Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Using Other Methods to Open and Read Member Files

00:00 Other Methods of Opening and Reading Member Files. If you are looking for a more flexible way to read from member files and create and add new member files to an archive, then ZipFile’s .open() method is for you. Like the built-in open() function, this method implements the context manager protocol, and therefore it supports the with statement.

00:29 Here you open hello.txt for reading. The first argument to .open() is name, indicating the member file that you want to open. The second argument is the mode, which defaults to "r" as usual.

00:44 ZipFile.open() also accepts a pwd argument for opening encrypted files. This argument works as the equivalent pwd argument does in .read().

00:56 You can also use .open() with the "w" mode. This mode allows you to create a new member file, write content to it, and finally append the file to the underlying archive, which you should open in append mode.

01:14 First, you open sample.zip in append mode. Then you create new_hello.txt by calling .open() with the "w" mode.

01:23 This function returns a file-like object that supports .write(), which allows you to write bytes into this newly created file. Note that you need to supply a non-existing filename to .open().

01:36 If you use a filename that already exists in the underlying archive, then you’ll end up with a duplicated file and a UserWarning exception.

01:45 Here you write the bytes "Hello, World!" into new_hello.txt. When the execution flow exits the inner with statement, Python writes the input bytes into the member file.

01:57 When the outer with statement exits, Python writes new_hello.txt into the underlying ZIP file, sample.zip.

02:11 The second part of the code confirms that new_hello.txt is now a member file of sample.zip. A detail to notice in the output of this example is that .write() sets the Modified date of the newly added file to the 1980-01-01, which is a slightly unusual behavior that you should keep in mind when using this method.

02:34 As you saw previously, you can use the .read() and .write() methods to read from and write to member files without extracting them from the containing ZIP archive.

02:43 Both of these methods work exclusively with bytes. However, when you have a ZIP archive containing text files, you may want to read their content as text instead of bytes.

02:53 There are at least two ways to do this. You can use bytes.decode() or io.TextIOWrapper. Because ZipFile.read() returns the content of the target member file as bytes, .decode() can operate on these bytes directly. The .decode() method decodes a bytes object into a string using a given character encoding format.

03:18 On-screen, you’ll see how to use .decode() to read text from the hello.txt file in the sample.zip archive.

03:35 Here, you read the content of hello.txt as bytes. Then you call .decode() to decode the bytes into a string using UTF-8 as encoding.

03:48 To set the encoding argument, you use the "utf-8" string.

03:55 However, you could use any other valid encoding, such as UTF-16 or cp1252, which can be represented as case-insensitive strings. Note that "utf-8" is the default value of the encoding argument to .decode().

04:11 It’s important to keep in mind that you need to know beforehand the character encoding format of any member file that you want to process using .decode().

04:19 If you use the wrong character encoding, then your code will fail to decode the underlying bytes into text, and you can end up instead with indecipherable characters.

04:29 The second option for reading text out of a member file is to use an io.TextIOWrapper object, which provides a buffered text stream. This time you need to use .open() instead of .read().

04:43 On-screen is an example of using io.TextIOWrapper to read the content of the hello.txt member file as a stream of text.

04:57 In the inner with statement, you open the hello.txt member file from the sample.zip archive. You then pass the resulting binary file-like object, hello, as an argument to io.TextIOWrapper.

05:11 This creates a buffered text stream by decoding the content of hello using the UTF-8 character encoding format. As a result, you get a stream of text directly from the target member file.

05:26 Just as with .encode(), the io.TextIOWrapper class takes an encoding argument. You should always specify a value for this argument because the default text encoding depends on the system running the code and may not be the right value for the file that you are trying to decode.

05:43 In the next section of the course, you’ll see how to extract member files from ZIP archives and how to close ZIP files after use.

Become a Member to join the conversation.