Reading Information From ZIP Files
You’ve already put
.printdir() into action. It’s a useful method that you can use to list the content of your ZIP files quickly. Along with
ZipFile class provides several handy methods for extracting metadata from existing ZIP files. On-screen, you can see a summary of these methods:
.getinfo() returns a
.infolist() returns a list of
ZipInfo objects, and
.namelist() returns a list holding the names of all the member files.
.getinfo() takes a member file as an argument and returns a
ZipInfo object with information about it.
ZipInfo objects have several attributes that allow you to retrieve valuable information about the target member file. For example,
.compress_size hold the size, in bytes, of the original and compressed files, respectively.
for loop iterates over the
ZipInfo objects from
.infolist(), retrieving the filename, the last modification date, the normal size, and the compressed size of each member file. In this example, you use
datetime to format the date in a human-readable way.
Sometimes you have a ZIP file and need to read the content of a given member file without extracting it. To do that, you can use
.read(). This method takes a member file’s
name and returns that file’s content as bites.
.read(), you need to open the ZIP file for reading or appending. Note that
.read() returns the content of the target file as a stream of bytes. In this example, you use
.split() to split the stream into lines using the line feed character
"\n" as a separator.
First, you provide the password
secret to read the encrypted file. The
pwd argument accepts values of the bytes type. As you can see here, if you use read on an encrypted file without providing the required password, then you get a
Some popular file archivers include 7z and WinRAR for Windows, Ark and GNOME Archive Manager for Linux, and Archiver and Keka for macOS. For large encrypted ZIP files, keep in mind that the decryption operation can be extremely slow because it’s implemented in pure Python. In such cases, consider using a specialized program to handle your archives instead of using
zipfile. If you regularly work with encrypted files, then you may want to avoid providing the decryption password every time you call
.read() or another method that accepts a
If that’s the case, you can use
ZipFile.setpassword() to set a global password. With
.setpassword(), you just need to provide the password once.
ZipFile uses that unique password for decrypting all of the member files.
If you call
.read() on a ZIP file that uses an unsupported compression method, this raises a
NotImplementedError. You’ll also get an error if the required compression module isn’t available in your Python installation. In the next section of the course, you’ll see some other ways of opening and reading the contents of ZIP files.
Become a Member to join the conversation.