Manipulating Existing ZIP Files
00:00
Manipulating Existing ZIP Files With Python’s zipfile
. Python’s zipfile
provides convenient classes and functions that allow you to create, read, write, extract, and list the contents of your ZIP files.
00:13
Here are some additional features that zipfile
supports: ZIP files greater than four gigabytes; data decryption; several compression algorithms, such as Deflate, Bzip2, and LZMA; and information integrity checks with CRC32.
00:32
Be aware that zipfile
does have some limitations. For example, the current data decryption feature can be fairly slow because it uses pure Python code. The module can’t handle the creation of encrypted ZIP files.
00:44
And finally, the use of multi-disk ZIP files isn’t supported either. Despite these limitations, zipfile
is a great and useful tool, and you’ll learn all about its capabilities in this course. In the zipfile
module, you’ll find the ZipFile
class.
01:01
This class works pretty much like Python’s built-in open()
function, allowing you to open ZIP files using different modes. The read mode ("r"
) is the default.
01:12
You can also use the write ("w"
), append ("a"
), an exclusive ("x"
) modes. You’ll learn more about these later on.
01:23
ZipFile
implements the context manager protocol so you can use the class in a with
statement. This feature allows you to quickly open and work with a ZIP file without worrying about closing the file after you finish your work.
01:37 Before writing any code, make sure you have a copy of the files and archives that you’ll be using by downloading the course materials. You can then unzip them using your operating system’s built-in ZIP-handling method shown here by double-clicking in macOS, whereas right-clicking on the archive and choosing Extract All will work in Windows.
02:05
Create a new directory called python-zipfile/
in your home folder. Then move into the directory and copy the unzipped files into it.
02:22
You should see a listing similar to the one seen on-screen. Once you have these files and directories present in your working folder, you are ready to open up a Python interactive session and to start working with ZIP files. To warm up, you’ll start by reading the ZIP file called sample.zip
. To do that, you can use the code seen on-screen.
02:53
The first argument to the initializer of ZipFile
can be a string representing the path to the ZIP file that you need to open. This argument can accept file-like and path-like objects too. On-screen, you use a string-based path.
03:08
The second argument to ZipFile
is a single-letter string representing the mode that you’ll use to open the file. As you saw earlier, ZipFile
can accept four possible modes depending on your needs.
03:19
The mode
positional argument defaults to "r"
, so you can get rid of it if you want to open the archive for reading only. Inside the with
statement, you call the .printdir()
method on archive
.
03:31
The archive
variable now holds the instance of ZipFile
itself. This method provides a quick way to display the content of the underlying ZIP file on your screen.
03:41
Its output has a user-friendly tabular format with three informative columns: File Name
, Modified
, and Size
. If you want to make sure that you are targeting a valid ZIP file before you try to open it, then you can wrap ZipFile
in a try
… except
statement and catch any BadZipFile
exception, as seen on-screen.
04:07
When this code is run on sample.zip
, it executes without raising an exception as before. However, running the code on bad_sample.zip
leads to an exception, as the file is not a valid ZIP file.
04:29
To check for a valid ZIP file, you can also use the is_zipfile()
function.
04:38
Here you use a conditional statement with is_zipfile()
as a condition. This function takes a filename
argument that holds the path to a ZIP file in your file system.
04:48
This argument can accept string, file-, or path-like objects. The function returns True
if filename
is a valid ZIP file. Otherwise, it returns False
.
05:22
Now let’s say you want to add hello.txt
to a hello.zip
archive using ZipFile
. To do that, you can use the write mode ("w"
).
05:31
This mode opens a ZIP file for writing. If the target ZIP file exists, then the "w"
mode truncates it and writes any new content you pass in.
05:40
Note that if you’re using ZipFile
with existing files, then you should be careful with the "w"
mode. You can truncate your ZIP file and lose all the original content.
05:52
If the target ZIP file doesn’t exist, then ZipFile
creates it for you when you close the archive.
06:04
In this example, you call .write()
on the ZipFile
object. This method allows you to write member files into your ZIP archives.
06:12
Note that the argument to .write()
should be an existing file. After running this code, then you’ll have a hello.zip
file in your working directory.
06:24
If you list the file content using .printdir()
, then you’ll notice that hello.txt
will be there.
06:36
ZipFile
is smart enough to create a new archive when you use the class in write mode and the target archive doesn’t exist. However, the class doesn’t create new directories in the path to the target ZIP file if those directories don’t already exist.
06:51 This explains why the code seen on-screen won’t work.
07:06
Because the missing/
directory in the path to the target hello.zip
file doesn’t exist, you get a FileNotFound
exception.
07:17
The append mode ("a"
) allows you to append new member files to an existing ZIP file. This mode doesn’t truncate the archive, so its original content is safe.
07:27
If the target ZIP file doesn’t exist, then the "a"
mode creates a new one for you and appends any input files that you pass as an argument to .write()
.
07:38
To try out the "a"
mode, go ahead and add the new hello.txt
file to your newly created hello.zip
archive. Here you use the append mode to add new_hello.txt
to the new hello.zip
file.
07:58
Then you run .printdir()
to confirm the new file is present in the ZIP file.
08:11
ZipFile
also supports an exclusive mode ("x"
). This mode allows you to exclusively create new ZIP files and write new member files into them.
08:20
You’ll use the exclusive mode when you want to make a new ZIP file without overwriting an existing one. If the target file already exists, then you get FileExistsError
.
08:32
If you create a ZIP file using the "w"
, "a"
, or "x"
mode and then close the archive without adding any member files, then ZipFile
creates an empty archive with the appropriate ZIP format.
08:44 In the next section of the course, you’ll see how to read information and files from ZIP files.
Become a Member to join the conversation.