Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Manipulating Existing ZIP Files

00:00 Manipulating Existing ZIP Files With Python’s zipfile. Python’s zipfile provides convenient classes and functions that allow you to create, read, write, extract, and list the contents of your ZIP files.

00:13 Here are some additional features that zipfile supports: ZIP files greater than four gigabytes; data decryption; several compression algorithms, such as Deflate, Bzip2, and LZMA; and information integrity checks with CRC32.

00:32 Be aware that zipfile does have some limitations. For example, the current data decryption feature can be fairly slow because it uses pure Python code. The module can’t handle the creation of encrypted ZIP files.

00:44 And finally, the use of multi-disk ZIP files isn’t supported either. Despite these limitations, zipfile is a great and useful tool, and you’ll learn all about its capabilities in this course. In the zipfile module, you’ll find the ZipFile class.

01:01 This class works pretty much like Python’s built-in open() function, allowing you to open ZIP files using different modes. The read mode ("r") is the default.

01:12 You can also use the write ("w"), append ("a"), an exclusive ("x") modes. You’ll learn more about these later on.

01:23 ZipFile implements the context manager protocol so you can use the class in a with statement. This feature allows you to quickly open and work with a ZIP file without worrying about closing the file after you finish your work.

01:37 Before writing any code, make sure you have a copy of the files and archives that you’ll be using by downloading the course materials. You can then unzip them using your operating system’s built-in ZIP-handling method shown here by double-clicking in macOS, whereas right-clicking on the archive and choosing Extract All will work in Windows.

02:05 Create a new directory called python-zipfile/ in your home folder. Then move into the directory and copy the unzipped files into it.

02:22 You should see a listing similar to the one seen on-screen. Once you have these files and directories present in your working folder, you are ready to open up a Python interactive session and to start working with ZIP files. To warm up, you’ll start by reading the ZIP file called sample.zip. To do that, you can use the code seen on-screen.

02:53 The first argument to the initializer of ZipFile can be a string representing the path to the ZIP file that you need to open. This argument can accept file-like and path-like objects too. On-screen, you use a string-based path.

03:08 The second argument to ZipFile is a single-letter string representing the mode that you’ll use to open the file. As you saw earlier, ZipFile can accept four possible modes depending on your needs.

03:19 The mode positional argument defaults to "r", so you can get rid of it if you want to open the archive for reading only. Inside the with statement, you call the .printdir() method on archive.

03:31 The archive variable now holds the instance of ZipFile itself. This method provides a quick way to display the content of the underlying ZIP file on your screen.

03:41 Its output has a user-friendly tabular format with three informative columns: File Name, Modified, and Size. If you want to make sure that you are targeting a valid ZIP file before you try to open it, then you can wrap ZipFile in a tryexcept statement and catch any BadZipFile exception, as seen on-screen.

04:07 When this code is run on sample.zip, it executes without raising an exception as before. However, running the code on bad_sample.zip leads to an exception, as the file is not a valid ZIP file.

04:29 To check for a valid ZIP file, you can also use the is_zipfile() function.

04:38 Here you use a conditional statement with is_zipfile() as a condition. This function takes a filename argument that holds the path to a ZIP file in your file system.

04:48 This argument can accept string, file-, or path-like objects. The function returns True if filename is a valid ZIP file. Otherwise, it returns False.

05:22 Now let’s say you want to add hello.txt to a hello.zip archive using ZipFile. To do that, you can use the write mode ("w").

05:31 This mode opens a ZIP file for writing. If the target ZIP file exists, then the "w" mode truncates it and writes any new content you pass in.

05:40 Note that if you’re using ZipFile with existing files, then you should be careful with the "w" mode. You can truncate your ZIP file and lose all the original content.

05:52 If the target ZIP file doesn’t exist, then ZipFile creates it for you when you close the archive.

06:04 In this example, you call .write() on the ZipFile object. This method allows you to write member files into your ZIP archives.

06:12 Note that the argument to .write() should be an existing file. After running this code, then you’ll have a hello.zip file in your working directory.

06:24 If you list the file content using .printdir(), then you’ll notice that hello.txt will be there.

06:36 ZipFile is smart enough to create a new archive when you use the class in write mode and the target archive doesn’t exist. However, the class doesn’t create new directories in the path to the target ZIP file if those directories don’t already exist.

06:51 This explains why the code seen on-screen won’t work.

07:06 Because the missing/ directory in the path to the target hello.zip file doesn’t exist, you get a FileNotFound exception.

07:17 The append mode ("a") allows you to append new member files to an existing ZIP file. This mode doesn’t truncate the archive, so its original content is safe.

07:27 If the target ZIP file doesn’t exist, then the "a" mode creates a new one for you and appends any input files that you pass as an argument to .write().

07:38 To try out the "a" mode, go ahead and add the new hello.txt file to your newly created hello.zip archive. Here you use the append mode to add new_hello.txt to the new hello.zip file.

07:58 Then you run .printdir() to confirm the new file is present in the ZIP file.

08:11 ZipFile also supports an exclusive mode ("x"). This mode allows you to exclusively create new ZIP files and write new member files into them.

08:20 You’ll use the exclusive mode when you want to make a new ZIP file without overwriting an existing one. If the target file already exists, then you get FileExistsError.

08:32 If you create a ZIP file using the "w", "a", or "x" mode and then close the archive without adding any member files, then ZipFile creates an empty archive with the appropriate ZIP format.

08:44 In the next section of the course, you’ll see how to read information and files from ZIP files.

Become a Member to join the conversation.