Using Compression
00:00 Using compression. You can create an archive file like you would a regular one with the addition of a suffix that corresponds to the detailed compression type as seen in the list onscreen.
00:14 pandas can deduce the compression type by itself. Here, you created a compressed CSV file as an archive. The size of the regular CSV file is 999 bytes, while the compressed file only has 730 bytes.
00:37
You can open this compressed file as usual with the pandas read_csv()
function.
00:51
read_csv()
decompresses the file before reading it into a DataFrame
. You can specify the type of compression with the optional parameter compression
, which can take on any of the values seen onscreen.
01:05
The default value compression='infer'
indicates that pandas should deduce the compression type from the file extension. Here’s how to compress a pickle file.
01:24
This should generate the file data.pickle.compress
that you can later decompress and read.
01:38 Note that this again corresponds to the DataFrame with the same data as before. You can give the other compression methods a try as well. If you’re using pickle files, then keep in mind that the zip format only supports reading. Next up, you’ll see how to choose columns when working with big data.
Become a Member to join the conversation.