Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Compression of pickled Objects

00:00 In this lesson, you’re going to see how you can easily compress and decompress your pickled objects using the bz2 module in the standard library. While pickled objects are already compact, you can make them even smaller by using bzip2 compression.

00:16 So to see this, go ahead and import pickle, and then import bz2. To get something to pickle, go ahead and define a new string, so I’m just going to say my_string, and then I’m going to paste in a bunch of random text that’s not quite a Lorem ipsum but it came from the first result on Google for a Lorem ipsum. So let me make a multiline string here, and then paste that in.

00:44 You can see there’s quite a bit there. Okay! Go ahead and pickle it, so you can say pickled is going to equal pickle and then dump this into a byte string, pass in my_string.

01:00 And then you can say compressed = bz2.compress(). Pass in pickled. And then to see what this looks like, you can go ahead and print out the length of my_string, and then print out the length of compressed, which is now a byte string.

01:22 And then just to make sure that nothing broke, you can say something like uncompressed = bz2.decompress(compressed).

01:34 And then just to make sure, you can say my_unpickled_string is just going to be pickle and then load from the string, pass in uncompressed.

01:46 And to see that, you can print out the length of my_unpickled_string, which should equal the length of the original my_string from before.

01:57 Save this, and try to run it! I’m going to have to reactivate my virtual environment.

02:06 Then go ahead and run pickle_compression. All right! You can see right here the length of the first string was like 1200 characters, the compressed one was only 723 characters, and then the decompression worked here as well.

02:22 To get a better view of what you just did, you can see that you took this ridiculously long string up here,

02:29 you were able to dump that into a pickled object, you then compressed it using bz2, which got you the shorter string, and then uncompressing it was just a matter of using bz2.decompress(), and then you deserialized it using pickle.loads() on the uncompressed data.

02:50 So in this case, it probably really wasn’t worth it to compress this relatively small amount of data up here, and compression does induce a speed penalty, but depending on what you’re trying to store and what constraints you have for your project, it’s good to know that it’s pretty straightforward to use bz2 with pickle to compress your objects. All right!

03:12 So now that you know how to do that, in the next lesson, you’re going to see an example of a security concern when using pickled objects, and why you need to differentiate between trusted and untrusted data.

Become a Member to join the conversation.