Compression of pickled Objects
00:00
In this lesson, you’re going to see how you can easily compress and decompress your pickled objects using the bz2
module in the standard library. While pickled objects are already compact, you can make them even smaller by using bzip2
compression.
00:16
So to see this, go ahead and import pickle
, and then import bz2
. To get something to pickle, go ahead and define a new string, so I’m just going to say my_string
, and then I’m going to paste in a bunch of random text that’s not quite a Lorem ipsum but it came from the first result on Google for a Lorem ipsum. So let me make a multiline string here, and then paste that in.
00:44
You can see there’s quite a bit there. Okay! Go ahead and pickle it, so you can say pickled
is going to equal pickle
and then dump this into a byte string, pass in my_string
.
01:00
And then you can say compressed = bz2.compress()
. Pass in pickled
. And then to see what this looks like, you can go ahead and print out the length of my_string
, and then print out the length of compressed
, which is now a byte string.
01:22
And then just to make sure that nothing broke, you can say something like uncompressed = bz2.decompress(compressed)
.
01:34
And then just to make sure, you can say my_unpickled_string
is just going to be pickle
and then load from the string, pass in uncompressed
.
01:46
And to see that, you can print out the length of my_unpickled_string
, which should equal the length of the original my_string
from before.
01:57 Save this, and try to run it! I’m going to have to reactivate my virtual environment.
02:06
Then go ahead and run pickle_compression
. All right! You can see right here the length of the first string was like 1200 characters, the compressed one was only 723 characters, and then the decompression worked here as well.
02:22 To get a better view of what you just did, you can see that you took this ridiculously long string up here,
02:29
you were able to dump that into a pickled
object, you then compressed it using bz2
, which got you the shorter string, and then uncompressing it was just a matter of using bz2.decompress()
, and then you deserialized it using pickle.loads()
on the uncompressed data.
02:50
So in this case, it probably really wasn’t worth it to compress this relatively small amount of data up here, and compression does induce a speed penalty, but depending on what you’re trying to store and what constraints you have for your project, it’s good to know that it’s pretty straightforward to use bz2
with pickle
to compress your objects. All right!
03:12 So now that you know how to do that, in the next lesson, you’re going to see an example of a security concern when using pickled objects, and why you need to differentiate between trusted and untrusted data.
Become a Member to join the conversation.