Using __getstate__ and __setstate__
In the last lesson, you saw how you can use
dill to extend the capabilities of
pickle. This doesn’t always work, however, as there are still some cases where
dill can’t serialize a certain data type.
When this occurs, you can sometimes exclude things from the serialization process. In this lesson, you’re going to learn how to exclude items from serialization and then reinitialize them when deserializing. When you call
pickle on an object, it looks for the
.__getstate__() dunder to determine what needs to be serialized.
If there isn’t anything defined, it will use the default
.__dict__ dunder to determine what it needs stored. To see this, I’ve created a new Python script called
In your text editor, go ahead and import
pickle and then define a new class called
foobar. You can then make a constructor, so define your
.__init__() method, which will take
self. And then set the
.a property to like
.b property to a string of
"test", and then get a little complicated and set
.c equal to a lambda expression.
So from before,
pickle shouldn’t have any trouble serializing
.b, because it knows how to handle integers and strings.
But we know that there’s a problem with this lambda. To get around this, you can define a new method called
.__getstate__(). So with two underscores, type in
__getstate__(), and this will also take
self. And here you want to take the properties to be serialized and return them.
You can say something like
attributes and set this equal to
self, and then access this
.__dict__ property, and from
.__dict__ call the
In its default behavior,
pickle is going to look at this
self.__dict__ to return all of the data that needs to be serialized. Because the
.c property is a
lambda function and cannot be serialized, you can delete that. So say
del, go to
attributes, and get rid of
'c' like so.
And then now you can
return attributes. To see how this works, go ahead and make an instance of the
set this equal to
my_pickle_string is going to equal
pickle, and you’ll dump a string and pass in the
foobar instance. Now you can go ahead and deserialize it, so say
my_new_instance and set this equal to
pickle, and this time you’ll load from a string and pass in
And to see what this looks like, go ahead and print
my_new_instance and access
.__dict__ off of it. Okay! Before running this, let’s go ahead and take a look at what happened here.
You’ve defined a new custom class that contains a property that
pickle cannot handle.
You then modified what gets pickled by defining the
.__getstate__() method and removing the attribute that
pickle can’t handle, and then you return that.
So if you run this, try and think about what you expect to see. I’m going to save it, and then I’m going to run
python custom_pickling.py. And like you may have expected, you can see that the
'a' property and the
'b' property made it over. The
'c' property that contained the
lambda function is nowhere to be found. So while this works, you did lose a pretty significant part of your custom class, and if you don’t want this to happen, you can get around this by reinitializing the property with the
.__setstate__() dunder method. If this is present, this method is called when deserializing the object and can modify what comes out. So going back to your custom class,
go ahead and define a new method called
which will take
Inside here, you’ll define
self.__dict__ and set this equal to
state, and now you can reinitialize that property by saying
self.c is equal to the lambda expression that you defined earlier, so
x * x.
So now this method will be serialized with the custom class so when you deserialize it, it’s called. So
state here will return the
.b properties, and then you’ve re-added
.c right here.
So let’s save this and see what comes out! Okay. So you can see you still have the
'b' properties, and now you have
'c', which is representing this function over here.
This may seem strange with how you can’t save a
lambda function as a property of a custom class, but you can save it by reinitializing it when you deserialize it, and one way to think about this is that
pickle doesn’t know how to handle the lambda expression itself, but it can handle the instructions on how to redefine that lambda expression.
Now keep in mind, because this method is run every time this is deserialized, there are some security concerns here because you’re running code. So go ahead and add in a
print() statement here, like
'I am deserializing', and then save it and rerun it.
And you’ll see that when running the script,
I am deserializing printed out. So anything that gets put into the
.__setstate__() method is executed by whatever is deserializing it.
We’ll talk about this in a little bit more detail in a later lesson. So anyway, you should have a pretty good idea on how you can get around some of these unserializable data types by using the
.__setstate__() methods. In the next video, you’re going to see how you can compress your serialized output when using
Become a Member to join the conversation.