Serializing Objects in Python
00:00 In this lesson, you’re going to learn what serialization is and a couple of different methods for serializing data in Python. Serialization is the process of converting a data structure into a linear byte stream.
00:12 This means taking complex data and encoding it into a stream of bytes in a way that allows a second operation to take that data and convert it back into the original data.
This can be useful for storing data or sending it over a network. Python has a number of built-in modules for this process:
marshall is the oldest of the three serialization modules. It’s primarily used to read and write compiled bytecode from Python modules. If you’ve ever seen
.pyc files pop up in your working directory when importing modules, that’s
marshall working behind the scenes.
The biggest takeaway is not to use
marshall. It’s mainly used by the interpreter itself and can have breaking changes that would mess up your code.
json is the newest of the three serialization modules. It produces standard JSON output. That means that it’s human-readable and it works very well with other languages that have ways of parsing JSON files.
An issue with
json is that it only works with certain data types, and you may have seen this error pop up when trying to use the module to serialize more complex objects.
This is where
pickle comes into play.
pickle serializes data in a binary format, which means that it’s not human-readable. A benefit to
pickle is that it works out of the box with many Python data types, including custom ones that you define, and it works very fast.
So, bottom line: of the three built-in Python modules for serialization, don’t use
json if you need human-readable output or you need your output to work in other languages, and for everything else, go ahead and use
pickle. All right! In the next lesson, you’re going to see how to use
pickle to serialize a custom class.
Become a Member to join the conversation.