Decoding Custom Types From JSON
In this video, you’ll learn how to deserialize a non-serializable type given in a JSON file.
We can represent a complex object in JSON like this
{
"__complex__": true,
"real": 42,
"imaginary": 36
}
If we let the load()
method deserialize this, we’ll get a Python dict
instead of our desired complex
object. That’s because JSON objects deserialize to Python dict
. We can write a custom decoder function that will read this dictionary and return our desired complex
object.
def decode_complex(dct):
if "__complex__" in dct:
return complex(dct["real"], dct["imaginary"])
else:
return dct
Now, we need to read our JSON file and deserialize it. We can use the optional object_hook
argument to specify our decoding function.
with open("complex_data.json") as complex_data:
z = json.load(complex_data, object_hook=decode_complex)
Now, if we print the type of z
, we’ll see
<class 'complex'>
We have now deserialized a complex
object from a JSON file!
Congratulations, you made it to the end of the course! What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment in the discussion section and let us know.
00:00
Welcome back to our series on working with JSON data in Python. In this final video, we’re going to learn how to deserialize custom types like our complex
type we worked with in the last video.
00:13 I’m starting with a blank Python program here in Visual Studio Code, and I’ve also got a JSON file. Let’s pretend that I obtained this file over the internet, and I know nothing about it.
00:25
All I know is that I need to turn this data into a valid Python object of whatever type that is. The problem is, if I deserialize this like normal, I’m just going to get a Python list back. And worse, I’ll think that’s fine because a list
is a Python object.
00:44
But the person who actually serialized this original JSON file intends for this data to be reconstructed as a complex
object like we saw in the last video.
00:54 But how am I supposed to know that? Python doesn’t know that either. We’re missing metadata, which is information about the type of data that we’re supposed to be encoding.
01:04
So I’ll edit this JSON file, and I’ll add a key at the very beginning called "__complex__"
, and I’ll give this a value of true
.
01:14
Now we can create a decoder function that will check for this key, and if it’s there, it’ll know to create a complex
object out of the rest of this data. So let’s jump back to our Python program, and as usual, I’ll start by importing json
.
01:30
We wanted to find a custom decoding function, so we’ll type def decode_complex()
with the parameter of dct
, short for dictionary.
01:42
Now we’ll say if "__complex__" in dct:
—that’s the key we added before—we want to return a new complex
object with arguments of dct["real"]
and dct["imaginary"]
.
01:58
Finally, if "__complex__"
was not in the dictionary, then this is not supposed to be a complex
object and so we’ll just return the original dictionary object back to the default decoder to work with. Just like with the serializer, the deserializer will present us with the option to use this method for deserialization before it uses its default algorithm.
02:21
Let’s read from the file and see it in action. We’ll type with open("complex_data.json") as complex_data:
and then we want to get the actual contents of the document, so we’ll say data = complex_data.read()
, which will return us a string of the file contents.
02:43
And now we can try to deserialize the data by saying z = json.loads()
—because our data is a string—and we’ll pass in the data
from the file.
02:57
We also need to give it an object hook, which is how the method will know what our custom decoder function is. So I’ll say object_hook=decode_complex
.
03:10
Finally, we’ll print out the type()
of this JSON string, followed by the actual object itself. I’ll go ahead and run this code and we’ll see here that z
is now an object of type complex
with a value of 42+36j
. We’ve now deserialized a complex object given to us in a JSON file, and now we can do whatever we want with it in our Python code. And that’s it!
03:37 We’ve covered all of this course material, and now you know how to work with JSON data in your Python programs. You learned what JSON is, where it’s used, and how to serialize and deserialize both built-in types and custom types that are inherently non-serializable.
03:56 You also learned how to manipulate the data coming from a JSON file and how to derive meaning from that data, which we did with the TODO list. Just to quickly recap, JSON is a standardized format commonly used to transfer data.
04:11 It’s used in many APIs and databases. We serialize data to JSON format and then deserialize it to recreate the original data in a form that we can use, like a Python object.
04:24
The built-in Python module json
is capable of all of this, but we need to help it out with our custom encoding and decoding functions if the object we’re trying to encode is not natively serializable.
04:39 If you’re looking for more practice, search online for some APIs that expose JSON data. There are plenty of them and you can use them to create some really cool Python programs.
04:50 I’m Austin Cepalia with realpython.com and I wish you the best of luck in your programming endeavors. Happy coding!
ChrisF on March 31, 2019
Excellent course, really loved it
Raghunandana SK on April 1, 2019
Yep, This really takes some time to understand . But its explained well for sure
Claudemiro on April 14, 2019
Would be nice to have some examples with nested custom types.
SamR on June 17, 2019
Very helpful course. Thanks!
Fahim on July 9, 2019
Well explained. Thanks
Pygator on Aug. 24, 2019
What do you mean by apis that expose json? Otherwise fantastic course! Can’t wait to try this for my own objects and data.
Austin Cepalia RP Team on Aug. 27, 2019
When I say “APIs that expose JSON”, I’m saying “APIs that return some data in the form of JSON”. For example, if our program needed the ability to see weather forecasts, we could use a weather API. When our program sends a request to the API server, it may return data in JSON format. This course shows you how to interpret JSON data and use it within your Python programs.
Pygator on Aug. 27, 2019
Thanks. Great explanations! JSON has always been so mysterious.
Vincenzo Fiorentini on Oct. 10, 2019
cool. I guess in your example one might also check if complex is true, like cmp=”complex” if cmp in dct and dct[cmp]: return complex(dct[“real”],dct[“imaginary”]) else: return dct
looks like it is ‘false’ and ‘true’ in json instead of False and True. is that so? thanks.
Doug Creighton on Nov. 3, 2019
Is there a course that goes over best practices to convert api string or json to pandas or database structure and than back to json?
Andrew E on Dec. 13, 2019
How would you read in multiple complex numbers? That is, if your data json file looked like this:
{ “z1”:{ “complex”: true, “real”: 46, “complex”: 13, }, “z2”:{ “complex”: true, “real”: 3, “complex”: 4, } }
mikesult on Feb. 20, 2020
Great course, I really liked working with the custom type encoder/decoder. I usually type the exercises out by hand and I have been using single quotes instead of double quotes. One thing I learned was that JSON requires double quotes, single quotes don’t work. You might have mentioned that and I missed it (I think I read that somewhere and already had forgotten it). That made me fumble around for a while when I wrote out the complex_data.json using single quotes and JSON_Ex5.py didn’t work. Those are the kind of things that can trip me a bit. I appreciate learning about working with JSON. Thank you.
rgusaas on March 8, 2020
Good course Austin. It would be helpful if you described an example of when complex types come into play. What are some common uses for complex types and why use them at all?
Also, Links to any references for further study would make this a really solid tutorial.
zeroeum on April 10, 2020
I think the complex number example of non serializable, custom type encoding/decoding wasn’t the best choice. Went from a very useful/trivial example using JSON from API’s straight to a non trivial use case.
Muthukumar Kusalavan on May 7, 2020
Thank you very much for the tutorial, Mr.Austin
J on May 8, 2020
I am getting an error that I dont understand when trying to do this example.
Traceback (most recent call last): File “c:/Users/jcartwright/Downloads/TEST.PY”, line 16, in <module> z = json.load(complex_data, object_hook=decode_complex) File “C:\Users\jcartwright\AppData\Local\Continuum\anaconda3\lib\json__init__.py”, line 296, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, kw) File “C:\Users\jcartwright\AppData\Local\Continuum\anaconda3\lib\json__init__.py”, line 361, in loads return cls(kw).decode(s) File “C:\Users\jcartwright\AppData\Local\Continuum\anaconda3\lib\json\decoder.py”, line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File “C:\Users\jcartwright\AppData\Local\Continuum\anaconda3\lib\json\decoder.py”, line 355, in raw_decode raise JSONDecodeError(“Expecting value”, s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
J on May 8, 2020
What you have in the video is not what is in the text below it. and I have not been able to get whats in your video to work or the stuff below it. in the video you have loads and below you have load, so whats going on here? How come no one has said anything about this yet?
Ricky White RP Team on May 8, 2020
Hi @cartwrightjarrod. Can you share your code so we can see how we can help you fix your error?
tsusadivyago on May 13, 2020
the custom json type encoding and decoding is a addition for me
datascigit on July 10, 2020
Great Course. It helped me in using datasets via API for my data science projects. Thanks.
Ghani on Oct. 31, 2020
Very good course; thank you!
Deepak Nallajerla on Jan. 13, 2021
Great course , it would be helpful if some use case of complex type is provided.
gerardrealpython on May 13, 2021
In the last lesson, it is better to use:
if "__complex__" in dct and dct["__complex__"]:
instead of
if "__complex__" in dct:
So that when __complex__
is False
, it is not decoded as a complex object.
dudutt on June 15, 2021
After watching the decoder class, I was wondering if it is possible to use a class to decode, like we’ve done with the encoder.
The difference is that we have to override the __init__
and object_hook
methods, instead of default
.
Here is my code:
class Decoder(json.JSONDecoder):
def __init__(self, *args, **kwargs):
json.JSONDecoder.__init__(self, object_hook=self.object_hook, *args, **kwargs)
def object_hook(self, dct):
if 'name' in dct and 'age' in dct:
return Person(dct['name'], dct['name'])
else:
return dct
Then you call:
with open("singers.json", "r") as singers_file:
singers_json = json.load(singers_file, cls=Decoder)
shiningdonkey on June 27, 2021
Thanks @Austin for the intro course about JSON and python’s json module. I am very excited to have learned about custom json encoder and decoder.
Is it possible to build these into my class definition? So that the custom cls
/object_hook
arguments do not need to be supplied.
rcole1025 on July 20, 2021
Overall a good primer on working with JSON data in python. However, I would change the section on Custom Objects to Complex Objects as the lesson starts off using a Class Person custom type but never comes back to how one would serialize/deserialize this custom object type. On the plus side, this leads me to search for how to serialize a class object and I learned several methods of serializing JSON including customizing JSONEncoder class (overriding the default method) and using the pickle library.
techsukenik on Sept. 1, 2021
When serialize/de-serialize heterogenous objects (i.e. Person, Student) how do I avoid having one encoder/decoder function? This will allow new encoder/decoder functions for new classes without changing the code of a router type decoder/encoder function.
Is it possible to bypass the default encoder option and directly have Python have the encoder function?
Bartosz Zaczyński RP Team on Sept. 2, 2021
@techsukenik If you find the json
module’s API inconvenient to work with, then you always have the option to use the adapter pattern by wrapping the deserialized dictionary with custom code at a higher level of abstraction:
person_dict = json.loads("""\
{
"firstName": "Joe",
"lastName": "Doe",
"birthdate": "1978-05-12"
}
""")
person = Person(
first_name=person_dict["firstName"],
last_name=person_dict["lastName"],
birth_date=date.fromisoformat(person_dict["birthdate"]))
Calling json.loads()
returns a plain dictionary that you’re free to cherry-pick elements from and transform them as you wish. That way, you’ll have total control over what happens and when.
If you don’t mind keeping the serialization and deserialization logic together with your model class, then you might combine them:
import json
from datetime import date
from typing import NamedTuple
class Person(NamedTuple):
first_name: str
last_name: str
birth_date: date
@staticmethod
def from_json(text: str) -> "Person":
dict_ = json.loads(text)
return Person(first_name=dict_["firstName"],
last_name=dict_["lastName"],
birth_date=date.fromisoformat(dict_["birthdate"]))
def to_json(self) -> str:
return json.dumps({
"firstName": self.first_name,
"lastName": self.last_name,
"birthdate": self.birth_date.isoformat()
})
Alternatively, you could come up with a more clever solution that would work universally across many model classes. For example, you’d store the information about a specific data type in JSON, and then use that information to pick the right deserializer like in the video.
techsukenik on Sept. 2, 2021
@Bartosz Zaczyński …
For the JSON class it is possible to expand it to cover json.JSONEncoder and json.JSONDecoder?
See .. alexis-gomes19.medium.com/custom-json-encoder-with-python-f52c91b48cd2
Using the json.JSONEncoder, a universal ObjectEncoder can be coded that automatically support custom objects. stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable
import json
import inspect
class ObjectEncoder(json.JSONEncoder):
def default(self, obj):
if hasattr(obj, "to_json"):
return self.default(obj.to_json())
elif hasattr(obj, "__dict__"):
d = dict(
(key, value)
for key, value in inspect.getmembers(obj)
if not key.startswith("__")
and not inspect.isabstract(value)
and not inspect.isbuiltin(value)
and not inspect.isfunction(value)
and not inspect.isgenerator(value)
and not inspect.isgeneratorfunction(value)
and not inspect.ismethod(value)
and not inspect.ismethoddescriptor(value)
and not inspect.isroutine(value)
)
return self.default(d)
return obj
Example
class C(object):
c = "NO"
def to_json(self):
return {"c": "YES"}
class B(object):
b = "B"
i = "I"
def __init__(self, y):
self.y = y
def f(self):
print "f"
class A(B):
a = "A"
def __init__(self):
self.b = [{"ab": B("y")}]
self.c = C()
print json.dumps(A(), cls=ObjectEncoder, indent=2, sort_keys=True)
Result:
{
"a": "A",
"b": [
{
"ab": {
"b": "B",
"i": "I",
"y": "y"
}
}
],
"c": {
"c": "YES"
},
"i": "I"
}
Bartosz Zaczyński RP Team on Sept. 3, 2021
@techsukenik I took a stab at what I think you’re trying to achieve here and created a GitHub repository to showcase my approach. Hopefully, this will give you some inspiration or clear your doubts.
techsukenik on Sept. 3, 2021
@Bartosz Zaczyński Thanks for spending the time writing the code. I look forward in doing a deep dive on the code.
chestnutj2000 on July 16, 2024
Why did the last element of the output print as 36j
? Where did the “j” character come from? at 3:56. The JSON only had two integers. Also, why did it print the two elements seperated by a “+” ? Thanks much!
Bartosz Zaczyński RP Team on July 16, 2024
@jtchestnut It’s the default string representation of complex numbers in Python. The letter “j” indicates the imaginary part of a complex number to avoid confusing it with regular addition:
>>> 42 + 36
78
>>> 42 + 36j
(42+36j)
>>> type(42 + 36j)
<class 'complex'>
chestnutj2000 on July 17, 2024
Thank you for pointing me in the right direction here, Bartosz!
I believe I originally understood the types in the JSON, “real” and “imaginary” to be arbitrary labels from a random JSON data object (not translated into Python types). So, for a complex object, it sounds like the elements need to be labeld as Python types.
How would one take a random JSON data object from say, an AWS resource describe command, and load it into a Python object without having to modify the JSON file? I assume create a routine specific to process the JSON and map to a custom object? Was curious if there was some automated way to get close. I see object_hook()
of json.load()
may be a way.
Bartosz Zaczyński RP Team on July 17, 2024
@chestnutj2000 You’re welcome!
The “labels” or keys in JSON can indeed be arbitrary. It’s your custom decoder class that decides how to interpret them. In particular, to create an instance of a complex number, which isn’t normally JSON-serializable, you need two values that you can read from a JSON document once you know the expected keys or labels they’re stored at.
As to your second question, I’m not sure I’m following 🤔 If a piece of data was successfully serialized to JSON, then you can load it into Python using the json
module. You’d only need a custom decoder if you wanted to use specific or non-standard representations of the serialized values.
Ed Schneider on July 29, 2024
I’m new to JSON. I found this tutorial confusing. It begins with a description of JSON which emphasizes its power as a data interchange format. Then it gave an interesting example of what you might do with it beyond the basic role of transmitting data. That opened my eyes. Where it lost me is when it showed a limitation of JSON and how you can get around it. Unfortunately, that meant the bulk of the tutorial was on overcoming a short coming. That introduced confusion, notice all the comments regarding complex number issues. Are we to conclude that to use JSON you should be ready to provide your own decoders? Or was that just muscle flexing and a neat solution? I would have liked to see more examples of where JSON was useful.
Become a Member to join the conversation.
James Magruder on March 30, 2019
This was excellent!! However, it will take me a while to absorb it all. I may have to watch it again.