Diving Deeper Into defaultdict
Welcome back. In the last lesson, I showed you how you can use
defaultdict to solve concrete problems in different scenarios. In this lesson, we’ll look a bit deeper into
defaultdict and how it works under the hood.
Following that, we’ll contrast
.setdefault(). Remember that
.setdefault() is a method which works on
dict right out of the box and allows you to address this missing key problem. And then finally, we will say a few words about
.__missing__(). Okay, so let’s start by comparing
I’ll start by typing out
set(dir()), and in each case I’m going to be passing an object to them. First, I’ll use
defaultdict without the opening parentheses since I’m not actually passing a value to this object.
And then I’ll do the same for
dict—and don’t worry, I’m going to explain what I’m doing here in a minute. I have my code set up here. What
dir() does is it returns a
list of all the attributes and methods which an object has, and
set() in turn boils down that
list into a
set of unique items.
So let’s go ahead and create this and just have a look inside—so
def_dict, and here you can see. This is a
defaultdict and its contents are exactly the same, in fact, as in the
dict I just created.
The key difference is that I have
'list' here as the callable. Let’s look at how similar these two dictionaries actually are. To do that, I’m going to take my
dict) and I’m going to check if it’s the same as
def_dict. So here we go—and you can see this is
So these two dictionaries—or these two objects, rather, one of which is a
dict and one of which is a
defaultdict—are exactly the same. So as I mentioned,
.default_factory is really the heart and soul of
defaultdict. In fact, it’s what really sets
defaultdict apart from
Let’s have a look at how
.default_factory works in a bit more detail. We’re back in the REPL, and keep in mind I’ve already imported
collections, so I don’t need to do that again. So as mentioned earlier,
.default_factory is set to a callable and the callable is the first argument that you pass when you’re creating a
But what happens if you don’t pass anything? Well, in that case… Like here, I’ve created a
defaultdict but I didn’t pass any arguments to it at all, so I especially didn’t pass a first argument, which is a callable.
So if I now try to access a missing key, I get the same traceback. In order to avoid this, what I need to do is pass a callable. So, something like
str (string). Let’s go with
list—ha, and of course you should not capitalize
list, like I did.
But what happens if I try to get
'missing key'—not with a normal key reference but by using
.get(), which is a normal dictionary method and it’s a method which is available to me here with
The last thing I’d like to say here about
.default_factory is that this is an attribute of my
defaultdict. So I can inspect it just as I would any normal attribute, and you can see here it’s set to
list. And I can also update it, so I can set this to
str, and now if I try to access another missing key—so, I’ve already used this value but let’s call it
'missing key 2'—then this time a
str is generated in place of a traceback and an empty
list, as I had in the previous example.
If you leave it empty or set it to
None, then your
defaultdict behaves just like a normal
dict. And of course only
.default_factory(), so other methods for getting key-value pairs from dictionaries—such as
.get()—won’t work. Okay.
Moving swiftly onwards, let’s compare
.setdefault(). In one of the previous lessons I already mentioned
.setdefault() a bit, but it’s worth having a quick look at it here again.
Imagine I am going to look for a key, and it’s a key which isn’t there. I can put anything here since this dictionary is completely empty. And then the second argument which
.setdefault() takes—and you can already see it down here—is
default. This is a default value which will be provided if the key is not in the dictionary.
Let’s enter an empty
list here, and you can see what I did was I called my
dict, I looked for the string
'a'. It’s not there, so an empty
list was provided. This is exactly what
defaultdict would do.
And that kind of raises the question, “Why should I use one or the other since they seem to do the same thing?” Well,
defaultdict is arguably more readable, user-friendly, Pythonic, and straightforward.
If your code is heavily based on dictionaries and you’re dealing with missing keys all of the time—rather than as an occasional stumbling block—then you should really consider using
defaultdict rather than regular
If your dictionary items are having to be initialized with a constant default value, then
defaultdict also makes sense for you. And finally, if you’re using the dictionaries in your code for things such as aggregating, accumulating, counting, grouping—basically the use cases we saw in the previous lesson—then also
defaultdict is a good option. Regular
dict, as opposed to
defaultdict, does have a slight speed advantage but in most cases where your code is heavily reliant on dictionaries, the convenience of
defaultdict sort of outweighs that. Okay, so that brings us to the last item in this lesson.
I touched on this briefly earlier, but the key thing to remember here is that when you look for a key in a dictionary, you’re triggering
.__getitem__(), which in turn triggers
.__missing__(), which in turn triggers
The important thing about
.__missing__() is that it’s only triggered by
.__getitem__(), so other methods which can be used to look for keys, such as
.__contains__(), won’t be triggering
It’s less likely that
.__contains__() will trip you up, but if you’re used to using
.get() to look for something, then you do have to be aware that there is this potential trap that you can fall into if you’re expecting it to trigger
.default_factory() and allow your
defaultdict to work as such. Okay, so that was it for this lesson. In the next lesson, I’ll be telling you about different ways in which you can pass arguments to
defaultdict. I’ll see you there!
Become a Member to join the conversation.