Welcome back. In this lesson, I’m going to be showing you how to use
defaultdict. How exactly you’re going to use it is going to depend on the exact use case, but a general pattern here is that we’re going to be using built-in mutable data structures which Python offers, and we’re going to be mapping those to keys.
This all sounds a bit abstract, but it’s much clearer once we start looking at the code, so let’s jump right in. Here we are in the REPL, and the very first thing I’m going to do is I’m going to import
So what this allows me to do is that I can append something to a key which isn’t present, such as the key
'key'. And as you can see, this didn’t cause my code to error out. In fact, if I have a look at my dictionary, you can see that there’s a single
'key' and the value is
Well, there we go. That didn’t go as smoothly as I planned, but we can have a look at the
defaultdict and see what it contains. And you can see that what happened was that the value
2 was smoothly appended to this list. Again, no errors, no issues of any kind.
01:54 So basically what it is is a list of tuples, and each tuple is a department and an employee. So in the Sales department, we have John Doe and Martin Smith. In Marketing, we have Elizabeth Smith and Adam Doe. And Jane Doe works all alone in Accounting.
What I want to do here is I want to create a dictionary which will group people by department. So for example, I would like to have one key for
'Marketing', and then I would like to have
'Elizabeth Smith' and
'Adam Doe' both listed in a list under that key, since they both work in Marketing, right?
But we can imagine that this is a huge company—there are many departments, and I might not know all of them from the beginning. So now I’ve created a new
defaultdict. This time it’s called
So now that I’ve done that setup, what I can do is I can iterate over each
department and each
in dep—so in my list of departments and employees—and I can append each
employee to a list which is mapped to a key which is the
As you can see, that ran without any issues. Let’s try looking at the
defaultdict which I just created. And you can see here, for instance, that
'Sales' has two employees and indeed, they’re both in my list—
'John Doe' and
'Martin Smith' are both there.
Let’s come back to the REPL. First of all, I’m going to clear it so that we have sort of a blank slate, and I’m going to create my
dep list again. Except this time, the data isn’t as clean. In fact, it’s quite messy in the sense that I have multiple entries for the same values.
So for example, if you look at the last three values, it’s three times
'Adam Doe' in the
'Marketing' department. And this is a very common situation, right? We’re often working with dirty data, which is not optimally presented to us.
The way to do this is very similar to what we did previously. In the example just before this one, I created a
defaultdict. I called it
dep_dd, just like here. Instead of passing
set as a parameter to
.default_factory, I had passed
list. What I’m going to do now is pass
So the syntax for doing this is very similar to what I had done just before. Again, I’m iterating over my tuples, over
employee, and I’m this time adding—rather than appending them—to keys which are the
department values. But this time the mutable data structure which I’m using is a
set instead of a
So rather than prolonging the
list with repetitive values, the
set will only accept unique values. Let’s see how this went. I’ll have a look at my
defaultdict to see what it contains, and you can see in
'Sales', I have
But what’s more interesting is that in
'Marketing', I have
'Adam Doe' and
'Elizabeth Smith', and
'Adam Doe' only appears once even though in my original dataset up here, we had
'Adam Doe' three times.
For this example, I’m going to be using the same list I used in the very first example so there are no repetitions, just because it’s a bit cleaner and easier to work with. If you thought I was going to start out by creating a
defaultdict, as I did in the previous examples, you would be right.
So there we go. I’ve created my
defaultdict. Next, I will iterate over my list of tuples. And now what I’ll be doing is I’m going to be incrementing the
int, which I’ve added to each entry where I didn’t have a key in my
As you can see, this time there is an
int mapped to each key. The key is a department name. So the first one is
'Sales' and the value I have here is
2, and that’s because two people work in
What we have here is a series of departments again—or you can imagine these are sales types—and we have a value for each of them. So for instance, here at the very top line, you can see that we’ve sold three types of books, or we have three entries for sales in
'Books', or this could be perhaps spent on books.
08:06 And as in the previous use cases, what I’m going to be doing is I’m going to be iterating over this list of tuples, and I’m going to be accumulating values. This time, I’m going to be using products as keys.
That’s the end of this lesson. We looked at four different use cases and how
defaultdict can help us in each of them. In each case, we are grouping or consolidating or somehow reducing items that we have, maybe to unique items, and we’re mapping them in different categories.
We’re using keys as the way in which we can retrieve these consolidated values. So, I’ve shown you four use cases. I hope I’ve convinced you that
defaultdict can be helpful and useful in resolving concrete problems. In the next lesson, we’ll go deeper into
defaultdict and see more of how they work under the hood.
Become a Member to join the conversation.