Creating dictionary-like classes may be a requirement in your Python career. Specifically, you may be interested in making custom dictionaries with modified behavior, new functionalities, or both. In Python, you can do this by inheriting from an abstract base class, by subclassing the built-in dict
class directly, or by inheriting from UserDict
.
In this tutorial, you’ll learn how to:
- Create dictionary-like classes by inheriting from the built-in
dict
class - Identify common pitfalls that can happen when inheriting from
dict
- Build dictionary-like classes by subclassing
UserDict
from thecollections
module
Additionally, you’ll code a few examples that’ll help you understand the pros and cons of using dict
vs UserDict
to create your custom dictionary classes.
To get the most out of this tutorial, you should be familiar with Python’s built-in dict
class and its standard functionality and features. You’ll also need to know the basics of object-oriented programming and understand how inheritance works in Python.
Join Now: Click here to join the Real Python Newsletter and you'll never miss another Python tutorial, course update, or post.
Creating Dictionary-Like Classes in Python
The built-in dict
class provides a valuable and versatile collection data type, the Python dictionary. Dictionaries are everywhere, including in your code and the code of Python itself.
Sometimes, the standard functionality of Python dictionaries isn’t enough for certain use cases. In these situations, you’ll probably have to create a custom dictionary-like class. In other words, you need a class that behaves like a regular dictionary but with modified or new functionality.
You’ll typically find at least two reasons for creating custom dictionary-like classes:
- Extending the regular dictionary by adding new functionality
- Modifying the standard dictionary’s functionality
Note that you could also face situations in which you need to both extend and modify the dictionary’s standard functionality.
Depending on your specific needs and skill level, you can choose from a few strategies for creating custom dictionaries. You can:
- Inherit from an appropriate abstract base class, such as
MutableMapping
- Inherit from the Python built-in
dict
class directly - Subclass
UserDict
fromcollections
There are a few key considerations when you’re selecting the appropriate strategy to implement. Keep reading for more details.
Building a Dictionary-Like Class From an Abstract Base Class
This strategy for creating dictionary-like classes requires that you inherit from an abstract base class (ABC), like MutableMapping
. This class provides concrete generic implementations of all the dictionary methods except for .__getitem__()
, .__setitem__()
, .__delitem__()
, .__iter__()
, and .__len__()
, which you’ll have to implement by yourself.
Additionally, suppose you need to customize the functionality of any other standard dictionary method. In that case, you’ll have to override the method at hand and provide a suitable implementation that fulfills your needs.
This process implies a fair amount of work. It’s also error-prone and requires advanced knowledge of Python and its data model. It can also imply performance issues because you’ll be writing the class in pure Python.
The main advantage of this strategy is that the parent ABC will alert you if you miss any method in your custom implementation.
For these reasons, you should embrace this strategy only if you need a dictionary-like class that’s fundamentally different from the built-in dictionary.
In this tutorial, you’ll focus on creating dictionary-like classes by inheriting from the built-in dict
class and the UserDict
class, which seem to be the quickest and most practical strategies.
Inheriting From the Python Built-in dict
Class
For a long time, it was impossible to subclass Python types implemented in C. Python 2.2 fixed this issue. Now you can directly subclass built-in types, including dict
. This change brings several technical advantages to the subclasses because now they:
- Will work in every place that requires the original built-in type
- Can define new instance, static, and class methods
- Can store their instance attributes in a
.__slots__
class attribute, which essentially replaces the.__dict__
attribute
The first item in this list may be a requirement for C code that expects a Python built-in class. The second item allows you to add new functionality on top of the standard dictionary behavior. Finally, the third item will enable you to restrict the attributes of a subclass to only those attributes predefined in .__slots__
.
Even though subclassing built-in types has several advantages, it also has some drawbacks. In the specific case of dictionaries, you’ll find a few annoying pitfalls. For example, say that you want to create a dictionary-like class that automatically stores all its keys as strings where all the letters, if present, are uppercase.
To do this, you can create a subclass of dict
that overrides the .__setitem__()
method:
>>> class UpperCaseDict(dict):
... def __setitem__(self, key, value):
... key = key.upper()
... super().__setitem__(key, value)
...
>>> numbers = UpperCaseDict()
>>> numbers["one"] = 1
>>> numbers["two"] = 2
>>> numbers["three"] = 3
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}
Cool! Your custom dictionary seems to work well. However, there are some hidden issues in this class. If you try to create an instance of UpperCaseDict
using some initialization data, then you’ll get a surprising and buggy behavior:
>>> numbers = UpperCaseDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'one': 1, 'two': 2, 'three': 3}
What just happened? Why doesn’t your dictionary convert the keys into uppercase letters when you call the class’s constructor? It looks like the class’s initializer, .__init__()
, doesn’t call .__setitem__()
implicitly to create the dictionary. So, the uppercase conversion never runs.
Unfortunately, this issue affects other dictionary methods, like .update()
and .setdefault()
, for example:
>>> numbers = UpperCaseDict()
>>> numbers["one"] = 1
>>> numbers["two"] = 2
>>> numbers["three"] = 3
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}
>>> numbers.update({"four": 4})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4}
>>> numbers.setdefault("five", 5)
5
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4, 'five': 5}
Again, your uppercase functionality isn’t working well in these examples. To solve this issue, you must provide custom implementations of all the affected methods. For example, to fix the initialization issue, you can write an .__init__()
method that looks something like this:
# upper_dict.py
class UpperCaseDict(dict):
def __init__(self, mapping=None, /, **kwargs):
if mapping is not None:
mapping = {
str(key).upper(): value for key, value in mapping.items()
}
else:
mapping = {}
if kwargs:
mapping.update(
{str(key).upper(): value for key, value in kwargs.items()}
)
super().__init__(mapping)
def __setitem__(self, key, value):
key = key.upper()
super().__setitem__(key, value)
Here, .__init__()
converts the keys into uppercase letters and then initializes the current instance with the resulting data.
With this update, the initialization process of your custom dictionary should work correctly. Go ahead and give it a try by running the following code:
>>> from upper_dict import UpperCaseDict
>>> numbers = UpperCaseDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}
>>> numbers.update({"four": 4})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4}
Providing your own .__init__()
method fixed the initialization issue. However, other methods like .update()
continue to work incorrectly, as you can conclude from the "four"
key’s not being uppercase.
Why do dict
subclasses behave this way? Built-in types were designed and implemented with the open–closed principle in mind. Therefore, they’re open to extension but closed to modification. Allowing modifications to the core features of these classes can potentially break their invariants. So, Python core developers decided to protect them from modifications.
That’s why subclassing the built-in dict
class can be a little bit tricky, labor-intensive, and error-prone. Fortunately, you still have alternatives. The UserDict
class from the collections
module is one of them.
Subclassing UserDict
From collections
Starting with Python 1.6, the language has provided UserDict
as part of the standard library. This class initially lived in a module named after the class itself. In Python 3, UserDict
was moved to the collections
module, which is a more intuitive place for it, based on the class’s primary purpose.
UserDict
was created back when it was impossible to inherit from Python’s dict
directly. Even though the need for this class has been partially supplanted by the possibility of subclassing the built-in dict
class directly, UserDict
is still available in the standard library, both for convenience and for backward compatibility.
UserDict
is a convenient wrapper around a regular dict
object. This class provides the same behavior as the built-in dict
data type with the additional feature of giving you access to the underlying dictionary through the .data
instance attribute. This feature can facilitate the creation of custom dictionary-like classes, as you’ll learn later in this tutorial.
UserDict
was specially designed for subclassing purposes rather than for direct instantiation, which means that the class’s main purpose is to allow you to create dictionary-like classes through inheritance.
There are also other hidden differences. To uncover them, go back to your original implementation of UpperCaseDict
and update it like in the code below:
>>> from collections import UserDict
>>> class UpperCaseDict(UserDict):
... def __setitem__(self, key, value):
... key = key.upper()
... super().__setitem__(key, value)
...
This time, instead of inheriting from dict
, you’re inhering from UserDict
, which you imported from the collections
module. How will this change affect the behavior of your UpperCaseDict
class? Check out the following examples:
>>> numbers = UpperCaseDict({"one": 1, "two": 2})
>>> numbers["three"] = 3
>>> numbers.update({"four": 4})
>>> numbers.setdefault("five", 5)
5
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'FOUR': 4, 'FIVE': 5}
Now UpperCaseDict
works correctly all the time. You don’t need to provide custom implementations of .__init__()
, .update()
, or .setdefault()
. The class just works! This is because in UserDict
, all the methods that update existing keys or add new ones consistently rely on your .__setitem__()
version.
As you learned before, the most notable difference between UserDict
and dict
is the .data
attribute, which holds the wrapped dictionary. Using .data
directly can make your code more straightforward because you don’t need to call super()
all the time to provide the desired functionality. You can just access .data
and work with it as you would with any regular dictionary.
Coding Dictionary-Like Classes: Practical Examples
You already know that subclasses of dict
don’t call .__setitem__()
from methods like .update()
and .__init__()
. This fact makes subclasses of dict
behave differently from a typical Python class with a .__setitem__()
method.
To work around this issue, you can inherit from UserDict
, which does call .__setitem__()
from all the operations that set or update values in the underlying dictionary. Because of this feature, UserDict
can make your code safer and more compact.
Admittedly, when you think of creating a dictionary-like class, inheriting from dict
is more natural than inhering from UserDict
. This is because all Python developers know about dict
, but not all Python developers are aware of the existence of UserDict
.
Inheriting from dict
often implies certain issues that you can probably fix by using UserDict
instead. However, these issues aren’t always relevant. Their relevance very much depends on how you want to customize the dictionary’s functionality.
The bottom line is that UserDict
isn’t the right solution all the time. In general, if you want to extend the standard dictionary without affecting its core structure, then it’s totally okay to inherit from dict
. On the other hand, if you want to change the core dictionary behavior by overriding its special methods, then UserDict
is your best alternative.
In any case, remember that dict
is written in C and is highly optimized for performance. In the meantime, UserDict
is written in pure Python, which can represent a significant limitation in terms of performance.
You should consider several factors when deciding whether to inherit from dict
or UserDict
. These factors include, but aren’t limited to, the following:
- Amount of work
- Risk of errors and bugs
- Ease of use and coding
- Performance
In the following section, you’ll experience the first three factors in this list by coding a few practical examples. You’ll learn about performance implications a bit later, in the section on performance.
A Dictionary That Accepts British and American Spelling of Keys
As the first example, say you need a dictionary that stores keys in American English and allows key lookup in either American or British English. To code this dictionary, you’ll need to modify at least two special methods, .__setitem__()
and .__getitem__()
.
The .__setitem__()
method will allow you to always store keys in American English. The .__getitem__()
method will make it possible to retrieve the value associated with a given key, whether it’s spelled in American or British English.
Because you need to modify the core behavior of the dict
class, using UserDict
would be a better option to code this class. With UserDict
, you won’t have to provide custom implementations of .__init__()
, .update()
, and so on.
When you subclass UserDict
, you have two main ways to code your class. You can rely on the .data
attribute, which may facilitate the coding, or you can rely on super()
and special methods.
Here’s the code that relies on .data
:
# spelling_dict.py
from collections import UserDict
UK_TO_US = {"colour": "color", "flavour": "flavor", "behaviour": "behavior"}
class EnglishSpelledDict(UserDict):
def __getitem__(self, key):
try:
return self.data[key]
except KeyError:
pass
try:
return self.data[UK_TO_US[key]]
except KeyError:
pass
raise KeyError(key)
def __setitem__(self, key, value):
try:
key = UK_TO_US[key]
except KeyError:
pass
self.data[key] = value
In this example, you first define a constant, UK_TO_US
, containing the British words as keys and the matching American words as values.
Then you define EnglishSpelledDict
, inheriting from UserDict
. The .__getitem__()
method looks for the current key. If the key exists, then the method returns it. If the key doesn’t exist, then the method checks if the key was spelled in British English. If that’s the case, then the key is translated to American English and retrieved from the underlying dictionary.
The .__setitem__()
method tries to find the input key in the UK_TO_US
dictionary. If the input key exists in UK_TO_US
, then it’s translated to American English. Finally, the method assigns the input value
to the target key
.
Here’s how your EnglishSpelledDict
class works in practice:
>>> from spelling_dict import EnglishSpelledDict
>>> likes = EnglishSpelledDict({"color": "blue", "flavour": "vanilla"})
>>> likes
{'color': 'blue', 'flavor': 'vanilla'}
>>> likes["flavour"]
vanilla
>>> likes["flavor"]
vanilla
>>> likes["behaviour"] = "polite"
>>> likes
{'color': 'blue', 'flavor': 'vanilla', 'behavior': 'polite'}
>>> likes.get("colour")
'blue'
>>> likes.get("color")
'blue'
>>> likes.update({"behaviour": "gentle"})
>>> likes
{'color': 'blue', 'flavor': 'vanilla', 'behavior': 'gentle'}
By subclassing UserDict
, you’re saving yourself from writing a lot of code. For example, you don’t have to provide methods like .get()
, .update()
, or .setdefault()
, because their default implementations will automatically rely on your .__getitem__()
and .__setitem__()
methods.
If you have less code to write, then you’ll have less work to do. More importantly, you’ll be safer because less code often implies a lower risk of bugs and errors.
Note: If you need that the del
keyword works with both spellings, then you’ll have to implement a custom .__delitem__()
method in your EnglishSpelledDict
dictionary. Similarly, if you want membership tests to work with both spellings, then you’ll have to override the .__contains__()
method.
The main drawback of this implementation is that if you someday decide to update EnglishSpelledDict
and make it inherit from dict
, then you’ll have to rewrite most of the code to suppress the use of .data
.
The example below shows how to provide the same functionality as before using super()
and some special methods. This time, your custom dictionary is fully compatible with dict
, so you can change the parent class anytime you like:
# spelling_dict.py
from collections import UserDict
UK_TO_US = {"colour": "color", "flavour": "flavor", "behaviour": "behavior"}
class EnglishSpelledDict(UserDict):
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
pass
try:
return super().__getitem__(UK_TO_US[key])
except KeyError:
pass
raise KeyError(key)
def __setitem__(self, key, value):
try:
key = UK_TO_US[key]
except KeyError:
pass
super().__setitem__(key, value)
This implementation looks slightly different from the original one but works the same. It could also be harder to code because you’re not using .data
anymore. Instead, you’re using super()
, .__getitem__()
, and .__setitem__()
. This code requires certain knowledge of Python’s data model, which is a complex and advanced topic.
The main advantage of this new implementation is that your class is now compatible with dict
, so you can change the super class at any time if you ever need to do so.
Note: Remember that if you inherit directly from dict
, then you need to reimplement .__init__()
and other methods so that they also translate keys to American spelling when the keys are added to the dictionary.
It’s often more convenient to extend the standard dictionary functionality by subclassing UserDict
than by subclassing dict
. The main reason is that the built-in dict
has some implementation shortcuts and optimizations that end up forcing you to override methods that you can just inherit if you use UserDict
as the parent class.
A Dictionary That Accesses Keys Through Values
Another common requirement for a custom dictionary is to provide additional functionality apart from the standard behavior. For example, say that you want to create a dictionary-like class that provides methods to retrieve the key that maps to a given target value.
You need a method that retrieves the first key that maps to the target value. You also want a method that returns an iterator over those keys that map to equal values.
Here’s a possible implementation of this custom dictionary:
# value_dict.py
class ValueDict(dict):
def key_of(self, value):
for k, v in self.items():
if v == value:
return k
raise ValueError(value)
def keys_of(self, value):
for k, v in self.items():
if v == value:
yield k
This time, instead of inheriting from UserDict
, you’re inheriting from dict
. Why? In this example, you’re adding functionality that doesn’t alter the dictionary’s core features. Therefore, inheriting from dict
is more appropriate. It’s also more efficient in terms of performance, as you’ll see later in this tutorial.
The .key_of()
method iterates over the key-value pairs in the underlying dictionary. The conditional statement checks for values that match the target value. The if
code block returns the key of the first matching value. If the target key is missing, then the method raises a ValueError
.
As a generator method that yields keys on demand, .keys_of()
will yield only those keys whose value matches the value
provided as an argument in the method call.
Here’s how this dictionary works in practice:
>>> from value_dict import ValueDict
>>> inventory = ValueDict()
>>> inventory["apple"] = 2
>>> inventory["banana"] = 3
>>> inventory.update({"orange": 2})
>>> inventory
{'apple': 2, 'banana': 3, 'orange': 2}
>>> inventory.key_of(2)
'apple'
>>> inventory.key_of(3)
'banana'
>>> list(inventory.keys_of(2))
['apple', 'orange']
Cool! Your ValueDict
dictionary works as expected. It inherits the core dictionary’s features from Python’s dict
and implements new functionality on top of that.
In general, you should use UserDict
to create a dictionary-like class that acts like the built-in dict
class but customizes some of its core functionality, mostly special methods like .__setitem__()
and .__getitem__()
.
On the other hand, if you just need a dictionary-like class with extended functionality that doesn’t affect or modify the core dict
behavior, then you’re better off to inherit directly from dict
in Python. This practice will be quicker, more natural, and more efficient.
A Dictionary With Additional Functionalities
As a final example of how to implement a custom dictionary with additional features, say that you want to create a dictionary that provides the following methods:
Method | Description |
---|---|
.apply(action) |
Takes a callable action as an argument and applies it to all the values in the underlying dictionary |
.remove(key) |
Removes a given key from the underlying dictionary |
.is_empty() |
Returns True or False depending on whether the dictionary is empty or not |
To implement these three methods, you don’t need to modify the core behavior of the built-in dict
class. So, subclassing dict
rather than UserDict
seems to be the way to go.
Here’s the code that implements the required methods on top of dict
:
# extended_dict.py
class ExtendedDict(dict):
def apply(self, action):
for key, value in self.items():
self[key] = action(value)
def remove(self, key):
del self[key]
def is_empty(self):
return len(self) == 0
In this example, .apply()
takes a callable as an argument and applies it to every value in the underlying dictionary. The transformed value is then reassigned to the original key. The .remove()
method uses the del
statement to remove the target key from the dictionary. Finally, .is_empty()
uses the built-in len()
function to find out if the dictionary is empty or not.
Here’s how ExtendedDict
works:
>>> from extended_dict import ExtendedDict
>>> numbers = ExtendedDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'one': 1, 'two': 2, 'three': 3}
>>> numbers.apply(lambda x: x**2)
>>> numbers
{'one': 1, 'two': 4, 'three': 9}
>>> numbers.remove("two")
>>> numbers
{'one': 1, 'three': 9}
>>> numbers.is_empty()
False
In these examples, you first create an instance of ExtendedDict
using a regular dictionary as an argument. Then you call .apply()
on the extended dictionary. This method takes a lambda
function as an argument and applies it to every value in the dictionary, transforming the target value into its square.
Then, .remove()
takes an existing key as an argument and removes the corresponding key-value pair from the dictionary. Finally, .is_empty()
returns False
because numbers
isn’t empty. It would have returned True
if the underlying dictionary was empty.
Considering Performance
Inheriting from UserDict
may imply a performance cost because this class is written in pure Python. On the other hand, the built-in dict
class is written in C and highly optimized for performance. So, if you need to use a custom dictionary in performance-critical code, then make sure to time your code to find potential performance issues.
To check if performance issues can arise when you inherit from UserDict
instead of dict
, get back to your ExtendedDict
class and copy its code into two different classes, one inheriting from dict
and the other inheriting from UserDict
.
Your classes should look something like this:
# extended_dicts.py
from collections import UserDict
class ExtendedDict_dict(dict):
def apply(self, action):
for key, value in self.items():
self[key] = action(value)
def remove(self, key):
del self[key]
def is_empty(self):
return len(self) == 0
class ExtendedDict_UserDict(UserDict):
def apply(self, action):
for key, value in self.items():
self[key] = action(value)
def remove(self, key):
del self[key]
def is_empty(self):
return len(self) == 0
The only difference between these two classes is that ExtendedDict_dict
subclasses dict
, and ExtendedDict_UserDict
subclasses UserDict
.
To check their performance, you can start by timing core dictionary operations, such as class instantiation. Run the following code in your Python interactive session:
>>> import timeit
>>> from extended_dicts import ExtendedDict_dict
>>> from extended_dicts import ExtendedDict_UserDict
>>> init_data = dict(zip(range(1000), range(1000)))
>>> dict_initialization = min(
... timeit.repeat(
... stmt="ExtendedDict_dict(init_data)",
... number=1000,
... repeat=5,
... globals=globals(),
... )
... )
>>> user_dict_initialization = min(
... timeit.repeat(
... stmt="ExtendedDict_UserDict(init_data)",
... number=1000,
... repeat=5,
... globals=globals(),
... )
... )
>>> print(
... f"UserDict is {user_dict_initialization / dict_initialization:.3f}",
... "times slower than dict",
... )
UserDict is 35.877 times slower than dict
In this code snippet, you use the timeit
module along with the min()
function to measure the execution time of a piece of code. In this example, the target code consists of instantiating ExtendedDict_dict
and ExtendedDict_UserDict
.
Once you’ve run this time-measuring code, then you compare both initialization times. In this specific example, the initialization of the class based on UserDict
is slower than the class derived from dict
. This result is an indicator of a serious performance difference.
Measuring the execution time of new functionalities may also be interesting. For example, you can check the execution time of .apply()
. To do this check, go ahead and run the following code:
>>> extended_dict = ExtendedDict_dict(init_data)
>>> dict_apply = min(
... timeit.repeat(
... stmt="extended_dict.apply(lambda x: x**2)",
... number=5,
... repeat=2,
... globals=globals(),
... )
... )
>>> extended_user_dict = ExtendedDict_UserDict(init_data)
>>> user_dict_apply = min(
... timeit.repeat(
... stmt="extended_user_dict.apply(lambda x: x**2)",
... number=5,
... repeat=2,
... globals=globals(),
... )
... )
>>> print(
... f"UserDict is {user_dict_apply / dict_apply:.3f}",
... "times slower than dict",
... )
UserDict is 1.704 times slower than dict
The performance difference between the class based on UserDict
and the class based on dict
isn’t that big this time, but it still exists.
Often, when you create a custom dictionary by subclassing dict
, you can expect standard dictionary operations to be more efficient in this class than in a class based on UserDict
. On the other hand, new functionality may have similar execution time in both classes. How would you know which is the most efficient way to go? Well, you have to time-measure your code.
It’s worth noting if you’re aiming to modify the core dictionary functionality, then UserDict
is probably the way to go because, in this case, you’ll be mostly rewriting the dict
class in pure Python.
Conclusion
Now you know how to create custom dictionary-like classes with modified behavior and new functionalities. You’ve learned to do this by subclassing the built-in dict
class directly and by inheriting from the UserDict
class available in the collections
module.
In this tutorial, you learned how to:
- Create dictionary-like classes by inheriting from the built-in
dict
class - Identify common pitfalls of inheriting the Python built-in
dict
class - Build dictionary-like classes by subclassing
UserDict
from thecollections
module
You’ve also written some practical examples that helped you understand the pros and cons of using UserDict
vs dict
when creating your custom dictionary classes.
You’re now ready to create your custom dictionaries and to leverage the full power of this useful data type in Python in response to your coding needs.
Join Now: Click here to join the Real Python Newsletter and you'll never miss another Python tutorial, course update, or post.