C3 Superclass Linearization (Optional)
00:00 In the last video, you learned that Python uses the method resolution order, or MRO, to determine what order to search parent classes for the desired method. In Python 3, the algorithm that generates the MRO is called the C3 superclass linearization algorithm, or C3 for short.
00:24 This algorithm was developed in academia and eventually found its way into the heart of Python.
00:32 The algorithm is best explained with an example. This UML diagram on the right should look pretty familiar by now. Understanding the MRO becomes crucial when trying to patch fix the diamond problem, as you saw in the last video.
00:50
You might remember that the order of inheritance matters. (Secretary, HourlyEmployee)
did not yield the same error as (HourlyEmployee, Secretary)
.
01:03
This video will explain why that is. In this example, assume that TemporarySecretary
inherits from Secretary
, and then HourlyEmployee
, as seen in red. Linearization is the process of generating an ordered list of classes, aka the MRO. As I mentioned before, each class has its own linearization.
01:30 This process is recursive, as each class’s linearization relies on the linearization of its parents. The linearization of the base class, which can be seen as the base case of this recursion, is just itself.
01:47
For the sake of simplicity, we will assume Employee
is the base class, even though in Python it is technically the object
superclass that you learned about earlier in the course.
02:00 The linearization of a class will be denoted like a function where L() means linearization and the letters in parentheses represent the initials of the class we’re currently linearizing.
02:15
We’re going to start at the top with Employee
and work our way down until we have the linearization for TemporarySecretary
. The linearization of the base class is just itself, which I will list as E. Next, let’s compute the linearization of SalaryEmployee
, or SE. The order of this computation doesn’t matter, so long as we linearize all the classes from top to bottom.
02:44
Each linearization starts out with itself plus this merge() function. You’ll understand how the merge() function works as we go on, but for now, just know that the merge() function will take in the linearizations of the parent classes plus a list of those classes in the same order. For SalaryEmployee
,
03:08 that’s pretty small. It’s just the linearization of E and [E] itself. This simplifies to merge([E], [E]) since the linearization of E is just E itself.
03:22
Now that the merge() arguments are all simplified, we can start merging them into the list on the left, one by one. In the case where the merge() function contains just one class, it will merge this class with the list on the left, and that is the entire linearization of SalaryEmployee
. Intuitively, this makes sense.
03:47
We know that when you use the .__init__()
method to construct an object out of this class, Python will try to use the .__init__()
method in SalaryEmployee
, and if one doesn’t exist there, it will search Employee
for one.
04:02
Let’s keep going. In the interest of time, I have skipped over the steps for linearizing HourlyEmployee
. It’s the exact same as SalaryEmployee
because they both only inherit from Employee
.
04:17
Now we can linearize Secretary
. That looks like this: S plus merge the linearization of SE, plus [SE] itself.
04:30 As we’ve determined, the linearization of SE is [SE, E], and now we can see how the merge() function really works.
04:42 It’s going to search the first item in the first list, called the head.
04:48 As long as that first item isn’t not the first item in the remaining lists, it will pull that item out of all the lists and merge it with the list on the very left. In this case, SE is the first item of two lists, so we’ll pull that out and merge it with the list on the left.
05:13
Then, when the merge() function has just a single class, it gets merged with the list on the left automatically. The linearization of Secretary
is Secretary
, then SalaryEmployee
, and then Employee
, as you might’ve expected. Finally, we can linearize TemporarySecretary
.
05:37 This involves multiple inheritance, meaning that things are going to start getting a little bit more interesting. As usual, we have the sum of the class itself, plus the merging of its parents’ linearizations, and a list of parents themselves.
05:56 This is why the order of inheritance matters. If we swapped L(S) and L(HE), we would get a different result. As usual, I will simplify those linearizations and now we are left with three lists to merge.
06:15 First, S is checked, and S looks like a good candidate for merging because it’s not the middle or the last element of any list. So S gets merged with the list on the left.
06:30 The same thing happens to SE. Next, E is checked, but E is a bad candidate because it’s the last class in the middle list.
06:43 HE, on the other hand, is the first, so the algorithm merges that instead. Once again, we are left with two Es, which get merged automatically.
06:55
The linearization for TemporarySecretary
is TemporarySecretary
, Secretary
, SalaryEmployee
, HourlyEmployee
, Employee
.
07:09 As a final note, there are cases where this algorithm can fail, and if that happens, Python will raise an exception. One example of this is with cyclical inheritance, which will ultimately result in infinite recursion and a stack overflow.
07:28 You can try running this code on the screen to see this for yourself.
07:34
If you’d like to practice this yourself, try generating the linearization of TemporarySecretary
when it inherits from HourlyEmployee
before Secretary
.
07:46
You can check your answer by printing the .__mro__
attribute of the TemporarySecretary
class.
nizamraza920 on April 29, 2020
last example form cyclical inheritance how?
theramstoss on June 10, 2020
because when you work out the linearization algorithm, you get:
L[y] = [Y] + merge(L(object), L(X), [object, x]) = [y] + merge([object], [X, object], [object, X])
Since X and object are both the last element of one of the lists, none of the elements can get merged.
petrovicdjordje750 on Jan. 6, 2021
Great video,
Could you please elaborate more how the merge function works or point me to some article about algorithm if you have some recommendations?
Didn’t quite understand this part:
04:42 It’s going to search the first item in the first list, called the head.
04:48 As long as that first item isn’t not the first item in the remaining lists, it will pull that item out of all the lists and merge it with the list on the very left
Does this mean that it will always merge the first item, as long as the first item is not in the first list and list count > 2?
Thank you
Nehemie on Sept. 6, 2022
Can I get some help with getting the MRO of TemporarySecretary with a different inheritance order? Pleaase see below where I get stuck. I can output what employees.TemporarySecretary’s MRO is*(<class 'employees.TemporarySecretary'>, <class 'employees.Secretary'>, <class 'employees.SalaryEmployee'>, <class 'employees.HourlyEmployee'>, <class 'employees.Employee'>, <class 'object'>)*
but how do I get that result? What’s the rulewhen we do not have a good candidate like in the previous examples?
# MRO practice
TemporarySecretary(Secretary, HourlyEmployee)
L(TS) = [TS] + merge(L(S), L(HE))
[TS ] + merge([S, SE, E], [HE, E])
[ TS,
Nehemie on Sept. 6, 2022
Please ignore my comment above
Nehemie on Sept. 6, 2022
Here is how I finally managed to get it right
TemporarySecretary(HourlyEmployee, Secretary)
L(TS) = [TS] + merge(L(HE), L(S), [HE, S])
= [TS ] + merge([HE, E], [S, SE, E], [HE, S])
= [ TS, HE ] + merge([E], [S, SE, E], [S])
= [ TS, HE, S ] + merge([E], [SE, E])
= [ TS, HE, S, SE ] + merge([E])
= [ TS, HE, S, SE, E ]
Ariba S on June 6, 2024
The double negative comment on the merge for secretary is a bit confusing :)
Isaac P on Nov. 5, 2024
It’s not clear for me what is the cyclic problem of C3 algorithm in example:
class X(object):
class Y(X, object)
Why not to resolve it as following?
L(Y) = [Y] + merge(L(object), L(X))
= [Y] + merge([object], [X,object]) #object is not the first in 2'nd
= [Y,X] + merge([object])
= [Y,X,object]
Become a Member to join the conversation.
Matheus Rosa on April 21, 2020
In Python 3.8.0, it is not even allowed to create the
Y
class in this example:Thanks for this class, Austin. It is indeed very interesting and really cool to know about this!