The is Operator
So, why do I say that
is compares identity rather than equality? Well, in Python, or at least in CPython—the reference implementation of Python that you’re probably using—identity refers to the memory address at which an object is stored.
00:25 And you can think of this as kind of analogous to thinking about the identities of two people or something like that, right? Even if you have a pair of identical twins or two people who look really similar, they might be equal in many features, right?
They might have equally-sized noses and length of hair or something, but they’re still not the same person. They don’t have the same identity, right? The way that Python defines identity is simply by memory address, or ID number, and I’ll inspect that and show you how that works under the hood with the
id() function later in the terminal. Importantly, this is different from the variable name, which points to an object. Because you might have, you know, 10, 20, 50, a hundred, a million different variables, which all reference actually the same underlying object, like a list or a string or something like that. To continue the people analogy, you might have one person that goes by many different names, right?
I don’t go quite that far, I think that sometimes it can be useful to compare the memory addresses of objects—maybe you’re debugging, maybe you have a very specific program that really needs to work with these memory addresses—but in general usage, you should use it to compare with
None pretty much exclusively.
Let’s take a look at how all this works in the terminal. Well, I’ll need some variables to operate on, so I’ll have
a, which has the content
"This is a string", and then I’m going to have
b, and it’s actually also just going to say,
"This is a string".
And be careful when you look at these IDs, because you might be tempted to think that because the ID of
b is larger, that means it was declared later than
a? This may or may not be true, and it’s not true that that’s always the case, so those ID numbers—there’s a complex algorithm under the hood for how to generate them, so don’t rely on anything about the ID number itself to tell you about the declaration pattern. But you might get confused if you do something like this.
You might say
x = 20,
y = 20. And in this case, of course, they’re equal. And if you do
x is y, you actually also get
True, and so that might confuse you if you’ve just watched the first part of this lesson, because you’d think, “Well, they’re declared separately, right?
So they should be different objects, so the
is relation shouldn’t be satisfied.” Well, this is because of a cool feature of the Python interpreter under the hood, which is called interning. And the numbers from
-5—and I’ll put this in a comment just so that you can see it—from
256 are interned by default. What does this mean?
04:06 It means that each value in this range has a distinct memory location that it occupies. And every variable that’s declared to be equal to that value—or assigned to that value is maybe more accurate to say—each of those variables actually just points to the same underlying memory address.
This is one of the ways that the Python interpreter optimizes because these numbers are pretty small and so they’re used really often in code, but if you did two integers and you said
a = 257 and
b = 257 and you said
a is b, that’s actually
False because those are not included in the interned numbers.
04:49 And this happens as well for various small strings that are often used, and it’s difficult to tell sometimes what those strings are, but they might be interned when they’re really frequently used.
as I showed you earlier, the
is relationship is not satisfied. But if I say
a = intern(a) and
b = intern(b), then all of a sudden the ID of
a, and the ID of
b are the same, and so
a is b is now
And that’s because what the
intern() function does is exactly what the interpreter does—it interns this
a and puts it at a specific location in memory, and then any other variables which are declared or interned which have the same value as
a point to that exact same location in memory.
06:01 So this is an optimization tactic, and if you’re going to declare or work with many different variables, all of which have the same value, it might be useful to you to intern that value just so that you get a little more speed.
So, I’ve taken you through all this, but I haven’t yet shown you the actual use case that I told you was best for the
is not operators, which is comparing with
None. I’ll show you that now.
And so if I say
a is not None, then I get
True, right? But it might not be obvious why this is useful or why you might want to compare with
None. Well, I’ll come up with a contrived example and then I’ll leave you to kind of extrapolate. So, imagine you’re making a web crawler, and so you had a function that gave you a list of web addresses, and those were the web addresses maybe that had a certain picture on them or something.
You’re searching the web for a particular picture and you want all of the addresses which had that picture. So I’ll say
address1—let’s actually make them strings.
["address1.com", "address2.net"]—but you might very well, when you’re crawling the web, sometimes you can’t get a response from a particular website.
You want to split those up so that you can get both the domain name and then the
".net" reference there. So this would be great except—oh, I’m sorry. I said
for address in address, so be careful typing and coding—or, talking and coding at the same time is a very dangerous game. But the error that I wanted to show you was this
'NoneType' object has no attribute 'split'.
So what you need to do instead is say
for address in web_addresses:
if address is not None: then you know you’re safe to print the
address.split("."), and now you get exactly what I wanted, which was the actual address part of it and then the
"net", or suffix of the web address.
So, that’s one example, and this is a problem that comes up a lot when you’re working with real-world data, is that you can’t always get the data that you want and sometimes you have to have things like
NoneType in there to kind of fill the empty space.
Right? You can just say something like that, that might be nice. You have one here,
Web address not reachable. So you know at least that this one address, and of course you would have to do some more work so that you know what the actual thing that you wanted to reach was. But regardless of all that, that’s a use case for this
is not None comparison, which is really, for most cases, the only use case that you need to use this
is not operator for. All the stuff with memory addresses is super cool and super interesting and it’s fun to think about why it is the way it is, but when you’re doing kind of casual or more general purpose programming, you’ll probably just need the
is operator to compare with
Become a Member to join the conversation.