The is Operator
In this lesson, I’ll take you through how to compare the identities of two objects with the
is not operators.
So, why do I say that
is compares identity rather than equality? Well, in Python, or at least in CPython—the reference implementation of Python that you’re probably using—identity refers to the memory address at which an object is stored.
00:25 And you can think of this as kind of analogous to thinking about the identities of two people or something like that, right? Even if you have a pair of identical twins or two people who look really similar, they might be equal in many features, right?
They might have equally-sized noses and length of hair or something, but they’re still not the same person. They don’t have the same identity, right? The way that Python defines identity is simply by memory address, or ID number, and I’ll inspect that and show you how that works under the hood with the
id() function later in the terminal. Importantly, this is different from the variable name, which points to an object. Because you might have, you know, 10, 20, 50, a hundred, a million different variables, which all reference actually the same underlying object, like a list or a string or something like that. To continue the people analogy, you might have one person that goes by many different names, right?
01:21 You might have a person named Jonathan who goes by the name Jack with his friends, by the name John at work, and by the name Jonathan with his parents, right?
So you can have many different variable names, but you can only have one identity, and that is one memory address. With all that in mind, what should you actually use the
is operator for?
is not operator, its counterpart? Well, you should use it to compare with
None, and many people in the Python community would go so far as to say that’s the only use case for it.
I don’t go quite that far, I think that sometimes it can be useful to compare the memory addresses of objects—maybe you’re debugging, maybe you have a very specific program that really needs to work with these memory addresses—but in general usage, you should use it to compare with
None pretty much exclusively.
Let’s take a look at how all this works in the terminal. Well, I’ll need some variables to operate on, so I’ll have
a, which has the content
"This is a string", and then I’m going to have
b, and it’s actually also just going to say,
"This is a string".
To illustrate the difference between equality and identity, first, let’s take a look at
a == b. This is
True because the two strings have the same value.
They both say
"This is a string", right? So they’re equal to one another.
But if you say
a is b, you get
b were declared separately from one another, right? So they’re technically separate objects even though they have the same value.
And you can check this by saying
id(b), and those are two different numbers, meaning they were instantiated at different times.
And be careful when you look at these IDs, because you might be tempted to think that because the ID of
b is larger, that means it was declared later than
a? This may or may not be true, and it’s not true that that’s always the case, so those ID numbers—there’s a complex algorithm under the hood for how to generate them, so don’t rely on anything about the ID number itself to tell you about the declaration pattern. But you might get confused if you do something like this.
You might say
x = 20,
y = 20. And in this case, of course, they’re equal. And if you do
x is y, you actually also get
True, and so that might confuse you if you’ve just watched the first part of this lesson, because you’d think, “Well, they’re declared separately, right?
So they should be different objects, so the
is relation shouldn’t be satisfied.” Well, this is because of a cool feature of the Python interpreter under the hood, which is called interning. And the numbers from
-5—and I’ll put this in a comment just so that you can see it—from
256 are interned by default. What does this mean?
04:06 It means that each value in this range has a distinct memory location that it occupies. And every variable that’s declared to be equal to that value—or assigned to that value is maybe more accurate to say—each of those variables actually just points to the same underlying memory address.
This is one of the ways that the Python interpreter optimizes because these numbers are pretty small and so they’re used really often in code, but if you did two integers and you said
a = 257 and
b = 257 and you said
a is b, that’s actually
False because those are not included in the interned numbers.
04:49 And this happens as well for various small strings that are often used, and it’s difficult to tell sometimes what those strings are, but they might be interned when they’re really frequently used.
You can see how this works by importing the function,
from sys. That teaches me not to talk and code the same time.
from sys import intern.
05:13 So, those strings that we used earlier, let’s reinstantiate those because in a silly fashion, I wrote over them.
But if you have two strings
"This is a string",
as I showed you earlier, the
is relationship is not satisfied. But if I say
a = intern(a) and
b = intern(b), then all of a sudden the ID of
a, and the ID of
b are the same, and so
a is b is now
And that’s because what the
intern() function does is exactly what the interpreter does—it interns this
a and puts it at a specific location in memory, and then any other variables which are declared or interned which have the same value as
a point to that exact same location in memory.
06:01 So this is an optimization tactic, and if you’re going to declare or work with many different variables, all of which have the same value, it might be useful to you to intern that value just so that you get a little more speed.
So, I’ve taken you through all this, but I haven’t yet shown you the actual use case that I told you was best for the
is not operators, which is comparing with
None. I’ll show you that now.
a is of course not equal to
None, so it’s
a is None, and that’s because
a has a value
"This is a string".
And so if I say
a is not None, then I get
True, right? But it might not be obvious why this is useful or why you might want to compare with
None. Well, I’ll come up with a contrived example and then I’ll leave you to kind of extrapolate. So, imagine you’re making a web crawler, and so you had a function that gave you a list of web addresses, and those were the web addresses maybe that had a certain picture on them or something.
You’re searching the web for a particular picture and you want all of the addresses which had that picture. So I’ll say
address1—let’s actually make them strings.
["address1.com", "address2.net"]—but you might very well, when you’re crawling the web, sometimes you can’t get a response from a particular website.
So this will probably have some
None values in it. And then you’ll have
You definitely have a
None in here, and so if you just try to say
for address in address:,
and then maybe you want to do some string manipulation on it, right? Maybe you say
print(address.split()) on the period (
You want to split those up so that you can get both the domain name and then the
".net" reference there. So this would be great except—oh, I’m sorry. I said
for address in address, so be careful typing and coding—or, talking and coding at the same time is a very dangerous game. But the error that I wanted to show you was this
'NoneType' object has no attribute 'split'.
So if you’re trying to operate on all of these things but some of them are
None because your other function had to return
None in some cases, then you’re going to run into issues.
So what you need to do instead is say
for address in web_addresses:
if address is not None: then you know you’re safe to print the
address.split("."), and now you get exactly what I wanted, which was the actual address part of it and then the
"net", or suffix of the web address.
So, that’s one example, and this is a problem that comes up a lot when you’re working with real-world data, is that you can’t always get the data that you want and sometimes you have to have things like
NoneType in there to kind of fill the empty space.
08:56 But often, you’ll want to either ignore them or treat them in a special fashion, and so you have to compare. And so I’ll just do one more quick example.
I’ll add in an
else clause. You could print, you know,
"Web address not reachable".
Right? You can just say something like that, that might be nice. You have one here,
Web address not reachable. So you know at least that this one address, and of course you would have to do some more work so that you know what the actual thing that you wanted to reach was. But regardless of all that, that’s a use case for this
is not None comparison, which is really, for most cases, the only use case that you need to use this
is not operator for. All the stuff with memory addresses is super cool and super interesting and it’s fun to think about why it is the way it is, but when you’re doing kind of casual or more general purpose programming, you’ll probably just need the
is operator to compare with
@jamesbrown68 hmm, that’s really interesting! I don’t get that behavior with my REPL, and interning is generally only preserved over each invocation of Python, so I’m not sure what’s causing that to happen. Out of curiosity, what OS and REPL are you running?
I’m on Windows 10, Python 3.8.1.
python >>> x = 20 >>> id(x) 1617684704 exit() python >>> x = 20 >>> id(x) 1617684704
Wow @jamesbrown68, I have to admit this one stumps me. I wonder if there’s some system setting in your OS that’s causing Python to run in exactly the same way every time?
Wondering why I’m getting this output?
[root@vedang]# python Python 2.7.5 (default, Aug 7 2019, 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = "hello" >>> b = "hello" >>> a is b True >>> a == b True >>> id(a) 140575040349312 >>> id(b) 140575040349312 >>>
the interning feature seems random?
“Interning is an implementation-dependent optimization that depends on many factors. It can be interesting to understand how it works, but never depend on it working any particular way.” (Source)
i have not been able to recreate the above example.
a='bro!' b='bro!' print(a is b) a='bro ' b='bro ' print(a is b)
True True [Program finshed]
a='bro' b='bro' print(a is b) a=a+'!' b=b+'!' print(a is b)
True False [Program finished]
a='bro' b='bro' print(a is b) a=a+'!' b=b+'!' print(a is b)
True False [Program finished]
is this use case safe?
if a is True:
Whether it’s safe, depends on what you’re doing within in
if statement. But checking if an object is
True is common in Python. However, if you are checking to see if the presence of the object
a, then you can shorten it to just:
if a:. If a is a boolean, and you want to check it’s truthiness, then you should use
== instead of
is. Hope that helps.
Become a Member to join the conversation.
jamesbrown68 on March 31, 2020
About the interned numbers (am I spelling that right?) I noticed that the id’s for ‘x = 20’ was the same, even after I exited Python and started a new REPL. I was expecting the assignments to occur when the REPL started, but I suppose not. So what’s determining the id’s for -5 to 256? My OS?