How pandas Uses Boolean Operators
As filtering can be a bit tricky in pandas, you’ll learn in this lesson how pandas uses boolean operators.
00:00 Pandas can be a little tricky when filtering, so in this video you’re going to learn how Pandas uses Boolean operators. You may remember from math class PEMDAS, or the order of operations.
00:11
This is a set of rules that defines the order in which mathematical operations occur. Python has its own order of operations, and the ones we’re concerned with are listed down here in order of increasing precedence. So first we have the Boolean operators, which are and
, not
, and or
, which were evaluated after the arithmetic operators—<
, <=
, and so on—which are operated on after the bitwise operators, and (&
), or (|
), and the inverse or compliment operator (~
). Pandas doesn’t use these Boolean operators and instead opts for these bitwise operators. Let’s open up a terminal and see this in action.
00:52
I’ve got my terminal here, I’m going to start the Python interpreter, and import pandas as pd
. Okay. Starting with some more simple data, we can say 4 < 3 and 5 > 4
.
01:09
And because this will evaluate, and then this section will evaluate, and then they’ll be compared, we should end up with False and True
, which should evaluate to False
. If you throw some parentheses into this, you can change it a little bit.
01:23
So let’s say something like (3 and 5)
in parentheses and then type the same thing out. And now this evaluates to True
. This is actually because of short-circuit evaluation, where the 3
and the 5
actually evaluate to 5
, because it’s the last argument in this statement.
01:40
So then this whole section, then, is 4 < 5 > 4
, which is True
. Let’s take a look at a Pandas-specific example. So take a Pandas Series
that you can set equal to [True, True, False]
, and then using the bitwise &
, compare it to another Series
that is [True, False, False]
. When you run this, you should get another Series
that is True
, False
, and False
.
02:09
If you tried to run this with and
,
02:14
like this, you’d end up with an error that says The truth value of a Series is ambiguous.
So let’s take a look at these two different statements.
02:27
The bitwise operators actually compare the bits of a value as opposed to how truthy or falsy the value is. So as a bit of an analogy, you can think of the bits as being each piece in the Series
. So if you compare the first items, they’re both True
—so True and True
is True
.
02:46
True and False
is False
, and False and False
is False
. Don’t try to say that five times fast. Down here, however, it attempts to compare the entire Series
with the entire Series
, so if you had a Series
that was True
, True
, and False
, how do you determine if that’s actually True
or False
?
03:04
And that’s what this error is trying to say—that it’s ambiguous. Let’s look at this a different way. Create a new Series
called s
,
03:15
and set this equal to a Series
that’s just a range(10)
,
03:21
so this will just be a set of numbers like so. You may think you’d be able to do an operation like s % 2 == 0 & s > 3
as a way to find even numbers that are larger than 3
. But if you run this, you end up with that same error.
03:40 Let’s try to break down exactly what’s happening here. Let’s go back to the statement and start putting some parentheses in to see what’s going on. First, we can say that this operation is occurring,
03:55
because the modulo operator (%
) has the tightest binding. Next, this &
statement
04:02
is the next most tightly-bound operator, because it’s that bitwise &
. Using logic, you can expand that statement by putting an and
here and saying another (0 & s)
, like that.
04:19
And because the actual Boolean and
is the least binding, you can then group these like so.
04:34
Because we didn’t change the order of anything, if you try to run this, you’ll get that same truth value
[…] is ambiguous
error, and that’s because you can think of this as trying to operate each of these pieces and then compare the resulting Series
with the resulting Series
.
04:50 So let’s go back to that original statement. This bitwise operator here makes sense because we want to do an element-by-element comparison of each of these statements. The problem is, because of the order of operations, this is evaluated before these arithmetic operators.
05:08 So, to fix this, right from the start just wrap these in parentheses.
05:17
Now if you run this, you can see that the even numbers that are greater than 3
evaluate to True
. What you should take away from here is that if you run into ValueError
while doing Boolean indexing, there’s a chance that you just need to add some more parentheses. Using these bitwise operators instead of and
, or
, or not
changes the order in which they’re evaluated, which means that your arithmetic operations that used to have precedence now occur after the bitwise operator. So when in doubt, parentheses it out.
Matt Williams on Aug. 18, 2020
I don’t know that the statement above is correct, because if you perform the evaluation 5 < 3 or 3 < 4
, the output is True
. Obviously the interpreter looked at both numerical comparisons in the operation, otherwise it would have stopped at 5 < 3
and returned false, as rworreby is suggesting above.
Dan Bader RP Team on Aug. 18, 2020
Thanks for flagging this @rworreby. What’s referred to as “short-circuit evaluation” in the video around the 1:30 mark isn’t actually short-circuit evaluation but the evaluation of a nested boolean expression.
We’ll get the terminology cleared up with the next update of this course. Sorry about any confusion this may have caused.
In the meantime, please check out the following resources for a deeper look at short-circuit evaluation of boolean expressions in Python:
Become a Member to join the conversation.
rworreby on May 27, 2020
I actually want to point out a section that is wrong in this video, starting from 1:00 to 1:45:
As Python uses short-circuit evaluation for algebraic expressions, the expression in the video (
4 < 3 and 5 > 4
) would be treated the following way:4 < 3
is False and therefore the expression would return False immediately (as “False and ” is always false). The statement in the video that4 < 3
is evaluated, then5 > 3
is evaluated and then both compared (???) is therefore wrong.In the next section, short-circuit evaluation is mentioned, however, in the example that is given, there is no short-circuit evaluation, as its a nested comparison. It is also wrong that
(3 and 5)
“just” evaluates to 5. In a logical test, the result is the last evaluated statement. Its not the last argument.As a reference, check the following things:
I think in the video there is a wrong understanding of what short-circuit evaluation is. Truth tables is a thing to check out, but applied to boolean logic in python these are the base cases of short-circuit evaluation: