Using the bisect Module
00:00
In this lesson, I’m going to show you how to use the bisect
module, which is in-built in Python. And the reason I’m showing this to you before showing you how to implement binary search on your own in Python is just to impress upon you the importance of using library modules that are verified and vetted for performance and security before you necessarily reinvent the wheel. Now, of course, there are reasons to actually implement binary search on your own, and I’ll talk about those when I get into doing so. But for the moment, let’s take a look at bisect
.
00:33
Now, the bisect
module performs the same work as a binary search function, but it does it in a little bit of an interesting way. What bisect()
is concerned with is finding the index at which an item could be inserted in the list to maintain the sorted order.
00:50
So, the bisect()
function, aka bisect_right()
—because this is really just an alias for bisect_right()
—when you call it on an item that is in the list, it will actually return the index after that item, because it’s the index at which you could insert a new copy of this item, if you wanted to maintain sorted order. And in the case of duplicates, it returns the rightmost index.
01:12
So bisect_left()
is actually the function that will generate the index of existing items in the list, because it returns the leftmost possible index that you could insert an item at to maintain sorted order. Now, I realize that may not be entirely clear, but I’ll demonstrate in the REPL how this works, and I think it will become a lot more obvious.
01:32
And then the insort
functions are essentially just functions that call bisect()
and then actually insert the item at the index found. So, there’s an insort_right()
and an insort_left()
just like there’s a bisect_right()
and bisect_left()
.
01:46 So, let’s see these in action.
01:49
Here, I’ve imported bisect
, and I’ve actually defined beforehand a list of fruits that has an 'apple'
, three copies of the string 'banana'
, then 'orange'
, and 'pineapple'
. So as you can see, this is in sorted order—'a'
, 'b'
, 'o'
, 'p'
—and so that means that I can use bisect
. An important feature of bisect
is that it doesn’t enforce this sorted order, so if you call it on a list that’s not already sorted, it will probably give you an odd behavior. And so you don’t want to do that.
02:20
Let’s see what happens when I call bisect.bisect()
on the string 'banana'
in the fruits
array. So, param a
, param x
—the list and then the actual item to find—so fruits
and then "banana"
.
02:37
It’s going to return the index 4
. 0
, 1
, 2
, 3
, and then 4
. So it says, “If I insert 'banana'
at index 4
, pushing these two over an index, then I’ll still have a list that’s in sorted order.” And this is exactly the same as bisect.bisect_right()
—It actually returns exactly the same thing—but if I call bisect_left()
, I get a bit of a different behavior. I get the index 1
, which interestingly enough is the first index of that string in the list.
03:04
Because I could insert at this index, and now that copy of 'banana'
would be the leftmost instance of 'banana'
and the whole list would still be in sorted order.
03:13
So that’s the key to understanding bisect
, is it’s concerned with “Where can I put stuff to make sure that this list maintains sorted order?” And when there are duplicate copies, that’ll be on either side with bisect_left()
and bisect_right()
.
03:25
Interestingly enough, if I call bisect_right()
and then I subtract the call on bisect_left()
with the same parameters,
03:34
then I’ll actually get the number of occurrences of this item here, which is in fact 3
. So that’s kind of a cool, random feature of bisect
that you might be able to use.
03:44
So now, if I call bisect.insort
and let’s call bisect.insort_left()
of fruits
, and then let’s say "kiwi"
just for fun, and see what it does.
03:56
So, bisect.insort_left(fruits, "kiwi")
. If I print out fruits
, it will put 'kiwi'
right after 'banana'
.
04:03
Now let’s put in one more copy of "banana"
here,
04:09
and you’ll see—of course it won’t be entirely clear what side that it actually put it on because these are strings, so there’s no obvious way to see the differences here, but it did in fact put it on the left. So now you might be wondering “Why, in fact, are there these bisect_right()
, and bisect_left()
and insort_right()
and insort_left()
if it’s never going to be obvious that you’ve actually put it on either side?” Well, the reason for that is when you start to have more complex objects, you might have objects that compare equal to one another, but are in fact different objects. That is, they occupy different memory locations, but they consist of the same data that’s compared for equality.
04:51
And if you’re interested, I’d encourage you to check out the tutorials on Real Python about the is
and equals (==
) comparison operators, because they’ll give you a little bit more context as to why this distinction between insort_left()
and insort_right()
can actually be important in various cases.
05:07 But for now, just know that that is what these functions do, and this is how it looks in practice. In the next lesson, I’ll show you how to actually implement a binary search on your own without any help from library modules.
Become a Member to join the conversation.