Using the bisect Module
In this lesson, I’m going to show you how to use the
bisect module, which is in-built in Python. And the reason I’m showing this to you before showing you how to implement binary search on your own in Python is just to impress upon you the importance of using library modules that are verified and vetted for performance and security before you necessarily reinvent the wheel. Now, of course, there are reasons to actually implement binary search on your own, and I’ll talk about those when I get into doing so. But for the moment, let’s take a look at
bisect module performs the same work as a binary search function, but it does it in a little bit of an interesting way. What
bisect() is concerned with is finding the index at which an item could be inserted in the list to maintain the sorted order.
bisect() function, aka
bisect_right()—because this is really just an alias for
bisect_right()—when you call it on an item that is in the list, it will actually return the index after that item, because it’s the index at which you could insert a new copy of this item, if you wanted to maintain sorted order. And in the case of duplicates, it returns the rightmost index.
bisect_left() is actually the function that will generate the index of existing items in the list, because it returns the leftmost possible index that you could insert an item at to maintain sorted order. Now, I realize that may not be entirely clear, but I’ll demonstrate in the REPL how this works, and I think it will become a lot more obvious.
And then the
insort functions are essentially just functions that call
bisect() and then actually insert the item at the index found. So, there’s an
insort_right() and an
insort_left() just like there’s a
01:46 So, let’s see these in action.
Here, I’ve imported
bisect, and I’ve actually defined beforehand a list of fruits that has an
'apple', three copies of the string
'pineapple'. So as you can see, this is in sorted order—
'p'—and so that means that I can use
bisect. An important feature of
bisect is that it doesn’t enforce this sorted order, so if you call it on a list that’s not already sorted, it will probably give you an odd behavior. And so you don’t want to do that.
Let’s see what happens when I call
bisect.bisect() on the string
'banana' in the
fruits array. So,
param x—the list and then the actual item to find—so
fruits and then
It’s going to return the index
3, and then
4. So it says, “If I insert
'banana' at index
4, pushing these two over an index, then I’ll still have a list that’s in sorted order.” And this is exactly the same as
bisect.bisect_right()—It actually returns exactly the same thing—but if I call
bisect_left(), I get a bit of a different behavior. I get the index
1, which interestingly enough is the first index of that string in the list.
Because I could insert at this index, and now that copy of
'banana' would be the leftmost instance of
'banana' and the whole list would still be in sorted order.
So that’s the key to understanding
bisect, is it’s concerned with “Where can I put stuff to make sure that this list maintains sorted order?” And when there are duplicate copies, that’ll be on either side with
Interestingly enough, if I call
bisect_right() and then I subtract the call on
bisect_left() with the same parameters,
then I’ll actually get the number of occurrences of this item here, which is in fact
3. So that’s kind of a cool, random feature of
bisect that you might be able to use.
So now, if I call
bisect.insort and let’s call
fruits, and then let’s say
"kiwi" just for fun, and see what it does.
bisect.insort_left(fruits, "kiwi"). If I print out
fruits, it will put
'kiwi' right after
Now let’s put in one more copy of
and you’ll see—of course it won’t be entirely clear what side that it actually put it on because these are strings, so there’s no obvious way to see the differences here, but it did in fact put it on the left. So now you might be wondering “Why, in fact, are there these
insort_left() if it’s never going to be obvious that you’ve actually put it on either side?” Well, the reason for that is when you start to have more complex objects, you might have objects that compare equal to one another, but are in fact different objects. That is, they occupy different memory locations, but they consist of the same data that’s compared for equality.
And if you’re interested, I’d encourage you to check out the tutorials on Real Python about the
is and equals (
==) comparison operators, because they’ll give you a little bit more context as to why this distinction between
insort_right() can actually be important in various cases.
05:07 But for now, just know that that is what these functions do, and this is how it looks in practice. In the next lesson, I’ll show you how to actually implement a binary search on your own without any help from library modules.
Become a Member to join the conversation.