Implementing Binary Search
Let’s take it a look at this binary search function. I’m going to start out with an iterative approach, so I’m just going to call it
binary_iterative(), and it’s going to take in a list of
elements and then a
00:29 Okay. So with that boilerplate out of the way, first, you have to define the search space, which is what the binary search will actually decrease—hopefully, by around half—on each run-through of the algorithm.
Normally, the way that you can do that is you can say
right, and those are equal to the boundaries of the actual list that you’re searching, right? So the zeroth index and then the far-rightmost index.
Now, you have to have a termination condition. I like to say,
while left <= right. You can also do
while left < right and do a little bit more work later, but I like this because it essentially just says, “As long as there’s anything that you’re searching for at all, keep running.” And if there’s nothing left to search for—if your search base is
0—then you know to just kick out. So,
while left <= right: the first thing that you have to do is that you have to get a middle index, and a way that works pretty well to do that is to just say
(left + right), with integer division of
The first is if your middle element is equal to your
search_item. And this case is simple—you just return the
middle_dex, so you return the currently searched index. Then you have two more cases.
If middle element is less than your
search_item—remember that these have to be comparable and in sorted order—so if that’s the case, then you can discard the half of the elements that are less than or equal to the middle element. And a way to do that that’s really convenient is to just say
left is equal to the
middle_dex + 1—you just move the left bound over. Then you can say
elif middle_el—and really, you could just say
else here, because you’ve already checked for this one case and there’s nothing else.
But I’m going to say just for clarity’s sake
if middle_el > search_item: then you can say
right = middle_dex - 1. And then you just continue on into your loop. And if you exit out of the loop, you know that you’ve never returned a middle index, so you just know that you need to return
None. So, this works just fine.
As you can see, it returns
2—so that’s pretty nice. And then I can run it on
9 and it gets
4, which works just great. Run it on
12 and it returns
None, which isn’t printed out but it was in fact what was returned. Run it on
3, it gives me
0. It seems to be working just fine.
But if I modify
els just a little bit and I say
els +=, let’s just say
[9, 10], and then I run it again here on
5, I’ll actually get
els is now equal to this seven-element list that has the rightmost
5 in the middle, so that’s what gets picked first. So importantly, this iterative method that I’ve defined so far doesn’t make any guarantees about which index of an item that you get, which isn’t always desirable because, in general, you’d like your algorithms to be a little bit more deterministic.
So there are a couple of different ways that you can do this, and I’m going to do the simplest and the easiest of these methods. I’m going to define
binary_leftmost(), which is going to take in—just as you might expect—
elements and a
And one way you can handle this slight issue where you’re not sure if you’re going to get the leftmost, the rightmost, or somewhere in the middle, is you can first call your iterative algorithms—so you can get, let’s call it
tentative for the tentative index, is just the
binary_iterative() of those
elements and that
And then you say, first,
if tentative is None: then you just return
None because you know that’s not going to change. But otherwise, you can say
while tentative, or I should say,
while elements[tentative] == search_item:
then you just subtract from
tentative. So you say
tentative -= 1. And then finally, at the end, you just return
tentative + 1, because you know that this
while loop will cause you to go one too far, so you want to actually return the last one that you did. So
tentative + 1.
05:22 And this essentially just says, “Hey, I know that I’ve found an instance of the search item and now I just need to look left until I find the end of those search items.” This implementation actually has a slight bug in it.
was composed of copies of what you’re looking for. At the moment, there’s no guard to make sure that if you hit the leftmost index, the zeroth index, you stop because that’s about as far left as it’s possible to go. With this implementation you might run into negative indices and keep looping indefinitely. Of course, one super-simple way to fix this is to just add in a quick
and statement that says
and tentative >= 0.
And as you’ll notice, it returns
2, which is the correct index for the leftmost
5. But remember, of course, that the original binary search would also have returned the same index in this case, so I need to give it one more test here, which is to add
[9, 10], which was the case that this failed on earlier.
So, the original binary search would now return this middle index instead of this leftmost index, but
binary_leftmost() works just fine. So it seems to be working. On your own, of course, you should do a little bit more rigorous testing than I’ve shown in this demonstration, but that seems to fix the general problem. Next, I want to show you the recursive implementation of binary search, which is a little different than the iterative version, but follows pretty much the same logic in a different paradigm.
07:39 Recursive implementation of the binary search functions in much the same way and with exactly the same logic as the iterative approach, but what it lets you do is it lets you discard these left and right bounds, and instead use slices of your actual input array and then recursive calls to accomplish the same objective.
Let me type this out and then I’ll run you through it. So there it is, the recursive implementation of binary search. Now, the biggest thing you’ll probably notice is this
dex_mod parameter with a default value that I’ve now passed into
dex_mod parameter can be a little confusing, so I’ve made a slide to hopefully illustrate it a little bit more clearly. The root of the problem is that the recursive binary search algorithm can return the wrong index if you’re not careful.
And that’s because of what I’m going to call the sub-list indexing problem. So, take a look at how it works. Imagine that I’m searching this list for the element
5, which is at index
4 because of zero indexing.
But then I recursively call the function on the last three elements of the list as a sub-list. But of course, if you look at this new list,
5 is at the index
0 in this sub-list, so its sub-list index is
0, but its real index is
4 in the main list.
And without some kind of parameter to keep track of the things that you’ve discarded, there is no way for a new function call to know that you’ve already discarded four elements and so that that needs to be added to the index of
So, whenever you recurse on a right-hand half of the list, you need to actually add in the number of elements you’ve just discarded. It doesn’t matter when you’re discarding elements that are on the right-hand side, because then this index is still
0 no matter if you discard two elements or if you don’t, right?
10:00 I would encourage you to check out this code that I’ll include below this lesson if you want to get a closer look at this and see why some of the logic works exactly as it does. But in terms of the actual functionality of the algorithm, nothing here has really changed.
When I call
binary_recursive() on my
els list, it will give me exactly the same results that the original
binary_iterative() implementation would do. And of course, if I want to find the leftmost, I’ll have to either work this recursive algorithm into the binary leftmost search, somehow. And that won’t be hard to do, so I encourage you to do that as an exercise on your own.
10:42 So now you know several ways to implement a binary search and how to deal with some of the deficiencies of a super basic implementation. In the next lesson, I’ll get into some even trickier parts of this algorithm that often trip people up.
Become a Member to join the conversation.