Covering Tricky Details

Creating a Binary Search in Python Liam Pulsifer 06:10

00:00 As usual in these kinds of demonstrations, there’s some tricky details that I haven’t fully gotten into yet. Let’s run through some of these now and get into a mode of thinking where you’re always looking for these kinds of little edge cases when you implement algorithms like binary search.

00:59 But I’ll tell you why it’s still an issue: because many libraries in Python are written in C under the hood, and C is a language that does have the possibility of integer overflow.

01:10 So it’s always best to be safe in case you’re trying to pipe this result into something that’s being used by a C library. You can fix this problem by calculating the offset in advance.

01:23 So (right - left) // 2, and then adding that to the lower bound. This cannot overflow, no matter how big right and left are, as long as right and left can be represented in an integer system.

01:36 And in general, it’s always best to be thinking about these kinds of issues so that your Python knowledge can be transferable to other languages as well.

01:43 So, that’s the problem of integer overflow, but don’t worry because as long as you use this little fix, it won’t be a problem.

03:01 So that’s a case where you could have a stack overflow.

03:04 A problem that I’ve already discussed a little bit is that binary search in its most basic form doesn’t give you a consistent left- or right-most index of a given item. It just gives you one of the indices of a given item.

03:17 And of course, you know several ways to deal with that already, so I won’t belabor the point, but just make sure that in a tense situation—like, say, a technical interview or just a higher-pressure coding environment—that you’re not relying on behavior that doesn’t actually exist with the basic binary search algorithm.

04:04 What you’ll get, if you try to actually do some of these comparisons and say .1 in floats, .2 in floats—those will work, but .3 in floats will be False simply because of quirks in the IEEE floating-point representation. When I enter in this list comprehension [.1 * i for i in range(10)], you can take a look at it.

05:00 But on a computer you would need infinite precision to represent that, which means you would need infinite bits, which means you would need infinite memory, which is just not possible.

05:09 So there have to be some small weaknesses in how floating-point numbers work, no matter how you choose to do them. So for this reason, you can often decide on a precision that you need and then use integers in your code. If, for example, you only need a certain level of precision, you might define your own data class where it’s, you know, up to four-digit numbers or something like that.

05:31 But if you really need to do comparisons with floating-point numbers, you can also use the math.isclose() function to generate more accurate comparisons.

Become a Member to join the conversation.