Floating Point Complications
00:00 In the previous lesson, I showed you the halfway from zero and halfway from even rounding strategies. In this lesson, I’ll show you the complications of floating-point numbers and how that can mess up your rounding.
00:13 Floating point is a way of storing arbitrary decimal values in a fixed number of bytes in memory. How this gets done is laid out in the IEEE standard number 754.
00:23 Most CPython implementations use 64 bits to store what is known as a double precision float. I say most because CPython essentially uses the underlying C data types, which some compilers on some platforms might do something else with.
00:39 I’m pretty confident the computer you’re watching this on is using 64 bits to store a float though. The range of values that can be stored in a float are mind-bendingly large.
00:49 Seriously, I spent a while trying to come up with an analogy and failed, I so failed. I couldn’t come close. You’ll see what I mean in a minute.
00:58
You can get some details about how floats are handled in your version of Python by examining the float_info
value in the sys
module.
01:07 The two numbers I’ve highlighted here are the maximum possible value and the minimum possible value. The max is just a little less than two with 308 zeros after it, and a min being a two with that many decimal places to its left. To try to wrap your mind around this, the max is this massive thing
01:31 and the min it ranges down to is this tiny thing.
01:38 Trying to wrap your head around 616 orders of magnitude is kind of hard. Let me give it a shot. The width of a helium atom is about 0.05 nanometers, which makes two of them 0.1 nanometers.
01:55 If you line them up, it takes around 20 trillion of them with a T to make a kilometer, about two thirds of a mile for my American friends. Keep lining them up and you need two times 10 to the power of 29 helium atoms to make a light year.
02:13 By comparison, the magnitude range of the exponent in a floating point number is 10 to the 616. The width of the observable universe is about 47 billion light years, which only gives you 10 more orders of magnitude on top of the 10 to the 29. You’re still over 570 orders of magnitude between the smallest floating point and the largest.
02:36 It’s a huge range. At this point, someone in the comments is going to point out that I missed a zero or I have one too many and I anticipate your comment and smile.
02:46 Even if I’ve messed this up by several orders of magnitude it’s a rounding error on the scale. See, I brought it back to rounding. Anyhow, computers aren’t infinite and so squishing all of that range into 64 measly little bits is lossy compression.
03:05 Lots of numbers can’t be precisely represented in floating point. Let’s go to the REPL to prove this.
03:12
I’m going to go back a few lessons and grab our half_up
function.
03:17 Just as a reminder, this one rounds breaking ties by shifting up.
03:24 Let me do that again with negative 1.225
03:30 and you get different behavior. That’s not up, that’s not to the right, that’s down. You’re probably ahead of me. Let’s see why floating point is to blame.
03:41 Let me do a little math on negative 1.225.
03:46 You can’t actually represent negative 122.5 as a float and as our algorithm did this exact thing multiplying the negative 1.225 by a hundred, the lack of precision in floating point caused our negative 122.5 to shift a bit.
04:02 The shift was just enough to muck with our boundaries in our rounding algorithm. The most common example of the floating-point problem is some very simple math that’s not quite right, is it?
04:15 This isn’t a Python thing, but a floating-point thing. Every programming language that uses the IEEE standard has this problem, and by the way, that IEEE standard is implemented in the hardware on your machine.
04:27
There’s no getting away from it. This means you get some weirdnesses, even if you use Python’s built-in round()
function.
04:37
Recall the built-in round()
function rounds to even, so this is the expected value. The closest even to that seven in our tiebreaker is eight.
04:46 But this looks wrong. It’s technically correct because 2.675 isn’t actually 2.675. The floating representation is off and it gets rounded in the wrong direction.
05:04 Floats are tricky. Depending on what kind of math you’re doing it likely doesn’t matter, but if you’re doing money, this is very problematic. Never use floats for money, you’ll end up owing or being owed.
05:17 In the next lesson, I’ll show you Python’s decimal library, which can be used for money as well as a couple of others.
Become a Member to join the conversation.