LRU in Python

Caching in Python With lru_cache Christopher Trudeau 09:57

Transcript
Discussion

00:00 In the previous lesson, I showed you how different caching policies work. In this lesson, I’ll introduce you to the @lru_cache decorator found in the functools module.

00:12 Your hardware and your operating system implement caches to help make your computing experience snappy, or at least snappier than it would be without a cache. To take advantage of the same concepts in your Python code, you can use the provided @lru_cache decorator. As it is a decorator, it is used by wrapping a function that you already have.

00:33 You might use it if your function does a lot of computation, or if your function does an operation that is slow, like accessing the network to fetch a document.

00:43 I’m going to stick with the first case, as it means writing less code to demonstrate, but in any situation where the results take time and might be reused, that’s a situation where you might consider caching.

00:56 Now for a quick aside—spoiler alert, there’s a long aside coming later—understanding a bit about computation may help you better understand when a cache may be useful.

01:06 Big O notation represents a bounded runtime for a computation. The O is short for Ordnung, which means the order of approximation. My apologies to any German speakers in the audience for butchering that pronunciation.

01:20 All orders are relative to the amount of data given as an input. O(1), also known as constant time, means that no matter how much data is involved, the amount of time it takes is the same.

01:32 Getting the first item from a list is usually O(1). The length of that list is irrelevant to the cost of accessing the first item. Linear time, or O(n), is a runtime that is proportionate to the amount of data.

01:47 Finding the largest value in an unordered list is O(n). You have to scan the whole list. Quadratic time is where things start to get messy. In this case, the runtime grows with the square of the amount of data.

02:01 And on the higher end, you have exponential time, where the runtime grows with the exponent of the amount of input. This last case is a really good one for caching.

02:11 It gets expensive really fast. The example I’m going to be playing with is the Fibonacci sequence. This goes from costing microseconds to calculate early on in the sequence to seven orders of magnitude slower than that by the thirty-fifth spot in the list.

02:27 Exponential can bite you quickly.

02:31 In case you’re not familiar, let me briefly introduce you to the Fibonacci sequence. Although it sounds like it might be yummy smothered in an Alfredo sauce, the Fibonacci sequence is a numeric computation named after an Italian mathematician who documented the concept in 1202. As with a lot of things in history, it was discovered long before that, but the first European to write it down gets the credit. The sequence goes like this: you start with a 1, then another 1. Then you sum those two values to get 2.

03:03 I didn’t even have to use my fingers. The next item in the sequence is the sum of the previous two numbers. That would be 3. So then 5, then 8. And it goes on forever. Interestingly, this kind of geometric progression is common in nature and is related to spirals.

03:24 The number of seeds in each ring of a sunflower follows this kind of sequence. This algorithm is best tackled with recursion. Let’s go take a look. In the top window here, I have a program that prints out ten digits from the Fibonacci sequence, ending with the number passed in on the command line. Lines 3 through 7 are the recursive algorithm. If the number in the sequence is 0 or 1, then it returns 1. If the number is more than that, it is based on the two previous numbers in the sequence summed together, which here I’m using recursion to calculate. Lines 10 through 12 get the command-line argument and turn it into an integer, then calculate the range of numbers being requested.

04:08 I’m using 0 to indicate the first number in the sequence because I’m a programmer, and lists start with the zeroth item. Lines 14 through 16 loop through the range, printing each of the numbers.

04:21 All right, let’s try this sucker out, starting with the first ten digits.

04:27 That was quick. Let me scroll back here. And there’s the familiar 1, 1, 2, 3, 5, et cetera.

04:36 Let me try that again, ending with the fifteenth in the sequence.

04:42 Still quick. It won’t get messy until you get higher up. With me so far? Okay. With that established, I can talk caching now.

04:53 Python’s implementation of the LRU cache is provided as a decorator inside of the functools library. All you have to do is wrap your function with the decorator, and you’re instantly cached. That’s it. It’s that easy. Let’s go play with it.

05:10 To understand performance improvement, you need to measure performance. Here, I’ve modified the previous code just a bit to add some timing information.

05:19 The fibonacci() function is the same, but inside of the for loop at the bottom, I’m printing out both the result and how long it took to get the result.

05:27 I’m measuring the time by calling the time module’s perf_counter() function and finding the difference between the value from before and after the call.

05:37 Let’s see this in action.

05:45 You’ve seen the first ten digits before. Note that although it is pretty quick, there is two orders of magnitude difference between the first digit and the eleventh. Let’s do it again.

05:59 By the time you get to the thirty-first digit in the sequence, you’ve got a difference of over three hundred thousand times. Let’s take this up a few more digits.

06:20 And by going up just the five more digits, you are now into a place where it is easily discernible by humans. This is the cost of an exponential algorithm.

06:29 So let’s cache this sucker, yeah?

06:37 Same basic code, just a couple of new additions. I import the lru_cache decorator from functools. I wrap the fibonacci() function with that decorator. And finally, I’ve added a little bit of debug through a print() statement in the Fibonacci calculation. Before I do this, take a guess.

06:56 How much difference do you expect? Well, here it comes.

07:07 Pretty awesome, right? You’ve got the thirty-fifth number coming out at a speed similar to the first couple of numbers. To understand how this worked, think about the print() statement that I added.

07:18 It’s the first thing that gets run when the function is called. As the calculation is recursive, calculating the thirty-fifth number should end up calling the function thirty-six times.

07:29 But the print() statement is only showing up once between the thirty-fourth and thirty-fifth. That is because of the cache. The decorator gets called before the function does, sees that it has a relevant value, and returns that value instead of calling the function. As each call to the Fibonacci function uses the two values before, you’re guaranteed to have them in the cache, which means once the cache is primed, you only need to do one computation per number.

07:57 Let me scroll back here, and you’ll see the cache getting primed. The first time through, for the twenty-fifth number, the function gets called for each number before it.

08:10 Calculating the twenty-fifth number took 0.000131 seconds.

08:18 But the twenty-sixth number had a primed cache. All the values that preceded it were in the cache, and so this number gets calculated two orders of magnitude faster.

08:34 What’s more, each subsequent calculation takes a similar amount of time. That’s the power of a cache.

08:43 There is a small but in all of this, which is due to how the decorator is written. To achieve this speed, the cache is implemented using a dictionary.

08:52 The keys in the dictionary are the arguments to the function being wrapped. That means you can only use the LRU decorator on functions whose arguments are hashable.

09:03 If your function has an argument that can’t be used as a key in a dictionary, your function won’t work with the LRU decorator. Another minor thing: due to how the arguments are inserted into the dictionary, the LRU decorator considers the order of arguments important.

09:19 If you’re using two named arguments, but passing them in in a different order in a future call, you’ll get a cache miss. So foo() called with a=3 and b=4 and foo() called with b=4 and a=3, although programmatically the same thing, end up taking two distinct spots in the cache.

09:40 These restrictions aren’t too bad. As long as you can live with them, you can get the power of caching with an import and a single line of code.

09:50 The LRU decorator takes some optional arguments. In the next lesson, I’ll show you how to use them.

Become a Member to join the conversation.