Introduction to Caching
00:12 Software often exhibits the principle of locality. Recently used things are likely to be used again. This works at many different levels in your architecture. Memory blocks that were used before are more likely to be used again than other memory blocks.
00:40 And this is quite common on the Web. Your browser has caches built in to avoid refetching things from the Internet. Consider a news site. Even if it is updated regularly, the logos on it aren’t likely to change frequently.
00:53 Keeping the logos in a cache means saving the transit time of downloading them again. Caching values means you can get at them again quickly, but that comes at the cost of space. This cost is usually in memory, but it can also be at the cost of disk space.
01:10 If the cost of computing something or fetching something is high, a cache will save you time if that value is ever reused. This is so common that your processor has this concept built in. Modern CPUs often have several layers of caches to reduce the number of trips to RAM.
Virtual memory in your operating system allows you to have the total memory use of all your running programs exceed the amount of RAM you have. How the pages of memory are swapped in and out of RAM uses the same kind of mechanisms that I’ll be showing you for your cache. And finally, you can take advantage of these concepts in your own code with the use of the right library. In Python, that’d be the
02:04 The 5 block has some friends, nine of them in fact. These blocks are big, big enough that only a few of them can be used at a time. Consider the base case of only being able to keep one of these blocks in memory at a time. When I need a block, I swap all of its contents into memory.
02:22 And if I want something else, I have to wipe out what is there to make room for another block, block 6 in this case. There aren’t a lot of choices to make if you can only fit one block into memory at a time. So let’s increase our RAM.
03:06 Let’s look at several different kinds of caching policies. The first is called FIFO, short for First-In/First-Out. This means that whatever got put into the cache first is the thing that gets removed when the cache is full. LIFO, or Last-In/First-Out, means the last thing that got put in the cache is the first to go.
03:27 This mechanism favors whatever got loaded early on and then pretty much stays forever. LRU, or Least Recently Used, determines which item in the cache was used the longest ago. This will be where the rest of this course concentrates.
03:44 MRU, or Most Recently Used, is LRU’s cousin. And LFU, which just sounds rude, is Least Frequently Used, where you keep frequency counters on the blocks and kick out the one that has been accessed the least. FIFO is pretty cheap to implement, but you might be getting rid of something that is used a lot. LFU optimizes for use, but takes a lot of extra bookkeeping. Every time you kick something out, you have to scan through to find the lowest frequency or re-sort your list.
04:15 Both of these can be expensive operations. LRU is a compromise between the two, hence why it is one of the more common policies. Let’s see some of these in action to better understand the choices.
04:40 The cache is empty, so the 3 gets inserted. Now a 5. There’s still space, so 5 gets filed away. Now another 3. This is called a cache hit. Block 3 can be returned from the cache, and the cache itself looks the same before and after a hit.
05:02 Remember these numbers are really representing large chunks of data. You’d never just cache a single number. Now the 9 block, it takes up the last spot. In the immortal words of Tim Curry, “Antici-pation.” So what do we do?” Here’s the 1 block, and the cache is full. This is called a miss.
05:52 and so it is. This cache can be implemented with an ordered list and a pointer. The pointer indicates what is to be replaced next, and when it is, the pointer advances by one spot. In the case of pointing to the last spot, the pointer wraps around to the first spot again.
06:09 This is what I meant when I said FIFO is relatively cheap. You really only need one additional pointer to implement this method. Note that although the 3 was used, it was still the first thing to go.
07:09 The concept of Last-In/First-Out is common in queuing theory, which is related to this idea, but I’ve never seen it implemented as a caching mechanism. A LIFO cache will fill up with the first few items, then constantly replace the last one. It’s really cheap to implement—just use the last spot in the cache once it is full—but you’re probably not going to have a good balance locality-wise with this choice. Let’s try something else. All right.
07:49 So far, so good. And now a choice that needs to be made. Here comes the 1. That’s still a miss. You can draw a line from the 1 back to the list of things inserted until you are three items deep.
08:43 There is a memory cost and a slight access cost to this. The access cost is negligible in comparison to what you’re caching, or it should be. Otherwise, you shouldn’t bother cashing. But that memory thing might be expensive.
Become a Member to join the conversation.