Computers and Latency
00:20 This is where the actual computation happens. Memory stores what is being worked on, and the CPU works with memory frequently—not only to find out what instruction to run next, but where it stores the data to work on. Memory is generally volatile and is gone once you turn the computer off, so for longer-term storage, there’s usually a device like a hard drive.
00:41 And then, finally, there is some set of peripherals. Peripherals are usually used for input and output. This includes things like network cards, video cards, and external devices like keyboards and mice.
01:04 That information is pulled into the CPU. The CPU then sends that off to memory. In modern computers, there are ways of skipping the CPU to do this, which speeds things up, but for the purposes of this conversation, I’m going to keep things simple.
01:18 Once the program’s been loaded in memory, the CPU needs to get the next instruction from the memory and run that instruction inside of the CPU. That instruction often impacts peripherals—for example, sending something out onto the network.
01:33 The CPU sends information down to the peripheral card, and then the peripheral card itself sends information to the outside world. Each one of these components runs at different speeds, and this is where latency comes into effect.
02:10 Now multiply that out by 100. That’s about how long it takes to talk to main memory. So every time the CPU needs to talk to the memory, you need to delay by about 100 nanoseconds. Again, modern computers have ways of speeding this up, like L2 caches, but for the purposes of what I’m talking about, let’s keep it simple.
03:18 There’s a huge difference in scale between the instruction level, memory level, disk level, and peripheral level in your computer. There can be a factor of a thousand or more between different steps in this stack. To try and put this in perspective, let’s think about this like a distance. Think about a single CPU instruction as a meter, or about a yard. For the purposes of this analogy, they’re about the same. To help you visualize, that’s about the height of a doorknob off the ground on a regular door.
03:47 This runs in 0.01 nanoseconds. In 1 nanosecond, you can run 100 CPU instructions of that Intel i7 that I mentioned earlier. That would be 100 meters or about 100 yards, which is roughly the length of an American football field or about the length of a soccer pitch—give or take the same thing plus or minus a few meters.
04:21 3 microseconds, which is about how long it takes to read 1 megabyte from memory, is 300 kilometers or 186 miles. That’s three times the length of the Suez canal, so now you’re looking at large distances on the face of the Earth.
04:37 Going from memory to disk just makes that worse. Reading 1 megabyte from disk is 82,500 kilometers or 51,000 miles. That’s over twice the Earth’s circumference. That read time is only if the disk’s head is in the correct position and the megabyte being read is in order on the disk, i.e. it’s sequential information. If the head needs to move around, there’s a cost to do just that.
05:03 It takes about 2 milliseconds to do a disk seek. That’s 200,000 kilometers, 125,000 miles, or about half the distance between the Earth and the Moon, on average. And that ping time to Europe, 150 milliseconds?
05:35 These differences are huge and hard to wrap your head around. Let me try it another way to see if I can just hit it home. Pretend that instead of an instruction taking fractions of a nanosecond, it took a full second. Reading that megabyte from RAM would take 2 hours and 47 minutes, or you can run 10,000 instructions in that time. That pesky disk seek? 6 years and 4 months, or 200 million instructions.
06:17 The gaps between the levels in the computing stack are phenomenally large. And just to make it that much more complicated, that Intel i7 that I said runs 100 instructions? Yeah, that’s a 8-year-old processor.
06:30 The modern ones are about three or four times that. Unfortunately, for the distances in latency it’s easier from a physics standpoint to increase the speed of a CPU than it is to increase the speed of network traffic. As a result, computer processors are getting faster and faster at a higher degree than the network traffic is getting faster. As CPUs get better and better, the latency difference between performing an instruction and going out to the network is getting more extreme, not less.
07:02 This is why most programs are I/O-bound. If you write a program that first accesses RAM, hundreds of instructions could be run in the time that it’s waiting. If it needs to access disk, that can be tens of thousands or millions of instructions before the program is ready to run again. And if you have to access the network, it’s billions of instructions.
Become a Member to join the conversation.