Understanding Memory and File I/O
00:00 In the previous lesson, I gave an overview of the course. In this lesson, I’ll cover background on memory and file I/O in your computer and all the stuff you’ll be glad your operating system abstracts away for you.
00:13 First off, let’s talk about performance. Your computer can be broken down into four basic concepts: the CPU, where the computation takes place, the memory, which acts like a kind of chalkboard that temporarily holds things that are too big to fit in the CPU at a time, a storage unit, like a disk drive, and not pictured here, an interface to peripherals like your video card.
00:37 There are huge performance differences between each of these areas. I know I exaggerated the huge in that last sentence, but it truly is kind of mind-blowing.
00:49 See that dot? That dot represents a nanosecond. That’s one billionth of a second. To try and imagine that, light travels a whole thirty centimeters, or just about a foot, in that amount of time.
01:04 Your CPU is so fast that a middling Intel I7 from a few years ago could comfortably do a hundred instructions in that time. Now instructions in a CPU are basic building blocks, so that’s not a hundred lines of code—more like a hundred multiplications—but that’s still a lot.
01:23 Your CPU has little tiny memory things inside of it called registers. This is where it stores stuff it is operating on, but there aren’t a lot of registers in most CPUs, and so there is a constant shuffle to memory to get stuff to fill the registers.
01:39 Just accessing a spot in memory takes about a hundred nanoseconds. The dot above is a single pixel. This line is a hundred pixels long. You see what I’m doing? Instead of just accessing memory, let’s read a bunch of it.
01:59 Grabbing a megabyte of data takes about three thousand nanoseconds. That’s three thousand pixels of lines there. This is something every programmer should have a basic understanding about. Going out to memory is significantly more expensive than doing something directly on the processor. And you ain’t seen nothing yet.
02:18 Let’s take a thousand nanoseconds—that’s a bunch of those red lines—and compress them down into that tiny yellow dot. I’m back to a single pixel. You thought going out to memory was slow? Well, going to disk is much worse.
02:37 Yep, you read that right. 825,000 nanoseconds to read that same megabyte block from disk. That’s a three-order-of-magnitude difference.
02:49 It isn’t relevant to this course, but there is worse. The network is even worse.
02:55 Locality in computing is tied to huge performance gains. If you can keep everything in memory, it can make a big difference. If you can keep everything on the CPU, even better.
03:07 All these numbers are just rough. Because of these differences, your hardware will have caches between these boundaries to help improve performance. That makes measuring things a bit weird.
03:17 You’ll get different results between the first and second time of doing anything. That’s those caches. The difference is so stark that most CPUs have multiple levels of caches on them to help avoid going out to memory too frequently.
03:33 Consider the relatively simple case of adding variables together. Variables in your program are stored in memory. The act of addition is done on the CPU. To do the addition, the variables have to be read from memory, put into registers on the CPU.
03:51 Then the CPU does the addition, typically putting the result in a third register, although some hardware uses two registers and overwrites one of them. Then, to put the result in a third variable, it has to be written back out to memory.
04:05 Think back to the latency values from before. Let’s simplify this and say that the two variables can be read in a single access. An access costs one hundred nanoseconds. In one nanosecond, the CPU can do a hundred instructions.
04:21 That means the CPU could do over ten thousand additions in the time it takes to just fetch the variables from memory into the registers. I’m sure you can guess where this is going.
04:33 A molasses-like pace of disk-reading is in your future.
04:39 All right, let’s change it up a bit and talk about different kinds of memory. Up until now, I’ve been talking about the physical memory in your hardware—most likely RAM. There’s only so much RAM on your machine, and every process running wants some of it.
04:53 So your operating system abstracts this away as virtual memory. When a program wants memory, it is given virtual memory, which could currently be in RAM or on the disk.
05:06 The OS swaps the contents in and out of RAM from a SWAMP file. This enables all the programs on your machine to use more memory than physically available. The OS simply puts some of it down to the disk when RAM is tight.
05:20 If your OS is smart, it typically makes this decision based on something not being used right now. This is why computers seem to go from zipping along quickly to chugging at a horrible pace. When two or more processes are fighting for most of the RAM, the OS has to frequently swap the memory in and out of the disk.
05:40 As disks are three orders of magnitude slower than memory, it shouldn’t be a wonder that you notice some performance difference. Thankfully, all of this is done by the OS, and you don’t have to manage it yourself.
05:52 Being aware of the consequences of asking for a lot of memory can make you a better programmer, or at least a programmer whose users are less cranky about the sluggishness of their machine.
06:04 Another memory concept is shared memory. The simplest model is for each program that you run to be contained inside of a process. Each process is managed by the OS and, among other things, is allocated some memory. For safety reasons, this is self-contained.
06:21 You wouldn’t want my process writing all over your process’s memory. That’d be bad. Your program can actually have multiple processes. There are a variety of reasons for doing this, but most of them have to do with trying to do more than one thing at a time.
06:37 Since memory is allocated to a process, you need a special situation to share memory between two processes. This is another feature offered by your operating system, and it is called, logically enough, a shared memory block.
06:53 Okay, you’re an expert on memory. Now how about that storage stuff?
06:58
Consider this bit of code, which reads all the contents of a file and puts it in a variable named text
, which of course lives in memory. To read the file, you have to give up control of your program to the OS by making a system call.
07:13 Then the OS interacts with the disk, and it buffers the data from the disk before putting it into memory. This is a vast oversimplification. In fact, that code there is going to break down into at least two system calls: one for opening the file, the other for reading.
07:30 But even that reading is more complicated. There are file pointers that need to move. There are buffers that need filling. In fact, how many system calls there will be will be partially dependent on the size of the file being read.
07:43
And I know this seems like it might be an obvious observation, but changing the variable named text
will do nothing to the file. If you want to change the file, you have to do that whole process again, but writing things down to disk instead of reading.
07:58 Why am I making obvious statements? Well, mmap does things differently.
Become a Member to join the conversation.