Threading and Multitasking

Understanding Python's Global Interpreter Lock (GIL) Christopher Trudeau 06:26

00:00 In the previous lesson, I started my introduction to concurrency by explaining processes. In this lesson, I’ll cover their lighter weight siblings, threads.

00:09 Processes are heavy. Having a complete copy of the code and their own chunk of memory means a lot of allocation and resources. And because the memory isn’t shared between processes, communication requires special code, that chat room that I mentioned, which requires special coding.

00:26 This heaviness caused operating system designers to come up with other ways of doing concurrency.

00:33 Remember when I mentioned that time slicing isn’t all that great a way to do scheduling? One reason is because most software spends a lot of its time waiting rather than using the CPU.

00:43 Consider that a modern processor can execute hundreds of instructions per nanosecond, and that’s just on a single CPU. Reading from main memory can take hundreds of nanoseconds and programs read from memory all the time.

00:57 That means, while waiting for stuff to come back from memory, the CPU could be idle for the equivalent of hundreds of thousands of instructions.

01:06 CPUs have ways of optimizing for this that aren’t related to concurrency, but the picture’s complicated enough without that little bit, so let’s ignore it for now.

01:14 You also don’t tend to read just tiny little bits from memory. Transferring larger chunks of course takes time. Half a megabyte takes about a thousand nanoseconds to fetch.

01:26 It takes about 2 million nanoseconds to do a seek on a disk platter, which is why solid state is so much faster. It’s closer to being memory. And finally, using the network is insanely slow.

01:38 By comparison, a single packet pinging from the US to Europe can take 150 million nanoseconds.

01:46 Take any of these three times and multiply them by hundreds of instructions per nanosecond, and you have a whole lot of waiting around.

01:54 This is why simple time slicing is naive. Instead, it would be great to have a way to put a process to sleep relinquishing the CPU until that packet came back from its summer trip to France.

02:05 A better process scheduler can do this, but then you still have the overhead of a full copy of the code and extra memory. Instead, a thread is a lightweight mini-process, which isn’t a process at all, but operates inside of the process. Threads share memory, and in the simplest case, threads execute on the same CPU.

02:27 In fact, the threads are getting sliced around inside of a process’s allocated slice. You can think of a thread as a sub-slice of a process’s time slice. This also is a simplification as modern operating systems may allocate threads across CPUs, which is dependent on your OS, on your programming language, and other factors.

02:47 Coding wise, you now have a paradigm. Because threads are small and operate inside the same process, they’re ideal for I/O-bound concurrency. That’s parallelism gained by one thread, waiting for some I/O and letting another thread execute in the meantime.

03:04 Processes, on the other hand, are good for CPU-bound work. If you’re doing big number crunching with little I/O, having extra threads likely won’t speed you up.

03:13 In fact, although threads are lighter weight, they do have some overhead. So a multi-threaded program that is doing CPU-bound work might even be slower than a process due to the cost of swapping threads.

03:25 So now you’ve got a program and instead of forking a new process,

03:30 it spins up some threads operating inside of it.

03:33 Two levels of scheduling is going on here, one at the process level, and one for the threads inside the process. The operating system manages them both swapping Program 2 for Program 1, and then back again.

03:47 When Program 1 is active, a subset of its slice gets used for each of its threads. But again, if the threads are dealing with I/O-bound concurrency, they might just be sitting there waiting as the packet in France is having a good time in the cafe and hasn’t quite made it back yet.

04:03 So far, I’ve only brushed the surface of the scheduling iceberg. Do you brush icebergs? Ah mixed metaphors, the friend of lazy writers everywhere. I’ve already said that pure time slicing isn’t ideal and there are other ways of doing it.

04:17 In the thread space, you’ll come across two common scheduling terms. The first is preemptive multi-tasking. Time slicing is a simple version of this where the OS is in control of when your process or thread gets swapped out for another one.

04:57 Some programming languages aren’t happy with the OS being in control of thread scheduling. This is more common in languages with cross-platform runtimes. And in order to give the developer a consistent thread scheduling experience on different operating systems, a language might implement its own threading mechanism.

05:14 These kinds of threads are known as “green”. Python has both regular and green threads. The asyncio library is a green-thread library and it uses a cooperative multi-tasking mechanism known as coroutines.

05:29 Your code signals when it’s ready to give up control, typically when it’s waiting on some I/O from that wine-sipping packet as he’s touring The Louvre.

05:38 I know I keep bringing up the whole packet in France example, but put it in perspective for a second. At this point in the course, I’ve uttered a little over 2000 words, which contain about 12,000 characters.

05:49 If each character represented a CPU instruction, then a packet heading from the US to France would only be 0.0008% of the way there. This course would have to be two years and nine and a half months long for the packet to even arrive.

06:08 It’s astounding the order of magnitude difference between a CPU and something like network access.

06:15 Okay, one more topic before you’re ready to understand why the GIL exists. What happens when two threads try to do the same thing at the same time? Next up, race conditions.

Become a Member to join the conversation.