Taking Advantage of Latency With Concurrency Patterns
00:00 In the previous lesson, I talked about the degrees of latency in the different components of your computer. In this lesson, I’m going to show you how that’s taken advantage of to create concurrency.
00:11 Recall this diagram from the previous lesson. It’s common for a program to have to wait long periods of time for it to be able to do the next instruction.
00:21 It needs to wait on RAM, disk, and network. Time slicing is the idea of the operating system mixing other programs into these wait states. While you’re waiting for that network packet to come back from Prague, the computer can insert multiple other programs to take advantage of that time.
00:40 This time slicing is how a single processor computer can look like it’s doing multiple things at the same time.
00:48 Time slicing for multitasking can be thought of at three levels. The lowest level is not having any at all. The olden days of personal computers, the DOS operating system, worked this way.
00:59 You could only run a single program at a time. Cooperative multitasking is when the program willingly gives up the CPU. If you know that you’re about to go out to the network and you’re not going to need the CPU for a while, you signal the operating system and the operating system can schedule a different program.
01:17 In home computing, Windows 3.1 introduced this kind of multitasking. If you happen to be old enough to remember using this operating system, you may recall that at certain times the program would just stop refreshing.
01:29 That would be because it didn’t give up the CPU, but it was stuck in a wait state, so the UI was not being updated because the program had not told the operating system to put it into the wait state.
01:42 As the name implies, cooperative multitasking requires all of the processes to cooperate with each other. In shared computing, or in the case of a program with a problem, this can be a challenge.
01:54 So preemptive multitasking was created. The operating system in this case interrupts the program whether or not it’s ready. The operating system is responsible for swapping between the different programs. In most operating systems, this is done intelligently.
02:10 If your program does do a network call, that triggers the operating system to take advantage of it and put the program into a wait state. But it can also do it in the case where the program’s being a hog and hasn’t given up the CPU. Mainframes have done this from very early on, as has Unix. In the Windows world, NT and Windows 95 is where preemptive multitasking was first introduced.
02:36 On larger computers, you can have this time slicing as well as multiple processors. On each CPU, you can now have different programs running. In some cases, the same program may also be resident on those different CPUs.
02:51 In this diagram, programs one, four and five are tied to a single CPU, but programs two and three are across the two CPUs. Part of the operating system’s responsibility is scheduling within the CPU and across the CPUs.
03:09 Not all types of software can take advantage of concurrency equally. The type of concurrency you have can change what kinds of models you would use when creating a concurrent program.
03:20 Trivial concurrency is that where the different parts of the program can be split apart without concerns for each other. This can usually be achieved when activities within the program are completely independent of each other.
03:33 This is easiest if there’s no shared data between the different components.
03:38 Consider our web server. You can have multiple clients talking to the web server at a time because each one of those clients doesn’t need to be able to talk to each other.
03:47 There may be some contention at the disk level, but as you saw in the lesson on latency, a huge amount of processing can be done while a server’s waiting for the disk.
03:56 In the non-trivial case, you have to share data across your concurrent components. When you’re thinking about this situation, it’s useful to think of three steps in processing: input, computing, and output.
04:09 Concurrency is generally done by splitting up the compute portion, but that may mean that there’s coordination necessary for the input and output stages.
04:18 You may also need coordination amongst the different compute nodes depending on what kind of algorithm you are running.
04:25 Using the input, compute, and output model, you can think of programs in three parts. A producer is a component that produces data. This might mean reading something off the disk.
04:37 The worker is the component that does the actual computation, and the consumer is the component that consumes the data or aggregates the output.
04:47 Consider for a moment doing some image processing on a very large file. The producer portion of your program would read the large image format. The worker portion of your program would do the actual filtering on the image, and the consumer component would consume the result from the worker and write the new image to the hard drive.
05:08 Depending on the architecture of your program, these concepts might be mixed and matched.
05:14 Using these three ideas, you can introduce different patterns in concurrency. This simple pattern has a producer pushing data to a worker, which pushes data to a consumer.
05:25 It might not be immediately evident where you can get concurrency here, but because producers and consumers are interacting with the disk, a lot of work can be done in the worker while producers and consumers are working independently.
05:39 Thinking back to the image processing example, the producer might read thousands of bytes off the disk, hand them off to the worker. The worker could then manipulate those thousands of bytes creating a result, which the consumer would then write to the disk.
05:54 Unless the worker is particularly computationally intensive, in all likelihood, it will most of the time be waiting for the producer and consumer, but as soon as the worker has been fed data, it can start working.
06:07 The producer does not have to have read the entire image off of the disk. This can produce some concurrency.
06:15 A variation on this pattern uses multiple workers. Your producer reads information off of the disk, hands it off to a worker, which begins computation. The producer then reads more information off to the disk and hands it off to a second or third worker.
06:30 The workers work independently creating a result, and then the consumer is responsible for writing it back to disk.
06:37 This pattern is common with the processing of very large images. For the most part, an image can be broken up into components and image processing can be done on those components independently.
06:49 This means the workers don’t have to talk to each other very often, giving a high degree of parallelism between the work they’re performing.
06:58 A variation on this pattern is for the producer to broadcast. In this case, each worker sees all of the data. It may not operate on all of the data, but as the data is exposed to it, it can continue to work.
07:11 These patterns can also be mixed and matched. You can have your producers broadcasting to workers, workers talking to other workers, and those workers consolidating information for the consumer.
07:23 Next up, the challenges that concurrency introduces and how it’s changing in Python as we speak.
waveman on Nov. 19, 2023
Small niggle: cooperative multitasking was introduced to perosnal computing on Mac System 5, which predated Win 3.1 by 5 years
Become a Member to join the conversation.

ldr3tlogy on Dec. 10, 2020
Excellent summary, easy to follow the producer, worker, consumer analogy, and various combinations of each, architecting different concurrent algorithms, thereby improving throughput.