00:00 In the previous lesson, I talked about the degrees of latency in the different components of your computer. In this lesson, I’m going to show you how that’s taken advantage of to create concurrency.
00:21 It needs to wait on RAM, disk, and network. Time slicing is the idea of the operating system mixing other programs into these wait states. While you’re waiting for that network packet to come back from Prague, the computer can insert multiple other programs to take advantage of that time.
00:59 You could only run a single program at a time. Cooperative multitasking is when the program willingly gives up the CPU. If you know that you’re about to go out to the network and you’re not going to need the CPU for a while, you signal the operating system and the operating system can schedule a different program. In home computing, Windows 3.1 introduced this kind of multitasking.
01:21 If you happen to be old enough to remember using this operating system, you may recall that at certain times the program would just stop refreshing. That would be because it didn’t give up the CPU, but it was stuck in a wait state, so the UI was not being updated because the program had not told the operating system to put it into the wait state. As the name implies, cooperative multitasking requires all of the processes to cooperate with each other. In shared computing, or in the case of a program with a problem, this can be a challenge.
01:54 So preemptive multitasking was created. The operating system, in this case, interrupts the program whether or not it’s ready. The operating system is responsible for swapping between the different programs.
02:36 On larger computers, you can have this time slicing as well as multiple processors. On each CPU, you can now have different programs running. In some cases, the same program may also be resident on those different CPUs.
02:51 In this diagram, programs 1, 4, and 5 are tied to a single CPU, but programs 2 and 3 are across the two CPUs. Part of the operating system’s responsibility is scheduling within the CPU and across the CPUs.
03:20 Trivial concurrency is that where the different parts of the program can be split apart without concerns for each other. This can usually be achieved when activities within the program are completely independent of each other. This is easiest if there’s no shared data between the different components.
03:56 In the non-trivial case, you have to share data across your concurrent components. When you’re thinking about this situation, it’s useful to think of three steps in processing: input, computing, and output.
04:09 Concurrency is generally done by splitting up the compute portion, but that may mean that there’s coordination necessary for the input and output stages. You may also need coordination amongst the different compute nodes, depending on what kind of algorithm you’re running.
04:37 The worker is the component that does the actual computation. And the consumer is the component that consumes the data or aggregates the output. Consider for a moment doing some image processing on a very large file.
04:52 The producer portion of your program would read the large image format, the worker portion of your program would do the actual filtering on the image, and the consumer component would consume the result from the worker and write the new image to the hard drive. Depending on the architecture of your program, these concepts might be mixed and matched. Using these three ideas, you can introduce different patterns in concurrency.
05:19 This simple pattern has a producer pushing data to a worker, which pushes data to a consumer. It might not be immediately evident where you can get concurrency here, but because producers and consumers are interacting with the disk, a lot of work can be done in the worker while producers and consumers are working independently. Thinking back to the image processing example, the producer might read thousands of bytes off the disk, hand them off to the worker.
05:47 The worker could then manipulate those thousands of bytes, creating a result, which the consumer would then write to the disk. Unless the worker is particularly computationally intensive, in all likelihood it will most of the time be waiting for the producer and consumer.
06:15 A variation on this pattern uses multiple workers. Your producer reads information off of the disk, hands it off to a worker, which begins computation. The producer then reads more information off to the disk and hands it off to a second or third worker. The workers work independently, creating a result, and then the consumer is responsible for writing it back to disk.
06:37 This pattern is common with the processing of very large images. For the most part, an image can be broken up into components, and image processing can be done on those components independently.
06:57 A variation on this pattern is for the producer to broadcast. In this case, each worker sees all of the data. It may not operate on all of the data, but as the data is exposed to it, it can continue to work.
07:11 These patterns can also be mixed and matched. You can have your producers broadcasting to workers, workers talking to other workers, and those workers consolidating information for the consumer.
07:23 Programming concurrent systems introduces all sorts of new challenges to your software. Some things to consider. You must make sure that you’ve got execution coordination—different processes have to be able to sync up with each other.
07:36 Think of the image processing example. Let’s say your image was a megabyte large and split up into 10,000-kilobyte chunks. Each worker could work on one of those 10-kilobyte chunks, but in all likelihood, the borders between those 10-kilobyte chunks might impact the calculation in the chunk of image next door. So although for the most part the workers can work independently, when they reach the borders of the portions they’re working on, they may need to speak with other workers.
08:05 Once you have multiple processes operating, the operating system or the programmer themselves has to determine how memory is allocated across those multiple processes. This is hard enough in a regular program, it just becomes more complicated when dealing with concurrency. Scheduling is about when to determine which processes are active.
08:25 Thankfully, operating systems have done a lot of work on this. For the most part when you’re writing concurrent programs, you just let the operating system do the scheduling. But of course, that also means you’re giving up control, so there may be situations where this isn’t ideal.
08:40 Typically, the reason you’re writing a concurrent application in the first place is to be able to get things done faster. And really what you’re talking about here is higher throughput—more work done per unit time. How you manage execution coordination, memory allocation, and scheduling can change the throughput on your system, and you might need to fine-tune how these things behave in order to actually see speed-up. Somewhat related to scheduling is also distribution.
09:19 Deadlocks are when two or more components are waiting on each other. If A is expecting B to do something, and B is waiting for A to do something, nothing is ever going to happen. And finally, resource starvation may also be a problem.
09:50 You can run Python in a window or in your IDE and let the operating system take care of the fact that other things are happening at the same time. If you have multiple processors available on your computer, you can use the multiprocessor library in Python to have code execute on those different processors.
10:09 Modern operating systems have a lighter-weight way of getting concurrency called threads. Each thread in a program can execute different code. Threads can then be scheduled to take advantage of the I/O-bound nature of your program.
In addition to having threads in the standard library, Python has another mechanism using coroutines called
asyncio. The rest of this course describes the async I/O, thread, and multiprocessor standard libraries in Python and the differences between them. When considering concurrency inside of Python, you have to be aware of the GIL.
11:05 If you have this kind of race condition inside of memory allocation, you will get memory leaks. The GIL was introduced to solve this problem. One of the powers of Python is the ability to extend it using C. You can write code in C and plug it in underneath and then have it called by Python. Although this makes Python powerful, it also makes memory allocation challenging.
11:29 Now you have to deal with memory both at the interpreter level and at the C extension level. The GIL is particularly important when dealing with these kinds of interactions. If you do a quick Google on Python and GIL, you’ll find all sorts of opinions out there. It is seen by some as a limitation. Whatever you feel about it, it’s deeply nested inside of the interpreter, and removing it is not a simple thing.
12:05 This is one of the reasons it’s stuck around for such a long time. To be clear, the GIL is an implementation detail of the interpreter. CPython and PyPy both have it but Jython and IronPython do not use the GIL, so depending on your underlying interpreter, you may or may not run into this.
12:25 This course was recorded in the fall of 2020. At the time of recording, PEP 554 was being worked on. This PEP introduces Multiple Interpreters in the Stdlib. This concept is something called subinterpreters.
12:40 CPython already has this idea and it allows C extensions to kick off multiple Python interpreters at a time. Each subinterpreter has its own GIL, and so C extension level code is able to get concurrency without the challenges of the GIL involved.
12:57 The interpreters are essentially independent of each other. PEP 554 proposes exposing these interpreters in the Python standard library. This doesn’t fix the GIL, but would give programmers access to more tools that allow them to work around the GIL. And as changes happen around the GIL, they could be exposed to the programmers earlier inside of these subinterpreters, hopefully improving the ability to do concurrency in Python.
Although the GIL is a challenge, the I/O-bound nature of most programs means that you can still do concurrency in Python. In the next lesson, I’ll introduce you to the
threading library, a good way to do just that.
Become a Member to join the conversation.