Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set the default subtitles language in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

Speed Up Python With Concurrency (Summary)

You’ve got the understanding to decide which concurrency method you should use for a given problem, or if you should use any at all! In addition, you’ve achieved a better understanding of some of the problems that can arise when you’re using concurrency.

In this course, you’ve learned how to:

  • Understand how latency between the CPU and compoments of your computer provide opportunities for concurrency
  • Use the threading library to write concurrent programs
  • Write code using async and await with the asyncio library
  • Get full use of all your CPUs with the multiprocessing library
  • Distinguish between I/O bound and CPU bound workloads

Here are resources for additional information about latency:

Here are resources about concurrency and the Python GIL:

Here are resources about PEP 554 and Subinterpreters:

Here are resources about threading, futures and asyncio:

Here are resources about multiprocessing and distributed programming:


Sample Code (.zip)

8.2 KB


Course Slides (.pdf)

1.6 MB

00:00 In the previous lesson, I talked about the difference between I/O-bound and CPU-bound workloads. In this final topic, I’ll summarize the contents of the course and point you at some further reading.

00:12 Different programs use your computer in different ways. A lot of software is I/O-bound, spending most of its time waiting for the disk or network in comparison to the amount of cycles available to do computation. Some kinds of problems are CPU-bound, meaning they spend most of their time using the CPU to do work.

00:33 A large amount of latency is involved in getting off the CPU and going to memory, even more to disk, and even more to going to the network. Single processor computers are able to look like they’re doing multiple things at a time because they’re quickly switching back and forth between programs, taking advantage of this latency. This simultaneous work is called concurrency.

00:55 This course introduced you to a number of patterns to help you think about how the concurrency works. The first was pipes, starting with a producer responsible for feeding data into the computation, having a worker to work on that data and do the computation, and then having a consumer that consolidates the output of the worker.

01:16 The N-workers pattern uses a similar model, but multiplies the number of workers. This pattern is particularly useful in CPU-bound computing. Having one worker for each CPU can drastically speed up your program.

01:31 In the N-workers pattern, the producer breaks up the data and passes chunks on to each of the workers. The broadcast pattern is a variation on this, where the producer sends all of the data to each of the workers and the workers themselves decide what to work on.

01:48 Python includes three different modules to meet your concurrency needs in the standard library. The first is threading, which helps you do I/O-bound processing and is tied to the threads inside of your operating system. The second is asyncio, which is an event loop and coroutine mechanism that is similar to threading, but is completely contained inside of the Python interpreter and isn’t operating system-dependent.

02:13 And last, is the multiprocessing library that allows you to spin up multiple interpreters across the CPUs on your computer.

02:21 When thinking about concurrency in your software, the first thing you need to do is decide whether or not you really need it. There’s additional overhead and more code necessary just to manage the concurrency, so make sure that you’re actually going to benefit before you write that code.

02:37 If you are going to use concurrency, determine whether or not your problem is an I/O-bound problem. If it is, then threading or asyncio would be your answer.

02:47 If it isn’t, then you need to use the multiprocessing library. In the case of an I/O-bound program, you should prefer asyncio over threading if you can. It tends to be more efficient and requires less overhead.

03:00 Not all libraries support asyncio, so this decision may actually be made for you, depending on your third-party library needs. Finally, be careful with your concurrent program as to how you’re dealing with memory and the interactions between the parallel portions of your software. threading and asyncio use the same interpreter, so you have to be careful about race conditions messing up your results. In the multiprocessing situation, you don’t have this problem, but you need to do extra work to get the different processes talking to each other and sharing values.

03:32 If you’d like to learn more about latency inside of software, these two articles are helpful. The second article is the original, and the first article was written by somebody else doing an update on the numbers.

03:44 This is where I got a lot of the data about component timing inside of the computer. The general purpose Concurrency article in Wikipedia gives you a high-level introduction to the topic and points you to different models and ways of thinking about it from a computer science perspective.

04:00 If you want to learn more about the GIL, there’s an article available on Real Python, or you can go to the Python Wiki to see some internals. If you’re interested in the subinterpreters, PEP 554 has the proposed changes, and this article on Medium discusses the pros and cons of the approach.

04:20 If you’re interested in threading without using the futures library, the Python docs is probably the best place to start, or you can read this introduction on Real Python. Generally, I wouldn’t recommend using the old school methods. Take advantage of concurrent.futures if you can.

04:37 More information on these can also be found in the documentation. If you want to dig into asyncio, here’s a good article introducing you to the concepts, and this conversation on Stack Overflow goes into great detail about how it actually works. This is the link to the multiprocessing library, and this is an excellent article that introduces you to the different concepts.

05:00 Finally, if you want to up your concurrency game, there’s nothing like making things concurrent across multiple computers. This is referred to as distributed computing.

05:09 This used to be something that was extremely difficult to do unless you had a rack full of servers available to yourself. Now with the advent of Amazon Web Services, Google Cloud Platform, Azure, and other services like it, you have access to someone else’s large warehouse filled with computers.

05:26 This page at the Python Wiki shows different tools that you can use for doing distributed programming. And then finally, Dask and Celery are two common Python libraries that you can use to attack these kinds of problems.

05:41 Thanks for your attention. I hope this course has been useful for you.

frankhofstede on Dec. 15, 2020

I think the celery link is broken.

Chris Bailey RP Team on Dec. 15, 2020

Hi @frankhofstede, It looks like that link it is currently down, not sure why. But these links may work to get you more information on celery, and how to use it.

Lin Gao on Dec. 27, 2020

Nice course! Thanks for the course I finally understand the difference between threads/processes and when should we take advantage of multi-thread/multiprocessing. One suggestion based on my experience taking a high-performance-computing course before: show in depth how to tune the number of threads/processes using a systematic approach.

blackray on Dec. 28, 2020

Very nice write up. This is one of those advance python topics that is must read for data engineers.

Christopher Trudeau RP Team on Dec. 28, 2020

Hi Lin,

Yeah, that’s a tough topic and is a little bit black-magical. Process-wise you typically don’t want to exceed the number of processors on your machine. Thread-wise it depends on how IO bound your computation is. You also have to factor in the extra complexity in your code and the overhead of inter-concurrent communication.

This could make a good article topic on its own. …ct

Pavlo Kurochka on Dec. 29, 2020

Excellent course. I finally got a comprehensive and current overview of the options and the reasoning behind choosing one over the other. Code samples are great too.

Become a Member to join the conversation.