The concurrent.futures Module
In this lesson, you’ll see that using the concurrent.futures
module is the newer way of doing asynchronous computation in Python. It has a clean interface for working with process pools and thread pools and is only available in Python 3.
You’ll replace your multiprocessing
code with code from the concurrent.futures
module. When working with this new module, you use various classes that have Executor
in their names. There are different execution strategies for how your code is run in parallel, whether that’s across multiple processes or multiple threads within a single process, and they all follow the context manager protocol.
00:00
All right. Let’s replace this code with the concurrent.futures
module.
00:07 This is the new and shiny way to do asynchronous computation in Python. It has a clean interface for working with process pools and also thread pools,
00:20
and it’s kind of cool. It’s only available in Python 3. The first thing I’m going to show you is how we can replace this multiprocessing
code here with code from the concurrent.futures
module.
00:29
Let’s just get this set up. Here, I can go concurrent.futures.ProcessPoolExecutor()
. The way this interface works in the concurrent.futures
module is that you have these different classes that are called executors, and they are different execution strategies for how your code is run in parallel, whether that’s across multiple processes or multiple threads within a single process. They all follow the context manager protocols, so we can just enter this executor here and then do stuff with it.
01:08
It makes it very easy to do the cleanup here, as well. Here, I can just go result = executor.map
and—again—you can see here the central importance of this map()
function as a parallel processing primitive. I’m just going to pass it my transform()
function and my input data, and hopefully, this is going to run. All right, now as you can see here, we’re pretty much getting the same result that we did get with the multiprocessing-based implementation. Again, this is fanning out and it’s running across four processes in parallel—it’s doing the calculations here that transforms in parallel. It takes about two seconds to complete, and then we’re getting this <itertools.chain object>
.
01:54
This is maybe a small difference to what you’ve seen before, where multiprocessing
, or a multiprocessing.Pool.map()
—it gives you a list of results, whereas this will give you an iterator, here.
02:08 And if I wanted to convert that back into a immutable data structure,
02:15
I’d probably just call tuple()
on it. And again, you know, maybe you want to go back to some of the previous videos to see why I needed to do that—because I explained it, I think, in the video on doing the map()
operation that’s built into Python directly. Again, we’re going to rerun this and now we’re getting the expected output because we’re just converting. We’re consuming this iterator, turning it into a tuple here with all these output elements so we can print them nice and cleanly.
Martin Breuss RP Team on April 27, 2021
Awesome, thanks for sharing this here, and nice work for figuring it out! 🙌
Become a Member to join the conversation.
squeakyboots on April 27, 2021
Ha, just killed at least 30m because I named the file for the last section in the directory I’m working through this tutorial in as ‘multiprocessing.py’ so when trying to run the examples with
concurrent.futures
in a new file it failed sayingFile "C:\Users\[user]\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\process.py", line 52, in <module> import multiprocessing.connection ModuleNotFoundError: No module named 'multiprocessing.connection'; 'multiprocessing' is not a package
because it was importing my own code instead of the actual multiprocessing package, which apparentlyconcurrent.futures
uses =pThat will teach me to be careful with my file naming =D
Also, the “is not a package” error is a big tell when you think you’re trying to import a package. I learned double on this video =)