Parallel Processing With concurrent.futures: Overview
In this section, you’ll learn how to do multithreading and parallel programming in Python using functional programming principles and the
You’ll take the example data set based on an immutable data structure that you previously transformed using the built-in
map() function. But this time, you’ll process the data in parallel, across multiple threads using Python 3’s
concurrent.futures module, which is available in the standard library.
You’ll see, step by step, how to parallelize an existing piece of Python code so that it can execute much faster and leverage all of your available CPU cores and computing power. You’ll learn how to use the
ThreadPoolExecutor classes and their parallel
map implementations that make parallelizing most Python code written in a functional style a breeze.
By knowing the difference between both executors available in the
concurrent.futures module, you’ll be able to parallelize your Python functions across multiple threads and across multiple processes. You’ll get a brief introduction to the Python Global Interpreter Lock, also known as the GIL, and see how you can work around its limitations by using the correct executor implementation.
Once again, you’ll use your little testbed program from the last video to measure the execution time with the
time.time() function. This allows you to compare the single-threaded and multithreaded implementations of the same algorithm.
Hey there and welcome to another video in my Functional Programming in Python series. In the last video, you saw how to take a piece of code that used the built-in
map() function and to refactor it so that works in a parallel processing fashion, so it gets executed in parallel, processing multiple records at the same time. That can lead to huge speedups in the execution time.
We did that using the
multiprocessing module that’s available in Python 2 and Python 3. Now, I already hinted at this in the previous video, or towards the end of the previous video—that there’s other ways to implement parallel processing using a functional programming style in Python. So, what I want to talk about in this video—what I want to show you in this video—is how to use the
concurrent module that’s built into Python 3. So, that’s not available in Python 2, but it’s kind of the nice and clean interface for doing parallel processing and parallel programming in Python 3. All right.
So, let’s bring back the
multiprocessing implementation for a second, and just to run this example program again… So, what you can see here is that, well, we’re taking this input data set, we’re generating this output here, and this takes about two seconds to complete using
01:18 We can see here, based on our logging output that I set up, that the work is distributed across a bunch of different processes. We have these four worker processes, here, that we can identify based on their process ID, and they’re working on these records in parallel.
Become a Member to join the conversation.