How to Create a multiprocessing.Pool() Object
In this lesson, you’ll create a
multiprocesing.Pool object. This is an interface that you can use to run your
transform() function on your input data in parallel, spread out over multiple CPU cores. This
Pool instance has a
map() function, so you can
transform() function over scientists.
Now, when you run your program, you’ll see that you get the same result, but you get it a lot faster. This happened because you did your processing in two batches. In the next lesson, you’ll keep working with
So, what I’m going to do now—we’re going to replace the sequential step here. We’re going to replace it with some multiprocessing code. What we need to do here, first, is we need to create a
multiprocessing.Pool object and we need to store that somewhere. A
multiprocessing.Pool, it’s basically an interface that we can use to run our transformation, or our
transform() function, on this input
01:03 So remember, before, this took about seven seconds to execute. If we run this again, now—well, we’re getting a way different output, right? It looks like we’re actually starting the processing here for four records all at once, and then they all complete as a batch of four, and we have another three—I guess that’s the remaining ones—and then those complete as well.
Become a Member to join the conversation.