Multiprocessing Testbed Overview
In this lesson, you’re going to look at a little testbed program that you’re going to build and use to measure execution time with the
time.time() function, so that you can compare the single-threaded and multithreaded implementations of the same algorithm.
In the next lesson, you’ll see why you’d want to do all this. Because you wrote your code in a functional programming style, you can parallelize it fairly easily. There’s a parallel
map construct that you can use. That way, you can run your processing steps in parellel.
00:00 So, what I’ve got here is a simple testbed that brings back our data structure here. We’ve got this immutable data structure that represents a bunch of scientists that worked in different fields, when they were born, their names, whether or not they won the Nobel Prize at some point.
then create a transformed output data structure that contains the scientist’s name and their age as of 2017. All of this is going to look familiar, so at this point, you probably want to check out the first video in the series on how and why we set up this data structure the way it looks like here—and, of course, also here, that’s about the
namedtuple—and then you probably will also want to watch the video on the
map() function that showed you how we can take this
scientists structure here and transform it into a new data
structure. All right, so, let’s run this example because I think that’s going to make it a little bit easier to follow. I saved this as
parallel.py, and when I run this, you can see here, well, okay, we’re just printing out the input data structure—this
scientists thing here—and it looks like this.
We get all of the ages for these scientists and their names. Now, this
transform() function here is really simple, right? We’re working with a relatively small amount of data here, and we’re working with a simple transformation that doesn’t take a lot of computations and it will execute very fast.
01:43 But if you imagine if this was a more complex operation, here—for example, if this needed to go out and fetch some data from the internet and then process it. So, if it was I/O bound, it was waiting for I/O to complete, for that website fetch to complete.
Or, if this was doing some more extensive number crunching, then performing this
map() operation would actually take quite a while, right? Like, if I run this right now, it’s instantaneous, but if we needed to do something more complicated here, this would actually take longer.
little bit slower. We’re just going to insert a
time.sleep() call here and we’re just going to sleep for a second. And if I run this now I can see here, this actually takes a little bit longer to process.
Become a Member to join the conversation.