Comparing the Performance Between Loops and Generators
00:00 Comparing the Performance Between Loops and Generators. As always, when measuring performance, you shouldn’t read too much into any one set of results. Instead, design a test for your own code with your own real-world data before you make any important decisions.
00:18 You also need to weigh complexity against readability. Sometimes shaving off a few milliseconds just isn’t worth it. For this test, you’ll want to create a function that can create lists of an arbitrary size with a certain value at a certain position.
00:38
First pprint
is imported. It will be used to output the built list to make it more readable. Otherwise, the list would appear on a single line by default, making it much harder to read.
00:49
The build_list()
function creates a list filled with identical items. All items in the list, except for one, are copies of the fill
argument.
00:58
The single outlier is the value
argument, and it’s placed at the index provided by the at_position
argument. With this function, you’ll be able to create a large set of lists with the target value at various positions in the list.
01:13 You can use this to compare how long it takes to find an element at the start and the end of the list.
01:35
The build_list()
function will be a key part of the script which will be created next, which will eventually chart the performance of for
loops and generators.
01:44
First, timeit
is imported, allowing the time taken for the functions to be measured easily. Some constants are declared, allowing the testing intensity and depth to be easily altered in the future.
01:58
The build_list()
function that you created earlier is re-created within the script to allow the creation of a list of dictionaries.
02:12
Next, find_match_loop()
is created, which will return the first value with a population
attribute over 50
by iterating over the list using a for
loop.
02:28
The function using a generator to return the first match is created next, once again returning a value with a population
attribute over 50
.
02:43 Both of these functions are hard-coded to keep things simple for this test. Later in the course, you’ll be creating a reusable function. Now the main loop of the program is created, firstly creating the three lists that will be used for the storage of the the data.
03:08
The current progress is printed on-screen, with end
set to a carriage return to clear the previous text.
03:45
This creates a list of dictionaries of LIST_SIZE
, and then timeit
is used to measure the time taken for the for
loop …
04:09 and the generator to perform the search.
04:29 As you’ve seen, the script makes use of three lists. Two of them will contain the time it took to find the element with either the loop or the generator.
04:39 The third list will contain the corresponding position of the target element in the list. It’s possible to take a look at the data by running the script using Python’s interactive mode.
04:56
Here the script runs quickly, as timeit
is set to only run each test a hundred times, and there are only fifty tests run for each function. You can then examine the lists that have been created.
05:13
While you clearly could have created this output by adding print
statements to the program, there is an advantage to using interactive mode.
05:20
You can create a dataset, which may take some time if you increase the TIMEIT_TIMES
or LIST_SIZE
and then perform calculations and experiments on the data without having to recalculate it each time. For instance, you can calculate the ratio between the times taken for generators and loops.
05:56 Ideally, you’d want to produce a chart of these findings, and that’s what you’ll be doing in the next section of the course.
Become a Member to join the conversation.