Set Speed Test
00:00
In order to see how fast functions take to run, you can use the %timeit
method that is built into IPython, which is an interactive Python terminal. So use %timeit
and then just put some expression like this. It will run it a bunch of times, find the mean, and then also find the standard deviation on how long it took.
00:23 So for example, it took 7 nanoseconds plus or minus 0.019 nanoseconds per loop. It had 7 runs of, I don’t know, 10 billion loops each, or something like that—10 million loops.
00:35 So this is pretty accurate. It ran it a lot, and this is very, very quick. Let’s see how fast it takes to do some operations on sets, lists, and tuples.
00:46 First, we will just initialize our set.
00:52
We’ll loop through 1000
, we’ll add them all to our set. Remember, .add()
mutates our set, and then just return s
.
01:01
I’ll load this interactively so we have access to this method. We’ll do %timeit
like this, wait a little bit.
01:14 And it took 72 microseconds plus or minus 390 nanoseconds, and it ran 7 runs with 10,000 loops each. So here, let’s initialize our list.
01:32
.append()
, because lists use .append()
. Exit this, reload this, clear, use our %timeit
.
01:49 And it took 71 microseconds plus or minus 480. So, let’s do Up arrow—oh, it actually remembered. So, it takes about the same amount of time to initialize a set with a 1,000 numbers, as it does a list.
02:05
Let’s just quickly do it for a tuple, tup = tuple()
. So remember, we use +=
because tuples are immutable and we have to use this little interesting parentheses-comma thing to add tuples.
02:25 Exit, clear, load. This is defined twice. So, basically, I’m using VS Code, which has a package called Python, which will show you if you have any syntax errors and also auto-format when you save. It’s just useful to make your coding a little bit faster.
02:44 Why is this…? There we go. Okay.
02:54 So, we see here that initializing tuples takes longer than initializing sets and lists. Well, it’s mostly because here we are mutating our variables, while here we have to create a new tuple and then reassign it to here and add tuples together.
03:15 That’s a lot harder than just adding to an object that already exists in memory. But that wasn’t the point of this video, just something cool you learn when you start running functions and looking at how fast they take. What we really care about is membership.
03:34
Let’s loop through our numbers again. Well, first let’s define our set. s = initialize_set()
, lst = initialize_lst()
, tup = initialize_tuple()
.
03:50
This will happen outside the function, so this will happen when I load the file so it won’t actually be counted when I start timing these methods. Let’s just do i in s
. This is sort of an interesting expression here.
04:04
This is literally just like writing like True
or False
or, like, x * 10
or something. Like, it doesn’t actually do anything, because we’re not assigning it to anything, but it does execute the code.
04:15
So this will actually execute i in s
for each number between 0
and 1000
, not including 1000
. Let’s try membership_lst()
, change this to lst
, and membership_tuple()
.
04:36
Eh, we’ll just call it tuple
for… what did I call it here? Okay, yeah. This.
04:46
Perfect! Let’s exit this, and now we will %timeit membership_set()
. It took 57 microseconds plus or minus 1 microsecond per loop.
05:07 This took 5.7 milliseconds plus or minus 25 microseconds, and it only did 100 loops. I’m assuming that this optimizes, because if it were to do 10,000 loops for everything, this would take way too long.
05:20 But it still gives us a pretty realistic mean and standard deviation, so we can see that checking membership in sets is about a hundred times faster, I believe, which is pretty significant.
05:36 And here we’ll try the tuple. So, the tuple—checking membership in lists and tuples is pretty similar. Checking membership in sets is almost instant, or at least instant relative to lists and tuples, and is much faster.
James Uejio RP Team on April 4, 2020
Hi @Minh thank you for the response. See ipython.org/install.html for how to install iPython where you can use timeit.
Zarata on April 6, 2020
The tuple tutorial made a point that tuples are used because they are faster for many operations compared to lists. You’ve proven that the membership operation is “the same” tuple vs. list. Is this detail worth an explanation / rectification?
keyurratanghayra on April 7, 2020
Hey There,
May I know which IDE, you are using for this lesson? It looks swanky.
Ricky White RP Team on April 8, 2020
He’s using VS Code :)
James Uejio RP Team on April 11, 2020
@keyurratanghayra Ricky is correct I am using VS Code! I have a “Dainty – Monokai” theme see dainty.site/vs. Also using iTerm for my terminal.
@Zarata I’m a little confused regarding the question. At around 5:25 you can see membership in sets is 57 microseconds while membership in lists is 5 milliseconds. So membership in sets is 100x faster. Let me know if you are referring to another part of the video!
Zarata on April 13, 2020
In the Introduction to Python learning path as a whole, there is another module Lists and Tuples. In that module (not this Sets module) the author made the point that tuples are used because they are faster than lists. Your video clocks comparable list and tuple operations as ~5.7 ms both. Your video validates the speed superiority of set implementation, but invalidates that tuple/list statement by the other author. I’m curious if there’s a rectification. Your timing example re-ignites the question I remember from the tuples/list module “why use tuples at all?”
BTW, I recall someone along the way asked “what are sets good for?” I have found set concepts VERY useful in some graph theory numerical analysis I performed using Java, but I didn’t find in Java at the time all of the method and operator support that Python provides. To that person: sets do have real world use and the set concept can make some numerical analysis problems “easier”.
RMS on April 13, 2020
Awesome, didn’t know about the %timeit
functionality of ipython.
Is that why you’re using ipython instead of ptpython? Or are there any other advantages?
James Uejio RP Team on April 15, 2020
@zarata Great observation and after some research, the article is correct to an extent. Tuples are only faster than lists when initializing them, not necessarily adding them together (like in the video). For example:
>>> %timeit (1, 2, 3)
6.9 ns ± 0.0379 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)
>>> %timeit [1, 2, 3]
46.4 ns ± 0.341 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In terms of what I showed in the video, you can see lists are slightly faster because for tuples you need to initialize and add together vs just appending to a list. I probably should have named the functions initalize_and_add or something. In the real world however this is pretty negligible and the it’s not necessarily worth worrying about optimizing tuples vs. lists in terms of speed. You can learn more here: stackoverflow.com/questions/3340539/why-is-tuple-faster-than-list-in-python.
However there are many benefits to tuples because they are hashable so you can use them as keys for dictionaries or you can put them in sets. They are also immutable which can lead to cleaner code and thread safe programming. See hackernoon.com/5-benefits-of-immutable-objects-worth-considering-for-your-next-project-f98e7e85b6ac for more!
@RMS I’m not super familiar with pypython but ipython has been pretty easy for me to use. I don’t think the REPL matters too much (sometimes when I’m lazy I’ll just pull up the built in one to double check some syntax), but ipython is easy to install and use so I chose that one. This video course realpython.com/courses/finding-perfect-python-code-editor/ has some more information on choosing a python code editor, which is probably more useful than finding the perfect REPL.
Become a Member to join the conversation.
Minh Pham on March 25, 2020
Hi James,
Please tell us how to set up the Python Console and test as you do in this Speed Test
Thanks Minh