The Quicksort Algorithm
00:00 In the previous lesson, I showed you how to traverse data trees using recursion. In this lesson, I’ll show you how to write the Quicksort algorithm. Spoiler alert, it uses recursion.
00:12 Quicksort is a sorting algorithm that uses a divide-and-conquer approach. It works like this: Given some data, you choose a pivot point, then create two lists, one with stuff less than the pivot point and the other with stuff greater than the pivot. Finally, you recursively call Quickstart on the sublists. Correct selection of the value for the pivot will make the algorithm much faster.
00:41 You’re trying to find the value that’s midway through the sorted list. You have to be careful with this, though. If you spend too much time figuring it out, you’ll never get a performant Quicksort, so it’s a bit of a balancing act to your guess.
00:56 And here it is: Quicksort in all its glory. First off, you check if the data is empty or has a single item. If so, it’s sorted already, so just return it.
01:09 Next is finding the pivot. The technique I’m using finds the median of the first, last, and middle item in the list. If you’ve got knowledge about your data, you might be able to make a better choice than this. Also, this will only work for numbers. If you’re sorting other kinds of data, you need a different way of picking the pivot.
01:30 I know the algorithm said create two sublists, but I’m going to create three. Stuff smaller than the pivot, stuff equal to the pivot, and stuff larger than the pivot.
01:41 This style handles the case where there are duplicates in the list a little bit better. With the pivot selected, it’s time to create the sublists by iterating through this set.
01:52 If the value is smaller or larger or the same, add it to the appropriate sublist. And the last line here recursively calls Quicksort with the smaller list and the larger list.
02:05 There’s no need to sort the pivot list, as it will contain one or more copies of the same value, and as it’s the pivot, it goes between the other two sublists.
02:15 Because the lists are lists, the addition operation here will concat them together into a single value to return, the sorted list. Let’s try this out. Importing …
02:34
Sorting empty returns empty ([]
). That’s good.
02:39 Sorting a single value returns the same thing. Also good.
02:48 And there it is, sorted. Let me try one more time, this time with some duplicates,
02:59
and it looks like our Quicksort is working to dig into this a little more. I’ve added some print()
statements to the Quicksort function. Let me run it again so you can see the process.
03:18
Just going to scroll back up, and let’s take a look. The first call to the function takes the full list and the median of 5
, 3
, and 1
is 3
, so that’s the pivot. Now, time to calculate the sublists. Everything smaller than 3
, the list with the pivots, and everything larger than 3
.
03:42
The next step is to begin the recursion, starting with the smaller list. So there it is, Quicksort called again with the parent’s sublist containing the smaller values, 2
and 1
.
03:55 The sublists for the second level are then created, and once again, it’s time to recurse. This time, third level down, the list being sorted is empty. Then the second level’s larger list gets sorted, returning itself.
04:12 Then the first level’s larger list gets sorted,
04:18 and so on until it all gets assembled back into the sorted list.
04:26 The version of Quicksort I showed you here is a bit more memory intensive. It needs copies of data for the sublists, along with the overhead of the lists themselves. There are versions of Quicksort that don’t require copies, but they’re harder to do in Python, and you typically see this in languages that support pointers.
04:46
The name of Quicksort is appropriate. It’s one of the faster sorting algorithms. If Python is your first programming language, you might not have come across Quicksort before, and that’s because Python has .sort()
built in. If you’re curious, the underlying algorithm for .sort()
in Python is not Quicksort, but something called Timsort. It is also pretty quick.
05:10 Its worst-case performance is faster than Quicksort, but there are other cases where Quicksort is faster, so it’s kind of a draw. Timsort does tend to work better if there are a lot of objects to dereference, and as everything in Python is an object, it kind of makes sense that Timsort was chosen as the defacto sorting in Python. That’s it for recursion.
05:35 Last up, I summarize the course and point you at further investigation.
Become a Member to join the conversation.