# Merge Sort

Here are resources for indexing and slicing in Python:

**00:00**
The next algorithm I want to consider is merge sort. Now merge sort is a really clever algorithm that relies on an interesting property of sorted lists.

**00:11**
You can merge two sorted lists into one bigger sorted list just by including the elements from both, and you can actually do that in order of *n* time—that is, in as many operations as there are elements in the sum of those two sorted lists. That may not seem that important but it actually is the key to this algorithm, which is a recursive algorithm.

**00:34**
So let’s take a look at how it works.

**00:36**
The basic idea of merge sort is that you break the list into two halves. You sort those halves recursively also using merge sort, and then you merge the halves together again.

**00:47**
It relies on the fact that merging two sorted lists is actually an order of *n* operation. So if you take a look here, what you can do is divide this list of five elements into two smaller lists, which are both about half the size of the big list.

**01:01**
And then you do the same on all of these sublists that you generate there. And then you actually merge those two lists back together in a sorted manner.

**01:10**
So `[8]`

and `[2]`

, which are two one-element lists, become the two-element list `[2, 8]`

. And then that gets merged with the result of this recursive computation on `[6, 4, 5]`

, which is `[4, 5, 6]`

, and those two get merged into one big list. Now if you’re not super familiar with algorithmic theory, this may not be obvious to you that this is faster than something like insertion sort or bubble sort. But there’s a basic rule of algorithmic study which is that if you can perform recursive operations on half-sized lists, and then if you can merge them together quickly, you often get a big performance boost as opposed to having to do some kind of iterative method.

**01:52**
So, let’s take a look at how this algorithm actually gets implemented in practice, and it may become a little bit more clear to you.

**02:01**
The great thing about merge sort is that if you have a function called `merge_sorted_lists()`

that takes in two already sorted lists—let’s call them `left`

and `right`

—and merges them into one larger sorted list that contains all of the elements of both of the sublists, then the actual logic of `merge_sort()`

itself is pretty much exactly as I described it in the slide a second ago.

**02:25**
You just say if the length of the `items`

is less than or equal to `1`

, so a little guard clause, you just return `items`

, because obviously you can’t sort a list with one item in it. It’s already sorted, you can’t get any better than that.

**02:40**
So the next thing to do is, as I described earlier, you have to divide the list into two sublists roughly of equal size, and the best way to do that is to divide them right at the middle.

**02:52**
So I can choose the middle index, which is the `len(items) // 2`

, and I use integer division to make sure that the `midpoint`

is actually a valid index, and it’s not a huge deal because it doesn’t really matter if one sublist has one more element than the other. That’s not a problem for this algorithm.

**03:09**
So once you have that, you can divide the list pretty straightforwardly into left and right halves by saying `items`

at the beginning of the list up to the `midpoint`

, and then `items`

**03:24**
from the `midpoint`

up to the end of the list. And this is Python’s awesome slice operator, and if you’re not super familiar with it, I would encourage you to go check out some of the Real Python resources on the slice operator. It’s amazingly convenient.

**03:38**
It’s one of my favorite Python features, this ability to slice lists in a variety of different ways as you might need them. Now there’s really only one step left here, which is to say `return merge_sorted_lists()`

of the recursively sorted left-hand list and the recursively sorted right-hand list.

**03:59**
So you merge sort the two halves and then you merge them together. That’s really all there is to merge sort, which is crazy because it turns out to only be what—what is this? 5 lines of code? That’s crazy.

**04:11**
But of course you actually need this `merge_sorted_lists()`

function, so let’s take a look and implement that. Before I get into it, though, I’d encourage you to think about how you can merge sorted lists as quickly as possible. The ideal, of course, is to do it in order of *n* time, because that will give this algorithm the order of *nlog(n)* runtime, which is the, kind of, coveted sorting runtime. In fact, there’s a proof out there, a mathematical proof, which shows that any comparison-based sorting algorithm, the fastest possible time you can take is order of *nlog(n)*. And so if we can achieve that here, that’s a pretty darn great accomplishment, especially with five lines of code for the real meat of the algorithm.

**04:54**
So, let’s merge some sorted lists. The key to understanding this algorithm—which is, I’d say, a little bit more opaque than the actual merge sort—is that the idea is if you have two sorted lists, you know that the smallest elements of both of those lists are at the beginning of the lists, right? Because they’re sorted.

**05:14**
And if you want to find the smallest element in either of those two lists, you just have to take the smallest of the two elements at the beginning of each list, right?

**05:23**
Because even if one list has all smaller elements than the other, you know that its smallest element has to be at the beginning because we know it’s already sorted. So this is a really convenient property, and it leads to an algorithm which looks something like this.

**05:39**
You get a left index and a right index, and those will both start at `0`

, so start at the beginning of both lists, and the idea is to slowly build up a return list, which is going to start as empty, and then you can just say, while the length of the `return_list`

is less than the length of `left`

plus the length of `right`

. And you don’t actually really need these parentheses here, but I just put them there for my own kind of clarity.

**06:09**
What you can do is you can say, let’s get the elements at each of these indices.

**06:18**
And there’s a little bit of a tricky step here, which is to say, I can’t just get the left item and say `left`

at `left_index`

. I actually have to check if this is a valid index for it to be at, so I say `left[left_index] if left_index < len(left)`

.

**06:39**
And what do I put otherwise? What do I put in the `else`

clause? Well, this is another little tricky thing, which is I’m going to say `float()`

of `'inf'`

(infinity), which is a common Python idiom which essentially just gives an infinite or max numeric value.

**06:54**
And the reason for why I do that will become clear in one second. So let me just type out, I’ll do the exact same thing with the `right_index`

and the `right`

list.

**07:07**
And then the key to this is I’m just going to choose between these two items and slowly move through these two lists—or quickly, really, as it were—and I’ll just keep adding one element at a time.

**07:19**
So, `if left_item < right_item:`

then I know that `left_item`

is the smallest remaining item that hasn’t yet been added to the `return_list`

.

**07:29**
So I can say `return_list.append(left_item)`

.

**07:35**
And then, of course, since I appended that `left_item`

, I need to add `1`

to my `left_index`

because I no longer need to consider that item. And in all other cases, I can say here `return_list.append(right_item)`

and increment the `right_index`

.

**07:53**
And, of course, if the two are equal, then I’ll just add the `right_item`

but it won’t really matter because I could have added either of those two in that case. Then all that’s left to do is just return the `return_list`

.

**08:05**
So, of course you might expect that to be the final step because of its name.

**08:10**
And so really all this is saying is “Choose the smallest remaining item and add it to the `return_list`

.” So that will guarantee that you’re always adding things to this `return_list`

in sorted order, and that’s the key to this algorithm. So, there it is.

**08:28**
And if this is not looking 100% clear to you, I encourage you to check out the downloadable code underneath this lesson, because it will have a little bit more comments and a little bit more, kind of, just annotation for you to hopefully help you understand it.

**08:42**
And you can also run that code, of course, and add whatever print statements you want to better kind of follow the progress of this algorithm as it goes. So, I already imported `random`

and `time`

, so let me just add a little bit of quick kind of benchmarking code here.

**08:57**
So, my little decorator doesn’t work so well anymore because this is a recursive algorithm and it’s kind of hard, I would just have to write some more code and it would be a little more opaque if I used a decorator for this.

**09:08**
So I’m just going to do this in kind of a brute force way, which is just start the timer, run the algorithm, and then print out how much time has passed.

**09:16**
So let’s see this work with just a 10-element list in the terminal.

**09:21**
When I run this from the command line in Python,

**09:25**
as you can see, this outputs a nicely sorted list of these random integers, and then it says—wow, this took an unbelievably small amount of time, almost infinitesimal.

**09:36**
So let’s bump it up to `100`

, see how long that takes to run and make sure it’s still sorted. And as you can see, it looks like everything is in sorted order.

**09:46**
Of course, you can take some time to check that if you want, but I can guarantee it’s in sorted order. And it still took a really small period of time. Let’s remove this print statement, I don’t think I want to see a thousand-element list printed into my terminal. And I’ll change that to `1000`

, and we can kind of start to see it move up.

**10:04**
And of course, you know, it would be nice if I had made this into a nice script that just would increment slowly and go up all of this but, you know, it’s nice to kind of see it done in just a low-tech kind of way as well. As you can see, it’s starting to take a little bit longer. It took a little over a second there. And if I add one more zero it will take an even longer period of time, but it’s still so much faster than insertion sort and bubble sort for the same number of elements in the list. So, that’s merge sort for you.

**10:35**
And as I said, feel free to browse the code for a little bit more context and the ability to look at this on your own. A great thing about merge sort—it’s super fast compared to the two sorts that I showed you earlier, and it’s really quite simple once you understand the power of recursion.

**10:52**
The actual algorithm itself is just not super, super complicated. And it has another great property, which is that merge sorts are stable, meaning that you can sort on multiple keys and maintain the order of previous sorts.

**11:06**
That’s a very desirable property when you’re sorting large data structures that often need to be in very complex orders based on multiple pieces of the data that they contain. Next up, I’m going to talk about quicksort, which in some ways is a really similar algorithm to merge sort but ends up having a lot of distinct behavior.

Become a Member to join the conversation.