Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Insertion Sort

00:00 Next up is insertion sort, which is a similarly simple algorithm to bubble sort, and it has similar issues with time complexity. That is, it can be a very expensive algorithm in terms of the time it takes to run when the input lists are large. However, it’s not quite as slow as bubble sort in many cases and it has some advantages.

00:23 Especially on really short lists, it tends to be very, very fast.

00:28 Insertion sort is very similar in its basic idea to bubble sort. You might even call it kind of a reverse version of bubble sort, but that’s not quite accurate.

00:37 So, what it does is it starts at the second element of the list and it just moves each item that it’s iterating over into its correct position in the growing sorted subsegment of this list.

00:52 It inserts each element into its correct place in a sorted sublist that eventually grows to become the full list. You can see in this diagram, you start with 2 and you move it before 8 because it’s smaller than 8, and so now you know that 2 and 8 are sorted, so you continue to the next item.

01:08 You move down until you get to the correct place for 6, you put it there, now you have a list of three sorted elements. You keep doing that until you’ve moved all of the items into their correct spots.

01:20 So, you may already have noticed that there’s a slight difference here with bubble sort, is that insertion sort doesn’t do any swapping. It doesn’t flip these two elements as it goes.

01:31 It just moves these elements up as it inserts this one in here. So it does slightly less work on each run through the list because it’s not flipping the two elements each time—it’s simply moving one element up, right?

01:47 So it keeps track of the element that needs to be inserted and it inserts it in at the right place. So you actually lose those extra, kind of, swapping assignment operations.

01:57 Now this doesn’t tend to be a huge deal in practice because modern computers are very, very fast at memory operations but it’s still a slight improvement in performance, especially in very small lists. So that’s something to keep in mind.

02:13 So again, I’m going to have my little timing function here,

02:18 and I’m going to define a timed function, which will just print out its runtime, essentially. It’s going to be called insertion_sort(). It’s going to take in a list of items, just like it did for bubble_sort().

02:31 And now, insertion sort is a little bit more complex to implement than bubble sort. That’s one kind of thing that you might want to keep in mind is you have to be a little more careful about your indices.

02:40 So, you know that you have to iterate through the whole list, but starting at the second item. So I can say for i in range 1 to the length of items.

02:50 Now you need to define your… you could call it a pointer or your, kind of, counter variable, which is the item right before i, right? So if I start at the second element in the list, this will be the very first element of the list.

03:04 And now you’re going to iterate until either j is less than 0, so until you get to the very beginning of the list, or—and I’m saying, “or” as I write and, but these are the two stop conditions.

03:18 If either one of these conditions fails, then we’ll stop iterating. So, current_item is less than the items at j.

03:29 This is essentially saying “Go through and break when the current element is in the right place.”

03:41 And I forgot to actually define the current element, but of course the current element—or, the current_item, I should say—is equal to items at i.

03:51 So, I have a current_item that is the one that I’m trying to insert into its correct place. And I know that it’s in its correct place if either you get to the end of the list or you have the current_item is actually greater than the items at your, kind of, pointer.

04:07 So, while I’m doing this, I want to be moving up the item at j, right? Because I have to move all of these other items up one in the list to account for the insertion of the actual element itself.

04:21 So items[j + 1] = items[j], just moving this item up. And then, of course, I’ll subtract from j. So just kind of a counter variable that’s pointing to the next item to look at in the list. Now of course, when this loop is done, I know that the index at j + 1 is equal to the current_item, or that’s where the current_item needs to be inserted, right?

04:50 Because j is subtracting until it’s either less than 0 or the current_item is the opposite of this condition, right? So now you know that items[j + 1] has the correct item.

05:03 And this is actually all there is to this algorithm! And so it turns out to be quite simple. You’re just going to return items, right? So, get the current item and initialize a counter right before it, and then iterate until you either get to the beginning of the list or until you get to the correct point for the current item. In the meantime, move all of the other elements up so that you can insert this new item into its correct place, and then insert the item. Right? So I’ll just say here, let’s say # Insert current_item into its correct spot. And then return items!

05:43 So, this works quite well, and let’s do a similar trial to the one that I did last time with bubble_sort(), and let’s see how long it takes for this thing to run.

05:51 So if you remember correctly, it was items = [], I think it was random.randint from 1 to 1000 for blank (_) in range I think I started with 1000. Let me just import random,

06:05 and then I will just say here, let’s just say print(insertion_sort(items)).

06:15 So when I say python insertion_sort.py, it’s going to sort that whole list and it just prints out the time it took, which was about five hundredths of a second, or six hundredths of a second if you want to round.

06:27 So that’s pretty fast, and it’s definitely faster than bubble_sort(), which was 11 hundredths of a second for this smaller list. Now let’s bump it up to 10000 and see how long this one takes.

06:37 It’ll also be quite a bit longer than it was for a list of 1000 because insertion sort is also an order of n squared operation. But as you can see, it’s much faster than bubble sort by a whole six seconds or so, even though they’re the same technical, like, time complexity of O(n**2), this one tends to be much faster because of the lack of flipping, right?

07:01 It just is doing one assignment at each iteration instead of two, and that really does tend to be a pretty big deal especially when you’re working with really large lists. But of course, this is still an order of n squared operation because you’re going through the whole list, and then each time you go through the whole list, you’re going through a subsection of that list, which can be as large as the whole list.

07:24 So, it’s a general rule, again, that if you iterate through the list and then each time you iterate through some subset of that list, then it’s still going to be an order of n squared operation.

07:36 So, still pretty slow and we’ll see faster algorithms in the next lesson, but much faster than bubble sort, so there’s some improvement there. The only notable thing with insertion sort is that if you’re implementing this on your own, you have to be really careful with the indices and just make sure that you are correctly following those indices.

07:53 Something to worry about is off-by-one errors, where you forget a + 1, or you forget a - 1 where you need it, and so that’s something that makes this kind of a little tricky to implement at times. Okay. In the next lesson, I’m going to talk about merge sort, which is a little bit of a more difficult to understand sorting algorithm but it’s much faster, and it’s actually quite simple to implement still.

Avatar image for Fabio Alvarez

Fabio Alvarez on Oct. 29, 2022

I implemented this, using the insertion logic

def insertion_sort(lst):
    for i in range(len(lst)-1):
        if lst[i+1] < lst[i]:
            lst[i], lst[i+1] = lst[i+1], lst[i]
    return lst

insertion_sort(lst)
Avatar image for letsgamble92

letsgamble92 on Dec. 8, 2022

In the insertion sort, while we start iterating from 1, why do we ever need j >= 0 condition in line 9?

Avatar image for Neil

Neil on Dec. 19, 2022

Swapping is slower and needs more operations, so the recommended approach is as shown by Liam

Become a Member to join the conversation.