Comparing Performance of Lists vs Linked Lists
Here are resources for more information about comparing linked lists and arrays:
00:00 In the default implementation of Python, called CPython, lists are represented as an array of objects in memory. Because arrays are stored in sequential, contiguous blocks of memory, they support random access.
00:16 This means that we can access any element by its index in O(1), or constant time. C arrays have some fundamental differences from Python lists.
00:29 The important difference for this course is that arrays cannot grow or shrink like a list can. You cannot simply add a new object to the end of an array that is already full.
00:42 Instead, you have to recreate the entire array, allocating more or less space as needed. This happens behind the scenes so that your Python list can grow or shrink without you—the Python developer—having to worry about memory management.
01:00 You never have to think about allocating new space for the array that represents your list. CPython automatically resizes the underlying arrays that make up Python lists when it determines it needs to. When that happens, it typically allocates more space than needed so that the next few append operations on the list don’t require further resizing of the array. In fact, this resizing happens so infrequently that we say the time complexity of an append operation on a list is O(1).
01:37 So, that explains why accessing elements in a list and appending new ones is a constant time operation. Where things fall apart is when trying to insert new items into the list anywhere else.
01:51 If we just insert the new item at the end—aka appending—it’s an O(1) operation. However, if we need to insert an item somewhere in the middle or at the beginning, it becomes an O(N) operation where N is the number of elements in the list.
02:10 In other words, the amount of time the operation takes depends on how many elements are in the list. This is because every element after the desired location for the new item must be pushed down by one address to make space for the new one.
02:28 Then, the new element can be added. As you can imagine, the slowest scenario is inserting a new item at the beginning of a list, which requires that all the existing elements in the underlying array be shifted down by one address.
02:46 The bottom line is this: lists are great when we need to quickly obtain items at a specific index or when we need to append new items to the end, but they start to slow down when we try to insert new items somewhere in the middle or—in the worst case—at the beginning. Let’s switch gears to the linked list.
03:09 First of all, linked lists are not represented by C arrays under the hood. Nodes are simply stored in sections of random memory, with that section of memory containing a pointer to the data stored in that node, as well as a pointer to the next node in the linked list.
03:28 Really, nodes are just objects in memory with an instance attribute for the data and an instance attribute that acts as a pointer to the next node. These nodes are often created from a class or a dataclass. Because no arrays are
03:45 involved, lists don’t suffer from the same limitations as arrays, such as needing to be resized. They’re really just a series of random objects in memory that point to one another. Where linked lists shine is inserting elements at the beginning or the
04:03
end. Imagine I have a linked list with a few nodes in it. As usual, the head pointer points to the first node and the last node points to None
.
04:16 If I want to insert a new node at the beginning, all I have to do is create the new node,
04:22
redirect the .next
attribute to the previous head, and change the head to point to the new node. I didn’t have to resize anything or even touch most of the other nodes.
04:35
If I want to add a node to the end, however, I’m first going to need to traverse through the entire linked list until I reach the end node. Then, I can simply create a new node, point the previous end node to it, and make this new node point to None
.
04:54 As you can imagine, the time it takes to traverse the linked list grows with the number of elements, so accessing elements is an O(N) operation.
05:05 The reason why I said it’s quick to insert nodes at the end is because it’s common to store a separate pointer to the end node of a linked list so you don’t have to keep traversing the whole thing every time.
05:19 The process of creating a new node and redirecting pointers is always an O(1) operation. Additionally, linked lists never involve any array resizing no matter how large they grow, so we never suffer the performance penalty associated with that.
05:38 Later on, you’ll learn about a more advanced type of linked list that can traverse forward or backward starting from either end. Let’s recap what you learned in this video. Python lists are great for when you need to access an element at a specific index. They are not as great, however, when you are constantly inserting new elements into the list somewhere in the middle or towards the beginning. On the other hand, linked lists are great for when you need to frequently access or modify the first or last elements in the collection.
06:15 They’re not as fast when you need to retrieve an item from somewhere in the middle, since they require traversing through all of the previous nodes.
06:25 If you’d like to learn more about the time complexities for various operations on both collection types, see the link in the video notes down below. In the next two videos, you’ll learn about some common data structures that play to the strengths of the linked list.
Become a Member to join the conversation.