Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set the default subtitles language in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

heapq and PriorityQueue

00:00 In the previous lesson, I showed you two ways of using the built-in list type as a priority queue. In this lesson, I’ll introduce you to heapq and the queue library’s PriorityQueue object.

00:13 Heap queues are based on heaps. A heap is a special form of a tree data structure. There are several different kinds of heaps. The max-heap specifies that for any node in the tree, all children of that node have a value greater than or equal to their parent node. The min-heap is similar, but the value is less than or equal to the parent nodes. Binary heaps are heaps where no node has more than two children. Here’s an example.

00:42 The relationships between the nodes means that it is O(log n) to find the right spot to insert and also O(log n) to rearrange the tree when the smallest element is removed.

00:56 It turns out that a binary heap tree can be flattened and stored in a linear fashion, such as an array. You can think of this as a special way of sorting the array, but the sort order is the position in the heap. Inserting into a binary heap as an array is done by putting the element at the end of the array, then swapping parent and child elements until the newly inserted item is in the correct position in the array. In Python, the built-in list type is implemented as an array. Inserting at the end of the list is therefore cheap. Swapping items in a list is also cheap. You are directly accessing two elements in memory.

01:35 Deleting an item from the binary heap is a bit more complicated, but similar to the insertion process. The nature of the relationship between nodes in the tree make insert and delete O(log n) operations.

01:49 The heapq library provides methods for keeping a list in binary heap format. I’ll start off with a list as my queue, import the heapq library, then use the heappush() method from the library to insert a new item in the heap. Nothing new here. Doing it again,

02:16 and you can see the result. One more time.

02:24 Notice that the order isn’t just a standard sort, but specific to the representation of a binary heap.

02:33 Popping off an item using heappop() gets the lowest numbered item out of the heap. A relatively performant priority queue for your priority queue needs.

02:45 Using heapq to maintain your list is all well and good as long as you remember to do so. If you forget to use the heapq methods at some point, you will damage the tree, further messing up its use in the future.

02:57 Complications arise from sort stability as well. First off, what is your desired behavior when two items have the same value? If you’re using tuples, you will get the tuple sort order, similar to the insort() method in the previous lesson. If, on the other hand, you want the priority of items with the same priority value to be based on their insertion order—a common practice in scheduling—then you’re going to have to augment your tuples with more information.

03:25 Removing items from the heap can also be problematic, possibly damaging the sort order of the tree. The easiest way to do this is just to remove the item, then call heapify() to return heap sorting order on the list.

03:39 In the case of moving an item—say, if its priority changes—do the delete first, then re-insert afterwards. Previous lessons talked about thread safety in concurrent programming. Python provides a thread-safe version of a heapq in the queue library called PriorityQueue. It has the same performance and restrictions of heapq, but also uses locks to ensure its methods are atomic. Similar to the other objects in the queue library, PriorityQueue uses .put(), .get(), and .get_nowait(). To get started, do an import,

04:19 then create the object. Use .put() to push items into the queue. Like the other members of the queue library, not a particularly useful repr.

04:32 Let me put a few more things inside.

04:38 Now my .get(), and another .get(). And don’t forget those were blocking calls. For the non-blocking version, use .get_nowait() with its expected Empty exception.

04:54 Pretty much the same as the other members of the queue library. Isn’t consistency grand? Next up, I’ll talk about the pros and cons of the different priority queue mechanisms.

Become a Member to join the conversation.