heapq and PriorityQueue
00:00
In the previous lesson, I showed you two ways of using the built-in list
type as a priority queue. In this lesson, I’ll introduce you to heapq
and the queue
library’s PriorityQueue
object.
00:13 Heap queues are based on heaps. A heap is a special form of a tree data structure. There are several different kinds of heaps. The max-heap specifies that for any node in the tree, all children of that node have a value less than or equal to the parent nodes. The min-heap is similar, but the value is greater than or equal to their parent node. Binary heaps are heaps where no node has more than two children. Here’s an example.
00:42 The relationships between the nodes means that it is O(log n) to find the right spot to insert and also O(log n) to rearrange the tree when the smallest element is removed.
00:56
It turns out that a binary heap tree can be flattened and stored in a linear fashion, such as an array. You can think of this as a special way of sorting the array, but the sort order is the position in the heap. Inserting into a binary heap as an array is done by putting the element at the end of the array, then swapping parent and child elements until the newly inserted item is in the correct position in the array. In Python, the built-in list
type is implemented as an array. Inserting at the end of the list
is therefore cheap. Swapping items in a list
is also cheap. You are directly accessing two elements in memory.
01:35 Deleting an item from the binary heap is a bit more complicated, but similar to the insertion process. The nature of the relationship between nodes in the tree make insert and delete O(log n) operations.
01:49
The heapq
library provides methods for keeping a list in binary heap format. I’ll start off with a list
as my queue, import the heapq
library, then use the heappush()
method from the library to insert a new item in the heap. Nothing new here. Doing it again,
02:16 and you can see the result. One more time.
02:24 Notice that the order isn’t just a standard sort, but specific to the representation of a binary heap.
02:33
Popping off an item using heappop()
gets the lowest numbered item out of the heap. A relatively performant priority queue for your priority queue needs.
02:45
Using heapq
to maintain your list is all well and good as long as you remember to do so. If you forget to use the heapq
methods at some point, you will damage the tree, further messing up its use in the future.
02:57
Complications arise from sort stability as well. First off, what is your desired behavior when two items have the same value? If you’re using tuples, you will get the tuple sort order, similar to the insort()
method in the previous lesson. If, on the other hand, you want the priority of items with the same priority value to be based on their insertion order—a common practice in scheduling—then you’re going to have to augment your tuples with more information.
03:25
Removing items from the heap can also be problematic, possibly damaging the sort order of the tree. The easiest way to do this is just to remove the item, then call heapify()
to return heap sorting order on the list.
03:39
In the case of moving an item—say, if its priority changes—do the delete first, then re-insert afterwards. Previous lessons talked about thread safety in concurrent programming. Python provides a thread-safe version of a heapq
in the queue
library called PriorityQueue
. It has the same performance and restrictions of heapq
, but also uses locks to ensure its methods are atomic. Similar to the other objects in the queue
library, PriorityQueue
uses .put()
, .get()
, and .get_nowait()
. To get started, do an import,
04:19
then create the object. Use .put()
to push items into the queue. Like the other members of the queue
library, not a particularly useful repr.
04:32 Let me put a few more things inside.
04:38
Now my .get()
, and another .get()
. And don’t forget those were blocking calls. For the non-blocking version, use .get_nowait()
with its expected Empty
exception.
04:54
Pretty much the same as the other members of the queue
library. Isn’t consistency grand? Next up, I’ll talk about the pros and cons of the different priority queue mechanisms.
Christopher Trudeau RP Team on May 4, 2023
Hi Andrew,
Yes, you’re right. I must have been asleep when I recorded that. Fix coming soon.
andrewodrain on May 4, 2023
Thanks for the quick response!
Become a Member to join the conversation.
andrewodrain on May 3, 2023
I was under the impression that ‘Max Heaps’ are trees whose child nodes were less than or equal to their parents & ‘Min Heaps’ are trees whose child nodes are greater than or equal to their parents. Implying for a ‘Max Heap’, the root node is the greatest value, and for a ‘Min Heap’ the root node is the minimum value. Is this correct? Thank You!