heapq and PriorityQueue
00:13 Heap queues are based on heaps. A heap is a special form of a tree data structure. There are several different kinds of heaps. The max-heap specifies that for any node in the tree, all children of that node have a value less than or equal to the parent nodes. The min-heap is similar, but the value is greater than or equal to their parent node. Binary heaps are heaps where no node has more than two children. Here’s an example.
It turns out that a binary heap tree can be flattened and stored in a linear fashion, such as an array. You can think of this as a special way of sorting the array, but the sort order is the position in the heap. Inserting into a binary heap as an array is done by putting the element at the end of the array, then swapping parent and child elements until the newly inserted item is in the correct position in the array. In Python, the built-in
list type is implemented as an array. Inserting at the end of the
list is therefore cheap. Swapping items in a
list is also cheap. You are directly accessing two elements in memory.
01:35 Deleting an item from the binary heap is a bit more complicated, but similar to the insertion process. The nature of the relationship between nodes in the tree make insert and delete O(log n) operations.
heapq library provides methods for keeping a list in binary heap format. I’ll start off with a
list as my queue, import the
heapq library, then use the
heappush() method from the library to insert a new item in the heap. Nothing new here. Doing it again,
heapq to maintain your list is all well and good as long as you remember to do so. If you forget to use the
heapq methods at some point, you will damage the tree, further messing up its use in the future.
Complications arise from sort stability as well. First off, what is your desired behavior when two items have the same value? If you’re using tuples, you will get the tuple sort order, similar to the
insort() method in the previous lesson. If, on the other hand, you want the priority of items with the same priority value to be based on their insertion order—a common practice in scheduling—then you’re going to have to augment your tuples with more information.
Removing items from the heap can also be problematic, possibly damaging the sort order of the tree. The easiest way to do this is just to remove the item, then call
heapify() to return heap sorting order on the list.
In the case of moving an item—say, if its priority changes—do the delete first, then re-insert afterwards. Previous lessons talked about thread safety in concurrent programming. Python provides a thread-safe version of a
heapq in the
queue library called
PriorityQueue. It has the same performance and restrictions of
heapq, but also uses locks to ensure its methods are atomic. Similar to the other objects in the
.get_nowait(). To get started, do an import,
Become a Member to join the conversation.