# Hard Interview Question (Optimal Solution & PriorityQueue)

In this lesson, you’ll see the **final optimal solution** to solving the ** merge_k_lists** hard interview question. The solution takes advantage of the

**abstract data type, which is similar to a queue but associates each element with a priority. Getting an element, it retrieves the element with the highest priority. Both**

`PriorityQueue`

`.put()`

and `.get()`

take logarithmic time. Here’s an example:

```
>>> from queue import PriorityQueue
>>> pq = PriorityQueue()
>>> pq.put(2)
>>> pq.put(1)
>>> pq.put(3)
>>> pq.get()
1
>>> pq.get()
2
>>> pq.get()
3
>>> pq.empty()
True
>>> pq.put((1, "hello"))
>>> pq.put((2, "world"))
>>> pq.get()
(1, "hello")
>>> pq.get()
(2, "world")
```

You can check out the Python documentation on PriorityQueue and the Wikipedia article.

You’ll see all solutions at the end of the course.

**00:00**
In this video, we’ll look at the current solution we have for the `merge_k_linked_list()`

question and see if we can improve it. Just to review, the algorithm looks at the front value of all the linked lists, finds the minimum, puts it in the result linked list, remove that value from this `copy_linked_lists`

, then keeps going until there are no more values to add.

**00:18**
So really, there are a couple places in this `while`

loop that we need to look at. These two *O(k)* lines can sort of be combined into one by iterating, but that’s not really an optimization. Here, this is sort of annoying that we have to look through all the linked lists again to find the one that matches that minimum value.

**00:35**
It would be great if we had a mapping from the value of the linked list to that link. So, the key idea here is `# keep a mapping of the value to the linked list`

.

**00:46**
That way, we don’t need to iterate through the linked list again to find the one that corresponds to the min value. We can think, long term, that if we optimize this to have a mapping and then somehow optimize a way to get the minimum value faster than *O(k)*, and then optimize the `while`

condition, we can get to a better solution.

**01:02**
So, let’s start with optimizing this by having a mapping. Let’s actually just put this out here. `# keep a mapping of the value to the linked list`

. Well, whenever you think of mapping, you think of dictionary.

**01:13**
What is this dictionary going to map? It’s going to map `val_to_link`

. Well, what if linked lists have the same value? This should be `val_to_links`

.

**01:22**
It’s going to be a number map to a list of linked lists. So, instead of using a regular dictionary, we can use a `defaultdict`

. Just import it right here, `from collections import defaultdict`

, and the default is going to be a `list()`

.

**01:37**
Then, let’s add all the linked lists to this mapping `for link in copy_linked_lists:`

`val_to_links[link.val].append(link)`

. Let’s print `val_to_links,`

just to see what happens here. Let’s just return, so we don’t execute the rest of the code.

**01:56**
Okay. We got a `defaultdict`

of mapping `1`

to `Link(1, Link(2))`

, `2`

to mapping `Link(2, Link(4))`

, `3`

to mapping `Link(3, Link(3))`

.

**02:04**
It seemed to map correctly. I guess in this example, `1`

, `2`

, `3`

are all unique, so there’s no list of linked lists.

**02:14**
Okay. Now, we have a mapping of the value to linked list. This logic here becomes pretty duplicate, so let’s just comment it out. And now, we can find the `link`

basically being this variable, `val_to_links[min_val].pop()`

.

**02:30**
I’m using `.pop()`

because `.pop()`

is a constant time, and that will actually pop off one of the linked lists in our mapping. `pointer.next = Link(link.val)`

.

**02:41**
`pointer = pointer.next`

. And let’s save and see what happens. Let’s just remove this to make it really clean, like this, run it. `pop from empty list`

—okay, that makes sense in our solution because every time we’re popping off from the list—and that can be empty. So, what should we do? Well, we know that if we’ve popped off the last linked list, then there are no more linked lists mapping to that min value.

**03:05**
So we can do something like `if len()`

of here is `None`

—`== 0`

—let’s just close this like that—

**03:17**
then, we should just delete the mapping, because there are no more linked lists mapped to that `min_val`

. And the issue that this is causing is the next iteration, that min value is going to be the min value, but there’s not going to be any linked list mapped to it.

**03:31**
And then, we actually forgot one part. Every time we’re popping off, but nowhere in the code—except for up here—are we ever adding a linked list back to this mapping.

**03:42**
Well, if `link.next`

exists, then we need to add the mapping of the front of `link.next`

to `link.next`

. `val_to_links[link.next.val] = link.next`

.

**03:58**
Let’s save it, let’s run it. Okay. Again, we’re popping off an empty list. I assume—well, this line’s not even right. It should be `.append()`

.

**04:09**
I don’t think that’s going to fix it, but let’s see. Well, this logic is wrong. This is legacy logic, where we loop through the `copy_linked_lists`

.

**04:17**
We’re not changing `copy_linked_lists`

anymore, so I’m assuming that might be causing it? Let’s remove this. Let’s get `min_val`

as actually the min of `val_to_links`

.

**04:30**
Run it. `min() arg is an empty sequence`

. Okay, well that’s because this condition is wrong, because we’re not changing `copy_link_lists`

. We’re actually not even using `copy_linked_lists`

, so let’s actually just replace this, replace this, delete this copy—just to really make the code clean.

**04:44**
And here, it should really be if there are any values in `val_to_links`

.

**04:55**
Cool! And it passed. It’s a little hard to read the code—I’m going to remove the terminal output—but basically, this is checking, “Okay, are there still values mapped? And if there are, then do this logic, but if there aren’t, then that means we’ve ran out of linked lists to map.” Okay, so let’s look at the runtime. Let’s scroll all the way to the top. Here, this is constant.

**05:14**
This is going to be *O(k)*. Appending to a list is constant time. Constant, constant. This is actually constant because taking the length of a list is constant.

**05:23**
This line—so, how many keys can there possibly be in `val_to_links`

? Well, the initial loop—there can be max *k* keys, but you might be thinking, “Okay, as we’re adding and deleting links from that mapping, won’t it grow?” Well, no, it can’t. Because if there are no more linked lists mapped to a certain value, it will get removed from the keys.

**05:43**
And so, basically—scrolling all the way to the top—the mapping would include `1`

, `2`

, and `3`

as keys, but then when we’ve looked at all the linked list mapping to `1`

, it will get removed from the keys, and then `2`

already exists there.

**05:55**
So, at a certain time, there could only be *k* unique numbers mapped in `val_to_keys`

, so this min actually still takes *O(k)*. This `.pop()`

is still constant.

**06:06**
Constant, constant, constant—deleting from a dictionary is constant. Constant. And appending is constant. So, what is our final runtime? Well, how many times does a `while`

loop run? Well, even though we changed the condition, it still runs *k* * *n* times because it has to go through every number in all the linked lists. So the current solution is still *k* * *n* * *k*.

**06:24**
The previous solution in the previous video was *k* * *n* * *k*, and we still got *k* * *n* * *k*. So, you might be thinking, “Okay, that’s a lot of work to do nothing,” but notice how we’ve removed all the *O(k)* runtimes, except for one line.

**06:37**
So, if there was some way to get this minimum in less than *O(k)* time, we’ve optimized the solution. We don’t have to worry about any other place because everything else is constant time. And that’s exactly what we’re going to do.

**06:49**
So, we’re going to introduce a new data structure called the `PriorityQueue`

. I felt like the `PriorityQueue`

is not a large enough data structure to warrant its own video, but we’re going to sneak it into here and optimize this solution.

**07:00**
A `PriorityQueue`

is a data structure that allows you to put items into it and get items based on a priority. So, `from queue import PriorityQueue`

.

**07:10**
`pq = PriorityQueue()`

. And then, when you do `pq.put()`

, the value that you’re putting in is going to also be the priority. So, `pq.put(1)`

, then `2`

, then `3`

.

**07:23**
`1`

has the highest priority, so `pq.get()`

is going to return `1`

, then `2`

, then `3`

. For some reason, negative numbers have a higher priority, so `pq.put(-1)`

—`1`

. And `-5`

. `pq.get()`

will be `-5`

, `-1`

, and `1`

. You

**07:42**
can also put tuples where the first number is the priority. `(1, "hello")`

, `(2, "world")`

, `pq.get()`

will return a tuple `(1, 'hello')`

, and then, `(2, 'world')`

. You can check if it’s empty by `pq.empty()`

.

**08:02**
`PriorityQueues`

are really useful because they’re usually implemented with heaps, which allows you to `put()`

and `get()`

in *log(n)* time, where *n* is the length of the `PriorityQueue`

. I will link some more documentation down below, but let’s use this to speed up our solution.

**08:20**
`from queue import PriorityQueue`

.

**08:29**
`pq.put(link.val)`

. And then now, to get the minimum it’s just `pq.get()`

.

**08:38**
It doesn’t even need to be `min()`

, it’s literally just like that, which now takes *log(k)* time because the `PriorityQueue`

will at any time have max *k* values. Here, we don’t need to do anything because the `.get()`

actually removes it from the `PriorityQueue`

. Here, though, we need to put

**08:55**
`link.next.val`

into the `PriorityQueue`

. Save it, exit, and run it…

**09:08**
So now, let’s call this `# find min val solution`

and this’ll be `# val to link`

`solution`

, and now, finally, `# priority queue solution`

—

**09:23**
*O(k * n * log(k))*. So, we actually optimized from *log(k * n)* to *log(k)*, and interviewers really, really like these small optimizations, because it shows that you know your data structures, you know how to think about the best solution, and think about ways to improve it.

**09:42**
Let’s review all four solutions and think about what concepts we used and what we learned. So, the brute force solution—you put all the values of all the linked lists into a list, sorted it, and created a final linked list.

**09:55**
Here, we took advantage of the `sorted()`

built-in function, and it got us comfortable with building a linked list. The final resulting runtime was *k* * *n* * *(log(k * n)*. Then, we used a solution that took advantage of the fact that the lists were sorted by looking at the front of all the linked lists, found the minimum, put it in the result linked list, removed that value that we’ve added, kept going until there were no more values to add.

**10:18**
We copied the input because we didn’t want to mutate the input, and then kept going until there were no more values, found the minimum value, found the linked list that was mapped to the minimum value, and then added that value to the result, and then returned it.

**10:33**
The final runtime was *k* * *n* * *k*. Then, we saw that we could optimize this part by using a map to map the value to the linked lists.

**10:41**
We used a `defaultdict`

to map `val_to_links`

, actually saw a way to optimize the `while`

condition so that it’s constant time, found the minimum value, found the link that corresponds to it, deleted that min value mapping if there were no more linked lists, and then added a mapping to the `.next`

value of that `Link`

that we just popped off, and the final runtime was *k* * *n* * *k*.

**11:03**
But we did that because we wanted the only line to take *O(k)* time to be this `min()`

call, and then used a `PriorityQueue`

to decrease that runtime from *O(k)* to *log(k)* so that the final `PriorityQueue`

solution is *k* * *n* * *log(k)*.

**11:17**
This concludes the three-part video on finding multiple ways to solve the same problem and using all these different concepts that we’ve talked about: `defaultdict`

, `min()`

, `sorted()`

, and then a new data structure, `PriorityQueue`

.

**11:30**
The next video is a conclusion video, and you’ll go through everything that you learned in this course.

**Daniel A** on May 18, 2020

Not sure if i’m doing the Big O notation right but i believe this solution may run O(n) times.

```
from functool import reduce
def merge_k_linked_lists(linked_lists):
next_links = [link.next for link in linked_lists if link.next] # O(n)
total_links = next_links + linked_lists # O(1)
total_links_sorted = sorted(total_links, key=lambda link: link.val, reverse=True) # Olog(k)
result = reduce(lambda val, next_val: Link(next_val.val, next=val), total_links_sorted) # O(n)
return result
```

**Daniel A** on May 18, 2020

This solution essentially builds the links backwards from the largest link to the smallest.

**Simon Yates** on July 11, 2020

You’re relying on the linked lists only being 2 items deep, which wasn’t stated in the problem (although it’s true of both the test cases).

**ambreen2006** on Nov. 19, 2020

In the optimal solution without the priority queue there is an improvement made by removing the redundant iteration to find the index to which the min value belongs to. I’m thinking though that it does a tad bit more than just this change, because it has this line:

```
val_to_links[min_val].pop()
```

The pop() without the argument should be removing the last item from the list associated with the min value in the dictionary. However, if there are multiple linked list with the same min value, the program in the previous video advances the linked list found first but this one advances the linked list that was found last in the creation/update of ‘val_to_links’ dictionary. That’s fine here but if the Link object had more attributes associated to it, the order in which the sorted list is returned would be different for these two programs.

**ambreen2006** on Nov. 19, 2020

PriorityQueue solution can also do without the additional book keeping of indices.

Because duplicates are allowed, the entry into the PQ needs to be wrapped (or the next element of the tuple needs to be comparable).

```
class PQ_Entry:
def __init__(self, value, data):
self.value = value
self.data = data
def __lt__(self, other):
return self.value < other.value
```

Now, we can just use the PQ without needing the dictionary.

```
def merge_k_linked_lists(linked_lists):
pq = PriorityQueue()
for link in linked_lists:
if link:
pq.put(PQ_Entry(link.val, link))
result = Link(0)
ptr = result
while not pq.empty():
pq_entry = pq.get()
min_val, min_link = pq_entry.value, pq_entry.data
ptr.next = Link(min_val)
ptr = ptr.next
if min_link.next:
pq.put(PQ_Entry(min_link.next.val, min_link.next))
return result.next
```

**Geoff Rliey** on Jan. 30, 2021

A quick note about `PriorityQueue`

, in the video at TC:07:23 you say “`1`

has the highest priority, so `pq.get()`

is going to return `1`

, then `2`

, then `3`

. **For some reason, negative numbers have a higher priority**, so `pq.put(-1)`

, `1`

and `-5`

. `pq.get()`

will be `-5`

, `-1`

, and `1`

.”

I’m sure you’ve realised by now, but in case anyone else is wondering, the reason that negative numbers have a higher priority is simply because they are lower values than positive numbers.

I could probably get a degree in stating the blinking obvious. ;)

**pavolsevera** on Feb. 22, 2021

If we’re allowed to use the standard library, the simplest solution seems to use heapq.merge. The only thing we need to do then is to convert linked lists to iterators and iterators to linked lists, which is easy and pleasant.

**stephencarey1905** on July 20, 2021

I solved the problem initially with recursion but I’m not sure what the big O complexity might be for that. Mainly because I dont know what the complexity of my merge(merged_list, list) is.... its effectivly merge sort so o(n log n)?

```
def merge_k_linked_lists(linked_lists):
merged_list = linked_lists.pop(0)
# o(k)
for list in linked_lists:
# o(nlogn)???
merged_list = merge(merged_list, list)
return merged_list
def merge(lst_1, lst_2):
if lst_1 is None:
return lst_2
if lst_2 is None:
return lst_1
if lst_1.val < lst_2.val:
return Link(lst_1.val, merge(lst_1.next, lst_2))
return Link(lst_2.val, merge(lst_1, lst_2.next))
```

**Engleang Sam** on Jan. 4, 2022

I think it’s **O( k * n )** not **O( k * n log(n))** . By merging the list it will take **O(n)** rather than **O( n * log( n))** .

Become a Member to join the conversation.

James UejioRP Team on April 27, 2020Here is the Python documentation on PriorityQueue and a Wikipedia article.