Common Issues With Sorting in Python
00:00 As powerful and flexible as sorting in Python is, there are a few things you need to keep in mind that can cause headaches if you miss them. First, lists must contain comparable data types.
00:12
Some types of data can’t be sorted because they are too different to be compared. Let’s take a look at some in the interpreter. Go ahead and make a list called mixed_types
and set this equal to [None, 0]
. If you call sorted()
on mixed_types
, you’ll get an error.
00:31
This is because sorted()
is trying to do a less than comparison between an integer and a NoneType
. You can actually take a look at this and see it for yourself if you do None < 0
, and you’ll get that same issue there.
00:45
If you have multiple data types in your list and they can be compared without throwing a TypeError
, then you should be okay. Sometimes you’re able to convert your elements so they can be compared in a process known as casting.
00:58
Go ahead and make a new list, call this mixed_numbers
, and this will be 5
, and then a "1"
in quotes, 100
, and then a "34"
in quotes.
01:10
What you end up with is an integer, a string, integer, and string. Now if you try to call sorted()
on this—and make sure Caps Lock is off—
01:22
you’ll see that you get this error here, because you’re trying to compare strings and integers. But you know that this "1"
can be evaluated as an integer if you were to call something like int()
and pass in the "1"
.
01:35
So in this case, you can actually cast those strings as integers as needed. So you’ll just make a new list, and in here will be [int(x) for x in mixed_numbers]
.
01:51
And make sure you’ve got your closing parenthesis. And there you go! sorted()
was able to handle that because all of the elements were evaluated as integers.
02:01
Sometimes, Python will try to implicitly convert values to different data types as needed. You might be familiar with something like this if you ever do an if
statement and pass in an empty list, because you know that an empty list will evaluate to False
.
02:15
Let’s see how this works for sorting by making a new list called similar_values
, and set this equal to [False, 0, 1, 'A' == 'B', 1 <= 0]
.
02:34
Right away, Python is going to try to evaluate those comparisons. So if you just take a look at similar_values
, you’re going to end up with [False, 0, 1, False, False]
because these aren’t equal and 1
is greater than 0
.
02:48
So let’s call sorted()
on similar_values
. And look at that! You’ve got False
, 0
, False
, False
, and then 1
has been brought to the end. Now, interestingly, the 0
was not sorted separately from these False
values, so you can tell that Python was evaluating these to be equal.
03:08 This highlights another aspect of sorting called sort stability. If Python tries to sort two elements of equal value, it will return the original order in the list.
03:20
So looking at the original list, you can see that you had False
and 0
. Now 1
was brought to the end, but then these False
values were moved forward.
03:29 Finally, case is also important when sorting, as you saw earlier. Remember how the capital letters would appear before the lowercase letters, no matter where they landed in the alphabet.
03:40
This is because Python is using Unicode code points to return a number value for each letter. To see this, let’s go ahead and make a names_with_case
list, which is going to equal 'harry'
, and then 'Suzy'
with a capital 'S'
,
04:00
'al'
with a lowercase 'a'
, and then 'Mark'
with a capital 'M'
. So if you take a look at that, you’ve got a mix here of lowercase and capital letters.
04:13
If you call sorted()
on this,
04:18
you’ll see that the capital letters appear first. And this makes sense if you were to look at the Unicode code points for each of those letters. Let’s go ahead and make a new list here, and you’re going to use the ord()
function for the first letter in each name, and then also just show what that first letter is in each name.
04:39
And actually, let’s make this a tuple and do this for name in sorted()
, and then pass in names_with_case
.
04:50
You’ll see here that the capital letters have lower ord()
values than lowercase letters, and they’re sorted in ascending order. Now, we haven’t looked at any strings where they both start with the same letter, but if this is the case, Python will just look at the next letter and continue on until it finds a difference.
05:07 So if you wanted to make some very similar strings
05:12
and make something like ['hhhd', 'hhhb', 'hhha']
,
05:23
when you call sorted()
on this,
05:27 it’ll just go to that last letter of each one because the first three letters are the same in each case. Likewise, if all the letters are the same but they’re of different lengths,
05:42
when you call sorted()
on this, Python will just grab the shortest ones first because there’s no character to compare with the longer string. All right!
05:54 Hopefully you have a pretty good idea of how to deal with some of the strange cases that can come up when you’re sorting. The big takeaways are to keep in mind that everything in your list or iterable that you’re trying to sort must be comparable.
06:07
And when you’re trying to sort alphabetically, you’re actually sorting off of the Unicode code point value. In the next video, we’re going to talk about times when you’ll want to use sorted()
over .sort()
and vice versa.
Become a Member to join the conversation.
Harsha Vardhan on Jan. 2, 2020
To sort mixed numbers like in your example, instead of list comprehension. I have used key argument. something like following.
Thanks Joe for the course. Really helpful.