Sorting is going to be one of the main operations that you do on a DataFrame. To sort a DataFrame, we’re going to use the
.sort_values() method, which takes one required positional argument.
This is going to be the column that we want to sort by. So, for example, if we wanted to sort our DataFrame by the
js-score column, we simply type in the name of the column.
The default is to go from low to high. If we want it to change the order—in other words, to go from highest to lowest—we pass in a value to the
ascending keyword argument.
The default is
True, and so it would be sorting from smallest to highest, but if we want to change that and simply sort from highest to lowest, we pass in a value of
False. In this case, we’re going from highest JS score to lowest JS score.
Now, the name of this required positional argument is
by, and so we can also type that in just to make it a little bit more readable, and we can also sort by more than one criteria or more than one column name.
So, for example, notice right here that in the
js-score, there are two candidates that have a
91.0. And so if we then wanted to also sort by a second criteria or a second column, we would pass into the
by keyword argument a second column, or as many as we wanted. So, for example, if we then wanted to sort by
py-score, and we also wanted this to be, say, from highest to lowest—so we would pass in a
False for the
then we get
js-score from highest to lowest and then
py-score would be the tie breaker for any values that have a value of the same for the
js-score. So in this case, we’re going to be sorting these two candidates that had the same
js-score from highest to lowest in terms of the
Now, similar to a lot of the other methods that we’ve looked at already for DataFrames, the
.sort_values() is going to return a new DataFrame.
So we could either create a new DataFrame or just redefine the DataFrame
df that we’ve been working with, or we can also use the
inplace keyword argument. In this case, the default value is
False and if we pass in
True and run that and then take a look at the current state of our DataFrame, we’ve got it sorted. Again,
js-score from highest to lowest, and then any JS scores that are equal, the tiebreaker is going to be
py-score and that’s also going to be from highest to lowest.
02:50 It’s worth mentioning that you can also sort a pandas DataFrame where the constraint is a row instead of a column, and therefore what you’ll actually be doing is sorting the columns. Now in this case, that really doesn’t make sense because for any given row, we’ve got different data types. We’ve got strings and we’ve got floats.
But depending on what you’re doing, you may actually want to sort the columns. Because sorting is an important operation on a DataFrame, let’s do a quick recap of the different keyword arguments to the
.sort_values() method takes on one required positional argument and several keyword arguments, and it actually takes in more keyword arguments than the ones that are listed here, but these are probably the most used ones, and so we’ll focus on these.
The required positional argument
by, as we saw, is a string or is going to be a list of strings. This is going to determine what column or what list of columns are going to be used to sort the rows in the DataFrame.
axis keyword argument, the default value is
0, which means that you want to be sorting the rows. And another way to write instead of
0 for a value for the
axis keyword argument is to write down
'index', and so that makes it a little bit more readable. Instead of this value of
0, you’re saying that we want to sort the rows, and so you pass in the string
If, however, you want to sort the columns, then you can pass in a value of
axis or the string
'columns', and so then that would mean that the string or list of strings that you pass into the
by argument is going to be the names of the labels that you want to use as the constraints or the key. So in other words, you want to use a particular row to sort the columns by and you would then pass in a value of
'columns' or a value of
Next we saw is the
ascending keyword argument. This is a Boolean value. The default value is
True, which means that the sorting will be done in ascending order.
If you want it the other way, just pass in
False. And then the
inplace keyword argument. Again, this is a Boolean. The default value is
False, which means that whenever you call the
.sort_values() method on a DataFrame, you’re going to be returned a new DataFrame.
But if, instead, you want to sort inplace, then pass in a value of
True to the
inplace keyword argument.
05:21 Coming up next, we’ll take a look at the general operation of filtering data from a DataFrame.
Become a Member to join the conversation.