Sorting Your DataFrame on Multiple Columns
00:00 Sorting Your DataFrame on Multiple Columns. In data analysis, it’s common to want to sort your data based on the values of multiple columns. Imagine you have a dataset with people’s first and last names.
In the first example, you sorted your DataFrame on a single column named
city08. From an analysis standpoint, the MPG in city conditions is an important factor that could determine a car’s desirability. In addition to the MPG in city conditions, you may want to look at it for highway conditions.
The next example will explain how to specify the sort order and why it’s important to pay attention to the list of column names you use. To sort the DataFrame on multiple columns, you must provide a list of column names. For example, to sort by make and model, you should create the following list and then pass it to
01:46 Now your DataFrame is sorted in ascending order by make. If there are two or more identical makes, then it’s sorted by model, The order in which the column names are specified in your list corresponds to how your DataFrame will be sorted.
With textual data, the sort is case sensitive, meaning that capitalized text will appear first in ascending order and last in descending order. You might be wondering if it’s possible to sort using multiple columns and to have those columns use different
ascending arguments. With pandas, you can do this with a single method call.
04:13 This is helpful because it groups the cars in a categorical order and shows the highest-MPG cars first. Now that you know how to sort a DataFrame on multiple columns, in the next section of the course, you’ll see how to sort one on its index.
Become a Member to join the conversation.