For more information on concepts covered in this lesson, you can check out:
Sorting Your DataFrame on Its Index
Sorting Your DataFrame on Its Index. Before sorting on the index, it’s a good idea to know what an index represents. A DataFrame has an
index property, which by default is a numerical representation of its rows’ locations.
00:30 Sorting by column values like you did in a previous example reorders the rows in your DataFrame, so the index becomes disorganized. This can also happen when you filter a DataFrame or when you drop or add rows.
Using this method, you replace the default integer base row index with two axis labels. This is considered a multi-index or hierarchical index. Your DataFrame is now indexed by more than one key, which you can sort on with
For the next example, you’ll sort your DataFrame by its index in descending order. Remember from sorting your DataFrame with
.sort_values() that you can reverse the sort order by setting
Now your DataFrame is sorted by its index in descending order. One difference between using
.sort_values() is that
.sort_index() has no
by parameter since it sorts a DataFrame on the row index by default.
There are many cases in data analysis when you want to sort on a hierarchical index. You’ve already seen how you can use
model in a multi-index. For this dataset, you could also use the
id column as an index.
id column as the index could be helpful in linking related datasets. For example, the EPA’s
emissions dataset also uses
id to represent vehicle record IDs. This links the emissions data to the fuel economy data.
Become a Member to join the conversation.