Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Sorting Your DataFrame on a Single Column

For more information on concepts covered in this lesson, you can check out Introduction to Sorting Algorithms in Python.

00:00 Sorting Your DataFrame on a Single Column. To sort the DataFrame based on the values in a single column, you’ll use .sort_values(). By default, this will return a new DataFrame sorted in ascending order.

00:14 It doesn’t modify the original DataFrame. To use .sort_values(), you pass a single argument to the method containing the name of the column you want to sort by. In this example, you sort the DataFrame by the city08 column, which represents city miles per gallon for fuel-only cars.

00:39 This sorts your DataFrame using the column values from city08, showing the vehicles with the lowest miles per gallon first. By default, .sort_values() sorts your data in ascending order.

00:53 Although you didn’t specify a name for the argument you passed to .sort_values(), you actually used the by parameter, which you’ll see in the next example.

01:03 Another parameter of .sort_values() is ascending, which by default is set to True. If you want the DataFrame sorted in descending order, then you can pass False to this parameter, as seen on-screen.

01:24 By passing False to ascending, you reverse the sort order. Now your DataFrame is sorted in descending order by the average miles per gallon measured in city conditions.

01:34 The vehicles with the highest miles-per-gallon values are in the first rows.

01:40 It’s good to note that pandas allows you to choose different sorting algorithms to use with both .sort_values() and .sort_index().

01:47 The available algorithms are quicksort, mergesort, and heapsort. For more information on these different sorting algorithms, check out this Real Python course.

02:01 The algorithm used by default when sorting on a single column is quicksort. To change this to a stable sorting algorithm, use mergesort.

02:11 You can do that with the kind parameter in .sort_values() or .sort_index(), as seen on-screen.

02:26 Using kind, you set the sorting algorithm to mergesort. The previous output used the default quicksort algorithm.

02:35 Looking at the highlighted indices, you can see the rows are in a different order. This is because quicksort is not a stable sorting algorithm, but mergesort is. Note that in pandas, kind is ignored when you sort on more than one column or label.

02:52 When you’re sorting multiple records that have the same key, a stable sorting algorithm will maintain the original order of those records after sorting. For that reason, using a stable sorting algorithm is necessary if you plan to perform multiple sorts.

03:08 Now that you’re familiar with sorting a DataFrame on a single column, you’re ready to see how to sort one on multiple columns. And that’s what will be covered in the next section of the course.

Become a Member to join the conversation.