How to Sort With Pandas
For more information on concepts covered in this lesson, you can check out the following resources:
00:00 Sometimes, you might want to achieve something that’s not directly doable using plotnine. For example, in the previous lesson, you thought about wanting to sort this descending or ascending by value counts instead of going alphabetically as it does by default.
00:16 Now, you can always fall back to the underlying pandas library if you want to do something such as custom sorting, and I’d also encourage you to take a look at the documentation for plotnine that has examples for a lot of these things.
00:28 So, if I head over to the documentation—you’ll see a link for that in the description—then you can see an example that is very similar to the one that you were looking at before.
00:37
So, it’s even using the same dataset, and you see the default sorting. Again, by alphabetical, but you want to instead sort it by count, categorical. And the documentation on plotnine suggests for specific ordering that you want to use a pandas.Categorical
variable and then order the categories in your preference.
00:59 Now, since this is such a similar example, you can probably just go ahead and copy this and then change it up so that you have the same for the things that you need. So, I’m going to give this a go, put it in here. We’re going to need some imports.
01:13
So here, you’re working with pandas
as pd
, so I’m going to have to import that separately, as pd
.
01:21
And then the dataset, you already have that imported from before. And then on this one, what you’re doing is # Determine order and create a categorical type
, # Note that value_counts() is already sorted
.
01:32
So, you’re calling .value_counts()
, which is a pandas DataFrame method, on a specific Series
object here of the mpg
DataFrame.
01:40 In this case, they were using the manufacturer, but we’re actually interested in the class, so I’m going to have to change these ones out here. And this is also not going to be the manufacturer… It might be easier just to write it, but ha, let’s keep going down this road. All right.
01:57
So, class_list
. Let’s go line by line. So this first one, you want to make a list of the classes that exist. And as this one says here, .value_counts()
already sorts them automatically. Taking the Series
object on 'class'
of this dataset,
02:16
which is this column here. Sort it, and then put it into a list. And then class is categorical, you’re using pandas.Categorical
and passing in, again, a Series
object, and then the categories take it from this list that you just created before.
02:34
Now you want to assign a new column to the DataFrame, so you’re going to say the mpg
dataset gets a new column that we’re going to call class_cat
(class categorical),
02:44
and we’re going to pass this Series
object here, class_cat
, and then you’re already ready to plot this. I’m going to—oops—make it look a bit nicer.
02:57
And for x
, we want to use the new column that you just created, which is called 'class_cat'
. That’s the column here. And we’re using a geom_bar(size=20)
, okay, we’ll see how that goes.
03:11
And then 'Count'
versus 'Class'
, 'Number of Cars by Class'
.
03:19 Okay, so you saw what I did here, is I just went over here to the documentation, found an example that made sense—it’s achieving what I want to achieve. I copied the code, and then adapted it to change out… In this case, you just had to change out what is the column that you’re working with. And keep the names that you’re using somewhat descriptive. Okay.
03:41 Now that I run this, you can see that it’s sorted by the counts, and it starts off with the 2seater, the lowest one, and then just increases in the amounts.
03:50 And you see that the most entries are SUVs, the smallest entries are 2-seaters, and then there’s the rest in between. So, what I wanted to show you with this lesson is two things: First of all, there’s not everything you can do inside of plotnine, and if you can’t do it, then fall back to pandas. And then, secondly, take a look at the documentation. There are examples in the documentation,
04:16 plotnine docs. All right. So, if there’s something you can’t do, fall back to pandas. If you need an example of how to do something, check out the plotnine docs or research it online, and you will find something that’s going to be helpful.
Become a Member to join the conversation.