Take Advantage of Accessor Methods
Do you know what accessor methods are and why you can gain a lot of advantages from them? If not, this lesson is right for you! You’ll not only learn what they are, but how to get the most out of them.
00:00 In this video, you’re going to learn what accessor methods are in Pandas and why you’d want to use them. You can think of a Pandas accessor as a property that serves as an interface to additional methods. If that didn’t make any sense, don’t worry.
00:15 We’re going to look at some examples before we get too into it. Go ahead and open up your terminal.
00:25
Start the Python interpreter, and import pandas as pd
. So, the Series
class in Pandas actually has four different types of accessors.
00:37
You can see this by going to Series._accessors
. We’re only going to be concerned about three of these, the 'str'
(string) accessor, 'cat'
(categorical) accessor, and the 'dt'
(datetime) accessor. 'sparse'
is a new one that’s set up to handle sparse data structures, and is basically just an extension of these other three. To get started, create a small Series
called addr
(address),
01:10
and this will just be something simple with a couple of strings in it. It says something like 'Washington, D.C.'
and a zip code, 'Brooklyn'
and a 9-digit zip code,
01:44
and finish it up with 'Pittsburgh'
.
01:54 All right, close the bracket and the parenthesis.
02:01
Now for a regular string, if I were to have something like ' hello'
, you could then call s.upper()
, and it would convert that to an uppercase—and with a space in there too, just because I put the space here.
02:16
If you wanted to call .upper()
on every item in this Series, you may think you could do something like addr.apply(upper)
, like so.
02:28
But you see that that doesn’t work, because .upper()
is actually a string method, where it’s called off of the string object, and not a standalone function that would fit in .apply()
. Here’s where that .str
(string) accessor comes into play.
02:41
You can actually type addr.str.upper()
, just like that. And now every item in that Series has been converted to uppercase letters. This extends a little bit further, too, so if you wanted to find out if it was a 5- or a 9-digit zip code, you could call that .str
accessor and then call .count()
off of that and actually pass in a regular expression.
03:12
And now you can see there is a 5-digit zip code, a 9
, and then a 5
, and a 5
. So, each of these accessors maps to their respective class methods, so that you can call them off of your Series or DataFrame.
03:25
You then attach these classes to the Series or DataFrame using a CachedAccessor
, which is a type of cached property. And this just means that it’s a property that’s only computed once per instance, and then replaced by an attribute.
03:39
This allows you to keep pulling that value without calculating it every time. The second type of accessor that we’ll talk about is the .dt
(datetime) one.
03:47
So create another Series called daterng
(date range), and set this equal to a Series()
that’s just a pd.date_range()
.
04:04
and we’ll grab 9
periods off that, and say the freq
is quarterly.
04:14
Take a look at what that looks like, and you can see the end of each quarter for 2019, 2020, and the first one for 2021. So, how do we use the .dt
accessor?
04:30
Just like the .str
one. So you could call something like .day_name()
off of this, and you actually get the day name of each of those dates.
04:41
And if you want, you can start chaining these together a little bit and get some cool results. So, let’s say you want to see the date range where the .quarter
is greater than 2
. Run that, and you see the dates for the third and the fourth quarter. Likewise, if you wanted to look at the items for the end of the year, you could call daterng[daterng]
, use your accessor, and check if it is the year end or not.
05:17
And just like that, you’ve returned the two items that represent the last day of the year. The .cat
(categorical) accessor is a little special and deserves its own video, but hopefully you’ve got a pretty good idea of how these accessor methods work.
05:32 Think of them as applying a certain set of methods to a Series or DataFrame. This can be a bit tricky, so feel free to look up the Pandas documentation on this and also more information on cached properties. Like everything, though, they take practice. Try and see where you can use them! Thanks for watching.
Become a Member to join the conversation.