Working With Rows and Columns in DataFrames
A frequent operation that you’ll be doing on a DataFrame is to extract rows or columns. Let’s start with accessing, say, a column. Let’s access this
city column. Here, you’re going to be using notation that’s similar to a Python dictionary.
So in this case, you’re going to access the column with the label
'city', and this will return a pandas
Series object. You can think of a pandas
Series object as either an entire row or an entire column of a
So if we check out the type of the
'city' column, we get a pandas
Series. Let’s save this column in the variable, say,
And if we take a look at this again, we see that we not only extracted the data in that column, we also extracted the index, or the row labels. And so a pandas
Series object will also contain an
which, in this case, will be the same as the
.index in the DataFrame, because we extracted an entire column of the DataFrame. Another way that you can extract a column is to use dot notation, but this will only work if the name of the column that you want to extract is a string that’s a valid Python identifier. So, for example, if we wanted to extract, say, the
age column, we would simply type
.age. And then, in this case, we get that
But if we tried this with the Python score column, so
we’re going to get an
AttributeError because pandas thinks that we are extracting the column that’s called
py and we’re subtracting it from some other
Series object called
So if we wanted to extract that
py-score column, we’d have to use the bracket notation and simply write out the full column name.
Now let’s talk about extracting rows. So the rows, if you remember, they’re found in the
.index attribute. We know that these are
108, but not including
108, so the last one’s
So to access a whole row, say with index value
103, we use the
.loc accessor method. So the way to do this is to call the DataFrame with
.loc, and then bracket notation, and then the actual label of the row that we want to access, so let’s say
This returns a pandas
Series object as well. Let’s see the data type for that just to make sure.
And so we have a pandas
Series. Now, if you recall, we also have the
cities Series and its index is the exact same. However, with
Series objects, contrary to when we were working with a
DataFrame, where we had to use the
.loc accessor method, if we’re working with a
Series object, we can directly access the index just using bracket notation like this.
So that’s one key difference between
Series objects and
DataFrame objects. So if we go ahead and try that, we’ll directly get the only value for that index, which is
Whereas when we use the index value on a
DataFrame, we needed to use the
.loc method and we saw then, of course, that this returns a whole
Whereas if we’re working with a
Series object and we use one of the values of the index to access a value, we’re going to get a single value.
All right! So, with this lesson and the previous lesson, you got a broad overview of the
pandas module and some of the basics with creating a pandas DataFrame and also working with some of the
Series objects that are built into a DataFrame.
04:17 We also quickly went over how you access rows and columns in a DataFrame. In the next lesson, what we’ll do is we’ll take a look at other ways to create a pandas DataFrame.
Become a Member to join the conversation.