Modifying Values in DataFrames: Label Indices
So of course, one of the main things you’re going to be doing with a DataFrame is accessing and modifying the values of the DataFrame. Let’s introduce some markup here. We’re going to call this section, say, give it a heading label level two, and this will be
## Accessing and Modifying Data.
Now, this accesses columns. We may also want to access rows directly. And to access rows, we have to use the
.loc method, and this sort of works using list notation. So, for example, if I want to access the first row of the DataFrame, I just pass in the label.
So I’m going to get an error. Let’s go all the way down here. And basically, I have a
KeyError, right? This is not a valid index label. So if we take a look at the labels again, we know that the labels are
12, and so on. So actually, if I want to access an individual row, I need to do, so… If I’m going to use
.loc accessor method, I need to use the actual label name.
So that means that the actual labels of the indices or the index, they can be any type of hashable Python object. So it can be any string, an integer. You know, these are going to be the usual types of labels that you use. So in this case, for example, if I want to access the row with label
11, then I would use that with the
.loc accessor method. Now,
.loc can also be used to access, say, subsets of your DataFrame, or a sub-DataFrame, just like you would with a NumPy array. And again, with
.loc, they have to be actual labels of the rows and labels of the columns.
So I would have a comma separator, and then here I would pass in the row labels and here I would pass in the column labels, just like a two-dimensional NumPy array. And so, for example, if I wanted to get all of the rows of the columns for, say, the
age and the
py-score, I’m passing in a list containing the two column names
I want to pick off the labels, so I’ll say
x for x in the
.index if the index label, which is going to be an integer, is a multiple of
2, and so I’ll use the modulo operator (
%) and then pass in
x % 2, if it’s even, I’m going to get
0 and so I want to flip the Boolean value from
and so then we get just the
name and the
city columns where the row labels are even. So with
.loc, the key thing to remember is that the actual values that you pass in to
.loc have to be the actual labels of either the row and the column.
Become a Member to join the conversation.