Loading video player…

Working With Rows and Columns in DataFrames

00:00 A frequent operation that you’ll be doing on a DataFrame is to extract rows or columns. Let’s start with accessing, say, a column. Let’s access this city column. Here, you’re going to be using notation that’s similar to a Python dictionary.

00:17 So in this case, you’re going to access the column with the label 'city', and this will return a pandas Series object. You can think of a pandas Series object as either an entire row or an entire column of a DataFrame.

00:35 So if we check out the type of the 'city' column, we get a pandas Series. Let’s save this column in the variable, say, cities.

00:49 And if we take a look at this again, we see that we not only extracted the data in that column, we also extracted the index, or the row labels. And so a pandas Series object will also contain an .index attribute,

01:07 which, in this case, will be the same as the .index in the DataFrame, because we extracted an entire column of the DataFrame. Another way that you can extract a column is to use dot notation, but this will only work if the name of the column that you want to extract is a string that’s a valid Python identifier. So, for example, if we wanted to extract, say, the age column, we would simply type .age. And then, in this case, we get that Series object.

01:37 But if we tried this with the Python score column, so .py-score,

01:46 we’re going to get an AttributeError because pandas thinks that we are extracting the column that’s called py and we’re subtracting it from some other Series object called score.

01:58 So if we wanted to extract that py-score column, we’d have to use the bracket notation and simply write out the full column name.

02:12 Now let’s talk about extracting rows. So the rows, if you remember, they’re found in the .index attribute. We know that these are 101 to 108, but not including 108, so the last one’s 107.

02:28 So to access a whole row, say with index value 103, we use the .loc accessor method. So the way to do this is to call the DataFrame with .loc, and then bracket notation, and then the actual label of the row that we want to access, so let’s say 103.

02:50 This returns a pandas Series object as well. Let’s see the data type for that just to make sure.

02:59 And so we have a pandas Series. Now, if you recall, we also have the cities Series and its index is the exact same. However, with Series objects, contrary to when we were working with a DataFrame, where we had to use the .loc accessor method, if we’re working with a Series object, we can directly access the index just using bracket notation like this.

03:26 So that’s one key difference between Series objects and DataFrame objects. So if we go ahead and try that, we’ll directly get the only value for that index, which is 'Prague'.

03:39 Whereas when we use the index value on a DataFrame, we needed to use the .loc method and we saw then, of course, that this returns a whole Series object.

03:50 Whereas if we’re working with a Series object and we use one of the values of the index to access a value, we’re going to get a single value.

04:02 All right! So, with this lesson and the previous lesson, you got a broad overview of the pandas module and some of the basics with creating a pandas DataFrame and also working with some of the Series objects that are built into a DataFrame.

04:17 We also quickly went over how you access rows and columns in a DataFrame. In the next lesson, what we’ll do is we’ll take a look at other ways to create a pandas DataFrame.

Become a Member to join the conversation.