Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Iterating Through the Data

00:00 Now that you can load the worksheets, it’s time to put their data to use. openpyxl has a couple of different ways to iterate through the data in your worksheets and, in some ways, should feel familiar if you’re used to doing this in Excel.

00:11 So, go ahead and open up an interpreter session and do your imports. So, from openpyxl import load_workbook. Say that workbook is going to equal load_workbook with the filename="sample.xlsx", and the sheet is going to equal workbook.active. In a way that’s very similar to Excel, you can slice through the rows and columns using a : (colon) to separate the top-left and bottom-right cells. So let’s say sheet, and you want to access ["A1:C2"], like that.

00:45 You can see here that you actually end up with a tuple of tuples, and in each tuple are the cells. So A1, B1, and C1, and then A2, B2, and C2.

00:56 And I’m actually going to go over to the spreadsheet and change the sheet name to just amazon, so that it prints out a bit better.

01:05 Save that, and then let’s reload the workbook and then reload the sheet.

01:13 Try that again. And yeah, that’s a bit cleaner. So now, you can really see that A1, B1, and C1 and then A2, B2, and C2, just like that.

01:24 Okay. And just like in Excel, if you ever want to access an entire column, you can say sheet and then just put in the column letter. So if you want to see everything, that’s in "A", you can do that.

01:36 And if you look, it’s counting all the way down to row 100, starting from A1. Likewise, if you want to access a couple of columns, you can just put a : between them.

01:49 So "A:B" (A to B) would be like that, and this grabs multiple columns. So, something to note here is you have A1, A2, A3, and so on, and then all the B’s are in a row, too.

02:02 So, of the tuples that are returned, each tuple contains an entire column. If you want to grab an entire row, you can get those the same way. Let’s say you just want to grab row 5, just put in the row number as an integer, run that, and you can see A5, B5, C5, and so on.

02:21 These are all contained in a tuple. openpyxl also provides generators to go through the data, which might feel a bit more like Python than Excel.

02:30 .iter_rows() and .iter_cols() can take a range of rows and columns, and then iterate through the cells. These will also return tuples, which are either entire rows or columns, depending on which one you call.

02:41 So, to try this out, say something like for row in sheet and then call the .iter_rows() method and say that min_row is going to equal 1, max_row is going to equal 2, min_col is going to equal 1, and max_col is going to equal 3, like, so. And then, just print the row.

03:06 There you go! A1, B1, C1, A2, B2, and C2. This will work for columns, too. So, for column in sheet I’m going to call .iter_cols(), and then with the same name parameters—so, min_row=1, max_row=2, min_col=1, and

03:27 max_col=3—and make sure that’s an underscore. And do the same thing, but print column, this time.

03:37 And now you can see, you get A1 and A2, B1 and B2, and C1 and C2. Now, you might be noticing here that this is returning the entire Cell object.

03:46 So, if you just need the values, there’s a parameter values_only that you can set. Take a look at this. It says something like for value in sheet, and this time do rows.

03:57 So, min_row=1, max_row=2, min_col=1, and max_col=3, and now pass in

04:08 values_only set to True. Come back here, print the value, and now you should see the resulting values from those cells, instead of the Cell objects. Finally, if you want to go through the entire worksheet and you don’t have any min or max rows or columns, you can just say something like for row in sheet and then access the .rows property. So if you go print(row) from here, you should see that everything prints out just like that.

04:36 Let me scroll back up…

04:41 And column would be the same way. So, when you do this, it’s the same as calling .iter_rows() or .iter_cols(), but it gets rid of the minimum or maximum parameters and goes through the entire dataset. All right!

04:54 So now, you know how to iterate through the data and it’s now time to see how you can pull this data into Python data structures to really open up the possibilities of what you can do with it.

Become a Member to join the conversation.