Reading Excel Spreadsheets
Before you can start writing spreadsheets, it’s a good idea to know how to read them. For this video, we’re going to use the
sample.xlsx spreadsheet, which, if you open it up, contains a bunch of data from Amazon. You’ll see here things like marketplace, customer_id, and review_id—and just a number of columns here about different watches and the reviews that people left for them.
So let’s go to the terminal—I’m just going to bring this up here— and open up a Python interpreter session. I’m going to use bpython. Go ahead and do the imports, so from
openpyxl go ahead and
All right. So now, just say
sheet and like before, you can say
workbook.active to grab that first sheet that opens. And now, if you take a look at the
sheet, you get this object here, which is a
Worksheet, and conveniently gives you the name of the sheet over here.
01:53 Assigning the entire workbook to a variable and then grabbing the sheets off of it is a common way to access data from an Excel spreadsheet, and we’re going to be doing this quite a bit more throughout the course.
So, if you say something like
sheet and then
["A1"], you can see that now it returns a
Cell. It’ll also tell you the name of the sheet and the cell location, but not the actual data. If you want to see the values, you need to access the
Cell and then access the
.value property off of it, and now you can see
'marketplace'. Let’s take a look at another one, say something like
"F10", and just take the
.value off of it.
Let’s go ahead and do that again, but this time call
.value, and you should get the same value there. Moving forward, we’re mostly going to be accessing cells using this notation, as the letter for column and number for rows is a bit more similar to how you would access things within an Excel worksheet.
So if you’re used to things—like Python—being zero-indexed, just make sure you get that number correct. Another thing to consider also is that you have a couple options when you call that
So, let’s say you say
workbook1 and call
load_workbook() again. You can see that there are a few options here.
read_only does just like what it says and will open up the workbook in a read-only format. The default is
False, but if you say
True, you can actually open up much larger spreadsheets based on the way that
openpyxl handles memory.
Another nice option is this
data_only option down here, which also defaults to
False. If you set this to
True any formulas present in the cells will only come over to Python as their resulting values. If you’ve ever copied cells and pasted the values from them, it’s a very similar result. All right!
Become a Member to join the conversation.