Reading Excel Spreadsheets
Before you can start writing spreadsheets, it’s a good idea to know how to read them. For this video, we’re going to use the
sample.xlsx spreadsheet, which, if you open it up, contains a bunch of data from Amazon. You’ll see here things like marketplace, customer_id, and review_id—and just a number of columns here about different watches and the reviews that people left for them.
00:27 There are about a hundred rows here—actually, exactly 100 rows—so definitely something that we can get some good practice with.
So let’s go to the terminal—I’m just going to bring this up here— and open up a Python interpreter session. I’m going to use bpython. Go ahead and do the imports, so from
openpyxl go ahead and
Then now, instead of creating a new workbook, you can say
workbook = load_workbook() and then pass in the
"sample.xlsx", just like that.
So now you can see that a
Workbook is made up of a number of sheets by calling
workbook.sheetnames, and this will return a list of the titles of each sheet.
So here, you’ll see
'amazon_reviews_us_Watches'. And if you go back, it’s a little small here, but the title of this sheet is also amazon_reviews_us_Watches.
All right. So now, just say
sheet and like before, you can say
workbook.active to grab that first sheet that opens. And now, if you take a look at the
sheet, you get this object here, which is a
Worksheet, and conveniently gives you the name of the sheet over here.
If you want to take a look at just the title, you could say
sheet.title and access the
.title property, and now you can see it over here.
01:53 Assigning the entire workbook to a variable and then grabbing the sheets off of it is a common way to access data from an Excel spreadsheet, and we’re going to be doing this quite a bit more throughout the course.
02:04 Now that you have the sheet, you can access the data inside a couple of different ways. One way is to index the cell by putting in the column and row from the sheet.
So, if you say something like
sheet and then
["A1"], you can see that now it returns a
Cell. It’ll also tell you the name of the sheet and the cell location, but not the actual data. If you want to see the values, you need to access the
Cell and then access the
.value property off of it, and now you can see
'marketplace'. Let’s take a look at another one, say something like
"F10", and just take the
.value off of it.
"G-Shock Men's Grey Sport Watch". All right. Another way to access a cell is by calling the
.cell() method off of the
So you could say something like
sheet.cell() and then pass in something like
This should be the same, because we access the 10th row here, and F is the 6th letter, so column F10. And you can see here that it returns a
Let’s go ahead and do that again, but this time call
.value, and you should get the same value there. Moving forward, we’re mostly going to be accessing cells using this notation, as the letter for column and number for rows is a bit more similar to how you would access things within an Excel worksheet.
A thing to note here, also, is that everything in these worksheets is going to be one-indexed. So here, where you said
column=6, it starts with column
So if you’re used to things—like Python—being zero-indexed, just make sure you get that number correct. Another thing to consider also is that you have a couple options when you call that
So, let’s say you say
workbook1 and call
load_workbook() again. You can see that there are a few options here.
read_only does just like what it says and will open up the workbook in a read-only format. The default is
False, but if you say
True, you can actually open up much larger spreadsheets based on the way that
openpyxl handles memory.
Another nice option is this
data_only option down here, which also defaults to
False. If you set this to
True any formulas present in the cells will only come over to Python as their resulting values. If you’ve ever copied cells and pasted the values from them, it’s a very similar result. All right!
04:24 So, now you can access data in a workbook that already exists. In the next video, you’re going to see how to iterate through that data and use the values that you get.
Become a Member to join the conversation.