Reading Excel Spreadsheets
00:00
Before you can start writing spreadsheets, it’s a good idea to know how to read them. For this video, we’re going to use the sample.xlsx
spreadsheet, which, if you open it up, contains a bunch of data from Amazon. You’ll see here things like marketplace, customer_id, and review_id—and just a number of columns here about different watches and the reviews that people left for them.
00:27 There are about a hundred rows here—actually, exactly 100 rows—so definitely something that we can get some good practice with.
00:35
So let’s go to the terminal—I’m just going to bring this up here— and open up a Python interpreter session. I’m going to use bpython. Go ahead and do the imports, so from openpyxl
go ahead and import load_workbook
.
00:52
Then now, instead of creating a new workbook, you can say workbook = load_workbook()
and then pass in the filename
of "sample.xlsx"
, just like that.
01:04
So now you can see that a Workbook
is made up of a number of sheets by calling workbook.sheetnames
, and this will return a list of the titles of each sheet.
01:14
So here, you’ll see 'amazon_reviews_us_Watches'
. And if you go back, it’s a little small here, but the title of this sheet is also amazon_reviews_us_Watches.
01:28
All right. So now, just say sheet
and like before, you can say workbook.active
to grab that first sheet that opens. And now, if you take a look at the sheet
, you get this object here, which is a Worksheet
, and conveniently gives you the name of the sheet over here.
01:45
If you want to take a look at just the title, you could say sheet.title
and access the .title
property, and now you can see it over here.
01:53 Assigning the entire workbook to a variable and then grabbing the sheets off of it is a common way to access data from an Excel spreadsheet, and we’re going to be doing this quite a bit more throughout the course.
02:04 Now that you have the sheet, you can access the data inside a couple of different ways. One way is to index the cell by putting in the column and row from the sheet.
02:12
So, if you say something like sheet
and then ["A1"]
, you can see that now it returns a Cell
. It’ll also tell you the name of the sheet and the cell location, but not the actual data. If you want to see the values, you need to access the Cell
and then access the .value
property off of it, and now you can see 'marketplace'
. Let’s take a look at another one, say something like "F10"
, and just take the .value
off of it.
02:39
"G-Shock Men's Grey Sport Watch"
. All right. Another way to access a cell is by calling the .cell()
method off of the sheet
.
02:47
So you could say something like sheet.cell()
and then pass in something like row=10
and column=6
.
02:55
This should be the same, because we access the 10th row here, and F is the 6th letter, so column F10. And you can see here that it returns a Cell
.
03:06
Let’s go ahead and do that again, but this time call .value
, and you should get the same value there. Moving forward, we’re mostly going to be accessing cells using this notation, as the letter for column and number for rows is a bit more similar to how you would access things within an Excel worksheet.
03:23
A thing to note here, also, is that everything in these worksheets is going to be one-indexed. So here, where you said column=6
, it starts with column 1
.
03:33
So if you’re used to things—like Python—being zero-indexed, just make sure you get that number correct. Another thing to consider also is that you have a couple options when you call that load_workbook()
.
03:44
So, let’s say you say workbook1
and call load_workbook()
again. You can see that there are a few options here. read_only
does just like what it says and will open up the workbook in a read-only format. The default is False
, but if you say True
, you can actually open up much larger spreadsheets based on the way that openpyxl
handles memory.
04:05
Another nice option is this data_only
option down here, which also defaults to False
. If you set this to True
any formulas present in the cells will only come over to Python as their resulting values. If you’ve ever copied cells and pasted the values from them, it’s a very similar result. All right!
04:24 So, now you can access data in a workbook that already exists. In the next video, you’re going to see how to iterate through that data and use the values that you get.
Become a Member to join the conversation.