Adding Charts
00:00
By now, you should feel pretty comfortable using openpyxl
for a number of different operations when handling Excel spreadsheets. We’re going to wrap up with a very cool topic. You’re going to learn how to add charts to your worksheets with Python. openpyxl
has support for just about all the chart types in Excel, but we’re only going to look at a couple here because the process of creating them is almost identical between each type.
00:23
The big thing to keep in mind is that you need to find the data for the chart, and then the type of chart, and then where you want it. So to get started, create a new script. I’m going to call mine openpyxl_charts.py
.
00:41
So, the first thing you’re going to want to do is create a new workbook. from openpyxl import Workbook
and then from openpyxl.chart import BarChart,
and Reference
.
00:54
Make your new blank workbook
, which will just be a Workbook
, and then grab the active sheet off of it.
01:03
Okay. Now, it’s time to make some sample data. You can say rows
and make a list of lists. We’ll start this off with a "Product"
…
01:14
and "Store"
. This will be, like, some fake sample sales data for a couple products. Then, just throw in some numbers here.
01:38
All right. Now that you have that data, go ahead and append this to the worksheet. So, for row in rows:
sheet.append(row)
. Okay. So now that you have the data set up, you can make the chart, so say, chart = BarChart()
, and then the data
is going to equal a Reference
. Now, define where the data is.
01:58
You can say worksheet=sheet
and the min_row=1
, max_row=8
, min_col=2
, and the max_col=3
. Now, take your chart, add the data to it, pass in the data
, and then the titles_from_data
you can set equal to True
, and that’s because this first row has the titles for what the data represents.
02:28
Then, sheet.add_chart()
, pass in chart
, and then where you want it. So, let’s say something like "E2"
—and that’ll be the top-left corner of where the chart is, or where the chart’s top-left corner will be. So from here, you can say workbook.save()
, and I’m just going to call this something like "chart.xlsx"
. Save it, go ahead and run it.
02:54 No errors, and it’s in the directory, so let’s open it up. And there we go! Here’s the sample data that was appended to the worksheet, and then starting in cell E2, here’s the chart. Now, one thing that’s cool here—because you didn’t pass in the data, you passed in the reference—if you click here, you can see that it’s pointing to the data that’s still on the worksheet.
03:16
That means that a user could edit the data as needed and the chart would then reflect that edited data. So now, let’s take a look at line charts. I’m just going to take all of this code here and comment it out and then start over. I’ll make a new section, call this Line Chart
, and start over with the imports.
03:36
I’m going to import random
, and then from openpyxl import Workbook
, and then from openpyxl.chart import LineChart,
03:47
and Reference
. Like before, make a new Workbook()
,
03:53
and grab the active sheet. Okay. And also, like before, I’m going to make some fake sample data. This will be a little different. rows
is still going to be a list of lists. This will be a blank string and then each of the months.
04:14 Okay? And then here’s going to be a little different.
04:19
I’m going to make some lists like this, where it’s just a number and then blank. That’ll get filled in later on. Now, for row in rows:
sheet.append(row)
.
04:34
Then, for row in sheet.iter_rows()
and this will be starting at the minimum row of 2
, to not include the headers. Then, set max_row=4
, and then min_col
is going to start with 2
because that first column is just blank.
04:54
And then max_col
is going to be 13
to include all 12 months. And then now, for cell in row
—because this is not values_only
, it’s going to return the Cell
object—you can say the Cell
object is going to equal random
and just pass into .randrange()
from 5
to 100
. Okay.
05:15
That should set you up for sample data. Now, you’ll actually make the charts and you can say chart = LineChart()
, and then data = Reference()
and pass in the Worksheet
as the worksheet
—so, sheet
.
05:31
The min_row=2
, and the max_row=4
, min_col=1
, and max_col=13
. Okay. And then go .add_chart()
, pass in data
, and then from_rows=True
and titles_from_data
is also True
.
05:52
Okay, now that the chart
is ready, you can take your sheet
and then add the chart to it, pass in the chart
, and then this time have it go to "C6"
. Now, take your workbook and save it. You can call this one line_Chart.xlsx
.
06:09 If you go ahead and run this again—ah, you’ll see an error.
06:16
This 'line_Chart' is not defined
, because it’s not a string. Now it is! Let’s save this, run this again, and let’s go see what it looks like. Okay, and here you go!
06:31 You’ve got all your sample data here for three different products, and for each month of the year. These are all random, so yours will probably look a little different. You can see that all of these have been plotted out.
06:46
One thing to note here is that when you created the chart, you set this from_rows=True
, which was not the case during the bar chart. If you did not set that equal to True
instead of the chart being made up of each row, it would have gone column by column, so the data would kind of be flipped around.
07:08
So, this chart looks nice and somebody seeing this and seeing 1 through 12 might assume that that’s month, but they might not. openpyxl
will let you add categories to your chart based on your data.
07:20
I’m going to close this out and go back to the script. And now, before adding the chart, you’re going to add categories. Just call this cats
and set this equal to a Reference
. This will be on the same worksheet, so worksheet=sheet
. min_row=1
, and then max _row=1
—this is just to get that first row.
07:42
min_col=2
and max_col=13
. This will get you that first row of month names. Now, you can say chart.set_categories()
and pass in cats
.
07:55 If you rerun the script and then reopen up the chart or spreadsheet, take a look at it and you’ll see those numbers have been replaced with the month names, which makes it a lot more readable for someone to take a quick glance at the chart.
08:09
The last thing that we’ll cover for charts in openpyxl
is how to add axis titles, because you may know that these are months, but you might not.
08:17
And these numbers over here don’t really mean anything at all without a description. So back in the script we’ll make a new section and this will be add axis labels
.
08:29
You can say chart.x_axis.title
and set this equal to "Months"
and then chart.y_axis.title
—and actually, these are properties, so change these underscores to periods. Set this to "Sales"
, then do something like "(per unit)"
. All right, save this, rerun the script, and let’s open it up and see what it looks like. All right, look at that!
08:55
All your sample data is still there and charts are still there, but now you see axis titles: Months and Sales. Cool! So, that just about covers everything you need to know how to add charts to openpyxl
. Like I said before, there are plenty of different types of charts that you can use with openpyxl
, but they all have just about the same workflow to getting them set up, so if you need to use a certain type of chart, take a look at the openpyxl
documentation and you can find the specifics on how to use them there. And with that, we’re going to wrap up the how to write the spreadsheets using openpyxl
section of the course. In the next video, you’re going to take a look at how openpyxl
works with pandas
, which is a popular data analysis library.
karmajna on Sept. 28, 2020
I think for Line Chart Reference, min_col = 1
Become a Member to join the conversation.
Patricio Urrutia on Sept. 13, 2020
Hi Joe, thanks for the course.
I have a doubt regarding the data. For example, on the second chart you have:
I understand min_row = 2 because on the first row we have the “month names” not the data. But, why is not the same for columns? Since in col 1 we have the “product names”…
In other words, in the data, why the columns start with 1 instead of 2?