For more information on concepts covered in this lesson, you can check out the following resources:
An Introduction to a Grammar of Graphics
00:00 You’ll get started by learning about what a grammar of graphics even means. This term comes from an influential book that was written a while ago, and someone else, Hadley Wickham, he created a library in the R language that he called A Layered Grammar of Graphics, and this library was called ggplot. ggplot2 is the currently used version of it. And plotnine, the library that you’ll be working with now, is essentially a port of ggplot to Python.
00:30 So, it applies the same structure and the same approach of using a layered grammar of graphics to represent graphics. But, yes, it is written for Python, so it will be easy for you to use and integrate in your Python apps.
00:44 Now, what is a layered grammar of graphics? Essentially, you can think of it as putting more and more layers on top of each other to create a graphic. That’s also why it’s called a layered grammar of graphics.
00:56 And there’s three important layers. There’s a couple more, but you’re going to learn about these three primarily, which is: data as the first base layer, then aesthetics on top of that, which are mappings from the data to specific visible elements in the graph, and then geometric objects, which is how to represent the data points.
01:17 Let’s look at these in a bit more detail. When you think of data, you might just think of a table like this, which is a common way to represent the data, which consists of rows and of columns and has these different data items in there. Now, if you would just apply the data layer to your plotnine graph, then all you would end up with is a gray square.
So, this gray square already has the data in there, but there’s no information on how to display the data, so plotnine can’t do anything more with it than just telling you, “Okay, there’s some data.” But you do need to establish this first layer of data so that you can move on to the second layer, which is aesthetics. And here, as I mentioned before, this is about mapping values that exist in your data to things that you can perceive on the graph. Most importantly, those will be the
x position and the
02:21 There’s different ones that you can choose for these as well. But plotnine just applies some good defaults for these for you, so you usually don’t have to worry about those unless you really want to fine-tune your graph.
And if you think about the dataset that you looked at before, you could pick out a lot of things from this dataset, but let’s say you’re going to map this column
class to the x-axis and you’re going to map this column here, the highway miles that the car can drive per gallon, as the y-axis.
03:37 So, this looks already more like a graph, but it’s still missing the data points. And this is the third important piece of the puzzle, which is the geometric objects. These geometric metrics objects tell plotnine how you want the data points to be represented. So, in this case, you’re telling it that you want it to be little black circles for each data point that’s in the dataset. And with this, you have a complete graph that makes sense. You have a y-axis with a mapping, you have an x-axis mapping, and then you have the data underlying all of it, and then the top layer of the geometric objects telling you how to display that data. And here, you can read about how SUVs, for example, can’t drive a ton of kilometers on the highway per gallon that they use, so their fuel consumption is much worse than, for example, this subcompact car that sits up all the way up there.
04:34 Okay! As a quick recap, what you need to build a plot with plotnine is three layers of the layered grammar of graphics, which is, first, the data, then aesthetics, and then third, geometric objects.
Become a Member to join the conversation.