Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set the default subtitles language in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

Plotting With Pandas

Give Feedback

The pandas library has become popular not just for enabling powerful data analysis but also for its handy pre-canned plotting methods.

Interestingly, those plotting methods are really just convenient wrappers around existing matplotlib calls. You can use matplotlib and pandas to produce even more sophisticated visualizations.

00:00 The Pandas library has become popular not just for enabling powerful data analysis, but also for its handy, pre-canned plotting methods. Interestingly, though, Panda’s plotting methods are really just convenient wrappers around existing Matplotlib calls. We can use Matplotlib and Pandas to produce even more sophisticated visualizations.

00:28 Before we start scripting, let’s learn a bit about how Pandas works by running it in the interactive shell.

00:36 I’m going to start by importing all three of the required libraries, matplotlib.pyplot, numpy, and finally pandas. Let’s create a pandas.Series, which is a one-dimensional labeled array.

00:54 I’ll call this s and I’ll get it with pd.Series() passing in np.arange(5) to generate an ndarray from 0 to 4 inclusive.

01:11 And for the index, I’ll convert the string 'abcde' into a list.

01:18 If I inspect this Series, you can see what it looks like. To plot this, I’ll type ax = s.plot(). pandas.Series contain a .plot() method, which calls the matplotlib plotting method internally.

01:37 If you recall, that method implicitly tracks the current Axes, and so we can obtain that Axes object by storing it in a variable.

01:48 And just to show that this ax variable is an Axes, I can use the built-in type() function passing in ax. And look at that: AxesSubplot, just like before.

02:02 Remember, we’re tracking the current Figure with pyplot under the hood, and so I can compare the ID of the object returned by the gca() (get current axes) function with the ID of our Axes object.

02:20 And they are the same. This shows that we can use Pandas in a similar way to stateful Matplotlib, but with the additional functionality of Pandas. This also means that we can take a stateless approach, obtaining our Axes object and modifying it manually before plotting the whole Figure.

02:43 I’ll show you how to do that next. I’m here in Visual Studio Code in a new file called plot5.py. We’ll be plotting the moving average of a widely watched financial time series, the CBOE market volatility index, or CBOE VIX. In other words, financial data.

03:08 The first thing we’ll do is grab all of our libraries. The only new one here is matplotlib.transforms, which I will import as mtransforms.

03:22 And of course, we need pandas too. Next, I’ll declare a variable for the URL and initialize it with this string. This links to a CSV file containing dates and their associated volatility.

03:38 We need to turn this into a pandas.Series, so I’ll say vix = pd.read_csv(), passing in the URL to read from as well as some other arguments that will help with interpreting dates and removing non-accessible values, which are marked in the file with a dot (.).

04:04 We also need to generate a Series of the 90-day rolling averages, which can be done with the .rolling() and .mean() functions, just like this.

04:17 In order to split this data into bins, I’ll use the pandas.cut() function. Each date will be assigned to a bin corresponding to a severity level obtained from the rolling average associated with that date.

04:34 The bins are labeled 0, 1, 2, and 3, and which date goes into which bin is determined by these cut-offs: 14, 18, and 24.

04:49 Now I need to decide on a color map and store it in a variable. I’ll call this cmap and I’ll get it with plt.get_cmap() passing in Red-Yellow-Green reversed, just like I did before. To actually create the plot, We can call the .plot() function on our ma object.

05:15 In this case, we want to plot this rolling average as a black line in an 8 by 4 figure. Remember, the Pandas .plot() function calls pyplot.plot() function under the hood, and so now pyplot is tracking a current Axes.

05:34 Let’s grab it and store it as a variable so that we can further modify it.

05:41 ax = plt.gca(). Now I’ll quickly set some Axes properties with methods that we’ve used before.

05:52 This is nothing new. This code here will use a Matplotlib transform to draw colored bars in the visualization based on those state bins we created earlier.

06:06 This is another way to visualize level of fear in the marketplace.

06:12 In other words, we’re mapping bins—or severity levels—to a specific color and then drawing those on the screen. And finally, I want to draw a horizontal dashed line at the mean of our VIX data.

06:31 We can do that with Matplotlib’s .axhline() method passing in the vix.mean() and some styling properties.

06:43 And now I will show this with pyplot. If we observe this plot, you can see that the colored bars actually correspond to the black line. A higher average results in more red colors, and a lower average corresponds to more green. It all depends on the bin each date was placed in.

07:05 Now, this course is by no means a dedicated Pandas tutorial, as you can probably tell. We’ve got dedicated tutorials at realpython.com if you’re interested in learning more about how this Pandas code works in-depth.

07:22 The takeaway here is that we can use Pandas to aid in our data analysis and plotting alongside Matplotlib. It just opens the door for new opportunities.

Marco Belo on Oct. 30, 2019

ValueError: Invalid RGBA argument: ‘xkcd: dark grey’

Xavier on Nov. 2, 2019

Re the invalid “dark grey”, permissible XKCD colours can be found here https://matplotlib.org/3.1.0/tutorials/colors/colors.html#xkcd-colors Looks like you must also remove the space after the colon, e.g. color='xkcd:darkblue'

Xavier on Nov. 2, 2019

Typo in setting of vix. read_csv argument should be parse_dates=True

Ranit Pradhan on April 6, 2020

TypeError: parser_f() got an unexpected keyword argument ‘pase_dates’

Ranit Pradhan on April 6, 2020

Ok,got it ....Thank You Mr.Xavier, it should be parse_dates=True.

Become a Member to join the conversation.