Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Plotting With Pandas

The pandas library has become popular not just for enabling powerful data analysis but also for its handy pre-canned plotting methods.

Interestingly, those plotting methods are really just convenient wrappers around existing matplotlib calls. You can use matplotlib and pandas to produce even more sophisticated visualizations.

00:00 The Pandas library has become popular not just for enabling powerful data analysis, but also for its handy, pre-canned plotting methods. Interestingly, though, Panda’s plotting methods are really just convenient wrappers around existing Matplotlib calls. We can use Matplotlib and Pandas to produce even more sophisticated visualizations.

00:28 Before we start scripting, let’s learn a bit about how Pandas works by running it in the interactive shell.

00:36 I’m going to start by importing all three of the required libraries, matplotlib.pyplot, numpy, and finally pandas. Let’s create a pandas.Series, which is a one-dimensional labeled array.

00:54 I’ll call this s and I’ll get it with pd.Series() passing in np.arange(5) to generate an ndarray from 0 to 4 inclusive.

01:11 And for the index, I’ll convert the string 'abcde' into a list.

01:18 If I inspect this Series, you can see what it looks like. To plot this, I’ll type ax = s.plot(). pandas.Series contain a .plot() method, which calls the matplotlib plotting method internally.

01:37 If you recall, that method implicitly tracks the current Axes, and so we can obtain that Axes object by storing it in a variable.

01:48 And just to show that this ax variable is an Axes, I can use the built-in type() function passing in ax. And look at that: AxesSubplot, just like before.

02:02 Remember, we’re tracking the current Figure with pyplot under the hood, and so I can compare the ID of the object returned by the gca() (get current axes) function with the ID of our Axes object.

02:20 And they are the same. This shows that we can use Pandas in a similar way to stateful Matplotlib, but with the additional functionality of Pandas. This also means that we can take a stateless approach, obtaining our Axes object and modifying it manually before plotting the whole Figure.

02:43 I’ll show you how to do that next. I’m here in Visual Studio Code in a new file called plot5.py. We’ll be plotting the moving average of a widely watched financial time series, the CBOE market volatility index, or CBOE VIX. In other words, financial data.

03:08 The first thing we’ll do is grab all of our libraries. The only new one here is matplotlib.transforms, which I will import as mtransforms.

03:22 And of course, we need pandas too. Next, I’ll declare a variable for the URL and initialize it with this string. This links to a CSV file containing dates and their associated volatility.

03:38 We need to turn this into a pandas.Series, so I’ll say vix = pd.read_csv(), passing in the URL to read from as well as some other arguments that will help with interpreting dates and removing non-accessible values, which are marked in the file with a dot (.).

04:04 We also need to generate a Series of the 90-day rolling averages, which can be done with the .rolling() and .mean() functions, just like this.

04:17 In order to split this data into bins, I’ll use the pandas.cut() function. Each date will be assigned to a bin corresponding to a severity level obtained from the rolling average associated with that date.

04:34 The bins are labeled 0, 1, 2, and 3, and which date goes into which bin is determined by these cut-offs: 14, 18, and 24.

04:49 Now I need to decide on a color map and store it in a variable. I’ll call this cmap and I’ll get it with plt.get_cmap() passing in Red-Yellow-Green reversed, just like I did before. To actually create the plot, We can call the .plot() function on our ma object.

05:15 In this case, we want to plot this rolling average as a black line in an 8 by 4 figure. Remember, the Pandas .plot() function calls pyplot.plot() function under the hood, and so now pyplot is tracking a current Axes.

05:34 Let’s grab it and store it as a variable so that we can further modify it.

05:41 ax = plt.gca(). Now I’ll quickly set some Axes properties with methods that we’ve used before.

05:52 This is nothing new. This code here will use a Matplotlib transform to draw colored bars in the visualization based on those state bins we created earlier.

06:06 This is another way to visualize level of fear in the marketplace.

06:12 In other words, we’re mapping bins—or severity levels—to a specific color and then drawing those on the screen. And finally, I want to draw a horizontal dashed line at the mean of our VIX data.

06:31 We can do that with Matplotlib’s .axhline() method passing in the vix.mean() and some styling properties.

06:43 And now I will show this with pyplot. If we observe this plot, you can see that the colored bars actually correspond to the black line. A higher average results in more red colors, and a lower average corresponds to more green. It all depends on the bin each date was placed in.

07:05 Now, this course is by no means a dedicated Pandas tutorial, as you can probably tell. We’ve got dedicated tutorials at realpython.com if you’re interested in learning more about how this Pandas code works in-depth.

07:22 The takeaway here is that we can use Pandas to aid in our data analysis and plotting alongside Matplotlib. It just opens the door for new opportunities.

Marco Belo on Oct. 30, 2019

ValueError: Invalid RGBA argument: ‘xkcd: dark grey’

Xavier on Nov. 2, 2019

Re the invalid “dark grey”, permissible XKCD colours can be found here https://matplotlib.org/3.1.0/tutorials/colors/colors.html#xkcd-colors Looks like you must also remove the space after the colon, e.g. color='xkcd:darkblue'

Xavier on Nov. 2, 2019

Typo in setting of vix. read_csv argument should be parse_dates=True

Ranit Pradhan on April 6, 2020

TypeError: parser_f() got an unexpected keyword argument ‘pase_dates’

Ranit Pradhan on April 6, 2020

Ok,got it ....Thank You Mr.Xavier, it should be parse_dates=True.

patientwriter on Feb. 24, 2022

The very idea that we should use pandas, of all things, in order to simplify working with matplotlib, tells you all you need to know about how crazy complex and unpythonic matplotlib is to begin with. The idea of bringing matlab functionality into python is fine, but matplotlib clearly was far too slavish in following matlab’s original implementation rather than transforming it to python equivalents. For the core devs to abandon Pylab was no small concession to this reality. Thus, it is no wonder there are now so many alternative plotting libraries in the python ecosystem.

Bartosz Wilk on Aug. 18, 2022

Traceback (most recent call last):
  File "/Volumes/Work/realpython/plotting_with_matplotlib/plotting_with_panda.py", line 31, in <module>
    ax.axhline(vix.mean(), linestyle='dashed', color='xkcd: dark grey', alpha=0.6, label='Full-period mean', marker='')
  File "/Volumes/Work/realpython/venv/lib/python3.10/site-packages/matplotlib/axes/_axes.py", line 737, in axhline
    l = mlines.Line2D([xmin, xmax], [y, y], transform=trans, **kwargs)
  File "/Volumes/Work/realpython/venv/lib/python3.10/site-packages/matplotlib/lines.py", line 370, in __init__
    self.set_color(color)
  File "/Volumes/Work/realpython/venv/lib/python3.10/site-packages/matplotlib/lines.py", line 1030, in set_color
    mcolors._check_color_like(color=color)
  File "/Volumes/Work/realpython/venv/lib/python3.10/site-packages/matplotlib/colors.py", line 130, in _check_color_like
    raise ValueError(f"{v!r} is not a valid value for {k}")
ValueError: 'xkcd: dark grey' is not a valid value for color

mindconnect dot cc on April 1, 2023

Replace color='xkcd: dark grey' with color='#444'.

walterrieppi on Feb. 20, 2024

I think the URL is changed or something else because i have an urlopen error [WinError 10060].

Bartosz Zaczyński RP Team on Feb. 20, 2024

@walterrieppi The URL is still very much alive. It might be some kind of a network problem. Try downloading the file manually and passing the path to a local file instead of the URL to pandas.

Become a Member to join the conversation.