Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set the default subtitles language in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

Working With a Stocks Dataset

Give Feedback

In this lesson you’ll learn how to access stock data and visualize it using vincent. This video concludes the course.

00:00 Seeing that we have some stuff—well, let’s get a bit of a different dataset. This is kind of simple and rudimentary. Let’s maybe take a dataset from, like, a stock site or something like that.

00:09 Luckily, there’s something built in Pandas that allows us to get new datasets from sites like Yahoo. So we’re just going to do a few imports to get Vincent up and running.

00:21 We’re going to import from vincent.ipynb import init_d3, init_vg and display_vega. Once you have those three, we’re going to go init_d3(),

00:41 add the JavaScript needed to do that one, as well as the JavaScript needed to do this one. Once we have that, we’re going to go from pandas—and this is what lets us get the real-time dataset from Yahoo, the DataReader class.

00:58 We’re going to assign that to some data. So we’re going to go DataReader(),

01:03 we’re going to ask for one of my favorite company’s stock, Tesla. We’re going to also then ask from the source 'yahoo', and we’re also going to go specify a start time.

01:16 So the company went public in 2010, let’s get all the data from 2010. And that’ll give us a DataFrame, hopefully.

01:28 Oh, typo there. And that gives us a DataFrame with 777 entries from June 29th, 2010 until today. What we need to do, though, is just get a slice. So as I said before, the .head() allows us to get a smaller section, so when you try to print it won’t group up like it has here.

01:46 Just so you can see that. So you can see, it has many columns, it has an open price, a high price, a low price, a closed price, adjusted close, and volume.

01:55 So, that allows us to figure out what we’re interested in. We want to see what the daily highs were since the beginning of their stock. So to do that, which I hadn’t mentioned before, you can splice along columns, which will return Series sets, as you saw before.

02:12 So let’s say we’re interested in the high. All we’re going to do is go high is equal to that, and that will be a Series. So we have to print high, and that’ll give us a timeseries object, like we dealt with in the first example, with a date and a high price.

02:27 So, let’s go and graph that. What we’re going to need to do is we’re going to need to resample so our axes are all the same. Even though the dataset is set to be on days, it actually has a time attached to it if you were to inspect it further.

02:41 So we need to resample it so it’s grouped together per day. So we’re going to go high = high.resampled(),

02:53 daily ('D'). And how do we want to sample that? We want to make that a sum, and there’s only one per day, so it won’t actually change any of our values.

03:06 Once we have that, we’re going to declare a Line from Vincent, so we’re going to go vincent.Line() and that’ll give us the structure of a line graph. Then we’re going to pass in some tabular data,

03:19 tabular data being the high values here that we just collected, high. And then we’re going to simply go display_vega(line). Now it should return to us a graph of all the…

03:38 It is .resample(), not .resampled(). And that should give us a daily trend of the high prices for all the stuff. Now, this is a bit squished.

03:47 Vincent allows you to edit the way the thing looks as well, so you can go line.update(), you can give it some padding.

04:12 So, as you can see, we get a beautiful graph here with a nice structure over the period of the time that we selected—over multiple years. And then we have year gaps and breakages.

04:22 So, that’s your introduction to Vincent and Pandas! Very powerful tools to get some impressive graphics up and running.

Sciencificity on March 14, 2019

Hi there, I have installed vincent however from vincent.ipynb import init_d3, init_vg, display_vega causes this error: ModuleNotFoundError: No module named ‘vincent.ipynb’ Any guidance? Thanks.

Mahidhar Nyayapati on March 15, 2019

Need more examples to illustrate power of Vincent…Why should I use Vincent when I have matplotlib abd plotly like tools

charliem22 on June 29, 2019

I had the same issue as you did RE vincent. I think the core problem is that this tutorial is <<way>> out of data. It’s my understanding – although I am just coming to data science via Python – that matplotlib and plotly are far better plotting tools. I’m going to work on getting up to speed with them.

Bradley Grant on Nov. 14, 2019

Hello from 2019! This code is 6 years old and a lot of things have changed:

  • The Pandas datareader is now a separate package, you’ll need to pip install pandas-datareader then from pandas_datareader import data, wb
  • Consider using `stock_data = data.DataReader(‘tsla’, ‘yahoo’, start = ‘01-01-2010’)
  • The resample method now looks like high = high.resample('D').sum()
  • The vincent imports are now import vincent and vincent.core.initialize_notebook()
  • Pass the data directly into the Line object like this: line = vincent.Line(stock_data)
  • Display the line by calling display(line)

Olivier Rachoin on Nov. 21, 2019

Dears, thank you for the Tutorial but, we need a cleared explanation on the power of Vincent. matplotlib & plotly are already showing great results, where does vincent stand against the 2 mentioned above?

A Janifal on May 13, 2020

Hello from 2020, =) more tutorials about Vincent n Pandas please

Louis Voet on Aug. 26, 2020

I think another line in this tutorial might be outdated. The following did not seem to work for me:

from pandas.io.data import DataReader

It gave me this error:

ModuleNotFoundError: No module named 'pandas.io.data'

It was fixed by changing the import line to:

from pandas_datareader import DataReader

sumeet87 on May 29, 2021

not very informative, seemed rushed through.

Become a Member to join the conversation.