Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Importing CSV Data Into a Pandas DataFrame

This is the first video of the course and defines the objectives of this course. Furthermore, you’ll learn how to configure packages used during the course as well as explore the used dataset and how to load it into a Pandas DataFrame.

Resources:

00:00 We’re going to be talking about Pandas DataFrames. This time we’re going to dive deeper into the methods that the DataFrames provide—things like inspecting the datasets as well as slicing and dicing a DataFrame, as well as gathering statistics about the DataFrame that you may have, such as mean, median, and as well as dealing with things like .groupby().

00:23 Seeing as we’ll need some data to work with when dealing with DataFrames, I’ve decided to pull up Kevin Durant’s 2012-2013 basketball season stats. We have stats on the number of minutes he’s played, who he’s played against, his age, what game, number of free throws—all kinds of stuff.

00:41 And that is an interesting set of data we can use to gather some stuff. This is kind of cool, because this is my first year playing in a fantasy basketball league, so this might be fun just to explore what information we can gather on particular players, seeing how they do against certain teams, and maybe even see things about their play type and how long they’ve played and how many points they’ll score in a particular minute. So, we have this data.

01:05 What I went ahead and did was import this data into our Python Notebook. I simply at first imported vincent, which we’ll use to visualize some of the statistics information we gather later on. I’ve imported pandas, we’ve imported specifically the DataFrame and Series objects.

01:22 We’ve initialized vincent to work with the Python Notebook.

01:26 And we set our print .max_columns to None. What this allows us to do is render long tables like so. These tables here are quite long and won’t render sometimes, so you’ll get a compressed view of the DataFrame.

01:38 What this allows us to do is expand it and so we’ll get as many columns as needed and it’ll end up becoming a scroll bar, as you can see here. I have then taken all the column names and wrote them out in a list, so that will be our column names.

01:52 We then also then took that kevin.csv you saw here a second ago and imported that into pandas with the columns that I’ve named above, and we end up having something similar to this.

02:06 The .head() command allows us to specify the number of items you’ll return in our particular set.

Anonymous on June 1, 2019

where do i go to get the “kevin.csv” file that you are using?

Schumi Chou on June 21, 2019

I was wondering the same question - could RealPython help to share “kevin.csv” which this course uses? So that we can play with it directly to have more practical impression and experience. Thanks so much!

Dan Bader RP Team on June 21, 2019

Thanks for the heads up, you can download the CSV file at basketball-reference.com. Just click on Share & moreGet table as CSV (for Excel) and you can copy the table and save it to a file named kevin.csv.

olagappanmuthu on Dec. 28, 2019

How do I install the “vincent” package?

pshekhar2707 on March 5, 2020

to install vincent package : i used following command at anaconda prompt (logged in as admin) conda install -c conda-forge vincent

pshekhar2707 on March 5, 2020

After reading file from site mentioned by Dan, you would notice in dataframe 2 columns as :’Unnamed: 5’, and ‘Unnamed: 7’.

So we need to rename those columns as : data.rename(columns={‘Unnamed: 5’:’Home_Away’, ‘Unnamed: 7’:’Win_Loss’}, inplace=True)

gmodelgado on March 15, 2020

May I have the course notebook?

Ricky White RP Team on March 16, 2020

Hi gmodelgado. There is not a notebook that accompanies this course. Sorry.

The link to the CSV file, however, is a above.

Hung Chua on May 17, 2020

pip3 list shows that I’ve got vincent somehow it won’t load in jupyter No problems with pandas

Become a Member to join the conversation.