Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Using the ColumnDataSource Object

This video covers Bokeh’s ColumnDataSource object. The ColumnDataSource is foundational in passing the data to the glyphs you are using to visualize. Its primary functionality is to map names to the columns of your data, making it easier for you to reference data elements when building your visualization.

For information about integrating data sources, check out the Bokeh user guide’s post on the ColumnDataSource and other source objects available.

Bokeh provides a helpful list of CSS color names categorized by their general hue. Also, htmlcolorcodes.com is a great site for finding CSS, hex, and RGB color codes.

File: read_nba_data.py

import pandas as pd 

# Read the csv files
player_stats = pd.read_csv('data/2017-18_playerBoxScore.csv',
                           parse_dates=['gmDate'])
team_stats = pd.read_csv('data/2017-18_teamBoxScore.csv',
                          parse_dates=['gmDate'])
standings = pd.read_csv('data/2017-18_standings.csv',
                         parse_dates=['stDate'])

# Create west_top_2
west_top_2 = (standings[(standings['teamAbbr'] == 'HOU') | 
              (standings['teamAbbr'] == 'GS')]
              .loc[:, ['stDate', 'teamAbbr', 'gameWon']]
              .sort_values(['teamAbbr', 'stDate']))

File: WestConfTop2.py

# Bokeh libraries
from bokeh.io import output_file
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, CDSView, GroupFilter

# Import the data
from read_nba_data import west_top_2

# Output to static HTML file
output_file('west_top_2_standings_race.html',
            title='Western Conference Top 2 Teams Wins Race')

# Isolate the data for the Rockets and Warriors
rockets_data = west_top_2[west_top_2['teamAbbr'] == 'HOU']
warriors_data = west_top_2[west_top_2['teamAbbr'] == 'GS']

# Create a ColumnDataSource object for each team
rockets_cds = ColumnDataSource(rockets_data)
warriors_cds = ColumnDataSource(warriors_data)

# Create and configure the figure
fig = figure(x_axis_type='datetime',
             plot_height=300, plot_width=600,
             title='Western Conference Top 2 Teams Wins Race, 2017-18',
             x_axis_label='Date', y_axis_label='Wins',
             toolbar_location=None)

# Render the race as step lines
fig.step('stDate', 'gameWon', 
         color='#CE1141', legend='Rockets', 
         source=rockets_cds)
fig.step('stDate', 'gameWon', 
         color='#006BB6', legend='Warriors', 
         source=warriors_cds)

# Move the legend to the upper left corner
west_fig.legend.location = 'top_left'

# Show the plot
show(west_fig)

Comments & Discussion

Pygator on Aug. 18, 2019

I get the following error when trying to build the dataframe:

FileNotFoundError Traceback (most recent call last) <ipython-input-1-90fbea3810d4> in <module> 3 # Read the csv files 4 player_stats = pd.read_csv(‘data/2017-18_playerBoxScore.csv’, ----> 5 parse_dates=[‘gmDate’]) 6 team_stats = pd.read_csv(‘data/2017-18_teamBoxScore.csv’, 7 parse_dates=[‘gmDate’])

~/Bokeh/venv/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision) 683 ) 684 –> 685 return _read(filepath_or_buffer, kwds) 686 687 parser_f.name = name

~/Bokeh/venv/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds) 455 456 # Create the parser. –> 457 parser = TextFileReader(fp_or_buf, **kwds) 458 459 if chunksize or iterator:

~/Bokeh/venv/lib/python3.7/site-packages/pandas/io/parsers.py in init(self, f, engine, **kwds) 893 self.options[“has_index_names”] = kwds[“has_index_names”] 894 –> 895 self._make_engine(self.engine) 896 897 def close(self):

~/Bokeh/venv/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine) 1133 def _make_engine(self, engine=”c”): 1134 if engine == “c”: -> 1135 self._engine = CParserWrapper(self.f, **self.options) 1136 else: 1137 if engine == “python”:

~/Bokeh/venv/lib/python3.7/site-packages/pandas/io/parsers.py in init(self, src, kwds) 1904 kwds[“usecols”] = self.usecols 1905 -> 1906 self._reader = parsers.TextReader(src, kwds) 1907 self.unnamed_cols = self._reader.unnamed_cols 1908

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.cinit()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

FileNotFoundError: [Errno 2] File b’data/2017-18_playerBoxScore.csv’ does not exist: b’data/2017-18_playerBoxScore.csv’

Pygator on Aug. 21, 2019

I downloaded the data from the link dan gave and moved the three csvs into the data/ subfolder . Datetimes aren’t working for me, i don’t understand how to format them at all.

Chris Bailey RP Team on Aug. 21, 2019

Hi Pygator, Are you getting the same error as you posted above? I just tried the link that Dan provided for the data, and it downloaded the CSVs with incorrect names. Each file name should start with 2017-18 and what I got was 017-18. That would cause the error above of file not found. I will work with Dan on how to fix the issue, but you can rename the files, and add a “2” to the front of each. If the issue is past the first error you got, and is more specific to datetimes, can you send me more info, as to where it is failing. Thanks for taking the time to comment, and I hope I can help you solve this.

Become a Member to join the conversation.