Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Using GroupFilter and CDSView

This video expands on Bokeh’s ColumnDataSource object, by exploring GroupFilter and CDSView. These features of the ColumnDataSource allow you to filter your data and make multiple views of a single ColumnDataSource. Allowing you to do much of your data wrangling using Bokeh’s own tools.

File: WestConfTop2.py

Python
# Bokeh libraries
from bokeh.io import output_file
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, CDSView, GroupFilter

# Import the data
from read_nba_data import west_top_2

# Output to static HTML file
output_file('west_top_2_standings_race.html',
            title='Western Conference Top 2 Teams Wins Race')

# Create a ColumnDataSource
west_cds = ColumnDataSource(west_top_2)

# Create view for each team
rockets_view = CDSView(source=west_cds,
                       filters=[GroupFilter(column_name='teamAbbr', group='HOU')])
warriors_view = CDSView(source=west_cds,
                        filters=[GroupFilter(column_name='teamAbbr', group='GS')])

# Create and configure the figure
west_fig = figure(x_axis_type='datetime',
                  plot_height=300, plot_width=600,
                  title='Western Conference Top 2 Teams Wins Race, 2017-18',
                  x_axis_label='Date', y_axis_label='Wins',
                  toolbar_location=None)

# Render the race as step lines
west_fig.step('stDate', 'gameWon',
              source=west_cds, view=rockets_view,
              color='#CE1141', legend='Rockets')
west_fig.step('stDate', 'gameWon',
              source=west_cds, view=warriors_view,
              color='#006BB6', legend='Warriors')

# Move the legend to the upper left corner
west_fig.legend.location = 'top_left'

# Show the plot
show(west_fig)

00:00 ColumnDataSource objects can do more than just serve as an easy way to reference DataFrame columns. In fact, the ColumnDataSource object has three built-in filters that you can use to create views on the data using what’s called a CDSView object.

00:15 The first is called a GroupFilter. It selects rows from a ColumnDataSource based on a categorical reference value. You’ll use that shortly.

00:23 Next is an IndexFilter, that filters the ColumnDataSource using a list of integer indices. And last is a BooleanFilter, which allows you to use a list of Boolean values with True rows being selected.

00:36 In the previous example, you made two ColumnDataSource objects—one each from a subset of the west_top_2 DataFrame. For this next example, you’ll recreate the same output but using only one ColumnDataSource based on that same DataFrame. Instead, you’ll use the new tool GroupFilter to create views on the data.

00:56 Let’s see what that looks like. So here you’re starting out very similarly. You’re still going to go ahead and output to a file. And from plotting you’re still going to use figure() and show(). But from bokeh.models, along with the ColumnDataSource, you’re going to import a couple other items, CDSView and GroupFilter. Okay.

01:14 You still need to bring in your data. That will bring in west_top_2, so that will stay the same. And the output’s going to be to the same static HTML file.

01:23 But this is where it’s going to change. Instead of isolating this manually and creating two separate column data sources, this is going to be done a little differently.

01:30 Create a single ColumnDataSource called west_cds.

01:35 And again, that’s from west_top_2. Great, there’s the ColumnDataSource. Now create views for each team. A rockets_view, that’s going to use our new friend CDSView(), which takes a sourcein this case, the west_cds you set up a second ago. And here, this is where you’ll create your filter.

01:57 It’s going to be a GroupFilter with a column_name equal to 'teamAbbr' (team abbreviation) and a group, a grouping of 'HOU'.

02:08 So a GroupFilter with this column name and the grouping of this. Looks good! In fact, to save some effort, copy that view and paste. For the warriors_view,

02:19 it’s going to be 'GS'. Okay.

02:24 Now you created your two CDSViews, and next up, let’s rename this figure and call it west_fig. It’s still going to be an x_axis_type of 'datetime' and the same plot_height and all the other stuff will remain the same.

02:40 But then to render the two teams here, for west_fig you’re going to do the same two columns and the same color and legend,

02:50 but remove this information and put a couple of lines in here. So after the two columns, put in your source of west_cds, and then pick the legend—or pick the view, sorry, of rockets_view.

03:05 Does it make sense? Okay, so now we have these two columns being used, source=west_cds, a ColumnDataSource, and the view being rockets_view that you set up here. Okay.

03:16 So, same kind of thing here for the other one, for the Warriors.

03:22 west_fig,

03:26 source=west_cds, and view=warriors_view. Don’t forget your commas. Okay. Update this to be the west_fig and this to be west_fig.

03:39 That all looks good. Okay. Now that you’ve made that change—here you’re always picking the same source, you’re just picking these two views instead.

03:47 What’s nice is you could create many other filters using the same methodology creating these GroupFilters and building up these CDSViews. Make sure to save, and now in the terminal, run the script. Okay. Let’s see if it looks the same.

04:03 Yep! It looks great. We’ve got the same data here, the different wins and the date. Now that you’ve gone through the steps of creating a couple CDSViews and creating the west_fig (Western figure), in the next exercise, you’ll create a second figure called east_fig.

04:17 And at that point with two different figures, now you’ll need to worry about layouts. That’s coming up next.

andresgtn on March 30, 2020

what is the rationale behind using CDSViews with group filters over creating multiple CDS - its not clear what is the benefit of one over the other. Its nice to understand the idea behind using two different ways of seemingly achieving the same outcome. E.g. one is more memory-efficient than the other

Chris Bailey RP Team on March 30, 2020

Hi @andresgtn, One of the key reasons for using a single CDS (Column Data Source) is to have interactivity, which is covered in much more detail coming up. With one source, if you make selections in one view it can be also represented on another view. I think you are correct that it is also memory efficient. Link to the docs for more.Bokeh Column Data Source

Become a Member to join the conversation.