I am paddy_mullen. I work for Continuum Analytics where we write the Bokeh open source plotting library. This tutorial will walk you through the basic bokeh plotting api and show you some of the advanced possiblilities. Peter Wang, Bryan Van de Ven, Hugo Shi and myself are the primary contributors.
If you have conda installed run the following shell commands
mkdir bokeh_example
cd bokeh_example/
git clone https://github.com/paddymul/bokeh_tutorial.git
conda create -n bokeh_tutorial bokeh ipython-notebook pyyaml pyaudio anaconda=1.8 --yes
source activate bokeh_tutorial
cd bokeh_tutorial
ipython notebook
Then in the IPython notebook, open the bokeh_tutorial notebook.
If you are executing this notebook, please use the menu and select Cell -> All Output -> Clear. Then reload the page, this quirk will be going away in 0.3.
import numpy as np
from bokeh.plotting import output_notebook
import pandas as pd
output_notebook()
Configuring embedded BokehJS mode.
Here is a simple plot.
from bokeh.plotting import line, show
x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)
line(x,y, color="#0000FF", tools=[])
show()
Take a moment to play around with a simple line plot.
from bokeh.plotting import line
import numpy as np
x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)
line(x,y, color="#0000FF", tools=[], plot_width=400, plot_height=400)
show()
There are many many types of glyphs: Here is a list
Let's look at combining two glyph renderers onto the same plot.
To do this we use the hold()
function, this allows us to combine renderers onto the same plot.
from bokeh.plotting import rect
rect([10,20,30], [10,20,30], width=2, height=5, plot_width=400, plot_height=400, tools=[])
show()
Due to a bug multiple plots will show up here, just look at the first two.
from bokeh.plotting import annular_wedge, hold, figure, show
figure() #create a new figure
hold(False)
annular_wedge(
[10,20,30], [30,25,10], 10, 20, 0.6, 4.1,
inner_radius_units="screen", outer_radius_units = "screen",
color="#8888ee", tools=[])
hold(True)
rect([10,20,30], [10,20,30], width=2, height=5, plot_width=400, plot_height=400, tools=[])
show()
So at this point we have a very flexible powerful plotting system. What about our data ranges? They are automatically configured for us. Notice that we don't have to specify pixels, only data sizes.
Bokeh ships with existing tools for pan, zoom, preview save, resize, and embed.
Tools are added with the tools kwarg of plots, like this:
hold(False)
x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)
#use a scatter because select doesn't work on lines
line(x,y, color="#0000FF", tools="pan, zoom, resize, select, save")
show()
Bokehjs renders plots based on their object graph. Inside this graph objects like renderers are described.
Plots have renderers, axes, grids, and tools. Renderers (Circle, Quad, Line..) have references to DataRanges and DataSources. Data Ranges operate on a DataSource to describe which portion of the dataspace should be rendered. Data sources containe the actual data to be displayed. multiple columns are algined on the same x-axis in data sources.
Now here is the really cool thing. Since plots don't have an attribute of min x and max x, but instead they have a reference to a data range, two separate plots can share the same data range. This means that they will pan and zoom together.
from IPython.display import Image
Image(filename='bokeh_objects.png')
Let's see what a simple line plot looks like if we build the object graph up
from numpy import pi, arange, sin, cos
import numpy as np
import os.path
from bokeh.objects import (Plot, DataRange1d, LinearAxis,
ObjectArrayDataSource, ColumnDataSource, Glyph, GridPlot,
PanTool, ZoomTool)
from bokeh.glyphs import Line, Rect
from bokeh import session
x = np.linspace(-2*pi, 2*pi, 100)
y = sin(x)
z = cos(x)
widths = np.ones_like(x) * 0.02
heights = np.ones_like(x) * 0.2
#I'm putting all of this into a function so that we don't pollute the global namespace
def simple_line_object():
from bokeh.plotting import curplot
source = ColumnDataSource(data=dict(x=x,y=y,z=z,widths=widths,
heights=heights))
xdr = DataRange1d(sources=[source.columns("x")])
ydr = DataRange1d(sources=[source.columns("y")])
line_glyph = Line(x="x", y="y", line_color="blue")
renderer = Glyph(data_source = source,
xdata_range = xdr, ydata_range = ydr,
glyph = line_glyph)
plot = Plot(x_range=xdr, y_range=ydr, data_sources=[source],
border=50, height=300, width=300)
xaxis = LinearAxis(plot=plot, dimension=0, location="bottom")
yaxis = LinearAxis(plot=plot, dimension=1, location="left")
pantool = PanTool(dataranges = [xdr, ydr], dimensions=["width","height"])
zoomtool = ZoomTool(dataranges=[xdr,ydr], dimensions=("width","height"))
plot.renderers.append(renderer)
plot.tools = [pantool, zoomtool]
sess = curplot()._session
sess.add(plot, renderer, xaxis, yaxis, source, xdr, ydr, pantool, zoomtool)
sess.plotcontext.children.append(plot)
simple_line_object()
show()
Now we will create two plots which share the same DataRange object
<bokeh.session.NotebookSession at 0x102342c90>
from bokeh.glyphs import Wedge, Rect
def simple_linked():
from bokeh.plotting import curplot
source = ColumnDataSource(data=dict(x=x,y=y,z=z,widths=widths,
heights=heights))
xdr = DataRange1d(sources=[source.columns("x")])
ydr = DataRange1d(sources=[source.columns("y")])
line_glyph = Line(x="x", y="y", line_color="blue")
#FIXME, I can't seem to get other glyph styles to work
rect_glyph = Rect(x="x", y="y", height=.5, width=.05, angle=30)
wedge_glyph = Wedge(x="x", y="y", radius=np.pi/4,
start_angle= np.pi/6, end_angle=np.pi/2, direction="clock", color="red")
renderer = Glyph(data_source = source, xdata_range = xdr,
ydata_range = ydr, glyph = line_glyph)
plot = Plot(x_range=xdr, y_range=ydr, data_sources=[source],
border=50, height=300, width=300)
plot.renderers.append(renderer)
renderer2 = Glyph(data_source = source, xdata_range = xdr,
ydata_range = ydr, glyph = line_glyph)
plot2 = Plot(x_range=xdr, y_range=ydr, data_sources=[source],
border=50, height=300, width=300)
pantool2 = PanTool(dataranges = [xdr, ydr], dimensions=["width","height"])
zoomtool2 = ZoomTool(dataranges=[xdr,ydr], dimensions=("width","height"))
plot2.renderers.append(renderer2)
plot2.tools = [pantool2, zoomtool2]
sess = curplot()._session
sess.add(plot, renderer, source, xdr, ydr)
sess.plotcontext.children.append(plot)
show()
sess.add(plot2, renderer2, pantool2, zoomtool2)
sess.plotcontext.children.append(plot2)
simple_linked()
show()
def line_advanced():
from bokeh.plotting import curplot
source = ColumnDataSource(data=dict(x=x,y=y,z=z,widths=widths,
heights=heights))
xdr = DataRange1d(sources=[source.columns("x")])
xdr2 = DataRange1d(sources=[source.columns("x")])
ydr = DataRange1d(sources=[source.columns("y")])
ydr2 = DataRange1d(sources=[source.columns("y")])
line_glyph = Line(x="x", y="y", line_color="blue")
wedge_glyph = Wedge(x="x", y="y", radius=np.pi/14,
start_angle= 3*np.pi/6, end_angle=4*np.pi/4, direction="clock")
renderer = Glyph(data_source = source, xdata_range = xdr,
ydata_range = ydr, glyph = line_glyph)
pantool = PanTool(dataranges = [xdr, ydr], dimensions=["width","height"])
zoomtool = ZoomTool(dataranges=[xdr,ydr], dimensions=("width","height"))
plot = Plot(x_range=xdr, y_range=ydr, data_sources=[source],
border=50, height=400, width=400)
plot.tools = [pantool, zoomtool]
plot.renderers.append(renderer)
#notice that these two have a different y data range
renderer2 = Glyph(data_source = source, xdata_range = xdr,
ydata_range = ydr2, glyph = line_glyph)
plot2 = Plot(x_range=xdr, y_range=ydr2, data_sources=[source],
border=50, height=400, width=400)
plot2.renderers.append(renderer2)
#notice that these two have a differen y data range
renderer3 = Glyph(data_source = source, xdata_range = xdr2,
ydata_range = ydr, glyph = line_glyph)
plot3 = Plot(x_range=xdr2, y_range=ydr, data_sources=[source],
border=50, height=400, width=400)
plot3.renderers.append(renderer3)
#this is a dummy plot with no renderers
plot4 = Plot(x_range=xdr2, y_range=ydr, data_sources=[source],
border=50, height=400, width=400)
sess = curplot()._session
sess.add(plot, renderer, source, xdr, ydr, pantool, zoomtool)
sess.add(plot2, renderer2, ydr2, xdr2, renderer3, plot3, plot4)
grid = GridPlot(children=[[plot, plot2], [plot3, plot4 ]], name="linked_advanced")
sess.add(grid)
sess.plotcontext.children.append(grid)
line_advanced()
show()
There are 3 distinct components to the bokeh plotting library.
This archictecture lets us do some remarkable things.
It is possible to run bokeh without the plot server. The file based examples that we have seen output static javascript that includes everything needed for bokehjs to display the plot. It is also important to understand that 99% of bokeh stays the same however the plot is output.
from bokeh.plotting import hold, line
hold(False)
x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)
hold(True)
line_plot = line(x,y, color="#0000FF", tools="pan, zoom, preview, resize, select, embed, save")
line_snippet = line_plot.inject_snippet()
print line_snippet
hold(False)
import webbrowser
import os
#ok let's create an html page with that snippet
open("foo.html","w").write("""
<html>
<body>
<h1> Embed example </h1>
%s
<h2> after embed </h2>
</body>
</html>""" % line_snippet)
webbrowser.open("file://" + os.path.abspath("foo.html"))
Since plots are first class objects in bokehjs and the bokeh python system they can be modified. Because the bokeh plotserver communicates updates to the browser, we can animate plots from python. For these demos to work, you must be running the plot server.
$ bokeh-server
The bokeh plot server does not yet work on windows. Once you have the started the server, navigate to the plot server http://localhost:5006/bokeh in another browser tab. Due to a bug in bokeh, all of the plots created start out zoomed in, you must zoom out to see the whole animation.
The IPython kernel runs the animation, to interupt the kernel type CTRL-m i
.
print "Go to http://localhost:5006/bokeh to view this plot"
import numpy as np
from numpy import pi, cos, sin, linspace
from bokeh.plotting import *
colors = ("#A6CEE3", "#1F78B4", "#B2DF8A")
N = 36
r_base = 8
theta = linspace(0, 2*pi, N)
r_x = linspace(0, 6*pi, N-1)
rmin = r_base - cos(r_x) - 1
rmax = r_base + sin(r_x) + 1
output_server("wedge animate")
cx = cy = np.ones_like(rmin)
annular_wedge(cx, cy,
rmin, rmax, theta[:-1], theta[1:],
inner_radius_units="data",
outer_radius_units="data",
color = colors[0],
line_color="black", tools="pan,zoom,resize")
#show()
import time
from bokeh.objects import GlyphRenderer
renderer = [r for r in curplot().renderers if isinstance(r, GlyphRenderer)][0]
ds = renderer.data_source
while True:
for i in np.linspace(-2*np.pi, 2*np.pi, 50):
rmin = ds.data["inner_radius"]
rmin = np.roll(rmin, 1)
ds.data["inner_radius"] = rmin
rmax = ds.data["outer_radius"]
rmax = np.roll(rmax, -1)
ds.data["outer_radius"] = rmax
ds._dirty = True
session().store_obj(ds)
time.sleep(.25)
import numpy as np
from numpy import pi, cos, sin, linspace, zeros, linspace, \
short, fromstring, hstack, transpose
from scipy import fft
import time
from bokeh.plotting import *
NUM_SAMPLES = 1024
SAMPLING_RATE = 44100
MAX_FREQ = SAMPLING_RATE / 8
FREQ_SAMPLES = NUM_SAMPLES / 8
SPECTROGRAM_LENGTH = 400
_stream = None
def read_mic():
import pyaudio
global _stream
if _stream is None:
pa = pyaudio.PyAudio()
_stream = pa.open(format=pyaudio.paInt16, channels=1, rate=SAMPLING_RATE,
input=True, frames_per_buffer=NUM_SAMPLES)
try:
audio_data = fromstring(_stream.read(NUM_SAMPLES), dtype=short)
normalized_data = audio_data / 32768.0
return (abs(fft(normalized_data))[:NUM_SAMPLES/2], normalized_data)
except:
return None
def get_audio_data(interval=0.05):
time.sleep(interval)
starttime = time.time()
while time.time() - starttime < interval:
data = read_mic()
if data is not None:
return data
return None
output_server("spectrogram")
# Create the base plot
N = 36
theta = linspace(0, 2*pi, N+1)
rmin = 10
rmax = 20 * np.ones(N)
cx = cy = np.ones(N)
annular_wedge(cx, cy, rmin, rmax, theta[:-1], theta[1:],
inner_radius_units = "data",
outer_radius_units = "data",
color = "#A6CEE3", line_color="black",
tools="pan,zoom,resize")
show()
from bokeh.objects import GlyphRenderer
renderer = [r for r in curplot().renderers if isinstance(r, GlyphRenderer)][0]
ds = renderer.data_source
while True:
data = get_audio_data()
if data is None:
continue
else:
data = data[0]
# Zoom in to a frequency range:
data = data[:len(data)/2]
histdata = (np.histogram(data, N, density=True)[0] * 5) + rmin
ds.data["outer_radius"] = histdata
ds._dirty = True
session().store_obj(ds)
There are a lot of exciting things in store for bokeh. These include: