from IPython.core.display import HTML
with open('creative_commons.txt', 'r') as f:
html = f.read()
name = '2014-01-13-tides_mpld3'
html = """
<small>
<p> This post was written as an IPython notebook. It is available for
<a href="https://ocefpaf.github.com/python4oceanographers/downloads/
notebooks/%s.ipynb">download</a> or as a static
<a href="https://nbviewer.ipython.org/url/ocefpaf.github.com/
python4oceanographers/downloads/notebooks/%s.ipynb">html</a>.</p>
<p></p>
%s """ % (name, name, html)
%matplotlib inline
from matplotlib import style
style.use('ggplot')
I usually use IPython notebook in my classes, they are a great tool for teaching code and science or even both at the same time.
However, when I am using the notebook to teach, I miss interaction with
matplotlib figures. One can drop the %matplotlib inline
magic to force the
figures to pop-up in the screen and use the interaction, but that's confusing, force
students to deal with multiple windows, and breaks the linear flow of the
notebook.
I've looked at some alternative to create interactive figures in HTML documents like:
Some of those, like Bokeh, even have some sort of integration with the IPython notebook. The problem with them is that they all require you to learn a new tool!
So here comes Jake to the rescue! (Like he already did before, and before, and before...) He is developing a D3 Renderings of Matplotlib Graphics. That means we can keep using the familiar matplotlib interface to make nice d3 interactive plots.
As an example I will re-write my tidal harmonic analysis using his mpld3
module so my student will be
able to explore the plot, zooming-in and -out at any point.
For that I will use sea level data from the GLOSS Station 194. Here is the data format:
%%bash
sed -n 71,83p ./data/gloss_hourly.fmt
The data records are coded as: field bytes length comment ----------------- ------- ------ ------------------------------------ station number 1-3 3 exactly 3 digits station version 4 1 letter from A to Z station name 6-9 4 Abbreviated if necessary year 12-15 4 month 16-17 2 numerical value day 18-19 2 day record count 20 1 either 1 or 2 sea level data 21-80 60 12 sea level values, hours 00-11 (1); hours 12-23 (2)
To read that data let's define some handy functions and load the data directly from the zipfile into a pandas Series object.
import os
import numpy as np
import numpy.ma as ma
from zipfile import ZipFile
from datetime import datetime
from pandas import Series, date_range
def basename(fname):
return os.path.splitext(os.path.basename(fname))[0]
def make_date(data):
date = str(int(data))[:-1]
return datetime.strptime(date, '%Y%m%d')
def load_gloss(fname):
data = np.loadtxt(fname, skiprows=1, usecols=range(2, 15))
elev = ma.masked_equal(data[:, 1:].ravel(), 9999)
start = make_date(data[0, 0])
dates = date_range(start, periods=len(elev), freq='H')
return Series(elev, index=dates)
series = Series()
with ZipFile('./data/h281a.zip', 'r') as ziped:
for fname in sorted(ziped.namelist()):
if not fname.endswith('/'): # Skip directories.
if False:
with ziped.open(fname) as f:
print(f.readline())
name = basename(fname)
series = series.append(load_gloss(ziped.open(fname)))
Now we can convert the data from millimeters to meters, print a data description, and produce a quick plot to inspect the data.
series /= 1e3
print(series.describe())
ax = series.plot(figsize=(10, 3))
count 457699.000000 mean 1.689822 std 0.404013 min 0.140000 25% 1.400000 50% 1.690000 75% 1.980000 max 4.880000 dtype: float64
The data need some more pre-processing to fill-in gaps, and to remove outliers that passed the initial QA/QC. That is beyond of my scope here so I'll just slice a clean and continuous 2-month period from 2004, remove the mean, and move ahead.
obs = series.ix['2004'] - series.ix['2004'].mean()
obs = obs.ix['2004-07':'2004-08']
ax = obs.plot(figsize=(8, 3))
print(obs.describe())
count 1488.000000 mean -0.031990 std 0.425528 min -1.213540 25% -0.339540 50% -0.009540 75% 0.283710 max 0.960460 dtype: float64
Looks good to me. The next 3 cells are just a copy-and-paste from my previous post on calling t_tide via oct2py.
from matplotlib.dates import date2num
dates = date_range(datetime.now(), periods=72, freq='H')
tim = date2num(dates.to_pydatetime()) + 366 # Matlab Argh!!!
elev = obs.values
%load_ext oct2py.ipython
%%octave -i elev -i tim -o tidestruc -o pout -o yout
pkg load all
addpath(genpath('./t_tide_v1.3beta'))
[tidestruc, pout] = t_tide(elev, 'interval', 1, 'latitude', -25, 'starttime', [1998, 1, 1, 0]);
yout = t_predic(tim, tidestruc);
number of standard constituents used: 35 Points used: 1487 of 1488 percent of var residual after lsqfit/var original: 27.26 % Greenwich phase computed with nodal corrections applied to amplitude and phase relative to center time Using nonlinear bootstrapped error estimates Generating prediction with nodal corrections, SNR is 2.000000 warning: isstr is obsolete and will be removed from a future version of Octave, please use ischar instead percent of var residual after synthesis/var original: 33.93 % ----------------------------------- date: 02-May-2014 nobs = 1488, ngood = 1487, record length (days) = 62.00 start time: 01-Jan-1998 rayleigh criterion = 1.0 Greenwich phase computed with nodal corrections applied to amplitude \n and phase relative to center time x0= -0.0365, x trend= 0 var(x)= 0.1812 var(xp)= 0.11915 var(xres)= 0.061482 percent var predicted/var original= 65.8 % tidal amplitude and phase with 95% CI estimates tide freq amp amp_err pha pha_err snr MM 0.0015122 0.1571 0.183 64.59 77.12 0.74 MSF 0.0028219 0.0256 0.164 231.18 210.70 0.024 ALP1 0.0343966 0.0019 0.006 16.85 171.73 0.091 *2Q1 0.0357064 0.0109 0.007 235.60 44.29 2.2 *Q1 0.0372185 0.0473 0.008 212.37 9.84 36 *O1 0.0387307 0.1614 0.007 203.26 2.58 4.8e+02 NO1 0.0402686 0.0029 0.005 240.58 99.45 0.35 *K1 0.0417807 0.1032 0.008 29.44 3.80 1.9e+02 J1 0.0432929 0.0021 0.006 295.19 161.03 0.14 OO1 0.0448308 0.0051 0.008 227.13 101.19 0.44 UPS1 0.0463430 0.0086 0.010 230.15 78.72 0.73 EPS2 0.0761773 0.0050 0.016 227.96 161.74 0.1 *MU2 0.0776895 0.0356 0.019 65.92 34.86 3.5 *N2 0.0789992 0.0650 0.019 190.48 18.93 11 *M2 0.0805114 0.3383 0.021 83.78 3.94 2.7e+02 *L2 0.0820236 0.0488 0.026 44.37 29.43 3.5 *S2 0.0833333 0.2356 0.023 210.99 5.78 1e+02 ETA2 0.0850736 0.0336 0.031 143.32 60.92 1.2 *MO3 0.1192421 0.0596 0.011 152.47 10.36 30 *M3 0.1207671 0.0703 0.008 42.64 6.71 70 *MK3 0.1222921 0.0397 0.009 16.25 14.31 21 *SK3 0.1251141 0.0489 0.010 298.89 11.84 25 *MN4 0.1595106 0.0241 0.006 166.80 13.51 18 *M4 0.1610228 0.0526 0.005 182.36 5.57 97 SN4 0.1623326 0.0054 0.005 125.25 61.09 1.1 *MS4 0.1638447 0.0270 0.006 49.39 12.76 20 S4 0.1666667 0.0029 0.005 325.77 113.83 0.28 2MK5 0.2028035 0.0005 0.003 17.45 225.26 0.03 2SK5 0.2084474 0.0031 0.005 312.31 93.55 0.44 2MN6 0.2400221 0.0004 0.002 319.96 174.12 0.061 M6 0.2415342 0.0012 0.002 246.25 100.47 0.44 *2MS6 0.2443561 0.0054 0.002 333.73 24.77 7.7 2SM6 0.2471781 0.0029 0.002 100.65 51.16 1.6 3MK7 0.2833149 0.0025 0.003 181.48 67.60 0.81 M8 0.3220456 0.0023 0.002 216.09 47.06 1.9
Now I'll use mpld3 to plot the observed data, tide prediction from the analysis, and the residue.
The main difference from my previous post is that, while the original plot was just a static png
figure, this one allow us to interact with the graph.
import mpld3
mpld3.enable_notebook()
import matplotlib.pyplot as plt
fig, (ax0, ax1, ax2) = plt.subplots(nrows=3, sharey=True, sharex=True, figsize=(8, 4))
ax0.plot(obs.index, obs)
ax1.plot(obs.index, pout.squeeze(), alpha=0.5)
ax2.plot(obs.index, obs-pout.squeeze(), alpha=0.5)
[<matplotlib.lines.Line2D at 0x7f2e8336b110>]
Now I can tell my students to zoom-in at the observed high water level event that occurred around July 14$^{\text{th}}$ and start quizzing them about what is happening there.
HTML(html)
This post was written as an IPython notebook. It is available for download or as a static html.