This notebook and its helper script (Helper_DDToyHMM.py) contain all information necessary to make plots summarizing our experiments on a diagonally-dominant toy dataset we call DDToyHMM
. These plots are shown in Fig. 2 of our paper on scalable learning for the HDP-HMM.
You have completed all runs from the DDToyHMM dataset experiments.
See the scripts experiments/Launch_DDToyHMM_*.py
in our experiment repository.
Load required modules, configure notebook to display figures in-line, etc.
import numpy as np
import bnpy; # bnpy package for learning and plotting BNP models
import DDToyHMM # Module describing the toy dataset in question
import glob; # for simple check that saved runs exist
import os;
from bnpy.viz.PlotUtil import pylab
%pylab inline
Populating the interactive namespace from numpy and matplotlib
bnpy.viz.PlotUtil.ConfigPylabDefaults(pylab);
doSaveEPS = 1; # Switch this to 1 to save these plots to .eps, for LaTeX publications.
This notebook requires saved runs located in the following directory: `$BNPYOUTDIR/DDToyHMM/nipsexperiment...
If these runs do not exist, you need to go run the experiments matching the pattern experiments/Launch_DDToyHMM_*.py
pathPattern = "%s/%s/%s" % (os.environ['BNPYOUTDIR'], 'DDToyHMM', 'nipsexperiment*')
dirList = glob.glob(pathPattern)
if len(dirList) == 0:
raise ValueError("STOP! You have not run the expected experiments yet!")
print "Found %d directories for jobs under the name 'nipsexperiment' for toy dataset DDToyHMM" % (len(dirList))
Found 22 directories for jobs under the name 'nipsexperiment' for toy dataset DDToyHMM
Jdict maps keys like "birth Sticky=0 K=50" to full paths names like "nipsexperiment-alg=bnpyHDPHMMstoch-lik=Gauss-hmmKappa=0-..."
import Helper_DDToyHMM as Helper
reload(Helper);
Jdict = Helper.setUp(lineStyleByKey='Sticky')
for key in Jdict.keys():
print key
stoch Sticky=0 K=50 stoch Sticky=0 K=100 stoch Sticky=50 K=50 stoch Sticky=50 K=100 sampler Sticky=0 K=50 sampler Sticky=0 K=100 sampler Sticky=50 K=50 sampler Sticky=50 K=100 memo Sticky=0 K=50 memo Sticky=0 K=100 memo Sticky=50 K=50 memo Sticky=50 K=100 delmerge Sticky=0 K=50 delmerge Sticky=0 K=100 delmerge Sticky=50 K=50 delmerge Sticky=50 K=100 birth Sticky=0 K=1 birth Sticky=0 K=10 birth Sticky=0 K=50 birth Sticky=50 K=1 birth Sticky=50 K=10 birth Sticky=50 K=50
Separate all jobs into two groups: those that are non-sticky (hmmKappa=0) and sticky (hmmKappa=50).
Jnonsticky = Helper.getSubsetByName(Jdict, 'Sticky=0')
Jsticky = Helper.getSubsetByName(Jdict, 'Sticky=50')
J50 = Helper.getSubsetByName(Jdict, 'K=50')
We have 8 true states that are each well-separated Gaussian blobs.
The generative HMM state transition model has high stickiness. Each state will self-transition with probability 0.999, and move to one other state (illustrated by arrow) with probability 0.001.
DDToyHMM.transPi
array([[ 0.999, 0.001, 0. , 0. , 0. , 0. , 0. , 0. ], [ 0. , 0.999, 0.001, 0. , 0. , 0. , 0. , 0. ], [ 0. , 0. , 0.999, 0.001, 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0.999, 0.001, 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. , 0.999, 0.001, 0. , 0. ], [ 0. , 0. , 0. , 0. , 0. , 0.999, 0.001, 0. ], [ 0. , 0. , 0. , 0. , 0. , 0. , 0.999, 0.001], [ 0.001, 0. , 0. , 0. , 0. , 0. , 0. , 0.999]])
DDToyHMM.illustrate(Colors=Helper.getStateSeqColorMap());
pylab.xticks([-30, -15, 0, 15, 30]);
pylab.yticks([-30, -15, 0, 15, 30]);
if doSaveEPS:
pylab.savefig('DDToyHMM_Illustrated.eps', bbox_inches='tight')
Helper.makeLegendInOwnFigure(Jdict,
names=['sampler', 'stoch', 'memo',
'delmerge:delete,merge', 'birth:birth,delete,merge',
'Sticky=0:Non-stick, kappa=0',
'Sticky=50:Sticky, kappa=50'])
if doSaveEPS:
pylab.savefig('LegendForAllAlgorithms.eps', bbox_inches='tight');
!sed -i -e 's/BoundingBox: 174 276 437 515/BoundingBox: 230 375 430 505/' LegendForAllAlgorithms.eps
Conclusions:
Helper.PlotUtil.plotKeff(J50, loc=None, xscale='log');
pylab.ylim([0, 60]);
pylab.gca().set_yticks([8, 25, 50]);
pylab.gca().set_yticklabels(['*8', '25', '50']);
if doSaveEPS:
pylab.savefig('DDToyHMM_KeffVsLaps_Kinit=50.eps', bbox_inches='tight');
Conclusions:
Helper.PlotUtil.plotHammingDist(J50, loc=None, xscale='log');
if doSaveEPS:
pylab.savefig('DDToyHMM_HammingVsLaps_Kinit=50.eps', bbox_inches='tight');
Note that birth methods (purple) are always solid, though we consider also initializing with K=1, K=10, or K=50 states.
# Setup
Helper.PlotUtil.LineStyleMap = Helper.getLineStyleMap_ByKValue()
Conclusions:
pylab.gca().set_yticks([8, 25, 50]);
Helper.PlotUtil.plotKeff(Jnonsticky, loc=None, xscale='log');
pylab.ylabel('num states K');
if doSaveEPS:
pylab.savefig('DDToyHMM_KeffVsLaps_nonsticky.eps', bbox_inches='tight');
Conclusions
Helper.PlotUtil.plotHammingDist(Jnonsticky, loc=None, xscale='log');
if doSaveEPS:
pylab.savefig('DDToyHMM_HammingVsLaps_nonsticky.eps', bbox_inches='tight');
Conclusions
Helper.PlotUtil.plotELBO(Jnonsticky, loc=None, xscale='log');
if doSaveEPS:
pylab.savefig('DDToyHMM_ELBOVsLaps_nonsticky.eps', bbox_inches='tight');
nipsexperiment-alg=foxHDPHMMsampler-lik=Gauss-hmmKappa=0-ECovMat-eye-sF=1.0-K=50-initname=randcontigblocks-nBatch=8/1/evidence-saved-params.txt nipsexperiment-alg=foxHDPHMMsampler-lik=Gauss-hmmKappa=0-ECovMat-eye-sF=1.0-K=100-initname=randcontigblocks-nBatch=8/1/evidence-saved-params.txt
Again, we'll look at algorithm performance across a variety of initializations.
Conclusions:
Helper.PlotUtil.plotKeff(Jsticky, loc=None, xscale='log');
pylab.ylabel('num states K');
if doSaveEPS:
pylab.savefig('DDToyHMM_KeffVsLaps_hmmKappa=50.eps', bbox_inches='tight');
Helper.PlotUtil.plotHammingDist(Jsticky, loc=None, xscale='log');
if doSaveEPS:
pylab.savefig('DDToyHMM_HammingVsLaps_hmmKappa=50.eps', bbox_inches='tight');
Helper.PlotUtil.plotELBO(Jsticky, loc=None, xscale='log');
if doSaveEPS:
pylab.savefig('DDToyHMM_ELBOVsLaps_hmmKappa=50.eps', bbox_inches='tight');
nipsexperiment-alg=foxHDPHMMsampler-lik=Gauss-hmmKappa=50-ECovMat-eye-sF=1.0-K=50-initname=randcontigblocks-nBatch=8/1/evidence-saved-params.txt nipsexperiment-alg=foxHDPHMMsampler-lik=Gauss-hmmKappa=50-ECovMat-eye-sF=1.0-K=100-initname=randcontigblocks-nBatch=8/1/evidence-saved-params.txt
Each figure will show the segmentations of one algorithm after some number of iterations.
We show the segmentations of sequences 1, 3, 5, and 7, which were chosen because together they contain long segments of each of the 8 true states of interest.
seqNames = ['', '', '', '', '', '', '']
sequences = [1,3,5,7]
xticks = [0, 200, 400, 600, 800]
def MakeTitleForSegmentation(algName, lapQuery, Kinit=50.0):
''' Reads info from stored job on disk to make title with stats on segmentation.
Returns
-------
titleStr : string of form "sampler after 100 min (30 laps). Kinit=50, Kfinal=8."
'''
lapQuery = str(lapQuery)
Kinit = str(Kinit)
path = Helper.PlotUtil.MakePath(Jnonsticky[algName + ' K=' + str(int(float(Kinit)))] + "/1/")
lineNum = !grep -n $lapQuery $path/laps-saved-params.txt
lineNum = lineNum[0].split(":")[0] # convert 25:1000 to 25
lineNum_p = lineNum + "p"
lap = !sed -n $lineNum_p $path/laps-saved-params.txt
time_sec = !sed -n $lineNum_p $path/times-saved-params.txt
Kfinal = !sed -n $lineNum_p $path/Keff-saved-params.txt
lap = lap[0]
time_sec = time_sec[0]
Kfinal = Kfinal[0]
time_min = float(time_sec) / 60 # convert to minutes
return "%s: K=%s after %s laps in %.0f min." % (algName, Kfinal, lapQuery, time_min)
MakeTitleForSegmentation("sampler", 1000)
'sampler: K=11 after 1000 laps in 40 min.'
lapToShow = 2000
algName = 'sampler'
ax = Helper.PlotUtil.plotStateSeq(Jnonsticky[algName + ' K=50'], taskids='1',
sequences=sequences, seqNames=seqNames,
xticks=xticks, showELBOInTitle=0, lap=lapToShow)
ax[0].set_title(MakeTitleForSegmentation(algName, lapToShow));
if doSaveEPS:
pylab.savefig('DDToyHMM_EstZ_%s_lap=%d.eps' % (algName, lapToShow));
lapToShow = 5000
algName = 'sampler'
ax = Helper.PlotUtil.plotStateSeq(Jnonsticky[algName + ' K=50'], taskids='1',
sequences=sequences, seqNames=seqNames,
xticks=xticks, showELBOInTitle=0, lap=lapToShow)
ax[0].set_title(MakeTitleForSegmentation(algName, lapToShow));
if doSaveEPS:
pylab.savefig('DDToyHMM_EstZ_%s_lap=%d.eps' % (algName, lapToShow));
Takes about 3 hrs, bc some laps do viterbi as well as local fwd/bwd alg step.
lapToShow = 2000
algName = 'stoch'
ax = Helper.PlotUtil.plotStateSeq(Jnonsticky[algName + ' K=50'], taskids='1',
sequences=sequences, seqNames=seqNames,
xticks=xticks, showELBOInTitle=0, lap=lapToShow)
ax[0].set_title(MakeTitleForSegmentation(algName, lapToShow));
if doSaveEPS:
pylab.savefig('DDToyHMM_EstZ_%s_lap=%d.eps' % (algName, lapToShow));
lapToShow = 100
algName = 'delmerge'
ax = Helper.PlotUtil.plotStateSeq(Jnonsticky[algName + ' K=50'], taskids='1',
sequences=sequences, seqNames=seqNames,
xticks=xticks, showELBOInTitle=0, lap=lapToShow)
ax[0].set_title(MakeTitleForSegmentation(algName, lapToShow).replace('delmerge', 'delete,merge'));
if doSaveEPS:
pylab.savefig('DDToyHMM_EstZ_%s_lap=%d.eps' % (algName, lapToShow));
A crucial part of our experimental set-up is that we used matched initializations for all algorithms, to try to be as fair as possible in the comparison. This means that for each experimental model condition (e.g. $\kappa=50$ and 50 initial states), the runs for the Gibbs sampler algorithm, the memoized variational algorithm, and the stochastic variational algorithms started from the same (randomly-generated) segmentation. We repeated each condition across 10 trials, so there were 10 random initial segmentations.
Below, we sketch out the "random-contiguous-blocks" procedure we used to make the initial segmentation, which is specified via the flag --initname randcontigblocks
in bnpy. The code is easily found in the source file FromScratchGauss.py.
INPUT: number of states $K$, complete dataset of $N=32$ toy data sequences
For each cluster $k$,
This procedure yields a valid, data-driven initialization for global parameters, so that our variational algorithms can immediately start optimization steps.
However, the sampler is not parameterized by posteriors over $\phi_k$. To transfer this initialization to the sampler, we first segment all the sequences (via Viterbi) from these posteriors, and then use the resulting hard segmentation to initialize the sampler. We can see from the plots below and the early iterations of the ELBO objective plots that this yields very consistent initial configurations for all methods.
lapToShow = 0
algName = 'sampler'
ax = Helper.PlotUtil.plotStateSeq(Jnonsticky[algName + ' K=50'], taskids='1',
sequences=sequences, seqNames=seqNames,
xticks=xticks, showELBOInTitle=0, lap=lapToShow)
ax[0].set_title(MakeTitleForSegmentation(algName, lapToShow));
lapToShow = 0
algName = 'memo'
ax = Helper.PlotUtil.plotStateSeq(Jnonsticky[algName + ' K=50'], taskids='1',
sequences=sequences, seqNames=seqNames,
xticks=xticks, showELBOInTitle=0, lap=lapToShow)
ax[0].set_title(MakeTitleForSegmentation(algName, lapToShow));