%pylab inline
from numpy import *
from numpy.random import rand, randn, randint
from dPCA import dPCA
Populating the interactive namespace from numpy and matplotlib
We first build surrogate data to apply dPCA to.
# number of neurons, time-points and stimuli
N,T,S = 100,250,6
# noise-level and number of trials in each condition
noise, n_samples = 0.2, 10
# build two latent factors
zt = (arange(T)/float(T))
zs = (arange(S)/float(S))
# build trial-by trial data
trialR = noise*randn(n_samples,N,S,T)
trialR += randn(N)[None,:,None,None]*zt[None,None,None,:]
trialR += randn(N)[None,:,None,None]*zs[None,None,:,None]
# trial-average data
R = mean(trialR,0)
# center data
R -= mean(R.reshape((N,-1)),1)[:,None,None]
We then instantiate a dPCA model where the two parameter axis are labeled by 's' (stimulus) and 't' (time) respectively. We set regularizer to 'auto' to optimize the regularization parameter when we fit the data.
dpca = dPCA.dPCA(labels='st',regularizer='auto')
dpca.protect = ['t']
You chose to determine the regularization parameter automatically. This can take substantial time and grows linearly with the number of crossvalidation folds. The latter can be set by changing self.n_trials (default = 3). Similarly, use self.protect to set the list of axes that are not supposed to get to get shuffled (e.g. upon splitting the data into test- and training, time-points should always be drawn from the same trial, i.e. self.protect = ['t']). This can significantly speed up the code.
Now fit the data (R) using the model we just instatiated. Note that we only need trial-to-trial data when we want to optimize over the regularization parameter.
Z = dpca.fit_transform(R,trialR)
Start optimizing regularization. Starting trial 1 / 3 Starting trial 2 / 3 Starting trial 3 / 3 Optimized regularization, optimal lambda = 0.000449987958058483 Regularization will be fixed; to compute the optimal parameter again on the next fit, please set opt_regularizer_flag to True.
time = arange(T)
figure(figsize=(16,7))
subplot(131)
for s in range(S):
plot(time,Z['t'][0,s])
title('1st time component')
subplot(132)
for s in range(S):
plot(time,Z['s'][0,s])
title('1st stimulus component')
subplot(133)
for s in range(S):
plot(time,Z['st'][0,s])
title('1st mixing component')
show()
The 1st mixing component looks merely like noise. But to be sure, we can run a significance analysis:
significance_masks = dpca.significance_analysis(R, trialR, n_shuffles=10, n_splits=10, n_consecutive=10)
Compute score of data: . . . . . . . . . . Finished. Compute score of shuffled data: 9 / 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
We can highlight the significant parts of the demixed components with a black bar underneath. Note that there is no significant analysis time, since there are no classes to compute the significance over.
time = arange(T)
figure(figsize=(16,7))
subplot(131)
for s in range(S):
plot(time,Z['t'][0,s])
title('1st time component')
subplot(132)
for s in range(S):
plot(time,Z['s'][0,s])
imshow(significance_masks['s'][0][None,:],extent=[0,250,amin(Z['s'])-1,amin(Z['s'])-0.5],aspect='auto',cmap='gray_r',vmin=0,vmax=1)
ylim([amin(Z['s'])-1,amax(Z['s'])+1])
title('1st stimulus component')
subplot(133)
for s in range(S):
plot(time,Z['st'][0,s])
dZ = amax(Z['st'])-amin(Z['st'])
imshow(significance_masks['st'][0][None,:],extent=[0,250,amin(Z['st'])-dZ/10.,amin(Z['st'])-dZ/5.],aspect='auto',cmap='gray_r',vmin=0,vmax=1)
ylim([amin(Z['st'])-dZ/10.,amax(Z['st'])+dZ/10.])
title('1st mixing component')
show()