the fundamental idea is to blend somewhat inaccurate measurements with somewhat inaccurate models of how the systems behaves to get a filtered estimate that is better than either information source by itself
beta_figures[0]
2015-11-05 08:15:54.476 [WARN] C:\Anaconda2\lib\site-packages\matplotlib\collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
if self._edgecolors == str('face'):
# how does beta/alpha change over time? random walk!
transition_matrices = np.eye(2)
# what is the beta/alpha drift over time? random walk!
transition_offsets = [0, 0]
# GUESS how much does beta/alpha change due to unmodeled forces?
transition_covariance = delta / (1 - delta) * np.eye(2)
# VIX returns are 1-dimensional
n_dim_obs = 1
# how do we convert beta/alpha to VIX returns
observation_matrices = np.expand_dims(
a = np.vstack([[x], [np.ones(len(x))]]).T,
axis = 1
)
# GUESS how much does VIX returns deviate outside of beta/alpha estimates
observation_covariance = .01
split into k folds
calculate MSE using pykalman's em
on previous folds
kf.em(
X = y_training,
n_iter = 10,
em_vars = ['transition_covariance', 'observation_covariance']
)
use MSE in current fold
beta_figures[1]
beta_df.dropna(inplace=True)
resid_kf = beta_df['beta'] * beta_df['x'] - beta_df['y']
resid_30 = beta_df['beta_30'] * beta_df['x'] - beta_df['y']
resid_250 = beta_df['beta_250'] * beta_df['x'] - beta_df['y']
resid_500 = beta_df['beta_500'] * beta_df['x'] - beta_df['y']
print np.sum(resid_kf**2) / len(resid_kf)
print np.sum(resid_30**2) / len(resid_30)
print np.sum(resid_250**2) / len(resid_250)
print np.sum(resid_500**2) / len(resid_500)
0.00145648345444 0.00136178536756 0.00162842090621 0.00176787087888
resid_kf = beta_df['beta'].shift(1) * beta_df['x'] - beta_df['y']
resid_30 = beta_df['beta_30'].shift(1) * beta_df['x'] - beta_df['y']
resid_250 = beta_df['beta_250'].shift(1) * beta_df['x'] - beta_df['y']
resid_500 = beta_df['beta_500'].shift(1) * beta_df['x'] - beta_df['y']
print np.sum(resid_kf**2) / len(resid_kf)
print np.sum(resid_30**2) / len(resid_30)
print np.sum(resid_250**2) / len(resid_250)
print np.sum(resid_500**2) / len(resid_500)
0.00157659837059 0.00159214078837 0.0016753622042 0.0017999663961
beta_figures[2]
pairs_figures[0]
pairs_figures[1]
pairs_figures[2]
f = predicion matrix (transition_functions
)
x = states
w = noise ~ N(0, Q) (transition_covariance
)
z = observable readings
g = mapping of x to observable (observation_functions
)
v = observation noise, measurement noise ~ N(0, R) (observation_covariance
)
where $ w_t $ and $ b_t $ are correlated with a value of $ \rho $.
For Heston, $ k = 0.5 $ . For GARCH, $ k = 1.0 $. For 3/2, $ k = 1.5 $.
Let:
$$ z_t = \frac{1}{\sqrt{1 - \rho^{2}}} (b_t - \rho w_t)$$$ z_t $ and $ w_t $ are not correlated!
$$ w_t = \frac{ r_t - \mu }{ \sqrt{\nu_t} } $$$$ b_t = \sqrt{1 - \rho^{2}} \, z_t + \rho w_t $$$$ b_t = \sqrt{1 - \rho^{2}} \, z_t + \rho \frac{ r_t - \mu }{ \sqrt{\nu_t} } $$Finally...
$$ \nu_{t+1} = \nu_{t} + \theta \, (\omega-\nu_t) + \xi \, \nu_t^{\kappa} \left [ \sqrt{1 - \rho^{2}} \, z_t + \rho \frac{ r_t - \mu }{ \sqrt{\nu_t} } \right ] $$$$ r_t = \mu + \sqrt{\nu_t} w_t $$def sv_kf_functions(mu, theta, omega, xi, k, rho, log_returns):
def f(current_state, transition_noise, log_return):
b_t = math.sqrt(1 - rho ** 2) * transition_noise + rho * (r - mu) / math.sqrt(current_state)
return current_state + theta * (omega - current_state) + xi * current_state ** k * b_t
def g(current_state, observation_noise):
return mu + math.sqrt(current_state) * observation_noise
return [partial(f, log_return=log_return) for log_return in log_returns], g
I'm a trader/quant.
I started blogging on Alphamaximus
I'm also a co-founder of Bonjournal.
Follow me on Twitter.