This notebook is a simple demonstration of how you may be able to predict the phase of the gait cycle from the states from a simple linear regression.
warning This notebook doesn't seem to run with Pandas < 0.14.0. I get an error calling results.rsquared
.
import sys
sys.path.append('..')
import matplotlib.pyplot as plt
from pandas import concat
import statsmodels.formula.api as smf
from gaitanalysis.gait import plot_steps
/home/moorepants/anaconda/envs/walk/lib/python2.7/site-packages/pandas/io/excel.py:626: UserWarning: Installed openpyxl is not supported at this time. Use >=1.6.1 and <2.0.0. .format(openpyxl_compat.start_ver, openpyxl_compat.stop_ver))
from src import utils
from src.grf_landmark_settings import settings
%matplotlib inline
from IPython.core.pylabtools import figsize
figsize(14, 10)
Load the path to the directory with the experimental data.
trials_dir = utils.trial_data_dir()
Trials data directory is set to /home/moorepants/Data/human-gait/gait-control-identification
This is the "training" data set. We simple collect all of the gait cycles from the perturbed portion of one trial from subject "A".
trial_number = '068'
event_data_frame, meta_data, event_data_path = utils.write_event_data_frame_to_disk(trial_number)
walking_data, walking_data_path = utils.write_inverse_dynamics_to_disk(event_data_frame, meta_data, event_data_path)
params = settings[trial_number]
perturbed_steps, walking_data = utils.section_signals_into_steps(walking_data, walking_data_path,
filter_frequency=params[0],
threshold=params[1],
num_samples_lower_bound=params[2],
num_samples_upper_bound=params[3])
Trials data directory is set to /home/moorepants/Data/human-gait/gait-control-identification Temporary data directory is set to ../data Loading pre-cleaned data: ../data/cleaned-data-068-longitudinal-perturbation.h5 0.10 s Loading pre-computed inverse dynamics from ../data/walking-data-068-longitudinal-perturbation.h5. 0.08 s Loading pre-computed steps from ../data/walking-data-068-longitudinal-perturbation.h5. 0.077136
Now create a big data frame which has a column with the percent gait cycle phi
. It also contains columns for all of the state values.
dfs = []
for i, df in perturbed_steps.iteritems():
dfs.append(df)
bigdf = concat(dfs)
bigdf['phi'] = bigdf.index.values
Now specify a patsy formula for the module. We simply want to predict phi
using all of the states as the multivariate linear regressors.
sensors, controls = utils.load_sensors_and_controls()
model = 'phi ~ Q("' + '") + Q("'.join(sensors) + '")' #+ ' + Q("FP1.ForY") + Q("FP2.ForY")'
model
'phi ~ Q("Right.Ankle.PlantarFlexion.Angle") + Q("Right.Ankle.PlantarFlexion.Rate") + Q("Right.Knee.Flexion.Angle") + Q("Right.Knee.Flexion.Rate") + Q("Right.Hip.Flexion.Angle") + Q("Right.Hip.Flexion.Rate") + Q("Left.Ankle.PlantarFlexion.Angle") + Q("Left.Ankle.PlantarFlexion.Rate") + Q("Left.Knee.Flexion.Angle") + Q("Left.Knee.Flexion.Rate") + Q("Left.Hip.Flexion.Angle") + Q("Left.Hip.Flexion.Rate")'
results = smf.ols(model, data=bigdf).fit()
The results show that we have a decent model with a high $R^2$ value.
print(results.summary())
OLS Regression Results ============================================================================== Dep. Variable: phi R-squared: 0.907 Model: OLS Adj. R-squared: 0.907 Method: Least Squares F-statistic: 6971. Date: Fri, 17 Oct 2014 Prob (F-statistic): 0.00 Time: 13:11:56 Log-Likelihood: 8690.6 No. Observations: 8580 AIC: -1.736e+04 Df Residuals: 8567 BIC: -1.726e+04 Df Model: 12 ========================================================================================================= coef std err t P>|t| [95.0% Conf. Int.] --------------------------------------------------------------------------------------------------------- Intercept 1.1361 0.035 32.176 0.000 1.067 1.205 Q("Right.Ankle.PlantarFlexion.Angle") 0.4920 0.014 34.856 0.000 0.464 0.520 Q("Right.Ankle.PlantarFlexion.Rate") 0.0086 0.001 11.482 0.000 0.007 0.010 Q("Right.Knee.Flexion.Angle") -0.0145 0.011 -1.297 0.195 -0.036 0.007 Q("Right.Knee.Flexion.Rate") -0.1013 0.001 -110.647 0.000 -0.103 -0.100 Q("Right.Hip.Flexion.Angle") -0.5951 0.027 -22.308 0.000 -0.647 -0.543 Q("Right.Hip.Flexion.Rate") 0.0607 0.004 14.145 0.000 0.052 0.069 Q("Left.Ankle.PlantarFlexion.Angle") 0.0269 0.014 1.962 0.050 2.55e-05 0.054 Q("Left.Ankle.PlantarFlexion.Rate") 0.0124 0.001 16.831 0.000 0.011 0.014 Q("Left.Knee.Flexion.Angle") -0.0420 0.011 -3.801 0.000 -0.064 -0.020 Q("Left.Knee.Flexion.Rate") 0.0454 0.001 53.012 0.000 0.044 0.047 Q("Left.Hip.Flexion.Angle") 0.8355 0.026 31.640 0.000 0.784 0.887 Q("Left.Hip.Flexion.Rate") -0.1008 0.004 -25.103 0.000 -0.109 -0.093 ============================================================================== Omnibus: 1351.627 Durbin-Watson: 2.368 Prob(Omnibus): 0.000 Jarque-Bera (JB): 5590.747 Skew: 0.729 Prob(JB): 0.00 Kurtosis: 6.676 Cond. No. 151. ==============================================================================
Now we load in data from the same subject but at a different speed to test the model.
trial_number = '069'
event_data_frame, meta_data, event_data_path = utils.write_event_data_frame_to_disk(trial_number)
walking_data, walking_data_path = utils.write_inverse_dynamics_to_disk(event_data_frame, meta_data, event_data_path)
params = settings[trial_number]
perturbed_steps, walking_data = utils.section_signals_into_steps(walking_data, walking_data_path,
filter_frequency=params[0],
threshold=params[1],
num_samples_lower_bound=params[2],
num_samples_upper_bound=params[3])
Trials data directory is set to /home/moorepants/Data/human-gait/gait-control-identification Temporary data directory is set to ../data Loading pre-cleaned data: ../data/cleaned-data-069-longitudinal-perturbation.h5 0.08 s Loading pre-computed inverse dynamics from ../data/walking-data-069-longitudinal-perturbation.h5. 0.08 s Loading pre-computed steps from ../data/walking-data-069-longitudinal-perturbation.h5. 0.093573
dfs = []
for i, df in perturbed_steps.iteritems():
dfs.append(df)
bigdf = concat(dfs)
bigdf['phi'] = bigdf.index.values
The following plot shows the actual percent gait cycle and that predicted by the linear model given the states from the new trial.
plt.plot(results.predict(bigdf)[:400])
plt.plot(bigdf['phi'].values[:400])
plt.legend(['Prediction', 'Actual'])
plt.ylabel('Percent Gait Cycle')
<matplotlib.text.Text at 0x7fc95c642510>
!git rev-parse HEAD
7ba68f0160c23a61204291ca107ad570ac6f6e5a
!git --git-dir=/home/moorepants/src/GaitAnalysisToolKit/.git --work-tree=/home/moorepants/src/GaitAnalysisToolKit rev-parse HEAD
a3732352747bc03ca839df9ff02ddcbd889e636d
%install_ext http://raw.github.com/jrjohansson/version_information/master/version_information.py
Installed version_information.py. To use it, type: %load_ext version_information
%load_ext version_information
%version_information gaitanalysis, dtk, numpy, scipy, pandas, matplotlib, tables, oct2py
Software | Version |
---|---|
Python | 2.7.8 64bit [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] |
IPython | 2.3.0 |
OS | Linux 3.13.0 37 generic x86_64 with debian jessie sid |
gaitanalysis | 0.1.0dev |
dtk | 0.3.5 |
numpy | 1.8.2 |
scipy | 0.14.0 |
pandas | 0.14.0 |
matplotlib | 1.4.0 |
tables | 3.1.1 |
oct2py | 2.4.0 |
Fri Oct 17 13:11:58 2014 EDT |
!pip freeze
DynamicistToolKit==0.3.5 -e git+git@github.com:csu-hmc/GaitAnalysisToolKit.git@a3732352747bc03ca839df9ff02ddcbd889e636d#egg=GaitAnalysisToolKit-origin/HEAD Jinja2==2.7.2 MarkupSafe==0.18 PyYAML==3.11 backports.ssl-match-hostname==3.4.0.2 ipython==2.3.0 matplotlib==1.4.0 numexpr==2.3.1 numpy==1.8.2 oct2py==2.4.0 pandas==0.14.0 patsy==0.2.1 pyparsing==2.0.1 python-dateutil==1.5 pytz==2014.7 pyzmq==14.3.0 scipy==0.14.0 six==1.8.0 statsmodels==0.5.0 sympy==0.7.5 tables==3.1.1 tornado==3.2.1 uncertainties==2.4.6.1 wsgiref==0.1.2