Video data mangling¶

Getting ready to analysis

In [1]:

import os
import concurrent.futures

import numpy as np

from utils import utils, video

Some data sanity validations, affine transformation, and area clipping¶

In [2]:

a_data = video.load_table('A')
b_data = video.load_table('B')
metadata = video.load_metadata()

Filter using the index and the metadata about the starting and ending of each block.

In [3]:

a_index, b_index = [], []
for b in ['block 1', 'block 2', 'block 3']:
    a_index.extend(utils.f_range(metadata[b]['start'], metadata[b]['end'], 0.5))
for b in ['block 5', 'block 6', 'block 7']:
    b_index.extend(utils.f_range(metadata[b]['start'], metadata[b]['end'], 0.5))

groups = {'A': a_data.loc[a_index], 'B': b_data.loc[b_index]}

Analyze the percentage of existing data in the dataset (percentage of not NaN values). Fill NaNs using forward filling.

In [4]:

for name, group_data in sorted(groups.items()):
    exists = group_data.notnull().as_matrix().flatten()
    exist_percentage = len(exists[exists == True]) / len(exists)
    print('Group {}, percentage of not NaN values: {}'.format(name, 100 * exist_percentage))
    group_data.fillna(method='ffill', inplace=True)

Group A, percentage of not NaN values: 99.93267186392629
Group B, percentage of not NaN values: 99.85931513409962

Transform the view usign the metadata. And clip to reasonable dance floor boundaries (after calculating irregularities).

In [5]:

known_ys = metadata['static points']['absolute']
a_known_xs = metadata['static points']['group a']
b_known_xs = metadata['static points']['group b']

with concurrent.futures.ProcessPoolExecutor() as executor:
    future_a = executor.submit(video.transform, groups['A'], a_known_xs, known_ys)
    future_b = executor.submit(video.transform, groups['B'], b_known_xs, known_ys)
groups['A'] = future_a.result()
groups['B'] = future_b.result()

Clip to reasonable dance floor boundaries (after calculating irregularities).

In [6]:

lower = -4
upper = 15

rows_list = []
for name in sorted(groups):
    values = groups[name].values
    total = len(values.flatten())
    below_bounds = len(values[lower > values])
    above_bounds = len(values[upper < values])
    in_bounds = total - below_bounds - above_bounds
    rows_list.append([name, 100 * below_bounds / total, 100 * above_bounds / total, 100 * in_bounds / total])
    groups[name] = groups[name].clip(lower=lower, upper=upper)
    print('Group: {}'.format(name))
    print('Percentage of values below boundarie: {}'.format(100 * below_bounds / total))
    print('Percentage of values above boundarie: {}'.format(100 * above_bounds / total))
    print('Total in boundaries: {}'.format(100 * in_bounds / total))
    print()

Group: A
Percentage of values below boundarie: 4.557051736357193
Percentage of values above boundarie: 2.273210489014883
Total in boundaries: 93.16973777462792

Group: B
Percentage of values below boundarie: 0.05836925287356322
Percentage of values above boundarie: 0.11673850574712644
Total in boundaries: 99.82489224137932

Generate bench position¶

There was a bench in the experiment area where participants gathered together some of the time. Calculate its position based on metadata['bench'].

In [7]:

benches = []  # the positioning of the bench according to the two videos (different groups)
for name, known_xs in [('a', a_known_xs), ('b', b_known_xs)]:
    A, b = video.solve_affine_transformation(known_xs, known_ys)
    benches.append(video.transform_vec(metadata['bench']['group {}'.format(name)], A, b))

# distance between positions (error wise)
distance = utils.euclidean_distance(*benches)
print('Distance of the error between the position values:', distance, '(meters)')

# position of the bench will be the mean of the two
bench_pos = np.array(benches).mean(axis=0)

Distance of the error between the position values: 0.379929100616 (meters)

Save data to csv files¶

In [8]:

for name, group in  groups.items():
    group.index.name = 'frame'
    group.to_csv(os.path.join('cooked', 'video_group_{}.csv'.format(name.lower())))

In [9]:

np.savetxt(os.path.join('cooked', 'video_boundaries.csv'), (lower, upper))

In [10]:

np.savetxt(os.path.join('cooked', 'video_bench_pos.csv'), bench_pos)