Coin flipping, round 2! Computational statistics!¶

Let's change the coin flipping code from the other homework this week (http://nbviewer.ipython.org/urls/raw.github.com/beacon-center/intro-computational-science/master/hw-week5-coinflips.ipynb) to be a little more convenient.

Our goal, below, is to be able to reliably distinguish between fair and unfair coin flipping functions, to a p < 0.05.

In [2]:

import random

def flip_a_fair_coin(N):
    coin = [0, 1]           # 0/1 instead of H/T
    
    record = []
    for i in range(N):
        record.append(random.choice(coin))

    return record

def flip_an_unfair_coin(N):
    coin = [0, 1, 1, 1]           # 0/1 instead of H/T
    
    record = []
    for i in range(N):
        record.append(random.choice(coin))

    return record

In [3]:

x = flip_a_fair_coin(5)
print 'A completely fair coin would yield an average of 2.5:', sum(x) / 5.

A completely fair coin would yield an average of 2.5: 0.8

Plotting distributions of coin flips¶

Computers make it easy to try things lots of times! Here is how you plot the distribution of X averages of N coin flips.

In [4]:

# do X independent coin flip experiments, N times each.
def distribution_of_coin_flips(N, X, coin_flipping_fn):
    results = []
    for i in range(X):
        avg = sum(coin_flipping_fn(N)) / float(N)
        results.append(avg)
    return results

In [5]:

d = distribution_of_coin_flips(100, 500, flip_a_fair_coin)
e = distribution_of_coin_flips(100, 500, flip_an_unfair_coin)

In [6]:

hist(d, normed=True)
hist(e, normed=True)
axis(xmin=0, xmax=1)

Out[6]:

(0, 1, 0.0, 12.0)

T-tests!¶

Next we are going to apply a one sample Student's t-test -- see these convenient links:

Student's T-test

Student's t-distribution

Basically, what you're doing here is comparing the distributions (plotted above) to see how much overlap they have. If the distributions have lots of overlap then it will be hard to determine that they are different (e.g. that one coin is fair and the other is unfair). This is dependent on # of samples and the unfairness of the coins.

Below is an implementation taken from the t-test and t-distribution pages, above.

In [7]:

import math

def one_sample_t_test(sample_mean, sample_stddev, sample_n, expected_mean):
    numerator = sample_mean - expected_mean
    denominator = sample_stddev / float(math.sqrt(sample_n))
    
    return abs(numerator / denominator)

p05_value = 6.314

# apply like so:
flip_record = flip_a_fair_coin(20)

# avg = ...
# stddev = ...
# t_statistic = one_sample_t_test(avg, stddev, len(flip_record), 0.5)
# if t_statistic < p05_value:
#    print 'FAIR'
# else:
#    print 'UNFAIR'

The homework!¶

For the functions 'flip1', 'flip2', and 'flip3', imported below, determine which of them represent fair and which of them represent unfair coins.

A: plot the distributions of their averages.

B: apply the one-sample t-test defined above, to a certainty of p < 0.05 -- which functions are fair and which are unfair?

C: for each 'flip' function write a loop to roughly determine the number of coin flips you need to record in order to reach a conclusion about fair or unfair. (Yes, that's vague. Deal. :)

In [8]:

from hw_week5_coin_fns import flip1, flip2, flip3

print flip1(5)
print flip2(5)
print flip3(5)

[1, 1, 0, 0, 0]
[1, 1, 1, 1, 1]
[1, 1, 0, 1, 1]

In [11]:

d = distribution_of_coin_flips(100, 500, flip1)
e = distribution_of_coin_flips(100, 500, flip2)
f = distribution_of_coin_flips(100, 500, flip3)

hist(d, normed=True)
hist(e, normed=True)
hist(f, normed=True)

axis(xmin=0, xmax=1)

Out[11]:

<matplotlib.legend.Legend at 0x7fe774dc1e90>

In [14]:

# so, clearly, flip1 is biased.  what about flip2 and flip3? they look *slightly* different...

import math
def avg(fliplist):
    return sum(fliplist) / float(len(fliplist))
    
def dev(fliplist):
    mean = avg(fliplist)
    diffs = []
    for k in fliplist:
        diffs.append((k - mean) ** 2)
    diffs = sum(diffs)
    diffs = math.sqrt(diffs / float(len(fliplist)))
    return diffs

In [22]:

N = 1000000
flip1_record = flip1(N)
flip2_record = flip2(N)
flip3_record = flip3(N)

flip1_t = one_sample_t_test(avg(flip1_record), dev(flip1_record), len(flip1_record), 0.5)
if flip1_t < p05_value:
    print 'flip1: fair'
else:
    print 'flip1: unfair'
    

flip2_t = one_sample_t_test(avg(flip2_record), dev(flip2_record), len(flip2_record), 0.5)
if flip2_t < p05_value:
    print 'flip2: fair'
else:
    print 'flip2: unfair'

    
flip3_t = one_sample_t_test(avg(flip3_record), dev(flip3_record), len(flip3_record), 0.5)
if flip3_t < p05_value:
    print 'flip3: fair'
else:
    print 'flip3: unfair'

flip1: unfair
flip2: fair
flip3: unfair

In [ ]:

# curious to see what people did for the loop :)