import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
%pylab inline
The first plot was generated by just picking samples from a probability distribution for a random process. 30 samples were picked (corresponding to the 30 coins) and the variance was calculated using 162 trials (number of times each coin was flipped).
For this type of random process:
$ \sigma^2 = 0.5*0.5/\#trials $
The second coin flip plot is just the probability distribution using the same variance.
Note: I'm using a normal approximation for this binomial distribution.
#distribution of random process
mean=.5
stddev=np.sqrt(.5*.5/162)
samples=30
coinflips=np.random.normal(loc=mean, scale=stddev, size=samples)
plt.plot(coinflips,ones(len(coinflips)),'.',markersize=40, alpha=.4)
plt.xlim(mean-3.5*stddev,mean+3.5*stddev)
plt.yticks([])
plt.xlabel("% heads")
plt.show()
#probability distribution for random process
mean=.5
stddev=np.sqrt(.5*.5/162) #/162)
x = np.linspace(mean-4*stddev,mean+4*stddev,100)
plt.plot(x,mlab.normpdf(x,mean,stddev))
plt.xlim(mean-3.5*stddev,mean+3.5*stddev)
plt.xlabel("% heads")
plt.show()
The distribution of winning percentages uses a mean of 50% and a standard deviation of .072 These numbers are from
http://www.insidethebook.com/ee/index.php/site/comments/true_talent_levels_for_sports_leagues/
#distribution of baseball winning percentage
mean=.5
stddev=.072
x = np.linspace(mean-4*stddev,mean+4*stddev,100)
plt.plot(x,mlab.normpdf(x,mean,stddev))
plt.xlim(mean-3.5*stddev,mean+3.5*stddev)
plt.xlabel("% wins")
plt.show()
mean=.5
stddev=np.sqrt(.5*.5/162)
x = np.linspace(mean-6*stddev,mean+6*stddev,100)
plt.plot(x,mlab.normpdf(x,mean,stddev),label="random")
mean=.5
stddev=.072
x = np.linspace(mean-4*stddev,mean+4*stddev,100)
plt.plot(x,mlab.normpdf(x,mean,stddev),label="baseball")
plt.xlim(mean-3.5*stddev,mean+3.5*stddev)
plt.yticks([])
plt.legend()
plt.show()