Part 1¶

The effect of each bet on your capital is multiplicative - your capital is multiplied by $1 + 2f$ on heads and by $1 - f$ on tails - so that your fortune at the end depends only on the total number of heads and tails, not on the order in which they occured.

In particular, for a given $f$, there will be a number $h(f)$ such that if the number of heads flipped is greater than or equal to $h(f)$, our final fortune will be greater than one billion.

In [1]:

import matplotlib.pyplot as plt
import numpy as np

f = 0.25
hs = range(1001)
final_amounts = [(1 + 2 * f)**h * (1 - f)**(1000 - h) for h in hs]

big_enough = [h for h, final_amount in zip(hs, final_amounts) if final_amount >= 10**9]

fig, ax = plt.subplots(figsize=(12, 8))
ax.semilogy(hs, final_amounts)
ax.axhline(1e9, color='red', linestyle='--')
ax.axvspan(min(big_enough), max(big_enough), color='green', alpha=0.2)
ax.set_xlim(0, 1000)
ax.set_ylabel('Final fortune')
ax.set_xlabel('Number of heads')
ax.set_title('{0} heads needed for f = {1}'.format(min(big_enough), f));

We can solve directly for the number of heads needed for a given $f$.

$$ (1 + 2f)^h (1 - f)^{1000 - h} = 10^9$$$$ h \log_{10} (1 + 2f) + (1000 - h) \log_{10} (1 - f) = 9 $$$$ h = \left \lceil{\frac{9 - 1000 \log_{10} (1 - f)}{\log_{10} (1 + 2f) - \log_{10} (1 - f)}}\right \rceil $$

In [2]:

def min_h_greater_given_f(f, mult=2):
    ''' Returns the minimimum number of heads needed to end up with at least
        10^9 pounds after 1000 flips if your capital is multiplied by 
        (1 + mult * f) for each head and by (1 - f) for each tail. If even
        1000 heads are not enough, returns 1001.
    '''
    if f == 0:
        # Impossible to end up above 1 if wagering 0.
        h = 1001
    elif f == 1:
        # Need to never get a tail if wagering everything.
        h = 1000
    else:
        exact = (9 - 1000 * np.log10(1 - f)) / (np.log10(1 + mult * f) - np.log10(1 - f))
        h = min(1001, np.ceil(exact))
    return h

An optimal $f$ minimizes the number of heads needed. Notice that because the number of heads is discrete, a range of $f$ values will be optimal.

In [3]:

fs = np.linspace(0, 1, 1000)
hs = [min_h_greater_given_f(f) for f in fs]

min_h = min(hs)
optimal_fs = [f for f, h in zip(fs, hs) if h == min_h]

fig, (full_ax, zoom_ax) = plt.subplots(1, 2, figsize=(12, 6))

full_ax.plot(fs, hs)
full_ax.set_ylim(0, 1001)
full_ax.set_xlabel('f')
full_ax.set_ylabel('Heads needed')

zoom_ax.plot(fs, hs)
zoom_ax.set_ylim(min_h - 10, min_h + 10)
zoom_ax.set_xlim(min(optimal_fs) - 0.1, max(optimal_fs) + 0.1);
zoom_ax.set_xlabel('f')
zoom_ax.set_ylabel('Heads needed');

The probability of achieving a given number of heads is given by a Binomial(1000, 0.5) distribution. We are interested in the probability of getting greater than or equal to a given number of heads. We can compute this ourselves from the formula...

In [4]:

from scipy.misc import comb

hs = range(1002)
binomial_coefficients = [comb(1000, h) for h in hs]
pdf = np.array(binomial_coefficients) / 2.**1000
p_geq = np.cumsum(pdf[::-1])[::-1]

fig, (pdf_ax, p_geq_ax) = plt.subplots(1, 2, figsize=(12, 6))

pdf_ax.plot(hs, pdf)
pdf_ax.axis('tight')
pdf_ax.set_ylabel('Probability that H = h')
pdf_ax.set_xlabel('h')

p_geq_ax.plot(hs, p_geq)
p_geq_ax.set_ylabel('Probability that H >= h')
p_geq_ax.set_xlabel('h')
p_geq_ax.axis('tight');

... or use scipy's built-in binomial distribution object.

In [5]:

from scipy.stats import binom

p_geq_alternate = [binom(1000, 0.5).sf(h - 1) for h in range(1002)]

print np.allclose(p_geq, p_geq_alternate)

True

The solution is the probability of getting at least the minimum number of heads.

In [6]:

print p_geq[min_h]

0.999992836186

Part 2¶

The problem as stated multiplies your wager by 2 on a win. What is the smallest this factor can be while still leaving the probability of ending up above one billion greater than 0.5, assuming that you play with an optimal f given the value of the factor?

In [7]:

def best_p_given_mult_crude(mult, return_f=False):
    ''' For a given value mult of the factor, returns the probability 
        of ending up a billionaire given optimal f.
    '''
    fs = np.linspace(0, 1, 1000)
    min_h = min(min_h_greater_given_f(f, mult) for f in fs)
    best_p = p_geq[min_h]
    return best_p

In [8]:

fig, axs = plt.subplots(1, 2, figsize=(16, 8))
lims_list = [(1, 2), (1.48, 1.52)]

for ax, (x_min, x_max) in zip(axs, lims_list):
    mults = np.linspace(x_min, x_max, 1000)
    ps = [best_p_given_mult_crude(mult) for mult in mults]

    ax.plot(mults, ps)

    ax.axis('tight')
    ax.set_ylabel('P(billionaire) given optimal f')
    ax.set_xlabel('Factor on heads')

We know that p < 0.5 if the factor is 1 and p > 0.5 if the factor is 2. We can repeatedly bisect this interval, maintaining the invariant that the left point of each new interval always produces a p < 0.5 and the right point always produces a p > 0.5.

In [9]:

def perform_bisection(best_p_finder):
    low, high = 1., 2.
    num_digits = 15
    while (high - low) > 10**-num_digits:
        mid = (low + high) / 2
        mid_p = best_p_finder(mid)
        if mid_p < 0.5:
            low = mid
        else:
            high = mid
        
    print 'A factor of {0:0.{precision}f} is high enough, but {1:0.{precision}f} is not.'.format(high, low, precision=num_digits + 1)
    return high

In [10]:

min_factor = perform_bisection(best_p_given_mult_crude)

A factor of 1.5046762137696037 is high enough, but 1.5046762137696028 is not.

I was wrong!¶

As Group 2 has pointed out, the above is wrong in the sense that it is not as small as possible to the claimed number of digits. The problem is that the method for finding the optimal $f$ was too crude. I was checking 1000 equally spaced points between 0 and 1. If the range of optimal $f$'s is small enough to fall entirely between these points, it won't be found, and the reported minimum number of heads will be to high. For the smaller value of the factor they report, 100,000 points are enough to show this.

In [11]:

smaller_mult = 1.504673877089423

fig, axs = plt.subplots(1, 2, figsize=(12, 6))
num_points_list = [1000, 100000]

for ax, num_points in zip(axs, num_points_list):
    fs = np.linspace(0, 1, num_points)
    hs = [min_h_greater_given_f(f, mult=smaller_mult) for f in fs]

    min_h = min(hs)
    optimal_fs = [f for f, h in zip(fs, hs) if h == min_h]

    ax.plot(fs, hs)
    ax.axvspan(min(optimal_fs), max(optimal_fs), color='green', alpha=0.2)
    ax.set_ylim(min_h - 10, min_h + 10)
    ax.set_xlim(0.05, 0.3);
    ax.set_xlabel('f')
    ax.set_title('{0:,} points'.format(num_points))
    ax.set_ylabel('Heads needed');

Of course, if 1,000 points wasn't enough, how do we know 1,000,000 is? In retrospect, it is clear that as you approach the infimum of all valid factors from the right, the range of optimal $f$'s will become arbitrarily small, so for any number of equally spaced points, we can get close enough to the infimum to make the range of optimal $f$'s fall through the cracks.

To get around this, we can do more sophisticated numerical minimization of the equation for $h$ in terms of $f$. Since this requires $h$ to be a continuous function of $f$, we need to do this while $h$ is still allowed to take fractional values, then take the ceil after minimizing. (This was John's idea.)

In [12]:

from scipy.optimize import minimize_scalar

def best_p_given_mult(mult, return_f=False):
    ''' For a given value mult of the factor, returns the probability 
        of ending up a billionaire given optimal f.
    '''
    get_fractional_h = lambda f: (9 - 1000 * np.log10(1 - f)) / (np.log10(1 + mult * f) - np.log10(1 - f))
    minimization_result = minimize_scalar(get_fractional_h,
                                          bounds=(0.001, 0.999),
                                          method='Bounded',
                                          tol=1e-16,
                                         )
    fractional_h = minimization_result.fun
    min_h = min(1001, np.ceil(fractional_h))
    best_p = p_geq[min_h]
    if return_f:
        return best_p, minimization_result.x
    else:
        return best_p

In [13]:

even_smaller_mult = perform_bisection(best_p_given_mult)

A factor of 1.5046738770873436 is high enough, but 1.5046738770873427 is not.

This is smaller than the number reported by group 2 by more than $10^{-15}$.

In [14]:

smaller_mult - even_smaller_mult

Out[14]:

2.079447725122918e-12

Let's confirm that there exists an $f$ that makes this a valid factor.

In [15]:

p, optimal_f = best_p_given_mult(even_smaller_mult, return_f=True)
min_h_greater_given_f(optimal_f, even_smaller_mult)

Out[15]:

500.0

So I'm back on top!