#!/usr/bin/env python
# coding: utf-8

# # Primarily about computing factorials

# Browsing Wikipedia one day I came across an interesting line from the [Factorial page under computation](http://en.wikipedia.org/wiki/Factorial#Computation)
# > The asymptotically best efficiency is obtained by computing $n!$ from its prime factorization. As documented by Peter Borwein, prime factorization allows $n!$ to be computed in time $\mathcal{O} \big(n {(\log n \log\log n)}^2\big)$, provided that a fast multiplication algorithm is used (for example, the Schönhage–Strassen algorithm). Peter Luschny presents source code and benchmarks for several efficient factorial algorithms, with or without the use of a prime sieve.
# 
# In this post I describe two algorithms for computing $n!$, code them up in Python, plot and discuss their runtimes, share links to detailed analysis and try parse the important differences between the two algorithms.

# What makes such an algorithm have better efficiency is that we can compute the unique prime factorisation of a factorial first and then use it to  compute the factorial. I'll go through the steps to do this now.
# 
# Let's consider $9!$ as an example   
# The naive way to compute it is by the definition, $$9! = 9\cdot 8\cdot7\cdot6\cdot5\cdot4\cdot3\cdot2 = \prod_{k=2}^{9}k$$ Instead let's try compute it by factoring first.   
# We need to count the amount of times each prime divides into $9!$, let's consider the explicit version of $9!$ above.  
# Starting with $2$, it divides into $2,4,6,8$, but also can divide once more into $4,8$, and can divide into $8$ a third time. So the number of times 2 divides $9!$ is $4+2+1=7$, i.e $2^7$ divides $9!$.  
# Next prime is $3$, it divides into $3,6,9$ and divides once more into $9$, so  number of times $3$ divides $9!$ is $3+1=4$, i.e $3^4$ divides $9!$.   
# Next primes are $5$ and $7$, both of which only divide once into $9!$. 
# Putting this together we get $$9! = 2^7 \cdot 3^4 \cdot 5^1 \cdot 7^1 = 128\cdot81\cdot5\cdot7 = 362880$$

# What we did was count how many times each prime and it's powers divide into $9\cdot 8\cdot7\cdot6\cdot5\cdot4\cdot3\cdot2$, i.e how many multiples of each prime and it's powers are there between $1$ and $9$.  
# This is easily computed with division, revisiting $2$, $\frac{9}{2^2} = 2.25$, we can ignore the $.25$ since we are looking for the number of times it divides evenly. The mathematical way of writing this is using the floor function $\lfloor x \rfloor$, which rounds down to the nearest, so we want $\Big\lfloor \frac{9}{2^2} \Big\rfloor = 2$. So to get the same from before, we sum $ \Big\lfloor\frac{9}{2}\Big\rfloor + \Big\lfloor\frac{9}{2^2}\Big\rfloor + \Big\lfloor\frac{9}{2^3}\Big\rfloor = 4+2+1=7$ 

# So for a general $n$, to get the prime factorisation we do this for each prime $p$ and each of it's powers. 
# The power of $p$ dividing $n!$, $$\nu _{p}(n!) = \sum_{k=1}^{\lfloor \log_pn \rfloor}\Big\lfloor\frac{n}{p^k}\Big\rfloor$$ 
# The $\lfloor \log_pn \rfloor$ insures we stop summing before $p^k > n$, in which case $\Big\lfloor\frac{n}{p^k}\Big\rfloor = 0$  
# So to write out $n!$ explicitely this way, we get
# $$n! = \prod_{p \leq n} p^{ \nu _{p}(n!)}$$  

# In[2]:


get_ipython().run_line_magic('matplotlib', 'inline')
import math
import numpy as np
import timeit
import matplotlib.pyplot as plt


# We will use a sieve of Eratosthenes to get the set of all the primes below $n$. The code I found on [Rosetta Code](http://rosettacode.org/wiki/Sieve_of_Eratosthenes#Using_set_lookup), was a concise and nice way of using Python [generators](https://www.jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/) and [sets](https://docs.python.org/2/library/sets.html).  
# When we iterate through `eratosthenes(n)` it checks to see if the current number, $i$, is not part of the `composite` set, if so then return it as a prime and add it's multiples to the set, starting from it's square. We update the set starting from $p^2$ since multiples beneath will have already been added, Eg. $2p$ was added for multiples of 2.

# In[3]:


def eratosthenes(n):
    composite = set()
    for i in range(2, n+1):
        if i not in composite:
            yield i
            composite.update(range(i**2, n+1, i))


# `factorFactorial(n)` computes $n!$ using the code and methods dicussed above.

# In[4]:


def factorFactorial(n):
    factorial = 1
    for p in eratosthenes(n):
        v_p = 0 
        for k in range(1,math.ceil(math.log(n,p))):
            v_p += n // p**k 
        factorial = factorial*(p**v_p)
    return factorial


# `iterFactorial(n)` computes $n!$ using the definition. 

# In[5]:


def iterFactorial(n):
    fact = 1
    for i in range(1,n+1):
        fact = fact * i
    return fact


# Let's do a quick test computing the $10^3!$, which has over 2500 digits so it's fairly large already.

# In[39]:


get_ipython().run_line_magic('timeit', 'factorFactorial(10**3)')
get_ipython().run_line_magic('timeit', 'iterFactorial(10**3)')


# The `iterFactorial` function is alot faster, but there is alot of overhead in the factorisation, let's try a larger example $10^6!$

# In[53]:


get_ipython().run_line_magic('timeit', '-n 1 factorFactorial(10**5)')
get_ipython().run_line_magic('timeit', '-n 1 iterFactorial(10**5)')


# Ok phew, so it was a bit a bit faster to go through the factoring process, it's only worth it for sufficiently large $n$ though. Let's plot out how long it takes for the two algorithms, with some averaging to remove random fluctiations. 

# In[ ]:


def loop(factorial,myRange,averageNum):
    timesAvg = []
    for i in myRange:
        times = []
        for j in range(averageNum):
            start_time = timeit.default_timer() 
            factorial(i**3)
            elapsed_time = timeit.default_timer() - start_time
            times.append(elapsed_time)
        timesAvg.append(np.mean(times))
    return timesAvg


# In[90]:


from mpl_toolkits.axes_grid1.inset_locator import zoomed_inset_axes
from mpl_toolkits.axes_grid1.inset_locator import mark_inset


# In[95]:


fig, ax = plt.subplots(figsize=[10,10])

loops = []
myRange = range(1,40)
ax2 = plt.axes([.2, .4, .3, .3], axisbg='w')
for i,f in enumerate([factorFactorial,iterFactorial,math.factorial]):
    loops.append(loop(f,myRange,10))
    ax.plot(np.power(myRange,3),loops[i],label = f.__name__)
    ax.legend(loc = 'upper left')
    ax.set_xlabel(r"$n!$",fontsize = 20)
    ax.set_ylabel("time",fontsize = 20)
    ax2.plot(np.power(myRange,3)[:20],loops[i][:20])
    #ax2.set_yticklabels([])
mark_inset(ax, ax2, loc1=2, loc2=4, fc="none", ec="0.5")
ax.set_title("Comparision of different factorial algorithms")
plt.show()
plt.close()


# The `factorFactorial(n)` function seems to gain the upper hand around `n=5000` and the difference gets much larger from then on out. As a matter of interest I plotted python's built in `math.factorial(n)`, it rely's on a more complicated algorithm and is written in c. It vastly outpreforms `factorFactorial(n)` although it's not as fast as other prime factoring algorithms written in C.

# In the paper [On the complexity of calculating factorials, Peter B. Borwein](http://www.cecm.sfu.ca/personal/pborwein/PAPERS/P29.pdf) algorithms very similar to the ones above are compared thouroughly, the results being:  
# Naive multiplication, `iterFactorial(n)` is $\mathcal{O}\big(n \,\, M\big(n\log n \big) \big)$  
# Algorithm similar to `factorFactorial(n)` is $\mathcal{O}\big( \log \log n \, M\big(n \log n\big)\big)$, where $M\big(n\big)$ is the complexity of multiplying two $n$ digit numbers.

# I think alot of what the paper describes is slightly beyond the scope of this post so I'll try give a brief explanation why `factorFactorial(n)` is faster for $n \gg 1$. We can notice the difference in the above complexities is $\log n$ being replaced by the slower growing $\log\log n$, the orgin of these terms might shed some light on the differences.
# 
# $n$ in `iterFactorial(n)` comes from doing the $n$ multiplications.
# 
# $\log \log n$ in `factorFactorial(n)` comes from where we have to sum the complexity of the steps for each prime. At each prime,$p_i$ we cross off $n/p_i$ multiples, so summing these we get $\mathcal{O} \big(\sum_{i=1}^n \frac{1}{p_i} \big)  = \mathcal{O} \big(\log\log n\big)$. [(Sum of reciprocals of primes)](https://en.wikipedia.org/wiki/Divergence_of_the_sum_of_the_reciprocals_of_the_primes)

# There are lot's of possible optimisations, I kept the code as simple to highlight the algorithm. Finding further optimisations could be an interesting excersise, eg slightly modifying the algorithm, using another method to find prime numbers or rewriting the algorithm with Numpy. 

# Here are a couple of resources I used making this post: [Prime numbers and Numpy, Rebrained blog](http://rebrained.com/?p=458), [Python factorial source code](https://hg.python.org/cpython/file/7937aa6b7e92/Modules/mathmodule.c#l1218), [On the complexity of calculating factorials, Peter B. Borwein](http://www.cecm.sfu.ca/personal/pborwein/PAPERS/P29.pdf), [A great webpage detailing and comparing lot's of Factorial Algorithms](http://www.luschny.de/math/factorial/FastFactorialFunctions.htm)