In this example, we show how to accelerate a pure Python algorithm with Cython, just by giving some type information about the local variables.
We will implement the Eratosthenes Sieve algorithm to find all prime numbers smaller than a given number. The first version is coded in pure Python.
def primes1(n):
primes = [False, False] + [True] * (n - 2)
i = 2
while i < n:
# we do not deal with composite numbers
if not primes[i]:
i += 1
continue
k = i * i
# mark multiples of i as composite numbers
while k < n:
primes[k] = False
k += i
i += 1
return [i for i in xrange(2, n) if primes[i]]
Let's evaluate the performance of this first version.
n = 10000
primes1(20)
[2, 3, 5, 7, 11, 13, 17, 19]
%timeit primes1(n)
100 loops, best of 3: 10.8 ms per loop
Now, we load the Cython magic extension to write Cython code right in the notebook.
%load_ext cythonmagic
All we do is to add %%cython
in the first line of the cell.
%%cython
def primes2(n):
primes = [False, False] + [True] * (n - 2)
i = 2
while i < n:
if not primes[i]:
i += 1
continue
k = i * i
while k < n:
primes[k] = False
k += i
i += 1
return [i for i in xrange(2, n) if primes[i]]
primes2(20)
[2, 3, 5, 7, 11, 13, 17, 19]
The speed up improvement is modest.
timeit primes2(n)
100 loops, best of 3: 6.97 ms per loop
Now, we will precise the type of each local variable so that Cython can considerably optimize the code.
%%cython
def primes3(int n):
primes = [False, False] + [True] * (n - 2)
cdef int i = 2
cdef int k = 0
while i < n:
if not primes[i]:
i += 1
continue
k = i * i
while k < n:
primes[k] = False
k += i
i += 1
return [i for i in xrange(2, n) if primes[i]]
timeit primes3(n)
1000 loops, best of 3: 1.36 ms per loop
We achieve a 8x speed improvement just by specifying the variable types.