We illustrate the notion of array computation with a simple example consisting of finding, among a large list of locations, the closest one from a position of interest.
The pure Python version loops through all positions, computes the distance from the position of interest, and returns the index of the closest location.
def closest(position, positions):
x0, y0 = position
dbest, ibest = None, None
for i, (x, y) in enumerate(positions):
d = (x - x0) ** 2 + (y - y0) ** 2
if dbest is None or d < dbest:
dbest, ibest = d, i
return ibest
import random
We generate a list of random positions.
positions = [(random.random(), random.random()) for _ in xrange(10000000)]
Now we evaluate the time required to compute the closest position from (0.5, 0.5)
.
%timeit closest((.5, .5), positions)
1 loops, best of 3: 12.8 s per loop
This version uses NumPy to compute all distances in a vectorized way, without using Python loops. This version is far more efficient than the pure Python version.
First, we need to activate the pylab mode so that NumPy is loaded and the NumPy objects are available in the current namespace.
%pylab
Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline]. For more information, type 'help(pylab)'.
Generating random positions with NumPy's rand
function is much faster than using a list.
positions = rand(10000000, 2)
positions
is a NumPy ndarray
object.
type(positions)
numpy.ndarray
It has two dimensions and a shape of (10000000, 2)
.
positions.ndim, positions.shape
(2, (10000000, 2))
We can easily get the columns of this matrix, which contain here the x and y coordinates of all positions.
x, y = positions[:,0], positions[:,1]
Now we compute the distances in a vectorized fashion, which is much more efficient than with a Python loop.
distances = (x - .5) ** 2 + (y - .5) ** 2
%timeit exec In[9]
1 loops, best of 3: 341 ms per loop
%timeit ibest = distances.argmin()
10 loops, best of 3: 24.2 ms per loop
Using NumPy is about 35 times faster than using pure Python code here.