Matplotlib and NumPy basics¶

In [1]:

%pylab inline

Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline].
For more information, type 'help(pylab)'.

Basic plotting¶

In [2]:

dat = [3, -1, 0.5, 4, 2]
plot(dat)

Out[2]:

[<matplotlib.lines.Line2D at 0x106119510>]

In [3]:

# plotting points
A = (0,1)
B = (1,0)
C = (2,1)

# x = [A[0], B[0], C[0]]
x = map(lambda i: i[0], [A,B,C])
y = [i[1] for i in [A,B,C]]

plot(x, y, 'bo-')
axis([-0.1, 2.1, -0.1, 1.1]); # note ; at the end

Out[3]:

[-0.1, 2.1, -0.1, 1.1]

Style and plot attributes¶

MATLAB like API

In [4]:

x = range(11)
y1 = [i**2 for i in x]
y2 = [i**2 for i in reversed(x)]

plot(x, y1, 'ro-')
plot(x, y2, 'gs--')
xlabel('x')
ylabel('y')
title('$f(x) \sim x^2$')
grid()

Lists vs arrays¶

In [5]:

x = range(5)
y = x + 4

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-7c650f8da057> in <module>()
      1 x = range(5)
----> 2 y = x + 4

TypeError: can only concatenate list (not "int") to list

In [6]:

y = 2 * x   # not what we expect for a math expression
y

Out[6]:

[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

In [7]:

x = array(range(5)) # covert a list to array object
y = 2*x + 1
print(y)
type(y)

[1 3 5 7 9]

Out[7]:

numpy.ndarray

In [8]:

x = arange(11) # numpy built-in
plot(x, x**2, 'ro-')
plot(x, (10-x)**2, 'gs-')

Out[8]:

[<matplotlib.lines.Line2D at 0x1061e52d0>]

In [9]:

x = arange(-1, 1.1, 0.1) # arange works with floats
plot(x, x**2)

Out[9]:

[<matplotlib.lines.Line2D at 0x1061ef510>]

In [10]:

x = linspace(-pi, pi, 256) # nb of points instead of sampling
plot(x, sin(x), label=r'$\sin(x)$')
plot(x, cos(x), label=r'$\cos(x)$' )
legend(loc='upper left')

Out[10]:

<matplotlib.legend.Legend at 0x106481810>

Note on pylab import¶

"%pylan inline" does the following:

from numpy import *
from numpy.fft import *
from numpy.random import *
from numpy.linalg import *
from matplotlib.pyplot import *

# provide the recommended module abbrevs in the pylab namespace
# recommended for scripts
import matplotlib.pyplot as plt
import numpy as np
import numpy.ma as ma

try help(cos)

Efficiency¶

In [11]:

import random

def rand_gauss(N):
    return [random.gauss(0,1) for _ in xrange(N)]

sample_size = 500

d1 = rand_gauss(sample_size)

d2 = np.random.randn(sample_size)

subplot(1,2,1)
hist(d1, label='gauss')
subplot(1,2,2)
hist(d2, color='red')

Out[11]:

(array([  1,   6,  32,  68, 127, 118,  89,  45,  12,   2]),
 array([-3.36075237, -2.70074813, -2.04074388, -1.38073963, -0.72073539,
       -0.06073114,  0.5992731 ,  1.25927735,  1.91928159,  2.57928584,
        3.23929008]),
 <a list of 10 Patch objects>)

In [12]:

%timeit rand_gauss(sample_size)

1000 loops, best of 3: 656 us per loop

In [13]:

%timeit np.random.randn(sample_size)

10000 loops, best of 3: 20.2 us per loop

NumPy: N-dimentional array object¶

In [14]:

x_lst = range(10)  # regular list
x_lst[0:-1:2]      # [start:stop:step]

Out[14]:

[0, 2, 4, 6, 8]

In [15]:

x_arr = arange(10)
x_arr[0:-1:2]

Out[15]:

array([0, 2, 4, 6, 8])

In [16]:

print x_lst
x_lst[0] = 'hi'
print x_lst

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
['hi', 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [17]:

x_arr[0] = 'hi'

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-862aa56c851a> in <module>()
----> 1 x_arr[0] = 'hi'

ValueError: invalid literal for long() with base 10: 'hi'

In [18]:

x_arr[0] = 10.5 # danger zone
x_arr

Out[18]:

array([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])

Numpy arrays are statically typed and homogeneous. The type of the elements is determined when array is created!

In [19]:

def a_info(arr):
    print 'dtype:', arr.dtype   # data type
    print 'nbytes:', arr.nbytes # nb of bytes
    print 'ndim:', arr.ndim
    print 'shape:', arr.shape
    print 'size:', arr.size
    
a_info(x_arr)

dtype: int64
nbytes: 80
ndim: 1
shape: (10,)
size: 10

In [20]:

len(x_arr) == x_arr.size

Out[20]:

True

In [21]:

y = arange(3, dtype='float')
a_info(y)

dtype: float64
nbytes: 24
ndim: 1
shape: (3,)
size: 3

In [22]:

# type casting
a_info(y.astype('complex'))

dtype: complex128
nbytes: 48
ndim: 1
shape: (3,)
size: 3

Multi-dimentional array and indexing¶

In [23]:

M = array( [[2.0, 1, 3], [0.1, 0.2, 0.3]] )
M

Out[23]:

array([[ 2. ,  1. ,  3. ],
       [ 0.1,  0.2,  0.3]])

In [24]:

a_info(M)

dtype: float64
nbytes: 48
ndim: 2
shape: (2, 3)
size: 6

In [25]:

M[0,0]

Out[25]:

2.0

In [26]:

M[0] # select 1st row

Out[26]:

array([ 2.,  1.,  3.])

In [27]:

M[1,:] # select 2d row

Out[27]:

array([ 0.1,  0.2,  0.3])

In [28]:

M[:, -1] # select last column

Out[28]:

array([ 3. ,  0.3])

In [29]:

# assignment works the same way
M[:,-1] = 100
M

Out[29]:

array([[   2. ,    1. ,  100. ],
       [   0.1,    0.2,  100. ]])

In [30]:

M[0:, 1:] # all rows, 2 last columns

Out[30]:

array([[   1. ,  100. ],
       [   0.2,  100. ]])

Array generation functions¶

In [31]:

print arange(-1.0, 1.0, 0.2)  # low, high (excl), step

[ -1.00000000e+00  -8.00000000e-01  -6.00000000e-01  -4.00000000e-01
  -2.00000000e-01  -2.22044605e-16   2.00000000e-01   4.00000000e-01
   6.00000000e-01   8.00000000e-01]

In [32]:

print linspace(-1.0, 1.0, 9)  # low, high (incl), nb_of_points

[-1.   -0.75 -0.5  -0.25  0.    0.25  0.5   0.75  1.  ]

In [33]:

print zeros((3,3))

[[ 0.  0.  0.]
 [ 0.  0.  0.]
 [ 0.  0.  0.]]

In [34]:

print diag((3.,2,1))

[[ 3.  0.  0.]
 [ 0.  2.  0.]
 [ 0.  0.  1.]]

In [35]:

print np.random.rand(2, 4)  # random numbers from a uniform distribution between [0, 1[

[[ 0.66800989  0.96622993  0.43213735  0.33973   ]
 [ 0.79521694  0.76777532  0.24617852  0.78354752]]

In [36]:

print np.random.randn(2, 4)  # randome numbers from a normal distribution (mu=0, sigma=1)

[[ 1.00360327  2.09461226  1.1273358   0.51963957]
 [ 0.59788146 -0.17339674  0.8364903   1.04364158]]

Fancy indexing¶

In [37]:

x = linspace(-1, 1, 5)
print 'x:', x

c = (x >= 0)
print 'c:', c
print c.dtype

print 'x[c]:', x[c]

y = exp(x)
print 'y[c]', y[c]

x: [-1.  -0.5  0.   0.5  1. ]
c: [False False  True  True  True]
bool
x[c]: [ 0.   0.5  1. ]
y[c] [ 1.          1.64872127  2.71828183]

In [38]:

c = (x >= -0.5) * ( x <= 0.5)
x[c]

Out[38]:

array([-0.5,  0. ,  0.5])

Array methods¶

In [43]:

def a_meth(arr):
    print 'min:', arr.min()
    print 'max:', arr.max()
    print 'mean:', arr.mean()
    print 'std:', arr.std()

x = randn(100)*2.3 + 1.7
a_meth(x)

min: -3.009859938
max: 7.72678056926
mean: 1.65085133248
std: 2.27740367094

Array operations¶

In [48]:

v = arange(6)
print 2 * v - 5 # linear transform

[-5 -3 -1  1  3  5]

In [61]:

v2 = v[::-1]
print v + v2    # element-wise sum

[5 5 5 5 5 5]

In [66]:

print (1 - np.exp(v))  # vectorized function

[   0.           -1.71828183   -6.3890561   -19.08553692  -53.59815003
 -147.4131591 ]

In [67]:

M = (rand(4,3) * 10).astype('int')
print 2*M

[[12  2 18]
 [18 16 12]
 [ 6 10 16]
 [18 18 10]]

In [68]:

v = array([0,1,2])
print M * v       # multiply every row by vector

[[ 0  1 18]
 [ 0  8 12]
 [ 0  5 16]
 [ 0  9 10]]

File I/O¶

In [69]:

M = randn(10, 3)
save('data/M.npy', M)  # dump an array into a binary file (similar to MATLAB's .mat)

In [71]:

N = load('data/M.npy')
(N - M).sum()

Out[71]:

0.0

In [76]:

savetxt('data/M.csv', M, fmt='%.3f', delimiter='\t')

In [77]:

!head data/M.csv

1.036	1.617	-0.977
0.600	-1.656	1.417
-1.850	0.007	0.246
1.088	-0.170	0.941
1.316	-1.822	0.389
-0.447	1.119	0.179
-0.393	-0.904	-1.427
1.363	-0.968	1.443
1.425	-0.509	-0.882
-0.031	0.431	0.011

In [81]:

N = genfromtxt('data/M.csv')
N

Out[81]:

array([[ 1.036,  1.617, -0.977],
       [ 0.6  , -1.656,  1.417],
       [-1.85 ,  0.007,  0.246],
       [ 1.088, -0.17 ,  0.941],
       [ 1.316, -1.822,  0.389],
       [-0.447,  1.119,  0.179],
       [-0.393, -0.904, -1.427],
       [ 1.363, -0.968,  1.443],
       [ 1.425, -0.509, -0.882],
       [-0.031,  0.431,  0.011]])

In [80]:

(N-M).sum()

Out[80]:

0.0028902802726487797