III-NumPy - Numerical Python¶

Lecturer:José Pedro Silva¹ - silva_at_math.uni-wuppertal.de

NumPy provides the backbone to all scientific computing in Python. It provides all the high-dimensional structures (arrays), operations and interfaces to C and Fortran

Usually it is imported as np

In [1]:

import numpy as np

For ease of writing in these tutorials we will populate the entire namespace with numpy

In [2]:

from numpy import *

"NumPy's main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In Numpy dimensions are called axes. The number of axes is rank."

Let's create an array then

Create array from list¶

In [4]:

v = [1,2,3,4]
av = np.array(v)
av

Out[4]:

array([1, 2, 3, 4])

Create matrix from list (of lists)¶

In [5]:

m = [[1,2],[3,4]]
am = array(m)
am

Out[5]:

array([[1, 2],
       [3, 4]])

Arrays have different methods and attributes¶

In [8]:

am.size #total number of elements

Out[8]:

In [9]:

am.shape   #array shape (equivalent to Matlab's size)

Out[9]:

(2, 2)

In [10]:

am.size

Out[10]:

In [11]:

am.shape

Out[11]:

(2, 2)

In [12]:

am.ndim

Out[12]:

In total, a lot of attributes and methods are available by default

In [13]:

len(dir(am))

Out[13]:

One important attribute is dtype. NumPy arrays are statically typed and contain always the same type of data. (For the next generation of array containers, see Blaze ) Therefore, if not specified, it is infered from the data when the array is created

In [14]:

am.dtype

Out[14]:

dtype('int64')

In [15]:

array([1,2,3]).dtype

Out[15]:

dtype('int64')

In [16]:

array([1.1,2.2,3.4]).dtype

Out[16]:

dtype('float64')

In [17]:

array([1+2j,2+4j,3]).dtype

Out[17]:

dtype('complex128')

In [18]:

array([True,False]).dtype

Out[18]:

dtype('bool')

There is an extensive list of available dtypes

In [19]:

sctypes

Out[19]:

{'complex': [numpy.complex64, numpy.complex128, numpy.complex256],
 'float': [numpy.float16, numpy.float32, numpy.float64, numpy.float128],
 'int': [numpy.int8, numpy.int16, numpy.int32, numpy.int64],
 'others': [bool, object, str, unicode, numpy.void],
 'uint': [numpy.uint8, numpy.uint16, numpy.uint32, numpy.uint64]}

Array generating funtions¶

In [20]:

arange(0,10,2)

Out[20]:

array([0, 2, 4, 6, 8])

In [21]:

arange(-1,1,0.1)

Out[21]:

array([ -1.00000000e+00,  -9.00000000e-01,  -8.00000000e-01,
        -7.00000000e-01,  -6.00000000e-01,  -5.00000000e-01,
        -4.00000000e-01,  -3.00000000e-01,  -2.00000000e-01,
        -1.00000000e-01,  -2.22044605e-16,   1.00000000e-01,
         2.00000000e-01,   3.00000000e-01,   4.00000000e-01,
         5.00000000e-01,   6.00000000e-01,   7.00000000e-01,
         8.00000000e-01,   9.00000000e-01])

In [22]:

linspace(0,10,6)

Out[22]:

array([  0.,   2.,   4.,   6.,   8.,  10.])

In [23]:

_.dtype

Out[23]:

dtype('float64')

In [24]:

linspace(0,10,6,dtype='uint32')

Out[24]:

array([ 0,  2,  4,  6,  8, 10], dtype=uint32)

In [25]:

_.dtype

Out[25]:

dtype('uint32')

In [26]:

linspace(-1,10,6,dtype='uint32')

Out[26]:

array([4294967295,          1,          3,          5,          7,
               10], dtype=uint32)

In [27]:

finfo('float64')

Out[27]:

finfo(resolution=1e-15, min=-1.7976931348623157e+308, max=1.7976931348623157e+308, dtype=float64)

In [28]:

iinfo(np.int8), iinfo('uint8')

Out[28]:

(iinfo(min=-128, max=127, dtype=int8), iinfo(min=0, max=255, dtype=uint8))

In [29]:

logspace(0,1,10)

Out[29]:

array([  1.        ,   1.29154967,   1.66810054,   2.15443469,
         2.7825594 ,   3.59381366,   4.64158883,   5.9948425 ,
         7.74263683,  10.        ])

In [30]:

x,y = mgrid[0:5,0:5]
x, y

Out[30]:

(array([[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3],
        [4, 4, 4, 4, 4]]), array([[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]]))

When we already have each of the axis we call meshgrid

In [31]:

x = linspace(0,1,10)
y = linspace(-1,1,5)
meshgrid(x,y)

Out[31]:

[array([[ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
          0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ],
        [ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
          0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ],
        [ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
          0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ],
        [ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
          0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ],
        [ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
          0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ]]),
 array([[-1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. ],
        [-0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5],
        [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
        [ 0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5],
        [ 1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ]])]

A lot of times we will want to generate random data to test our implementations¶

In [32]:

from numpy import random

In [33]:

random.normal(0,1)

Out[33]:

0.6633650258161853

In [34]:

random.normal(-1,0.001)

Out[34]:

-1.0001122377014229

In [35]:

random.normal(0,1,[2,3])

Out[35]:

array([[-0.0965079 ,  0.61404973, -0.64773592],
       [ 0.71630596,  0.76143585, -0.46887797]])

In [36]:

random.poisson()

Out[36]:

In [37]:

random.rand(5,5)

Out[37]:

array([[ 0.87390908,  0.24764657,  0.99397698,  0.38955504,  0.48710422],
       [ 0.73542102,  0.87108142,  0.27943569,  0.04596862,  0.93223845],
       [ 0.12292627,  0.03837154,  0.94424234,  0.54406483,  0.25399445],
       [ 0.04583676,  0.09379588,  0.91877364,  0.51265172,  0.36080634],
       [ 0.56637395,  0.39093857,  0.52816819,  0.65556558,  0.94067937]])

Indexing¶

Unlike Matlab, where parenthesis is used for indexing, in NumPy it is done using square brackets

Note:Check with your colleagues if you both have the same array a in the next step. It is very important to have the same values at this point

In [46]:

a = random.rand(5,5)
a

Out[46]:

array([[ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327],
       [ 0.26212745,  0.55539317,  0.71104502,  0.31895431,  0.56786133],
       [ 0.43040634,  0.2584529 ,  0.34181508,  0.24693744,  0.71610839],
       [ 0.78713362,  0.21469213,  0.00365139,  0.21454292,  0.94248957],
       [ 0.53490537,  0.61891915,  0.73147886,  0.72756939,  0.26000907]])

In [47]:

a[0,0]

Out[47]:

0.75916443575824311

In [48]:

a[0]

Out[48]:

array([ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327])

In [49]:

a[:,0]

Out[49]:

array([ 0.75916444,  0.26212745,  0.43040634,  0.78713362,  0.53490537])

Very important: Slicing and indexing return views of the original array. Can you see any difference?

In [50]:

aview1 = a[[0],:]
aview1

Out[50]:

array([[ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327]])

In [51]:

aview2 = a[0]
aview2

Out[51]:

array([ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327])

In [52]:

aview3 = a[[0]]
aview3

Out[52]:

array([[ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327]])

In [53]:

print aview1.shape, aview2.shape, aview3.shape

(1, 5) (5,) (1, 5)

Slicing¶

In [56]:

from IPython.display import YouTubeVideo
YouTubeVideo('q_2TBbfMLzs',start=15)

Out[56]:

In [57]:

Out[57]:

array([[ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327],
       [ 0.26212745,  0.55539317,  0.71104502,  0.31895431,  0.56786133],
       [ 0.43040634,  0.2584529 ,  0.34181508,  0.24693744,  0.71610839],
       [ 0.78713362,  0.21469213,  0.00365139,  0.21454292,  0.94248957],
       [ 0.53490537,  0.61891915,  0.73147886,  0.72756939,  0.26000907]])

In [58]:

a[:,:]

Out[58]:

array([[ 0.75916444,  0.94064948,  0.35674098,  0.76391983,  0.02329327],
       [ 0.26212745,  0.55539317,  0.71104502,  0.31895431,  0.56786133],
       [ 0.43040634,  0.2584529 ,  0.34181508,  0.24693744,  0.71610839],
       [ 0.78713362,  0.21469213,  0.00365139,  0.21454292,  0.94248957],
       [ 0.53490537,  0.61891915,  0.73147886,  0.72756939,  0.26000907]])

In [59]:

a[0:2,1:3]

Out[59]:

array([[ 0.94064948,  0.35674098],
       [ 0.55539317,  0.71104502]])

In [60]:

a = arange(10)

In [61]:

a[1:6:3]

Out[61]:

array([1, 4])

In [62]:

a[::2]

Out[62]:

array([0, 2, 4, 6, 8])

In [63]:

a[::-1]

Out[63]:

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

In [64]:

%%timeit 
a[::-1]

1000000 loops, best of 3: 222 ns per loop

In [65]:

%%timeit
reversed(a)

1000000 loops, best of 3: 182 ns per loop

In [66]:

print a[5:], a[:4], a[-2:], a[:-4]

[5 6 7 8 9] [0 1 2 3] [8 9] [0 1 2 3 4 5]

In [67]:

a = random.rand(5,5)
a

Out[67]:

array([[ 0.63611189,  0.45387041,  0.84324789,  0.41564385,  0.8736876 ],
       [ 0.15186863,  0.75865389,  0.11447856,  0.03593258,  0.35814001],
       [ 0.76592388,  0.7361502 ,  0.36358767,  0.25983878,  0.94569104],
       [ 0.43403234,  0.23080921,  0.64767593,  0.50096602,  0.27094573],
       [ 0.79041938,  0.41486588,  0.4061916 ,  0.33896987,  0.68529799]])

In [68]:

a[0]

Out[68]:

array([ 0.63611189,  0.45387041,  0.84324789,  0.41564385,  0.8736876 ])

In [69]:

rows = [1,3]
cols = [0,2]
a[rows], a[:,cols], a[rows,cols]

Out[69]:

(array([[ 0.15186863,  0.75865389,  0.11447856,  0.03593258,  0.35814001],
        [ 0.43403234,  0.23080921,  0.64767593,  0.50096602,  0.27094573]]),
 array([[ 0.63611189,  0.84324789],
        [ 0.15186863,  0.11447856],
        [ 0.76592388,  0.36358767],
        [ 0.43403234,  0.64767593],
        [ 0.79041938,  0.4061916 ]]),
 array([ 0.15186863,  0.64767593]))

Vorsicht! Round vs np.round

In [70]:

mask = np.around(a)
mask = mask.astype('bool')

In [71]:

a[mask]

Out[71]:

array([ 0.63611189,  0.84324789,  0.8736876 ,  0.75865389,  0.76592388,
        0.7361502 ,  0.94569104,  0.64767593,  0.50096602,  0.79041938,
        0.68529799])

In [72]:

mask = a > 1.0
a[mask]

Out[72]:

array([], dtype=float64)

In [73]:

mask = a < 1.0
a[mask]

Out[73]:

array([ 0.63611189,  0.45387041,  0.84324789,  0.41564385,  0.8736876 ,
        0.15186863,  0.75865389,  0.11447856,  0.03593258,  0.35814001,
        0.76592388,  0.7361502 ,  0.36358767,  0.25983878,  0.94569104,
        0.43403234,  0.23080921,  0.64767593,  0.50096602,  0.27094573,
        0.79041938,  0.41486588,  0.4061916 ,  0.33896987,  0.68529799])

In [74]:

mask = a > 0.7
a[mask]

Out[74]:

array([ 0.84324789,  0.8736876 ,  0.75865389,  0.76592388,  0.7361502 ,
        0.94569104,  0.79041938])

In [75]:

where(mask)

Out[75]:

(array([0, 0, 1, 2, 2, 2, 4]), array([2, 4, 1, 0, 1, 4, 0]))

Linear Algebra¶

In [76]:

from numpy.linalg import *

In [77]:

x = arange(10)
b = random.rand(10,1)
A = random.randn(10,10)
print x

[0 1 2 3 4 5 6 7 8 9]

In [78]:

x*2

Out[78]:

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [79]:

x+5

Out[79]:

array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [80]:

x**2

Out[80]:

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

In [81]:

x*x

Out[81]:

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

And one of the best features: broadcasting

In [82]:

from IPython.display import Image
Image('https://scipy-lectures.github.io/_images/numpy_broadcasting.png',width=800)

Out[82]:

In [83]:

x.shape, b.shape

Out[83]:

((10,), (10, 1))

In [84]:

x+b

Out[84]:

array([[ 0.62627322,  1.62627322,  2.62627322,  3.62627322,  4.62627322,
         5.62627322,  6.62627322,  7.62627322,  8.62627322,  9.62627322],
       [ 0.01516076,  1.01516076,  2.01516076,  3.01516076,  4.01516076,
         5.01516076,  6.01516076,  7.01516076,  8.01516076,  9.01516076],
       [ 0.28310379,  1.28310379,  2.28310379,  3.28310379,  4.28310379,
         5.28310379,  6.28310379,  7.28310379,  8.28310379,  9.28310379],
       [ 0.79613997,  1.79613997,  2.79613997,  3.79613997,  4.79613997,
         5.79613997,  6.79613997,  7.79613997,  8.79613997,  9.79613997],
       [ 0.6313566 ,  1.6313566 ,  2.6313566 ,  3.6313566 ,  4.6313566 ,
         5.6313566 ,  6.6313566 ,  7.6313566 ,  8.6313566 ,  9.6313566 ],
       [ 0.29106039,  1.29106039,  2.29106039,  3.29106039,  4.29106039,
         5.29106039,  6.29106039,  7.29106039,  8.29106039,  9.29106039],
       [ 0.6707564 ,  1.6707564 ,  2.6707564 ,  3.6707564 ,  4.6707564 ,
         5.6707564 ,  6.6707564 ,  7.6707564 ,  8.6707564 ,  9.6707564 ],
       [ 0.67791273,  1.67791273,  2.67791273,  3.67791273,  4.67791273,
         5.67791273,  6.67791273,  7.67791273,  8.67791273,  9.67791273],
       [ 0.25119872,  1.25119872,  2.25119872,  3.25119872,  4.25119872,
         5.25119872,  6.25119872,  7.25119872,  8.25119872,  9.25119872],
       [ 0.39701644,  1.39701644,  2.39701644,  3.39701644,  4.39701644,
         5.39701644,  6.39701644,  7.39701644,  8.39701644,  9.39701644]])

In [85]:

x2 = solve(A,b)

In [86]:

A*x2-b

Out[86]:

array([[ -3.95409374e-01,   4.13685767e-01,   2.14409468e-01,
         -6.11842916e-01,   9.99325848e-02,  -1.80705335e-01,
          4.60119301e-01,   1.85241297e-02,   1.83146509e-01,
         -5.73630120e-01],
       [ -6.88983026e-01,  -9.55154870e-01,   2.54459630e-01,
          2.43030937e-01,  -2.06568041e-01,  -4.28551990e-01,
          6.72499735e-01,  -4.51001736e-01,   1.46594431e-02,
         -4.69708206e-01],
       [ -3.46408297e-01,  -1.19438477e-01,  -3.27836245e-01,
         -1.09431832e-01,  -3.93504767e-01,  -1.93695090e-01,
         -2.53422338e-01,  -4.14724995e-01,  -2.97374203e-01,
         -3.93128055e-01],
       [ -7.78827743e-01,  -9.17404612e-01,  -7.45494711e-01,
         -9.22583512e-01,  -9.41219026e-01,  -7.53176435e-01,
         -8.45786322e-01,  -6.66898316e-01,  -7.99559214e-01,
         -8.22526185e-01],
       [ -1.03200940e+00,  -6.94543850e-01,  -1.15150462e+00,
         -2.35762676e-01,  -5.29343164e-01,  -4.93288471e-01,
         -5.16766367e-01,  -1.07569529e+00,  -8.58115139e-02,
         -1.35220025e+00],
       [ -7.80664600e-01,   2.68440722e-01,  -2.41399829e-01,
         -3.06844995e-01,   3.31572377e-01,   9.34426675e-01,
         -5.17301177e-01,   5.37634677e-01,   2.54035523e-04,
          1.01212903e-01],
       [ -1.72659782e+00,  -1.61006484e+00,  -1.73012235e-01,
         -1.22882031e+00,  -1.87884725e+00,  -1.00480080e+00,
         -2.72855063e-01,   7.06177709e-01,  -1.02047558e+00,
         -1.53621766e+00],
       [ -4.25488959e-01,  -9.28754504e-01,  -1.09543169e+00,
         -8.60858296e-01,  -4.05409203e-01,  -5.91867877e-01,
         -8.29765544e-01,  -6.12709584e-01,  -4.15395974e-01,
         -6.09868912e-01],
       [ -2.50242819e-01,  -2.87483231e-01,  -2.42666143e-01,
         -2.39819874e-01,  -2.31899871e-01,  -2.65062491e-01,
         -2.44599600e-01,  -2.49597843e-01,  -2.76084286e-01,
         -2.74808172e-01],
       [ -2.44695529e-01,  -2.42136197e-01,  -2.85091549e-01,
         -2.06551965e-01,  -2.18391036e-01,  -2.75058884e-01,
         -4.99375982e-01,  -3.36798199e-01,  -2.14202261e-01,
         -2.67609670e-01]])

In [87]:

A.dot(x2)-b

Out[87]:

array([[  0.00000000e+00],
       [ -4.16333634e-17],
       [  2.22044605e-16],
       [ -1.11022302e-16],
       [  0.00000000e+00],
       [ -6.66133815e-16],
       [ -1.11022302e-16],
       [  4.44089210e-16],
       [ -5.55111512e-17],
       [  0.00000000e+00]])

In [88]:

np.allclose(A.dot(x2),b)

Out[88]:

True

In [89]:

invA = inv(A)
np.allclose(invA.dot(A),identity(shape(A)[0]))

Out[89]:

True

In [90]:

A.transpose()

Out[90]:

array([[ 0.37271708,  1.46971809,  0.57517022,  0.11021651,  1.27367445,
        -0.80129291, -2.09517358,  0.80999239,  0.05558992,  1.3773463 ],
       [ 1.67895704,  2.05028301, -1.48702545, -0.77201885,  0.20087213,
         0.91568713, -1.86392975, -0.80491598, -2.11011398,  1.40048879],
       [ 1.35723633, -0.58808677,  0.40642882,  0.32242783,  1.65354952,
         0.08127515,  0.98770556, -1.33975964,  0.49620902,  1.0120694 ],
       [ 0.02329694, -0.56315889, -1.57794353, -0.80498979, -1.2575923 ,
        -0.0258333 , -1.10740186, -0.58704659,  0.66173285,  1.72225559],
       [ 1.17241964,  0.41749102,  1.00307793, -0.92363086, -0.32430052,
         1.01900926, -2.39729186,  0.8744255 ,  1.12231799,  1.615202  ],
       [ 0.71934504,  0.90167482, -0.81234691,  0.27352293, -0.43891831,
         2.00564881, -0.66286563,  0.27610583, -0.80624321,  1.10278876],
       [ 1.75392144, -1.49990155, -0.26967886, -0.31606838, -0.36428213,
        -0.37026875,  0.78958106, -0.48727433,  0.38376945, -0.92557577],
       [ 1.04099014,  0.95064144,  1.19588005,  0.82280369,  1.4125518 ,
         1.35625362,  2.73233834,  0.20922773,  0.09309828,  0.54451729],
       [ 1.30676399, -0.0650428 ,  0.1296577 , -0.02176826, -1.73428674,
         0.47676915, -0.6939701 ,  0.84237936, -1.44721209,  1.65307857],
       [ 0.08498941,  0.99144335,  0.99965521, -0.16798512,  2.2915605 ,
         0.64199981, -1.71739006,  0.21834304, -1.37300009,  1.17014751]])

In [91]:

A.sum()

Out[91]:

22.004705986670082

In [92]:

np.sum(A)

Out[92]:

22.004705986670082

In [93]:

sum(A)

Out[93]:

22.004705986670082

In [94]:

sum is np.sum

Out[94]:

True

In [ ]:

Numpy for Matlab Users

Other syntax conversions between languages

In [2]:

from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()

Out[2]:

In [ ]: