# import some things
%pylab --no-import-all inline
import numpy as np
import pylab as pl
from scipy import linalg
Populating the interactive namespace from numpy and matplotlib
You just need to know one language to do almost anything !
Python interpreter: executes Python code
IPython: an advanced Python shell
NumPy: provides numerical array objects
SciPy: scientific computing (linear algebra, optimization, regression, etc.)
Matplotlib a.k.a. Pylab: 2-D visualization, "publication-ready" plots
Mayavi : 3-D visualization
Many application specific packages for e.g., machine learning, image processing, symbolic math, .. incomplete list
Get a scientific-Python environment:
)
Start the IPython shell (from terminal or Windows cmd shell):
$ ipython --pylab
The IPython Shell is an interactive shell:
Now we can write our "Hello World" program by typing:
s = "Hello World!"
print s
Hello World!
Let's say the file my_script.py
contains:
s = 'Hello World!'
print s
In IPython you can run it as follows:
%run my_script.py
Hello World!
You can use Spyder, a scientific Python IDE. Or the IPython Notebook
Start the IPython Notebook as follows
$ ipython notebook --pylab=inline
Integer variables:
>>> 1 + 1
2
>>> a = 4
floats:
>>> c = 2.1
complex (a native type in Python!):
>>> a = 1.5 + 0.5j
>>> a.real
1.5
>>> a.imag
0.5
and booleans:
>>> 3 < 4
True
>>> test = (3 > 4)
>>> test
False
>>> type(test)
<type 'bool'>
Note that you don't need to specify the type of the variable
int a = 1; # in C
Python can replace your pocket calculator with : +
, -
, *
, /
, %
(modulo)
>>> 7 * 3.
21.0
>>> 2**10
1024
>>> 8 % 3
2
WARNING : Integer division
>>> 3 / 2 # !!!
1
>>> 3 / 2. # Trick: use floats
1.5
>>> 3 / float(2) # type conversion
1.5
my_str = 'Hello World!'
print my_str
print my_str[0]
Hello World! H
Notice: Indexing in Python starts at zero (like in C)
Strings are objects with many useful methods:
print my_str.replace('World', 'Why N\' How')
print my_str.upper()
Hello Why N' How! HELLO WORLD!
An ordered container that can hold arbitrart Python objects
my_list = [1, 2, 3, 'test'] # Notice: [] creates a list
print my_list
[1, 2, 3, 'test']
We can append and insert things
my_list.append('test2')
print my_list
my_list.insert(1, 0)
print my_list
[1, 2, 3, 'test', 'test2'] [1, 0, 2, 3, 'test', 'test2']
We can access elements using their index
print my_list
print my_list[0] # first element
print my_list[-1] # last element
print my_list[-2] # second last element
[1, 0, 2, 3, 'test', 'test2'] 1 test2 test
We can also use slicing to obtain sublists
print my_list
print my_list[2:5] # Notice: index 5 is not included
[1, 0, 2, 3, 'test', 'test2'] [2, 3, 'test']
The slicing syntax is l[start:stop:step]
. This can be very useful
print my_list[:3] # first 3 elements
print my_list[-3:] # last 3 elements
print my_list[::2] # every 2nd element
print my_list[::-1] # list with order reversed
[1, 0, 2] [3, 'test', 'test2'] [1, 2, 'test'] ['test2', 'test', 3, 2, 0, 1]
A dictionary (dict
) is basically an efficient table that maps keys to
values. It is an unordered container:
phone = {'joe': 554, 'bob': 308} # using {} creates a dict
print phone # Notice: no order
{'bob': 308, 'joe': 554}
We can access elements using their key
print phone['joe']
554
And add new elements (Notice: key does not have to be a string)
phone[0] = 101
print phone
print phone.keys() # list with the keys
print phone.values() # list with the values
{0: 101, 'bob': 308, 'joe': 554} [0, 'bob', 'joe'] [101, 308, 554]
Allow the conditional execution of code
a = 10
if a == 1:
print 1
print 22
elif a == 2:
print 2
else:
print 'a lot'
a lot
Notice: Blocks are delimited by indentation (4 spaces)
Can be used to iterate over lists, dicts, etc. For example:
for word in ['cool', 'powerful', 'readable']:
print 'Python is %s !!!' % word
Python is cool !!! Python is powerful !!! Python is readable !!!
Functions are defined using def, they allow us to group code for specific tasks.
def disk_area(radius):
area = 3.14 * radius * radius
return area
print disk_area(1.0)
print disk_area(2.0)
3.14 12.56
Arguments are not copied when passed to a function (not like with Matlab)
import copy
def foo(a):
a.append(1)
b = [0]
foo(b)
print b # a has been modified !!!
[0, 1]
NumPy is:
Reference documentation: http://docs.scipy.org/doc/numpy/reference
For Matlab users: http://wiki.scipy.org/NumPy_for_Matlab_Users
import numpy as np # import numpy so we can use it
a = np.array([0, 1, 2, 3], dtype=np.float) # create array
print a
print a.ndim # number of dimensions, in Matlab `ndims(a)`
print a.shape # shape, in Matlab `size(a)`
print a.dtype # the data type of the array
[ 0. 1. 2. 3.] 1 (4,) float64
Arrays can have an arbitrary number of dimensions
# 2-D array
b = np.array([[0, 1, 2], [3, 4, 5]]) # 2 x 3 array
print b
print b.dtype # Notice: here the data type is int64
print b.shape
[[0 1 2] [3 4 5]] int64 (2, 3)
# 3-D
c = np.array([[[1], [2]], [[3], [4]]])
print c.shape # in Matlab `size(c)`
(2, 2, 1)
a = np.ones((3, 3))
print a
[[ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.]]
b = np.zeros((2, 3))
print b
[[ 0. 0. 0.] [ 0. 0. 0.]]
c = np.eye(3)
print c
[[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]]
NumPy arrays can be indexed and sliced like Python lists
a = np.diag(np.arange(3))
print a
[[0 0 0] [0 1 0] [0 0 2]]
print a[1, 1]
print a[:,1] # takes the entire second row!
1 [0 1 0]
# slicing
a = np.arange(10)
print a
print a[::2] # every 2nd element
print a[-5:] # last 5 elements
[0 1 2 3 4 5 6 7 8 9] [0 2 4 6 8] [5 6 7 8 9]
a = np.arange(10)
print a
b = a[::2]
print b
[0 1 2 3 4 5 6 7 8 9] [0 2 4 6 8]
b[0] = 100
print b
print a # a was modified as well!
[100 2 4 6 8] [100 1 2 3 4 5 6 7 8 9]
If you want a copy you have to specify it:
a = np.arange(10)
b = a[::2].copy() # force a copy
b[0] = 100
print b
print a
[100 2 4 6 8] [0 1 2 3 4 5 6 7 8 9]
This behavior can be surprising at first sight...
but it allows to save both memory and time.
NumPy has its own file format for saving and loading arrays:
a = np.arange(10)
np.save('test.npy', a)
a = 0
a = np.load('test.npy')
print a
[0 1 2 3 4 5 6 7 8 9]
But Python supports well-known (& more obscure) file formats:
scipy.io.mmread
, scipy.io.mmread
..
Matrix multiplication:
a = np.triu(np.ones((2, 2)), 1) # see help(np.triu)
print 'a:' + str(a)
b = np.diag([1, 2])
print 'b:' + str(b)
c = np.dot(a, b) # same as a.dot(b)
print 'c:' + str(c)
a:[[ 0. 1.] [ 0. 0.]] b:[[1 0] [0 2]] c:[[ 0. 2.] [ 0. 0.]]
WARNING: Element-wise multiplication vs. matrix multiplication
print a * b # element-wise multiplication
[[ 0. 0.] [ 0. 0.]]
Transpose:
a_t = a.T
print a_t
[[ 0. 0.] [ 1. 0.]]
Note: As with slicing, there is no copy. We can verify this by inspecting the arrays:
print 'a.flags:\n' + str(a.flags)
print 'a_t.flags:\n' + str(a_t.flags)
a.flags: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False a_t.flags: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
Inverse, systems of linear equations and SVD:
from numpy import linalg # OR
from scipy import linalg # even better
A = np.triu(np.ones((3, 3)), 0)
print 'A:\n' + str(A)
B = linalg.inv(A)
C = np.dot(B, A)
print 'C:\n' + str(C)
x = linalg.solve(A, [1, 2, 3]) # linear system
U, s, V = linalg.svd(A) # SVD
vals = linalg.eigvals(A) # Eigenvalues
A: [[ 1. 1. 1.] [ 0. 1. 1.] [ 0. 0. 1.]] C: [[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]]
Computing sums:
x = np.arange(5)
print x
print np.sum(x) # or x.sum()
[0 1 2 3 4] 10
Sum by rows and by columns:
x = np.array([[1, 1], [2, 2]])
print np.sum(x, axis=0), # columns (first dimension)
print np.sum(x, axis=1) # rows (second dimension)
[3 3] [2 4]
Same with np.mean, np.argmax, np.argmin, np.min, np.max, np.cumsum, np.sort
etc.
scipy
contains various toolboxes dedicated to common issues in scientific computing.
scipy
can be compared to other standard scientific-computing libraries, such as the GSL (GNU Scientific Library for C and C++), or Matlab's toolboxes.
scipy
is the core package for scientific routines in Python.
scipy
is meant to operate efficiently on numpy
arrays.
scipy.io
for IO (e.g. read / write Matlab files)scipy.linalg
for optimized linear algebrascipy.stats
for basic stats (t-tests, simple anova, ranksum etc.)scipy.signal
for signal processingscipy.sparse
for sparse matricesscipy.fftpack
for FFTsscipy.ndimage
for N-D image processing (e.g., smoothing)scipy.stats
¶A T-test to decide whether the two sets of observations have different means:
from scipy import stats
a = np.random.normal(0, 1, size=10)
b = np.random.normal(1, 1, size=10)
tval, pval = stats.ttest_ind(a, b)
print 'T=%0.4f, p=%0.4f' % (tval, pval)
T=-3.0062, p=0.0076
Matplotlib provides functions to create publication-quality figures
import pylab as pl
t = np.linspace(0, 8 * np.pi, 1000)
pl.plot(t, np.sin(t))
pl.xlabel('$x$')
pl.ylabel('$sin(x)$')
pl.ylim([-1.1, 1.1])
pl.savefig('pylab_demo.pdf') # natively save pdf, svg, etc.
image = np.random.rand(30, 30)
pl.imshow(image)
pl.gray()
pl.show()
Even more:
A really active community !