In [180]:

# import some things
%pylab --no-import-all inline
import numpy as np
import pylab as pl
from scipy import linalg

Populating the interactive namespace from numpy and matplotlib

Python for Scientific Computing¶

Martin Luessi

Martinos Center "Why N' How", September 19, 2013

What is Python and why would I use it?¶

Python is an intepreted high-level programming language
Python is free (as in speech)
Python runs on most platforms
It "combines remarkable power with very clear syntax"
Well suited for high performance numerical computing (NumPy, ...)
High quality 2D and 3D visualizations (pylab, mlab, ...)
Increasingly popular in neuroscience (nipy, nipype, nitime, ...)

What you should be able to do after this talk¶

Start Python
Do simple math
Get started with linear algebra and scientific computing
Plot some nice figures

Use Python for what?¶

Scripting (like shell scripts, e.g., bash, csh)
Make web sites
Build GUI applications
Science (like Matlab, IDL, R, Octave, Scilab)
Etc.

You just need to know one language to do almost anything !

Scientific Python building blocks¶

Python interpreter: executes Python code
IPython: an advanced Python shell
NumPy: provides numerical array objects
SciPy: scientific computing (linear algebra, optimization, regression, etc.)
Matplotlib a.k.a. Pylab: 2-D visualization, "publication-ready" plots
Mayavi : 3-D visualization
Many application specific packages for e.g., machine learning, image processing, symbolic math, .. incomplete list

First Steps¶

Get a scientific-Python environment:

Comes with every Linux distribution
Python(x,y) on Windows: http://www.pythonxy.com
Enthought Canopy or EPD: http://www.enthought.com
Continuum Analytics Anaconda http://www.continuum.io
At the Martinos Center use the EPD based network installation, see [here](http://surfer.nmr.mgh.harvard.edu/fswiki/DevelopersGuide/NMRCenterPython/UsersGuide

)

Start the IPython shell (from terminal or Windows cmd shell):

$ ipython --pylab

Hello world!¶

The IPython Shell is an interactive shell:

Now we can write our "Hello World" program by typing:

In [181]:

s = "Hello World!"
print s

Hello World!

My first script¶

Let's say the file my_script.py contains:

s = 'Hello World!'
print s

In IPython you can run it as follows:

In [182]:

%run my_script.py

Hello World!

If you are scared of the terminal¶

You can use Spyder, a scientific Python IDE. Or the IPython Notebook

Start the IPython Notebook as follows

$ ipython notebook --pylab=inline

Python basics: Numerical types¶

Integer variables:

>>> 1 + 1
2
>>> a = 4

floats:

>>> c = 2.1

complex (a native type in Python!):

>>> a = 1.5 + 0.5j
>>> a.real
1.5
>>> a.imag
0.5

Python basics: Numerical types¶

and booleans:

>>> 3 < 4
True
>>> test = (3 > 4)
>>> test
False
>>> type(test)
<type 'bool'>

Note that you don't need to specify the type of the variable

int a = 1;  # in C

Python basics: Numerical types¶

Python can replace your pocket calculator with : +, -, *, /, % (modulo)

>>> 7 * 3.
21.0
>>> 2**10
1024
>>> 8 % 3
2

WARNING : Integer division

>>> 3 / 2  # !!!
1
>>> 3 / 2.  # Trick: use floats
1.5
>>> 3 / float(2)  # type conversion
1.5

Strings¶

In [183]:

my_str = 'Hello World!'
print my_str
print my_str[0]

Hello World!
H

Notice: Indexing in Python starts at zero (like in C)

Strings are objects with many useful methods:

In [184]:

print my_str.replace('World', 'Why N\' How')
print my_str.upper()

Hello Why N' How!
HELLO WORLD!

Container types: list¶

An ordered container that can hold arbitrart Python objects

In [185]:

my_list = [1, 2, 3, 'test']   # Notice: [] creates a list
print my_list

[1, 2, 3, 'test']

We can append and insert things

In [186]:

my_list.append('test2')
print my_list
my_list.insert(1, 0)
print my_list

[1, 2, 3, 'test', 'test2']
[1, 0, 2, 3, 'test', 'test2']

Container types: list¶

We can access elements using their index

In [187]:

print my_list
print my_list[0]   # first element
print my_list[-1]  # last element
print my_list[-2]  # second last element

[1, 0, 2, 3, 'test', 'test2']
1
test2
test

Container types: list¶

We can also use slicing to obtain sublists

In [188]:

print my_list
print my_list[2:5]  # Notice: index 5 is not included

[1, 0, 2, 3, 'test', 'test2']
[2, 3, 'test']

The slicing syntax is l[start:stop:step]. This can be very useful

In [189]:

print my_list[:3]   # first 3 elements
print my_list[-3:]  # last 3 elements
print my_list[::2]  # every 2nd element
print my_list[::-1] # list with order reversed

[1, 0, 2]
[3, 'test', 'test2']
[1, 2, 'test']
['test2', 'test', 3, 2, 0, 1]

Container types: dictionary¶

A dictionary (dict) is basically an efficient table that maps keys to values. It is an unordered container:

In [190]:

phone = {'joe': 554, 'bob': 308}  # using {} creates a dict
print phone  # Notice: no order

{'bob': 308, 'joe': 554}

We can access elements using their key

In [191]:

print phone['joe']

And add new elements (Notice: key does not have to be a string)

In [192]:

phone[0] = 101
print phone
print phone.keys()    # list with the keys
print phone.values()  # list with the values

{0: 101, 'bob': 308, 'joe': 554}
[0, 'bob', 'joe']
[101, 308, 554]

Basic control flow: Conditional statements¶

Allow the conditional execution of code

In [193]:

a = 10
if a == 1:
    print 1
    print 22
elif a == 2:
    print 2
else:
    print 'a lot'

a lot

Notice: Blocks are delimited by indentation (4 spaces)

Basic control flow: Loops¶

Can be used to iterate over lists, dicts, etc. For example:

In [194]:

for word in ['cool', 'powerful', 'readable']:
    print 'Python is %s !!!' % word

Python is cool !!!
Python is powerful !!!
Python is readable !!!

My first function¶

Functions are defined using def, they allow us to group code for specific tasks.

In [195]:

def disk_area(radius):
    area = 3.14 * radius * radius
    return area

print disk_area(1.0)
print disk_area(2.0)

3.14
12.56

My second function¶

Arguments are not copied when passed to a function (not like with Matlab)

In [196]:

import copy

def foo(a):
    a.append(1) 

b = [0]
foo(b)
print b  # a has been modified !!!

[0, 1]

NumPy: N-dimensional arrays in Python¶

NumPy is:

An extension package to Python for multidimensional arrays (matrices in n-dimensions)
Designed for efficient scientific computation
Unlike Python lists, all elements of the array have the same type (int, float, etc)

Reference documentation: http://docs.scipy.org/doc/numpy/reference

For Matlab users: http://wiki.scipy.org/NumPy_for_Matlab_Users

NumPy: Creating arrays¶

In [197]:

import numpy as np  # import numpy so we can use it
a = np.array([0, 1, 2, 3], dtype=np.float)  # create array
print a

print a.ndim   # number of dimensions, in Matlab `ndims(a)`
print a.shape  # shape, in Matlab `size(a)`
print a.dtype  # the data type of the array

[ 0.  1.  2.  3.]
1
(4,)
float64

NumPy: Creating arrays¶

Arrays can have an arbitrary number of dimensions

In [198]:

# 2-D array
b = np.array([[0, 1, 2], [3, 4, 5]]) # 2 x 3 array
print b
print b.dtype  # Notice: here the data type is int64
print b.shape

[[0 1 2]
 [3 4 5]]
int64
(2, 3)

In [199]:

# 3-D
c = np.array([[[1], [2]], [[3], [4]]])
print c.shape  # in Matlab `size(c)`

(2, 2, 1)

NumPy: Creating arrays¶

Common arrays: ones, zeros and eye (like in Matlab)

In [200]:

a = np.ones((3, 3))
print a

[[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]]

In [201]:

b = np.zeros((2, 3))
print b

[[ 0.  0.  0.]
 [ 0.  0.  0.]]

In [202]:

c = np.eye(3)
print c

[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]

NumPy: Indexing and slicing¶

NumPy arrays can be indexed and sliced like Python lists

In [203]:

a = np.diag(np.arange(3))
print a

[[0 0 0]
 [0 1 0]
 [0 0 2]]

In [204]:

print a[1, 1]
print a[:,1]  # takes the entire second row!

1
[0 1 0]

In [205]:

# slicing
a = np.arange(10)
print a
print a[::2] # every 2nd element
print a[-5:] # last 5 elements

[0 1 2 3 4 5 6 7 8 9]
[0 2 4 6 8]
[5 6 7 8 9]

NumPy: Copies and views¶

A slicing operation creates a view on the original array

In [206]:

a = np.arange(10)
print a
b = a[::2]
print b

[0 1 2 3 4 5 6 7 8 9]
[0 2 4 6 8]

The original array is not copied in memory: when modifying the view, the original array is modified as well.

In [207]:

b[0] = 100
print b
print a  # a was modified as well!

[100   2   4   6   8]
[100   1   2   3   4   5   6   7   8   9]

NumPy: Copies and views¶

If you want a copy you have to specify it:

In [208]:

a = np.arange(10)
b = a[::2].copy()  # force a copy
b[0] = 100
print b
print a

[100   2   4   6   8]
[0 1 2 3 4 5 6 7 8 9]

This behavior can be surprising at first sight...

but it allows to save both memory and time.

NumPy: File formats¶

NumPy has its own file format for saving and loading arrays:

In [209]:

a = np.arange(10)
np.save('test.npy', a)
a = 0
a = np.load('test.npy')
print a

[0 1 2 3 4 5 6 7 8 9]

But Python supports well-known (& more obscure) file formats:

Matlab: scipy.io.loadmat, scipy.io.savemat
HDF5: h5py, PyTables
NetCDF: scipy.io.netcdf_file,

netcdf4-python

MatrixMarket: scipy.io.mmread, scipy.io.mmread

..

NumPy: Linear algebra¶

Matrix multiplication:

In [210]:

a = np.triu(np.ones((2, 2)), 1)   # see help(np.triu)
print 'a:' + str(a)
b = np.diag([1, 2])
print 'b:' + str(b)
c = np.dot(a, b)  # same as a.dot(b)
print 'c:' + str(c)

a:[[ 0.  1.]
 [ 0.  0.]]
b:[[1 0]
 [0 2]]
c:[[ 0.  2.]
 [ 0.  0.]]

WARNING: Element-wise multiplication vs. matrix multiplication

In [211]:

print a * b  # element-wise multiplication

[[ 0.  0.]
 [ 0.  0.]]

NumPy: Linear algebra¶

Transpose:

In [212]:

a_t = a.T
print a_t

[[ 0.  0.]
 [ 1.  0.]]

Note: As with slicing, there is no copy. We can verify this by inspecting the arrays:

In [213]:

print 'a.flags:\n' + str(a.flags)
print 'a_t.flags:\n' + str(a_t.flags)

a.flags:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
a_t.flags:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

NumPy: Linear algebra¶

Inverse, systems of linear equations and SVD:

In [214]:

from numpy import linalg  # OR
from scipy import linalg  # even better
A = np.triu(np.ones((3, 3)), 0)
print 'A:\n' + str(A)
B = linalg.inv(A)
C = np.dot(B, A)
print 'C:\n' + str(C)
x = linalg.solve(A, [1, 2, 3])  # linear system
U, s, V = linalg.svd(A)  # SVD
vals = linalg.eigvals(A)  # Eigenvalues

A:
[[ 1.  1.  1.]
 [ 0.  1.  1.]
 [ 0.  0.  1.]]
C:
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]

NumPy: Reductions¶

Computing sums:

In [215]:

x = np.arange(5)
print x
print np.sum(x)  # or x.sum()
    

[0 1 2 3 4]
10

Sum by rows and by columns:

In [216]:

x = np.array([[1, 1], [2, 2]])
print np.sum(x, axis=0),   # columns (first dimension)
print np.sum(x, axis=1)   # rows (second dimension)

[3 3] [2 4]

Same with np.mean, np.argmax, np.argmin, np.min, np.max, np.cumsum, np.sort etc.

SciPy¶

scipy contains various toolboxes dedicated to common issues in scientific computing.
scipy can be compared to other standard scientific-computing libraries, such as the GSL (GNU Scientific Library for C and C++), or Matlab's toolboxes.
scipy is the core package for scientific routines in Python.
scipy is meant to operate efficiently on numpy arrays.

SciPy¶

scipy.io for IO (e.g. read / write Matlab files)
scipy.linalg for optimized linear algebra
scipy.stats for basic stats (t-tests, simple anova, ranksum etc.)
scipy.signal for signal processing
scipy.sparse for sparse matrices
scipy.fftpack for FFTs
scipy.ndimage for N-D image processing (e.g., smoothing)
etc.

SciPy: Example of `scipy.stats`¶

A T-test to decide whether the two sets of observations have different means:

In [217]:

from scipy import stats
a = np.random.normal(0, 1, size=10)
b = np.random.normal(1, 1, size=10)
tval, pval = stats.ttest_ind(a, b)
print 'T=%0.4f, p=%0.4f' % (tval, pval)

T=-3.0062, p=0.0076

Visualization with Python¶

Matplotlib provides functions to create publication-quality figures

In [218]:

import pylab as pl
t = np.linspace(0, 8 * np.pi, 1000)
pl.plot(t, np.sin(t))
pl.xlabel('$x$')
pl.ylabel('$sin(x)$')
pl.ylim([-1.1, 1.1])
pl.savefig('pylab_demo.pdf')  # natively save pdf, svg, etc.

Visualization with Python¶

2-D (such as images)

In [219]:

image = np.random.rand(30, 30)
pl.imshow(image)
pl.gray()
pl.show()

Visualization with Python¶

Mayavi : 3-D visualization

PySurfer uses Mayavi to visualize cortical surfaces

Learn more¶

Even more:

Matlab like IDE environment: http://packages.python.org/spyder
Parallel computing: http://packages.python.org/joblib
Cython: write Python get C code http://cython.org

Python for brain imaging¶

NiBabel for handling neurimaging file formats
Nipype Pipeline for SPM, FSL, FreeSurfer, etc.
PySurfer visualization of FreeSurfer surfaces
MNE-Python MEG and EEG data analysis
scikit-learn Machine learning and statistics
NiLearn Machine learning for neuroimaging (uses scikit-learn)
NIPY various neuroimaging packages
etc.

A really active community !

Python for Scientific Computing¶

What is Python and why would I use it?¶

What you should be able to do after this talk¶

Use Python for what?¶

Scientific Python building blocks¶

First Steps¶

Hello world!¶

My first script¶

If you are scared of the terminal¶

Python basics: Numerical types¶

Python basics: Numerical types¶

Python basics: Numerical types¶

Strings¶

Container types: list¶

Container types: list¶

Container types: list¶

Container types: dictionary¶

Basic control flow: Conditional statements¶

Basic control flow: Loops¶

My first function¶

My second function¶

NumPy: N-dimensional arrays in Python¶

NumPy: Creating arrays¶

NumPy: Creating arrays¶

NumPy: Creating arrays¶

NumPy: Indexing and slicing¶

NumPy: Copies and views¶

NumPy: Copies and views¶

NumPy: File formats¶

NumPy: Linear algebra¶

NumPy: Linear algebra¶

NumPy: Linear algebra¶

NumPy: Reductions¶

SciPy¶

SciPy¶

SciPy: Example of scipy.stats¶

Visualization with Python¶

Visualization with Python¶

Visualization with Python¶

Learn more¶

Python for brain imaging¶

SciPy: Example of `scipy.stats`¶