The following is part of my notes on scientific computing with python:

Notebook Setup¶

NOTE: Please run the following cell before skipping to any section. Do not change it without knowing what you're doing.

In [2]:

# load numpy
import numpy as np

# load scipy
import scipy as sp

# load data visualization
import matplotlib.pyplot as plt  # the tidy way 

# allows embedding of plots in the notebook
%matplotlib inline 

# load image processing
from IPython.display import Image

The Scipy Stack¶

Unlike Matlab, Scilab or R, Python does not come with a pre-bundled set of modules for scientific computing. Below are the basic building blocks that can be combined to obtain a scientific computing environment:

Python, a generic and modern computing language.
- Python language: data types (string, int), flow control, data collections (lists, dictionaries), patterns, etc.
- Modules of the standard library.
- A large number of specialized modules or applications written in Python: web protocols, web framework, etc. ... and scientific computing.
- Development tools (automatic testing, documentation generation)
IPython, an advanced Python shell http://ipython.scipy.org/
Numpy : provides powerful numerical arrays objects, and routines to manipulate them. http://www.numpy.org/
Scipy : high-level data processing routines. Optimization, regression, interpolation, etc http://www.scipy.org/
Matplotlib : 2-D visualization, “publication-ready” plots http://matplotlib.sourceforge.net/
Mayavi : 3-D visualization http://code.enthought.com/projects/mayavi/

Some tools for python development:

IPython : a fantastic way to work with/experiment with python. http://ipython.org/
SublimeText : my favorite text editor http://www.sublimetext.com/
Spyder : a MATLAB like IDE for working with python: https://code.google.com/p/spyderlib/

Scipy Stack on Windows¶

To get the complete scipy stack on windows, I use Continuum's Anaconda. It packages python 2.7 + numpy + scipy + spyder + etc. Pretty much everything you need! It even has several optimized libraries that are free with an academic license.

Interactive Workflow¶

Using Ipython for learning/experimenting/etc. is a great way to work with python. When you're done/ready for more advanced scripting/debugging, that can also be done in inside ipython, but also inside spyder/sublime and the console. IPython provided many beenfits over console python, and we will make use/explore them in this notebook.

The Python Language¶

Python is a full programming language. Only the bare minimum necessary for getting started with Numpy and Scipy is addressed here. To learn more about the language, consider going through the excellent tutorial http://docs.python.org/tutorial. Dedicated books are also available, such as http://diveintopython.org/.

Python is a programming language, as are C, Fortran, BASIC, PHP, etc. Some specific features of Python are as follows:

*an interpreted (as opposed to compiled) language. Contrary to e.g. C or Fortran, one does not compile Python code before executing it. In addition, Python can be used interactively: many Python interpreters are available, from which commands and scripts can be executed.

a free software released under an open-source license: Python can be used and distributed free of charge, even for building commercial software.
multi-platform: Python is available for all major operating systems, Windows, Linux/Unix, MacOS X, most likely your mobile phone OS, etc.
a very readable language with clear non-verbose syntax
a language for which a large variety of high-quality packages are available for various applications, from web frameworks to scientific computing.
a language very easy to interface with other languages, in particular C and C++.
an object-oriented language, with dynamic typing (the same variable can contain objects of different types during the course of a program).

See http://www.python.org/about/ for more information about distinguishing features of Python.

Data Types¶

In [6]:

# integer
1 + 1

Out[6]:

In [8]:

# floats 
c = 2.1
type(c)

Out[8]:

float

In [10]:

# complex 
a = 1.5 + 0.5j
type(a)

Out[10]:

complex

In [12]:

a.real

Out[12]:

1.5

In [15]:

a.imag

Out[15]:

0.5

In [17]:

# bool
test = (3 > 4)
type(test)

Out[17]:

bool

In [18]:

# becareful with integer division
b = 3/2 # int division
c = 3/2. # float
b,c

Out[18]:

(1, 1.5)

In [20]:

# explicit integer division
3.4//2.1

Out[20]:

1.0

Containers¶

There are several types of containers in python: lists, strings, etc. They can be operated on directly, or through their object methods, or in the case of numpy arrays, numpy methods

Lists¶

Note: Lists are mutable objects - they can be modified

In [34]:

# list = ordered collection of objects that can be of different types
L = ['red',2,'blue']
type(L)

Out[34]:

list

In [35]:

# indexing lists
L[2]

Out[35]:

'blue'

In [36]:

# lists are mutable - they can be modified
L[1] = 'orange'
L

Out[36]:

['red', 'orange', 'blue']

Note: For collections of numerical data that all have the same type, it is often more efficient to use the array type provided by the numpy module. A NumPy array is a chunk of memory containing fixed-sized items. With NumPy arrays, operations on elements can be faster because elements are regularly spaced in memory and more operations are performed through specialized C functions instead of Python loops.

In [39]:

# add to list
L.append('pink')
L

Out[39]:

['red', 'orange', 'blue', 'pink', 'pink', 'pink']

In [41]:

# concatenante lists
R = [2, 4, 6]
L + R

Out[41]:

['red', 'orange', 'blue', 'pink', 'pink', 'pink', 2, 4, 6]

Strings¶

Note: Strings are imutable objects - once created they cant be modified, but new strings can be created from them.

In [42]:

# different way to specify strings
s = 'Hello, how are you?'
s = "Hi, what's up"
s = '''Hello,                 # tripling the quotes allows the
       how are you'''         # the string to span more than one line
s = """Hi,
what's up?"""

In [43]:

a = "hello"
a[0]

Out[43]:

'h'

In [3]:

# backward incrementing
a[-1]

Out[3]:

'o'

In [4]:

# start:end:incr
a[1::2] 

Out[4]:

'el'

In [5]:

#string are immutable!
a[1] = 'b' 

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-bb548594ff10> in <module>()
      1 #string are immutable!
----> 2 a[1] = 'b'

TypeError: 'str' object does not support item assignment

In [44]:

# strings are first class objects - use methods
a.replace('l','z')

Out[44]:

'hezzo'

Dictionaries¶

A dictionary is basically an efficient table that maps keys to values. It is an unordered container.

In [46]:

tel = {'emmanuelle': 5752, 'sebastian': 5578}
tel['francis'] = 5915
tel

Out[46]:

{'emmanuelle': 5752, 'francis': 5915, 'sebastian': 5578}

In [48]:

tel['sebastian']

Out[48]:

In [50]:

tel.keys()

Out[50]:

['sebastian', 'francis', 'emmanuelle']

In [52]:

tel.values()

Out[52]:

[5578, 5915, 5752]

In [54]:

'francis' in tel

Out[54]:

True

Tuples¶

These are immutable lists. The elements of a tuple are written between parentheses, or just separated by commas:

In [58]:

t = 12345, 321, 'hea'
t

Out[58]:

(12345, 321, 'hea')

In [59]:

t[1]

Out[59]:

Sets¶

These are unordered unique items

In [62]:

s = set(('a','b','c','a'))
s

Out[62]:

{'a', 'b', 'c'}

In [64]:

s.difference(('a','b'))

Out[64]:

{'c'}

Further Notes¶

There are other containers, methods, etc. not covered here. See the full python tutorial mentioned above! Also, for a good discussion of mutable vs immutable objects in python, see this article: Types and Objects in Python

Control Flow¶

Note: indentation is respected and necessery to specify control blocks. Python also has several interesting ways to iterate over sequences!

In [8]:

if b == 4:
    print(1)
    print(2)
else:
    print(3)
print(5)

3
5

In [66]:

vowels = 'aeiouy'  # iterate over any sequence
for i in 'powerful':
    if i in vowels:
        print(i),

o e u

Note: it is not safe to modify the sequece you're iterating over

In [10]:

# Enumarting in python
words = ('cool', 'powerful', 'readable')
for index, item in enumerate(words):
    print index, item

0 cool
1 powerful
2 readable

In [11]:

# square i in that list
[i**2 for i in range(4)] 

Out[11]:

[0, 1, 4, 9]

Functions¶

Functions are first class objects in python meaining we can:

assign them to a variable
an item or a list (any collection)
passed as an argument to another function

In [12]:

# Defining Functions
def disk_area(radius):
    return 3.14*radius**2
disk_area(1)

Out[12]:

3.14

In [13]:

# default parameter values
def calc_area(radius=1):
    return 3.14*radius**2
a = calc_area()
b = calc_area(2)
a,b

Out[13]:

(3.14, 12.56)

Note: Default values are evaluated when the function is defined, not when it is called. This can be problematic when using mutable types (e.g. dictionary or list) and modifying them in the function body, since the modifications will be persistent across invocations of the function.

In [14]:

# Using an immutable type in a keyword argument:
bigx = 10

def double_it(x=bigx):
    return x * 2

a = double_it()
bigx = 1e9  # Now really big
b = double_it()
a,b

Out[14]:

(20, 20)

In [15]:

# Using an mutable type in a keyword argument (and modifying it inside the function body):
def add_to_dict(args={'a': 1, 'b': 2}):
    for i in args.keys():
        args[i] += 1
        print 'args: ', args

In [16]:

add_to_dict()

args:  {'a': 2, 'b': 2}
args:  {'a': 2, 'b': 3}

In [17]:

add_to_dict()

args:  {'a': 3, 'b': 3}
args:  {'a': 3, 'b': 4}

In [3]:

# variable number of parameters
def variable_args(*args, **kwargs):
    print 'args is', args
    print 'kwargs is', kwargs

variable_args('one', 'two', x=1, y=2, z=3)

args is ('one', 'two')
kwargs is {'y': 2, 'x': 1, 'z': 3}

In [4]:

va = variable_args
va('three', x=10, y=20)

args is ('three',)
kwargs is {'y': 20, 'x': 10}

In [68]:

# Docstrings
def funcname(params):
    """Concise one-line sentence describing the function.

    Extended summary which can contain multiple paragraphs.
    """
    # function body
    pass

In [1]:

funcname?

Object `funcname` not found.

Unnamed (lambda) functions¶

In python we can also create unnamed functions using the lambda keyword:

In [2]:

f1 = lambda x: x**2
    
# is equivalent to 

def f2(x):
    return x**2

This is useful when we want to pass a simple function as an argument to another function:

In [3]:

# map is a built-in python function
map(lambda x: x**2, range(-3,4))

Out[3]:

[9, 4, 1, 0, 1, 4, 9]

Scripts/Modules/Programs¶

The '%' character is a special ipython command. The command below runs an external script.

In [1]:

%run files/simplescript.py

Hello
how
are
you?

In [21]:

# script variables also added to ipython workspace
message

Out[21]:

'Hello how are you?'

Note: Scripts can also take in command line arguments - see files/simplescript.py

In [22]:

# importing
import os
os

Out[22]:

<module 'os' from 'C:\Anaconda\lib\os.pyc'>

In [23]:

# list directory contents
os.listdir('.')

Out[23]:

['.ipynb_checkpoints',
 'admm.py',
 'Introduction to Python Notebook.ipynb',
 'simplescript.py']

In [24]:

# importing submodules
from os import listdir

In [25]:

# shorthand importing
np.linspace(0,10,6)

Out[25]:

array([  0.,   2.,   4.,   6.,   8.,  10.])

In [26]:

# star imports - usually bad practice
from os import *

Note: Rule of thumb:

Sets of instructions that are called several times should be written inside functions for better code reusability.
Functions (or other bits of code) that are called from several scripts should be written inside a module, so that only the module is imported in the different scripts (do not copy-and-paste your functions in the different scripts!).

Ideally, youd modify the system variable PYTHONPATH and keep all of your modules in there

Notes¶

Some useful comments, links, etc for future reference.

Clarify the "view" concept and how that relates to immutable/mutable property
Update to python3 instead?

Tutorials/Links¶

Some useful tutorials Ive found/should come back to:

Scientific Computing with Python - A serires of lecturenotes on github
Starcluster + Anaconda - setting up starcluster using an anaconda ami on EC2
- SC+Anaconda: counting words
- SC+Anaconda: example repo
Scipy Advanced Lecture Notes - the second part of scipy's lecture notes - advanced issues like optimization/memory management/etc.
CVXPY

Extensions¶

There are several useful extensions to add to your ipython experience. Below is a list of the ones I've installed so far:

Table of contents - display a floating toc for easy navigation
Gist - make a gist out of your notebook

Getting Help¶

Rather than knowing all functions in Numpy and Scipy, it is important to find rapidly information throughout the documentation and the available help. Here are some ways to get information:

In Ipython, help function opens the docstring of the function. Only type the beginning of the function’s name and use tab completion to display the matching functions. *Note: in Ipython it is not possible to open a separated window for help and documentation; however one can always open a second Ipython shell just to display help and docstrings...
Numpy’s and Scipy’s documentations can be browsed online on http://docs.scipy.org/doc. The search button is quite useful inside the reference documentation of the two packages (http://docs.scipy.org/doc/numpy/reference/ and http://docs.scipy.org/doc/scipy/reference/). Tutorials on various topics as well as the complete API with all docstrings are found on this website.
Numpy’s and Scipy’s documentation is enriched and updated on a regular basis by users on a wiki http://docs.scipy.org/numpy/. As a result, some docstrings are clearer or more detailed on the wiki, and you may want to read directly the documentation on the wiki instead of the official documentation website. Note that anyone can create an account on the wiki and write better documentation; this is an easy way to contribute to an open-source project and improve the tools you are using!
Scipy’s cookbook http://www.scipy.org/Cookbook gives recipes on many common problems frequently encountered, such as fitting data points, solving ODE, etc.
Matplotlib’s website http://matplotlib.sourceforge.net/ features a very nice gallery with a large number of plots, each of them shows both the source code and the resulting plot. This is very useful for learning by example. More standard documentation is also available.
Mayavi’s website http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/ also has a very nice gallery of examples http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/examples.html in which one can browse for different visualization solutions.

Finally, two more “technical” possibilities are useful as well:

In Ipython, the magical function %psearch search for objects matching patterns. This is useful if, for example, one does not know the exact name of a function.

import numpy as np
%psearch np.diag*
np.diag
np.diagflat
np.diagonal

numpy.lookfor looks for keywords inside the docstrings of specified modules:

import numpy as np
numpy.lookfor('convolution')
Search results for 'convolution'
--------------------------------
numpy.convolve
    Returns the discrete, linear convolution of two one-dimensional
sequences.
numpy.bartlett
    Return the Bartlett window.
numpy.correlate
    Discrete, linear correlation of two 1-dimensional sequences.

If everything listed above fails (and Google doesn’t have the answer)... don’t despair! Write to the mailing-list suited to your problem: you should have a quick answer if you describe your problem well. Experts on scientific python often give very enlightening explanations on the mailing-list.
- Numpy discussion (numpy-discussion@scipy.org): all about numpy arrays, manipulating them, indexation questions, etc.
- SciPy Users List (scipy-user@scipy.org): scientific computing with Python, high-level data processing, in particular with the scipy package.
- matplotlib-users@lists.sourceforge.net for plotting with matplotlib.