Introduction to / review of Python¶

For a full version of these notes with the examples filled in click here. This is useful if you fall behind or want to refer back to these notes later.

Why Python?¶

Clean and powerful language designed by people who highly value elegance and readability.
Strong set of 3rd party analysis tools that are professionally and actively developed.
- In particular a lot of momentum behind use of Python to develop scientific tools over the last decade: If you analyze data from Chandra, LOFAR, Fermi, Herschel, HST, JWST, ALMA, EVLA (etc etc) you will likely end up using Python even if you didn’t intend it. So learning to use Python for for Astronomy data analysis has a good chance of paying off.
Interfaces well with existing C, C++, and FORTRAN libraries, providing access to existing libraries, such as robust numerical libraries like BLAS, ATLAS, etc. but with an easier to use interface.
- Easier then to write numerical code, while also interacting with databases, using data from the web, visualization, and other such tasks that a higher-level language makes easier, all without leaving Python.
Very active developer community.
Free as in open source, and "free as in beer". Anyone can run Python on almost any platform for free--neither you nor your collaborators will need to worry about having the proper licenses to run software written in Python (unless of course the software itself is proprietary, though that is uncommon).

Why not?

Switching from a familiar analysis environment is NOT FREE (hours * dollars/hour * overhead_rate)
- “Ugh, I could do this so much faster the old way.”
- “I can't find the functions I need to do [insert favorite IDL function].”
- “Stupid Python, why isn’t this working?!”
- “These plots look ugly.”

However, Python has been actively used in the astronomy community for 10+ years, and has even driven some of the development of Python.

Table of Contents¶

1. Individual things
2. Commands that operate on things
- EXERCISE 1 - Introducing logistic growth
3. Collections of things
- EXERCISE 2 - Storing population size in a list
- EXERCISE 3 - Storing data in arrays
4. Repeating yourself
- EXERCISE 4 - Using loops to repeat calculations
5. Making choices
- EXERCISE 5 - Making the model stochastic with an if statement
6. Creating chunks with functions and modules
- EXERCISE 6 - Creating a logistic growth function
- EXERCISE 7 - Putting the growth function(s) in a module

1. Individual things¶

The most basic component of any programming language are "things", also called objects.

The most common basic "things" in Python are integers, floats, strings, booleans, and some special objects of various types. We'll meet many of these as we go through the lesson.

TIP: To run the code in a cell quickly, press Ctrl-Enter.

TIP: To quickly create a new cell below an existing one, hit the ESC key to go into the notebook's "command mode" and then hit the 'b' key. Other shortcuts for making, deleting, and moving cells are in the menubar at the top of the screen.

In [1]:

# A thing
2

Out[1]:

In [2]:

# Use print to show multiple things in the same cell
# Note that you can use single or double quotes for strings
print 2
print "hello"

2
hello

In [3]:

print 'hello'

hello

In [4]:

print 'hello "AAS"'

hello "AAS"

In [5]:

# Things can be stored as variables
a = 2
b = 'hello'
c = True
print a, b, c

2 hello True

In [6]:

a = b

In [7]:

print a

hello

In [8]:

# The type function tells us the type of thing we have
print type(a)
print type(b)
print type(c)
print type(3)

<type 'str'>
<type 'str'>
<type 'bool'>
<type 'int'>

^Back to top

2. Commands that operate on things¶

Just storing data in variables isn't much use to us. Right away, we'd like to start performing operations and manipulations on data and variables.

There are three very common means of performing an operation on a thing.

2.1 Use an operator¶

All of the basic math operators work like you think they should for numbers. They can also do some useful operations on other things, like strings. There are also boolean operators that compare quantities and give back a bool variable as a result.

In [9]:

# Standard math operators work as expected in numerical expressions
print 2 + 3
print 2 * 3
print 2 ** 3

5
6
8

In [10]:

# We can also assign numbers to named variables for use in expressions
a = 2
b = 3
print a + b
print a * b
print a ** b

5
6
8

In [13]:

# But be careful with integer division
# Also watch out for exponentiation - you want **, not ^
print 2/3
print 5//4
print 2/3.0

0
1
0.666666666667

In [16]:

# There are also operators for strings
print 'hello' + ' world'
print 'hello' * 3
a = '1'
b = 2
print a * b

hello world
hellohellohello
11

In [23]:

# Boolean operators compare two things
print 1 > 3
print 3 == 3
print (1 > 3) or (3 == 3)
print (1 > 3) and (3 == 3)

False
True
True
False

In [24]:

# This prints the "Zen of Python" which lists some of the principles
# behind the design of Python itself, as well was (ideally) software
# written in Python
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

2.2 Use a function¶

These will be very familiar to anyone who has programmed in any language, and work like you would expect.

In [25]:

# There are thousands of functions that operate on things
print type(3)
print len('hello')
print round(3.3)

<type 'int'>
5
3.0

TIP: To find out what a function does, you can type it's name and then a question mark to get a pop up help window. Or, to see what arguments it takes, you can type its name, an open parenthesis, and hit tab.

In [26]:

round?
round()

TIP: Many useful functions are not in the Python built in library, but are in external scientific packages. These need to be imported into your Python notebook (or program) before they can be used. Probably the most important of these are numpy and matplotlib.

In [4]:

# Many useful functions are in external packages
# Let's meet numpy
import numpy as np
np.array

Out[4]:

<function numpy.core.multiarray.array>

In [30]:

# To see what's in a package, type the name, a period, then hit tab
np?
np.asscalar

In [32]:

# Some examples of numpy functions and "things"
print np.sqrt(4)
print np.pi
print np.sin(np.pi)

2.0
3.14159265359
1.22464679915e-16

2.3 Use a method¶

Before we get any farther into the Python language, we have to say a word about "objects". We will not be teaching object oriented programming in this workshop, but you will encounter objects throughout Python (in fact, even seemingly simple things like ints and strings are actually objects in Python).

In the simplest terms, you can think of an object as a small bundled "thing" that contains within itself both data and functions that operate on that data. For example, strings in Python are objects that contain a set of characters and also various functions that operate on the set of characters. When bundled in an object, these functions are called "methods".

Instead of the "normal" function(arguments) syntax, methods are called using the syntax variable.method(arguments).

In [34]:

# A string
a = 'hello'
print a.capitalize()
print a.replace('l', 'x')
print a.upper()
len(a)

Hello
hexxo
HELLO

Out[34]:

In [35]:

# Note, most (finite) collections of things have a length that can
# be found with the len() operator on that object (what is a string
# a collection of?)
len([1, 2, 3])

Out[35]:

In [36]:

# Scalar values like individual numbers do not have a length
len(2.3)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-36-adc6b044d635> in <module>()
----> 1 len(2.3)

TypeError: object of type 'float' has no len()

In [ ]:

# Objects have bundled methods

EXERCISE 1 - Introducing logistic growth¶

Throughout this lesson, we will successively build towards a program that will calculate the logistic growth of a population of bacteria in a petri dish (or bears in the woods, if you prefer). The exercises will build on each other - if at any time you get behind, you can open the notebook python-full to review answers to the previous exercises and catch up.

As a reminder, a commonly used discrete time equation for logistic population growth is

$$ n(t+1) = n(t) + r n(t) [1 - n(t) / K] $$

where $ n(t)$ is the population size at time $ t $, $ r $ is the net per capita growth rate, and $ K $ is the carrying capacity of the dish/woods.

To get started, write Python expressions that do the following:

Create variables for r, K, and n0, setting these equal to 0.6, 100, and 10, respectively.
Create the variable n1 and calculate its value. Do the same for n2.
Check the type of n2 - what is it?
Modify your calculations for n1 and n2 so that these values are rounded to the nearest

integer.

Bonus

Test whether n2 is larger than 20, and print out a line that says "n2 more than 20: "

followed by the answer (either True or False).

Figure out how to test whether n2 is an integer (a mathematical integer, not necessarily

whether it is an integer type) (HINT: look at the methods of n2 by typing n2. and pressing tab.)

In [42]:

r = 0.6
K = 100.0
n0 = 10

n1 = round(n0 + (r * n0) * (1 - n0 / K))
n2 = round(n1 + (r * n1) * (1 - n1 / K))

print n0, n1, n2
print type(n2)

print 'n2 more than 20:', n2 > 20
print 'n2 is an integer', n2.is_integer()

10 15.0 23.0
<type 'float'>
n2 more than 20: True
n2 is an integer True

^Back to top

3. Collections of things¶

Once the number of variables that you are interested in starts getting large, working with them all individually starts to get unwieldy. To help stay organized, we can use collections of things.

Probably 99% of your work in scientific Python will use one of four types of collections: lists, tuples, dictionaries, and numpy arrays. We'll look quickly at each of these and what they can do for you.

3.1 Lists¶

Lists are probably the handiest and most flexible type of container. Lists are declared with square brackets []. Individual elements of a list can be selected using the syntax a[ind].

In [43]:

# Lists are created with square bracket syntax
a = ['hi', 'hello', 'yo']
print a
print type(a)

['hi', 'hello', 'yo']
<type 'list'>

In [44]:

# Lists (and all collections) are also indexed with square brackets
# NOTE: The first index is zero, not one
print a[0]
print a[1]

hi
hello

In [45]:

# Indexing on a variable like `a` is exactly the same as if we were
# directly indexing the object pointed to by that variable
['hi', 'hello', 'yo'][0]

Out[45]:

'hi'

In [47]:

# Lists can be sliced by putting a colon between indexes
# NOTE: The end value is not inclusive
print a[0:2]
print [a[0], a[1]]

['hi', 'hello']
['hi', 'hello']

In [54]:

# You can leave off the start or end if desired
print a[:2]  # same as a[0:2]
print a[2:]
print a[0:0]
print a[:]
print a[:-1]  # a[0:len(a)-2]
print a[:-2]

['hi', 'hello']
['yo']
[]
['hi', 'hello', 'yo']
['hi', 'hello']
['hi']

In [ ]:

# Just remember that a[:] makes a *copy* of list

In [58]:

# Lists are objects, like everything else, and have methods such as append
a.append('hiya')
print a

a.append([1, 2])
print a

a.append(20)
print a

['hi', 'hello', 'yo', 'hiya', 'hiya', [1, 2], 'hiya', [1, 2], 20, 'hiya']
['hi', 'hello', 'yo', 'hiya', 'hiya', [1, 2], 'hiya', [1, 2], 20, 'hiya', [1, 2]]
['hi', 'hello', 'yo', 'hiya', 'hiya', [1, 2], 'hiya', [1, 2], 20, 'hiya', [1, 2], 20]

In [60]:

# A very common use of lists is to create a range of integers
# This is easy in Python using the built-in range function like
# range(start, end); note that like the 'slice' notation the end value
# is *not* inclusive
print range(10)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

TIP: A 'gotcha' for some new Python users is that many collections, including lists, actually store pointers to data, not the data itself. This means that you need to be careful with writing things b = a and then changing a, since this will also change b. Have a look at the copy module if you want to make copies of variables.

EXERCISE 2 - Storing population size in a list¶

Copy your code from Exercise 1 into the box below, and do the following:

Modify your code so that the values of n0, n1, n2, and n3 are stored in a list and not as

separate individual variables. HINT: You can start off by declaring an empty list using the syntax n = [], and then append each new calculated value of nt to the list.

Get the first and last values in the list, calculate their ratio, and print out "Grew by a factor of "

followed by the result.

Bonus

Extract the last value in two different ways: first, by using the index for

the last item in the list, and second, presuming that you do not know how long the list is.

Change the values of r and K to make sure that your cell still runs correctly and gives

reasonable answers.

In [62]:

######################################
# This code deletes our old variables
try: del n0, n1, n2, r, K
except: pass
######################################
r = 0.6
K = 100.0
n = []
n.append(10)

n.append(round(n[0] + (r * n[0]) * (1 - n[0] / K)))
n.append(round(n[1] + (r * n[1]) * (1 - n[1] / K)))
n.append(round(n[2] + (r * n[2]) * (1 - n[2] / K)))

print n
print 'Grew by a factor of', n[-1]/n[0]

[10, 15.0, 23.0, 34.0]
Grew by a factor of 3.4

3.2 Dictionaries¶

Dictionaries are the collection to use when you want to store and retrieve things by their names (or some other kind of key) instead of by their position in the collection. A good example is a set of model parameters, each of which has a name and a value. Dictionaries are declared using {}.

In [63]:

# Make a dictionary of model parameters
params = {'n0': 10, 'r': 0.5}
print params
print params['r']

{'n0': 10, 'r': 0.5}
0.5

In [64]:

params['K'] = 200

In [66]:

# All the variables you define are actually saved in a special
# dictionary internal to Python that can you can access using the built-in
# locals() function:

K = 200
locals()['K']

Out[66]:

3.3 Tuples¶

We won't say a whole lot about tuples except to mention that they basically work just like lists, with two major exceptions:

You declare tuples using () instead of []
Once you make a tuple, you can't change what's in it

You'll see tuples come up throughout the Python language, and over time you'll develop a feel for when to use them. In general, they're often used instead of lists to group items when the position in the collection is critical to understanding the item's meaning, such as (x,y), and when you want to make sure that you don't accidentally modify any of the items later.

3.4 Numpy arrays (ndarrays)¶

Even though numpy arrays (often written as ndarrays, for n-dimensional arrays) are not part of the core Python libraries, they are so useful in scientific Python that we'll include them here in the core lesson. Numpy arrays are collections of things, all of which must be the same type, that work similarly to lists (as we've described them so far). The most important are:

You can easily perform elementwise operations (and matrix algebra) on arrays
Arrays can be n-dimensional
Arrays must be pre-allocated (ie, there is no equivalent to append)

Arrays can be created from existing collections such as lists, or instantiated "from scratch" in a few useful ways.

When getting started with scientific Python, you will probably want to try to use ndarrays whenever possible, saving the other types of collections for those cases when you have a specific reason to use them.

In [67]:

# Python lists don't support numerical operations (Python has no way of ensuring
# that all elements of the list are a numeric type, since lists can contain a mixture
# of any object types)
a = [1, 2, 3]
a + 4

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-67-38b719833a7f> in <module>()
      1 a = [1, 2, 3]
----> 2 a + 4

TypeError: can only concatenate list (not "int") to list

In [68]:

# (Surprisingly, multiplication of a list by a number does work, but it doesn't do what
# we might want to do with numerical data...)
a * 4

Out[68]:

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

In [69]:

# Make an array from a list
import numpy as np

alist = [2, 3, 4]
blist = [5, 6, 7]

a = np.array(alist)
b = np.array(blist)

print a
print b

[2 3 4]
[5 6 7]

In [75]:

# Do arithmetic on arrays
print a + 4
print a ** 2
print np.sin(a)
print a * b

# Broadcasting in Numpy

print np.dot(a, b)

[6 7 8]
[ 4  9 16]
[ 0.90929743  0.14112001 -0.7568025 ]
[10 18 28]
56

In [82]:

# Boolean operators work on arrays too, and they return boolean arrays
print a
print a > 2
print a[a > 2]

c = a > 2
type(c)
print c.dtype
print a.dtype

f = np.array([1.0, 2.0, 3.0])
print(f.dtype)

[2 3 4]
[False  True  True]
[3 4]
bool
int64
float64

In [86]:

# Indexing arrays
print a[0]
print a[0:2]

2
[2 3]

In [93]:

c = np.random.rand(3, 3)
print c
c[1][1] = 42
print c
c[1]
# c[1][1]  # don't use this, use this:
c[1, 1]

[[ 0.23976327  0.96496104  0.42249143]
 [ 0.09107841  0.74538248  0.34056617]
 [ 0.44997111  0.84155608  0.29793788]]
[[  0.23976327   0.96496104   0.42249143]
 [  0.09107841  42.           0.34056617]
 [  0.44997111   0.84155608   0.29793788]]

Out[93]:

42.0

In [96]:

# Arrays can also be indexed with other boolean arrays
print a > 2
print a[a > 2]

mask = a > 2
print a[mask]

[False  True  True]
[3 4]
[3 4]

In [101]:

# ndarrays have attributes in addition to methods
print a.dtype
print a.prod()
print a.sum()
print a.mean()
print a.shape

print c.shape

int64
24
9
3.0
(3,)
(3, 3)

In [107]:

# There are handy ways to make arrays full of ones and zeros
print np.zeros(5)
print np.zeros([5, 5])
print np.ones(3)

[ 0.  0.  0.  0.  0.]
[[ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]]
[ 1.  1.  1.]

In [111]:

E = np.empty(5)
print E
E.fill(2)
print E

# arr = malloc(5 * sizeof(double))
# memset(arr, 5, 5 * sizeof(double))

[  1.71777771e-316   1.27945216e-316   6.75386947e-317   5.20185118e-317
   6.90396305e-310]
[ 2.  2.  2.  2.  2.]

In [112]:

# You can also easily make arrays of number sequences
np.identity(5)

Out[112]:

array([[ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  1.]])

In [115]:

# Some other useful routines for creating new arrays
# Read the help page for these functions using np.arange? etc.
print np.arange(0, 10)
print np.linspace(0, 10, 5)
print np.linspace(0, 10, 100)

[0 1 2 3 4 5 6 7 8 9]
[  0.    2.5   5.    7.5  10. ]
[  0.           0.1010101    0.2020202    0.3030303    0.4040404
   0.50505051   0.60606061   0.70707071   0.80808081   0.90909091
   1.01010101   1.11111111   1.21212121   1.31313131   1.41414141
   1.51515152   1.61616162   1.71717172   1.81818182   1.91919192
   2.02020202   2.12121212   2.22222222   2.32323232   2.42424242
   2.52525253   2.62626263   2.72727273   2.82828283   2.92929293
   3.03030303   3.13131313   3.23232323   3.33333333   3.43434343
   3.53535354   3.63636364   3.73737374   3.83838384   3.93939394
   4.04040404   4.14141414   4.24242424   4.34343434   4.44444444
   4.54545455   4.64646465   4.74747475   4.84848485   4.94949495
   5.05050505   5.15151515   5.25252525   5.35353535   5.45454545
   5.55555556   5.65656566   5.75757576   5.85858586   5.95959596
   6.06060606   6.16161616   6.26262626   6.36363636   6.46464646
   6.56565657   6.66666667   6.76767677   6.86868687   6.96969697
   7.07070707   7.17171717   7.27272727   7.37373737   7.47474747
   7.57575758   7.67676768   7.77777778   7.87878788   7.97979798
   8.08080808   8.18181818   8.28282828   8.38383838   8.48484848
   8.58585859   8.68686869   8.78787879   8.88888889   8.98989899
   9.09090909   9.19191919   9.29292929   9.39393939   9.49494949
   9.5959596    9.6969697    9.7979798    9.8989899   10.        ]

EXERCISE 3 - Storing data in arrays¶

Copy your code from Exercise 2 into the box below, and do the following:

Modify your code to store the calculation results in an array instead of a list. HINT:

Make sure to pre-allocate space for the array using np.zeros() or something similar. Pre-allocate an array of length 100, as if we were going to fill in 100 time steps, including n0. 2. Imagine that each discrete time step actually represents 0.25 of an hour. Create an array t storing the time, in hours, of each step of the calculations (for example, t0 is 0, t1 is 0.25, t2 is 0.5, etc.) up to 100 time steps (the final time will thus be 24.75). 3. Use boolean indexing to extract the value of n corresponding to a t of 0.5.

In [120]:

r = 0.6
K = 100.0
n = np.zeros(100)
n[0] = 10

n[1] = round(n[0] + r*n[0] * (1 - n[0] / K))
n[2] = round(n[1] + r*n[1] * (1 - n[1] / K))
n[3] = round(n[2] + r*n[2] * (1 - n[2] / K))

#print n

t = np.arange(0, 25, 0.25)
#print t

print n[t > 0.5]

[ 34.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   0.]

^Back to top

4. Repeating yourself¶

So far, everything that we've done could, in principle, be done by hand calculation. In this section and the next, we really start to take advantage of the power of programming languages to do things for us automatically.

We start here with ways to repeat yourself. The two most common ways of doing this are known as for loops and while loops. For loops in Python are useful when you want to cycle over all of the items in a collection (such as all of the elements of an array), and while loops are useful when you want to cycle for an indefinite amount of time until some condition is met.

The basic examples below will work for looping over lists, tuples, and arrays. Looping over dictionaries is a bit different, since there is a key and a value for each item in a dictionary. Have a look at the Python docs for more information.

In [124]:

# A basic for loop - don't forget the white space!
wordlist = ['hi', 'hello', 'bye']
for word in wordlist:
    print word + '!'
    print '------'

print 'Done loop'

hi!
------
hello!
------
bye!
------
Done loop

In [128]:

# Whitespace is an important delimiter for code blocks in Python
from __future__ import braces

  File "<ipython-input-128-2aebb3fc8ecf>", line 1
    from __future__ import braces
SyntaxError: not a chance

Note on indentation: Notice the indentation once we enter the for loop. Every idented statement after the for loop declaration is part of the for loop. This rule holds true for while loops, if statements, functions, etc. Required identation is one of the reasons Python is such a beautiful language to read.

If you do not have consistent indentation you will get an IndentationError. Fortunately, most code editors will ensure your indentation is correction.

In [129]:

# Indentation error: Fix it!
for word in wordlist:
     print word
   print word

  File "<ipython-input-129-fd4f4b9b1d76>", line 4
    print word
              ^
IndentationError: unindent does not match any outer indentation level

In [ ]:

# Sum all of the values in a collection using a for loop

In [ ]:

# Often we want to loop over the indexes of a collection, not just the items

In [ ]:

# While loops are useful when you don't know how many steps you will need,
# and want to stop once a certain condition is met.

TIP: Once we start really generating useful and large collections of data, it becomes unwieldy to inspect our results manually. The code below shows how to make a very simple plot of an array. We'll do more plotting later on, this is just to get started.

In [5]:

# Load up pylab, a useful plotting library
%matplotlib inline
from matplotlib import pyplot as plt

# Make some x and y data and plot it
x = np.linspace(0, 2 * np.pi)
y = np.sin(x)
plt.plot(x, y)

Out[5]:

[<matplotlib.lines.Line2D at 0x43c5850>]

EXERCISE 4 - Using loops to repeat calculations¶

FINALLY, let's get smart about our calculations of nt. Copy your code from Exercise 3 into the box below, and do the following:

Write a for loop to fill in the values of nt for 100 time steps. HINT: You will need to

create an array of the step numbers using a command like step = range(1, 100). (Why does this list start at 1 and not at 0?). Then, loop over the values of the step list, and use each step value to index the array n.

Plot the array n.
Play around with the values of r and K and see how it changes the plot. What happens if you set

r to 1.9 or 3?

Bonus

Modify your code to use a while loop that will stop your calculation once the population size is

greater than 90. HINT: Start a step counter i at 1, and check that the value in n[i-1] is less than 90 each time around the loop. Increment the step counter within the loop so that you have a record of what step the calculation stopped at.

In [135]:

r = 0.6
K = 100.0
n = np.zeros(100, dtype=float)
n[0] = 10

steps = range(1, 100)
for idx in steps:
    n[idx] = round(n[idx - 1] + r*n[idx - 1] * (1 - n[idx - 1]/K))
    
plt.plot(n)

Out[135]:

[<matplotlib.lines.Line2D at 0x31b0990>]

^Back to top

5. Making choices¶

Often we want to check if a condition is True and take one action if it is, and another action if the condition is False. We can achieve this in Python with an if statement.

TIP: You can use any expression that returns a boolean value (True or False) in an if statement. Common boolean operators are ==, !=, <, <=, >, >=. You can also use is and is not if you want to check if two variables are identical in the sense that they are stored in the same location in memory.

In [139]:

print bool(1)
print bool(0)

z = 1
if z:
    print 'z is non-zero'

True
False
z is non-zero

In [143]:

print bool('hello')
print bool('')
print bool('1')
print bool('0')

True
False
True
True

In [144]:

# A simple if statement
x = 3
if x > 0:
    print 'x is positive'
elif x < 0:
    print 'x is negative'
else:
    print 'x is zero'

x is positive

In [ ]:

# If statements can rely on boolean variables

EXERCISE 5 - Making the model stochastic with an if statement¶

Deterministic models are boring, so let's introduce some element of randomness into our logistic growth model. We'll model a simple "catastrophe" process, in which a catastrophe happens in 10% of the time steps that reduces the population back down to the size at n0. Copy your code from Exercise 4 into the box below, and do the following:

Inside your for loop, add a variable called cata, for catastrophe, that will be True if a catastrophe

has occurred, and False if it hasn't. A simple way to do this is to generate a random number using np.random.rand(), which will give you a pseudorandom number between 0 and 1. Check whether this number is less than 0.1 - this check will be True 10% of the time.

Using your boolean variable cata, add an if statement to your for loop that checks whether cat is

true in each time step. If it is true, set the population back to the size at n[0]. Otherwise, perform the usual logistic growth calculation.

Plot your results. Run the cell again to see a different random population growth path.

Bonus

Now that you have the array n, count the number of time steps in which the population was above 50.

Although you can do this with a for loop (loop through each value of nt, check if it is > 50, and if so increment a counter), you can do this in one line with a simple boolean operation. HINT: If you take the sum of a boolean array (using np.sum()), it will give you the number of True values (since a True is considered to be a 1, and False is a 0).

In [149]:

a = 'x is nonzero' if x else 'x is zero'
print a

x is nonzero

In [151]:

r = 0.6
K = 100.0
n = np.zeros(100, dtype=float)
n[0] = 10

steps = range(1, 100)
for idx in steps:
    catastrophe = np.random.rand() < 0.42
    if catastrophe:
        n[idx] = n[0]
    else:
        n[idx] = round(n[idx - 1] + r*n[idx - 1] * (1 - n[idx - 1]/K))
    
plt.plot(n)

Out[151]:

[<matplotlib.lines.Line2D at 0x385c410>]

In [148]:

if x: print 'x is nonzero'

x is nonzero

^Back to top

6. Creating chunks with functions and modules¶

One way to write a program is to simply string together commands, like the ones described above, in a long file, and then to run that file to generate your results. This may work, but it can be cognitively difficult to follow the logic of programs written in this style. Also, it does not allow you to reuse your code easily - for example, what if we wanted to run our logistic growth model for several different choices of initial parameters?

The most important ways to "chunk" code into more manageable pieces is to create functions and then to gather these functions into modules, and eventually packages. Below we will discuss how to create functions and modules. A third common type of "chunk" in Python is classes, but we will not be covering object-oriented programming in this workshop.

In [ ]:

# We've been using functions all day

In [152]:

# It's very easy to write your own functions
def multiply(x, y):
    return x * y

In [154]:

# Once a function is "run" and saved in memory, it's available just like any other function
print type(multiply)
print multiply(2, 3)

<type 'function'>
6

In [157]:

# It's useful to include docstrings to dxescribe what your function does
def say_hello(time, people):
    """
    This function says a greeting.  useful for engendering good will.
    
    Parameters
    ==========
    time : str
        The time at which we are greeting our guest.
    people : object
        The people who we want to greet (could be a string, a list, etc.)
    """
    
    return 'Good ' + time + ', ' + people

In [156]:

say_hello?

Docstrings: A docstring is a special type of comment that tells you what a function does. You can see them when you ask for help about a function.

In [ ]:

# All arguments must be present, or the function will return an error

In [ ]:

# Keyword arguments can be used to make some arguments optional by giving them a default value
# All mandatory arguments must come first, in order

EXERCISE 6 - Creating a logistic growth function¶

Finally, let's turn our logistic growth model into a function that we can use over and over again. Copy your code from Exercise 5 into the box below, and do the following:

Turn your code into a function called logistic_growth that takes four arguments: r, K, n0,

and p (the probability of a catastrophe). Make p a keyword argument with a default value of 0.1. Have your function return the n array.

Write a nice docstring describing what your function does.
In a subsequent cell, call your function with different values of the parameters to make sure it works.

Store the returned value of n and make a plot from it.

Bonus

Refactor your function by pulling out the line that actually performs the calculation of the new

population given the old population. Make this line another function called grow_one_step that takes in the old population, r, and K, and returns the new population. Have your logistic_growth function use the grow_one_step function to calculate the new population in each time step.

In [7]:

# Set up initial model parameters
def logistic_growth(r, K, n0, prob_catastrophe=0.1):
    """
    Function to simulate discrete time stocastic logistic growth. 
    
    Parameters
    ----------
    r : float
        Reproductive rate of the population.
    K : int
        Carrying capacity of the environment.
    n0 : int
        Initial population size
    prob_catastrophe : float
        Probability of a catastrophe resetting the population
        size to n0 (default: 0.1)
        
    Returns
    -------
    n : ndarray
        Array of 100 time steps of population values
    """
    
    n = np.zeros(100)
    n[0] = float(n0)

    # Loop through all time steps
    steps = np.arange(1, 100)
    for idx in steps:
        cata = np.random.rand() < prob_catastrophe
        if cata:
            n[idx] = n[0]
        else:
            n[idx] = grow_one_step(n[idx -1], r, K)
    return n


def grow_one_step(prev_population, r, K):
    return round(prev_population + r*prev_population*(1 - prev_population/K))

In [8]:

grow_one_step

Out[8]:

<function __main__.grow_one_step>

In [9]:

n = logistic_growth(0.6, 100, 10)
plt.plot(n)

Out[9]:

[<matplotlib.lines.Line2D at 0x4598ad0>]

Advanced solution--we can change logistic_growth to a function that takes a function step_func as one of its arguments. That is, when you call logistic_growth you pass in as an argument a function that itself takes three arguments: the previous population size, the growth rate, and the carrying capacity. It then returns the population size for the next time step. Note that logistic_growth no longer has the "stochastic" feature...yet. We'll come to that:

In [15]:

# Set up initial model parameters
def grow_one_step(prev_population, r, K):
    return round(prev_population + r*prev_population*(1 - prev_population/K))


def logistic_growth(r, K, n0, step_func=grow_one_step):
    """
    Function to simulate discrete time stocastic logistic growth. 
    
    Parameters
    ----------
    r : float
        Reproductive rate of the population.
    K : int
        Carrying capacity of the environment.
    n0 : int
        Initial population size
    step_func : function
        A function which, given the previous population, the growth rate,
        and the carrying capacity, returns the population size at the next
        time step.  By default this computes:
        
        .. math::
        
            n(t+1) = n(t) + r n(t) [1 - n(t) / K]
        
    Returns
    -------
    n : ndarray
        Array of 100 time steps of population values
    """
    
    n = np.zeros(100)
    n[0] = float(n0)

    # Loop through all time steps
    steps = np.arange(1, 100)
    for idx in steps:
        n[idx] = step_func(n[idx -1], r, K)
        
    return n

In [16]:

plt.plot(logistic_growth(0.6, 100, 10))

Out[16]:

[<matplotlib.lines.Line2D at 0x4e8e5d0>]

This now implements our basic logistic_growth (without the stochastic feature). The default step_func computes the next time step normally, and we can call logistic_growth without specifying the step_func argument, since it has a default value pointing to the grow_one_step function.

However, our logistic_growth function was designed so that the step_func called in its internal loop can be customized, so we can implement more sophisticated growth models without rewriting the entire logistic_growth function. This is a "Pythonic" example of a programming design pattern called the strategy pattern. So now say we want to implement the stochastic model. We can define a new function that wraps the basic grow_one_step but includes the probability for a catastrophe:

In [17]:

def grow_one_step_stochastic(prev_population, rate, carrying_capacity, init_population=10,
                              prob_catastrophe=0.1):
    """
    Computes the next population size using a stochastic model.
    
    With a probability of `prob_catastrophe` (given as a value from 0 to 1), the population
    size is reset back to `init_population` rather than growing larger.
    """
    
    cata = np.random.rand() < prob_catastrophe
    if cata:
        return init_population
    else:
        return grow_one_step(prev_population, rate, carrying_capacity)

But now we have a problem--the interface defined by logistic_growth requires the function passed to step_func to take three arguments. But our grow_one_step_stochastic requires five! We have specified defaults for init_population and prob_catastrophe. So if we pass in grow_one_step_stochastic with the defaults for this arguments it will still work:

In [18]:

n = logistic_growth(0.6, 100, 10, step_func=grow_one_step_stochastic)
plt.plot(n)

Out[18]:

[<matplotlib.lines.Line2D at 0x5102750>]

But what if we want to change to init_population and prob_catastrophe parameters to grow_one_step_stocastic? One thing we can do is use an advanced technique called "partial evaluation". We create a new function from grow_one_step_stochastic that has new values for init_population and prob_catastrophe "baked in", while leaving the other arguments (rate, etc.) as required arguments.

Python has a utility in the functools library called functools.partial that makes it easy to make partially-evaluated functions. To start with a simpler example:

In [21]:

import functools

def say_hello(name, time='morning'):
    print 'Good', time, name + '!'
    
say_hello('Erik', time='afternoon')

Good afternoon Erik!

If I want to say "Good afternoon" I have to specify the time='afternoon' argument every time I call say_hello (otherwise it will say "Good morning" by default). I can "bake in" time='afternoon' by making a partial evaluation of say_hello:

In [23]:

say_good_afternoon = functools.partial(say_hello, time='afternoon')

In [24]:

say_good_afternoon('Erik')

Good afternoon Erik!

We can use this with our grow_one_step_stochastic to specify, for example, the probability of catastrophe when we pass it in to logistic_growth:

In [27]:

n = logistic_growth(0.6, 100, 10,
                    step_func=functools.partial(grow_one_step_stochastic, prob_catastrophe=0.4))
plt.plot(n)

Out[27]:

[<matplotlib.lines.Line2D at 0x537a210>]

EXERCISE 7 - Putting the growth functions in a module¶

We can make our functions more easily reusable by placing them into modules that we can import, just like we have been doing with numpy. It's pretty simple to do this.

Copy your function(s) from Exercise 6 into a new text file, in the same directory as this notebook,

called pop.py.

In the cell below, type import pop to import the module. Type pop. and hit tab to see the available

functions in the module. Try running the logistic growth function that was imported from the module with a few different parameter values.

In [ ]:

# import pop
# plt.plot(pop.logistic_growth(0.6, 100, 10))

^Back to top