You have some experience with programming (preferably scripting like language--e.g., IDL or similar)
Forty five minutes coverage can only hit certain points. People will be available to answer specific questions that come up in the activities. Or refer to documentation if the location is obvious (googling is often the fastest way).
Using Python 2, but with Python 3 print and division behavior (we will be switching to Python 3 within a year as our main development and operations platform)
Using IPython notebook (future name: Jupyter) for nearly all tutorial material. You don't have to use it for general work, but we will require project activities to use it to display their results. Some instruction on using it later will be given.
Since many of the attendees have significant IDL experience:
Python is much more general and powerful than IDL, but IDL does make some things more convenient for numerical analysis since it is specialized for that. One just has to live with that. The plusses far outweigh the minuses.
For IDL users are there multiple web pages that map IDL operations into their Python/numpy/matplotlib equivalents:
More for the library equivalences:
We will occasionally address differences in the following material
to get notebooks, in terminal window type:
git clone https://github.com/spacetelescope/UserTraining2015.git
New directory tree should appear as UserTraining2015 in current directory. Notebooks are in that directory
%matplotlib inline
# import statements will be talked about later, for now, just do it.
from __future__ import print_function, division # Makes both of these Python 3 form.
# Good to get used to...
x = 1
y = 3.14
s = "Hello There"
print(s)
Hello There
# string operations
print(s + s)
print(2*s)
print(len(s))
Hello ThereHello There Hello ThereHello There 11
# Lists
L = ["red", "green", "blue"] # yes, case matters in variable names
print(L)
# lists can contain objects of any type, even other lists
# But usual practice is to limit members to items that share some essential property.
L = L + [2]
print(L)
['red', 'green', 'blue'] ['red', 'green', 'blue', 2]
# Indexing, access a list member
print(L[1])
# Slicing, extract a sublist
print(L[1:3]) # second index is non-inclusive!
# huh?
green ['green', 'blue']
# Indexing conveniences
L[-1]
2
L[:4:2]
['red', 'blue']
# indexing works on strings too
s[::-2]
'eeTolH'
# Modify a list
# Replace an element
L[-1] = "pink"
L.append('purple')
print(L)
# insertion
L[1:2] = ['orange', 'brown']
print(L)
# deletion
del L[4:6]
['red', 'green', 'blue', 'pink', 'purple'] ['red', 'orange', 'brown', 'blue', 'pink', 'purple']
# Dictionaries: Indexing a collection by arbitrary names or objects
D = {'M1':('5:34.5','22:01'), 'M51':('13:29.9','47:12'), 'M27':('19:59.6','22:43')}
D['M51']
('13:29.9', '47:12')
# Note about Tuples. Just like lists except use parentheses.
# Single element Tuple must have a comma, otherwise it is treated as a math expression:
et = (3.14,)
# Can be indexed like lists, but cannot be changed once created (immutable)
# only immutable objects can be used as keys to dictionaries (e.g., numbers, strings, tuples)
# Tuples usually used to contain different kinds of things (unlike lists),
# but with a consistent arrangement (e.g., ("M51", ra, dec, "cool interacting galaxy")
D['M51'][0] = 0 # try to set a new RA; note "traceback" on an error
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-10-16a0a2ced564> in <module>() 6 # Tuples usually used to contain different kinds of things (unlike lists), 7 # but with a consistent arrangement (e.g., ("M51", ra, dec, "cool interacting galaxy") ----> 8 D['M51'][0] = 0 # try to set a new RA; note "traceback" on an error TypeError: 'tuple' object does not support item assignment
Many other useful data structures not covered here:
# Note about how variables and values work
L2 = L # L2 is not a copy of L, it refers to the same list as L
L3 = L[:] # this is a copy
L[0] = 'lime green'
print(L2)
print(L3)
['lime green', 'orange', 'brown', 'blue'] ['red', 'orange', 'brown', 'blue']
# Libraries store code in modules and packages (think of packages as groups of modules).
# Module is a python file in the search path of Python (it can be compiled C code as well...)
# Multiple ways of importing the code into Python
import math # contents in the "math" namespace
print(math.sqrt(2.))
print(sqrt(2.)) # doesn't work
1.41421356237
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-12-8320e6c86738> in <module>() 4 import math # contents in the "math" namespace 5 print(math.sqrt(2.)) ----> 6 print(sqrt(2.)) # doesn't work NameError: name 'sqrt' is not defined
import math as m # give an alias to the original name
print(m.sqrt(2))
from math import sqrt # move sqrt into the current namespace
print(sqrt(2)) # works now!
from math import * # dump all things from math into current namespace
print(floor(2.5)) # very convenient, but usually a very, very bad idea,
# particularly in scripts and libraries
# Name collisions possible (one function hides another),
# Or hard to determine where a function comes from if many modules imported
1.41421356237 1.41421356237 2.0
important note If developing a module of your own, and you need to reload it after changes made to file.
import mymodule
will not reload your module
Python see this and thinks that it is already loaded and nothing must be done
Must use the following form:
reload(mymodule)
print(s.upper()) # change to upper case
print(s.find('llo')) # is the supplied string inside s?
print(s.split()) # split string into multiple strings as a list; defaults to splitting on whitespace
HELLO THERE 2 ['Hello', 'There']
# Formatted printing
pi = 3.1415926
print("{}".format(pi))
print("{1} {0}".format("hello",pi))
print("{:.2f}".format(pi))
# See https://docs.python.org/2/library/string.html#format-specification-mini-language
# for all formatting options and many examples
3.1415926 3.1415926 hello 3.14
A word about indentation in Python
Indentation Matters! No use of delimiters to define blocks. What you see is what you get. Unless you use tabs!
Never ever use real tabs for indentation Most editors can be set to turn tabs into spaces.
# define a simple function
def square(x):
'''square supplied number''' # this is a docstring; many tools can use this automatically
newvalue = x**2
return newvalue
print(square(4))
print(square.__doc__)
square?
16 square supplied number
# handling looping
def first_n_squares(n):
'''return the first n squares as a list'''
squares = []
for x in range(n):
squares.append(square(x))
return squares
first_n_squares(10)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# typical loop constructs
for color in L:
print(color)
# if a count is needed while looping
print('------')
for i, color in enumerate(L):
print(i, color)
# if you have two matching sequences you want to iterate through
# in parallel
print('------')
for color, n in zip(L, range(5,5+len(L))):
print(color, n)
lime green orange brown blue ------ 0 lime green 1 orange 2 brown 3 blue ------ lime green 5 orange 6 brown 7 blue 8
# 0, 0., empty strings, lists, tuples, dictionaries,
# and a special value "None" all are treated as false.
# The official true and false values are "True" and "False" (not strings)
if 0:
print('Hi')
if True:
print('True!')
True!
if None:
print('None case')
elif []:
print('empty list case')
elif [0]:
print('nonempty list case')
else:
print('nothing was true')
nonempty list case
import numpy as np
x = np.arange(20)
print(x)
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
# changing dimensionality
x.shape = (4, 5) # must have same number of elements
print(x)
[[ 0 1 2 3 4] [ 5 6 7 8 9] [10 11 12 13 14] [15 16 17 18 19]]
f = x.astype(np.float32) # how to convert to a new type
f
array([[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]], dtype=float32)
f**2 + np.sin(f) # arrays work in mathematical expressions
array([[ 0. , 1.84147096, 4.90929747, 9.14111996, 15.24319744], [ 24.04107666, 35.72058487, 49.65698624, 64.98935699, 81.412117 ], [ 99.45597839, 120.00000763, 143.46342468, 169.42016602, 196.99060059], [ 225.65028381, 255.71209717, 288.03860474, 323.24902344, 361.14987183]], dtype=float32)
fs = np.sin(f)
fs
array([[ 0. , 0.84147096, 0.90929741, 0.14112 , -0.7568025 ], [-0.95892429, -0.27941549, 0.65698659, 0.98935825, 0.41211849], [-0.54402113, -0.99999022, -0.53657293, 0.42016703, 0.99060738], [ 0.65028787, -0.28790331, -0.96139747, -0.75098723, 0.14987721]], dtype=float32)
# Indexing issues: 1-d works like lists
# Numpy supports multiple dimension indexing
# Order of indices is OPPOSITE of what IDL and Fortran users expect
# Most rapidly varying index in memory is the last one, not first!
print(fs[0,1]) # First row, second column
0.841471
# slices are views, not copies!
fsview = fs[1:3,0:3]
print(fsview)
fsview[0,0] = 100
print(fs)
[[-0.95892429 -0.27941549 0.65698659] [-0.54402113 -0.99999022 -0.53657293]] [[ 0. 0.84147096 0.90929741 0.14112 -0.7568025 ] [ 100. -0.27941549 0.65698659 0.98935825 0.41211849] [ -0.54402113 -0.99999022 -0.53657293 0.42016703 0.99060738] [ 0.65028787 -0.28790331 -0.96139747 -0.75098723 0.14987721]]
# using mask arrays to index
fs[fs>0]
array([ 0.84147096, 0.90929741, 0.14112 , 100. , 0.65698659, 0.98935825, 0.41211849, 0.42016703, 0.99060738, 0.65028787, 0.14987721], dtype=float32)
# using index arrays
nz = np.where(fs > 0)
print(nz)
fs[nz]
(array([0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 3]), array([1, 2, 3, 0, 2, 3, 4, 3, 4, 0, 4]))
array([ 0.84147096, 0.90929741, 0.14112 , 100. , 0.65698659, 0.98935825, 0.41211849, 0.42016703, 0.99060738, 0.65028787, 0.14987721], dtype=float32)
from matplotlib import pyplot as plt # yet another import variant...
plt.ion()
plt.plot(np.sin(np.arange(100)/5))
[<matplotlib.lines.Line2D at 0x107ccd490>]
What mode are you using matplotlib?
plt.show()
when all doneplt.ion()
so all matplotlib commands render immediatelyWhat backend are you using?
Image orientation convention
Two kinds of image display:
Three very important links:
The latter is very useful for finding something visually that matches what you want to do. Click on the gallery example to see the source code that generated it.
The IPython notebook provides a number of advantages:
There are some drawbacks
How to start:
ipython notebook
: starts basic notebook in browserTo enable inline plots, as first command type: %matplotlib inline
Markdown cheatsheet: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#code
Code cells are editable. If you make a mistake, just put the cursor in the cell and edit it. When you are done making changes, just re-execute. Rather than a long history of failed attempts as you would get in traditional logging of an interactive session, you just keep changing one thing until you get it right
Note the menus and toolbar at the top of the browser page. These allow quite a few actions to be performed. This tutorial won't go in to exhaustive detail on each choice, but will highlight some key features. You can always explore the choices on your own, and there is a help menu item for the ipython notebook and other Python tools. If you hover on toolbar widgets, a tooltip will appear. Finally, many of these have keyboard shortcuts; you can see these under the help menu (most of the keyboard shortcuts are two character combinations starting with to avoid conflicts with other programs).
Click on the text to the right of "Notebook" at the top of the screen to rename your Notebook to something more informative.
An IPython Notebook is a document much like any other. If you don't save your changes can be lost. To save your notebook just select "save" from the "File" dropdown at the top of the browser window or click the disk icon. The file will appear in the directory where you ran ipython notebook with the title as the name of the file and .ipynb as its extension.
In the "Edit" menu and in the toolbar are options for editing actions like delete, copy, cut, paste, merge, split, etc.
IPython will automatically insert new cells at the end of the Notebook. In the menus and toolbars are buttons for inserting cells above and below the cursor.
Notebook cells can be switched between different input types, including cells expecting Markdown (like this cell), cells expecting Python code, and cells that take raw text and do nothing to it. Cells can be converted between these types using the "Cell" menu or the toolbar dropdown.
There are keyboard shortcuts for most of the actions described above. To see all of the available shortcuts click the "Keyboard Shortcuts" item in the "Help" menu (or type
A Notebook can be executed linearly from top to bottom by selecting "Run All" from the "Cell" menu. This can be useful when resuming work in an existing Notebook.
You can put nicely rendered equations into Markdown cells. Use $...$ for inline equations and $$...$$ for block equations. For example, $E = mc^2$
is rendered as $E = mc^2$. $$f(x) = \int_0^x sin(\theta) d{\theta}$$
is rendered as:
$$f(x) = \int_0^x sin(\theta) d{\theta}$$
Some of these taken from fperez/org/py4science/starter_kit.html (which is dated and has broken links)
Astropy:
Book on learning how to do Interactive Data Analysis in Python by Greenfield and Jedrzejewski. A bit dated, but mostly still valid. Assumes no Python knowledge and introduces Python as more and more data analysis is introduced. IMHO, the chapter on Object-Oriented programming is more relevant for astronomers than almost anything else out there, but I'm biased.
Book (pdf): http://stsdas.stsci.edu/perry/pydatatut.pdf
Full set of data (114 MB): http://stsdas.stsci.edu/perry/full.tar.gz
Partial set of data (3.2 MB, missing large ACS file only) http://stsdas.stsci.edu/perry/partial.tar.gz
Lectures (video available) and Course with exercises at STScI early 2015: https://github.com/spacetelescope/scientific-python-training-2015 Follow link to videos under Course Material heading. More astronomy task-oriented than above book. Approach for some issue dated; in some cases Astropy has better solutions now.
Astrobetter/Python: http://www.astrobetter.com/wiki/python (some other links...)
Practical Python for Astronomers: https://python4astronomers.github.io/
numpy:
scipy
matplotlib
https://pythonconquerstheuniverse.wordpress.com/2009/09/10/debugging-in-python/
Most popular IDEs (not free, but not that expensive)