When people talk about Python, they can mean a couple of things. Python is
While there is only a single definition of the Python language defined by the Python Software Foundation, there are many versions of the interpreter, written in different languages to run on different types of systems. (Curious fact: the "standard" Python interpreter, called CPython, is written in the C programming language, while program that compiles code written in the C programming language is itself written in C!)
Python can be run in two basic modes. In the first, we would run a script written in python by calling the python interpreter from the command line:
$ python myprogram.py
For the second, we would type
$ python
at the command line to begin an interactive python session. In this session, we can execute commands in python interactively (just like the shell!), get the results, and use the output. You will know you are in the python shell when you see the prompt
>>>
instead of your normal shell prompt.
Note that the ability to interactively run programs is one of the things that separates languages like python (and R and matlab) from compiled languages like C and Java. This is one of the major reasons we teach Python for data analysis.
If you read enough about Python, you will eventually see mention of differences with Python 2 and Python 3. Several years ago, the folks in charge of Python realized that they needed to make some serious changes in the language that would mean old software would no longer run, thus breaking what software engineers call "backward compatibility." As a result, it's taken several years, but as of now, nearly all major Python packages have been ported to Python 3, and ongoing development of Python 2 packages has (or will soon be) discontinued.
For this reason, you should begin by learning Python 3. For most Python you see in the wild, the differences are fairly minor, and code written for Python 2 can easily be made to run.
Python is a good language, and easier to learn than many. But learning to program isn't actually the hard part. The hard part is doing the actual analysis, which often means finding tools that make the analysis possible. Since most of us aren't professional-grade programmers, we use code written by others to do our science. And when that code isn't readily available, it makes our lives needlessly difficult.
All of which is to say we teach Python because Python has the tools for scientific computing. There are lots and lots and lots of libraries for Python, and in the last several years, these have coalesced around a bunch of key technologies, including
NumPy
, which defines a fast, efficient array that can be used for heavy number crunchingSciPy
, which includes functions for signal processing, special functions, and statisticsPandas
, which defines a type of object called a data frame for organizing and manipulating data setsMatplotlib
, bokeh
, seaborn
, and a host more for plottingstatsmodels
and patsy
for defining and fitting statistical models like regressionsScikit-learn
for machine learningScikit-image
and PIL
for image processingCython
, PyPy
, blaze
, and Shed Skin
for making your code run fastSymPy
for computer algebra (think Mathematica)Even more important, if you work with neuroscience data, there are tremendously good libraries available:
Sounds like a mess, I know, but you can visualize it a little like the figure below. There, arrows represent dependencies. For instance, arrows point from SciPy, Cython, and NumPy to Pandas because Pandas builds on all of these. Similarly, statsmodels builds on Pandas. As you can see NumPy is at the heart of many of these advanced tools.
Almost a decade ago, Fernando Perez began the IPython project to design a better version of the Python shell. As you'll recall, the shell is itself just another program. It gets input from the user, talks to the file system and operating system, and runs other commands/programs. You can view the python shell as a similar type of program, except, instead of talking to the operating system, the shell talks to the Python interpreter. IPython simply replaces the old Python shell with something much more flexible and powerful.
More recently, IPython took a big step forward in releasing the IPython server and the IPython notebook, which is the system that generated this document. When you type
ipython notebook
at the command prompt, the IPython Notebook Server starts on your local machine, allowing you to interact with Python through your browser:
Even more exciting, though, the IPython server doesn't have to run on your machine. You can interact with an IPython server running notebooks backed by a powerful computing cluster at a remote location!
The IPython notebook has proven incredibly popular for a few reasons:
You'll also note those spiffy menus and buttons at the top of the browser window. These allow you to do things like run the entire notebook, stop the notebook (if your code is taking too long for some reason), create or delete cells, and change what type of cell you're working on. There are also handy keyboard shortcuts for many of these things.
For those who really want to dive in, I suggest the extended tutorial here, but let's go ahead and see what sorts of fun things are really very easy to do:
Markdown started out as a quick and dirty way to write HTML without all the clutter of tags. Write using something like normal text, and the program would convert it to a well-formatted web document. However, Markdown proved so successful that places like GitHub adopted and extended it, so that now, there are at least half a dozen markdown dialects out there, including code highlighting.
Markdown is very easy to learn. For instance:
# Pound signs indicate varying levels of header
## this is a subheading
### this is a sub-subheading
lists work like this
- one item
- another item
- etc.
or this:
1. first point
1. second point
1. notice these all start with 1? Markdown numbers automatically
And [web links](http://www.duke.edu) are easy to do, too.
Finally, you can get code highlighting like so:
~~~python
print "Hello, world!"
~~~
becomes
lists work like this
or this:
And web links are easy to do, too.
Finally, you can get code highlighting like so:
print "Hello, world!"
To have your cell converted to Markdown in the notebook, you can either select "Markdown" from the dropdown box of cell types or hit Escape (so that your cursor disappears) and hit m
. This will change the selector up top as well.
For those of you who know LaTeX, IPython uses MathJax to render equations in the browser:
Here is one of Maxwell's Equations:
$$
\nabla \times \mathbf{E} +\frac{\partial \mathbf{B}}{\partial t} = \mathbf{J}
$$
becomes
Here is one of Maxwell's Equations: $$ \nabla \times \mathbf{E} +\frac{\partial \mathbf{B}}{\partial t} = \mathbf{J} $$
Remember all that bash
we learned? You can execute shell commands through the notebook by starting your code cell with a !
(pronounced "bang"):
!ls
Basic Data Analysis.ipynb
Basic Programming in Python.ipynb
Introduction to Python and IPython.ipynb
Working with Array Data.ipynb
data
ecosystem.svg
ipython_communication.svg
ipython_local.svg
ipython_remote.svg
lemurfig.pdf
lemurs.py
lemurs.pyc
marmoset.jpg
summary_data.csv
!pwd
/Users/jmxp/code/DIBS_materials/python
Finally (or sort of finally; there's much, much more), IPython has a set of extensions to the normal Python shell called magics. These come in especially handy for running scripts saved in .py
files, debugging, and interfacing with R or Matlab.