Notebook

PYTHON/ANACONDA TUTORIAL

Getting Started With Python and Anaconda

Introduction
Anaconda
Downloading Anaconda
Installation on Linux
Installation on Windows
Installation on Mac
Spyder
Getting Started: Variables and Arrays
Matrix Operations
Importing Data
IPython
IPython Notebooks
Functions and Scripts
References

1. Introduction

Python is a very popular general-purpose language, with all the modern and classic constructs of a programming language that every software developer appreciates. This is what makes Python beneficial over MATLAB, besides the fact that it is not proprietary and various open source python distributions are freely and publicly available.

However, the very fact that Python is a general purpose language and not a software specific to scientific computing may be considered a drawback of it, too. To address this problem, several scientific computing packages (i.e. sets of function,classes,...) have been developed and released for it so far. These packages contain a large variety of functions which can solve everyday computational problems of researchers in many fields of engineering and science.

But the remaining problem is to find, install, maintain, manage updates and retain the consistency among all such packages as well as the Python system itself. This is where Anaconda comes in.

2. Anaconda

Anaconda is a completely free Python distribution (i.e. set of packages) for scientific purposes, as stated in their website. It contains more than 125 Python packages for science, mathematics, engineering and data analysis.

Installing Anaconda will not only give you an out-of-the-box ready python system as well as a fully-featured IDE (Integrated Development Environment), but also it will release you from the burden of manually installing and taking care of dependency and consistency requirements between various packages.

3. Downloading Anaconda

To download the Anaconda, you can simply go to the link http://continuum.io/downloads and download the zip file compatible with your system. The download page looks like this:

It may ask for your e-mail as well. Please note that the Python version Anaconda uses is Python 2.7.

4. Installation on Linux

Installing Anaconda is pretty simple. On Linux-based systems, all you need to do is running the following command.

bash Anaconda-1.x.x-Linux-x86[ 64].sh

No root access is required. However, you will need to manually add Python executable files to your Path environment if you want to run them from every folder. This can be done by adding the following line of code to your ~/.bashrc file:

export PATH= /anaconda:$PATH

5. Installation on Windows

Installing Anaconda on Windows should be easy. It is automatically added to Path. In case of any prospective problem, disabling your anti-virus can be a potential solution.

6. Installation on Mac

On MacOS, all you need to do is running the graphical installer. Anaconda will be automatically added to your path. However, in some cases an error message may appear at the installation time which is not a big deal. You can simply click Install for me only and go on.

It seems to happens for older versions of OS X that the following error is generated when launching the ipython notebook (see section 12)

ValueError: unknown locale: UTF-8

In that case, run the command locale in the terminal and inspect the value of the environment variable LC_CTYPE. It is probably just UTF-8.

Now open the file ~/.profile and add the following line

export LC_CTYPE='us_EN.UTF-8'

Then close the terminal session and try again. You might need to replace us_EN by a different value matching your system configuration.

7. Spyder

Spyder is a popular and very handy GUI for Python which is integrated in Anaconda by default. It is very similar to MATLAB's GUI meaning that if you have already worked with MATLB, you will not get lost in spyder. You can run the spyder using the following command in your command line:.

spyder

When you run the command above, the following screen appears.

The Console window on the bottom right is where you type your commands and view possible results. The Object inspector window in top right shows the help manual available for functions you type in the console window. In the top bar of the application you can see and change your working directory. This is the default location in which spyder expects to find the files you read, the scripts you run,... .

Finally, the Editor window in the left is where you create and edit your own functions. We will talk about functions later in this tutorial.

8. Getting Started: Variables and Arrays

Variables are where you store your values and results of your operations. Despite some other programming languages, in Anaconda there is no need to define a variable or its type prior to using it. The most important variable type you will be working

with is the array type. Please note the very important fact that in Python, indices of an array start from zero. This is in contrast to some other systems (the most notable of them is MATLAB) in which arrays and matrices are indexed starting from one.

Initializing arrays is simple and can be done using the following commands, for example:

In [1]:

v1=[1,2,3,4]
v2=[4,2,7,4]
v3=[v1,v2]

You may or may not put semicolon at the end of the commands you type.

The above commands create three arrays. The first two are four-element numerical arrays containing different numbers. The third one, v3, is known as an array of arrays, a multi-dimensional (here two-dimensional) array, or simply a matrix. v3 is a two-element

array whose each element is a four-element array itself. Hence it is a $2\times 4$ matrix. The default type of the elements in an array is integer, unless at least one of them is defined explicitly as a rational number (i.e. having a fractional part, such as $1.0$ ).

Please note that hereafter, we use the terms "matrix" and "array" interchangeably since in Python and in the programming language community in general, a matrix is simply an array of arrays or a multi-dimensional array.

To see the value of an array (or any variable in general), we can either simply write its name in the console:

In [2]:

v2

Out[2]:

[4, 2, 7, 4]

or use the more elegant and more flexible print command:

In [3]:

print v2

[4, 2, 7, 4]

There are useful commands which facilitate creating and initializing matrices. Two of them are the function zeros and ones, which can create arbitrary arrays with values initialized to $0$ and $1$ respectively. Consider the following commands for example:

In [4]:

%pylab inline
z1=zeros(2)
o1=ones((2,2))

Populating the interactive namespace from numpy and matplotlib

The first line creates a two-element array z1 with all elements initialized to $0$ , whereas the second line creates a $2\times 2$ matrix o1 with all elements initialized to $1$ .

Another useful group of functions are those which query about the structure or shape of an array, i.e. its number of elements and dimensions. The first one is shape which returns the number of elements in each dimension of an array (like MATLAB's size()).

An example is the following line of code (assuming you have already executed the above commands in your current Anaconda session),

In [5]:

 shape(v3)

Out[5]:

(2L, 4L)

which outputs $(2L,4L)$ meaning that v3 is a $2\times 4$ matrix. The letter $L$ in the output stands for "long", stating that size of an array is by default a long integer value.

The second one is size which outputs the number of elements in the whole array/matrix or a specific dimension of it (This is partly similar to MATLAB's length(), not size(). Note the confusion). Consider executing the following lines of code (again assuming you have already executed the above commands in your current Anaconda session):

In [6]:

print size(v3,0)
print size(v3,1)

2
4

You will see the outputs $2$ and $4$ respectively. These are the number of elements in each column and row of v3, respectively. Equivalently, these are the number of rows and columns of v3.

Please note that the following command leads to an error, since v3 has only two dimensions.

In [7]:

size(v3,2)

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-7-0f189a214d88> in <module>()
----> 1 size(v3,2)

E:\Programs\Anaconda\lib\site-packages\numpy\core\fromnumeric.pyc in size(a, axis)
   2536             return a.shape[axis]
   2537         except AttributeError:
-> 2538             return asarray(a).shape[axis]
   2539 
   2540 

IndexError: tuple index out of range

9. Matrix Operations

Many functions are provided to manipulate an array. Some useful ones are explained here.

To flatten a multi-dimensional array into a single-dimensional vector, we use a combination of commands flatten and list. For example, the following line of code flattens the array v3 defined above:

In [8]:

v4=list(flatten(v3))

We can see the output by typing "v4" in the Console window and viewing the output:

In [9]:

v4

Out[9]:

[1, 2, 3, 4, 4, 2, 7, 4]

The function transpose transposes v3, similar to MATLAB's transpose operator (single-quotation). This is shown below:

In [10]:

v5=transpose(v3)
print v5

[[1 4]
 [2 2]
 [3 7]
 [4 4]]

The matrix multiplication and element-wise multiplication are performed using the commands dot and multiply, respectively:

In [11]:

v6=dot(v3,v5)
v7=multiply(v3,v3)
print v6
print v7

[[30 45]
 [45 85]]
[[ 1  4  9 16]
 [16  4 49 16]]

Please note that the summation operator "+" has a completely different meaning for arrays/matrices than for scalar numbers. The output of the following line of code for example, is the concatenation of the two vectors v1 and v2, rather than their element-wise summation one may guess.

In [12]:

v1+v2

Out[12]:

[1, 2, 3, 4, 4, 2, 7, 4]

For element-wise summation, we should use the sum function. A few other useful functions are summarized in the table below. Please note that the division operator on integer values (which are default values in Python) acts as the modulo operator. That means 2/4 outputs $0$ for example. But 2.0/4 outputs $0.5$ .

<tbody>

    <tr>



        <td><b>add</b>

        </td>



        <td>Addition</td>



    </tr>



    <tr>



        <td><b>subtract</b>

        </td>



        <td>Subtraction</td>



    </tr>



    <tr>



        <td><b>dot</b>

        </td>



        <td>Matrix multiplication</td>



    </tr>



    <tr>



        <td><b>multiply</b>

        </td>



        <td>Element-wise multiplication</td>



    </tr>



    <!--<tr>



<td>/</td>



<td>Division</td>



</tr>-->



    <tr>



        <td><b>divide</b>

        </td>



        <td>Element-wise division</td>



    </tr>



    <tr>



        <td><b>np.power</b>

        </td>



        <td>Element-wise power</td>



    </tr>



</tbody>

10. Importing Data

Various commands provide means to import data in different formats and various modalities. A few of them are explained here.

To read the data stored in an ASCII text file, the function loadtxt is used. The following piece of code, for example:

In [14]:

v11=loadtxt('data.txt')
print v11

[[ 1.  3.  5.]
 [ 2.  4.  6.]]

Reads the data in the file data.txt into the matrix v11 and prints the matrix. Each line of the text file is stored in a row of v11.

Values in a line are assumed to be separated by white spaces by default. To change this, for example in the case of comma-separated CSV files, we use the following form:

In [15]:

v12=loadtxt('data.csv',delimiter=',')
print v12

[[ 0.  3.  6.]
 [ 1.  5.  8.]]

We can also import data from MATLAB's specific .mat files. This can be done using the following lines of code:

In [16]:

import scipy.io as sio
sio.loadmat('matlab.mat')

Out[16]:

{'M': array([[-1.79467884, -0.19412354, -1.20784549, -2.0518163 , -0.29906603,
         0.96422942, -0.58902903,  0.79141606,  0.86202161, -0.06786555],
       [ 0.84037553, -2.13835527,  2.90800803, -0.35385   ,  0.02288979,
         0.5200601 , -0.2937536 , -1.33200442, -1.36169447, -0.1952212 ],
       [-0.88803208, -0.83958875,  0.82521889, -0.82358653, -0.26199543,
        -0.02002785, -0.84792624, -2.32986716,  0.45502956, -0.21760635],
       [ 0.10009283,  1.35459433,  1.37897198, -1.57705702, -1.75021237,
        -0.03477109, -1.1201283 , -1.44909729, -0.84870938, -0.30310762],
       [-0.54452893, -1.07215529, -1.05818026,  0.50797465, -0.28565097,
        -0.79816358,  2.52599969,  0.33351083, -0.33488694,  0.02304562],
       [ 0.30352079,  0.96095387, -0.46861558,  0.28198406, -0.83136651,
         1.01868528,  1.65549759,  0.3913536 ,  0.55278335,  0.05129036],
       [-0.60032656,  0.1240498 , -0.27246941,  0.03347988, -0.97920631,
        -0.13321748,  0.30753516,  0.45167942,  1.03909065,  0.82606279],
       [ 0.48996532,  1.43669662,  1.09842462, -1.33367794, -1.15640166,
        -0.71453016, -1.25711836, -0.13028465, -1.11763868,  1.52697669],
       [ 0.73936312, -1.9609    , -0.27787193,  1.12749228, -0.53355711,
         1.35138577, -0.86546803,  0.1836891 ,  1.26065871,  0.46691444],
       [ 1.71188778, -0.19769823,  0.70154146,  0.35017941, -2.00263574,
        -0.22477106, -0.17653411, -0.47615302,  0.66014314, -0.20971334]]),
 'N': array([[ 0.81472369,  0.15761308,  0.6557407 ,  0.70604609,  0.43874436,
         0.27602508,  0.75126706,  0.84071726,  0.35165951,  0.07585429],
       [ 0.90579194,  0.97059278,  0.03571168,  0.03183285,  0.38155846,
         0.67970268,  0.25509512,  0.25428218,  0.83082863,  0.05395012],
       [ 0.12698682,  0.95716695,  0.84912931,  0.27692298,  0.76551679,
         0.655098  ,  0.50595705,  0.81428483,  0.58526409,  0.53079755],
       [ 0.91337586,  0.48537565,  0.93399325,  0.04617139,  0.7951999 ,
         0.16261174,  0.69907672,  0.24352497,  0.54972361,  0.77916723],
       [ 0.63235925,  0.80028047,  0.67873515,  0.09713178,  0.1868726 ,
         0.11899768,  0.89090325,  0.92926362,  0.91719366,  0.93401068],
       [ 0.0975404 ,  0.14188634,  0.75774013,  0.82345783,  0.4897644 ,
         0.49836405,  0.95929143,  0.34998377,  0.28583902,  0.12990621],
       [ 0.27849822,  0.42176128,  0.74313247,  0.69482862,  0.4455862 ,
         0.95974396,  0.54721553,  0.19659525,  0.75720023,  0.56882366],
       [ 0.54688152,  0.91573553,  0.39222702,  0.31709948,  0.64631301,
         0.34038573,  0.13862444,  0.25108386,  0.75372909,  0.46939064],
       [ 0.95750684,  0.79220733,  0.65547789,  0.95022205,  0.70936483,
         0.58526775,  0.14929401,  0.61604468,  0.38044585,  0.01190207],
       [ 0.96488854,  0.95949243,  0.17118669,  0.03444608,  0.75468668,
         0.22381194,  0.25750825,  0.47328885,  0.56782164,  0.33712264]]),
 '__globals__': [],
 '__header__': 'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Tue Apr 22 00:06:25 2014',
 '__version__': '1.0'}

The first line is required since the package containing the loadmat function is not loaded by default when spyder is started. So we need to import it manually.

Finally, to load an image into the spyder environment, the function imread is used:

In [17]:

I=imread('David.bmp')
print shape(I)

(96L, 96L, 4L)

The command in the first line is used to read the image David.bmp into a matrix, here I. Supported formats may vary depending on the device and the operating system. The second line prints the size and the number of color channels of the image.

11. IPython

IPython is an interactive shell for Python which offers much more functionality then the default Python command line. These vary from more direct operating system support (e.g. for manipulation of data files, interacting with external processes, ...) to logged system state (i.e. storing results of all operations performed so far). IPython is installed by Anaconda and can be used as an alternative to the default spyder Console, although some of its properties exist there too.

To run IPython you should type the following command in the command prompt of your operating system:

ipython

You will see a prompt as In [1]. This means that it is your first command. When you run something, the result is shown in an output line starting with Out [1]. As you go on and execute more commands, the number increases. You can always refer to the output of the command in line $n$ simply by the variable named $\_n$ (where $n$ is replaced by the command number). This is a very useful property of the IPython.

One other feature of IPython is its more direct connection to the underlying operating system. By preceding the character "!" in front of a command line, that line is passed to the operating system to be executed directly. You will find this specially useful when you want to run OS commands or execute external files from within Python. As an example, consider the following code:

!vlc test.avi

This will try to execute the vlc media player from the operating system. More information about IPython and its features can be found in this paper.

In recent versions of the IPython, a very interesting feature has been added to it, called the IPython notebooks. We will talk about these in the next section.

12. IPython Notebooks

IPython notebooks are very interesting novel features added to recent versions of IPython. Notebooks are interactive documents that allow running Python code and reading (or writing) notes and documentations in the same place. Therefore, one can not only see the results he is reading about, but also can produce different results by changing the documented code.

A notebook is actually an extended HTML file which contains specific markup to distinguish Python codes inside the page. When displayed using a custom web server, it allows interactive execution and editing of the code inside the document. However, it can also be viewed as a usual, nicely-formatted HTML page. The document you are currently reading is itself an IPython notebook.

The command to run the IPython Notebooks web server is the following:

ipython notebook

When you execute the command above, a new browser window is opened which shows the notebooks in the current folder. The IPython notebook files have the ".ipynb" extension.

There are a lot of notebooks available on the web which you can see and read. The GitHub repository available in here contains many useful and interesting ones. The source code of these notebooks is also available through the GitHub version control system.

Another interesting source is the book "Python For Signal Processing" which is publicly available as a series of IPython notebooks available at this address.

13. Functions and Scripts

You can define your own functions and scripts in Python as well. We start with scripts here.

You can create, edit, save and execute your scripts through the Editor panel in the left side of the spyder window. A script can be any pice of code that you may want to run together and more than once. For example, the following code draws 1000 random samples from the standard Gaussian distribution and plots a histogram of them. You can verify the statistical distribution of the samples by looking at the histogram.

In [18]:

x = randn(10000);
hist(x, 100);

To save the above code as a script, you should first create a new file using the menu item "File> New File", type your script in the file and then save it with your preferred name. the default folder to save the scripts is your working folder we talked about earlier. You can then execute the script by clicking the green play button in the toolbar. This runs the currently open script.

Looking at the console, you will notice that clicking the play button is equivalent to running the following function in console:

runfile('C:/Users/mrazavi/Documents/Python Scripts/untitled1.py', wdir=r'C:/Users/mrazavi/Documents/Python Scripts')

The first argument is the script file and the second one is the working directory.

You can also define your own functions in a script. Functions are defined by a starting def command. Consider the following code for example:

In [30]:

def celsius_to_fahrenheit(c_temp):
    return 9.0 / 5.0 * c_temp + 32
print celsius_to_fahrenheit(0)

32.0

The code above first defines a function which receives a degree in Celsius and returns its Fahrenheit equivalent. Then it prints the result of applying the function to the value $0$ .

After the first time you run the script containing a function, it is added to the current spyder session meaning that you can later run it independently from the console. However, if you change the function, you should run the scripts for the changes to take effect.

14. References

There is a lot of documentation available on the web, both about Anaconda and its packages and also about Python itself. A few of them are listed below.

<li><a name="refe"></a><a href="http://docs.continuum.io/anaconda/install.html" title="Link: http://docs.continuum.io/anaconda/install.html">http://docs.continuum.io/anaconda/install.html</a>

</li>



<li><a href="http://wiki.scipy.org/NumPy_for_Matlab_Users" title="Link: http://wiki.scipy.org/NumPy_for_Matlab_Users">http://wiki.scipy.org/NumPy_for_Matlab_Users</a>

</li>



<li><a href="http://mathesaurus.sf.net/matlab-numpy.html" title="Link: http://mathesaurus.sf.net/matlab-numpy.html">http://mathesaurus.sf.net/matlab-numpy.html</a>

</li>



<li><a href="http://mathesaurus.sf.net/matlab-python-xref.pdf" title="Link: http://mathesaurus.sf.net/matlab-python-xref.pdf">http://mathesaurus.sf.net/matlab-python-xref.pdf</a></li>