Developed by Mark Bakker
"Portable, powerful, and a breeze to use", Python is a popular, open-source programming language used for both scripting applications and standalone programs (see "Learning Python" by Mark Lutz). Python can be used to do pretty much anything. For example, you can use Python as a calculator. Position your cursor in the code cell below and hit [shift][enter]. The output should be 12 (-:
6 * 2
12
Note that the extra spaces are added to make the code more readable.
2 * 3
works just as well as 2*3
. And it is considered good style. Use the extra spaces in all your Notebooks.
When you are programming, you want to store your values in variables
a = 6
b = 2
a * b
12
Both a
and b
are now variables. Each variable has a type. In this case, they are both integers (whole numbers). To write the value of a variable to the screen, use the print
function (the last statement of a code cell is automatically printed to the screen if it is not stored in a variable, as was shown above). Note that multiplication of two integers results in an integer, but division of two integers results in a float (a number with decimal places).
print(a)
print(b)
print(a * b)
print(a / b)
6 2 12 3.0
You can add some text to the print
function by putting the text string between quotes (either single or double quotes work as long as you use the same at the beginning and end), and separate the text string and the variable by a comma
print('the value of a is', a)
the value of a is 6
A variable can be raised to a power by using **
(a hat ^
, as used in some other languages, doesn't work).
a ** b
36
Compute the value of the polynomial $y=ax^2+bx+c$ at $x=-2$, $x=0$, and $x=2.1$ using $a=1$, $b=1$, $c=-6$ and print the results to the screen.
Division works as well
print('1/3 gives', 1 / 3)
1/3 gives 0.3333333333333333
(Note for Python 2 users (you should really change to Python 3!): 1/3
gives zero in Python 2, as the division of two integers returned an integer in Python 2). The above print statement looks pretty ugly with 16 values of 3 in a row. A better and more readable way to print both text and the value of a variable to the screen is to use what are called f-strings. f-strings allow you to insert the value of a variable anywhere in the text by surrounding it with braces {}
. The entire text string needs to be between quotes and be preceded by the letter f
a = 1
b = 3
c = a / b
print(f'{a} divided by {b} gives {c}')
1 divided by 3 gives 0.3333333333333333
The complete syntax between braces is {variable:width.precision}
. When width
and precision
are not specified, Python will use all digits and figure out the width for you. If you want a floating point number with 3 decimals, you specify the number of digits, 3
, followed by the letter f
for floating point (you can still let Python figure out the width by not specifying it). If you prefer exponent (scientific) notation, replace the f
by an e
. The text after the #
is a comment in the code. Any text on the line after the #
is ignored by Python.
print(f'{a} divided by {b} gives {c:.3f}') # three decimal places
print(f'{a} divided by {b} gives {c:10.3f}') # width 10 and three decimal places
print(f'{a} divided by {b} gives {c:.3e}') # three decimal places scientific notation
1 divided by 3 gives 0.333 1 divided by 3 gives 0.333 1 divided by 3 gives 3.333e-01
Compute the value of the polynomial $y=ax^2+bx+c$ at $x=-2$, $x=0$, and $x=2.1$ using $a=1$, $b=1$, $c=-6$ and print the results to the screen using f-strings and 2 decimal places.
Once you have created a variable in a Python session, it will remain in memory, so you can use it in other cells as well. For example, the variables a
and b
, which were defined two code cells above in this Notebook, still exist.
print(f'the value of a is: {a}')
print(f'the value of b is: {b}')
the value of a is: 1 the value of b is: 3
The user (in this case: you!) decides the order in which code blocks are executed. For example, In [6]
means that it is the sixth execution of a code block. If you change the same code block and run it again, it will get number 7. If you define the variable a
in code block 7, it will overwrite the value of a
defined in a previous code block.
Variable names may be as long as you like (you gotta do the typing though). Selecting descriptive names aids in understanding the code. Variable names cannot have spaces, nor can they start with a number. And variable names are case sensitive. So the variable myvariable
is not the same as the variable MyVariable
. The name of a variable may be anything you want, except for reserved words in the Python language. For example, it is not possible to create a variable for = 7
, as for
is a reserved word. You will learn many of the reserved words when we continue; they are colored bold green when you type them in the Notebook.
Plotting is not part of standard Python, but a nice package exists to create pretty graphics (and ugly ones, if you want). A package is a library of functions for a specific set of tasks. There are many Python packages and we will use several of them. The graphics package we use is called matplotlib
. To be able to use the plotting functions in matplotlib
, we have to import it. We will learn several different ways of importing packages. For now, we import the plotting part of matplotlib
and call it plt
. Before we import matplotlib
, we tell the Jupyter Notebook to show any graphs inside this Notebook and not in a separate window using the %matplotlib inline
command (more on these commands later).
%matplotlib inline
import matplotlib.pyplot as plt
Packages only have to be imported once in a Python session. After the above import statement, any plotting function may be called from any code cell as plt.function
. For example
plt.plot([1, 2, 4, 2])
[<matplotlib.lines.Line2D at 0x1179cc5b0>]
Let's try to plot $y$ vs $x$ for $x$ going from $-4$ to $+4$ for the polynomial
$y=ax^2+bx+c$ with $a=1$, $b=1$, $c=-6$.
To do that, we need to evaluate $y$ at a bunch of points. A sequence of values of the same type is called an array (for example an array of integers or floats). Array functionality is available in the package numpy
. Let's import numpy
and call it np
, so that any function in the numpy
package may be called as np.function
.
import numpy as np
To create an array x
consisting of, for example, 5 equally spaced points between -4
and 4
, use the linspace
command
x = np.linspace(-4, 4, 5)
print(x)
[-4. -2. 0. 2. 4.]
In the above cell, x
is an array of 5 floats (-4.
is a float, -4
is an integer).
If you type np.linspace
and then an opening parenthesis like:
np.linspace(
and then hit [shift-tab] a little help box pops up to explain the input arguments of the function. When you click on the + sign, you can scroll through all the documentation of the linspace
function. Click on the x sign to remove the help box. Let's plot $y$ using 100 $x$ values from
$-4$ to $+4$.
a = 1
b = 1
c = -6
x = np.linspace(-4, 4, 100)
y = a * x ** 2 + b * x + c # Compute y for all x values
plt.plot(x, y)
[<matplotlib.lines.Line2D at 0x117af4ca0>]
Note that one hundred y
values are computed in the simple line y = a * x ** 2 + b * x + c
. Python treats arrays in the same fashion as it treats regular variables when you perform mathematical operations. The math is simply applied to every value in the array (and it runs much faster than when you would do every calculation separately).
You may wonder what the statement like [<matplotlib.lines.Line2D at 0x30990b0>]
is (the numbers above on your machine may look different). This is actually a handle to the line that is created with the last command in the code block (in this case plt.plot(x, y)
). Remember: the result of the last line in a code cell is printed to the screen, unless it is stored in a variable. You can tell the Notebook not to print this to the screen by putting a semicolon after the last command in the code block (so type plot(x, y);
). We will learn later on that it may also be useful to store this handle in a variable.
The plot
function can take many arguments. Looking at the help box of the plot
function, by typing plt.plot(
and then shift-tab, gives you a lot of help. Typing plt.plot?
gives a new scrollable subwindow at the bottom of the notebook, showing the documentation on plot
. Click the x in the upper right hand corner to close the subwindow again.
In short, plot
can be used with one argument as plot(y)
, which plots y
values along the vertical axis and enumerates the horizontal axis starting at 0. plot(x, y)
plots y
vs x
, and plot(x, y, formatstring)
plots y
vs x
using colors and markers defined in formatstring
, which can be a lot of things. It can be used to define the color, for example 'b'
for blue, 'r'
for red, and 'g'
for green. Or it can be used to define the linetype '-'
for line, '--'
for dashed, ':'
for dots. Or you can define markers, for example 'o'
for circles and 's'
for squares. You can even combine them: 'r--'
gives a red dashed line, while 'go'
gives green circular markers.
If that isn't enough, plot
takes a large number of keyword arguments. A keyword argument is an optional argument that may be added to a function. The syntax is function(keyword1=value1, keyword2=value2)
, etc. For example, to plot a line with width 6 (the default is 1), type
plt.plot([1, 2, 3], [2, 4, 3], linewidth=6);
Keyword arguments should come after regular arguments. plot(linewidth=6, [1, 2, 3], [2, 4, 3])
gives an error.
Names may be added along the axes with the xlabel
and ylabel
functions, e.g., plt.xlabel('this is the x-axis')
. Note that both functions take a string as argument. A title can be added to the figure with the plt.title
command. Multiple curves can be added to the same figure by giving multiple plotting commands in the same code cell. They are automatically added to the same figure.
Whenever you give a plotting statement in a code cell, a figure with a default size is automatically created, and all subsequent plotting statements in the code cell are added to the same figure. If you want a different size of the figure, you can create a figure first with the desired figure size using the plt.figure(figsize=(width, height))
syntax. Any subsequent plotting statement in the code cell is then added to the figure. You can even create a second figure (or third or fourth...).
plt.figure(figsize=(10, 3))
plt.plot([1, 2, 3], [2, 4, 3], linewidth=6)
plt.title('very wide figure')
plt.figure() # new figure of default size
plt.plot([1, 2, 3], [1, 3, 1], 'r')
plt.title('second figure');
Plot $y=(x+2)(x-1)(x-2)$ for $x$ going from $-3$ to $+3$ using a dashed red line. On the same figure, plot a blue circle for every point where $y$ equals zero. Set the size of the markers to 10 (you may need to read the help of plt.plot
to find out how to do that). Label the axes as 'x-axis' and 'y-axis'. Add the title 'First nice Python figure of Your Name', where you enter your own name.
As was already mentioned above, good coding style is important. It makes the code easier to read so that it is much easier to find errors and bugs. For example, consider the code below, which recreates the graph we produced earlier (with a wider line), but now there are no additional spaces inserted
a=1
b=1
c=-6
x=np.linspace(-4,4,100)
y=a*x**2+b*x+c#Compute y for all x values
plt.plot(x,y,linewidth=3)
[<matplotlib.lines.Line2D at 0x117eee3d0>]
The code in the previous code cell is difficult to read. Good style includes at least the following:
=
, +
, -
, *
, /
), but not needed around **
linewidth=3
is correct)#
#
when it follows a Python statementplt.plot(x, y)
is good style, and plt.plot (x, y)
is not good style.These rules are (a very small part of) the official Python style guide called PEP8. When these rules are applied, the code is much easier to read, as you can see below:
a = 1
b = 1
c = -6
x = np.linspace(-4, 4, 100)
y = a * x**2 + b * x + c # Compute y for all x values
plt.plot(x, y, linewidth=3);
Use correct style in all other exercises and all Notebooks to come.
Go back to your Exercise 2 and apply correct style.
Numerical data can be loaded from a data file using the loadtxt
function of numpy
; i.e., the command is np.loadtxt
. You need to make sure the file is in the same directory as your notebook, or provide the full path. The filename (or path plus filename) needs to be between quotes.
You are provided with the data files containing the mean montly temperature of Holland, New York City, and Beijing. The Dutch data is stored in holland_temperature.dat
, and the other filenames are similar. Plot the temperature for each location against the number of the month (starting with 1 for January) all in a single graph. Add a legend by using the function plt.legend(['line1','line2'])
, etc., but then with more descriptive names. Find out about the legend
command using plt.legend?
. Place the legend in an appropriate spot (the upper left-hand corner may be nice, or let Python figure out the best place).
Load the average monthly air temperature and seawater temperature for Holland. Create one plot with two graphs above each other using the subplot
command (use plt.subplot?
to find out how). On the top graph, plot the air and sea temperature. Label the ticks on the horizontal axis as 'jan', 'feb', 'mar', etc., rather than numbers. Use plt.xticks?
to find out how. In the bottom graph, plot the difference between the air and seawater temperature. Add legends, axes labels, the whole shebang.
If you don't specify a color for a plotting statement, matplotlib
will use its default colors. The first three default colors are special shades of blue, orange and green. The names of the default colors are a capital C
followed by the number, starting with number 0
. For example
plt.plot([0, 1], [0, 1], 'C0')
plt.plot([0, 1], [1, 2], 'C1')
plt.plot([0, 1], [2, 3], 'C2')
plt.legend(['default blue', 'default orange', 'default green']);
color1 = 'fuchsia'
color2 = 'lime'
color3 = 'DodgerBlue'
plt.plot([0, 1], [0, 1], color1)
plt.plot([0, 1], [1, 2], color2)
plt.plot([0, 1], [2, 3], color3)
plt.legend([color1, color2, color3]);
The coolest (and nerdiest) way is probably to use the xkcd names, which need to be prefaced by the xkcd:
. The xkcd list of color names is given by xkcd and includes favorites such as 'baby puke green' and a number of brown colors varying from poo
to poop brown
and baby poop brown
. Try it out:
plt.plot([1, 2, 3], [4, 5, 2], 'xkcd:baby puke green');
plt.title('xkcd color baby puke green');
The plotting package matplotlib
allows you to make very fancy graphs. Check out the matplotlib gallery to get an overview of many of the options. The following exercises use several of the matplotlib options.
At the 2012 London Olympics, the top ten countries (plus the rest) receiving gold medals were ['USA', 'CHN', 'GBR', 'RUS', 'KOR', 'GER', 'FRA', 'ITA', 'HUN', 'AUS', 'OTHER']
. They received [46, 38, 29, 24, 13, 11, 11, 8, 8, 7, 107]
gold medals, respectively. Make a pie chart (use plt.pie?
or go to the pie charts in the matplotlib gallery) of the top 10 gold medal winners plus the others at the London Olympics. Try some of the keyword arguments to make the plot look nice. You may want to give the command plt.axis('equal')
to make the scales along the horizontal and vertical axes equal so that the pie actually looks like a circle rather than an ellipse. Use the colors
keyword in your pie chart to specify a sequence of colors. The sequence must be between square brackets, each color must be between quotes preserving upper and lower cases, and they must be separated by comma's like ['MediumBlue','SpringGreen','BlueViolet']
; the sequence is repeated if it is not long enough.
Load the air and sea temperature, as used in Exercise 4, but this time make one plot of temperature vs the number of the month and use the plt.fill_between
command to fill the space between the curve and the horizontal axis. Specify the alpha
keyword, which defines the transparancy. Some experimentation will give you a good value for alpha (stay between 0 and 1). Note that you need to specify the color using the color
keyword argument.
a = 1
b = 1
c = -6
x = -2
y = a * x ** 2 + b * x + c
print('y evaluated at x = -2 is', y)
x = 0
y = a * x ** 2 + b * x + c
print('y evaluated at x = 0 is', y)
x = 2.1
y = a * x ** 2 + b * x + c
print('y evaluated at x = 2 is', y)
y evaluated at x = -2 is -4 y evaluated at x = 0 is -6 y evaluated at x = 2 is 0.5099999999999998
a = 1
b = 1
c = -6
x = -2
y = a * x ** 2 + b * x + c
print(f'y evaluated at x = {x} is {y}')
x = 0
y = a * x ** 2 + b * x + c
print(f'y evaluated at x = {x} is {y}')
x = 2.1
y = a * x ** 2 + b * x + c
print(f'y evaluated at x = {x} is {y:.2f}')
y evaluated at x = -2 is -4 y evaluated at x = 0 is -6 y evaluated at x = 2.1 is 0.51
x = np.linspace(-3, 3, 100)
y = (x + 2) * (x - 1) * (x - 2)
plt.plot(x, y, 'r--')
plt.plot([-2, 1, 2], [0, 0, 0], 'bo', markersize=10)
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.title('First Python Figure of Mark Bakker');
holland = np.loadtxt('holland_temperature.dat')
newyork= np.loadtxt('newyork_temperature.dat')
beijing = np.loadtxt('beijing_temperature.dat')
plt.plot(np.linspace(1, 12, 12), holland)
plt.plot(np.linspace(1, 12, 12), newyork)
plt.plot(np.linspace(1, 12, 12), beijing)
plt.xlabel('Number of the month')
plt.ylabel('Mean monthly temperature (Celcius)')
plt.xticks(np.linspace(1, 12, 12))
plt.legend(['Holland','New York','Beijing'], loc='best');
air = np.loadtxt('holland_temperature.dat')
sea = np.loadtxt('holland_seawater.dat')
plt.subplot(211)
plt.plot(air, 'b', label='air temp')
plt.plot(sea, 'r', label='sea temp')
plt.legend(loc='best')
plt.ylabel('temp (Celcius)')
plt.xlim(0, 11)
plt.xticks([])
plt.subplot(212)
plt.plot(air-sea, 'ko')
plt.xticks(np.linspace(0, 11, 12),
['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'])
plt.xlim(0, 11)
plt.ylabel('air - sea temp (Celcius)');
gold = [46, 38, 29, 24, 13, 11, 11, 8, 8, 7, 107]
countries = ['USA', 'CHN', 'GBR', 'RUS', 'KOR', 'GER', 'FRA', 'ITA', 'HUN', 'AUS', 'OTHER']
plt.pie(gold, labels = countries, colors = ['Gold', 'MediumBlue', 'SpringGreen', 'BlueViolet'])
plt.axis('equal');
air = np.loadtxt('holland_temperature.dat')
sea = np.loadtxt('holland_seawater.dat')
plt.fill_between(range(1, 13), air, color='b', alpha=0.3)
plt.fill_between(range(1, 13), sea, color='r', alpha=0.3)
plt.xticks(np.arange(1, 13), ['jan', 'feb', 'mar', 'apr',\
'may', 'jun', 'jul', 'aug', 'sep', ' oct', 'nov', 'dec'])
plt.xlabel('Month')
plt.ylabel('Temperature (Celcius)');