Computational science and engineering is the related discipline, lying at the intersection of applied mathematics, computer science, and traditional science and engineering. As a discipline, CSE uses high-performance computing (HPC), numerical algorithms, and physical insight to further engineering and scientific ends.
Although most scientific code is written in Fortran or C/C++, Python has rapidly become a serious and viable contender as a complete scientific computing language, particular in competition with MATLAB. The packages discussed in this notebook, NumPy and SciPy, are the current enablers of that prowess. As much of their underlying features take advantage of compiled C libraries, their performance is comparable to traditional compiled languages, yet they retain the dynamic flexibility of Python.
This notebook discusses the main features of Numerical Python (numpy
), Scientific Python (scipy
), and their relatives in an engineering context.
Although any installation of IPython will work with a version of this notebook, we recommend that you download and install either the Enthought Canopy Distribution (free for academic users) or Anaconda. To launch the notebook, open a command terminal, type ipython notebook tutorial.ipynb
, and press Return.
A few notes on this tutorial:
The code in this tutorial is written for Python 3, so if you have Python 2 then you will need to make some sensible modifications.
Code blocks starting with $
are intended to be run on the command line, not executed as Python code.
As we will be utilizing a number of packages with reasonably long names, we will adopt the de facto standard module abbreviations in the following header. We also ensure that our division behavior is sensible by importing from __future__
: i.e., promotion to double
will occur from int
or long
data types involving division: 1/2 == 0.5
. Although this is the default in Python 3, it is a trivial way to help this notebook work in Python 2 if that's what you are using.
from __future__ import division, print_function
import numpy as np
import scipy as sp
import matplotlib as mpl
import matplotlib.pyplot as plt
#IPython magic command for inline plotting
%matplotlib inline
#a better plot shape for IPython
mpl.rcParams['figure.figsize']=[15,3]
from scipy import optimize
help(optimize.fmin)
This problem will obey the standard law for an object subject to gravitational acceleration, $$y(t) = g t^{2} + v_{0}(t) t + x_0 \,\text{.}$$
We take the object to be initially stationary and falling from an initial position $x_{0} = 1000 \,\text{m}$.
# This code snippet models a falling object using the well-known formula.
a = -9.8 #m/s^2
v = 0 #m/s
x0 = 1000 #m
t = 1
y = a*t**2 + v*t + x0
print(y)
Python has a simple syntax which allows you to express mathematical and numerical expressions succinctly. Its syntax was in part inspired by MATLAB's, although making some distinct changes and improvements.
y = a*t**2 + v*t + x0
from what you would typically expect?Often, however, we don't just want to evaluate a function like this at one point, but at an array of points: for instance, time steps. The intuitive way to do this in Python is to try using a list
. So take the code snippet above and try to use it on a list
of numbers:
# This code snippet models a falling object using the well-known formula.
a = -9.8 #m/s^2
v = 0 #m/s
x0 = 1000 #m
t = [0.0, 0.25, 0.5, 0.75, 1.0]
y = a*t**2 + v*t + x0
print(y)
Basic pure Python cannot go beyond this point trivially: you have to introduce loops or some more advanced logic to get this to work properly with lists. A better way is NumPy.
The NumPy solution to this problem introduces the array
object (more pedantically, the ndarray
object). This object has common mathematical operators overloaded to work properly with it out-of-the-box; thus,
# This code snippet models a falling object using the well-known formula.
a = -9.8 #m/s^2
v = 0 #m/s
x0 = 1000 #m
t = np.array([0.0, 0.25, 0.5, 0.75, 1.0])
y = a*t**2 + v*t + x0
print(y)
You can generalize this calculation a little bit more by abstracting it out into a function.
Many features of array
-based calculations will work as you would intuitively expect in your functions, although sometimes you need to tweak things to make them work correctly. A set of powerful methods in the SciPy library (including NumPy, SciPy, MatPlotLib, and others) allows you to vectorize this function and make it apply to sets of numbers.
def y_fall(t, x0, v0):
a = -9.8
y = a*t**2 + v0*t + x0
return y
y_fall(t, x0, v)
Now we can trivially utilize NumPy functions, like arange
, to calculate our result vector.
t = np.arange(0,10,0.1)
y = y_fall(t, x0, v)
print(y)
plt.plot(t, y_fall(t, x0, v))
plt.show()
Further improvements could be made in the function and the physical model. However, this suffices to make our basic case: NumPy and SciPy greatly extend the power of Python to investigate numerical and physical models. This notebook will examine some of the features and their applications.
from numpy import zeros
dx = 0.1
dy = 0.1
dx2 = dx*dx
dy2 = dy*dy
def py_update(u):
nx, ny = u.shape
for i in range(1,nx-1):
for j in range(1, ny-1):
u[i,j] = ((u[i+1, j] + u[i-1, j]) * dy2 +
(u[i, j+1] + u[i, j-1]) * dx2) / (2*(dx2+dy2))
def calc(N, Niter=100, func=py_update, args=()):
u = zeros([N, N])
u[0] = 1
for i in range(Niter):
func(u,*args)
return u
This code takes a long time (if you don't want to wait, go to Kernel->Interrupt in the menu above). About *** on my machine.
%timeit calc(100, 200, func=py_update)
Let's compare a NumPy version.
def num_update(u):
u[1:-1,1:-1] = ((u[2:,1:-1]+u[:-2,1:-1])*dy2 +
(u[1:-1,2:] + u[1:-1,:-2])*dx2) / (2*(dx2+dy2))
%timeit calc(100, 200, func=num_update)
So (on my machine) that's (6.18 s)/(21.9 ms) = 282$\times$ speedup! (You can do even better moving to more optimized code, but we'll ignore that for now.) The point is that for any serious numerical work, NumPy is the package to build on.
(By the by, you can also do better if you include C code directly, for instance using the scipy.weave
package which we won't discuss further. In this particular case, it's not much better. We will also discuss the incorporation of C and Fortran code and libraries into Python in a few lessons.)
from scipy import weave
def weave_update(u):
code = """
int i, j;
for (i=1; i<Nu[0]-1; i++) {
for (j=1; j<Nu[1]-1; j++) {
U2(i,j) = ((U2(i+1, j) + U2(i-1, j))*dy2 + \
(U2(i, j+1) + U2(i, j-1))*dx2) / (2*(dx2+dy2));
}
}
"""
weave.inline(code, ['u', 'dx2', 'dy2'])
%timeit calc(100, 200, func=weave_update)
NumPy is the foundational numerical module in modern Python (both 2 and 3). NumPy provides the basic array manipulation and mathematical routines that are drawn upon by scientific classes. The common classes (which will discuss here) are either built directly on NumPy (SciPy, Pandas, MatPlotLib) or can readily interface with it (mpmath
).
from IPython.display import Image
Image(filename='numpy.png', width=800, embed=True)
NumPy is implemented in combination of Python and C so it executes much faster than the corresponding pure Python code would. In particular, for array-based problems, NumPy provides vastly improved performance, comparable to MATLAB. NumPy provides the following key features which we will touch on in this notebook:
a powerful generic $N$-dimensional array object
vectorized functions
useful linear algebra, Fourier transform, and random number capabilities (these are greatly extended by SciPy)
NumPy also contains tools for integrating C/C++ and Fortran code with Python for numerical applications, although this falls beyond the scope of this tutorial.
A = np.array([[1,2],[3,4]])
print(A)
You can make them directly from a list, as above, or you can use one of the built-in helper functions to do it.
For instance, numpy.ones
takes the dimensions of the desired array, as either a list or a tuple.
B = np.ones([3,2])
print(B)
C = np.zeros((4,5))
print(C)
I = np.eye(3)
print(I)
These arrays now behave "properly"—that is, it's a little harder to run into a mathematical operation that doesn't just work as you may expect.
A + 1
# elementwise multiplication
B * B
# matrix multiplication
B.dot(A)
B - 3
B - 0+3j
Naturally, additions or multiplications for arrays of different shapes do not work.
A + B
A.dot(B)
As you can see, values promote to the expected data type, although the default is double float64
. If you need to, you may also specify the desired variable type according to those available in NumPy (notable examples include double float64
, complex double complex128
, quad float128
, and long int64
). These reflect the data types available in the underlying C implementation of the numerical code.
# Initialize array elements to one.
np.ones((3,2), dtype=np.int32)
# Initialize array elements to zero.
np.zeros((6,1),dtype=np.complex)
# Do not initialize array elements to any particular values---use with caution!
np.empty((4,5))
Numerical calculations often require sequential arrays for coordinates. These can be nontrivial to create without a loop in pure Python, but naturally NumPy supports a way—actually two—to create coordinate arrays or ranges trivially.
arange
¶arange
is the floating-point counterpart to range
. range
only returns integers and fails on floating-point arguments. arange
returns an array of evenly-spaced floating-point values.
Python 2 v. 3: In Python 2, range
generates a list
of values. In Python 3, range
returns an iterable instead (equivalent to Python 2's xrange
). If you're not sure of the difference yet, the behavior should still be as you generally expect.
print(range(10))
print('range with integers: ', [i for i in range(10)])
print('range with integers: ', [i for i in range(0,10,2)])
print('range with floats: ', [i for i in range(0,10,0.5)])
print(np.arange(10))
print('arange with integers: ', np.arange(10))
print('arange with integers: ', np.arange(0,10,2))
print('arange with floats: ', np.arange(0,10,0.5))
# Create a range of floating-point values (useful in for loops).
dx = 0.1
xhi = 2.0
xlo = 1.0
x = np.arange(xlo, xhi+dx, dx)
print(x)
# Create an array with 11 equal spaces from 0 to 1 inclusive.
np.linspace(0, 1, 11)
Careful! linspace
is actually preferred for most cases, since numerical drift (discussed in a moment) leads arange
to accrue small numerical truncation errors when using non-integer steps. np.linspace
is often a better choice, if less conveniently expressed.
Some other functions you should investigate for creating useful ranges are r_
, c_
, and item
.
Use these cells to familiarize yourself with both IPython notebooks and NumPy/SciPy code.
linspace
. Then do it using arange
.
x
with the numbers $0, 0.5, 1.0$.y
with the numbers $-5.3, -1.8, 1.5$.x
and y
into an array as two subsequent rows of data. (We haven't talked about this yet—try a few things to see what works.)print
. Hint: remember to enclose the variable names in a list.
def y_fall(t, x0, v0):
a = -9.8
y = a*t**2 + v0*t + x0
return y
t = np.arange(0,300,0.1)
print(y_fall(t, x0, v))
With base data types, simple assignment makes a copy ($y$ doesn't change when $x$ changes).
x = 7
y = x
x = 8
print x,y
However, with arrays (particularly large ones), we don't want to incautiously make copies which could rapidly consume our available memory. Thus the behavior for assignment in NumPy is to not copy an array, but refer to it.
A = np.linspace(0.0, 9.0, 10)
B = A
print("A is B?", A is B)
A = np.linspace(0.0, 9.0, 10)
B = A
print("A =", A)
B[0] = -1 #changes element in A since B is simply another name for A
print("A =", A)
B.shape = 2,5
print("A.shape =", A.shape)
print("A =", A)
print("B.shape =", B.shape)
print("B =", B)
Views particularly find a good usage in referring to slices, such as ghost cells, which require regular access but are inconveniently addressed by a complex slicing expression.
A = np.linspace(0.0, 9.0, 10)
C = A.view()
print("C is A?", C is A)
print("C.base is A?", C.base is A)
A = np.linspace(0.0, 9.0, 10)
C = A.view()
print("A =", A)
C[0] = -1 #changes element in A since C shares the same data as A
print("A =", A)
C.shape = 2,5
print("A.shape =", A.shape)
print("A =", A)
print("C.shape =", C.shape)
print("C =", C)
D = C[1].view()
print(D)
A = np.linspace(0.0, 9.0, 10)
E = A.copy()
print("A is E?", A is E)
print("E.base is A?", E.base is A)
A = np.linspace(0.0, 9.0, 10)
E = A.copy()
print("A =", A)
E[0] = -1 #doesn't changes element in A since E is now a different instance from A
print("A =", A)
print("E =", A)
E.shape = 2,5
print("A.shape =", A.shape)
print("A =", A)
print("E.shape =", E.shape)
print("E =", E)
NumPy slicing simply extends the traditional Python concept to $n$ dimensions. Slices are views of the original array, and so are useful for manipulating or iterating over subsets. The basic format for each dimension is start:stop:step. Any or all of these may be omitted.
x = np.linspace(0, 9, 10)
print("x =", x)
print("x[0:5] =", x[0:5])
print("x[0:10:3] =", x[0:10:3])
print("x[1::2] =", x[1::2])
print("x[::] =", x[::])
print("x[::-1] =", x[::-1])
import numpy.random as npr
X = npr.random((3,3))
print("X =\n", X)
print("X[0:2, 0] =", X[0:2, 0])
print("X[1::2] =", X[1::2])
print("X[:, 0] =", X[:, 0])
print("X[1, :] =", X[1, :])
print("X[::-1, ::-1] =\n", X[::-1, ::-1])
A_col
. Change the contents of this slice to twos.A_row
and change its contents to threes.
Sorting by a nested element in a list of lists is rather complicated, requiring the definition of a sorting function for sorted
, for instance. NumPy provides a trivial solution:
A = np.array([[1.5,0.9,4.6,0.1],[0.3,0.8,1.3, 2.7],[2.5,2.5,0.6,3.2]])
print('A = \n', A, '\n')
# np.sort sorts everything in its column or row.
print('Sorted along first index:\n', np.sort(A, 0), '\n')
print('Sorted along second index:\n', np.sort(A, 1), '\n')
# To keep things together, try this:
print('Indices of sorting:\n', np.argsort(A))
print('yielding the matrix columns sorted by the first row:\n', np.take(A, np.argsort(A), 1)[:,0])
print('or the rows sorted by the first column:\n', np.take(A, np.argsort(A, 0), 0)[:,0])
ndarray
s¶NumPy arrays can be written to disk in either a plain-text or binary format for future use. This is especially convenient for crash recovery (interim matrices can be recovered) and result storage.
The save
function creates a binary .npy
file containing the array. This file may be loaded again using load
.
Q = npr.random((10,10))
print('Q = \n', Q)
np.save('random-data.npy', Q)
R = np.load('random-data.npy')
print('R = \n', R)
The savetxt
function creates a text file (of user-specified extension) containing the array. This file may be loaded using loadtxt
or the more robust genfromtxt
(which handles missing values).
np.savetxt('random-data.dat', Q)
S = np.genfromtxt('random-data.dat')
print('S = \n', S)
Binary formats are preferred over plain text when large data sets are stored.
import os
statinfo = os.stat('random-data.npy')
print('Binary file random-data.npy consists of %d bytes.'%statinfo.st_size)
statinfo = os.stat('random-data.dat')
print('Plain-text file random-data.dat consists of %d bytes.'%statinfo.st_size)
In many cases, existing functions (from outside the NumPy/SciPy family) require modification to work properly with ndarray
s. This is particularly the case with if
statements over arrays (which return boolean arrays rather than single values).
For these cases, NumPy provides the convenience function vectorize
, which creates an alternate version of the function which can operate on arrays. (The resulting vectorize
d function is not optimized, however, and should be avoided in performance-critical cases in favor of hand-tuned code.)
from numpy import expm1
def f_scalar(n):
if n < 0.0:
return 0.0
else:
return expm1(n)
print 'Scalar function on scalar value:', f_scalar(1.0) #this will succeed
print 'Scalar function on vector value:', f_scalar(np.linspace(0,1,6)) #this will fail
Why does this fail? Let's evaluate the conditional test in the if
statement.
np.linspace(0,1,6) < 0.0
So we get an array out, which the if
statement doesn't know what to do with. NumPy knows that things like this will occasionally be a problem, so a convenience function, vectorize
, is provided in order to make things work.
from numpy import vectorize
f_vector = vectorize(f_scalar)
print 'Vector function on vector value:', f_vector(np.linspace(0,1,6)) #this will succeed now
You can also hand-code a function, for instance using the useful where
function (try it):
from numpy import where
def f_hand(n):
return where(n < 0.0, 0.0, expm1(n))
print 'Handwritten vector function on vector value:', f_hand(np.linspace(0,1,6))
print 'Handwritten vector function on scalar value:', f_hand(1.0)
Let's actually pull where
out so you can see what it does.
n = np.linspace(-5,5,11)
print where(n < 0.0, 0.0, expm1(n))
print where(n < 0.0, 0.0, n)
print where(n < 0.0)
def add_2nd_elems(b, a):
"""adds the second elements of two number containers"""
print b, a
return b[1] + a[1]
aa = [[1,2,3,4], [5,6,7,8], [9,10,11,12]]
bb = [[100,200,300,400], [500,600,700,800]]
print add_2nd_elems(bb,aa) #works okay
add_2nd_vec = np.vectorize(add_2nd_elems)
print add_2nd_vec(bb, aa) #breaks---why?
numpy.linalg
will be your workhorse module if you are interested in classical numerics: eigenvalues, matrix solution, decomposition, etc. It has been designed to reflect a MATLAB-like syntax as well for ease of reading and code conversion.
Many basic features of linear algebra have been promoted from the numpy.linalg
module to the main numpy
module and are available in both namespaces.
from numpy import linalg as LA
A = np.array([[1.,2],[3,4]])
B = np.array([[0.,-1],[1,0]])
I = np.eye(2)
print A
print B
print I
The *
operator provides elementwise multiplication between two arrays. numpy.linalg
additionally provides the dot product, inner product, outer product, tensor dot product along specified axes, and support for Einstein summation notation.
I*A #Elementwise multiplication
np.dot(I,A) #Conventional matrix multiplication
np.inner(I,A) #Conventional matrix multiplication
LA.eig(A)
LA.eig(B)
A.T #transpose
LA.inv(A) #inverse
Matrix inversion, while convenient formally, is a tedious memory-hogging process in practice and should be avoided in almost all cases. Use the more intelligent linalg.solve
command or execute LU decomposition (scipy.linalg.lu
) (where inversion can be done by backsubstituion), etc.
# Solve the matrix problem C*x = t
C = np.array([[2.843,-1.326,9.841],[8.673,1.295,-3.215],[0.173,-7.724,2.832]])
t = np.array([[5.643,3.124,1.694]]).T
x = LA.solve(C,t)
print 'x=\n', repr(x)
from scipy.linalg import lu
# Solve the matrix problem C*x = t by LU decomposition. (Note that SciPy P equals MATLAB P^-1.)
(P, L, U) = lu(C)
print 'P =\n', P
print 'L =\n', L
print 'U =\n', U
#C*x = P*L*U*x = t --> z = P^-1*t, y = L^-1*z, x = U^-1*y
z = LA.inv(P).dot(t)
y = LA.inv(L).dot(z)
x_lu = LA.inv(U).dot(y)
print 'x =\n', x_lu
print 'This should be zero:\nx_solve - x_lu =\n', x - x_lu
print 'Determinant of C =', LA.det(C), '\n'
print 'Matrix norm of C =', LA.norm(C), '\n'
print 'Inverse of C =\n', LA.inv(C), '\n'
print 'Eigenvalues of C =\n', LA.eig(C)[0], '\n'
print 'Eigenvectors of C =\n', LA.eig(C)[1]
NumPy, by default, uses ndarray
s, and so numpy.linalg
is built around these objects. Additionally, numpy.matlib
provides support for matrix
, an alternative to ndarray
which is rarely used but provides an alternative convenience syntax for matrix–matrix multiplication (*
) and matrix exponentiation (**
), for instance.
A common problem in mechanics is the solution of forces in a truss. Tis is solved statically by the method of joints, in which an equation is writen for each node of the truss and resulting set of linear equations is solved.
The system of linear equations on the right can be solved in matrix form. Let us write $T x = f$.
You could write the solution to this as $x = T^{-1}f$. While atractive formally, it is ofen far too expensive to calculate and store an inverse matrix in memory for large problems. Matrix inversion is a brute-force solution to a linear algebra problem. There are a number of clever ways to solve matrices built into NumPy and SciPy. The most frequent method is not to invert the matrix, but instead to use the solve
function, as above.
T = np.array([[0.5, 1, 0, 0, 0, 0, 0],
[0.866, 0, 0, 0, 0, 0, 0],
[-0.5, 0, 0.5, 1, 0, 0, 0],
[0.866, 0, 0.866, 0, 0, 0, 0],
[0, -1, -0.5, 0, 0.5, 1, 0],
[0, 0, 0.866, 0, 0.866, 0, 0],
[0, 0, 0, -1, -0.5, 0, 0.5]])
print(T)
f1 = 1000
f2 = 2000
f = np.array([f1, -0.433*f1-0.5*f2, -f1, 0, 0, f2, 0])
numpy.linalg.inv
). How long does it take?
scipy.linalg.solve
function. How long does it take?
The larger the matrix, the more pronounced the difference will be due to the accrual of small numerical errors in different methods.
It is apparent that the matrix in the previous example is largely diagonal: most of the values away from the diagonal are zero. Something similar is often observed in the solution of differential equations using linear algebra. Matrices of this form are referred to as banded, and it is more efficient to represent them sparsely: that is, we only specify the position and value of nonzero entries and presume that all others are zero. For a $7 \times 7$ matrix, this is marginally efficient, if it helps at all; for a $10,000 \times 10,000$ matrix, sparse representation is of obvious utility.
One typical pattern, the tridiagonal banded matrix, is shown below.
Other patterns are common for sparse matrices; all have in common that they can be more efficiently represented in sparse form than dense form.
Let's examine the sparse representation of the truss example.
from scipy.sparse import coo_matrix # COOrdinate format, a common sparse representation
print(coo_matrix(T))
from sys import getsizeof
print('%d bytes for dense T'%getsizeof(T))
print('%d bytes for sparse T'%getsizeof(coo_matrix(T)))
To recapitulate, for problems requiring large grids, it often becomes economical to use sparse matrix representations. This is the case when most entries of a matrix are zero, meaning that it is more compact to store values and indices rather than the entire matrix. These types of matrices often occur in the solution of finite difference, finite element, and nodal discretizations of differential equations.
SciPy sparse matrix objects support methods such as min
and diagonal
; however, conventional slicing does not generally work with these matrices.
Finally, recall that sparse matrix representation is designed to be memory-efficient primarily. For numerical efficiency, there are several representations to choose from which have varying features.
Indicial notation is convenient when populating banded matrices, such as for finite difference equations.
# Banded dense matrix
nx = 4
H = np.zeros((nx,nx), dtype=np.float128)
i,j = np.indices(H.shape)
H[i==j] = 2.0
H[i==j-1] = -1.0
H[i==j+1] = -1.0
print(H)
# Banded sparse matrix
nx = 4
from scipy.sparse import csr_matrix # Compressed Sparse Row format, a common sparse representation
J = csr_matrix((nx,nx), dtype=np.float128)
i,j = np.indices(J.shape)
J[i==j] = 2.0
J[i==j-1] = -1.0
J[i==j+1] = -1.0
print(J)
print(J.todense())
The next examples will illustrate how to use the built-in polynomial
convenience functions (such as Chebyshev, Laguerre, Legendre, Hermite, etc.).
First, the representation of polynomials: an intuitive power-series representation is used to carry the coefficients, which may be evaluated using polyval
or integrated and differentiated, etc. (The index gives the power to which the independent variable is raised.)
from numpy.polynomial.polynomial import polyval
poly = [1,4,6,7] # 1 + 4x + 6x**2 + 7x**3
print(polyval(2.0, poly))
Normally, you'll probably just use a list of coefficients. Occasionally, however, converting to the Polynomial
class explicitly can be convenient, as you gain access to a number of methods like linspace
and roots
.
from numpy.polynomial import Polynomial
Polynomial([1,2,3]).linspace(25, [0,1])
Polynomial([4,2,-1]).roots()
Naturally, the first (and maybe only) thing most of us want to do with polynomials is use them to fit discrete data. polyfit
fits the bill.
Let's pull a data set of a few hundred points and try fitting it. The Energy Information Administration provides monthly summaries of energy consumption going back to January 1973, so let's pull those values up and see if we can fit them reasonably well.
# Load the energy consumption data from online source.
import urllib.request
url = urllib.request.urlopen("https://raw.githubusercontent.com/uiuc-cse/python-sp15/gh-pages/lessons/data/eia-data.csv").read(20000).decode('utf-8') # read only 20 000 chars
rows = url.split('\r') # then split it into lines
data_list = [row.split(',') for row in rows]
data_list = [float(pt) for row in data_list for pt in row ]
energy_data = np.array(data_list)
print(energy_data.shape[0])
energy_data = np.reshape(energy_data, (energy_data.shape[0]/2,2))
ax = plt.subplot(111)
ax.plot(energy_data[:,0], energy_data[:,1])
ax.set_title(r'U.S. Energy Consumption in $10^{15}$ BTU')
plt.show()
The data are a bit choppy to fit directly with a polynomial (although you could probably throw in a cyclical multiplier on an annual or semiannual period if you were interested in fitting the data that closely). Let's sample periodically, which is easy in NumPy:
sample = energy_data[3::12] # sampling in March
ax = plt.subplot(111)
ax.plot(sample[:,0], sample[:,1])
ax.set_title(r'U.S. Energy Consumption in $10^{15}$ BTU')
plt.show()
Okay, now it's actually trivial to carry out a least-squares fit of the data to whatever polynomial degree they support:
from numpy.polynomial.polynomial import polyfit
order = 3
x = sample[:,0]
y = sample[:,1]
y_poly = polyfit(x, y, order)
y_fit = polyval(x, y_poly)
ax = plt.subplot(111)
ax.plot(x, y)
ax.plot(x, y_fit, 'r-')
ax.set_title(r'U.S. Energy Consumption in $10^{15}$ BTU, March 1973–March 2014')
plt.show()
The usual caveats about curve fitting, extrapolation, etc., of course apply.
You can also find out information about the residual and other statistics as documented:
y_poly, y_stats = polyfit(x, y, order, full=True)
y_fit = polyval(x, y_poly)
print(y_poly)
print(y_stats)
The Chebyshev functions are a set of orthogonal basis functions used in the solution of differential equations. They can be recursively defined as
$$T_0(x) = 1 $$$$T_1(x) = x $$$$T_{n+1}(x) = 2xT_n(x) - T_{n-1}(x).$$The first few functions are:
$$T_0(x) = 1$$$$T_1(x) = x$$$$T_2(x) = 2x^2 - 1$$$$T_3(x) = 4x^3 - 3x$$$$T_4(x) = 8x^4 - 8x^2 + 1$$et cetera.
from numpy.polynomial import Chebyshev as T
from numpy.polynomial import Polynomial as P
x = np.linspace(-1, 1, 100)
for i in range(6):
ax = plt.plot(x, T.basis(i)(x), lw=2, label=r'$T_%d$'%i)
print(r'T_%d(x) = %s'%(i, T.basis(i).convert(kind=P)))
plt.legend(loc="upper left")
plt.show()
Many other basis functions and standard polynomials are available as well.
The functionality for creating Taylor series expansions of existing functions can be found several places in the SciPy suite. For instance, scipy.interpolate
provides approximate_taylor_polynomial
to estimate the Taylor series around a point in an arbitrary function by polynomial fitting.
# Select an arbitrary function to approximate.
from scipy.special import j0 # Bessel function of the first kind, order zero
# Approximate the function f by the Taylor series expansion p (translated to the origin, so p(0) = f(x0)).
from scipy.interpolate import approximate_taylor_polynomial
x0 = 3.5 # location at which we will approximate
order = 3 # <-- tuning knob for order of approximation
series = approximate_taylor_polynomial(j0, x0, order, 0.01)
print(series)
# Plot the approximation(s) versus the original function.
x = np.linspace(x0-0.5, x0+0.5, 200)
mpl.rcParams['figure.figsize']=[15,6]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, j0(x), 'r-', lw=2, label=r'$J_0(x)$')
ax.plot(x, series(x-x0))
# Plot residual in subfigure.
ax = fig.add_subplot(222)
ax.plot(x, j0(x)-series(x-x0), lw=2)
plt.plot()
You can also utilize the sympy.series
function:
from sympy import Symbol, series
from sympy.functions.special.bessel import besselj
x = Symbol('x')
series(besselj(0,x), x, x0=x0, n=order+1).evalf()
Random numbers are generated in numpy.random
using a fairly sophisticated pseudorandom number generation algorithm, the Mersenne Twister (incidentally, it's unsuitable for cryptography but acceptable for Monte Carlo simulations).
Dozens of distributions are available for your use. When you require a random value, you sample from a specific distribution with its own limits, weights, and moments: uniform [0,1) (uniform
), gaussian normal (normal
), $\chi^{2}$ (chisquare
), Pareto (pareto
), and so on.
from numpy.random import random
print(random((2,3)))
from numpy.random import uniform
print(uniform(0,1))
print(uniform(0,100))
print(uniform(5))
from numpy.random import normal
n = (1000,1)
x = normal(size=n)
avg = np.mean(x)
std = np.std(x)
x_avg = np.ones(n)* avg
x_stdl = np.ones(n)*(avg-std)
x_stdh = np.ones(n)*(avg+std)
plt.plot(x,'bx',x_avg,'r-',x_stdl,'r--',x_stdh,'r--')
plt.title('%d Random Gaussian Numbers' % n[0]); plt.xlabel(r'$n$'); plt.ylabel(r'$U(n)$')
plt.show()
plt.hist(x,bins=20)
plt.title('Distribution of %d Random Gaussian Numbers' % n[0]); plt.xlabel(r'$U(n)$'); plt.ylabel(r'Frequency of $U$')
plt.show()
A seed is necessary to start a PRNG. Typically this is the system clock when the internal RandomState
object is created, although you may provide a seed manually. In this case, the random number sequence will be reproduced each time the program is invoked. This is particularly helpful when you are trying to create replicable results or debug your code.
A factorial experiment tries all combinations or ordered pairs of experimental conditions. It is a best practice to run these trials in a random order, thus attempting to minimize bias and environmental factors.
You can use the shuffle
function to trivially generate a randomized factorial design for experimentation. Consider a two-factor experiment where each independent variable can assume four states. There are thus sixteen trials to be held:
# Generate an array of ordered pairs of independent variable values.
factors = np.array([[i, j] for i in range(0,4) for j in range(0,4)])
print('Initial array:\n', factors)
# Assign a random number to each ordered pair and sort on that basis. Presto! A randomized factorial design!
from numpy.random import shuffle
shuffle(factors)
print('Shuffled array:\n', factors)
SciPy is a collection of mathematical routines built on top of NumPy. SciPy also provides convenience functions for scientific computing. (SciPy can refer to either the entire system of modules around NumPy or specifically to the SciPy library; we consistently take the latter sense in this document.)
You can get help for a component of scipy
using the info
function. Similarly, source
will let you view the source code for a function.
from scipy.special import jn
sp.info(jn)
from scipy import optimize
sp.source(optimize.fmin)
scipy.constants
provides mathematical constants and physical constants in SI units (unless otherwise stated). It also provides primitive unit conversion support.
scipy.constants.physical_constants
is a dictionary of named quantities in the format physical_constants[name] = (value, unit, uncertainty)
.
from scipy import constants
from math import e as ee
print(constants.R) #molar gas constant in J/mol/K
print(constants.physical_constants['molar gas constant'])
print(constants.c)
print(constants.physical_constants['speed of light in vacuum'])
print(constants.physical_constants['neutron mass'])
print(constants.physical_constants['{220} lattice spacing of silicon'])
print(constants.physical_constants['Planck mass energy equivalent in GeV'])
print(constants.femto*constants.gram) #SI equivalent of fg (i.e., in kg)
#(1,000 miles to m) / (speed of sound at 15°C, 1 atm in m/s)
1000 * constants.mile / constants.mach
Make sure you know what you are asking for.
print(ee) #Euler's number, base of the natural logarithm
print(constants.e) #elementary charge, not Euler's number
from scipy.special import jn, jn_zeros
xlo = 0.0
xhi = jn_zeros(1,4)[3]
x = np.linspace(xlo, xhi, 501)
y = jn(1, x)
plt.plot(x, y, 'r-')
plt.show()
Some integration functions require numerical data.
from scipy.integrate import trapz
val = trapz(x, y)
print('The result of the trapezoid rule is %f.'%val)
from scipy.integrate import simps
val = simps(x, y)
print('The result of Simpson\'s rule is %f.'%val)
Others can operate directly on functions and yield error estimates as well. quad
is the recommended integrator for general-purpose ODEs.
from scipy.integrate import quadrature
(val, err) = quadrature(lambda x: jn(1, x), xlo, xhi)
print('The result of quadrature is %f with an error of %e.'%(val, err))
from scipy.integrate import quad
(val, err) = quad(lambda x: jn(1, x), xlo, xhi)
print('The result of general integration is %f with an error of %e.'%(val, err))
You can also integrate the convenience classes provided in numpy.polynomial
. These results are analytic.
#from numpy.polynomial import Polynomial as P
from numpy.polynomial.polynomial import polyint
c = [0.667, 4.675, -2.349, 5.602]
print('The base polynomial coefficients are', c)
print('The integrated polynomial coefficients are', polyint(c))
x = np.linspace(-2, 2, 1001)
plt.plot(x, np.polyval(c, x))
plt.plot(x, np.polyval(polyint(c), x))
plt.ylim((0, 20))
plt.show()
odeint
provides an interface for the solution of first-order differential equations.
Solve the equation $\left( 1 + x^2 \right) \textrm{d}y = \left( \frac{1}{x} - x y \right) \textrm{d}x$.
Rewrite in a canonical form as follows:
$\frac{\textrm{d}y}{\textrm{d}x} = \frac{\left( \frac{1}{x} - x y \right)}{1 + x^2} = \frac{1}{x + x^3} - \frac{x}{1 + x^2} y$.
from scipy.integrate import odeint
def dydx(y, x):
if x == 0:
return float('Inf')
return (1/(x+x**3.0)) - x/(1+x*x)*y
y0 = np.array((1.0))
dx = 1e-5
x = np.arange(1e-2, 10.0, dx)
sol= odeint(dydx, y0, x)
plt.plot(x, sol)
plt.show()
Solve the equation $\frac{\textrm{d}y}{\textrm{d}x} = y(x)^2$ subject to the condition $y(0) = A$ for $A \in \left\{0.2, 0.4, ..., 2.0\right\}$ over the range $x \in \left[0, \frac{1}{2}\right]$.
def dydx(y, x):
return y*y
y0 = np.arange(0.2,2.01,0.2)
x = np.arange(0.0, 0.5, 1e-5)
sol= odeint(dydx, y0, x)
import matplotlib.cm as cm
for i in np.arange(10):
c = cm.rainbow(i/10.,1)
plt.plot(x[:], sol[:,i], color=c)
plt.ylim((0,10))
plt.show()
Higher-order ODEs may be solved by the expedient of using a system of coupled DEs thus:
$$\frac{du}{dt} = v$$$$\frac{dv}{dt} = f(u, v)$$Boundary conditions must be handled using the shooting method, as only one boundary condition can be specified at a time with the current odeint
function.
Solve the 1D steady-state heat equation $g(x) = \alpha \left( \frac{\partial^2 u}{\partial x^2} \right)$ with $\alpha = 1$ and the boundary conditions $u(x = 0) = 0$, $u(x = 1) = 1$ over the range $x \in [0, 1]$ with $g(x) = -sin(x)$.
Rewrite the equation as a system of two first-order equations:
In this case (linear equation), given two trial solutions $u_1$ and $u_2$ meeting only the left-hand boundary condition $u(0)$, we can write the actual solution as a linear combination of these two solutions. The coefficients are:
#f = [u, v]
def dfdx(f, x):
return np.array([f[1], -20*np.sin(x)])
f1 = odeint(dfdx, np.array([0.0, 0.0]), np.linspace(0.0, 1.0, 10000))
f2 = odeint(dfdx, np.array([0.0, 1.0]), np.linspace(0.0, 1.0, 10000))
#Shooting method for converting a BVP to an IVP.
c1 = (1.0 - f2[:,0][-1]) / (f1[:,0][-1] - f2[:,0][-1])
c2 = (f1[:,0][-1] - 1.0) / (f1[:,0][-1] - f2[:,0][-1])
f = c1 * f1[:,0] + c2 * f2[:,0]
plt.plot(np.linspace(0.0, 1.0, 10000), f1[:,0], 'r--',
np.linspace(0.0, 1.0, 10000), f2[:,0], 'g--',
np.linspace(0.0, 1.0, 10000), f, 'b-')
plt.title(r'Solution of 1D heat equation $u_{xx}(x) = -20 \sin(x)$')
plt.ylabel(r'$u(x)$')
plt.xlabel(r'$x$')
plt.ylim((-4,2))
plt.show()
scipy.interpolate
provides splines, polynomial interpolators, and other resources for univariate and multivariate interpolation.
interp1d
returns a function interpolating data within the range of the data.
from scipy.interpolate import interp1d
xpts = np.linspace(0, 10, 11)
y = np.sin(-xpts**2/8.0)
f = interp1d(xpts, y, kind='linear')
f2 = interp1d(xpts, y, kind='cubic')
x = np.linspace(0, 10, 201)
plt.plot(xpts, y, 'kx',
x, f(x), 'r--',
x, f2(x), 'r-',
x, np.sin(-x**2/8.0), 'k-')
plt.legend(['data', 'linear', 'cubic', 'exact'], loc='upper left', ncol=2)
plt.ylim((-1.5,1.5))
plt.show()
print('f(%f) = %f'%(2.71828, f(2.71828)))
griddata
interpolates unstructured $N$-dimensional data according to one of several methods (nearest, linear, and cubic).
def func(x, y):
return x*(1-x)*np.cos(4*np.pi*x) * np.sin(2*np.pi*np.sqrt(y))
# Define the basic grid coordinates.
grid_x, grid_y = np.mgrid[0:1:250j, 0:1:250j]
# Define a random subset of the grid for which we will generate data.
pts = np.random.rand(500,2)
vals = func(pts[:,0], pts[:,1])
# Load the methods and generate a grid for each approach.
from scipy.interpolate import griddata
grid_z0 = griddata(pts, vals, (grid_x, grid_y), method='nearest')
grid_z1 = griddata(pts, vals, (grid_x, grid_y), method='linear')
grid_z2 = griddata(pts, vals, (grid_x, grid_y), method='cubic')
plt.subplot(221)
plt.imshow(func(grid_x, grid_y).T, extent=(0,1,0,1), origin='lower', cmap=cm.PiYG)
plt.plot(pts[:,0], pts[:,1], 'k.', ms=1)
plt.title('Original')
plt.subplot(222)
plt.imshow(grid_z0.T, extent=(0,1,0,1), origin='lower', cmap=cm.PiYG)
plt.title('Nearest')
plt.subplot(223)
plt.imshow(grid_z1.T, extent=(0,1,0,1), origin='lower', cmap=cm.PiYG)
plt.title('Linear')
plt.subplot(224)
plt.imshow(grid_z2.T, extent=(0,1,0,1), origin='lower', cmap=cm.PiYG)
plt.title('Cubic')
plt.gcf().set_size_inches(12, 12)
plt.show()
Spline interpolations in one and two dimensions are also available.
As an example, we will optimize over the function $\sqrt[x]x$, which has a global maximum at $x = e = 2.718281828459045...$. This gives us a straightforward benchmark of testing for convergence and accuracy.
from math import e as ee
def f(x):
return x ** (1/x)
x = np.linspace(0,4,401)
y = f(x)
mpl.rcParams['figure.figsize']=[15,3]
plt.plot(x, y, 'b-')
plt.axis((0.0, 4.0, 0.0, 1.5))
plt.show()
The only wrinkly is that SciPy only provides a minimize
function, so we have to subtly modify this to work properly.
from scipy.optimize import minimize
def fmod(x):
return 1.0-f(x)
x0 = 1.0 #a bad guess
res = minimize(fmod, x0, method='nelder-mead', options={'xtol': 1e-8, 'disp': True})
print(res)
print('Absolute error is %e.'%(res.x-ee))
mpl.rcParams['figure.figsize']=[15,3]
plt.plot(x, y, 'b-', [ee, ee], [0.0, 1.5], 'r--', [res.x, res.x], [0.0, 1.5], 'g--')
plt.axis((0.0, 4.0, 0.0, 1.5))
plt.show()
There is a wide selection of optimization methods available in SciPy, from old standbys like the Nelder–Mead simplex and conjugate gradient to simulated annealing and the dog-leg trust-region algorithm.
Special functions typically refer to particular mathematical functions with well-established notations and roles, although there is no formal definition of the category.
Some common special functions include the relatively common trigonometric ($\sin, \cos, \tan, \sinh, \cosh, \tanh$) (which are in the scipy
namespace), Bessel ($J_\nu, Y_\nu, I_\nu, K_\nu$), Airy ($\textrm{Ai}$), spherical harmonics ($Y^m_\ell$), $\Gamma$, binomial coefficient, and error ($\textrm{erf}, \textrm{erfc}$) functions. Many more esoteric functions, such as Kelvin ($\textrm{ber}, \textrm{bei}, \textrm{ker}, \textrm{kei},$ etc.), Mathieu ($\textrm{CE}_n, \textrm{SE}_n$, etc.), and hypergeometric functions ($_pF_q, \textrm{Li}_2$, etc.) are also supported in scipy.special
.
from scipy.special import j0, jn_zeros
x = np.linspace(0, 16, 201)
plt.plot(x, j0(x), 'r-', jn_zeros(0, 5), [0]*5, 'bo')
plt.ylim((-1.0, 1.0))
plt.show()
from scipy.special import airy
(Ai, Aip, Bi, Bip) = airy(1.0)
print('Ai(1.0) = %.4f\nBi(1.0) = %.4f'%(Ai, Bi))
from scipy.special import erf, gamma, eval_legendre
x = np.linspace(0.001, 3, 201)
plt.plot(x, erf(x), 'r-',
x, gamma(x), 'g-',
x, eval_legendre(8, x), 'b-')
plt.ylim((-0.5, 2.5))
plt.show()
There are both generic and optimized specific versions of many of these functions, particularly Bessel and Hankel functions.
from scipy.special import jn, jv, j0
print('j0(1.5) = %s\n'%j0(1.5))
# Let's do some time trials to see which is faster. (This is for 1e6 evaluations, not one.)
import timeit
t = timeit.Timer("jv(0, 1.5)","from scipy.special import jv")
print('jv(0, 1.5) @ %.4s s'%t.timeit())
t = timeit.Timer("jn(0, 1.5)","from scipy.special import jn")
print('jn(0, 1.5) @ %.4s s'%t.timeit())
t = timeit.Timer("j0(1.5)","from scipy.special import j0")
print('j0(1.5) @ %.4s s'%t.timeit())
scipy.stats
contains many probability distributions and associated convenience functions and statistical functions.
from scipy.stats import alpha, chi2, maxwell, rayleigh, pareto
x = np.linspace(0.01, 10, 1001)
ax = plt.plot(x, alpha.pdf(x, 1.0), lw=2, label=r'$\alpha$')
ax = plt.plot(x, chi2.pdf(x, 1.0), lw=2, label=r'$\chi^2$')
ax = plt.plot(x, maxwell.pdf(x, 1.0), lw=2, label=r'Maxwell')
ax = plt.plot(x, rayleigh.pdf(x, 1.0), lw=2, label=r'Rayleigh')
ax = plt.plot(x, pareto.pdf(x, 1.0), lw=2, label=r'Pareto')
plt.legend(loc='upper right')
plt.ylim(0.0, 1.2)
plt.show()
from numpy.random import normal
n = (1000,1)
x = normal(size=n)
plt.plot(x, 'bo')
from scipy.stats import tmean, skewtest, histogram
x_mean = np.mean(x)
x_tmean= tmean(x, (-1, 1)) #trimmed mean: mean of interval [-1, 1]
print('x_mean =\n', x_mean)
print('x_tmean =\n', x_tmean)
print('is skew significant? the p-value is', skewtest(x)[1])
h = histogram(x, 5)
print('histogram bins size = %f, low value = %f, occupancy = %s'%(h[2], h[1], h[0]))
Finite-difference models are used throughout engineering to obtain numerical solutions to differential equations. This particular system models the heat equation
$$ \frac{1}{\alpha} \frac{\partial u}{\partial t} = \frac{\partial^2 u}{\partial x^2}$$given an initial condition of $u(x,t=0) = \sin\left(\pi x/L\right)$ and boundary conditions of $u(x=0,t) = 0$ and $u(x=L,t) = 0$.
To approximate a derivative, the most straightforward way is to take the formal definition
$$f'(x) = \frac{f(x+h)-f(x)}{h}$$and use a small but nonzero step $h$ in your calculation.
Application of this principle to the heat equation leads to a statement of the form
$$ \frac{1}{\alpha} \frac{u^m_i - u^{m-1}_i}{\Delta t} = \frac{u^{m-1}_{i-1} - 2 u^{m-1}_{i} + u^{m-1}_{i+1}}{\Delta x^2} $$or $u^m_i = \frac{\alpha \Delta t}{\Delta x^2}u^{m-1}_{i-1} + \left[1 - 2\left(\frac{\alpha \Delta t}{\Delta x^2}\right)\right]u^{m-1}_{i} + \frac{\alpha \Delta t}{\Delta x^2}u^{m-1}_{i+1}$.
This clearly yields a way to calculate subsequent time steps point-by-point from the previous time step's data.
# Basic parameters
nt = 60
nx = 10
alpha = 0.1
length = 1.0
tmax = 0.5
# Derived parameters: mesh spacing and time step size
dx = length / (nx-1)
dt = tmax / (nt-1)
# Create arrays to save data in process.
x = []
t = []
u = []
for i in range(nt):
t.append(i*dt)
ulist = []
for j in range(nx):
ulist.append(0.0)
u.append(ulist)
for i in range(nx):
x.append(i*dx)
# Set initial and boundary conditions.
from math import sin, pi
for j in range(nx):
u[0][j] = sin(pi*(j*dx)/length)**2
#boundaries are implicitly set by this initial condition
# Loop through each time step.
r = alpha * dt / (dx*dx)
s = 1 - 2*r
for n in range(1, nt):
for j in range(1, nx-1):
u[n][j] = r*u[n-1][j-1] + s*u[n-1][j] + r*u[n-1][j+1]
# Plot the results (initial and final conditions).
mpl.rcParams['figure.figsize']=[15,3]
plt.plot(x, u[0], 'k-', lw=2)
plt.plot(x, u[-1], 'b-', lw=2)
plt.ylim((0,1))
plt.xlim((0,1))
plt.show()
So the code above basically works, but we can make a few improvements by moving to NumPy for more of it.
First, let's read in the parameters from a separate file. (There's also a version in coding.ipynb
which uses the command line arguments as input.) We'll also start using NumPy ndarray
s for the data, which simplifies things a lot.
# Load parameters from disk.
with open('fd-params.cfg', 'r') as config:
for line in config:
variable, value = [word.strip() for word in line.split('=')]
exec(variable + '=' + value)
# Compute mesh spacing and time step.
dx = length / (nx-1)
dt = tmax / (nt-1)
# Create arrays to save data in process.
x = np.linspace(0, length+1e-15, nx)
t = np.linspace(0, tmax+1e-15, nt)
u = np.zeros([nx, nt])
# Set initial and boundary conditions.
u[:, 0] = np.sin(np.pi*x/length)**2
#boundaries are implicitly set by this initial condition
# Loop through each time step.
r = alpha * dt / (dx*dx)
s = 1 - 2*r
for n in range(1, nt):
for j in range(1, nx-1):
u[j, n] = r*u[j-1, n-1] + s*u[j, n-1] + r*u[j+1, n-1]
# Plot the results (initial and final conditions).
mpl.rcParams['figure.figsize']=[15,3]
plt.plot(x, u[:,0], 'k-', lw=2)
plt.plot(x, u[:,-1], 'b-', lw=2)
plt.ylim((0,1))
plt.xlim((0,1))
plt.show()
Now we can try to convert the calculation to use a matrix instead of a step-by-step for
loop. We also use a few functions to ease things out.
def calc_params(nx, nt, alpha, length, tmax):
# Compute mesh spacing and time step.
dx = length / (nx-1)
dt = tmax / (nt-1)
return (dx, dt)
def create_arrays(nx, nt):
# Create arrays to save data in process.
x = np.linspace(0, length+1e-15, nx)
t = np.linspace(0, tmax+1e-15, nt)
u = np.zeros([nx, nt])
return (x, t, u)
def set_ic(x, length):
# Set initial and boundary conditions.
u[:, 0] = np.sin(np.pi*x/length)**2
#boundaries are implicitly set by this initial condition
return u
def plot_results(u, x):
# Plot the results (initial and final conditions).
mpl.rcParams['figure.figsize']=[15,3]
plt.plot(x, u[:,0], 'k-', lw=2)
plt.plot(x, u[:,-1], 'b-', lw=2)
#import matplotlib.cm as cm
#for i in range(len(u[:,0:-1:10])-1):
# c = cm.rainbow(i/len(u[:,0:-1:10]), 1)
# plt.plot(x[:], u[:,i], color=c, lw=2)
plt.ylim((0,1))
plt.xlim((0,1))
plt.show()
def gen_matrix(nx, alpha, dt, dx):
r = alpha * dt / (dx*dx)
s = 1 - 2*r
A = np.zeros((nx, nx), dtype=np.float128)
i,j = np.indices(A.shape)
A[i==j] = s
A[i==j-1] = r
A[i==j+1] = r
return A
# Load parameters from disk.
with open('fd-params.cfg', 'r') as config:
for line in config:
variable, value = [word.strip() for word in line.split('=')]
exec(variable + '=' + value)
(dx, dt) = calc_params(nx, nt, alpha, length, tmax)
(x, t, u ) = create_arrays(nx, nt)
u = set_ic(x, length)
A = gen_matrix(nx, alpha, dt, dx)
# Loop through each time step.
for n in range(1, nt):
u[:,n] = A.dot(u[:,n-1])
plot_results(u, x)
Now examine heat_eqn_cl.py
to see how to read parameters off of the command line.
Another useful trick is to store parameters in a dictionary for easy lookup, rather than cluttering the namespace with dozens of variables.
You can see how inputting the parameters from a file can let you handle units automatically as well (e.g., dx = 1e-4 m
).
The Mandelbrot set is obtained by iteratively assigning a value to each point $c$ in the complex plane according to the formula $z_{n+1} = z_{n}^2 + c$ ($z_{0} = 0$). If the value $z$ goes to $\infty$ as $n \rightarrow \infty$, then that point $c$ is not part of the Mandelbrot set. Conversely, if the value of $z$ remains bounded no matter how large $n$ becomes, the point $c$ is part of the Mandelbrot set.
From Wikipedia, "Mandelbrot set images are made by sampling complex numbers and determining for each whether the result tends towards infinity when a particular mathematical operation is iterated on it. Treating the real and imaginary parts of each number as image coordinates, pixels are colored according to how rapidly the sequence diverges, if at all."
The following code implements the escape-time algorithm, which tells you how many iterations until a point "escapes", or becomes unbounded, if it does so under a certain maximum limit. This version does not utilize complex numbers, although a version could be written which does so.
This version lumps everything into one function.
def mandelbrot(minR, maxR, minI, maxI, samples=51, iters=25):
"""
Generate the Mandelbrot set within the boundaries of the limits
maxR, minR, minI, maxI with samples and a maximum iteration of iters.
"""
# Generate a mesh for the set.
setR = np.linspace(minR, maxR, samples)
setI = np.linspace(minI, maxI, samples)
# Calculate the values at each point of the mesh by the escape-time
# fractal algorithm.
pts = np.zeros([samples, samples])
for ii in range(1, len(setR)):
for jj in range(1, len(setI)):
it = 0
x = 0.0
y = 0.0
xx = setR[ii]
yy = setI[jj]
# Look for escape---i.e., does the value settle down in a few
# iterations or does it keep going?
while(x * x + y * y < 4 and it < iters):
xtmp = x * x - y * y + xx
y = 2 * x * y + yy
x = xtmp
it += 1
pts[ii, jj] = it
return setR, setI, pts
# Plot boundaries
minR = -2.25
maxR = 0.75
minI = -1.5
maxI = 1.5
samples = 201
iters = 20
x, y, z = mandelbrot(minR, maxR, minI, maxI, samples, iters)
z = z.transpose()
mpl.rcParams['figure.figsize']=[8,8]
plt.imshow(z, interpolation='nearest')
plt.show()
Let's illustrate the use of a parameter dictionary and split up the functions a little better. We'll also show better form and use the __main__
check to see if this file is being import
ed or executed.
The mand_alg
function is a great candidate for parallelization as well since each point is independent of the others.
def mand_mesh(params):
"""Generate a mesh for the Mandelbrot set calculation."""
setR = np.linspace(params['r.min'], params['r.max'], params['samples'])
setI = np.linspace(params['i.min'], params['i.max'], params['samples'])
pts = np.zeros([params['samples'], params['samples']])
return (setR, setI, pts)
def mand_escape(params, xx, yy):
# Look for escape---i.e., does the value settle down in a few
# iterations or does it keep going?
x, y = 0.0, 0.0
it = 0
while(x * x + y * y < 4 and it < params['iter.max']):
z_np1 = x * x - y * y + xx
y = 2 * x * y + yy
x = z_np1
it += 1
return it
def mand_alg(params):
"""
Generate the Mandelbrot set within the boundaries of the limits
r.max, r.min, i.min, i.max with samples and a maximum iteration of iter.max.
(These are keys in params dictionary.)
"""
# Calculate the values at each point of the mesh by the escape-time
# fractal algorithm.
setR, setI, pts = mand_mesh(params)
for ii in range(1, params['samples']):
for jj in range(1, params['samples']):
xx = setR[ii]
yy = setI[jj]
pts[ii, jj] = mand_escape(params, xx, yy)
return setR, setI, pts
def mand_plot(z):
mpl.rcParams['figure.figsize']=[8,8]
plt.imshow(z, interpolation='nearest')
plt.show()
if __name__ == "__main__":
params = dict()
params['r.min'] = -2.25
params['r.max'] = 0.75
params['i.min'] = -1.50
params['i.max'] = 1.50
params['samples'] = 401
params['iter.max'] = 20
x, y, z = mand_alg(params)
z = z.transpose()
mand_plot(z)
Neal Davis developed these materials for Computational Science and Engineering at the University of Illinois at Urbana–Champaign.
This content is available under a [Creative Commons Attribution 3.0 Unported License](https://creativecommons.org/licenses/by/3.0/).