Rationale
A fast and simple interpolation routine is needed in the Stochastic Dynamic Programming algorithm. The "fast and simple" requirement leads to choose the linear interpolation method.
The method needs to be applicable to N-dimensional input, i.e. to interpolate a scalar function :
$$ x\mapsto f(x)\in\mathbb{R} ,\quad x\in D \subset \mathbb{R}^n$$where the dimension $n$ will be in practice less than 3 or 4.
We here try to evaluate what routines can be found "off-the-shelf".
References
scipy.interpolate
doc http://docs.scipy.org/doc/scipy/reference/interpolate.htmlscipy.ndimage
docdolo
project: https://github.com/EconForge/dolo/tree/master/dolo/numeric/interpolation (Multilinear interpolation removed since then...)Update February 2015: add scipy.interpolate.RegularGridInterpolator in the benchmark (added in scipy
0.14, released in May 2014)
import numpy as np
from numpy import pi
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
from scipy.interpolate import (
LinearNDInterpolator, RectBivariateSpline,
RegularGridInterpolator)
from scipy.ndimage import map_coordinates
from dolointerpolation import MultilinearInterpolator
# Tweak how images are plotted with imshow
mpl.rcParams['image.interpolation'] = 'none' # no interpolation
mpl.rcParams['image.origin'] = 'lower' # origin at lower left corner
mpl.rcParams['image.cmap'] = 'RdBu_r'
def f_2d(x,y):
'''a function with 2D input to interpolate on [0,1]'''
twopi = 2*pi
return np.exp(-x)*np.cos(x*2*pi)*np.sin(y*2*pi)
# quick check :
f_2d(0,0.25)
1.0
def f_3d(x,y,z):
'''a function with 3D input to interpolate on [0,1]'''
twopi = 2*pi
return np.sin(x*2*pi)*np.sin(y*2*pi)*np.sin(z*2*pi)
# quick check :
f_3d(0.25,0.25,0.25)
1.0
Ndata = 50
xgrid = np.linspace(0,1, Ndata)
ygrid = np.linspace(0,1, Ndata+1) # use a slighly different size to check differences
zgrid = np.linspace(0,1, Ndata+2)
f_2d_grid = f_2d(xgrid.reshape(-1,1), ygrid)
plt.imshow(f_2d_grid.T)
plt.title(u'image of a 2D function ({}² pts)'.format(Ndata));
f_2d_grid.shape
(50, 51)
The 3D case
f_3d_grid = f_3d(xgrid.reshape(-1,1,1), ygrid.reshape(1,-1,1), zgrid)
f_3d_grid.shape
(50, 51, 52)
Try several interpolation routines.
Notice how each routine has its special way (API) to
# Define the grid to interpolate on :
Ninterp = 1000
xinterp = np.linspace(0,1, Ninterp)
yinterp = np.linspace(0,1, Ninterp+1) # use a slighly different size to check differences
zinterp = np.linspace(0,1, 5) # small dimension to avoid size explosion
LinearNDInterpolator
uses an unstructured data which is provided as a list of points. There is also a cousin : NearestNDInterpolator
LinearNDInterpolator(points, values)
(documentation)
Performance :
instanciation :
18 ms for 50x50 pts,
but 19.1 s for 50x50x50 pts
evaluation : 45 ms for 1 Mpts.
# Build data for the interpolator
points_x, points_y = np.broadcast_arrays(xgrid.reshape(-1,1), ygrid)
points = np.vstack((points_x.flatten(),
points_y.flatten())).T
values = f_2d_grid.flatten()
# Build
%timeit f_2d_interp = LinearNDInterpolator(points, values)
100 loops, best of 3: 17.7 ms per loop
f_2d_interp = LinearNDInterpolator(points, values)
# Evaluate
%timeit f_2d_interp(xinterp.reshape(-1,1), yinterp)
10 loops, best of 3: 44.7 ms per loop
# Display
plt.imshow(f_2d_interp(xinterp.reshape(-1,1), yinterp).T)
plt.title(u'interpolation of a 2D function ({}² pts)'.format(Ninterp));
The 3D case : terrible performance !
# Build data for the interpolator
points_x, points_y, points_z = np.broadcast_arrays(xgrid.reshape(-1,1,1), ygrid.reshape(1,-1,1), zgrid)
points = np.vstack((points_x.flatten(),
points_y.flatten(),
points_z.flatten()
)).T
values = f_3d_grid.flatten()
points.shape
(132600, 3)
%time f_3d_interp = LinearNDInterpolator(points, values)
CPU times: user 19.1 s, sys: 88 ms, total: 19.2 s Wall time: 19.2 s
f_3d_interp = LinearNDInterpolator(points, values)
# Evaluate:
%time f_3d_interp(xinterp.reshape(-1,1), np.linspace(0,1,5), 0.25);
CPU times: user 2min 44s, sys: 0 ns, total: 2min 44s Wall time: 2min 44s
array([[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 6.23317279e-03, 7.64852773e-19, -6.23317279e-03, -1.53553034e-18], [ 0.00000000e+00, 1.24663456e-02, 1.52970555e-18, -1.24663456e-02, -3.07106068e-18], ..., [ 0.00000000e+00, -1.25138148e-02, -1.53553034e-18, 1.25138148e-02, 3.07106068e-18], [ 0.00000000e+00, -6.25690738e-03, -7.67765169e-19, 6.25690738e-03, 1.53553034e-18], [ 0.00000000e+00, -2.44098405e-16, -2.99525375e-32, 2.44098405e-16, 5.99050750e-32]])
Interpolation on a slice along the x-axis, just to check that it works.
f_3d_interp_grid = f_3d_interp(xinterp.reshape(-1,1,1), 0.25, 0.25)
plt.plot(f_3d_interp_grid.flatten());
for 2D interpolation only !
RectBivariateSpline(x, y, z, bbox=[None, None, None, None], kx=3, ky=3, s=0)
(documentation)
Performance
%%timeit # Build
f_2d_interp = RectBivariateSpline(xgrid, ygrid, f_2d_grid)
10000 loops, best of 3: 192 µs per loop
f_2d_interp = RectBivariateSpline(xgrid, ygrid, f_2d_grid)
%%timeit # Evaluate
f_2d_interp(xinterp, yinterp)
10 loops, best of 3: 21.2 ms per loop
# Display
plt.imshow(f_2d_interp(xinterp, yinterp).T)
plt.title(u'interpolation of a 2D function ({}² pts)'.format(Ninterp));
notice that this method has no separation between instanciation and evaluation !
map_coordinates(input, coordinates, output=None, order=3, mode='constant', cval=0.0, prefilter=True)
(documentation)
Performance :
# Prepare the coordinates to evaluate the array on :
points_x, points_y = np.broadcast_arrays(xinterp.reshape(-1,1), yinterp)
coord = np.vstack((points_x.flatten()*(len(xgrid)-1) , # a weird formula !
points_y.flatten()*(len(ygrid)-1)))
coord.shape
(2, 1001000)
%%timeit # Build and Evaluate
f_2d_interp = map_coordinates(f_2d_grid, coord, order=1)
# Reshape
f_2d_interp = f_2d_interp.reshape(len(xinterp), len(yinterp))
10 loops, best of 3: 56.4 ms per loop
# Display
f_2d_interp = map_coordinates(f_2d_grid, coord, order=1).reshape(len(xinterp), len(yinterp))
plt.imshow(f_2d_interp.T)
plt.title(u'interpolation of a 2D function ({}² pts)'.format(Ninterp));
The 3D case
# Prepare the coordinates to evaluate the array on :
points_x, points_y, points_z = np.broadcast_arrays(xinterp.reshape(-1,1,1), yinterp.reshape(1,-1,1), zinterp)
coord = np.vstack((points_x.flatten()*(len(xgrid)-1), # a weird formula !
points_y.flatten()*(len(ygrid)-1),
points_z.flatten()*(len(zgrid)-1)
))
coord.shape
(3, 5005000)
%%timeit # Build and Evaluate
f_3d_interp = map_coordinates(f_3d_grid, coord, order=1)
# Reshape
f_3d_interp = f_3d_interp.reshape(len(xinterp), len(yinterp), len(zinterp))
1 loops, best of 3: 511 ms per loop
f_3d_interp = map_coordinates(f_3d_grid, coord, order=1)
f_3d_interp = f_3d_interp.reshape(len(xinterp), len(yinterp), len(zinterp))
f_3d_interp.shape
(1000, 1001, 5)
n_z = 1
plt.imshow(f_3d_interp[:,:,n_z])
plt.title('f_3d(x,y, z={:.2f})'.format(zinterp[n_z]))
plt.colorbar();
Performance :
References :
smin = [xgrid[0], ygrid[0]]
smax = [xgrid[-1], ygrid[-1]]
orders = [len(xgrid), len(ygrid)]
print(smin, smax, orders)
([0.0, 0.0], [1.0, 1.0], [50, 51])
%%timeit
minterp = MultilinearInterpolator(smin,smax,orders)
minterp.set_values(np.atleast_2d(f_2d_grid.flatten()))
100000 loops, best of 3: 9.47 µs per loop
minterp.grid.shape
(2, 2550)
f_2d_grid.shape
(50, 51)
# Prepare the coordinates to evaluate the array on :
points_x, points_y = np.broadcast_arrays(xinterp.reshape(-1,1), yinterp)
coord = np.vstack((points_x.flatten(),
points_y.flatten()))
coord.shape
(2, 1001000)
%timeit f_2d_interp = minterp(coord).reshape(len(xinterp), len(yinterp))
100 loops, best of 3: 12.6 ms per loop
# Display
f_2d_interp = minterp(coord).reshape(len(xinterp), len(yinterp))
plt.imshow(f_2d_interp.T)
plt.title(u'interpolation of a 2D function ({}² pts)'.format(Ninterp));
plt.colorbar();
<matplotlib.colorbar.Colorbar instance at 0x7fe0c49d0b00>
smin = [xgrid[0], ygrid[0], zgrid[0]]
smax = [xgrid[-1], ygrid[-1], zgrid[-1]]
orders = [len(xgrid), len(ygrid), len(zgrid)]
print(smin, smax, orders)
([0.0, 0.0, 0.0], [1.0, 1.0, 1.0], [50, 51, 52])
minterp = MultilinearInterpolator(smin,smax,orders)
minterp.set_values(np.atleast_2d(f_3d_grid.flatten()))
# Prepare the coordinates to evaluate the array on :
points_x, points_y, points_z = np.broadcast_arrays(xinterp.reshape(-1,1,1), yinterp.reshape(1,-1,1), zinterp)
coord = np.vstack((points_x.flatten(), # a weird formula !
points_y.flatten(),
points_z.flatten()
))
coord.shape
(3, 5005000)
A very nice interpolation performance !
%timeit minterp(coord)
10 loops, best of 3: 105 ms per loop
f_3d_interp = minterp(coord)
f_3d_interp = f_3d_interp.reshape(len(xinterp), len(yinterp), len(zinterp))
f_3d_interp.shape
(1000, 1001, 5)
n_z = 1
plt.imshow(f_3d_interp[:,:,n_z])
plt.title('f_3d(x,y, z={:.2f})'.format(zinterp[n_z]))
plt.colorbar();
Interpolation throuput:
print('{:.1f} Mpts/s'.format(5e6 / 0.105 /1e6) )
47.6 Mpts/s
actual interpolation code: https://github.com/scipy/scipy/blob/master/scipy/interpolate/interpolate.py#L1577 (pure Python, no Cython involved!)
API:
RegularGridInterpolator(points, values, [...])
points
: tuple of ndarray of float, with shapes (m1, ), ..., (mn, ).value
: array_like, shape (m1, ..., mn, ...).Performance
%%timeit # Instanciate the interpolator
f_2d_interp = RegularGridInterpolator((xgrid, ygrid), f_2d_grid)
10000 loops, best of 3: 21.6 µs per loop
f_2d_interp = RegularGridInterpolator((xgrid, ygrid), f_2d_grid)
# Prepare the coordinates to evaluate the array on :
points_x, points_y = np.broadcast_arrays(xinterp.reshape(-1,1), yinterp)
coord = np.vstack((points_x.flatten(),
points_y.flatten()))
coord.shape
(2, 1001000)
f_2d_interp
<scipy.interpolate.interpolate.RegularGridInterpolator at 0x7fe0c4933ed0>
%%timeit # Interpolate
f_2d_interp_res = f_2d_interp(coord.T).reshape(len(xinterp), len(yinterp))
10 loops, best of 3: 112 ms per loop
# Display
f_2d_interp_res = f_2d_interp(coord.T).reshape(len(xinterp), len(yinterp))
plt.imshow(f_2d_interp_res.T)
plt.title(u'interpolation of a 2D function ({}² pts)'.format(Ninterp));
plt.colorbar();
f_3d_interp = RegularGridInterpolator((xgrid, ygrid, zgrid), f_3d_grid)
# Prepare the coordinates to evaluate the array on :
points_x, points_y, points_z = np.broadcast_arrays(xinterp.reshape(-1,1,1), yinterp.reshape(1,-1,1), zinterp)
coord = np.vstack((points_x.flatten(), # a weird formula !
points_y.flatten(),
points_z.flatten()
))
coord.shape
(3, 5005000)
%%timeit # Interpolate
f_3d_interp_res = f_3d_interp(coord.T).reshape(len(xinterp), len(yinterp), len(zinterp))
1 loops, best of 3: 1.8 s per loop
table
Method | Instanciation | Evaluation (1Mpts 2D) | |
---|---|---|---|
dolo | 0.010 ms | 13 ms | 105 ms |
RegularGridInterpolator | 0.022 ms | 116 ms | 1.8 s |
Conclusion: Dolo's Multilinear interpolation (for an even grid only) is still the best!