Curve Fitting¶

In [2]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.interpolate import interp1d
from scipy.optimize    import curve_fit


Curve fitting is used to represent a set of data by a model function.

• The model function (the curve) can be a form meant for convenience. That is, we can represent the data using a function instead of the raw data itself.
• This can give some flexibility in data analysis (integration, differentiation, compact representation, etc.)
• Often, there is an underlying physical process assumed that created the data. Our model may be an equation representing this process.
• Find the best fit parameters for the curve.
• Either we want the parameters for their own sake,
• Or, we may just want parameters that allow the curve to fit well.

Recall the procedure in Excel

1. Use a trendline with a plot.
• Several model forms available: polynomial, linear, power law, logarithic, exponential, etc.
• Sometimes needed to massage the data to get in a model form.
2. Use the solver to minimize the sum square error by changing the model parameters.

Python

• Two methods:
1. polyfit for polynomial fits (including linear)
2. curve_fit for general curve fits

Polynomial fits¶

• p = np.polyfit(x, y, polynomial_order)
• np.polyval(p, x)
• Set or obtain your given x and y data
• Get a polyfit object using polyfit
• Evaluate the polynomial at desired locations using polyval

Exercise¶

• Given the data below,
• Fit a third order polynomial.
• Print out the polynomial coefficients.
• Plot the data and the polynomial curve
• Use more points for the curve than just the xg data.
In [3]:
xg = np.array([0., 1., 2., 3., 4., 5.])         # given x data
yg = np.array([0, 0.8, 0.9, 0.1, -0.8, -1.0])   # given y data

#-------------

p3 = np.polyfit(xg, yg, 3)

xx = np.linspace(0,5,1000)
yy = np.polyval(p3, xx)

plt.rc('font', size=14)
plt.plot(xg, yg, 'o')
plt.plot(xx, yy, '-')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['data', 'fit'], frameon=False);


Excercise¶

• Given the data below for $2\le x\le 8$
• Fit a 6th order polynomial to the data.
• Plot the results for $0\le x\le 12$.
In [8]:
xg = np.array([2, 3, 4, 5, 6, 7, 8])
yg = np.array([1, 2, 1, 2, 1, 2, 1])

#-------------

p6 = np.polyfit(xg, yg, 6)

xx = np.linspace(-2,12,1000)
yy = np.polyval(p6, xx)

plt.plot(xg,yg, 'o')
plt.plot(xx,yy, '-')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['data', 'fit'], frameon=False);

plt.figure()
plt.plot(xg,yg, 'o')
plt.plot(xx,yy, '-')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['data', 'fit'], frameon=False);
plt.xlim([2,8])
plt.ylim([0,4])

Out[8]:
(0, 4)

Question¶

• What does this tell you about using the polynomial fit outside of the original range of the data?

General Curve Fits¶

We can fit a general function f(x; a, b, c) where f is a function of x with parameters a, b, c that we want to optimize for a set of given data.

Use curve_fit available from from scipy.optimize import curve_fit

• params, extras = curve_fit(f, xg, yg)

Example from scipy.org¶

Fit the following function to the given data $$f(x) = a\exp(-bx) + c$$

• Plot the data and the curve fit
In [10]:
#-------- Set some given data (normally, this is already available)

xg = np.linspace(0,4,50)
yg = 2.5*np.exp(-1.3*xg)+0.5 + 0.2*np.random.normal(size=len(xg))

#-------- Define the function with parameters: x comes first

def f(x, a, b, c) :
return a*np.exp(-b*x) + c

#-------- Do the curve fit

abc, extras = curve_fit(f, xg, yg)

a = abc[0]
b = abc[1]
c = abc[2]

xx = xg
yy = f(xx,a,b,c)

#-------- Output / plot the results

print(f"a={abc[0]:.4f}, b={abc[1]:.4f}, c={abc[2]:.4f}")

plt.plot(xg, yg, 'o')
plt.plot(xx, yy, '-')
plt.legend(['data', 'fit'], frameon=False)
plt.xlabel('x')
plt.ylabel('y');

a=2.4418, b=1.3884, c=0.5064


Question¶

• What appears to be missing from this curve_fit function call?

• Use help(curve_fit) for details on this and other options.