Curve Fitting

In [2]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.integrate   import quad       
from scipy.interpolate import interp1d
from scipy.optimize    import curve_fit

Curve fitting is used to represent a set of data by a model function.

  • The model function (the curve) can be a form meant for convenience. That is, we can represent the data using a function instead of the raw data itself.
    • This can give some flexibility in data analysis (integration, differentiation, compact representation, etc.)
  • Often, there is an underlying physical process assumed that created the data. Our model may be an equation representing this process.
  • Find the best fit parameters for the curve.
    • Either we want the parameters for their own sake,
    • Or, we may just want parameters that allow the curve to fit well.

Recall the procedure in Excel

  1. Use a trendline with a plot.
    • Several model forms available: polynomial, linear, power law, logarithic, exponential, etc.
    • Sometimes needed to massage the data to get in a model form.
  2. Use the solver to minimize the sum square error by changing the model parameters.

Python

  • Two methods:
    1. polyfit for polynomial fits (including linear)
    2. curve_fit for general curve fits

Polynomial fits

  • p = np.polyfit(x, y, polynomial_order)
  • np.polyval(p, x)
  • Set or obtain your given x and y data
  • Get a polyfit object using polyfit
  • Evaluate the polynomial at desired locations using polyval

Exercise

  • Given the data below,
  • Fit a third order polynomial.
  • Print out the polynomial coefficients.
  • Plot the data and the polynomial curve
    • Use more points for the curve than just the xg data.
In [3]:
xg = np.array([0., 1., 2., 3., 4., 5.])         # given x data
yg = np.array([0, 0.8, 0.9, 0.1, -0.8, -1.0])   # given y data

#-------------

p3 = np.polyfit(xg, yg, 3)

xx = np.linspace(0,5,1000)
yy = np.polyval(p3, xx)

plt.rc('font', size=14)
plt.plot(xg, yg, 'o')
plt.plot(xx, yy, '-')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['data', 'fit'], frameon=False);

Excercise

  • Given the data below for $2\le x\le 8$
  • Fit a 6th order polynomial to the data.
  • Plot the results for $0\le x\le 12$.
In [8]:
xg = np.array([2, 3, 4, 5, 6, 7, 8])
yg = np.array([1, 2, 1, 2, 1, 2, 1])

#-------------

p6 = np.polyfit(xg, yg, 6)

xx = np.linspace(-2,12,1000)
yy = np.polyval(p6, xx)

plt.plot(xg,yg, 'o')
plt.plot(xx,yy, '-')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['data', 'fit'], frameon=False);

plt.figure()
plt.plot(xg,yg, 'o')
plt.plot(xx,yy, '-')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['data', 'fit'], frameon=False);
plt.xlim([2,8])
plt.ylim([0,4])
Out[8]:
(0, 4)

Question

  • What does this tell you about using the polynomial fit outside of the original range of the data?

General Curve Fits

We can fit a general function f(x; a, b, c) where f is a function of x with parameters a, b, c that we want to optimize for a set of given data.

Use curve_fit available from from scipy.optimize import curve_fit

  • params, extras = curve_fit(f, xg, yg)

Example from scipy.org

Fit the following function to the given data $$f(x) = a\exp(-bx) + c$$

  • Plot the data and the curve fit
In [10]:
#-------- Set some given data (normally, this is already available)

xg = np.linspace(0,4,50)
yg = 2.5*np.exp(-1.3*xg)+0.5 + 0.2*np.random.normal(size=len(xg))

#-------- Define the function with parameters: x comes first

def f(x, a, b, c) :     
    return a*np.exp(-b*x) + c

#-------- Do the curve fit

abc, extras = curve_fit(f, xg, yg)

a = abc[0]
b = abc[1]
c = abc[2]

xx = xg
yy = f(xx,a,b,c)

#-------- Output / plot the results

print(f"a={abc[0]:.4f}, b={abc[1]:.4f}, c={abc[2]:.4f}")

plt.plot(xg, yg, 'o')
plt.plot(xx, yy, '-')
plt.legend(['data', 'fit'], frameon=False)
plt.xlabel('x') 
plt.ylabel('y');
a=2.4418, b=1.3884, c=0.5064

Question

  • What appears to be missing from this curve_fit function call?

Answer

  • An initial guess for the parameters is not required, but it is possible to give this if desired.
    • Use help(curve_fit) for details on this and other options.
In [ ]:
 
In [ ]: