Fitting a function to data points with scipy.optimize.curve_fit is easy.
But how can we make it clear to users how the fit parameter errrors are computed?
Links:
import numpy as np
from scipy.optimize import curve_fit
def f(x, slope, intercept):
"""Linear function (two fit parameters)"""
return slope * x + intercept
# Example data
xdata = np.array([0, 1, 2, 3, 4, 5])
ydata = np.array([1, 1, 5, 7, 8, 12])
sigma = np.array([1, 2, 1, 2, 1, 2])
# Here's what curve_fit gives by default
popt, pcov = curve_fit(f, xdata, ydata, p0=[2, 0], sigma=sigma)
perr_scaled = np.sqrt(np.diag(pcov))
#print('slope: {0:.5f} +- {1:.5f}'.format(popt[0], perr[0]))
#print('intercept: {0:.5f} +- {1:.5f}'.format(popt[1], perr[1]))
print 'popt:', popt
print 'perr_scaled:', perr_scaled
popt: [ 1.95283019 0.77044025] perr_scaled: [ 0.20659803 0.57204404]
# But the errors are *not* what e.g. physicists expect!
# Let's use iminuit (https://github.com/iminuit/iminuit) to find
# the errors a physicist expects.
from iminuit import Minuit
def chi2(slope, intercept):
"""Define fit statistic, interpreting sigma as errors on ydata"""
chi = (ydata - f(xdata, slope, intercept)) / sigma
return np.sum(chi ** 2)
minuit = Minuit(chi2, pedantic=False, print_level=0)
minuit.migrad()
minuit.hesse()
popt = np.array(minuit.values.values())
perr = np.array(minuit.errors.values())
print 'popt:', popt
print 'perr:', perr
# Note that the best-fit values are the same as reported by curve_fit,
# but the parameter errors are very different!
popt: [ 1.95283019 0.77044025] perr: [ 0.30714756 0.85045308]
# How to get the HESSE errors from curve_fit?
chi = (ydata - f(xdata, *popt)) / sigma
chi2 = (chi ** 2).sum()
dof = len(ydata) - len(popt)
perr_scale_factor = 1 / np.sqrt((chi2 / dof))
print 'perr:', perr_scale_factor * perr_scaled
perr: [ 0.30714756 0.85045308]
Because it is equivalent to assuming that the sigmas only represent relative weights of the data points, but the absolute scale of the sigmas is actually unknown.
In this case it makes sense to scale the sigmas so that a chi ** 2 / dof = 1
is obtained, corresponding to an average goodness of fit.
As shown below, dividing sigma
by perr_scale_factor
is equivalent to multiplying perr
by perr_scale_factor
and forces chi ** 2 / dof = 1
.
def chi2(slope, intercept):
"""Define fit statistic, interpreting sigma as errors on ydata"""
chi = (ydata - f(xdata, slope, intercept)) / (sigma / perr_scale_factor)
return np.sum(chi ** 2)
minuit = Minuit(chi2, pedantic=False, print_level=0)
minuit.migrad()
minuit.hesse()
popt = np.array(minuit.values.values())
perr2 = np.array(minuit.errors.values())
print 'popt:', popt
print 'perr_scaled:', perr2
print 'chi2 / dof:', chi2(**minuit.values) / dof
popt: [ 1.95283019 0.77044025] perr_scaled: [ 0.20659803 0.57204404] chi2 / dof: 1.0