Investigating scipy.optimize.curve_fit covariance output

In Aug 2011 there was a thread Unexpected covariance matrix from scipy.optimize.curve_fit where Christoph Deil reported that "scipy.optimize.curve_fit returns parameter errors that don't scale with sigma, the standard deviation of ydata, as I expected." Today I independently came to the same conclusion.

This thread generated some discussion but seemingly no agreement that the covariance output of curve_fit is not what would be expected. I think the discussion wasn't as focused as possible because the example was too complicated. With that I provide here about the simplest possible example, which is fitting a constant to a constant dataset, aka computing the mean and error on the mean. Since we know the answers we can compare the output of curve_fit.

In [1]:
import numpy as np
from scipy.optimize import curve_fit

Generate a normally distributed dataset of 100 points with $\mu = 0.0$ and $\sigma_y = 1.0$

In [2]:
x = np.arange(100, dtype=np.float)
y = np.zeros_like(x)
yn = y + np.random.normal(size=len(y))

Define a constant function

In [3]:
def const(x, a):
    return a * np.ones_like(x)

Fit a constant to a set of normally distributed points

This is equivalent to determining the mean and uncertainty on the mean. We know the answer analytically:

  • $\mu = 0.0$
  • $\sigma_\mu = \sigma_y / N^{1/2}$

If no sigma parameter is provided to curve_fit then we get the expected results:

In [15]:
popt, pcov = curve_fit(const, x, yn)
print 'Best fit constant =', popt[0]
print 'Uncertainty on constant =', np.sqrt(pcov[0, 0])
plot(x, yn, '.')
plot(x, const(x, popt[0]))
grid()
Best fit constant = -0.114460255805
Uncertainty on constant = 0.0990467843199

But now setting sigma=100 doesn't change the output covariance at all:

In [22]:
sigma = 100
popt, pcov = curve_fit(const, x, yn, sigma=sigma)
print 'Best fit constant =', popt[0]
print 'Uncertainty on constant =', np.sqrt(pcov[0, 0])
Best fit constant = -0.114460248457
Uncertainty on constant = 0.099046785584

Try to fix the covariance matrix using the method suggested by @cdeil.

This gives the expected result.

In [25]:
chi = (yn - const(x, *popt)) / sigma
chi2 = (chi ** 2).sum()
dof = len(x) - len(popt)
factor = (chi2 / dof)
pcov_sigma = pcov / factor
print chi2
print dof
print factor
print np.sqrt(pcov_sigma[0, 0])
0.00971216310043
99
9.81026575801e-05
9.99999998803

Use Sherpa to do the fit and compute covariance

Sherpa is a modeling and fitting package used in astronomy. There is a relatively steep learning curve and installation isn't so easy, but the optimization methods are robust.

In [6]:
import sherpa.ui as ui  # sherpa user interface

Load dataset 1 with the same x and yn dataset, with sigma=100

In [7]:
ui.load_arrays(1, x, yn, 100.0 * np.ones_like(yn))

Set the model for dataset 1 to a constant model named c1

In [8]:
ui.set_model(1, ui.const1d.c1)
c1.c0.min = -100  # set the parameter bounds to -100 to +100
c1.c0.max = 100

Fit the dataset using Levenberg-Marquart minimization and $\chi^2$ statistics

The answer is exactly the same as curve_fit.

In [9]:
ui.fit()
Dataset               = 1
Method                = levmar
Statistic             = chi2gehrels
Initial fit statistic = 0.0221324
Final fit statistic   = 0.00971216 at function evaluation 4
Data points           = 100
Degrees of freedom    = 99
Probability [Q-value] = 1
Reduced statistic     = 9.81027e-05
Change in statistic   = 0.0124202
   c1.c0          -0.11446    

In [10]:
ui.plot_fit()  # this shows the large error bars

Compute the covariance

The diagonal element of the covariance is automatically converted to an effective $1-\sigma$ confidence interval on the fitted parameter. In this case, we used $\sigma_y=100$, so the answer of $\sigma_\mu = \sigma_y / N^{1/2} = 100 / 100^{1/2} = 10$ is correct.

In [11]:
ui.covar()
Dataset               = 1
Confidence Method     = covariance
Iterative Fit Method  = None
Fitting Method        = levmar
Statistic             = chi2gehrels
covariance 1-sigma (68.2689%) bounds:
   Param            Best-Fit  Lower Bound  Upper Bound
   -----            --------  -----------  -----------
   c1.c0            -0.11446          -10           10