`scipy.optimize.curve_fit`

covariance outputIn Aug 2011 there was a thread Unexpected covariance matrix from scipy.optimize.curve_fit where Christoph Deil reported that "scipy.optimize.curve_fit returns parameter errors that don't scale with sigma, the standard deviation of ydata, as I expected." Today I independently came to the same conclusion.

This thread generated some discussion but seemingly no agreement that the covariance output of `curve_fit`

is not what would be expected. I think the discussion wasn't as focused as possible because the example was too complicated. With that I provide here about the simplest possible example, which is fitting a constant to a constant dataset, aka computing the mean and error on the mean. Since we know the answers we can compare the output of `curve_fit`

.

In [1]:

```
import numpy as np
from scipy.optimize import curve_fit
```

In [2]:

```
x = np.arange(100, dtype=np.float)
y = np.zeros_like(x)
yn = y + np.random.normal(size=len(y))
```

In [3]:

```
def const(x, a):
return a * np.ones_like(x)
```

This is equivalent to determining the mean and uncertainty on the mean. We know the answer analytically:

- $\mu = 0.0$
- $\sigma_\mu = \sigma_y / N^{1/2}$

If no `sigma`

parameter is provided to `curve_fit`

then we get the expected results:

In [15]:

```
popt, pcov = curve_fit(const, x, yn)
print 'Best fit constant =', popt[0]
print 'Uncertainty on constant =', np.sqrt(pcov[0, 0])
plot(x, yn, '.')
plot(x, const(x, popt[0]))
grid()
```

But now setting `sigma=100`

doesn't change the output covariance at all:

In [22]:

```
sigma = 100
popt, pcov = curve_fit(const, x, yn, sigma=sigma)
print 'Best fit constant =', popt[0]
print 'Uncertainty on constant =', np.sqrt(pcov[0, 0])
```

This gives the expected result.

In [25]:

```
chi = (yn - const(x, *popt)) / sigma
chi2 = (chi ** 2).sum()
dof = len(x) - len(popt)
factor = (chi2 / dof)
pcov_sigma = pcov / factor
print chi2
print dof
print factor
print np.sqrt(pcov_sigma[0, 0])
```

Sherpa is a modeling and fitting package used in astronomy. There is a relatively steep learning curve and installation isn't so easy, but the optimization methods are robust.

In [6]:

```
import sherpa.ui as ui # sherpa user interface
```

**Load dataset 1 with the same x and yn dataset, with sigma=100**

In [7]:

```
ui.load_arrays(1, x, yn, 100.0 * np.ones_like(yn))
```

**Set the model for dataset 1 to a constant model named c1**

In [8]:

```
ui.set_model(1, ui.const1d.c1)
c1.c0.min = -100 # set the parameter bounds to -100 to +100
c1.c0.max = 100
```

**Fit the dataset using Levenberg-Marquart minimization and $\chi^2$ statistics**

The answer is exactly the same as `curve_fit`

.

In [9]:

```
ui.fit()
```

In [10]:

```
ui.plot_fit() # this shows the large error bars
```

**Compute the covariance**

The diagonal element of the covariance is automatically converted to an effective $1-\sigma$ confidence interval on the fitted parameter. In this case, we used $\sigma_y=100$, so the answer of $\sigma_\mu = \sigma_y / N^{1/2} = 100 / 100^{1/2} = 10$ is correct.

In [11]:

```
ui.covar()
```