name = '2015-07-27-least_squares'
title = "Simple least squares fit to study residuals"
%matplotlib inline
import os
from datetime import datetime
from IPython.core.display import HTML
with open('creative_commons.txt', 'r') as f:
html = f.read()
hour = datetime.utcnow().strftime('%H:%M')
comments="true"
date = '-'.join(name.split('-')[:3])
slug = '-'.join(name.split('-')[3:])
metadata = dict(title=title,
date=date,
hour=hour,
comments=comments,
slug=slug,
name=name)
markdown = """Title: {title}
date: {date} {hour}
comments: {comments}
slug: {slug}
{{% notebook {name}.ipynb cells[2:] %}}
""".format(**metadata)
content = os.path.abspath(os.path.join(os.getcwd(),
os.pardir,
os.pardir,
'{}.md'.format(name)))
with open('{}'.format(content), 'w') as f:
f.writelines(markdown)
html = '''
<small>
<p> This post was written as an IPython notebook.
It is available for <a href='https://ocefpaf.github.com/python4oceanographers/downloads/notebooks/%s.ipynb'>download</a>
or as a static <a href='https://nbviewer.ipython.org/url/ocefpaf.github.com/python4oceanographers/downloads/notebooks/%s.ipynb'>html</a>.</p>
<p></p>
%s''' % (name, name, html)
I got a question from a friend: how do I fit a function, using least-squares, to analyze the residuals. To answer that I cook this notebook up and made a post out of it ;-)
For the example I create a fake tidal signal, that we want fit and remove, added some noise and an offset (mean current).
import numpy as np
def fake_tide(t, M2amp, M2phase):
"""
Generate a minimally realistic-looking fake semidiurnal tide.
t is time in hours
phases are in radians
"""
out = M2amp * np.sin(2 * np.pi * t / 12.42 - M2phase)
return out
N = 500
res = 5
noise = np.random.randn(N)
t = np.arange(N)
amp, phase = 2, 0
u = fake_tide(t, amp, phase) + res + noise
Here is the signal we created.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(11, 2.75))
ax.plot(t, u, 'darkgreen', label='Signal')
leg = ax.legend()
Now lets try to specify a first guess for the non-linear fitting. Note that the phase is a little off on purpose.
mean = np.mean(u)
std = amp * np.std(u) / (amp**0.5)
# First guess.
first_guess = fake_tide(t, std, 0.25) + mean
fig, ax = plt.subplots(figsize=(11, 2.75))
ax.plot(t, first_guess, 'darkorange', label='First guess')
ax.plot(t, u, 'darkgreen', label='Signal')
leg = ax.legend()
Nice first guess, we did not had to do such a good jib ;-), but now it is time
to feed that into SciPy's optimize.leastsq
to get a least squares fit of the
parameters.
from scipy.optimize import leastsq
optimize_func = lambda x: x[0] * np.sin(2 * np.pi * t / 12.42 + x[1]) + x[2] - u
est_std, est_phase, est_mean = leastsq(optimize_func, [std, phase, mean])[0]
Let's check if the parameters we estimated are close to reality.
print('Orignal: std {:.2f}, phase {:.2f}, and res {:.2f}'.format(std, phase, res))
print('Estimated: std {:.2f}, phase {:.2f}, and res {:.2f}'.format(est_std, est_phase, est_mean))
Orignal: std 2.39, phase 0.00, and res 5.00 Estimated: std 1.98, phase -0.00, and res 5.05
Awesome! Close enough =)
Now lets re-create the signal using the estimated parameters and plot everything.
fit = fake_tide(t, est_std, est_phase) + est_mean
fig, ax = plt.subplots(figsize=(11, 2.75))
ax.plot(t, fit, 'darkorange', label='Fitted')
ax.plot(t, first_guess, 'crimson', label='First guess')
ax.plot(t, u, 'darkgreen', label='Signal')
ax.plot(t, u-fit, 'crimson', label='Residue')
leg = ax.legend()
_ = ax.set_xlim(100, 150)
HTML(html)
This post was written as an IPython notebook. It is available for download or as a static html.