using KernSmooth, Winston
Generate some data and plot it:
srand(12345)
x = randn(250) * 10
y = sin(x) + x ./15.0 + randn(250)
xmn, xmx = extrema(x)
xmn = floor(xmn)
xmx = ceil(xmx)
xgrid = linspace(xmn, xmx, 401)
ytrue = sin(xgrid) + xgrid ./15.0
plot(x, y, ".", xgrid, ytrue, "r-")
We want to estimate the red curve from the data. We can start by picking some arbitrary bandwidths: 0.5, 1.0, and 2.0. The locpoly
function chooses it's own grid based on the x
values on which to estimate the regression.
xgrid2, yhat0_5 = locpoly(x, y, 0.5)
yhat1_0 = locpoly(x, y, 1.0)[2]
yhat2_0 = locpoly(x, y, 2.0)[2];
We can also use the dpill
function which implements a direct plug-in method to select a bandwidth.
h = dpill(x, y)
0.7812098873536838
And then estimate the curve using the selected bandwidth.
yhath = locpoly(x, y, h)[2];
We now plot our estimates.
ymn, ymx = extrema(y)
ymn = floor(ymn) + 1.0
ymx = ceil(ymx) + 2.0
p = FramedPlot(xrange=(xmn, xmx), yrange = (ymn, ymx))
t = Curve(xgrid, ytrue, color="red")
setattr(t, label="truth")
f0_5 = Curve(xgrid2, yhat0_5, color="blue")
setattr(f0_5, label="h = 0.5")
f1_0 = Curve(xgrid2, yhat1_0, color="green")
setattr(f1_0, label="h = 1.0")
f2_0 = Curve(xgrid2, yhat2_0, color="purple")
setattr(f2_0, label="h = 2.0")
fh = Curve(xgrid2, yhath, color="orange")
setattr(fh, label="h = $(round(h, 3))")
s = Points(x, y, kind = "dot")
l = Legend(.1, .9, {t, f0_5, f1_0, f2_0, fh})
add(p, t, f0_5, f1_0, f2_0, fh, s, l)
Save the plot
file(p, "scatter.png", height=400, width=600)