%pylab inline
Populating the interactive namespace from numpy and matplotlib
How to determine the weights?
Simple approach: Squared error
\begin{equation} E(\mathbf{w}) = \frac{1}{2} \sum_{n=1}^{N} || \mathbf{y}(\mathbf{x}_n, \mathbf{w}) - \mathbf{t}_n || ^2 \end{equation}Where $\mathbf{x}_n$ are the training samples and $\mathbf{t}_n$ are the desired targets
But first...the "probabilistic interpretation"
Assume t real-valued and Gaussian distributed, with mean given by network output
\begin{equation} p(t|\mathbf{x}, \mathbf{w}) = N(t|y(\mathbf{x},\mathbf{w}), \sigma^2) \end{equation}Then the likelihood function over many observations is
\begin{equation} p(T|X,\mathbf{w},\sigma^2) = \prod_{n=1}^N p(t_n|\mathbf{x}_n, \mathbf{w}, \sigma^2) \end{equation}With negative log
\begin{equation} \frac{1}{2\sigma^2} \sum_{n=1}^N (y(\mathbf{x}_n,\mathbf{w})-t_n)^2 + \frac{N}{2}\log(\sigma^2) + \frac{N}{2} \log(2\pi) \end{equation}Using a max likelihood approach, finding $\mathbf{w}$ is equivalent to minimising the squared error
#...
# Say our error function looks like
ls = linspace(-1,1,100)
w1, w2 = meshgrid(ls, ls)
def E(w1, w2):
return w1**2 + w2**4 + w1*w2 #+ 0.5*w2
imshow(E(w1, w2), origin='bottom')
<matplotlib.image.AxesImage at 0x1968f390>
# Gradients
def E_w1(w1, w2):
return 2*w1 + w2
def E_w2(w1, w2):
return 4*w2 + w1# + 0.5
def grad(w1, w2):
return array([E_w1(w1,w2), E_w2(w1,w2)])
contour(ls, ls, E_w1(w1, w2), 40)
figure()
contour(ls, ls, E_w2(w1, w2), 40)
<matplotlib.contour.QuadContourSet instance at 0x19add6c8>
# Hessian
E_w1_w1 = lambda w1,w2: 2
E_w1_w2 = lambda w1,w2: 1
E_w2_w1 = lambda w1,w2: 1
E_w2_w2 = lambda w1,w2: 4
H = array([[2,1],[1,4]])
H
array([[2, 1], [1, 4]])
# Consider point w0
w0 = array([0, -0.8])
E(*w0)
0.40960000000000008
# Taylor expansion around w0
@vectorize
def E_approx(w1, w2):
w = array([w1, w2])
return E(*w0) + dot((w-w0), grad(*w0)).T# + 0.5 * dot(w-w0, dot(H, w-w0))
print E_approx(*(w0 + .1))
0.0096
rangex = ls#linspace(w0[0]-.5, w0[0]+.5,100)
rangey = ls#linspace(w0[1]-.5, w0[1]+.5,100)
rx, ry = meshgrid(rangex, rangey)
contour(rx, ry, E_approx(rx, ry), 20, colors='r')
contour(rx, ry, E(rx, ry), 20, colors='b')
scatter(*w0)
<matplotlib.collections.PathCollection at 0x1994f590>