This document is the technical supplement, for instructors, for Statistical Inference for Everyone, the introductory statistical inference textbook from the perspective of "probability theory as logic".
$\newcommand{\bvec}[1]{\mathbf{#1}}$ Given \begin{eqnarray} y_k&=& m x_k + b + \epsilon \end{eqnarray} where the (known) noise term is \begin{eqnarray} p(\epsilon|I) &=& \frac{1}{\sqrt{2\pi \sigma^2}}e^{-\epsilon}/2\sigma^2 \end{eqnarray} then we have
\begin{eqnarray} p(m,b|\bvec{y},I)&=& \int_0^{\infty} p(m,b,\sigma|\bvec{y},I)d\sigma \\\\ &=& \int_0^{\infty} p(\bvec{y}|m,b,\sigma,I)p(m,b,\sigma|I)d\sigma \\\\ p(y_k|m,b,\sigma,I)&=& \frac{1}{\sqrt{2\pi \sigma^2}}e^{-(m x_k+b-y_k)/2\sigma^2}\\\\ p(\bvec{y}|m,b,\sigma,I)&\propto& \frac{1}{\sigma^N}e^{-\sum (m x_k+b-y_k)^2/2\sigma^2} \end{eqnarray}As before, with uniform priors, we get (assuming we know $\sigma$) \begin{eqnarray} p(m,b|\bvec{y},I)&\propto&\frac{1}{\sigma^N}e^{-\sum (m x_k+b-y_k)^2/2\sigma^2} \\\\ L&=&{\rm constant} - \sum (m x_k+b-y_k)^2/2\sigma^2 \\\\ \nabla_{m,b} L &=& 0 \mbox{ (maximum $p$ = maximum $L$ = minimum squares)}\\\\ \end{eqnarray} Gives the two equations \begin{eqnarray} \sum (m x_k+b-y_k) x_k &=& 0 \\\\ \sum (m x_k+b-y_k) &=& 0 \end{eqnarray} Define \begin{eqnarray} v&=&\sum x_k^2 \\\\ c&=&\sum x_k y_k \\\\ \bar{x}&=&\frac{1}{N}\sum x_k \\\\ \bar{y}&=&\frac{1}{N}\sum y_k \end{eqnarray}
and we have
\begin{eqnarray} \sum (m x_k+b-y_k) x_k &=& 0 \\\\ vm + N\bar{x} b - c &=& 0 \end{eqnarray}and \begin{eqnarray} \sum (m x_k+b-y_k) &=& 0 \\\\ N\bar{x}m + N b - N\bar{y} &=& 0 \end{eqnarray}
A nice little trick I learned in high school for quickly solving $2\times 2$ equations is to use the determinant. It can be used for any size, but it is particularly expedient for $2\times 2$ equations.
Write the equations in the following form: \begin{eqnarray} ax+by&=&c \\\\ dx+ey&=&f \end{eqnarray}
Form the determinant of the left-hand side parameters like \begin{eqnarray} D&\equiv& \left|\begin{array}{cc}a & b \\\\ d & e\end{array} \right| \\\\ &=& ae - bd \end{eqnarray}
The solutions are formed by the following ratios \begin{eqnarray} x&=& \frac{\left|\begin{array}{cc}c & b \\\\ f & e\end{array} \right|}{D} \\\\ &=& \frac{ce-bf}{ae-bd} \end{eqnarray} and \begin{eqnarray} y&=& \frac{\left|\begin{array}{cc}a & c \\\\ d & f\end{array} \right|}{D} \\\\ &=& \frac{af-cd}{ae-bd} \end{eqnarray}
Notice that the numerators are made in the same way as $D$, except that the relevant column (1st column for $x$, 2nd for $y$) is replace with the right-hand side parameters.
Why is this any better than solving for one, and plugging in? I find that the arithmetic in this recipe to be more straightforward, and less prone to careless errors.
So we have \begin{eqnarray} vm + N\bar{x} b &=& c \\\\ \bar{x}m + b =\bar{y} \end{eqnarray}
Solving we get
\begin{eqnarray} D&\equiv& \left|\begin{array}{cc}v & N\bar{x} \\\\ \bar{x} & 1\end{array} \right| \\\\ &=& v - N (\bar{x})^2 \\\\ m&=& \frac{\left|\begin{array}{cc}c & N\bar{x} \\\\ \bar{y} & 1\end{array} \right|}{D} = \frac{c-N\bar{x}\bar{y}}{v - N (\bar{x})^2} \\\\ b&=& \frac{\left|\begin{array}{cc}v & c \\\\ \bar{x} & \bar{y}\end{array} \right|}{D} = \frac{v\bar{y}-c\bar{x}}{v - N (\bar{x})^2} \\\\ \end{eqnarray}
from IPython.core.display import HTML
def css_styling():
styles = open("../styles/custom.css", "r").read()
return HTML(styles)
css_styling()