In [1]:

import sympy as S
S.var("x mu sigma", real=True)

Out[1]:

$$\begin{pmatrix}x, & \mu, & \sigma\end{pmatrix}$$

Here's the likelihood for a 1-d normal distribution, in terms of two parameters: the mean $\mu$ and standard deviation $\sigma$

In [2]:

f = 1/(sqrt(2*S.pi)*sigma)*S.exp(-(x-mu)**2/(2*sigma**2))
f

Out[2]:

$$\frac{\sqrt{2}}{2 \sqrt{\pi} \sigma} e^{- \frac{\left(- \mu + x\right)^{2}}{2 \sigma^{2}}}$$

And here's the log likelihood, which we can tidy up:

In [3]:

ll = log(f)
ll

Out[3]:

$$\log{\left (\frac{\sqrt{2}}{2 \sqrt{\pi} \sigma} e^{- \frac{\left(- \mu + x\right)^{2}}{2 \sigma^{2}}} \right )}$$

In [4]:

ll = simplify(logcombine(ll))
ll

Out[4]:

$$\log{\left (\frac{1}{\sigma} \right )} - \frac{1}{2} \log{\left (\pi \right )} - \frac{1}{2} \log{\left (2 \right )} - \frac{\left(\mu - x\right)^{2}}{2 \sigma^{2}}$$

Let's pull out the derivatives of the log-likelihood: $\frac{\partial\mathcal(l)}{\partial\mu}$, $\frac{\partial\mathcal(l)}{\partial\sigma}$

In [5]:

ll.diff(mu)

Out[5]:

$$- \frac{1}{2 \sigma^{2}} \left(2 \mu - 2 x\right)$$

In [6]:

ll.diff(sigma)

Out[6]:

$$- \frac{1}{\sigma} + \frac{1}{\sigma^{3}} \left(\mu - x\right)^{2}$$

Second derivatives:

In [7]:

ll.diff(mu, mu)

Out[7]:

$$- \frac{1}{\sigma^{2}}$$

In [8]:

ll.diff(mu, sigma)

Out[8]:

$$\frac{1}{\sigma^{3}} \left(2 \mu - 2 x\right)$$

In [9]:

ll.diff(sigma, sigma)

Out[9]:

$$\frac{1}{\sigma^{2}} \left(1 - \frac{3}{\sigma^{2}} \left(\mu - x\right)^{2}\right)$$

In [10]:

simplify(_)

Out[10]:

$$\frac{1}{\sigma^{4}} \left(\sigma^{2} - 3 \left(\mu - x\right)^{2}\right)$$

Let's find the point where the gradient is zero for $\mu$ and $\sigma$

In [11]:

solve(f.diff(mu), mu)

Out[11]:

$$\begin{bmatrix}x\end{bmatrix}$$

In [12]:

solve(f.diff(sigma), sigma)

Out[12]:

$$\begin{bmatrix}- \mu + x, & \mu - x\end{bmatrix}$$

What if we want to use these calculations in numerical optimization code?

In [13]:

d2f_sigma2 = lambdify((mu, sigma, x), f.diff(sigma, sigma))

In [14]:

d2f_sigma2(1.2, 2, 3)

Out[14]:

$$-0.0463620287292$$

In [15]:

timeit d2f_sigma2(1.2, 2, 3)

1000000 loops, best of 3: 1.61 µs per loop

In [17]:

from sympy.utilities import codegen

In [18]:

codegen.ccode(f)

Out[18]:

'(1.0L/2.0L)*sqrt(2)*exp(-1.0L/2.0L*pow(-mu + x, 2)/pow(sigma, 2))/(sqrt(M_PI)*sigma)'

In [19]:

 print codegen.codegen(('d2f_dsigma2', f), 'C', 'd2f_dsigma2')[0][1]

/******************************************************************************
 *                     Code generated with sympy 0.7.4.1                      *
 *                                                                            *
 *              See http://www.sympy.org/ for more information.               *
 *                                                                            *
 *                       This file is part of 'project'                       *
 ******************************************************************************/
#include "d2f_dsigma2.h"
#include <math.h>

double d2f_dsigma2(double mu, double sigma, double x) {

   return (1.0L/2.0L)*sqrt(2)*exp(-1.0L/2.0L*pow(-mu + x, 2)/pow(sigma, 2))/(sqrt(M_PI)*sigma);

}

Too lazy to compile a C extension. Do it for me!

In [20]:

from sympy.utilities import autowrap

In [21]:

fw = autowrap.autowrap(f.diff(sigma, sigma))

In [22]:

timeit fw(1.2, 2, 3)

1000000 loops, best of 3: 214 ns per loop

In [24]:

logpdf = autowrap.autowrap(ll)

In [27]:

logpdf(2.3, 6., 1)

Out[27]:

$$-2.73417022465$$

In [40]:

from scipy.stats.distributions import norm
s_logpdf = norm.logpdf

In [41]:

s_logpdf(1, 2.3, 6.)

Out[41]:

$$-2.73417022465$$

In [42]:

%timeit logpdf(2.3, 6., 1)

1000000 loops, best of 3: 197 ns per loop

In [43]:

%timeit norm.logpdf(2.3, 6., 1)

10000 loops, best of 3: 86 µs per loop

In [45]:

x = np.arange(100)
vlogpdf = np.vectorize(logpdf)

In [47]:

timeit vlogpdf(2.3, 6, x)

10000 loops, best of 3: 51.8 µs per loop

In [48]:

timeit s_logpdf(2.3, 6., x)

10000 loops, best of 3: 108 µs per loop