Low Rank Process with Non Gaussian Likelihood¶

Gaussian Process Summer School, Melbourne, Australia¶

25th-27th February 2015¶

James Hensman and Neil Lawrence¶

In this lab we are going to consider approaches to dealing with non-Gaussian likelihoods.

In [2]:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import GPy
import pods
from IPython.display import display

Sparse GP Classification¶

In this section we'll combine expectation propagation with the low rank approximation to build a simple image classificationa application. For this toy example we'll classify whether or not the subject of the image is wearing glasses. correspond to whether the subject of the image is wearing glasses. Set up the ipython environment and download the data:

In [13]:

data = pods.datasets.olivetti_glasses()
Xtrain = data['X']
ytrain = data['Y']

Here’s a simple way to visualise the data. Each pixel in the image will become an input to the GP.

In [14]:

plt.imshow(Xtrain[120].reshape(64,64,order='F'),
interpolation='nearest',cmap=plt.cm.gray)

Out[14]:

<matplotlib.image.AxesImage at 0x10f903210>

Now fetch the class labels

Divide the data into a training/testing set:

In [14]:

Next we choose some inducing inputs. Here we've chosen inducing inputs by applying k-means clustering to the training data. Think about whether this is a good scheme for choosing the inputs? Can you devise a better one?

In [15]:

from scipy import cluster
M = 8
#Z, distortion = cluster.vq.kmeans(Xtrain,M)

In [16]:

Xtrain_std = (Xtrain - Xtrain.mean(0)[None,:] )/ (Xtrain.std(0)[None,:])
Z = np.random.permutation(Xtrain_std)[:M].copy()
print Xtrain.mean()

132.132182617

Finally, we’re ready to build the classifier object.

In [17]:

k = GPy.kern.RBF(Xtrain.shape[1],lengthscale=20) + GPy.kern.White(Xtrain.shape[1],0.001)
model = GPy.models.SparseGPClassification(Xtrain_std, ytrain, kernel=k, Z=Z)
display(model)
model.optimize()

clang: warning: argument unused during compilation: '-fopenmp'
In file included from /Users/neil/.cache/scipy/python27_compiled/sc_1790bf65208b11355ffcfd4b65a5f1091.cpp:11:
In file included from /Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scipy/weave/blitz/blitz/array.h:26:
In file included from /Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scipy/weave/blitz/blitz/array-impl.h:37:
/Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scipy/weave/blitz/blitz/range.h:120:34: warning: '&&' within '||' [-Wlogical-op-parentheses]
        return ((first_ < last_) && (stride_ == 1) || (first_ == last_));
                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ ~~
/Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scipy/weave/blitz/blitz/range.h:120:34: note: place parentheses around the '&&' expression to silence this warning
        return ((first_ < last_) && (stride_ == 1) || (first_ == last_));
                                 ^
                (                                 )
In file included from /Users/neil/.cache/scipy/python27_compiled/sc_1790bf65208b11355ffcfd4b65a5f1091.cpp:23:
In file included from /Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4:
In file included from /Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17:
In file included from /Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1761:
/Users/neil/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: "Using deprecated NumPy API, disable it by "          "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
#warning "Using deprecated NumPy API, disable it by " \
 ^
/Users/neil/.cache/scipy/python27_compiled/sc_1790bf65208b11355ffcfd4b65a5f1091.cpp:24:10: fatal error: 'omp.h' file not found
#include <omp.h>
         ^
2 warnings and 1 error generated.

 Weave compilation failed. Falling back to (slower) numpy implementation

Model: SparseGPClassification
Log-likelihood: -329.515082269
Number of Parameters: 32771

SparseGPClassification.	Value	Constraint
inducing inputs	(8, 4096)
add.rbf.variance	1.0	+ve
add.rbf.lengthscale	20.0	+ve
add.white.variance	0.001	+ve

Exercise 1¶

Look at the following figure. What is being shown? Why does it look like this?

In [20]:

plt.figure()
plt.imshow(model.Z[0].gradient.reshape(64,64,order='F'),
           interpolation='nearest',
           cmap=plt.cm.gray)
plt.colorbar()

Out[20]:

<matplotlib.colorbar.Colorbar instance at 0x118b6d4d0>

Exercise 2¶

Write some code to evaluate the model’s performance, using the held-out data that we separated earlier. How is the error rate? Is that better than random guessing?

Hint:

GPy.util.classification.conf_matrix(prob_estimate,ytest)

Exercise 3¶

Write another simple for loop to optimize the model. How low can you get the error rate to go? What kind of kernel do you think might be appropriate for this classification task?

In [ ]: