In [2]:

```
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import GPy
import pods
from IPython.display import display
```

In this section we'll combine expectation propagation with the low rank approximation to build a simple image classificationa application. For this toy example we'll classify whether or not the subject of the image is wearing glasses. correspond to whether the subject of the image is wearing glasses. Set up the ipython environment and download the data:

In [13]:

```
data = pods.datasets.olivetti_glasses()
Xtrain = data['X']
ytrain = data['Y']
```

Here’s a simple way to visualise the data. Each pixel in the image will become an input to the GP.

In [14]:

```
plt.imshow(Xtrain[120].reshape(64,64,order='F'),
interpolation='nearest',cmap=plt.cm.gray)
```

Out[14]:

Now fetch the class labels

Divide the data into a training/testing set:

In [14]:

```
```

Next we choose some inducing inputs. Here we've chosen inducing inputs by applying k-means clustering to the training data. Think about whether this is a good scheme for choosing the inputs? Can you devise a better one?

In [15]:

```
from scipy import cluster
M = 8
#Z, distortion = cluster.vq.kmeans(Xtrain,M)
```

In [16]:

```
Xtrain_std = (Xtrain - Xtrain.mean(0)[None,:] )/ (Xtrain.std(0)[None,:])
Z = np.random.permutation(Xtrain_std)[:M].copy()
print Xtrain.mean()
```

Finally, we’re ready to build the classifier object.

In [17]:

```
k = GPy.kern.RBF(Xtrain.shape[1],lengthscale=20) + GPy.kern.White(Xtrain.shape[1],0.001)
model = GPy.models.SparseGPClassification(Xtrain_std, ytrain, kernel=k, Z=Z)
display(model)
model.optimize()
```

Look at the following figure. What is being shown? Why does it look like this?

In [20]:

```
plt.figure()
plt.imshow(model.Z[0].gradient.reshape(64,64,order='F'),
interpolation='nearest',
cmap=plt.cm.gray)
plt.colorbar()
```

Out[20]:

Write some code to evaluate the model’s performance, using the held-out data that we separated earlier. How is the error rate? Is that better than random guessing?

*Hint:*

```
GPy.util.classification.conf_matrix(prob_estimate,ytest)
```

Write another simple for loop to optimize the model. How low can you get the error rate to go? What kind of kernel do you think might be appropriate for this classification task?

In [ ]:

```
```