$\newcommand{\trace}[1]{\operatorname{tr}\left\{#1\right\}}$ $\newcommand{\Norm}[1]{\lVert#1\rVert}$ $\newcommand{\RR}{\mathbb{R}}$ $\newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $\newcommand{\DD}{\mathscr{D}}$ $\newcommand{\grad}[1]{\operatorname{grad}#1}$ $\DeclareMathOperator*{\argmin}{arg\,min}$

Setting up the environment

In [ ]:

```
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
```

We will use an old dataset on the price of housing in Boston (see description). The aim is to predict the median value of the owner occupied homes from various other factors. We will use a normalised version of this data, where each row is an example. The median value of homes is given in the first column (the label) and the value of each subsequent feature has been normalised to be in the range $[-1,1]$. Download this dataset from mldata.org.

Read in the data using `np.loadtxt`

with the optional argument `delimiter=','`

.

Check that the data is as expected using `print()`

. Use `np.delete`

and `del`

to remove the column containing the binary variable 'chas' and the corresponding label, respectively. `names.index('chas')`

is a convenient way to get the index of that column. This should give you an `np.ndarray`

with 506 rows (examples) and 13 columns (1 label and 12 features).

In [ ]:

```
names = ['medv', 'drim', 'zn', 'indus', 'chas', 'nox', 'rm', 'age', 'dis', 'rad', 'tax', 'ptratio', 'b', 'lstat']
```

In [ ]:

```
# Solution goes here
```

In [ ]:

```
print('name, min, max, #unique:')
print('\n'.join([str((name, min(vals), max(vals), len(set(vals)))) for name, vals in zip(names, data.T)]))
assert data.shape == (506,13)
```

Plotting is done using the matplotlib toolbox. For example:

In [ ]:

```
x = [0,1.2,2,3,5.1,7,8,9]
y1 = [1.1,3,2,4,5,6,8.1,8.2]
y2 = [4.2,4.2,4.1,5,6,3.2,4.8,6]
fig = plt.figure(figsize=(11,5))
ax = fig.add_subplot(121)
ax.plot(x,y1,'b--')
ax.plot(x,y1,'bs',label='y1')
ax.plot(x,y2,'r:')
ax.plot(x,y2,'r>',label='y2')
ax.set_title('Some random data')
ax.set_ylabel('labels')
ax.legend(loc='upper left', numpoints=1)
ax = fig.add_subplot(122)
ax.plot(x,y1,'bo')
ax.set_title('same data as before, without lines')
ax.set_xlabel(r'examples (symbols, e.g. $\alpha,\beta,\gamma$, works)')
```

Plot the median value of the property (vertical axis) versus the tax rate (horizontal axis).

In [ ]:

```
# Solution goes here
```

Implement the sum of squares error function to find the maximum likelihood solution $w_{ML}$ for the regression problem. Implement subroutines for polynomial basis function of degree 2. See expansion based on binomial formula.

In [ ]:

```
# Solution goes here
```

Use half of the available data for training the model using maximum likelihood. The rest of the data is allocated to the test set. Report the root mean squared error (RMSE) for the training set and the test set.

In [ ]:

```
# Solution goes here
```

Using the standard basis function (no transformations), find the feature with the biggest weight. Plot two figures, one for the training set and one for the test set. In each figure, plot the label against the this most important feature. Also include a line showing your maximum likelihood predictor (*Hint: use* `np.arange`

*to generate data*).

In [ ]:

```
# Solution goes here
```

Implement the regularized least squares regression to find the maximum likelihood solution $w_{reg}$ with regularizer $\lambda>0$. (Warning: the keyword `lambda`

is a reserved word in Python).

In [ ]:

```
# Solution goes here
```

Similar to the previous exercise, plot two figures showing the most important feature along with the label and prediction. Use $\lambda = 1.1$.

In [ ]:

```
# Solution goes here
```

Compare the RSME of regression with and without regularization. By also considering the plots, describe what you observe and explain your observations.

The choice of basis function as well as the value of the regularization parameter $\lambda$ affects the performance of the predictor. Using the same training and test data as before, compute the different RMSE for:

- The standard basis (as done above)
- polynomial basis function of degree 2.
- $\lambda$ = [0.01, 0.1, 1, 10, 100]

In [ ]:

```
# Solution goes here
```