A really great feature in Lasagne is the ability to easily manipulate the training (and test-data) on the fly. This is often done in a random fashion by applying transformations under which the objects are invariant. All winners of competitions do a clever data augmentation. So let's give it a try...
Note that this notebook is just to show you how the augmentation works from a programming standpoint. If you want to see the benefit of augmentation, have a look at DataAugmentationII.ipynb
To augment the data on the fly one has to implement a BatchIterator
with our own transformation method. In the following we preform a very simple and crude transformation method, we simply flip the image on the y-axis. In principle using cv2 or other libraries quite complex operations are possible and used in practice.
For the non-python freaks as me a short reminder that flipping can be done as follows:
a = (1,2,3,4)
a[::-1]
(4, 3, 2, 1)
As in the previous notebooks we define the network and load the data.
#We load the net and the data from the last example
from lasagne import layers
from lasagne import nonlinearities
from nolearn.lasagne import NeuralNet
import cPickle as pickle
import gzip
with gzip.open('mnist_4000.pkl.gz', 'rb') as f:
(X,y) = pickle.load(f)
PIXELS = len(X[0,0,0,:])
X.shape, y.shape, PIXELS
def createNet():
return NeuralNet(
# Geometry of the network
layers=[
('input', layers.InputLayer),
('conv1', layers.Conv2DLayer),
('pool1', layers.MaxPool2DLayer),
('conv2', layers.Conv2DLayer),
('pool2', layers.MaxPool2DLayer),
('hidden4', layers.DenseLayer),
('output', layers.DenseLayer),
],
input_shape=(None, 1, PIXELS, PIXELS), #None in the first axis indicates that the batch size can be set later
conv1_num_filters=32, conv1_filter_size=(3, 3), pool1_pool_size=(2, 2), #pool_size used to be called ds in old versions of lasagne
conv2_num_filters=64, conv2_filter_size=(2, 2), pool2_pool_size=(2, 2),
hidden4_num_units=500,
output_num_units=10, output_nonlinearity=nonlinearities.softmax,
# learning rate parameters
update_learning_rate=0.01,
update_momentum=0.90,
regression=False,
# We only train for 10 epochs
max_epochs=10,
verbose=1,
# Training test-set split
eval_size = 0.2
)
Fitting the model with the standard BatchIterator which does not do on the fly manipulations.
net1 = createNet()
d = net1.fit(X[0:1000,:,:,:],y[0:1000]); #Training with the
DenseLayer (None, 10) produces 10 outputs DenseLayer (None, 500) produces 500 outputs MaxPool2DLayer (None, 64, 6, 6) produces 2304 outputs Conv2DLayer (None, 64, 12, 12) produces 9216 outputs MaxPool2DLayer (None, 32, 13, 13) produces 5408 outputs Conv2DLayer (None, 32, 26, 26) produces 21632 outputs InputLayer (None, 1, 28, 28) produces 784 outputs Epoch | Train loss | Valid loss | Train / Val | Valid acc | Dur --------|--------------|--------------|---------------|-------------|------- 1 | 2.273287 | 2.175909 | 1.044753 | 42.11% | 6.9s 2 | 1.983694 | 1.815794 | 1.092467 | 49.69% | 7.0s 3 | 1.438780 | 1.172793 | 1.226798 | 62.91% | 7.0s 4 | 0.854319 | 0.802410 | 1.064692 | 76.42% | 7.1s 5 | 0.541192 | 0.684301 | 0.790869 | 82.05% | 7.2s 6 | 0.375430 | 0.662087 | 0.567040 | 81.27% | 7.0s 7 | 0.286839 | 0.616086 | 0.465583 | 84.15% | 6.9s 8 | 0.223945 | 0.570648 | 0.392440 | 85.46% | 6.9s 9 | 0.181760 | 0.609484 | 0.298219 | 84.81% | 6.9s 10 | 0.158122 | 0.545048 | 0.290107 | 85.59% | 6.9s
/Library/Python/2.7/site-packages/lasagne/init.py:86: UserWarning: The uniform initializer no longer uses Glorot et al.'s approach to determine the bounds, but defaults to the range (-0.01, 0.01) instead. Please use the new GlorotUniform initializer to get the old behavior. GlorotUniform is now the default for all layers. warnings.warn("The uniform initializer no longer uses Glorot et al.'s " /Library/Python/2.7/site-packages/lasagne/layers/helper.py:69: UserWarning: get_all_layers() has been changed to return layers in topological order. The former implementation is still available as get_all_layers_old(), but will be removed before the first release of Lasagne. To ignore this warning, use `warnings.filterwarnings('ignore', '.*topo.*')`. warnings.warn("get_all_layers() has been changed to return layers in " /Library/Python/2.7/site-packages/lasagne/layers/base.py:99: UserWarning: layer.get_output_shape() is deprecated and will be removed for the first release of Lasagne. Please use layer.output_shape instead. warnings.warn("layer.get_output_shape() is deprecated and will be " /Library/Python/2.7/site-packages/lasagne/layers/base.py:109: UserWarning: layer.get_output(...) is deprecated and will be removed for the first release of Lasagne. Please use lasagne.layers.get_output(layer, ...) instead. warnings.warn("layer.get_output(...) is deprecated and will be "
Now we create our own BatchIterator which does the flipping around the x-axis.
from nolearn.lasagne import BatchIterator
class SimpleBatchIterator(BatchIterator):
def transform(self, Xb, yb):
# The 'incomming' and outcomming shape is (10, 1, 28, 28)
Xb, yb = super(SimpleBatchIterator, self).transform(Xb, yb)
return Xb[:,:,:,::-1], yb #<--- Here we do the flipping of the images
# Setting the new batch iterator
net1.batch_iterator_train = SimpleBatchIterator(batch_size=10)
d = net1.fit(X[0:1000,:,:,:],y[0:1000])
DenseLayer (None, 10) produces 10 outputs DenseLayer (None, 500) produces 500 outputs MaxPool2DLayer (None, 64, 6, 6) produces 2304 outputs Conv2DLayer (None, 64, 12, 12) produces 9216 outputs MaxPool2DLayer (None, 32, 13, 13) produces 5408 outputs Conv2DLayer (None, 32, 26, 26) produces 21632 outputs InputLayer (None, 1, 28, 28) produces 784 outputs Epoch | Train loss | Valid loss | Train / Val | Valid acc | Dur --------|--------------|--------------|---------------|-------------|------- 1 | 1.187332 | 1.545440 | 0.768281 | 48.83% | 7.4s 2 | 0.233399 | 2.098233 | 0.111236 | 44.24% | 7.4s 3 | 0.089768 | 2.770103 | 0.032406 | 43.32% | 7.3s 4 | 0.026483 | 3.479021 | 0.007612 | 36.92% | 7.3s 5 | 0.016780 | 3.645117 | 0.004603 | 42.27% | 7.5s 6 | 0.003624 | 3.896002 | 0.000930 | 43.85% | 7.5s 7 | 0.001605 | 3.995633 | 0.000402 | 43.59% | 7.3s 8 | 0.001078 | 4.079319 | 0.000264 | 43.59% | 7.3s 9 | 0.000839 | 4.153205 | 0.000202 | 43.59% | 7.5s 10 | 0.000698 | 4.219086 | 0.000165 | 43.59% | 7.8s
We see that it takes a bit longer (0.4 sec), which is strange since in theory the flipping should be done on the CPU while the GPU is busy doing the fitting. Maybe the network is not big enough to keep the GPU busy? Anyway.
The second observation is that the performance drops from approx. 80% to approx. 40%. The reason may be that it’s not such a good idea to flip the training data in the first place. Any idea why?
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as imgplot
fig = plt.figure(figsize=(10,10))
for i in range(18):
a=fig.add_subplot(6,6,2*i+1,xticks=[], yticks=[])
plt.imshow(-X[i,0,:,:], cmap=plt.get_cmap('gray'))
a=fig.add_subplot(6,6,2*i+2,xticks=[], yticks=[])
plt.imshow(-X[i,0,:,::-1], cmap=plt.get_cmap('gray'))
Maybe we should do a more clever data augmentation next time: Have a look at DataAugmentationII.ipynb
If overwritting the transform
method is not sufficient, one might even overwrite the __iter__
method in the BatchIterator.