Quick Example: Linearly Separable Data¶

Much of the code here is from the scikit-learn documentation: http://scikit-learn.org/stable/auto_examples/applications/svm_gui.html

Some supervised learning problems can be solved by very simple models (called generalized linear models) depending on the data. Others simply don’t.

To grasp the difference between the two cases, we'll run the interactive graphical example found in the figures directory. To do this, you can open a terminal and run the file svm_gui.py

>$ cd scripts
>$ python svm_gui.py

Put some data points belonging to one of the two target classes (‘white’ or ‘black’) using left click and right click. Choose some parameters of a Support Vector Machine to be trained on this toy dataset (n_samples is the number of clicks, n_features is 2). Click the Fit but to train the model and see the decision boundary. The accurracy of the model is displayed on stdout.

The following figures demonstrate one case where a linear model can perfectly separate the two classes while the other is not linearly separable (a model with a gaussian kernel is required in that case).

In [ ]:

%pylab inline

In [ ]:

from figures.svm_gui_frames import plot_linear_model, plot_rbf_model
plot_linear_model()

This figure shows a linear Support Vector Machine trained to perfectly separate two sets of data points labeled as white and black in a 2D space.

In [ ]:

plot_rbf_model()

This shows a Support Vector Machine with gaussian kernel trained to separate 2 sets of data points labeled as white and black in a 2D space. This dataset would not have been seperated by a simple linear model.

Exercise 1:¶

Use the GUI to fit a model that is able to solve the XOR problem using the GUI: the XOR problem is composed of 4 samples:

2 white samples in the top-left and bottom-right corners
2 black samples in the bottom-left and top-right corners

Question: is the XOR problem linearly separable?

Exercise 2:¶

Use the GUI to construct a problem with less than 10 points where the predictive accuracy of the best linear model is 50%.

Notes:¶

The higher the dimension of the feature space, the more likely the data is linearly separable.