Much of the code here is from the scikit-learn documentation: http://scikit-learn.org/stable/auto_examples/applications/svm_gui.html
Some supervised learning problems can be solved by very simple models (called generalized linear models) depending on the data. Others simply don’t.
To grasp the difference between the two cases, we'll run the interactive
graphical example found in the figures directory. To do this, you can
open a terminal and run the file svm_gui.py
>$ cd scripts
>$ python svm_gui.py
Put some data points belonging to one of the two target
classes (‘white’ or ‘black’) using left click and right click.
Choose some parameters of a Support Vector Machine to be trained
on this toy dataset (n_samples
is the number of clicks, n_features
is 2).
Click the Fit but to train the model and see the decision boundary.
The accurracy of the model is displayed on stdout.
The following figures demonstrate one case where a linear model can perfectly separate the two classes while the other is not linearly separable (a model with a gaussian kernel is required in that case).
%pylab inline
from figures.svm_gui_frames import plot_linear_model, plot_rbf_model
plot_linear_model()
This figure shows a linear Support Vector Machine trained to perfectly separate two sets of data points labeled as white and black in a 2D space.
plot_rbf_model()
This shows a Support Vector Machine with gaussian kernel trained to separate 2 sets of data points labeled as white and black in a 2D space. This dataset would not have been seperated by a simple linear model.
Use the GUI to fit a model that is able to solve the XOR problem using the GUI: the XOR problem is composed of 4 samples:
Question: is the XOR problem linearly separable?
Use the GUI to construct a problem with less than 10 points where the predictive accuracy of the best linear model is 50%.