This exercise will walk you through the process of using machine learning for facial recognition.
from __future__ import print_function, division
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# use seaborn for better matplotlib styles
import seaborn; seaborn.set(style='white')
The data we'll use is a number of snapshots of the faces of world leaders. We'll fetch the data as follows:
from sklearn.datasets import fetch_lfw_people
faces = fetch_lfw_people(min_faces_per_person=70, resize=0.4)
plt.imshow
to plot several of the images. How many pixels are in each image?sklearn.metrics.train_test_split
to split the data into a training set and a test set.Lets use some dimensionality reduction routines to try and understand the data. Just a warning: you'll probably find that, unlike in the case of the handwritten digits, the projections will be a bit too jumbled to gain much insight. Still, it's always a useful step in understanding your data!
Here we'll perform a classification task on our data. Given a training set, we want to build a classifier that will accurately predict the test set
sklearn.cross_validation.train_test_split
)sklearn.svm.SVC
) to classify the data. Import this and instantiate the estimator.sklearn.metrics.accuracy_score
to see how well you're doing.C
parameter of SVC
. Look at the SVC
doc string and try some choices for the kernel
, for C
and for gamma
. What's the best accuracy you can find?sklearn.metrics.classification_report
and sklearn.metrics.confusion_matrix
, and plot some of the images with the true and predicted label. How well does it do?