Both classification and regression are done by Estimator
objects.(Classifier and Regressors extend this class). Estimator more or less for most of the classifiers and regressors looks like in the following:
class Estimator(object):
def __init__(self, *args, **kwargs):
# Initialization of object
pass
def fit(self, X, y):
"""Train the Estimator
Arguments:
X(numpy array-like): Training Data
y(numpy array-like): Labels
"""
# This goes the classifier algorithm
# does not return, but updates the Estimator object
pass
def predict(self, X):
"""Predict the test data
Arguments:
X(numpy array-like): Test Data
Returns:
y(numpy array): Predicted Labels
"""
# compute predictions
return predictions
If we want to summarize Scikit Learn in three lines:
est = Estimator()
est.fit(X_train, y_train)
est.predict(X_test)
Commoditization of machine learning
Some people say ...
First, we initialize the etimator, then fit the training instances by providing training dataset and labels. Then, try to predict the test dataset and get the predictions. Classification produces discrete labels(based on the number of classes) whereas regression produces rational numbers. However, the api stays same for classifiers and regressors for supervised learning. For unsupervised learning, we do not have predict
as there is no target
variable. We have fit
and transform
functions.