Deep learning is a specific kind of machine learning. In order to understand deeplearning well, one must have a solid understanding of the basic principles of ma-chine learning
In order to evaluate the abilities of a machine learning algorithm, we must designa quantitative measure of its performance. Usually this performance measure Pis specific to the task T being carried out by the system
and
with
so the error increases whenever the Euclidean distance between the predictionsand the targets increases.
in a way that reduces
when the algorithmis allowed to gain experience by observing a training set
we can simply solve for where its gradient is 0:
so the mapping from parameters to predictions is still a linear function but themapping from features to predictions is now an affine function.
The central challenge in machine learning is that we must perform well on new,previously unseen inputs—not just those on which our model was trained. ==> generalization
as another feature provided to the linear regression model, wecan learn a model that is quadratic as a function of
:
that expresses a preference for the weights to have smaller squared
norm.
Specifically,
where
is a value chosen ahead of time that controls the strength of our preferencefor smaller weights.
When
we impose no preference,
and larger
forces the weights to become smaller.
Minimizing
results in a choice of weights thatmake a tradeoff between fitting the training data and being small.