Professional Documents
Culture Documents
Regularization
Regularization - Introduction
• Overfitting: If we have too many features, the learned hypothesis may fit
the training set very well, but fail to generalize to new examples.
( = sigmoid function)
size of house
Price
no. of bedrooms
no. of floors
age of house
average income in neighborhood Size
kitchen size
Price
Price
x2
x1
Cost function:
Repeat
• If lambda is zero then we will get back Ordinary Least Squared whereas
very large value will make coefficients zero hence it will under-fit.
• If lambda is zero then you can imagine we get back Ordinary Least
Squared.
• If lambda is very large then it will add too much weight and it will lead to
under-fitting. Having said that it’s important how lambda is chosen. This
technique works very well to avoid over-fitting issue.
12/26/2021 School of Computer Science and Engineering 24
• we can see from the formula of L1 and L2 regularization, L1
regularization adds the penalty term in cost function by adding the
absolute value of weight(Wj) parameters, while L2 regularization adds
the squared value of weights(Wj) in the cost function.
• L2 is Ridge:
• It will not cause any variables to be demolished, but it will assign weights (call it
importance) to variables, such that your model maintains all the variables and at the
same time gives more importance to the important ones, in order to build a great
model for you.