3.13.23.33.43.5regression Regularization

• what is meant by regression.
• the different types of regression analysis.

• the simple linear regression model.
• the different metrics to evaluate regression models.
• the logistic and quantile regression models.
• the regularization approaches in regression analysis.
▪ Regression is a supervised learning approach, where the given dataset is labeled (i.e.,
has one or more target variables).
▪ The aim of the regression analysis is to find an expectation for the relationship between
the target variable (i.e., dependent variable) and the existent independent variables
within the dataset.
▪ The relationship between the dependent variable and the independent variable(s),
which is retrieved through a regression analysis, can be either linear or nonlinear
▪ In linear regression, the relationship is a straight-line equation
Where {x1, x2, …, xn} are the independent variables of the dataset
and {a1, a2, …, an} are the coefficients of these variables, implying the weight that each
independent variable shares in the resultant target variable y.
The b value is the constant or bias term.
▪ The nonlinear regression, the relationship is a nonlinear equation (e.g., polynomial,
exponential)
▪ Objective?
▪ find the optimum values for the coefficients {a1, a2, …, an}
▪ Steps?
▪ model selection, model fitting, model prediction, and model evaluation
▪ Given: one independent variable and one dependent variable
▪ Required: simple linear regression
▪ Model selection:
▪
▪ The aim is to find the values of

a and b, such that the error term
is minimized, and y will be close
to the desired target variable
Residue
A small amount that remains after the
main part has been taken away

The evaluation
to the desired target metric
variable
for the simple
regression model is the
Residue sum of the squared
A small residues
amount that remains
which after the
we aim
main part hastobeen taken away
minimize.

The evaluation
to the desired target metric
variable
for the simple
regression model is the
Residue sum of the squared
A small residues
amount that remains
which after the
we aim
main part hastobeen taken away
minimize.
▪ From differential calculus, the minimum value of E can be derived by setting
▪ Find a and b
For datasets with more than one independent variable, the same procedure needs to be
followed for each of the independent variables to find all the involved parameters.
R2 Metric for Correlation Analysis
▪ Coefficient of determination is employed to gauge the quality of fit of the derived
regression relationship.
▪ Describes the proportion of variance in the dependent variable that is explained

by the independent variables.
▪ If the correlation between independent and dependent variables is strong, a robust

implementation of the developed regression model that is predictive for new data
inputs can be achieved.
▪ Test of Linearity
▪ For the developed model to satisfy the linearity assumption. The resultant values
should be small with respect to a specified threshold, and there should be no
significant pattern for the distribution of these values.
▪ When a linear regression fails to fit a linear model with strong correlation from the
independent variable to the dependent variable, a nonlinear or polynomial
regression is advised to correlate the data variables in a curvilinear model.
▪ In this example, a linear regression model could possibly present a continuous
range of values including “0” and “1,” but it will not exactly produce {“0”, “1”}
output.
▪ Therefore, a logit function “S” shape is introduced
to the linear regression equation, and the
underlying linear model will map the predictions
with the logistic function to produce binary values
{“0”, “1”}.
▪ The rationale for logistic regression is based
on odds ratio, where the odds of a specific event
occurring are defined as the probability of an
event occurring divided by the probability of that
event not occurring
▪ Logistic regression is considered an extension of the linear regression analysis,
and its corresponding model can be used to classify input data records into a set
of given categories or discrete values that form the dependent variable.
▪ For example, the dependent variable (y) may represent a loan prediction
(approved/rejected) based on a credit score (i.e., the independent variable).
▪ Ex.; Assume two values for
y: “0” to denote a rejected loan status
and “1” to denote an approved loan status.
▪ In this example, a linear regression model could possibly present a continuous
range of values including “0” and “1,” but it will not exactly produce {“0”, “1”}
output.
▪ Therefore, a logit function “S” shape is introduced
to the linear regression equation, and the
underlying linear model will map the predictions
with the logistic function to produce binary values
{“0”, “1”}.
▪ The maximum likelihood method generates the logit function that predicts the
natural logarithm of the odds ratio
▪ The predicted odds ratio and the predicted probability of success are found next.
Hence, the logistic regression equation is
▪ The plot of the logistic equation on the given dataset is shown as follows
▪ The plot of the logistic equation on the
given dataset is shown as follows
https://towardsdatascience.com/logit-of-logistic-regression-
understanding-the-fundamentals-f384152a33d1
▪ Regression analysis tries to develop the best fit between the independent
variables and the dependent variable such that the loss function is minimized
and/or the value of R2 is maximized.
▪ However, the developed model can overfit the training dataset and lose the value of
generalization when applied to a different testing dataset.
▪ Furthermore, while tuning the model for its best fit, the existence of any wrong
data points (i.e., outliers) affect the development, leading to an incorrect
regression model.
▪ To avoid the issues of overfitting and outliers and to have a more robust model, we
penalize the loss function by adding a penalty term to the regression model. Such a
penalty is known as regularization.
▪ For the case of regression, it comes in two common forms: ridge and lasso
regularization.
▪ The ridge regression (or L2 regularization) proceeds by adding a term to the loss
function that penalizes the sum of squares of the model coefficients
before
▪ The ridge regression (or L2 regularization) proceeds by adding a term to the loss
function that penalizes the sum of squares of the model coefficients
before
after
where is a constant that controls the level of the penalty, and stands for the model
coefficients.
The higher is, the greater the emphasis on the reduction of the coefficients magnitudes, at
the expense of tolerating higher residuals.
▪ In L1 regularization the objective is to minimize the sum of the absolute values of
the coefficients instead of their squares.
▪ As a result, both large and small coefficients values are addressed and driven
down.
L1
down.
before
Ridge:
L2
Lasso:
L1
down.
before
Ridge:
L2
Lasso:
L1

3.13.23.33.43.5regression Regularization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3.13.23.33.43.5regression Regularization

Uploaded by

Copyright:

Available Formats

• what is meant by regression.

• the different types of regression analysis.

▪ The aim is to find the values of

▪ The aim is to find the values of

▪ The aim is to find the values of

▪ Describes the proportion of variance in the dependent variable that is explained

▪ If the correlation between independent and dependent variables is strong, a robust

You might also like