Lecture 14-15 (ML Models - Linear Regression) : CS446: Machine Learning

CS446: Machine Learning
Lecture 14-15 (ML Models – Linear Regression)

Instructor:
Dr. Muhammad Kabir
Assistant Professor
muhammad.kabir@umt.edu.pk
School of Systems and Technology

Department of Computer Science
University of Management and Technology, Lahore
Previous Lectures…
 Selecting right ML model
 Cross-Validation – Statistical measures
 Overfitting in ML – concept, signs, causes and prevention
 Under fitting in ML – concept, signs, causes and prevention
 Bias-variance tradeoff in ML
 Loss function
 Model evaluation – Accuracy score, mean score error
 Model parameters and hyperparameter
 Gradient decent in ML
Today’s Lectures…
 Linear Regression – intuition
 Linear Regression – Mathematical understanding
 Gradient Decent for Linear Regression
How Machine Learning Works….
Linear Regression….
What would be the salary

of a person with three
years of experience?
What would be the salary

of a person with three
years of experience?
What would be the

salary of a person with
three years of
experience?
What would be the

salary of a person
with three years
of experience?
~650000 PKR per

month
X 1 2 3 4 5
Y 5 7 9 11 13
X 1 2 3 4 5
Y 5 7 9 11 13
X 1 2 3 4 5
Y 5 7 9 11 13
What if are more than 2 variables….
What if are more than 2 variables?

What if are more than 2 variables….
Multiple Linear Regression
 Multiple linear regression is a model for predicting the value of one
dependent variable based on two or more independent variables.
 Salary was dependent only on experience – What about qualification?
Advantages
 Very simple to implement.
 Perform well on data with linear
relationship.
Disadvantages
 Not suitable for data having non-
linear relationship.
 Underfitting issue
 Sensitive to outliers
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Y = mX+c
X X value
Y Y value
m Slope
c Intercept
Which one is
best????
Loss Function - Linear Regression….
Randomly assigned
parameters m=3 and c=2
Y = mX+c
Loss Function - Linear Regression….
Randomly assigned
parameters m=3 and c=2
Y = mX+c
Model parameters….
Weight: Weight decides how much influence the input will have on the
output.
X – features or input variable
Y – Target or output variable
w- weight
b- bias
Bias:
- Bias is the offset value given to the model.
- It is used to shift the model in a particular direction.
- Similar to a Y-intercept.
- ‘b’ is equal to “Y” when all the features values are zero.
Hyperparameter – for Optimization….
Learning Rate: It is tunable parameters used for Loss Function
optimization that determines the step size at each
iteration while moving toward a minimum of a loss
function.
Gradient descent – optimization algorithm.
Numbers of Epochs:
- Represents the number of times the model iterates over
the entire dataset
Proper values are required fro learning rate and number of epochs to avoid
overfitting and Underfitting.
Model Optimization.
Optimization: It refers to determining best parameters for a model, such that the loss
function of the model decreases, as a result of which the model can predict more
accurately.
- Finding best parameters to get optimal results.
- Gradient descent is one of the mostly used algorithm for optimization.
Initial parameters
updated parameters
Model Optimization .
Optimization: It refers to determining best parameters for a model, such that the loss
function of the model decreases, as a result of which the model can predict more
accurately.
- Finding best parameters to get optimal results.
- Gradient descent is one of the mostly used algorithm for optimization.
Gradient Decent for Linear Regression...
Working of Gradient Decent .
Gradient Descent - Optimization algorithm .
- Optimization algorithm used for minimizing the loss function in
various ML algorithms.
- It is used for updating the parameters of the learning model.
- Formula for updating w and b is:
- w --> weight
- b --> bias
- L --> Learning rate
- dw -->partial derivative of loss unction with respect to w.

- db --> partial derivate of loss function with respect to b.
Chapter Reading
Chapter Chapter 01
Pattern Recognition and
Machine Learning
by
Machine Learning
by
Tom Mitchell
Christopher M. Bishop

Lecture 14-15 (ML Models - Linear Regression) : CS446: Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 14-15 (ML Models - Linear Regression) : CS446: Machine Learning

Uploaded by

Copyright:

Available Formats

CS446: Machine Learning

Lecture 14-15 (ML Models – Linear Regression)

School of Systems and Technology

What would be the salary

What would be the salary

What would be the

What would be the

~650000 PKR per

What if are more than 2 variables?

- dw -->partial derivative of loss unction with respect to w.

You might also like