You are on page 1of 35

CS446: Machine Learning

Lecture 14-15 (ML Models – Linear Regression)


Instructor:
Dr. Muhammad Kabir
Assistant Professor
muhammad.kabir@umt.edu.pk

School of Systems and Technology


Department of Computer Science
University of Management and Technology, Lahore
Previous Lectures…
 Selecting right ML model
 Cross-Validation – Statistical measures
 Overfitting in ML – concept, signs, causes and prevention
 Under fitting in ML – concept, signs, causes and prevention
 Bias-variance tradeoff in ML
 Loss function
 Model evaluation – Accuracy score, mean score error
 Model parameters and hyperparameter
 Gradient decent in ML
Today’s Lectures…
 Linear Regression – intuition
 Linear Regression – Mathematical understanding
 Gradient Decent for Linear Regression
How Machine Learning Works….
Linear Regression….
Linear Regression….

What would be the salary


of a person with three
years of experience?
Linear Regression….

What would be the salary


of a person with three
years of experience?
Linear Regression….

What would be the


salary of a person with
three years of
experience?
Linear Regression….

What would be the


salary of a person
with three years
of experience?

~650000 PKR per


month
Linear Regression….

X 1 2 3 4 5
Y 5 7 9 11 13
Linear Regression….

X 1 2 3 4 5
Y 5 7 9 11 13
Linear Regression….

X 1 2 3 4 5
Y 5 7 9 11 13
Linear Regression….
What if are more than 2 variables….

What if are more than 2 variables?


What if are more than 2 variables….
Multiple Linear Regression
 Multiple linear regression is a model for predicting the value of one
dependent variable based on two or more independent variables.
 Salary was dependent only on experience – What about qualification?
Linear Regression….
Advantages
 Very simple to implement.
 Perform well on data with linear
relationship.

Disadvantages
 Not suitable for data having non-
linear relationship.
 Underfitting issue
 Sensitive to outliers
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept
Linear Regression….

Y = mX+c

X X value
Y Y value
m Slope
c Intercept

Which one is
best????
Loss Function - Linear Regression….
Randomly assigned
parameters m=3 and c=2
Y = mX+c
Loss Function - Linear Regression….
Randomly assigned
parameters m=3 and c=2
Y = mX+c
Model parameters….
Weight: Weight decides how much influence the input will have on the
output.
X – features or input variable
Y – Target or output variable
w- weight
b- bias

Bias:
- Bias is the offset value given to the model.
- It is used to shift the model in a particular direction.
- Similar to a Y-intercept.
- ‘b’ is equal to “Y” when all the features values are zero.
Hyperparameter – for Optimization….
Learning Rate: It is tunable parameters used for Loss Function
optimization that determines the step size at each
iteration while moving toward a minimum of a loss
function.
Gradient descent – optimization algorithm.

Numbers of Epochs:
- Represents the number of times the model iterates over
the entire dataset

Proper values are required fro learning rate and number of epochs to avoid
overfitting and Underfitting.
Model Optimization.
Optimization: It refers to determining best parameters for a model, such that the loss
function of the model decreases, as a result of which the model can predict more
accurately.
- Finding best parameters to get optimal results.
- Gradient descent is one of the mostly used algorithm for optimization.

Initial parameters

updated parameters
Model Optimization .
Optimization: It refers to determining best parameters for a model, such that the loss
function of the model decreases, as a result of which the model can predict more
accurately.
- Finding best parameters to get optimal results.
- Gradient descent is one of the mostly used algorithm for optimization.
Gradient Decent for Linear Regression...
Working of Gradient Decent .
Gradient Descent - Optimization algorithm .
- Optimization algorithm used for minimizing the loss function in
various ML algorithms.
- It is used for updating the parameters of the learning model.
- Formula for updating w and b is:

- w --> weight
- b --> bias
- L --> Learning rate

- dw -->partial derivative of loss unction with respect to w.


- db --> partial derivate of loss function with respect to b.
Gradient Descent - Optimization algorithm .
Gradient Descent - Optimization algorithm .
Chapter Reading

Chapter Chapter 01
Pattern Recognition and
Machine Learning
by
Machine Learning
by
Tom Mitchell
Christopher M. Bishop

You might also like