0% found this document useful (0 votes)

28 views6 pages

Understanding Simple Linear Regression

Uploaded by

Anisha Koushal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

Understanding Simple Linear Regression

Uploaded by

Anisha Koushal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Simple Linear Regression

Introduction

Least Square “Linear Regression” is a statistical method to regress the data with dependent
variable having continuous values whereas independent variables can have either continuous
or categorical values. In other words, “Linear Regression” is a method to predict dependent
variable (Y) based on values of independent variables (X).

Prerequisites

To start with Linear Regression, you must be aware of a few basic concepts of statistics. i.e.,
 Correlation (r) – Explains the relationship between two variables, possible values -1
to +1
 Variance (σ2)– Measure of spread in your data
 Standard Deviation (σ) – Measure of spread in your data (Square root of Variance)
 Normal distribution
 Residual (error term) – {Actual value – Predicted value}

Assumptions

 Dependent variable is continuous

 Linear relationship between Dependent Variable and Independent Variable.
 No Multicollinearity (no relationship between Independent variables)
 Residuals should follow Normal Distribution.
 Residuals should have constant variance: Homoscedasticity
 Residuals should be independently distributed/no autocorrelation

To check relationship between dependent and independent variable:

1. Perform Bivariate Analysis
2. Calculate Variance Inflation factor: value which is closer to 1 and till maximum 4

To find whether residuals are normally distributed or not:

1. Perform Histogram/ Boxplot
2. Perform Kolmogorov Smirnov K’s test

To check Homoscedasticity:
1. Plot Residuals Vs. Predicted values, and there should be no pattern.
2. Perform Non Constant Variance Test.

Linear Regression Line

While doing linear regression our objective is to fit a line through the distribution which is
nearest to most of the points. Hence reducing the distance (error term) of data points from the
fitted line.

For example, in above figure (left) dots represent various data points and line (right)
represents an approximate line which can explain the relationship between ‘x’ & ‘y’
axes. Through, linear regression we try to find out such a line. For example, if we have one
dependent variable ‘Y’ and one independent variable ‘X’ – relationship between ‘X’ & ‘Y’
can be represented in a form of following equation:

Y= β0 + β1X
Where,
Y = Dependent Variable
X = Independent Variable
β0 = Constant term a.k.a Intercept
β1 = Coefficient of relationship between ‘X’ & ‘Y’

Few properties of linear regression line

 Regression line always passes through mean of independent variable (x) as well as
mean of dependent variable (y)
 Regression line minimizes the sum of “Square of Residuals”. That’s why the method
of Linear Regression is known as “Ordinary Least Square (OLS)”.
 β1 explains the change in Y with a change in X by one unit. In other words, if we
increase the value of ‘X’ by one unit then what will be the change in value of Y.

Finding a Linear Regression Line

Using a statistical tool e.g., Excel, R, SAS etc. you will directly find constants (β0 and β1) as
a result of linear regression function. But conceptually as discussed it works on OLS concept
and tries to reduce the square of errors, using the very concept software packages calculate
these constants.

For example, let say we want to predict ‘y’ from ‘x’ given in following table and let’s assume
that our regression equation will look like “y=B0+B1*x”

X y Predicted 'y'

1 2 Β0+B1*1

2 1 Β0+B1*2

3 3 Β0+B1*3

4 6 Β0+B1*4

5 9 Β0+B1*5

6 11 Β0+B1*6

7 13 Β0+B1*7

8 15 Β0+B1*8

9 17 Β0+B1*9
10 20 Β0+B1*10

Where,
Table 1:

Std. Dev. of x 3.02765

Std. Dev. of y 6.617317

Mean of x 5.5

Mean of y 9.7

Correlation between x & y .989938

If we differentiate the Residual Sum of Square (RSS) WRT. B0 & B1 and equate the results
to zero, we get the following equations as a result:
B1 = Correlation * (Std. Dev. of y/ Std. Dev. of x)
B0 = Mean(Y) – B1 * Mean(X)
Putting values from table 1 into the above equations,

B1 = 2.64
B0 = -2.2
Hence, the least regression equation will become –

Y = -2.2 + 2.64*x

Model Performance

Once you build the model, the next logical question comes in mind is to know whether your
model is good enough to predict in future or the relationship which you built between
dependent and independent variables is good enough or not.
For this purpose, there are various metrics which we look into-

i. R – Square (R2)
Formula for calculating R2 is given by:

 Total Sum of Squares (TSS): TSS is a measure of total variance in the

response/ dependent variable Y and can be thought of as the amount of variability
inherent in the response before the regression is performed.
 Residual Sum of Squares (RSS): RSS measures the amount of variability that
is left unexplained after performing the regression.
 (TSS – RSS) measures the amount of variability in the response that is explained
(or removed) by performing the regression.
Where N is the number of observations used to fit the model, σx is the standard deviation of x,
and σy is the standard deviation of y.
 R2 ranges from 0 to 1.
 R2 of 0 means that the dependent variable cannot be predicted from the independent
variable.
 R2 of 1 means the dependent variable can be predicted without error from the
independent variable.
 An R2 between 0 and 1 indicates the extent to which the dependent variable is
predictable. An R2 of 0.20 means that 20 percent of the variance in Y is predictable
from X; an R2 of 0.40 means that 40 percent is predictable; and so on.

ii. Root Mean Square Error (RMSE)

RMSE tells the measure of dispersion of predicted values from actual values. The formula for
calculating RMSE is

N: Total number of observations

Though RMSE is a good measure for errors but the issue with it is that it is susceptible
to the range of your dependent variable. If your dependent variable has thin range, your
RMSE will be low and if dependent variable has wide range RMSE will be high. Hence,
RMSE is a good metric to compare between different iterations of a model.

iii. Mean Absolute Percentage Error (MAPE)

To overcome the limitations of RMSE, analyst prefer MAPE over RMSE which gives
error in terms of percentages and hence comparable across models. Formula for
calculating MAPE can be written as:

Linear Regression Case Study
No ratings yet
Linear Regression Case Study
6 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
15 pages
Daunit 3
No ratings yet
Daunit 3
32 pages
Regression Analysis and Model Building
No ratings yet
Regression Analysis and Model Building
16 pages
Regression
No ratings yet
Regression
56 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Regression
No ratings yet
Regression
14 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
15 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
24 pages
Linear Regression Implementation Guide
No ratings yet
Linear Regression Implementation Guide
7 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Unit III
No ratings yet
Unit III
13 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Machine Learning Regression with Scikit-learn
No ratings yet
Machine Learning Regression with Scikit-learn
19 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
7 pages
Simple Linear Regression Overview and Analysis
No ratings yet
Simple Linear Regression Overview and Analysis
27 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
1.5.linear Regression
No ratings yet
1.5.linear Regression
5 pages
Business Analytics: Advance: Simple & Multiple Linear Regression
No ratings yet
Business Analytics: Advance: Simple & Multiple Linear Regression
38 pages
Regression Analysis: Concepts & Types
No ratings yet
Regression Analysis: Concepts & Types
36 pages
RRB - Unit 2 Regresion
No ratings yet
RRB - Unit 2 Regresion
53 pages
Comprehensive Guide to Regression Analysis
No ratings yet
Comprehensive Guide to Regression Analysis
7 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
23 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
12 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Regression Analysis in Excel Techniques
No ratings yet
Regression Analysis in Excel Techniques
10 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression-SIMPLE LINEAR
No ratings yet
Regression-SIMPLE LINEAR
25 pages
Simple Regression Analysis Overview
No ratings yet
Simple Regression Analysis Overview
12 pages
Regression Analysis and Techniques
No ratings yet
Regression Analysis and Techniques
49 pages
UNIT-III Lecture Notes
No ratings yet
UNIT-III Lecture Notes
18 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
14 pages
Understanding Regression and Covariance
No ratings yet
Understanding Regression and Covariance
34 pages
Understanding Simple Regression Models
No ratings yet
Understanding Simple Regression Models
25 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
2 pages
Data Science
100% (1)
Data Science
14 pages
Linear Regression Guide & Assumptions
No ratings yet
Linear Regression Guide & Assumptions
9 pages
Understanding Blue Property Assumptions
No ratings yet
Understanding Blue Property Assumptions
27 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Linear Regression Models Guide
No ratings yet
Linear Regression Models Guide
42 pages
Raw Introduction to Linear Regression (서울대 회귀분석 강의노트)
No ratings yet
Raw Introduction to Linear Regression (서울대 회귀분석 강의노트)
226 pages
Understanding Regression in Machine Learning
No ratings yet
Understanding Regression in Machine Learning
49 pages
Understanding Linear Regression Models
No ratings yet
Understanding Linear Regression Models
41 pages
Data Analytics Unit 3 Notes
100% (3)
Data Analytics Unit 3 Notes
28 pages
Linear Regression
No ratings yet
Linear Regression
35 pages
13 Predictive Analysis - Tests of Association - Regression
No ratings yet
13 Predictive Analysis - Tests of Association - Regression
70 pages
Understanding Linear Regression Techniques
No ratings yet
Understanding Linear Regression Techniques
31 pages
Unit III
No ratings yet
Unit III
24 pages
Introduction to Linear Regression Analysis
No ratings yet
Introduction to Linear Regression Analysis
50 pages
Study Material For Machine Learning - 2 - 1754721786127
No ratings yet
Study Material For Machine Learning - 2 - 1754721786127
22 pages
Steps in Regression Analysis Explained
No ratings yet
Steps in Regression Analysis Explained
5 pages
Jimmy Lamba Resume PDF
No ratings yet
Jimmy Lamba Resume PDF
8 pages
Kiran Triangle
No ratings yet
Kiran Triangle
4 pages
DevOps Engineer Profile
100% (1)
DevOps Engineer Profile
2 pages
Promit Mallick Resume
No ratings yet
Promit Mallick Resume
3 pages
Executive Profile Core Competencies: Manoj Mohandas Menon
No ratings yet
Executive Profile Core Competencies: Manoj Mohandas Menon
2 pages
Cambridge IGCSE: PHYSICS 0625/61
No ratings yet
Cambridge IGCSE: PHYSICS 0625/61
12 pages
High Point Golf Club Employment Application
No ratings yet
High Point Golf Club Employment Application
2 pages
Autonomous Colleges - AQAR Format
No ratings yet
Autonomous Colleges - AQAR Format
28 pages
Conference Proceedings Iconas 2019
No ratings yet
Conference Proceedings Iconas 2019
476 pages
Objectives and Scope of Management Accounting
No ratings yet
Objectives and Scope of Management Accounting
9 pages
Module 5 - Assessment of Student Learning 1 & 2
No ratings yet
Module 5 - Assessment of Student Learning 1 & 2
44 pages
Shortlist Edited
No ratings yet
Shortlist Edited
42 pages
SFX Laser Engraver Operation Manual
No ratings yet
SFX Laser Engraver Operation Manual
8 pages
Tabag Arki Undergrad Thesis Letters 1
No ratings yet
Tabag Arki Undergrad Thesis Letters 1
2 pages
Modernism and Phenomenology
100% (1)
Modernism and Phenomenology
231 pages
Trichoderma Biofertilizer Effects on Mustard and Tomato Yield
No ratings yet
Trichoderma Biofertilizer Effects on Mustard and Tomato Yield
11 pages
Bio Learn Lesson Plan Chromosomes and Dna Structure
No ratings yet
Bio Learn Lesson Plan Chromosomes and Dna Structure
19 pages
Living and Loving After Betrayal
No ratings yet
Living and Loving After Betrayal
204 pages
Embracing Authenticity: Four Steps
No ratings yet
Embracing Authenticity: Four Steps
1 page
WebinarET 2 PySEBAL Iran Caiserman (P)
No ratings yet
WebinarET 2 PySEBAL Iran Caiserman (P)
15 pages
Ards Process Control Narratives
No ratings yet
Ards Process Control Narratives
94 pages
Wipro Elite NTH Role and Packages: Content
No ratings yet
Wipro Elite NTH Role and Packages: Content
21 pages
Sample Size Impact on Skewness in Assessment
No ratings yet
Sample Size Impact on Skewness in Assessment
9 pages
Algebra 1 Student Assessment
No ratings yet
Algebra 1 Student Assessment
16 pages
Introduction To Structural Engineering-18032025-2
No ratings yet
Introduction To Structural Engineering-18032025-2
6 pages
Genetics and Inheritance Basics
No ratings yet
Genetics and Inheritance Basics
4 pages
Uppcs CSAT 2024 Question Paper
No ratings yet
Uppcs CSAT 2024 Question Paper
21 pages
Henrietta Lacks and The Legacy of HeLa Cells
No ratings yet
Henrietta Lacks and The Legacy of HeLa Cells
2 pages
Weld-Free ROV D-Handle: Features Features
No ratings yet
Weld-Free ROV D-Handle: Features Features
2 pages
Eng7 - Q2 - Mod7 - Linear and Non-Linear Text
No ratings yet
Eng7 - Q2 - Mod7 - Linear and Non-Linear Text
19 pages
Clean Campaign Janakpur Proposal
No ratings yet
Clean Campaign Janakpur Proposal
17 pages
Aerodynamics of Turbine Diffusers
No ratings yet
Aerodynamics of Turbine Diffusers
31 pages
Participant Manual Omi 1.10
No ratings yet
Participant Manual Omi 1.10
19 pages
Chakra Mantras: Spiritual Liberation
0% (1)
Chakra Mantras: Spiritual Liberation
4 pages
Understanding Reality: Models & Theories
No ratings yet
Understanding Reality: Models & Theories
2 pages

Understanding Simple Linear Regression

Uploaded by

Understanding Simple Linear Regression

Uploaded by

Simple Linear Regression

 Dependent variable is continuous

To check relationship between dependent and independent variable:

To find whether residuals are normally distributed or not:

Linear Regression Line

Few properties of linear regression line

Finding a Linear Regression Line

Std. Dev. of x 3.02765

Std. Dev. of y 6.617317

Correlation between x & y .989938

 Total Sum of Squares (TSS): TSS is a measure of total variance in the

ii. Root Mean Square Error (RMSE)

N: Total number of observations

iii. Mean Absolute Percentage Error (MAPE)

You might also like