Linear Regression Clearly Explained

linkedin.
com/in/vikrantkumar95
Linear Regression
Clearly Explained
Swipe
What Is Linear Regression? (1/2)
Technical definition: Linear Regression is a statistical method
used to model a relationship between one dependent variable and
one or more independent variables.
Suppose we had a dataset that showed the Crop Yield of

farmers in different regions and the Rainfall received by that
region:
Here our goal is

To see how strong the relationship is between Rainfall
(independentant variable) and Crop Yield (dependent variable)
To see if we can predict the expected yield of a crop at certain
levels of rainfall,
linkedin.com/in/vikrantkumar95
What Is Linear Regression? (2/2)
Simply put, Linear Regression is a way to predict one thing ( Crop Yield
or Sales) based on the linear relationships it has with other things
(Time or Rainfall). It's like drawing a straight line through a scatter of
points to show the general trend.
)
Tan =B
The above line would have an equation with Crop Yield (the dependent
variable we are trying to predict) as Y-axis and Rainfall (the
independent variable that will be the predictor) as X-axis. The
equation would be
Crop Yield = A + B (Rainfall)
A is the Y-intercept and B is the slope of the line. The higher the value
of B the steeper the line, i.e the more sensitive the Crop Yield is to
changes in Rainfall.
So How Do We Fit A Line? (1/2)
1 2
In Linear Regression, a line is fit by minimizing the Sum of

Residual Squares. Here, a Residual is the difference between
the Predicted Crop Yield (Y predicted) and the actual Crop
Yield (Y actual).
We start with an arbitrary line here in Graph 1 and calculate

the Sum Residual Squares. We then rotate the line (i.e change
the slope B) and adjust the intercept (i.e A) to get Graph 2.
We see that the Sum of Residuals Squares has reduced. We
then adjust the line further to get Graph 3 which has the min
Sum Residual Squares. This is the best fit line.
So How Do We Fit A Line? (2/2)
As we saw in the previous slide, Linear Regression involves
fitting a line by minimizing the Sum of Square Residuals. Also
why it’s sometimes known as Least Squares Regression.
Different orientations of the line result in different Sum of

Residual Squares with the goal being to achieve the
minimum, which is the best fit.
Don’t worry though, the actual process of arriving at the

minimum Sum of Square Residuals is done by Statistical
Softwares.
How Do We Tell How Good The Fit Is?
We fit a line by minimizing the residual squares. However,
how do we know if the best fit line actually captures the
underlying relationship? It could just be a less poor fit
amongst a bunch a poor fits. This is where R² comes in.
Here:
SST is Sum of Squares Total
SSR is Sum of Square Residuals
R² is a ratio between 0 and 1, with 0 implying the Rainfall

(independent variable) has no predictive power with respect
to the Crop Yield (dependent variable) while 1 implies that
Rainfall perfectly explains the variation in Crop Yield.
Don’t worry, we’ll be looking at exactly what this formula and

its components are and how it explains how well a line has fit.
R² Explained (1/3)
We saw two terms in the R² formula - SST and SSR. We’ll take
a look at both here.
SST stands for Sum of Squares Total, which is the sum of the
square residuals around the mean of the dependent variable
(Crop Yield). Which basically means calculating the Sum of
Squares (SS) around a horizontal fit line that passes through
the mean of Crop Yield.
SST
We’ll see in the next slide that this essentially is equivalent of

shifting all the data points to the Y-axis and calculating the
Sum of Square Residuals around the mean of the Crop Yields.
R² Explained (2/3)
We can see below that calculating the Sum of Square
Residual (SS) for either of the graphs, you’d get the same
result, which is the SST.
SST SST
Shifting all the points horizontally to the Y-axis
In simple terms, Sum of Squares Total (SST) measures the

total variation in the dependent variable (the Crop Yield in
this case) in a dataset. It's calculated by summing up the
squares of the differences between each observed value and
the overall mean of the dependent variable.
Variation around Mean = Var(mean) = SST / n
The average of Sum of Squares Total is what we call the

Variation of Crop Yield around its Mean.
R² Explained (3/3)
The second term in the formula was SSR - Sum of Squares
Residual. This is what we used to achieve our best fit - It's
calculated by summing the squares of the differences
between each observed value and its corresponding predicted
value from the regression model.
SSR
Essentially, SSR quantifies how much the data points deviate

from the fitted regression line.
Variation around Fit = Var(fit) = SSR / n
The average of Sum of Squares Residuals is what we call the

Variation of Crop Yield around its Fit line.
Calculating & Interpreting R² (1/2)
SST SSR
Suppose there are 10 data points (n) and the SST comes out to be
400. Then the Var(mean) would be:
Var(mean) = 400 / 10 = 40
Now let’s assume the SSR comes out to be 120. Then Var(fit)
would be:
Var(mean) = 120 / 10 = 12
We can see that the variation in the second graph, i.e the line fit
by least squares, is less compared to the variation around the
mean in the first graph. Therefore, we can say that some of the
variation in the Crop Yield is explained by taking into
consideration the Rainfall.
Calculating & Interpreting R² (2/2)
Let us look at the formula for R² again:
Here the Explained Variance would be the difference between

the Var(mean) and Var(fit). Total Variance is the Variance
around the mean, Var(mean). Hence, based on the values
previously, the calculation for R² would be:
R² = (40 - 12) / 40 = 0.7
Here we get a R² of 0.7 or 70%. This means that there is a 70%

reduction in Variance when we take Rainfall into consideration.
Or we can say that Rainfall explains 70% of the Variation in Crop
Yield. A 0.7 R² would be considered a reasonably good Linear
Regression Model (assuming it’s significant, something we’ll see
in later modules).
A perfect model that passes through all the data points would
have a 0 Variance and hence an R² = 1. Although that rarely
happens in real world and is usually a sign of overfitting.
Future Learnings
What we saw here was just an example of Simple Linear
Regression (only one independent and one dependent
variable). There are a few more things that we’ll cover later
which would broaden your understanding of Linear
Regressions models:
Multiple Linear Regression: Expanding from one

independent variable to multiple variables.
Assumptions of Linear Regression: Discussing normality,
homoscedasticity, independence, and linearity.
Interpreting Coefficients: Understanding what the
regression coefficients mean.
Model Validation Techniques: Techniques like cross-
validation and splitting data into training and test sets.
Diagnostics and Residual Analysis: Identifying issues like
multicollinearity, autocorrelation, and influential outliers.
Model Improvement Strategies: Techniques like
transformation and regularization.
Comparison with Non-Linear Models: Understanding
when linear models are appropriate and when to consider
non-linear alternatives.
Sample Code to Train and
Visualize a Simple Linear
Regression Model
Enjoyed
reading?
Follow for
everything Data
and AI!

Linear Regression Clearly Explained

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Regression Clearly Explained

Uploaded by

Copyright:

Available Formats

linkedin.

Suppose we had a dataset that showed the Crop Yield of

Here our goal is

In Linear Regression, a line is fit by minimizing the Sum of

We start with an arbitrary line here in Graph 1 and calculate

Different orientations of the line result in different Sum of

Don’t worry though, the actual process of arriving at the

R² is a ratio between 0 and 1, with 0 implying the Rainfall

Don’t worry, we’ll be looking at exactly what this formula and

We’ll see in the next slide that this essentially is equivalent of

Shifting all the points horizontally to the Y-axis

In simple terms, Sum of Squares Total (SST) measures the

Variation around Mean = Var(mean) = SST / n

The average of Sum of Squares Total is what we call the

Essentially, SSR quantifies how much the data points deviate

Variation around Fit = Var(fit) = SSR / n

The average of Sum of Squares Residuals is what we call the

Here the Explained Variance would be the difference between

R² = (40 - 12) / 40 = 0.7

Here we get a R² of 0.7 or 70%. This means that there is a 70%

Multiple Linear Regression: Expanding from one

You might also like