Professional Documents
Culture Documents
Linear Regression
Posted by Ramana PV
Linear regression is a statistical technique to estimate the mathematical relationship between a dependent variable (usually
denoted as Y) and an independent variable (usually denoted as X). In other words, predict the change in the dependent
Dependent variable or Criterion variable – is the variable for which we wish to make a predictions
Independent variable or Predictor variable – The variable used to explain the dependent variable
multiple linear regression more than one independent variables used to predict a single dependent variable. In fact, the
basic difference between simple and multiple regression is in terms of explanatory variables.
For example compare the crop yield rate against the rain fall rate in a season.
known as scatter plot, to observe the relationship between dependent and independent variable, because if the data is
https://sixsigmastudyguide.com/linear-regression/ 1/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Draw the line which covers the majority of the points, further this line considered as the “best t” line
Where
Random error (ε-Epsilon) – The difference between an observed value of y and the mean value of y for a given value of
x.
Response variable is continuous and also residuals are almost same throughout the regression line
The method of least squares is a standard approach in regression analysis to determine the best t line for a given data, It
In general, the dependent variables are demonstrated on y-axis, while the independent variables are demonstrated on x-
axis. The least square method determines the position of a straight line or also called trend line and the equation of the line.
The least square method means that the overall solution minimizes the sum of squares of the errors made in the results of
every single equation. For instance, Least Squares Equation can be used to nd the values of the coef cients a and b
Compute â and b̂ values and then substitute these values into the equation of a line to obtain the least squares prediction
https://sixsigmastudyguide.com/linear-regression/ 3/17
2/3/2021 Linear Regression | Six Sigma Study Guide
A passenger vehicle manufacturer reviewing the 10 salespersons training records. In fact, their main aim is to compare the
salespersons achieved target (in %) with the number of sales module training’ completed.
https://sixsigmastudyguide.com/linear-regression/ 4/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Furthermore, predict y for a given value of x by substitution into the prediction equation. For example, If a salesperson
completes 15 training modules, then the predicted achieved target sales would be:
ŷ = 31.09+3.5742(15)= 84.7019=84.7%
https://sixsigmastudyguide.com/linear-regression/ 5/17
2/3/2021 Linear Regression | Six Sigma Study Guide
A random error (Є) affects the error of prediction. Hence the variability of the random errors (σε2) is the key parameter
Example: From the above data, compute the variability of the random errors
From the above calculation σ̂Є is 5.38. Thus, most of the points will fall within ±1.96 σ̂Є i.e 10.54 of the line, hence approx
95% of the values should be in this region. Moreover from the above graph, it is clearly evident that all the values are within
to 0. If b is not equal to 0 there is a linear relationship. The null and alternative hypotheses are
https://sixsigmastudyguide.com/linear-regression/ 6/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Example: From the above data determine if the slope results are signi cant at a 95% con dence level
Determine the critical values of t for 8 degrees of freedom at 95% con dence level
The calculated t value is 5.481, which is not in between -2.306 and 2.306, we can reject the null hypothesis if t value is
In this case, we can reject the null hypothesis and concluded that b≠0 and there is a linear relationship between dependent
https://sixsigmastudyguide.com/linear-regression/ 7/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Example: from the above data, compute the con dence interval around the slope of the line
2.0707<b<5.07
sample.
https://sixsigmastudyguide.com/linear-regression/ 8/17
2/3/2021 Linear Regression | Six Sigma Study Guide
The line slopes upward to the right when r indicates positive value
The line slopes downward to the right when r indicates negative value
distribution. Often, more than one variable is collected in a study or experiment. When two variables are measured on a
single experiment unit, the resulting data are called bivariate data. Ex job satisfaction strati ed by income.
In most instances, in bivariate data, it determines that one variable in uences the other variable. The quantities from these
two variables often represented using scatter plots to explore the relation between two variables.
https://sixsigmastudyguide.com/linear-regression/ 9/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Depends on the type of data, Bivariate data can be described with graphs and numerical measures. If one or both variables
are qualitative, then use a pie chart or bar chart to see the relation between variables. For example, compare the
relationship between opinion and gender. If the two variables are quantitative, use the scatter plot. The Correlation
Example: Correlation between the amount of time spent in Casino (independent variable) and the amount ($) lost
(dependent variable).
The correlation coef cient varies between -1 and +1. Values approaching -1 or +1 indicate strong correlation (negative or
https://sixsigmastudyguide.com/linear-regression/ 10/17
2/3/2021 Linear Regression | Six Sigma Study Guide
A negative correlation is not necessarily bad news. It merely means that as the independent variable goes more negative,
r = 0; does not indicate the absence of a relationship, a curvilinear pattern may exist; r=-0.76 has the same predictive power
as r = +0.76
regression is performed.
https://sixsigmastudyguide.com/linear-regression/ 11/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Example: from the above data, compute the coef cient of determination
We can say that 79% of the variation in sales target achieved can be explained by variation in number of training modules
completed.
Linear Regression
Linear regression is a statistical technique to estimate the
relationship between a dependent variable and an
independent variable.
https://sixsigmastudyguide.com/linear-regression/ 12/17
2/3/2021 Linear Regression | Six Sigma Study Guide
https://sixsigmastudyguide.com/linear-regression/ 13/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Contributors
Ramana PV
This entry was posted in Analyze and tagged ASQ, Black Belt, Green Belt, IASSC. Bookmark the permalink.
Comments (7)
Ronald Bettinardi
September 18, 2018 at 10:19 am
https://sixsigmastudyguide.com/linear-regression/ 14/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Reply
LUCA AMADEI
May 2, 2019 at 12:48 pm
Ted the link doesn’t work.Could you kindly provide a valide one?Thank you!
Reply
Ted Hessing
May 3, 2019 at 9:45 am
Hi all, Updated with a few links and a few videos. Let me know how this works for you!
Best, Ted.
Reply
Lyla
January 10, 2020 at 2:13 am
All your contributions are very useful for professionals and non-professionals. I appreciate your availability to
share these types of great and valuable info And you did it very well! Can’t wait to read more… You nailed it……..
Reply
https://sixsigmastudyguide.com/linear-regression/ 15/17
2/3/2021 Linear Regression | Six Sigma Study Guide
Ted Hessing
January 10, 2020 at 8:40 am
Reply
Anshika Tela
April 13, 2020 at 3:26 am
From the above calculation σ̂Є is 5.38. Thus, most of the points will fall within ±1.96 σ̂Є i.e 10.54 of the line,
hence approx 95% of the values should be in this region. Moreover from the above graph, it is clearly evident that
Reply
Ramana
April 15, 2020 at 9:22 am
Anshika,
A random error (Є) affects the error of prediction. Hence the variability of the random errors (σε2) is
Random errors in experimental measurements are caused by unknown and unpredictable changes in
https://sixsigmastudyguide.com/linear-regression/ 16/17
2/3/2021 Linear Regression | Six Sigma Study Guide
For the standard normal distribution, P(-1.96 < Z < 1.96) = 0.95, i.e., there is a 95% probability that a
standard normal variable, Z, will fall between -1.96 and 1.96. (refer Z table)
From the calculation variability of random error is 5.38. 1.96 *5.38 = 10.54. 95% of values should be in
this region, but If you observe above graph (in the example) all the points fall with in ± 10.54 of the LS
line.
Thanks
Reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
https://sixsigmastudyguide.com/linear-regression/ 17/17