Professional Documents
Culture Documents
.. .
SLIDES BY
.. John Loucks
.
.. St. Edward’s
.. University
Slide1
Chapter 14, Part A
Simple Linear Regression
Slide2
Regression
Managerial decisions often are based on the relationship between two or more
variables
Regression can be used to develop an equation showing how dependence of one
variable on one or more independent variables.
It provides an equation use to estimate or predicting the average value of the
dependent variable from the known independent variable.
We study the dependence of one variable (simple regression) on one or more
than one independent variables (multiple regression).
If dependence is on single variable is called linear regression, else it is
curvilinear.
Dependent variable: regressed, predictand, response, or explained variable (assumed to be
random)
Independent variable: regressor, predictor, the regression variable or explanatory variable
Slide3
Simple Linear Regression
Simple linear regression involves one independent variable and one dependent
variable.
The deterministic models can not be studied using regression models
It is appropriate for the non-deterministic models where linear relationships are
not exact (e.g., relationship between age and height creates a measurement error)
y = b0 + b1x +e
where:
b0 (intercept) and b1 (slope is called regression coefficient may be positive or
negative) are called population parameters of the model,
e is a random variable called the error term.
Slide4
Assumptions About the Error Term e
Slide5
Simple Linear Regression Equation
E(y) = 0 + 1x
• Assume E(e )=0 , The expected value of the error term is zero, then graph of the
regression equation is a straight line.
b0 is the y intercept of the regression line. b1 is the slope of the regression
line.
E(y) is the expected value of y for a given x value.
Slide6
Simple Linear Regression Equation
E(y)
Regression line
Intercept Slope b1
b0
is positive
Slide7
Simple Linear Regression Equation
E(y)
Intercept
b0 Regression line
Slope b1
is negative
Slide8
Simple Linear Regression Equation
No Relationship
E(y)
Slide9
Estimated Simple Linear Regression Equation
Slide10
Least Squares Method
where:
yi = observed value of the dependent variable
for the ith observation
^
yi = estimated value of the dependent variable
for the ith observation
The least squares method is a statistical procedure to find the best fit for a set of
data points by minimizing the sum of the offsets or residuals of points from the
plotted curve. Least squares regression is used to predict the behavior of
dependent variables.
Slide11
Least Squares Method
where:
xi = value of independent variable for ith observation
yi = value of dependent variable for ith observation
_
x = mean value for independent variable
_
y = mean value for dependent variable
y-Intercept for the Estimated Regression Equation
Slide12
Example
Slide13
SOLUTION
X Y Xi-Xbar Yi-Ybar (Xi-Xbar)^2 (Xi-Xbar)(Yi-Ybar)
5.00 16.00 -6.33 -17.56 40.11 111.18
6.00 19.00 -5.33 -14.56 28.44 77.63
8.00 23.00 -3.33 -10.56 11.11 35.18
10.00 28.00 -1.33 -5.56 1.78 7.41
12.00 36.00 0.67 2.44 0.44 1.63
13.00 41.00 1.67 7.44 2.78 12.41
15.00 44.00 3.67 10.44 13.44 38.30
16.00 45.00 4.67 11.44 21.78 53.41
17.00 50.00 5.67 16.44 32.11 93.19
Mean 11.33 33.56 152.00 430.33
^ =1.47+ 2.83 𝑋
𝑌
b1 2.83
bo 1.47
The desired estimated regression coefficient, b=2.831 indicate the value of Y increases by 2.831
unit with an increase in 1 unit of X
Slide14
Example 2
1. The data is from bivariate population, both X and Y are random in this case, Therefore, there
will be two regression lines.
2. To find the regression equation predicting length of Y, we take Y dependent, and X
independent (non-random).
3. To find the regression equation predicting weight of X, we take X independent and Y
dependent variable.
To
Slide15
X Y Xi-Xbar Yi-Ybar (Xi-Xbar)^2 (Xi-Xbar)(Yi-Ybar)
3.00 10.00 -8.33 -23.56 69.44 196.30
5.00 12.00 -6.33 -21.56 40.11 136.52
6.00 12.00 -5.33 -21.56 28.44 114.96
9.00 18.00 -2.33 -15.56 5.44 36.30
10.00 20.00 -1.33 -13.56 1.78 18.07
12.00 22.00 0.67 -11.56 0.44 -7.70
15.00 27.00 3.67 -6.56 13.44 -24.04
20.00 30.00 8.67 -3.56 75.11 -30.81
22.00 32.00 10.67 -1.56 113.78 -16.59
b1 0.69 ^ =0.69+12.76 𝑋
𝑌
bo 12.76
Slide16
Y X Yi-Xbar Xi-Ybar (Yi-Ybar)^2 (Yi-Ybar)(Xi-Xbar)
10.00 3.00 -11.70 -10.00 136.89 117.00
12.00 5.00 -9.70 -8.00 94.09 77.60
12.00 6.00 -9.70 -7.00 94.09 67.90
18.00 9.00 -3.70 -4.00 13.69 14.80
20.00 10.00 -1.70 -3.00 2.89 5.10
22.00 12.00 0.30 -1.00 0.09 -0.30
27.00 15.00 5.30 2.00 28.09 10.60
30.00 20.00 8.30 7.00 68.89 58.10
32.00 22.00 10.30 9.00 106.09 92.70
34.00 28.00 12.30 15.00 151.29 184.50
21.70 13.00 696.10 628.00
b1 0.90 ^ =− 6.58+0.90 𝑋
𝑌
bo -6.58 Slide17
X Y XY X^2 Y^2
Example
3 10 30 9 100
5 12 60 25 144
6 15 90 36 225
9 18 162 81 324
Slide18
Example
Slide19
Coefficient of Determination
The variability in the values of dependent variable Y are called the total
variation and is given by . This is composed of two parts.
variation explained by the regression lines
the regression line fails to explain
Slide20
Coefficient of Determination 4910.54/646
Slide21
SOLUTION
X Y XY X^2
5 16 80 25
6 19 114 36
8 23 184 64
10 28 280 100
12 36 432 144
13 41 533 169
15 44 660 225
16 45 720 256
17 50 850 289
The value of r2 shows 95.8% of the variability in Y is due to the independent variable X
Slide23
Slide24
Slide25
Linear Correlation
Relationship between two variables is measured through correlation coefficient.
In other words, the linear correlation coefficient measures how closely the points in a scatter
diagram are spread around the regression line.
The correlation coefficient calculated for the population data is denoted by ρ (Greek letter rho)
and the one calculated for sample data is denoted by r.
(Note that the square of the correlation coefficient is equal to the coefficient of determination.)
Slide26
Correlation
Slide27
Sample Correlation Coefficient
Slide28
x y Y^2 xy x^2 X-xbar y-ybar (X-xbar)^2 (y-ybar)^2 (x-xb)(y-yb)
55 14 196 770 3025 -0.142 -1.43 0.02 2.04 0.20
83 24 576 1992 6889 27.86 8.57 776.07 73.44 238.74
38 13 169 494 1444 -17.14 -2.43 293.85 5.90 41.66
61 16 256 976 3721 5.858 0.57 34.32 0.32 3.34
33 9 81 297 1089 -22.14 -6.43 490.27 41.34 142.37
49 15 225 735 2401 -6.142 -0.43 37.72 0.18 2.64
67 17 289 1139 4489 11.86 1.57 140.61 2.46 18.62
386 108 1792 6403 23058 0.0 -0.01 1772.86 125.71 447.57
55.14 15.43
r=447.57/SQRT(1772.86*125.71)=447.57/472.087=0.95
Shows income (X) is strongly associated with food expenditures
Slide29