You are on page 1of 29

..

.. .
SLIDES BY
.. John Loucks
.
.. St. Edward’s
.. University

Slide1
Chapter 14, Part A
Simple Linear Regression

 Simple Linear Regression Model


 Least Squares Method
 Coefficient of Determination
 Model Assumptions
 Testing for Significance

Slide2
Regression
 Managerial decisions often are based on the relationship between two or more
variables
 Regression can be used to develop an equation showing how dependence of one
variable on one or more independent variables.
 It provides an equation use to estimate or predicting the average value of the
dependent variable from the known independent variable.
 We study the dependence of one variable (simple regression) on one or more
than one independent variables (multiple regression).
 If dependence is on single variable is called linear regression, else it is
curvilinear.
Dependent variable: regressed, predictand, response, or explained variable (assumed to be
random)
Independent variable: regressor, predictor, the regression variable or explanatory variable

Slide3
Simple Linear Regression

 Simple linear regression involves one independent variable and one dependent
variable.
 The deterministic models can not be studied using regression models
It is appropriate for the non-deterministic models where linear relationships are
not exact (e.g., relationship between age and height creates a measurement error)

y = b0 + b1x +e

where:
b0 (intercept) and b1 (slope is called regression coefficient may be positive or
negative) are called population parameters of the model,
e is a random variable called the error term.

Slide4
Assumptions About the Error Term e

1. The error  is a random variable with mean of zero.

2. The variance of  , denoted by  2, is the same for


all values of the independent variable.

3. The values of  are independent.

4. The error  is a normally distributed random


variable.

Slide5
Simple Linear Regression Equation

 The simple linear regression equation is:

E(y) = 0 + 1x

• Assume E(e )=0 , The expected value of the error term is zero, then graph of the
regression equation is a straight line.
b0 is the y intercept of the regression line. b1 is the slope of the regression
line.
E(y) is the expected value of y for a given x value.

Slide6
Simple Linear Regression Equation

 Positive Linear Relationship

E(y)

Regression line

Intercept Slope b1
b0
is positive

Slide7
Simple Linear Regression Equation

 Negative Linear Relationship

E(y)

Intercept
b0 Regression line

Slope b1
is negative

Slide8
Simple Linear Regression Equation

 No Relationship

E(y)

Intercept Regression line


b0
Slope b1
is 0

Slide9
Estimated Simple Linear Regression Equation

 The estimated simple linear regression equation

• The graph is called the estimated regression line.


• b0 is the y intercept of the line.
• b1 is the slope of the line.
• is the estimated value of y for a given x value.

Slide10
Least Squares Method

 Least Squares Criterion

where:
yi = observed value of the dependent variable
for the ith observation
^
yi = estimated value of the dependent variable
for the ith observation
 The least squares method is a statistical procedure to find the best fit for a set of
data points by minimizing the sum of the offsets or residuals of points from the
plotted curve. Least squares regression is used to predict the behavior of
dependent variables.
Slide11
Least Squares Method

 Slope for the Estimated Regression Equation

where:
xi = value of independent variable for ith observation
yi = value of dependent variable for ith observation
_
x = mean value for independent variable
_
y = mean value for dependent variable
 y-Intercept for the Estimated Regression Equation

Slide12
Example

Slide13
SOLUTION
X Y Xi-Xbar Yi-Ybar (Xi-Xbar)^2 (Xi-Xbar)(Yi-Ybar)
5.00 16.00 -6.33 -17.56 40.11 111.18
6.00 19.00 -5.33 -14.56 28.44 77.63
8.00 23.00 -3.33 -10.56 11.11 35.18
10.00 28.00 -1.33 -5.56 1.78 7.41
12.00 36.00 0.67 2.44 0.44 1.63
13.00 41.00 1.67 7.44 2.78 12.41
15.00 44.00 3.67 10.44 13.44 38.30
16.00 45.00 4.67 11.44 21.78 53.41
17.00 50.00 5.67 16.44 32.11 93.19
Mean 11.33 33.56 152.00 430.33
^ =1.47+ 2.83 𝑋
𝑌
b1 2.83
bo 1.47

The desired estimated regression coefficient, b=2.831 indicate the value of Y increases by 2.831
unit with an increase in 1 unit of X
Slide14
Example 2

1. The data is from bivariate population, both X and Y are random in this case, Therefore, there
will be two regression lines.
2. To find the regression equation predicting length of Y, we take Y dependent, and X
independent (non-random).
3. To find the regression equation predicting weight of X, we take X independent and Y
dependent variable.
To
Slide15
X Y Xi-Xbar Yi-Ybar (Xi-Xbar)^2 (Xi-Xbar)(Yi-Ybar)
3.00 10.00 -8.33 -23.56 69.44 196.30
5.00 12.00 -6.33 -21.56 40.11 136.52
6.00 12.00 -5.33 -21.56 28.44 114.96
9.00 18.00 -2.33 -15.56 5.44 36.30
10.00 20.00 -1.33 -13.56 1.78 18.07
12.00 22.00 0.67 -11.56 0.44 -7.70
15.00 27.00 3.67 -6.56 13.44 -24.04
20.00 30.00 8.67 -3.56 75.11 -30.81
22.00 32.00 10.67 -1.56 113.78 -16.59

28.00 34.00 16.67 0.44 277.78 7.41


Mean 13.00 21.70 625.78 430.40

b1 0.69 ^ =0.69+12.76 𝑋
𝑌
bo 12.76
Slide16
Y X Yi-Xbar Xi-Ybar (Yi-Ybar)^2 (Yi-Ybar)(Xi-Xbar)
10.00 3.00 -11.70 -10.00 136.89 117.00
12.00 5.00 -9.70 -8.00 94.09 77.60
12.00 6.00 -9.70 -7.00 94.09 67.90
18.00 9.00 -3.70 -4.00 13.69 14.80
20.00 10.00 -1.70 -3.00 2.89 5.10
22.00 12.00 0.30 -1.00 0.09 -0.30
27.00 15.00 5.30 2.00 28.09 10.60
30.00 20.00 8.30 7.00 68.89 58.10
32.00 22.00 10.30 9.00 106.09 92.70
34.00 28.00 12.30 15.00 151.29 184.50
21.70 13.00 696.10 628.00

b1 0.90 ^ =− 6.58+0.90 𝑋
𝑌
bo -6.58 Slide17
X Y XY X^2 Y^2
Example
3 10 30 9 100

5 12 60 25 144

6 15 90 36 225

9 18 162 81 324

10 20 200 100 400

12 22 264 144 484

15 27 405 225 729

20 30 600 400 900

22 32 704 484 1024

28 34 952 784 1156

130 220 3467 2288 5486

Slide18
Example

Slide19
Coefficient of Determination

The variability in the values of dependent variable Y are called the total
variation and is given by . This is composed of two parts.
 variation explained by the regression lines
 the regression line fails to explain

 Total variation=Unexplained variation+ Explained variation


 The co-efficient of determination measure the proportion of variability of
the values of the dependent variable explained by its linear relations
with intepndent variable is defined as the ratio of the explained variation
to the total variation.

Slide20
Coefficient of Determination 4910.54/646

 A value of r2 =1 signifies a 100% of the variability in the dependent


variable is explained by the independent variable X
 r2=0 shows no variability
 r2=0.93 indicates the 93% of the variability of Y explained by X.

Slide21
SOLUTION
X Y XY X^2

5 16 80 25

6 19 114 36

8 23 184 64

10 28 280 100

12 36 432 144

13 41 533 169

15 44 660 225

16 45 720 256

17 50 850 289

102 302 3853 1308

The desired estimated regression coefficient, b=2.831 indicate the value of y


increases by 2.831 unit with an increase in 1 unit of X
Slide22
Coefficient of Determination

The value of r2 shows 95.8% of the variability in Y is due to the independent variable X
Slide23
Slide24
Slide25
Linear Correlation
 Relationship between two variables is measured through correlation coefficient.
 In other words, the linear correlation coefficient measures how closely the points in a scatter
diagram are spread around the regression line.
 The correlation coefficient calculated for the population data is denoted by ρ (Greek letter rho)
and the one calculated for sample data is denoted by r.
 (Note that the square of the correlation coefficient is equal to the coefficient of determination.)

Slide26
Correlation

Slide27
Sample Correlation Coefficient

Slide28
x y Y^2 xy x^2 X-xbar y-ybar (X-xbar)^2 (y-ybar)^2 (x-xb)(y-yb)
55 14 196 770 3025 -0.142 -1.43 0.02 2.04 0.20
83 24 576 1992 6889 27.86 8.57 776.07 73.44 238.74
38 13 169 494 1444 -17.14 -2.43 293.85 5.90 41.66
61 16 256 976 3721 5.858 0.57 34.32 0.32 3.34
33 9 81 297 1089 -22.14 -6.43 490.27 41.34 142.37
49 15 225 735 2401 -6.142 -0.43 37.72 0.18 2.64
67 17 289 1139 4489 11.86 1.57 140.61 2.46 18.62
386 108 1792 6403 23058 0.0 -0.01 1772.86 125.71 447.57
55.14 15.43

r=447.57/SQRT(1772.86*125.71)=447.57/472.087=0.95
Shows income (X) is strongly associated with food expenditures

Slide29

You might also like