You are on page 1of 33

Simple Linear Regression

 Simple Linear Regression Model


 Least Squares Method
 Coefficient of Determination
 Model Assumptions
 Testing for Significance

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
1
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression

 Managerial decisions often are based on the


relationship between two or more variables.
 Regression analysis can be used to develop an
equation showing how the variables are related.
 The variable being predicted is called the dependent
variable and is denoted by y.
 The variables being used to predict the value of the
dependent variable are called the independent
variables and are denoted by x.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
2
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression

 Simple linear regression involves one independent


variable and one dependent variable.
 The relationship between the two variables is
approximated by a straight line.
 Regression analysis involving two or more
independent variables is called multiple regression.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
3
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression Model

 The equation that describes how y is related to x and


an error term is called the regression model.
 The simple linear regression model is:

y = b0 + b1x +e

where:
b0 and b1 are called parameters of the model,
e is a random variable called the error term.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
4
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression Equation

 The simple linear regression equation is:

E(y) = b0 + b1x

• Graph of the regression equation is a straight line.


• b0 is the y intercept of the regression line.
• b1 is the slope of the regression line.
• E(y) is the expected value of y for a given x value.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
5
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression Equation

 Positive Linear Relationship

E(y)

Regression line

Intercept Slope b1
b0
is positive

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
6
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression Equation

 Negative Linear Relationship

E(y)

Intercept
b0 Regression line

Slope b1
is negative

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
7
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression Equation

 No Relationship

E(y)

Intercept Regression line


b0
Slope b1
is 0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
8
or duplicated, or posted to a publicly accessible website, in whole or in part.
Estimated Simple Linear Regression Equation

 The estimated simple linear regression equation

ŷ  b0  b1 x

• The graph is called the estimated regression line.


• b0 is the y intercept of the line.
• b1 is the slope of the line.
• ŷis the estimated value of y for a given x value.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
9
or duplicated, or posted to a publicly accessible website, in whole or in part.
Estimation Process

Regression Model Sample Data:


y = b0 + b1x +e x y
Regression Equation x1 y1
E(y) = b0 + b1x . .
Unknown Parameters . .
b0, b1 xn yn

Estimated
b0 and b1 Regression Equation
provide estimates of ŷ  b0  b1 x
b0 and b1 Sample Statistics
b0, b1

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
10
or duplicated, or posted to a publicly accessible website, in whole or in part.
Least Squares Method

 Least Squares Criterion

min  (y i  y i ) 2

where:
yi = observed value of the dependent variable
for the ith observation
y^i = estimated value of the dependent variable
for the ith observation

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
11
or duplicated, or posted to a publicly accessible website, in whole or in part.
Least Squares Method

 Slope for the Estimated Regression Equation

b1   ( x  x )( y  y )
i i

 (x  x ) i
2

where:
xi = value of independent variable for ith
observation
yi = value of dependent variable for ith
_ observation
x = mean value for independent variable
_
y = mean value for dependent variable
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
12
or duplicated, or posted to a publicly accessible website, in whole or in part.
Least Squares Method

 y-Intercept for the Estimated Regression Equation

b0  y  b1 x

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
13
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression

 Example: Reed Auto Sales


Reed Auto periodically has a special week-long sale.
As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales are shown on the next slide.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
14
or duplicated, or posted to a publicly accessible website, in whole or in part.
Simple Linear Regression

 Example: Reed Auto Sales

Number of Number of
TV Ads (x) Cars Sold (y)
1 14
3 24
2 18
1 17
3 27
Sx = 10 Sy = 100
x2 y  20

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
15
or duplicated, or posted to a publicly accessible website, in whole or in part.
Estimated Regression Equation

 Slope for the Estimated Regression Equation

b1   ( xi  x )( y i  y ) 20
 5
 ( xi  x ) 2
4
 y-Intercept for the Estimated Regression Equation
b0  y  b1 x  20  5(2)  10
 Estimated Regression Equation
yˆ  10  5x

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
16
or duplicated, or posted to a publicly accessible website, in whole or in part.
Coefficient of Determination

 Relationship Among SST, SSR, SSE

SST = SSR + SSE

 i
( y  y ) 2
  i
( ˆ
y  y ) 2
  i i
( y  ˆ
y ) 2

where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
17
or duplicated, or posted to a publicly accessible website, in whole or in part.
Coefficient of Determination

 The coefficient of determination is:

r2 = SSR/SST

where:
SSR = sum of squares due to regression
SST = total sum of squares

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
18
or duplicated, or posted to a publicly accessible website, in whole or in part.
Coefficient of Determination

r2 = SSR/SST = 100/114 = .8772


The regression relationship is very strong; 87.72%
of the variability in the number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
19
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Correlation Coefficient

rxy  (sign of b1 ) Coefficient of Determination


rxy  (sign of b1 ) r 2

where:
b1 = the slope of the estimated regression
equation yˆ  b0  b1 x

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
20
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Correlation Coefficient

rxy  (sign of b1 ) r 2

yˆ  10  is
The sign of b1 in the equation 5 x“+”.

rxy = + .8772

rxy = +.9366

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
21
or duplicated, or posted to a publicly accessible website, in whole or in part.
Assumptions About the Error Term e

1. The error e is a random variable with mean of zero.

2. The variance of e , denoted by  2, is the same for


all values of the independent variable.

3. The values of e are independent.

4. The error e is a normally distributed random


variable.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
22
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance

To test for a significant regression relationship, we


must conduct a hypothesis test to determine whether
the value of b1 is zero.

Two tests are commonly used:


t Test and F Test

Both the t test and F test require an estimate of  2,


the variance of e in the regression model.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
23
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance

 An Estimate of  2
The mean square error (MSE) provides the estimate
of  2, and the notation s2 is also used.

s 2 = MSE = SSE/(n  2)

where:

SSE   ( yi  yˆ i ) 2   ( yi  b0  b1 xi ) 2

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
24
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance

 An Estimate of 
• To estimate  we take the square root of  2.
• The resulting s is called the standard error of
the estimate.

SSE
s  MSE 
n2

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
25
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: t Test

 Hypotheses

H0 : b1  0
H a : b1  0

 Test Statistic

b1 s
t where sb1 
sb1 S( xi  x ) 2

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
26
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: t Test

 Rejection Rule

Reject H0 if p-value < 


or t < -t or t > t

where:
t is based on a t distribution
with n - 2 degrees of freedom

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
27
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: t Test

1. Determine the hypotheses. H0 : b1  0


H a : b1  0
2. Specify the level of significance.  = .05

b1
3. Select the test statistic. t 
sb1

4. State the rejection rule. Reject H0 if p-value < .05


or |t| > 3.182 (with
3 degrees of freedom)

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
28
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: t Test

5. Compute the value of the test statistic.


b1 5
t   4.63
sb1 1.08

6. Determine whether to reject H0.


t = 4.541 provides an area of .01 in the upper
tail. Hence, the p-value is less than .02. (Also,
t = 4.63 > 3.182.) We can reject H0.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
29
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: F Test

 Hypotheses

H0 : b1  0
H a : b1  0
 Test Statistic

F = MSR/MSE

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
30
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: F Test

 Rejection Rule

Reject H0 if
p-value < 
or F > F
where:
F is based on an F distribution with
1 degree of freedom in the numerator and
n - 2 degrees of freedom in the denominator

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
31
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: F Test

1. Determine the hypotheses. H0 : b1  0


H a : b1  0

2. Specify the level of significance.  = .05

3. Select the test statistic. F = MSR/MSE

4. State the rejection rule. Reject H0 if p-value < .05


or F > 10.13 (with 1 d.f.
in numerator and
3 d.f. in denominator)

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
32
or duplicated, or posted to a publicly accessible website, in whole or in part.
Testing for Significance: F Test

5. Compute the value of the test statistic.

F = MSR/MSE = 100/4.667 = 21.43

6. Determine whether to reject H0.


F = 17.44 provides an area of .025 in the upper
tail. Thus, the p-value corresponding to F = 21.43
is less than .025. Hence, we reject H0.
The statistical evidence is sufficient to conclude
that we have a significant relationship between the
number of TV ads aired and the number of cars sold.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied Slide
33
or duplicated, or posted to a publicly accessible website, in whole or in part.