You are on page 1of 33

Slides Prepared by

JOHN S. LOUCKS
St. Edward’s University

© 2002 South-Western/Thomson Learning Slide


1
Chapter 14
Simple Linear Regression
 Simple Linear Regression Model
 Least Squares Method
 Coefficient of Determination
 Model Assumptions
 Testing for Significance
 Using the Estimated Regression Equation
for Estimation and Prediction
 Computer Solution
 Residual Analysis: Validating Model Assumptions
 Residual Analysis: Outliers and Influential
Observations

Slide
2
The Simple Linear Regression Model

 Simple Linear Regression Model


y = 0 +  1 x + 
 Simple Linear Regression Equation
E(y) = 0 + 1x
 Estimated Simple Linear Regression Equation
^
y = b 0 + b 1x

Slide
3
Least Squares Method

 Least Squares Criterion

min  (y i  y i ) 2

where:
yi = observed value of the dependent variable
for the ith observation
y^i = estimated value of the dependent variable
for the ith observation

Slide
4
The Least Squares Method

 Slope for the Estimated Regression Equation

 xi y i  (  xi  y i ) / n
b1  2 2
 xi  (  xi ) / n
 y-Intercept for the Estimated Regression Equation
_ _
b0 = y - b1x
where:
xi = value of independent variable for ith observation
y_i = value of dependent variable for ith observation
x_ = mean value for independent variable
y = mean value for dependent variable
n = total number of observations

Slide
5
Example: Reed Auto Sales

 Simple Linear Regression


Reed Auto periodically has a special week-long sale.
As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales are shown below.
Number of TV Ads Number of Cars Sold
1 14
3 24
2 18
1 17
3 27

Slide
6
Example: Reed Auto Sales

 Slope for the Estimated Regression Equation


b1 = 220 - (10)(100)/5 = 5
24 - (10)2/5
 y-Intercept for the Estimated Regression Equation
b0 = 20 - 5(2) = 10
 Estimated Regression Equation
y^ = 10 + 5x

Slide
7
Example: Reed Auto Sales

 Scatter Diagram

30

25
20
Cars Sold

y = 5x + 10
15

10
5

0
0 1 2 3 4
TV Ads

Slide
8
Coefficient Of Determination

 Koefisien determinasi pada regresi linear sering


diartikan sebagai seberapa besar kemampuan semua
variabel bebas dalam menjelaskan varians dari
variabel terikatnya. 

Slide
9
The Coefficient of Determination

 Relationship Among SST, SSR, SSE


SST = SSR + SSE
 ( y i  y )   ( y^i  y )   ( y i  y^ i )
2 2 2

 Coefficient of Determination
r2 = SSR/SST
where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error

Slide
10
Example: Reed Auto Sales

 Coefficient of Determination
r2 = SSR/SST = 100/114 = .8772
The regression relationship is very strong since
88% of the variation in number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.

Slide
11
The Correlation Coefficient

 Sample Correlation Coefficient

rxy  (sign of b1 ) Coefficient of Determination

rxy  (sign of b1 ) r 2

where:
b1 = the slope of the estimated regression
equation yˆ  b0  b1 x

Slide
12
Example: Reed Auto Sales

 Sample Correlation Coefficient

rxy  (sign of b1 ) r 2
The sign of b1 in the equation yˆ  10 is5“+”.
x

rxy = + .8772
rxy = +.9366

Slide
13
Model Assumptions

 Assumptions About the Error Term 


• The error  is a random variable with mean of
zero.
• The variance of  , denoted by  2, is the same for
all values of the independent variable.
• The values of  are independent.
• The error  is a normally distributed random
variable.

Slide
14
Testing for Significance

 To test for a significant regression relationship, we


must conduct a hypothesis test to determine whether
the value of b1 is zero.
 Two tests are commonly used
• t Test
• F Test
 Both tests require an estimate of s 2, the variance of e
in the regression model.

Slide
15
Testing for Significance

 An Estimate of s 2
The mean square error (MSE) provides the estimate
of s 2, and the notation s2 is also used.
s2 = MSE = SSE/(n-2)
where:

SSE   ( yi  yˆ i ) 2   ( yi  b0  b1 xi ) 2

Slide
16
Testing for Significance

 An Estimate of s
• To estimate s we take the square root of s 2.
• The resulting s is called the standard error of the
estimate.
SSE
s  MSE 
n2

Slide
17
Testing for Significance: t Test

 Hypotheses
H0: 1 = 0
Ha: 1 = 0
 Test Statistic
b1
t
sb 1
 Rejection Rule
Reject H0 if t < -tor t > t

where t is based on a t distribution with


n - 2 degrees of freedom.

Slide
18
Example: Reed Auto Sales

 t Test
• Hypotheses H0 :  1 = 0
Ha: 1 = 0
• Rejection Rule
For  = .05 and d.f. = 3, t.025 = 3.182
Reject H0 if t > 3.182
• Test Statistics
t = 5/1.08 = 4.63
• Conclusions
Reject H0

Slide
19
Confidence Interval for 1

 We can use a 95% confidence interval for 1 to test the


hypotheses just used in the t test.
 H0 is rejected if the hypothesized value of 1 is not
included in the confidence interval for 1.

Slide
20
Confidence Interval for 1

 The form of a confidence interval for 1 is:


b1  t / 2 sb1

where b1 is the point estimate


t /the
is 2 sb1 margin of error
t / 2 t value providing an area
is the
of a/2 in the upper tail of a
t distribution with n - 2 degrees
of freedom

Slide
21
Example: Reed Auto Sales

 Rejection Rule
Reject H0 if 0 is not included in the confidence interval for
1.
 95% Confidence Interval for 1
b1  t / 2 sb1
= 5 +/- 3.182(1.08) = 5 +/- 3.44
or 1.56 to 8.44
 Conclusion
Reject H0 Karena nilai b1 yg dihipotesakan = 0
berada diluar interval b1

Slide
22
Testing for Significance: F Test

 Hypotheses
H0 : 1 = 0
Ha : 1 = 0
 Test Statistic
F = MSR/MSE
 Rejection Rule
Reject H0 if F > F

where F is based on an F distribution with 1 d.f. in


the numerator and n - 2 d.f. in the denominator.

Slide
23
Example: Reed Auto Sales

 F Test
• Hypotheses H0 :  1 = 0
Ha: 1 = 0
• Rejection Rule
For  = .05 and d.f. = 1, 3: F.05 = 10.13
Reject H0 if F > 10.13.
• Test Statistic
F = MSR/MSE = 100/4.667 = 21.43
• Conclusion
We can reject H0.

Slide
24
Using the Estimated Regression Equation
for Estimation and Prediction
 Confidence Interval Estimate of E(yp)
y p  t  /2 s y p

 Prediction Interval Estimate of yp

yp + t/2 sind

where the confidence coefficient is 1 -  and


t/2 is based on a t distribution with n - 2 d.f.

Slide
25
Example: Reed Auto Sales

 Point Estimation
If 3 TV ads are run prior to a sale, we expect the mean
number of cars sold to be:
y^ = 10 + 5(3) = 25 cars
 Confidence Interval for E(yp)
95% confidence interval estimate of the mean number
of cars sold when 3 TV ads are run is:
25 + 4.61 = 20.39 to 29.61 cars
 Prediction Interval for yp
95% prediction interval estimate of the number of
cars sold in one particular week when 3 TV ads are
run is: 25 + 8.28 = 16.72 to 33.28 cars

Slide
26
Residual Analysis

 Residual for Observation i


yi – ^yi

 Standardized Residual for Observation i


y i  y^i
syi  y^i

where: syi  y^i  s 1  hi

Slide
27
Example: Reed Auto Sales

 Residuals
Observation Predicted Cars Sold Residuals
1 15 -1
2 25 -1
3 20 -2
4 15 2
5 25 2

Slide
28
Example: Reed Auto Sales

 Residual Plot

TV Ads Residual Plot


3
2
Residuals

1
0
-1
-2
-3
0 1 2 3 4
TV Ads

Slide
29
Residual Analysis

 Detecting Outliers
• An outlier is an observation that is unusual in
comparison with the other data.
• Minitab classifies an observation as an outlier if its
standardized residual value is < -2 or > +2.
• This standardized residual rule sometimes fails to
identify an unusually large observation as being
an outlier.
• This rule’s shortcoming can be circumvented by
using studentized deleted residuals.
• The |i th studentized deleted residual| will be
larger than the |i th standardized residual|.

Slide
30
Praktek SPSS
 Data from a sample of 5 previous sales are shown
below.

Number of TV Ads Number of Cars Sold


1 14
3 24
2 18
1 17
3 27

Slide
31
Langkah

 Analyze, regression, linear


 Pindahkan variabel dependen dan independen ke
kolom masing-masing
 Statisstic (klick yang perlu) continue
 Save , beta, standardized beta, contimue
 Ok
 Baca tabel
 Gambar model
 Implikasi

Slide
32
End of Chapter 14

Slide
33

You might also like