Professional Documents
Culture Documents
10th Edition
Chapter 13
Y Y
X X
Y Y
X X
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-6
Types of Relationships
(continued)
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-7
Types of Relationships
(continued)
No relationship
X
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-8
Simple Linear Regression
Model
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable
Yi β0 β1Xi ε i
Linear component Random Error
component
Y Yi β0 β1Xi ε i
Observed Value
of Y for Xi
εi Slope = β1
Predicted Value
Random Error
of Y for Xi
for this Xi value
Intercept = β0
Xi X
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-10
Simple Linear Regression
Equation (Prediction Line)
The simple linear regression equation provides an
estimate of the population regression line
Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for
Ŷi b0 b1Xi
observation i
350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet
98.25 0.1098(2000)
317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-23
Interpolation vs. Extrapolation
When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation
450
400
House Price ($1000s)
350
300
250
200
150 Do not try to
100
extrapolate
50
0
beyond the range
0 500 1000 1500 2000 2500 3000 of observed X’s
Square Feet
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-24
Measures of Variation
Xi X
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-27
Coefficient of Determination, r2
The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
The coefficient of determination is also called
r-squared and is denoted as r2
SSR regression sum of squares
r 2
SST total sum of squares
note: 0 r 1
2
X
r =1
2
X
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-30
Examples of Approximate
r2 Values
r2 = 0
Y
No linear relationship
between X and Y:
SSE i i
( Y Ŷ ) 2
S YX i1
n2 n2
Where
SSE = error sum of squares
n = sample size
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
small s YX X large s YX X
Y Y
x x
residuals
x residuals x
Not Linear
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Linear
Chap 13-38
Residual Analysis for
Independence
Not Independent
Independent
residuals
residuals
X
residuals
Percent
100
0
-3 -2 -1 0 1 2 3
Residual
Y Y
x x
residuals
x residuals x
Non-constant variance
Constant variance
15
Here, residuals show a 10
Residuals
5
cyclic pattern, not
0
random. Cyclical
-5 0 2 4 6 8
patterns are a sign of -10
positive autocorrelation -15
Time (t)
i
e 2
i1
D less than 2 may signal positive
autocorrelation, D greater than 2 may
signal negative autocorrelation
0 dL dU 2
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-46
Testing for Positive
Autocorrelation
(continued)
160
140
120
100
Sales
80 y = 30.65 + 4.7038x
2
60 R = 0.8976
40
20
0
0 5 10 15 20 25 30
Is there autocorrelation? Tim e
120
Excel/PHStat output: 100
Durbin-Watson Calculations
Sales
80 y = 30.65 + 4.7038x
2
Sum of Squared 60 R = 0.8976
Difference of Residuals 3296.18 40
Sum of Squared 20
Residuals 3279.98 0
0 5 10 15 20 25 30
Durbin-Watson Tim e
Statistic 1.00494
(e e i i1 )2
3296.18
D i 2
n
1.00494
3279.98
ei
2
i 1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-48
Testing for Positive
Autocorrelation
(continued)
Here, n = 25 and there is k = 1 one independent variable
Using the Durbin-Watson table, dL = 1.29 and dU = 1.45
D = 1.00494 < dL = 1.29, so reject H0 and conclude that
significant positive autocorrelation exists
Therefore the linear model is not the appropriate model
to forecast sales
Decision: reject H0 since
D = 1.00494 < dL
S YX S YX
Sb1
SSX (X X)
i
2
where:
Sb1 = Estimate of the standard error of the least squares slope
SSE
S YX = Standard error of the estimate
n2
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-50
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error
Observations
41.33032
10
Sb1 0.03297
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Y Y
t b1 = regression slope
Sb1 coefficient
β1 = hypothesized slope
Sb 1= standard
d.f. n 2 error of the slope
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-53
Inference about the Slope:
t Test
(continued)
b1 Sb1
H0: β1 = 0 From Excel output:
H1: β1 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
b1 β1 0.10977 0
t t 3.32938
Sb1 0.03297
d.f. = 10-2 = 8
Decision:
/2=.025 /2=.025 Reject H0
Conclusion:
Reject H0 Do not reject H0 Reject H
There is sufficient evidence
-tα/2 tα/2 0
SSR
MSR
k
SSE
MSE
n k 1
where F follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom
Test statistic
r -ρ
t (with n – 2 degrees of freedom)
1 r 2
where
n2 r r 2 if b1 0
r r 2 if b1 0
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-63
Example: House Prices
Is there evidence of a linear relationship
between square feet and house price at
the .05 level of significance?
r ρ .762 0
t 3.329
1 r 2 1 .762 2
n2 10 2
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-64
Example: Test Solution
r ρ .762 0 Decision:
t 3.329
1 r 2 1 .762 2 Reject H0
n2 10 2 Conclusion:
There is
d.f. = 10-2 = 8
evidence of a
linear association
/2=.025 /2=.025
at the 5% level of
significance
Reject H0 Do not reject H0 Reject H0
-tα/2 0
tα/2
-2.3060 2.3060
3.329
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-65
Estimating Mean Values and
Predicting Individual Values
Goal: Form intervals around Y to express
uncertainty about the value of Y for a given Xi
Confidence
Interval for Y
the mean of Y
Y, given Xi
Y = b0+b1Xi
Prediction Interval
for an individual Y,
given Xi Xi
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
X
Chap 13-66
Confidence Interval for
the Average Y, Given X
Confidence interval estimate for the
mean value of Y given a particular Xi
Confidence interval for μY|X Xi :
Ŷ t n2S YX hi
Size of interval varies according
to distance away from mean, X
Ŷ t n2S YX 1 hi
1 (Xi X)2
Ŷ t n-2S YX 317.85 37.12
n (Xi X) 2
1 (Xi X)2
Ŷ t n-1S YX 1 317.85 102.28
n (Xi X) 2
In Excel, use
PHStat | regression | simple linear regression …
Check the
“confidence and prediction interval for X=”
box and enter the X-value and confidence level
desired
Input values
Y