Professional Documents
Culture Documents
Basic Business Statistics: Introduction To Multiple Regression
Basic Business Statistics: Introduction To Multiple Regression
(9th Edition)
Chapter 14
Introduction to Multiple
Regression
Yi = + X 1i + X 2i + + k X ki + i
Dependent (Response) Independent (Explanatory)
variable variables
Response
Response 00 i
Plane
Plane
X
X22
X
X11 (X
X 1i ,,X
1i 2i2)i
X
X
X11 (X
X 11ii , X2i2 i)
ˆ ^
YYi i==bb00++bb1 X i + b22X2i
1X11i + b 2i
◼ Slope (bj )
◼ Estimated that the average value of Y changes by
bj for each 1 unit increase in Xj , holding all other
variables constant (ceterus paribus)
◼ Example: If b1 = -2, then fuel oil usage (Y) is
expected to decrease by an estimated 2 gallons for
each 1 degree increase in temperature (X1), given
the inches of insulation (X2)
◼ Y-Intercept (b0)
◼ The estimated average value of Y when all Xj = 0
Yˆi = b0 + b1 X 1i + b2 X 2i + + bk X ki
Coefficients
Excel Output Intercept 562.1510092
X Variable 1 -5.436580588
X Variable 2 -20.01232067
r =
2
Oil
Temp
SSR
=
variation of oil
SSR + SSE
© 2004 Prentice-Hall, Inc. Chap 14-12
Venn Diagrams and
Explanatory Power of Regression
Variation NOT Overlapping
explained by variation in
Temp nor both Temp and
Insulation Oil Insulation are
( SSE ) used in
explaining the
variation in Oil
Temp
but NOT in the
Insulation estimation of
1
nor 2
© 2004 Prentice-Hall, Inc. Chap 14-13
Coefficient of
Multiple Determination
◼ Proportion of Total Variation in Y Explained by
All X Variables Taken Together
SSR Explained Variation
◼ r2
Y •12 k = =
SST Total Variation
Oil r2
Y •12 =
Temp
Insulation SSR
=
SSR + SSE
© 2004 Prentice-Hall, Inc. Chap 14-15
Adjusted Coefficient of Multiple
Determination
◼ Proportion of Variation in Y Explained by All the X
Variables Adjusted for the Sample Size and the
Number of X Variables Used
n −1
= 1 − (1 − rY •12 k )
◼ 2 2 k is # variable
r
n − k − 1
adj
◼ Penalizes excessive use of independent variables
2
◼ Smaller than rY •12 k
◼ Useful in comparing among models
◼ Can decrease if an insignificant new X variable is
added to the model
© 2004 Prentice-Hall, Inc. Chap 14-16
Coefficient of Multiple
Determination
Excel Output SSR
R e g re ssi o n S ta ti sti c s
r 2
Y •12 =
SST
M u lt ip le R 0.982654757
R S q u a re 0.965610371
A d ju s t e d R S q u a re 0.959878766 Adjusted r2
S t a n d a rd E rro r 26.01378323 ❑ reflects the number
O b s e rva t io n s 15 of explanatory
variables and sample
size
❑ is smaller than r2
◼ Multiple Regression:
◼ Oil =
0 + 1 Temp + 2 Insulation +
Coefficients Coefficients
Intercept 436.4382299 Intercept 345.3783784
Temp -5.462207697 Insulation -20.35027027
The three e’s are different
-5.4366 -5.4622
© 2004 Prentice-Hall, Inc. Chap 14-21
Simple and Multiple Regression
Compared: r2
Oil = b0 + b1 Temp + b2 Insulation + e
Regression Statistics
Multiple R 0.982654757
( 0.97275)
=
R Square 0.965610371
Adjusted R Square 0.959878766
Standard Error
Observations
26.01378323
15
0.96561 ( 0.75645 + 0.21630 )
◼ Residuals Vs Yˆ
◼ May need to transform Y variable
◼ Residuals Vs X1
◼ May need to transform X 1 variable
◼ Residuals Vs X2
◼ May need to transform X 2variable
◼ Residuals Vs Time is time series
◼ May have autocorrelation
20
Insulation R esidual P lot
Re sidua ls
0 20 40 60 80
-20
-40
-60
0 2 4 6 8 10 12
No Discernable Pattern
◼ Hypotheses:
◼ H0: = = … = k = 0 (No linear relationship)
◼ H1: At least one j ( At least one independent
variable affects Y )
◼ The Null Hypothesis is a Very Strong Statement
◼ The Null Hypothesis is Almost Always Rejected
© 2004 Prentice-Hall, Inc. Chap 14-28
Testing for Overall Significance
(continued)
◼ Test Statistic:
MSR SSR / k
◼ F= =
MSE MSE / ( n − k − 1)
◼ Where F has k numerator and (n-k-1)
denominator degrees of freedom
MSR
= F Test Statistic
MSE
© 2004 Prentice-Hall, Inc. Chap 14-30
Test for Overall Significance:
Example Solution
H0: 1 = 2 = … = k = 0 Test Statistic:
H1: At least one j 0 F = 168.47
= .05 (Excel Output)
df = 2 and 12
Decision:
Critical Value: review how to find critical value
Reject at = 0.05.
Conclusion:
= 0.05 There is evidence that at
least one independent
variable affects Y.
0 3.89 F
© 2004 Prentice-Hall, Inc. Chap 14-31
Test for Significance:
Individual Variables
bj
t= t Test Statistic for X2
Sb j (Insulation)
◼ r 2
Yj • all others =
SSR ( X j | all others )
SST − SSR ( all ) + SSR ( X j | all others )
SSR ( X 1 | X 2 )
r 2
=
SST − SSR ( X 1 , X 2 ) + SSR ( X 1 | X 2 )
Y 1• 2
SSR ( X 1 | X 2 )
SST − SSR ( X 1 , X 2 ) + SSR ( X 1 | X 2 )
Oil
the part of both x1 only x1 not x2
and x2
=
Temp
Insulation
© 2004 Prentice-Hall, Inc. Chap 14-41
Coefficient of Partial
Determination in PHStat
SSR ( X 1 and X 3 | X 2 )
= SSR ( X 1 , X 2 and X 3 ) − SSR ( X 2 )
From ANOVA
From ANOVA section of section of
regression for regression for
Yˆi = b0 + b1 X1i + b2 X 2i + b3 X 3i Yˆi = b0 + b2 X 2i
© 2004 Prentice-Hall, Inc. Chap 14-44
Testing Portions of Model
◼ Examines the Contribution of a Subset Xs of
Explanatory Variables to the Relationship with Y
◼ Null Hypothesis:
◼ Variables in the subset do not improve the model
significantly when all other variables are included
◼ Alternative Hypothesis:
◼ At least one variable in the subset is significant
when all other variables are included
◼ Hypotheses:
◼ H0 : Variable Xj does not significantly improve
the model given all others included
◼ H1 : Variable Xj significantly improves the
model given all others included
◼ Test Statistic:
◼ SSR ( X j | all others )
F=
MSE ( all )
◼ with df =1 and (n-k-1 )
◼ m = 1 heresingle m=1, not single m /= 1
© 2004 Prentice-Hall, Inc. Chap 14-48
Testing Portions of Model:
Example
Test at the = .05
level to determine if
the variable of
average temperature
significantly improves
the model, given that
insulation is included.
Same
slopes b1
b0 + b2
Intercepts
different b0
X1 (Square footage)
© 2004 Prentice-Hall, Inc. Chap 14-55
Interpretation of the Dummy-
Variable Coefficient (with 2 Levels)
Example:
Yˆi = b0 + b1 X1i + b2 X 2i = 20 + 5 X 1i + 6 X 2i
Y : Annual salary of college graduate in thousand $
0 non-business degree
X 1 : GPA X 2:
1 business degree
◼ Given:
◼ Yi = 0 + 1 X 1i + 2 X 2 i + 3 X 1i X 2 i + i
8
Y = 1 + 2X1 + 3(0) + 4X1(0) = 1 + 2X1
4
0 X1
0 0.5 1 1.5
Effect (slope) of X1 on Y depends on X2 value
© 2004 Prentice-Hall, Inc. Chap 14-61
Interaction Regression Model
Worksheet
Case, i Yi X1i X2i X1i X2i
1 1 1 3 3
2 4 8 5 40
3 1 3 2 6
4 3 5 6 30
: : : : :
Female + 2 + 3
Male + 1 + 1 + 1
+ 2 + 4 + 3 + 5
◼ Hypotheses
◼ H0 : j = 0 (Xj is not significant)
◼ H1 : j 0 (Xj is significant)
◼ Test Statistic
◼ The Wald statistic is normally distributed
◼ A two-tail test with left and right-tail rejection
regions