Professional Documents
Culture Documents
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-1
The Multiple Regression Model
Y β0 β1X1 β 2 X 2 βk Xk ε
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-2
Multiple Regression Equation
yˆ i b0 b1x1i b 2 x 2i bk x ki
In this chapter we will always use a computer to obtain the
regression slope coefficients and other regression
summary measures.
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-3
Multiple Regression Equation
(continued)
Two variable model
y
yˆ b 0 b1x1 b 2 x 2
x2
x1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-4
Standard Multiple Regression
Assumptions
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-5
Standard Multiple Regression
Assumptions
(continued)
c 0 c 1x1i c 2 x 2i c K x Ki 0
(This is the property of no linear relation for
the Xj’s)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-6
Example:
2 Independent Variables
A distributor of frozen desert pies wants to
evaluate factors thought to influence demand
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-7
Pie Sales Example
Pie Price Advertising
Week Sales ($) ($100s) Multiple regression equation:
1 350 5.50 3.3
2 460 7.50 3.3
Sales = b0 + b1 (Price)
3 350 8.00 3.0
4 430 8.00 4.5 + b2 (Advertising)
5 350 6.80 3.0
6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
9 450 7.00 3.5
10 490 5.00 4.0
11 340 7.20 3.5
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-8
Estimating a Multiple Linear
Regression Equation
Excel will be used to generate the coefficients
and measures of goodness of fit for multiple
regression
Excel:
Tools / Data Analysis... / Regression
PHStat:
PHStat / Regression / Multiple Regression…
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-9
Multiple Regression Output
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341 Sales 306.526 - 24.975(Pri ce) 74.131(Adv ertising)
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-10
The Multiple Regression Equation
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-11
Coefficient of Determination, R2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-12
Coefficient of Determination, R2
(continued)
Regression Statistics
SSR 29460.0
Multiple R 0.72213
R 2
.52148
R Square 0.52148 SST 56493.3
Adjusted R Square 0.44172
Standard Error 47.46341
52.1% of the variation in pie sales
Observations 15 is explained by the variation in
price and advertising
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-13
Estimation of Error Variance
Consider the population regression model
Yi β0 β1x1i β2 x 2i βK xKi ε i
i
e 2
SSE
s2e i1
n K 1 n K 1
where ei y i yˆ i
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-15
Adjusted Coefficient of
Determination, R 2
R2 never decreases when a new X variable is
added to the model, even if the new variable is
not an important predictor variable
This can be a disadvantage when comparing
models
What is the net effect of adding a new variable?
We lose a degree of freedom when a new X
variable is added
Did the new X variable add enough
explanatory power to offset the loss of one
degree of freedom?
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-16
Adjusted Coefficient of
Determination, R 2
(continued)
Used to correct for the fact that adding non-relevant
independent variables will still reduce the error sum of
squares
SSE / (n K 1)
R 1
2
SST / (n 1)
(where n = sample size, K = number of independent variables)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-18
Coefficient of Multiple
Correlation
The coefficient of multiple correlation is the correlation
between the predicted value and the observed value of
the dependent variable
R r(yˆ , y) R2
Is the square root of the multiple coefficient of
determination
Used as another measure of the strength of the linear
relationship between the dependent variable and the
independent variables
Comparable to the correlation between Y and X in
simple regression
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-19
Evaluating Individual
Regression Coefficients
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-20
Evaluating Individual
Regression Coefficients
(continued)
Test Statistic:
bj 0
t (df = n – k – 1)
Sb j
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-21
Evaluating Individual
Regression Coefficients
(continued)
Regression Statistics
t-value for Price is t = -2.306, with
Multiple R 0.72213
R Square 0.52148
p-value .0398
Adjusted R Square 0.44172
Standard Error 47.46341 t-value for Advertising is t = 2.855,
Observations 15 with p-value .0145
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-22
Example: Evaluating Individual
Regression Coefficients
From Excel output:
H0: βj = 0
Coefficients Standard Error t Stat P-value
H1: βj 0 Price -24.97509 10.83213 -2.30565 0.03979
Advertising 74.13096 25.96732 2.85478 0.01449
d.f. = 15-2-1 = 12
= .05 The test statistic for each variable falls
t12, .025 = 2.1788 in the rejection region (p-values < .05)
Decision:
/2=.025 /2=.025 Reject H0 for each variable
Conclusion:
There is evidence that both
Reject H0 Do not reject H0 Reject H0
-tα/2 tα/2 Price and Advertising affect
0
-2.1788 2.1788 pie sales at = .05
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-23
Confidence Interval Estimate
for the Slope
Confidence interval limits for the population slope βj
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-25
Test on All Coefficients
F-Test for Overall Significance of the Model
Shows if there is a linear relationship between all
of the X variables considered together and Y
Use F test statistic
Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent
variable affects Y)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-26
F-Test for Overall Significance
Test statistic:
MSR SSR/K
F 2
se SSE/(n K 1)
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-28
F-Test for Overall Significance
(continued)
H0 : α1 α2 αr 0
H1 : at least one of α j 0 (j 1,...,r)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-30
Tests on a Subset of
Regression Coefficients
(continued)
( SSE(r) SSE ) / r
Reject H0 if F 2
Fr,nK r 1,α
se
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-31
Prediction
Given a population regression model
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-32
Using The Equation to Make
Predictions
Predict sales for a week in which the selling
price is $5.50 and advertising is $350:
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-33
Predictions in PHStat
PHStat | regression | multiple regression …
Check the
“confidence and
prediction interval
estimates” box
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-34
Predictions in PHStat
(continued)
Input values
<
Predicted y value
Confidence interval for the
<
mean y value, given
these x’s
<
individual y value, given
these x’s
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-35
Residuals in Multiple Regression
Two variable model
y Sample
yi observation yˆ b 0 b1x1 b 2 x 2
Residual =
<
ei = (yi – yi)
<
yi
x2i
x2
x1i
x1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-36
Dummy Variables
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-37
Dummy Variable Example
yˆ b0 b1x1 b 2 x 2
Let:
y = Pie Sales
x1 = Price
x2 = Holiday (X2 = 1 if a holiday occurred during the week)
(X2 = 0 if there was no holiday that week)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-38
Dummy Variable Example
(continued)
Different Same
intercept slope
y (sales)
If H0: β2 = 0 is
b0 + b2 rejected, then
b0 “Holiday” has a
significant effect
on pie sales
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-40
Interaction Between
Explanatory Variables
yˆ b0 b1x1 b2 x 2 b3 x 3
b0 b1x1 b2 x 2 b3 (x1x 2 )
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-41
Effect of Interaction
Given: Y β β X (β β X )X
0 2 2 1 3 2 1
β0 β1X1 β2 X2 β3 X1X2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-42
Interaction Example
Suppose x2 is a dummy variable and the estimated
regression equation is yˆ 1 2x 1 3x 2 4x 1x 2
y
12
x2 = 1:
^y = 1 + 2x + 3(1) + 4x (1) = 4 + 6x
8 1 1 1
4 x2 = 0:
^y = 1 + 2x + 3(0) + 4x (0) = 1 + 2x
1 1 1
0
x1
0 0.5 1 1.5
Slopes are different if the effect of x1 on y depends on x2 value
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-43
Significance of Interaction Term
The coefficient b3 is an estimate of the difference
in the coefficient of x1 when x2 = 1 compared to
when x2 = 0
The t statistic for b3 can be used to test the
hypothesis
H0 : β3 0 | β1 0, β2 0
H1 : β3 0 | β1 0, β2 0
If we reject the null hypothesis we conclude that there is
a difference in the slope coefficient for the two
subgroups
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-44
Multiple Regression Assumptions
<
ei = (yi – yi)
Assumptions:
The errors are normally distributed
Errors have a constant variance
The model errors are independent
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-45
Analysis of Residuals
in Multiple Regression
<
Residuals vs. yi
Residuals vs. x1i
Residuals vs. x2i
Residuals vs. time (if time series data)
Use the residual plots to check for
violations of regression assumptions
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-46
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 13-47