08 Multiple Regression

The Population Model
Multiple Regression Multiple Regression Model with k Independent Variables:
Population Intercept Population (partial) slopes Error
Y ȕ0 ȕ1 X 1 ȕ2 X 2 ! ȕk X k İ
N(0,sig^2)
The Sample Model Least Squares Method
The coefficients of the population model are estimated b0,…,bk are values that minimize the sum of the
using sample data squared errors (SSE) :
SSE ¦ (Y Yˆ )
i i
2
¦ (Y (b
i 0 b1X1i ... bk Xki ))2
Sample
intercept Sample (partial) slopes
Predicted Y
Normal equations :
Yˆ b0 b1 X 1 b2 X 2 ! bk X k ¦e i 0
¦X e 1i i 0
#
¦X e ki i 0
Example:
ANOVA for Multiple Regression Two Independent Variables
[AQ
A distributor of frozen desert pies wants to
SST SSR SSE evaluate factors thought to influence demand
Total Sum of Regression Sum Error Sum of Y : Pie sales (units/week)
Squares of Squares Squares
X’s : Price (in $)
¦ (Y Y )
i
2
¦ (Yˆ Y )
i
2
¦ (Y Yˆ )
i i
2 Advertising ($100’s)
Data are collected for 15 weeks
dfSST dfSSR dfSSE

n 1 k n k 1
Geometrical Representation Pie Sales Example

Pie Price Advertising
Week Sales ($) ($100s) Multiple regression equation:
Y Sample 1 350 5.50 3.3
Yi observation Ŷ b0 b1X1 b 2 X 2 2 460 7.50 3.3
Sales = b0 + b1(Price)
3 350 8.00 3.0
<
ei = Yi – Yi 4 430 8.00 4.5 + b2(Advertising)

<
5 350 6.80 3.0

Yi 6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
x2i 9 450 7.00 3.5
X2 10 490 5.00 4.0
11 340 7.20 3.5
x1i The best plane is found by
12 300 7.90 3.2
minimizing the sum of
13 440 5.90 4.0
squared errors, 6e2
X1 14 450 5.00 3.5
15 300 7.00 2.7
Regression Using SPSS Regression Using SPSS
Regression Using SPSS Regression Using SPSS

Coefficient of
Regression Using SPSS Multiple Determination
Reports the proportion of total variation in Y

explained by all X variables taken together
SSR
R2
SST
Adjusted R2 Is the Model Useful ?

Shows the proportion of variation in Y explained by all Model explains a significant proportion of
regressors adjusted for the number of regressors used :
variation in the data ?
SSE / (n k 1) § n 1 · Is there a linear relationship between all of the X
Ra2 : 1 1 (1 R2 ) ¨ ¸
SST / (n 1) © n k 1¹ variables considered together and Y ?
Hypotheses:
(n = sample size, k = number of regressors)
H0: ȕ1 = ȕ2 = … = ȕk = 0 (no linear relationship) V
Always smaller than R2

H1: at least one ȕi DWOHDVWRQHLQGHSHQGHQW
Can be negative
variable affects Y)
F Test for Overall Significance F Test for Overall Significance
(continued)
Test statistic:
MSR SSR / k
F= = ~ F(k , n k 1)
MSE SSE / (n k 1)
Multiple Regression Model :

F Test for Overall Significance SPSS Output
(continued)
H0: ȕ1 = ȕ2 = 0 Test Statistic:

H1: not both zero MSR Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
F 6.5386
D = .05 MSE
df1= 2, df2 = 12
Decision:
Since the F test statistic is in
the rejection region, reject H0
D = .05
Conclusion:
0 Reject H0
F There is evidence that at least one
independent variable affects Y
F2,12,0.05 = 3.885
Using The Equation to Make
The Multiple Regression Equation Predictions
Predict sales for a week in which the selling
Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
price is $5.50 and advertising is $350:
where
Sales is in number of pies per week
Price is in $ Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
Advertising is in $100’s.
b1 = -24.975: sales b2 = 74.131: sales will 306.526 - 24.975 (5.50) 74.131 (3.5)
will decrease, on increase, on average,
average, by 24.975 by 74.131 pies per 428.62
pies per week for week for each $100
each $1 increase in increase in Note that Advertising is
selling price, when advertising, when Predicted sales is in $100’s, so $350
advertising is fixed price is fixed 428.62 pies means that X2 = 3.5
Significance of Individual Are Individual Variables

Variables Significant ?
(continued)
Hypotheses: H0: ȕj = 0
H0: ȕj = 0 (Xj is useless in the presence of H1: ȕj 0
other variables)
Test Statistic:
H1: ȕj 0 (Xj is useful)
bj 0 (df = n – k – 1)
t
se(b j )
Are Individual Variables
Significant? Partial F Tests
(continued)
Used to assess the effect of adding or

removing predictors from the model
Form the basis of model building
p-value for slope
Adding One Predictor : Example Adding One Predictor : Example

(continued)
H0 : Adding X1 does not improve the model when X2 is

Test at the D = .05 level to determine whether already included ( E1 0)
price (X1) improves the model given that H1 : It does ( E1 z 0)
advertising (X2) is already included.
(X2) (X1 and X2)
ANOVA
ANOVA
df SS MS
df SS
Regression 2 29460.03 14730.01
Regression 1 17484.22
Residual 13 39009.11 Residual 12 27033.31 2252.78
Total 14 56493.33 Total 14 56493.34

Adding One Predictor : Example Removing One Predictor : Example
(continued)
(X1 and X2) (X2)
ANOVA ANOVA
df SS MS df SS Test at the D = .05 level to determine whether
Regression 2 29460.03 14730.01 Regression 1 17484.22 price (X1) can be removed from the model with
Residual 12 27033.31 2252.78 Residual 13 39009.11
Total 14 56493.34 Total 14 56493.33 price (X1) and advertising (X2) as predictors.
F
>SSR(X ,X1 2
) SSR(X2 )@ y 1
MSE(X1,X2 )
29460.03 17484.22
2252.78
5.316
> F(1, 12, 0.05) = 4.75
Conclusion : reject H0 ; adding X1 does improve the model
Removing One Predictor : Example Removing One Predictor : Example

(continued) (continued)
H0 : Removing X1 does not reduce the power of the model (X1 and X2) (X2)
when X2 is also included ( E1 0)
ANOVA ANOVA
df SS MS df SS
H1 : It does ( E1 z 0) Regression 2 29460.03 14730.01 Regression 1 17484.22

Total 14 56493.34 Total 14 56493.33
(X1 and X2) (X2)
F
>SSR(X ,X1 2
) SSR(X2 )@ y 1
ANOVA MSE(X1,X2 )
ANOVA
df SS MS 29460.03 17484.22
df SS
2252.78
Regression 2 29460.03 14730.01 Regression 1 17484.22
5.316
Total 14 56493.34 Total 14 56493.33 > F(1, 12, 0.05) = 4.75
Conclusion : reject H0 ; dropping X1 does reduce the power of the model

What is Collinearity ? Hald Cement Data
Y : heat evolved Y X1 X2 X3 X4
Collinearity (or multicollinearity) exists if two or X1 : tricalcium aluminate 78.5 7 26 6 60
more independent variables have a perfect X2 : tricalcium silicate 74.3 1 29 15 52
linear relationship X3 : tetracalcium alumino ferrite 104.3 11 56 8 20

X4 : dicalcium silicate 87.6 11 31 8 47
Examples : 95.9 7 52 6 33
X 3 1 2 X 2 3 X1 n : 13 109.2 11 55 9 22
102.7 3 71 17 6
X 2 4X3 72.5 1 31 22 44
93.1 2 54 18 22
115.9 21 47 4 26
83.8 1 40 23 34
113.3 11 66 9 12
109.4 10 68 8 12
Collinearity in Hald Cement Data Collinearity in Hald Cement Data

Matrix Scatter Plots
Significant correlation between independent variables
Correlations
X1 X2 X3 X4
X1 Pearson
1 .229 -.824** -.245
Correlation
Sig. (2-tailed) .453 .001 .419
N 13 13 13 13
X2 Pearson
.229 1 -.139 -.973**
Correlation
Sig. (2-tailed) .453 .650 .000
N 13 13 13 13
X3 Pearson
-.824** -.139 1 .030
Correlation
Sig. (2-tailed) .001 .650 .924
N 13 13 13 13
X4 Pearson
-.245 -.973** .030 1
Correlation
Sig. (2-tailed) .419 .000 .924
N 13 13 13 13
**. Correlation is significant at the 0.01 level (2-tailed).
0RGHO%XLOGLQJ Backward Elimination
*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ Start with the ‘full’ model.
If all predictors are significant, we get the final answer.

%DFNZDUGHOLPLQDWLRQ
If some of them are not, the one with the largest p-value is
)RUZDUGVHOHFWLRQ
removed. A new model is then fitted using the remaining
predictors. This step is repeated until all remaining
predictors are significant.
Hald Cement Data : Backward

Elimination Forward Selection
Remove
Start with the ‘null’ model.
The predictor that gives the largest and most significant

increase in SSR is included. If the largest increase is
non-significant, the final answer is the ‘null’ model.
Each predictor not already in the model is tested. The most

significant of these is added.
Continue adding predictors until none of the remaining ones

is significant.
!%()6QRWHV
Hald Cement Data : Forward
Selection Criteria of Model Selection
In some cases, ‘backward elimination’ and ‘forward selection’

give different answers. The final choice has to be based on
performance criteria such as adj R2 and MSE :
Model Adj R2 MSE

X1,X2 0.974 5.79
X1,X4 0.967 7.476
!%()6QRWHV
For the Hald cement data, the final model is (X1,X2)

08 Multiple Regression

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

08 Multiple Regression

Uploaded by

Copyright:

Available Formats

The Population Model

Multiple Regression Multiple Regression Model with k Independent Variables:

Population Intercept Population (partial) slopes Error

The Sample Model Least Squares Method

 Data are collected for 15 weeks

dfSST dfSSR  dfSSE

Geometrical Representation Pie Sales Example

ei = Yi – Yi 4 430 8.00 4.5 + b2(Advertising)

5 350 6.80 3.0

Regression Using SPSS Regression Using SPSS

 Reports the proportion of total variation in Y

Adjusted R2 Is the Model Useful ?

 Always smaller than R2

Multiple Regression Model :

H0: ȕ1 = ȕ2 = 0 Test Statistic:

Significance of Individual Are Individual Variables

 Used to assess the effect of adding or

p-value for slope

Adding One Predictor : Example Adding One Predictor : Example

H0 : Adding X1 does not improve the model when X2 is

Total 14 56493.33 Total 14 56493.34

> F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; adding X1 does improve the model

Removing One Predictor : Example Removing One Predictor : Example

H1 : It does ( E1 z 0) Regression 2 29460.03 14730.01 Regression 1 17484.22

Total 14 56493.34 Total 14 56493.33 > F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; dropping X1 does reduce the power of the model

linear relationship X3 : tetracalcium alumino ferrite 104.3 11 56 8 20

Collinearity in Hald Cement Data Collinearity in Hald Cement Data

*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ  Start with the ‘full’ model.

 If all predictors are significant, we get the final answer.

Hald Cement Data : Backward

 Start with the ‘null’ model.

 The predictor that gives the largest and most significant

 Each predictor not already in the model is tested. The most

 Continue adding predictors until none of the remaining ones

 In some cases, ‘backward elimination’ and ‘forward selection’

Model Adj R2 MSE

 For the Hald cement data, the final model is (X1,X2)

You might also like

Data are collected for 15 weeks

dfSST dfSSR dfSSE

Reports the proportion of total variation in Y

Always smaller than R2

Used to assess the effect of adding or

*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ Start with the ‘full’ model.

If all predictors are significant, we get the final answer.

Start with the ‘null’ model.

The predictor that gives the largest and most significant

Each predictor not already in the model is tested. The most

Continue adding predictors until none of the remaining ones

In some cases, ‘backward elimination’ and ‘forward selection’

For the Hald cement data, the final model is (X1,X2)