You are on page 1of 11

The Population Model

Multiple Regression Multiple Regression Model with k Independent Variables:

Population Intercept Population (partial) slopes Error

Y ȕ0  ȕ1 X 1  ȕ2 X 2  !  ȕk X k  İ
N(0,sig^2)

The Sample Model Least Squares Method

The coefficients of the population model are estimated ƒ b0,…,bk are values that minimize the sum of the
using sample data squared errors (SSE) :

SSE ¦ (Y Yˆ )
i i
2
¦ (Y  (b
i 0  b1X1i  ...  bk Xki ))2
Sample
intercept Sample (partial) slopes
Predicted Y
ƒ Normal equations :

Yˆ b0  b1 X 1  b2 X 2  !  bk X k ¦e i 0

¦X e 1i i 0

#
¦X e ki i 0
Example:
ANOVA for Multiple Regression Two Independent Variables
[AQ
ƒ A distributor of frozen desert pies wants to
SST SSR  SSE evaluate factors thought to influence demand
Total Sum of Regression Sum Error Sum of ƒ Y : Pie sales (units/week)
Squares of Squares Squares
ƒ X’s : Price (in $)
¦ (Y  Y )
i
2
¦ (Yˆ  Y )
i
2
¦ (Y  Yˆ )
i i
2 Advertising ($100’s)

ƒ Data are collected for 15 weeks

dfSST dfSSR  dfSSE


n 1 k n  k 1

Geometrical Representation Pie Sales Example


Pie Price Advertising
Week Sales ($) ($100s) Multiple regression equation:
Y Sample 1 350 5.50 3.3
Yi observation Ŷ b0  b1X1  b 2 X 2 2 460 7.50 3.3
Sales = b0 + b1(Price)
3 350 8.00 3.0
<

ei = Yi – Yi 4 430 8.00 4.5 + b2(Advertising)


<

5 350 6.80 3.0


Yi 6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
x2i 9 450 7.00 3.5
X2 10 490 5.00 4.0
11 340 7.20 3.5
x1i The best plane is found by
12 300 7.90 3.2
minimizing the sum of
13 440 5.90 4.0
squared errors, 6e2
X1 14 450 5.00 3.5
15 300 7.00 2.7
Regression Using SPSS Regression Using SPSS

Regression Using SPSS Regression Using SPSS


Coefficient of
Regression Using SPSS Multiple Determination

ƒ Reports the proportion of total variation in Y


explained by all X variables taken together

SSR
R2
SST

Adjusted R2 Is the Model Useful ?


ƒ Shows the proportion of variation in Y explained by all ƒ Model explains a significant proportion of
regressors adjusted for the number of regressors used :
variation in the data ?
SSE / (n  k  1) § n 1 · ƒ Is there a linear relationship between all of the X
Ra2 : 1  1  (1  R2 ) ¨ ¸
SST / (n  1) © n  k  1¹ variables considered together and Y ?
ƒ Hypotheses:
(n = sample size, k = number of regressors)
H0: ȕ1 = ȕ2 = … = ȕk = 0 (no linear relationship) V

ƒ Always smaller than R2


H1: at least one ȕi  DWOHDVWRQHLQGHSHQGHQW
ƒ Can be negative
variable affects Y)
F Test for Overall Significance F Test for Overall Significance
(continued)

ƒ Test statistic:

MSR SSR / k
F= = ~ F(k , n  k  1)
MSE SSE / (n  k  1)

Multiple Regression Model :


F Test for Overall Significance SPSS Output
(continued)

H0: ȕ1 = ȕ2 = 0 Test Statistic:


H1: not both zero MSR Sales 306.526 - 24.975(Price)  74.131(Adv ertising)
F 6.5386
D = .05 MSE
df1= 2, df2 = 12
Decision:
Since the F test statistic is in
the rejection region, reject H0

D = .05
Conclusion:
0 Reject H0
F There is evidence that at least one
independent variable affects Y
F2,12,0.05 = 3.885
Using The Equation to Make
The Multiple Regression Equation Predictions
Predict sales for a week in which the selling
Sales 306.526 - 24.975(Price)  74.131(Adv ertising)
price is $5.50 and advertising is $350:
where
Sales is in number of pies per week
Price is in $ Sales 306.526 - 24.975(Price)  74.131(Adv ertising)
Advertising is in $100’s.
b1 = -24.975: sales b2 = 74.131: sales will 306.526 - 24.975 (5.50)  74.131 (3.5)
will decrease, on increase, on average,
average, by 24.975 by 74.131 pies per 428.62
pies per week for week for each $100
each $1 increase in increase in Note that Advertising is
selling price, when advertising, when Predicted sales is in $100’s, so $350
advertising is fixed price is fixed 428.62 pies means that X2 = 3.5

Significance of Individual Are Individual Variables


Variables Significant ?
(continued)

ƒ Hypotheses: H0: ȕj = 0
ƒ H0: ȕj = 0 (Xj is useless in the presence of H1: ȕj 0
other variables)
Test Statistic:
ƒ H1: ȕj 0 (Xj is useful)
bj  0 (df = n – k – 1)
t
se(b j )
Are Individual Variables
Significant? Partial F Tests
(continued)

ƒ Used to assess the effect of adding or


removing predictors from the model
ƒ Form the basis of model building

p-value for slope

Adding One Predictor : Example Adding One Predictor : Example


(continued)

H0 : Adding X1 does not improve the model when X2 is


Test at the D = .05 level to determine whether already included ( E1 0)
price (X1) improves the model given that H1 : It does ( E1 z 0)
advertising (X2) is already included.
(X2) (X1 and X2)

ANOVA
ANOVA
df SS MS
df SS
Regression 2 29460.03 14730.01
Regression 1 17484.22
Residual 13 39009.11 Residual 12 27033.31 2252.78

Total 14 56493.33 Total 14 56493.34


Adding One Predictor : Example Removing One Predictor : Example
(continued)
(X1 and X2) (X2)
ANOVA ANOVA
df SS MS df SS Test at the D = .05 level to determine whether
Regression 2 29460.03 14730.01 Regression 1 17484.22 price (X1) can be removed from the model with
Residual 12 27033.31 2252.78 Residual 13 39009.11
Total 14 56493.34 Total 14 56493.33 price (X1) and advertising (X2) as predictors.

F
>SSR(X ,X1 2
)  SSR(X2 )@ y 1
MSE(X1,X2 )
29460.03  17484.22
2252.78
5.316

> F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; adding X1 does improve the model

Removing One Predictor : Example Removing One Predictor : Example


(continued) (continued)

H0 : Removing X1 does not reduce the power of the model (X1 and X2) (X2)
when X2 is also included ( E1 0)
ANOVA ANOVA
df SS MS df SS

H1 : It does ( E1 z 0) Regression 2 29460.03 14730.01 Regression 1 17484.22


Residual 12 27033.31 2252.78 Residual 13 39009.11
Total 14 56493.34 Total 14 56493.33
(X1 and X2) (X2)
F
>SSR(X ,X1 2
)  SSR(X2 )@ y 1
ANOVA MSE(X1,X2 )
ANOVA
df SS MS 29460.03  17484.22
df SS
2252.78
Regression 2 29460.03 14730.01 Regression 1 17484.22
5.316
Residual 12 27033.31 2252.78 Residual 13 39009.11

Total 14 56493.34 Total 14 56493.33 > F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; dropping X1 does reduce the power of the model


What is Collinearity ? Hald Cement Data
Y : heat evolved Y X1 X2 X3 X4
ƒ Collinearity (or multicollinearity) exists if two or X1 : tricalcium aluminate 78.5 7 26 6 60
more independent variables have a perfect X2 : tricalcium silicate 74.3 1 29 15 52

linear relationship X3 : tetracalcium alumino ferrite 104.3 11 56 8 20


X4 : dicalcium silicate 87.6 11 31 8 47
ƒ Examples : 95.9 7 52 6 33

X 3 1 2 X 2  3 X1 n : 13 109.2 11 55 9 22
102.7 3 71 17 6

X 2 4X3 72.5 1 31 22 44
93.1 2 54 18 22
115.9 21 47 4 26
83.8 1 40 23 34
113.3 11 66 9 12
109.4 10 68 8 12

Collinearity in Hald Cement Data Collinearity in Hald Cement Data


ƒ Matrix Scatter Plots
ƒ Significant correlation between independent variables
Correlations
X1 X2 X3 X4
X1 Pearson
1 .229 -.824** -.245
Correlation
Sig. (2-tailed) .453 .001 .419
N 13 13 13 13
X2 Pearson
.229 1 -.139 -.973**
Correlation
Sig. (2-tailed) .453 .650 .000
N 13 13 13 13
X3 Pearson
-.824** -.139 1 .030
Correlation
Sig. (2-tailed) .001 .650 .924
N 13 13 13 13
X4 Pearson
-.245 -.973** .030 1
Correlation
Sig. (2-tailed) .419 .000 .924
N 13 13 13 13
**. Correlation is significant at the 0.01 level (2-tailed).
0RGHO%XLOGLQJ Backward Elimination

*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ „ Start with the ‘full’ model.

„ If all predictors are significant, we get the final answer.


%DFNZDUGHOLPLQDWLRQ
„ If some of them are not, the one with the largest p-value is
)RUZDUGVHOHFWLRQ
removed. A new model is then fitted using the remaining
predictors. This step is repeated until all remaining
predictors are significant.

Hald Cement Data : Backward


Elimination Forward Selection
Remove

„ Start with the ‘null’ model.

„ The predictor that gives the largest and most significant


increase in SSR is included. If the largest increase is
non-significant, the final answer is the ‘null’ model.

„ Each predictor not already in the model is tested. The most


significant of these is added.

„ Continue adding predictors until none of the remaining ones


is significant.

!%()6QRWHV
Hald Cement Data : Forward
Selection Criteria of Model Selection

„ In some cases, ‘backward elimination’ and ‘forward selection’


give different answers. The final choice has to be based on
performance criteria such as adj R2 and MSE :

Model Adj R2 MSE


X1,X2 0.974 5.79
X1,X4 0.967 7.476
!%()6QRWHV

„ For the Hald cement data, the final model is (X1,X2)

You might also like