You are on page 1of 5

Backward elimination and stepwise regression

(a) Backward elimination:


Assume the model with all possible covariates is
+ + + + =
1 1 1 1 0 r r
X X Y .
Backward elimination procedure:
Step 1:
At the beginning, the original model is set to be
+ + + + =
1 1 1 1 0 r r
X X Y .
Then, the following r-1 tests are carried out,
. 1 , , 2 , 1 , 0 :
0
= = r j H
j j


The lowest partial F-test value
l
F
corresponding to
0 :
0
=
l l
H
or t-test
value l
t
is compared with the preselected significance values 0
F
and 0
t
. ne of two possible steps !step2a and step 2b" can be ta#en.
Step 2a:
$f 0
F F
l
<
or 0
t t
l
<
, then l
X
can be deleted and the new original model is
+ + + + + + + =
+ + 1 1 1 1 1 1 1 1 0 r r l l l l
X X X X Y
.
%o bac# to step 1.
Step 2b:
$f 0
F F
l
>
or 0
t t
l
>
, the original model is the model we should choose.
&'ample !continue":
(uppose the preselected significance level is . 1 . 0 = Thus,
. 1) . *
+ . 0 , , , 1 0
= =F F

Step 1:
The original model is
1
+ + + + + =
) ) * * 2 2 1 1 0
X X X X Y
.
01, . 0
*
= F
corresponding to
0 :
* 0*
= H
is the smallest partial F value.
(tep 2a:
1) . * 01, . 0
0
= < = F F
l
.
Thus,
*
X
can be deleted. %o bac# to step 1.
(tep 1:
The new original model is
+ + + + =
) ) 2 2 1 1 0
X X X Y
.
,- . 1
)
= F corresponding to
0 :
) 0)
= H
is the smallest partial F value.
(tep 2a:
1) . * ,- . 1
0 )
= < = F F
.
Thus,
)
X
can be deleted. %o bac# to step 1.
(tep 1:
The new original model is
+ + + =
2 2 1 1 0
X X Y .
1))
1
= F corresponding to
0 :
1 01
= H
is the smallest partial F value.
(tep 2b:
1) . * 1))
0 1
= > = F F
.
Thus,
+ + + =
2 2 1 1 0
X X Y ,
is the selected model.
(b) Stepwise regression:
(tepwise regression procedure emplo.s some statistical /uantit., partial
correlation, to add new covariate. 0e introduce partial correlation first.
Partial correlation:
2
Assume the model is
+ + + + =
1 1 1 1 0 r r
X X Y
.
The partial correlation of j
X
and Y , denoted b.
" !
1 1 1 2 1 +

r j j j
X X X X X YX
r
,
can be obtained as follows:
1. Fit the model
+ + + + + + + =
+ + 1 1 1 1 1 1 1 1 0 r r j j j j
X X X X Y
obtain the residuals
Y
n
Y Y
e e e , , ,
2 1

.
Also, fit the model

+ + + + + + + =
+ + 1 1 1 1 1 1 1 1 0 r r j j j j j
X X X X X
obtain the residuals
j j j
X
n
X X
e e e , , ,
2 1

.
2.
( )
( )( )
( ) ( )

= =
=



+
n
i
X X
i
n
i
Y Y
i
n
i
X X
i
Y Y
i
X X X X X YX
j j
j j
p j j j
e e e e
e e e e
r
1
2
1
2
1
1 1 1 2 1
,
where
n
e
e
n
i
Y
i
Y

=
=
1
and
n
e
e
n
i
X
i
X
j
j

=
=
1
.
Stepwise regression procedure:
The original model is
+ =
0
Y
. There are r-1 covariates,
1 2 1
, , ,
r
X X X
*
.
Step 1:
(elect the variable most correlated Y, sa.
1
i
X
, based on the correlation
coefficient. Fit the model
+ + =
1 1
0 i i
X Y
and chec# if
1
i
X
is significant. $f not, then
+ =
0
Y
,
is the best model. Otherwise, the new original model is
+ + =
1 1
0 i i
X Y
and go to step 2.
Step 2:
&'amine the partial correlation 1
,
1
i j r
i j
X YX

. Find the covariate
2
i
X

with largest value of partial correlation
.
1
i j
X YX
r
Then, fit
+ + + =
2 2 1 1
0 i i i i
X X Y
and obtain partial F-value,
1
i
F
corresponding to
0 :
1
0
=
i
H
and
2
i
F
corresponding to
0 :
2
0
=
i
H
. Go to step 3.
Step 3:
The smallest partial F-value l
F
!one of
1
i
F
and
2
i
F
" is compared with
the preselected significance 0
F
value. There are two possibilities:
!a"
$f 0
F F
l
<
, then delete the covariate corresponding to l
F
. Go back to
step 2. 1ote that if
2
i l
F F =
, then e'amine the partial correlation
2 1
,
1
i i j r
i j
X YX

.
!b"
)
$f 0
F F
l
>
, then
+ + + =
2 2 1 1
0 i i i i
X X Y
,
is the new original model. Then, go back to step 2, but now examine the
partial correlation 2 1 " , !
,
2
1
i i j r
i i j
X X YX

.
The procedure will automaticall. stop when no variable in the new
original model can be removed and all the ne't best candidate can not be
retained in the new original model. Then, the new original model is our
selected model.
2

You might also like