Professional Documents
Culture Documents
Introduction
[I+ k(X'X)-']-l'3
(2)
0< k<
2oX2//*'/3*
(3)
tr MSE[/3(k)]< tr MSE(,B).
II.
(4)
Interpretation of ORR
[ 488 1
LoI
= [1
/3
+ [Vj
(6)
where 0 is a p x 1 vector of zeros. Let us compare this with the mixed estimator of /8 of the
model in (1) estimated with the restriction that
very likely
which, for b
a
b=
>
489
a, gives
-b
2f/ VT
-2cr/VXk1<3 j
+ 2,r/VT
(13)
Sample-based Selection of k
y = X:3 + E
(12)
a general
=XPP'f3+E
(14)
= X*a + E
where X* = XP, a = P',3, and P is an orthonormal matrix whose columns are normalized eigenvectors of X'X, that is,
PP' = I
(15)
490
P'X'XP=
(16)
X.
ZP \/(XA
= (P'X'XP)-'P'X'XPP'f,
(17)
= PI,3.
k)3
= k
i=l
a = (P'X'XP)-'P'X'y
= (P'X'XP)-'P'X'Xj8
(23)
(p - 2)6r2/13 3.
>P
a, 2/(X,
a(k) = P'/3(k).
(18)
(24)
aii2/of2[(1/k)
(1/X1)]
X 2
(25)
i=l1
(19
1)
x [I + k(X'X)-'],j(k).
(20)
a(k) = [I + k(X*'X*)-l]-la&
= [I + k(P'X'XP)-']-l&
(26)
i=l1
(21)
>
ai2/[(I/k)
+ (I/Xi)]
i=l
=p
k.k4
2)]
(27)
(y
(22)
'8
-
X8)'(y
(28)
f3jlP'}lIP'.
(p'
(p'=
{YjMSE[,3j(k)]}'12
L = maxfl1,31(k)4These
k)3
that
where
i=l
B34,
(p'
1)
2)
13P(k)- 8P11
= x).
The performance of a ridge regression estimator based on a given value of k depends on (i)
the number and the values of the regression
coefficients, (ii) the degree of multicollinearity,
and (iii) the value of the variance of the disturbances, o-2. It can be expected that the same
factors would also be relevant for the ORR estimation with unknown k. In the Monte Carlo experiment at hand we take the factors (i) and (ii)
into consideration but, following Thisted (1976),
leave the value of o-2 constant (equal to unity)
throughout the experiment in order to keep the
computer costs down.
In constructing the data sets and in determining the values of the regression coefficients we
followed, with some modifications, the approach
of Dempster et al. (1977). Two models, one with
4 explanatory variables and 20 observations and
one with 8 explanatory variables and 40 observations, were used in this study. The values of the
explanatory variables have been generated from
a standard normal distribution, modified to reflect a low, a medium, and a high degree of multicollinearity, and standardized to be used in a
correlation matrix form.5 For the 4-variable
model the values of the determinants of this correlation matrix were 0.394, 0.016, and 0.005. The
corresponding values of the highest R2 obtained
by regressing each explanatory variable on the
remaining explanatory variables were 0.597,
0.980, and 0.994.6
The selection of the values of the regression
coefficients was based on the consideration of
two factors, shape (pattern of similarity and dissimilarity) and noncentrality (relative distance
from the origin). A measure of noncentrality, 6,
is defined as
6=,=',f/tr(X'X).
B11
B12
B21
B22
B31
B32
=
=
=
=
=
(2.2361, 2.2361,
(0
,0
,
(4.4721, 4.4721,
,0
,
(O
(5.9161, 5.9161,
= (0
,0
,
491
2.2361, 2.2361)
4.4721,0
)
4.4721, 4.4721)
8.9443,0
)
5.9161, 5.9161)
11.8332,0
).
Evaluation of Results
In our experiment we considered three different degrees of multicollinearity and six different
sets of values of the regression coefficients, giving rise to 18 different designs for each of the two
models (the 4-variable and the 8-variable model).
The following statistics based on the 500 replications have been computed for all the estimators
for each of the three different loss structures:
(i) the average loss;
(ii) the standard deviation of loss;
(iii) the number of times that the ORR loss
exceeded the OLS loss (in 500 samples);
(iv) the ratio of ORR average loss to OLS
average loss.
(29)
7 The values of the explanatory variables and of the regression coefficients for the 8-variable model are similar to those
for the 4-variable model.
492
TABLE
1.-EFFECT
OF
(4-VARIABLE
HKB
Loss
Structure
HKB
p' = 2
.49789
.46734
p = x
p = 1
.67808
.67405
.64457
.63586
HKBM
Dempster
Wermuth
Sclove
.58506
.56788
p = <
p = 1
.73742
.73181
.71295
.70728
p'
p
p'=
HKBM
1
p' = 2
Dempster
Wermuth
Sclove
= 2
= X
.44479
.40002
= x
p = 1
.63018
.63536
.58556
.56936
= x
p = 1
p' = 2
p = x
1
p'=
p' = 2
p =
1
p'=
p' = 2
p =
1
p'=
0.90259
0.32480
0.22044
0.57984
0.44109
0.95599
0.91613
0.57172
0.45630
0.43715
0.35699
0.96558
0.66449
0.54548
0.95981
0.91959
0.65696
0.24270
0.54188
0.10493
0.97354
0.51950
0.33057
0.96684
0.98730
1.00091
0.99865
0.92989
0.97982
0.97241
0.51299
0.33727
0.64712
0.61619
0.23056
0.51335
0.50697
0.32726
0.11867
0.39094
0.36757
0.09349
0.32167
0.31858
pt = 2
.51281
.44927
= x
p = 1
.69222
.72732
.66708
.59429
p/ = 2
.44371
.39285
= <
p = 1
.62830
.63707
.58160
.56157
0.96306
TABLE
High
Shape
Loss
Structure
MODEL)
Multicollinearity
Medium
MODEL)
Estimator
MULTICOLLINEARITY
Low
COEFFICIENTS
OF
SHAPE
Average ORR
Loss/OLS Loss
THE
(4-VARIABLE
OF
2.-EFFECT
OF
THE
NONCENTRALITY
(4-VARIABLE
OF
MODEL)
HKBM
Dempster
Wermuth
Sclove
Loss
Structure
Low
Noncentrality
Medium
High
= 2
p = <
p = 1
p' = 2
p = <
p'= 1
p' = 2
p = <
p'= 1
p' = 2
p = x
p'= 1
p' = 2
p = x
p = 1
.39732
.58575
.57275
.50265
.66022
.65038
.33083
.51421
.50286
.32702
.51994
.49970
.32951
.51305
.50079
.50157
.67604
.67192
.59129
.73769
.73342
.43953
.62548
.62186
.50291
.70731
.68927
.43463
.62139
.61810
.54895
.72220
.72019
.63547
.77764
.77484
.49687
.68392
.68237
.61322
.81171
.79344
.49070
.68041
.67907
p'
493
4.-EFFECT
OF
DC
THE
NUMBER
OF
VARIABLES
494