Professional Documents
Culture Documents
net/publication/260436784
CITATIONS READS
14 433
1 author:
Yazid Al-Hassan
University of Regina
7 PUBLICATIONS 113 CITATIONS
SEE PROFILE
All content following this page was uploaded by Yazid Al-Hassan on 17 February 2015.
Introduction
Consider the standard model for multiple linear regression
y= + e, …….………………………………….…………………..……….. (1)
where y is an n × 1 column vector of observations on the dependent variable, X is
an n × p fixed matrix of observations on the explanatory variables and is of full
rank p (p ≤ n) , is a p×1 unknown column vector of regression coefficients,
and e is an n × 1 vector of random errors; E(e) = 0, E(ee' ) = ∂ 2 I n , where I n
denotes the n×n identity matrix. The variables are assumed to be standardized so
that X´X is in the form of correlation matrix, and the vector X´y is the vector of
correlation coefficients of the dependent variable with each explanatory variable.
The least squares (LS) estimator, of the parameters are given by
= (X′X) − 1 X′y ………………………………………………………………... (2)
In multiple linear regression models, we usually assume that the explanatory
variables are independent. However, in practice, there may be strong or near to
strong linear relationships among the explanatory variables. In that case the
independent assumptions are no longer valid, which causes the problem of
multicollinearity.
In the presence of multicollinearity, it is impossible to estimate the unique effects
of individual variables in the regression equation. Moreover, the LS estimates are
101
Yazid Al-Hassan
likely to be too large in absolute value and, possibly, of the wrong sign. Therefore,
multicollinearity becomes one of the serious problems in the linear regression
analysis.
Several methods have been suggested to solve this problem. "Ridge regression" is
the most popular one as it has many benefits in real life. The ridge regression
method was proposed by Hoerl and Kennard,[1,2] and since then, numerous papers
have been written, either suggesting different ways of estimating the ridge
parameter, comparing ridge with LS, and evaluating the performance of different
ridge parameter estimates.
Hoerl and Kennard[1] suggested the use of X′X + kI p , (k ≥ 0) rather than X′X , in
the estimation of (Eq. (2)). The resulting estimators of are known in the
literature as the ridge regression estimators, given by
102
J. J. Appl. Sci., Vol.10, No. 2 (2008)
regression.[1] It follows from Hoerl and Kennard[1] that the value of k i which
minimizes the MSE ( (k) ), where
p p
k i2 i2
MSE( (K)) = 2
∑ i
+ k i )2
+∑ , ………………………………… (6)
i + ki )
2
i =1 i i =1
is
σ2
ki = , …………………………………………………………..…………… (7)
α i2
2
where represents the error variance of model (1), i is the ith element of .
Equation (7) gives a value of k i that fully depends on the unknowns 2 and i ,
and must be estimated from the observed data. Hoerl and Kennard[1] suggested
the replacement of 2 and i by their corresponding unbiased estimators, that is,
σ2
ki = , ……………………………………………………………………….. (8)
α i2
where 2
= ∑ e i2 n − p is the residual mean square estimate, which is an unbiased
estimator of 2 , and i is the ith element of , which is an unbiased estimator of
.
In the following we present some methods for estimating ridge parameter k.
103
Yazid Al-Hassan
∑ i i )2
k HSL = 2 i =1
p
…………………………………………………………. (12)
(∑ i
2 2
i )
i =1
max( i ) 2
k KS = …………………………………….. (15)
(n − p − 1) 2 + max( i ) ⋅ max( i ) 2
To compare the proposed estimators, a criterion for measuring “goodness” of an
estimator is needed. Following Lawless and Wang,[5] Gibbons[14] and Kibria,[12]
mean squared error (MSE) criterion is used throughout our study to measure the
goodness of an estimator.
From Eq. (6), the MSE heavily depends on i , i and 2 . Since, theoretically, the
estimators in Eqs. (9)–(15) are very hard to compare, we will compare them
through a simulation study which is discussed in the following section.
104
J. J. Appl. Sci., Vol.10, No. 2 (2008)
1 5000
MSE( k) = ∑(
5000 r =1
(r) − )′( (r) − ) ……………………………………….. (19)
The simulated MSEs and ridge parameters ( ks ) are summarized in Tables 1-3.
105
Yazid Al-Hassan
HK HKB LW HSL KS AM GM 1 p
HK HKB LW HSL KS AM GM 1 p
106
J. J. Appl. Sci., Vol.10, No. 2 (2008)
HK HKB LW HSL KS AM GM 1 p
For given n and p, GM performs better than the other estimators when the
correlations between the explanatory variables are low or moderate, but for high
correlations, HKB becomes better than GM, and for extremely high correlation
(i.e. = 0.99 ) all estimators (except AM) perform better than or as good as GM.
All estimators perform extremely better than AM, especially for high correlations.
For given n, p and , as the MSEs decrease for most of the estimators, the
ratio 1 p increases.
2- Performance as a Function of k
For a given n and p, k decreases for most of the estimators as the ratio 1 p
increases, which means that for most of the estimators there is a direct relation
between the MSE and k. But for a given n, p and , the best estimator, bar AM
and GM, is the estimator that have the largest k.
107
Yazid Al-Hassan
Simulation Comparisons
Several extensive studies have been conducted to evaluate the performance of
ridge estimators. Some of these studies are Hoerl and Kennard,[1] Hoerl et al.,[3]
McDonald and Galarneau,[4] Lawless and Wang,[5] Hocking et al.,[6] Wichern
and Churchill,[7] Gibbons,[14] Saleh and Kibria,[9] Singh and Tracy,[10] Kibria[12]
and Khalaf and Shukur.[13] Each of these studies has evaluated the most recent
proposed estimators at that time. This research study evaluates the recent
estimators available at the time of doing it.
This section compares the results of some simulation studies and points out areas
of agreement (or disagreement) regarding the relative performance of the
estimators evaluated here.
As in the present study, the estimator GM did well in the simulation comparison
of Kibria.[12] The estimator HKB performed well in this study and in those of
Hoerl, Kennard and Baldwin,[3] Gibbons,[14] Kibria[12] and Al-Hassan.[16] Its
performance was criticized, however, by Lawless[17] and Wichern and
Churchill,[7] who could not recommend its use without further study.
The LW estimator was not singled out as one of the best estimators in this study,
which agrees with the results of Gibbons[14] and Al-Hassan.[16] However, most of
108
J. J. Appl. Sci., Vol.10, No. 2 (2008)
the times, LW estimator performed better than the estimators HK, HSL and AM.
This result is similar to that of Kibria.[12]
The estimator HK is the first proposed ridge estimator, so it was evaluated in
various studies. Most of the estimators that were proposed after HK are
performing better (e.g. HKB, LW and GM) or almost equivalent to HK (e.g. HSL
and KS for low error variance). As for HK, our results were almost identical to the
results of Hoerl, Kennard and Baldwin,[3] Wichern and Churchill,[7] Kibria[12]
and Al-Hassan.[16]
The HSL estimator was included in the simulation studies of Hocking, Speed and
Lynn,[6] Kibria[12] and Al-Hassan.[16] The present study followed the same
format of these studies in this regard.
In addition to this study, the estimator AM was included in that of Kibria[12] and
Al-Hassan.[16] The performance of this estimator in our study is identical to that
of Al-Hassan.[16] Considering the performance of AM in the sense of having
smaller MSE, our results disagreed with that of Kibria.[12] However, AM was a
highly biased estimator in both studies.
The estimator KS was included in the studies of Khalaf and Shukur[13] and Al-
Hassan.[16] With regard to this estimator, our results were identical to that of Al-
Hassan.[16] In the case of low error variance, our results were similar to those of
Khalaf and Shukur.[13]
References
[1] Hoerl, A. E. & Kennard, R. W. (1970a) Ridge regression: biased estimation
for nonorthogonal problems. Technometrics 12 (1), 55-67.
[2] Hoerl, A. E. & Kennard, R. W. (1970b) Ridge regression: applications to
nonorthogonal problems. Technometrics 12 (1), 69-82.
[3] Hoerl, A. E., Kennard, R. W. & Baldwin, K. F. (1975) Ridge regression:
some simulation. Communications in Statistics 4 (2), 105–123.
[4] McDonald, G. C. & Galarneau, D. I. (1975) A Monte Carlo evaluation of
some ridge-type estimators. Journal of the American Statistical Association
70 (350), 407-412.
[5] Lawless, J. F. & Wang, P. (1976) A simulation study of ridge and other
regression estimators. Communications in Statistics-Theory and Methods 5
(4), 307–323.
[6] Hocking, R. R. , Speed, F. M. & Lynn, M. J. (1976) A class of biased
estimators in linear regression. Technometrics 18 (4), 425-437.
[7] Wichern, D. & Churchill, G. (1978) A comparison of ridge estimators.
Technometrics 20 (2), 301–311.
[8] Nordberg, L. (1982) A procedure for determination of a good ridge
parameter in linear regression. Communications in Statistics-Simulation and
Computation 11 (3), 285–309.
[9] Saleh, A. K. & Kibria, B. M. (1993) Performances of some new preliminary
test ridge regression estimators and their properties. Communications in
Statistics-Theory and Methods 22 (10), 2747–2764.
109
Yazid Al-Hassan
110