Professional Documents
Culture Documents
Notes For Least Squares
Notes For Least Squares
More Notes
for
Least Squares
Orlaith
Burke
Michaelmas Term, 2010
(1)
The xi terms are not random and are measured with negligible error. The
model is said to be linear in the parameters. Therefore, transformation
of the response and/or independent variables is permitted.
y = X +
(2)
where
y = (y1 , y2 , . . . , yn )t
X =
1 x1,1 x2,1
xp,1
1 x1,2 x2,2
..
..
..
.
.
.
xp,2
..
.
1 x1,n x2,n
xp,n
= (0 , 1 , 2 , . . . , p )t
= (1 , 2 , . . . , n )t
(3)
0.1
0.1.1
Assumptions
0.1.2
Parameter Estimation
"
2 #
1
1 y i Xi
fi =
exp
.
2
(4)
L(, 2 ) =
n
Y
i=1
"
2 #
1
1 y i Xi
exp
2
"
#
n
1
1 X
=
2
(yi Xi )2 .
n exp
2
2
2 i=1
(2 )
(5)
l(, 2 ) = lnL(, 2 )
"
2 #
n
1
y
n
i
i
.
= ln(2) ln( 2 )
2
2
2
l
1
= 2
[y 0 y 2 0 X 0 y + 0 X 0 X]
1
(2X 0 y + 2X 0 X)
2 2
1
(X 0 y + X 0 X) .
2
(6)
To find the maximum likelihood estimator, set the score function equal
to zero:
1
(X 0 y + X 0 X) = 0
2
X 0 X = X 0 y
= (X 0 X)1 X 0 y.
(7)
(8)
(9)
(10)
(11)
s2 =
0 (y X )
(y X )
np
(12)
This parameter quantifies the variation of the residuals about the estimated
regression line (
y = X ).
The residual mean square is an unbiased
estimator (assuming that the proposed model is correct).
0.1.3
Unbiased estimator
Under the assumption that E() = 0, it can be shown that the OLS
estimator (OLS ) is an unbiased estimator:
= E((X 0 X)1 X 0 y)
E()
= (X 0 X)1 X 0 Ey
= (X 0 X)1 X 0 X
= .
(13)
(14)
0.1.4
The first four assumptions (Section 0.1.1) are needed for any estimation
of the regression parameters. Assumptions 5 to 7 give the OLS estimator
other favorable qualities.
If the expected value of the residuals is always zero (Assumption 5), then
the OLS estimator is unbiased (as shown above).
If the residuals have homogeneous variance (Assumption 6), then the OLS
estimator has the minimum variance of all linear unbiased estimators
(BLUE) by the Gauss-Markoff Theorem.
Finally, if the residuals are also Normally distributed (Assumption 7),
then t and F tests can be used in inference. It should be noted that
the bias and dispersion properties of the estimator do not depend on the
Normality of the residuals.
However, if all seven of the assumptions are satisfied, the OLS estimator
has the uniformly minimum variance of all unbiased estimators (UMVU).
Of the seven OLS assumptions, the two which most often cause issues are
8
The third assumption listed above (Section 0.1.1) can also raise issues in
practical applications. Multicollinearity in the data can caused serious
problems (Section 0.1.5). OLS estimation does not allow for correlation
within the residuals. Other approaches (discussed later) do not require
the assumption of homogeneous independent Normal residuals.
0.1.5
Multicollinearity
Assumption 3 requires that the independent variables are not too strongly
collinear. Multicollinearity is an issue often raised in multiple regression
since it prohibits accurate statistical inference. This condition occurs
when there are near-linear relationships between the independent variables
which make up the columns of X. In mathematical terms, it implies that
there exists a set of constant dj (which are not all zero) such that:
k
X
d j xj
=0
(16)
j=1
n P (|T | c) = 0,
(15)
(17)
0.2
10
procedures.
0.2.1
Assumptions
y = X +
(18)
N (0, ).
(19)
0.2.2
Parameter Estimation
li (, ) =
1
nln(2) + ln(det) + (yi Xi )1 (yi Xi ) . (20)
2
L(, ) =
n
X
i=1
11
li (, ).
(21)
n
X
L(, ) =
lj (, )
i=1
n
1X
(yi Xi )1 (yi Xi )
2 i=1
n
X
Xi0 1 (yi Xi )
(22)
i=1
M LE =
n
X
!1
Xi0 1 Xi
i=1
n
X
Xi0 1 yi
i=1
(23)
Estimation by Regression
The heterogeneous variance of the residuals is taken into account in the
and the standard errors
GLS estimation of the regression parameters ()
of .
GLS = (X 0 1 X)1 X 0 1 Y
(24)
GLS ) = (X 0 1 X)1 .
V(
(25)
and
12
0.2.3
GLS allows for heterogeneous variance in the residuals. The GLS estimator,
is an unbiased estimator. It is also the Maximum Likelihood estimator
,
(under the assumption that N (0, )). If the assumption of Normality
is relaxed, the general Gauss-Markov theorem applies and is the best
(i.e. minimum variance) linear unbiased estimator.
0.3
13
0.3.1
Parameter Estimation
= y X .
(26)
These residual values are also consistent and are used to estimate the
variance-covariance matrix :
= 0 .
(27)
(28)
= (X 0
)
1 X)1 .
V(
(29)
and
0.3.2
FGLS allows for the practical application of GLS. It therefore, enjoys the
same advantages and suffers the similar disadvantages to GLS (Section
0.2.3). FGLS has been shown to be equivalent to the Maximum Likelihood
estimator in its limit. It therefore also possesses the asymptotic properties
of the Maximum Likelihood.
One down-side to FGLS is that while the procedure works well for large
samples, little is known about its behaviour in small finite sample.
14