You are on page 1of 514


of the regression of the first residual on the second. Notice, however, that neither of
the coefficients of time in the separate trend analyses is an estimate of the /3 3 shift
parameter. 11

3.2.3 General Treatment of Partial Correlation and

Multiple Regression Coefficients

Under the conditions shown in the next section, the normal equations solve for b =
(X'X)- 1X'y. The residuals from the LS regression may then be expressed as
e = y-Xb = y -X(X'X)- 1X'y =My (3.17)
where M =I - X(X'X)- X'
It is readily seen that Mis a symmetric, idempotent matrix. It also has the properties
that MX = 0 and Me = e. Now write the general regression in partitioned form as

y = [x2 X.] [~~J + e

In this partitioning x 2 is then x 1 vector of observations on X2 , with coefficient b2 ,
and X. is then X (k - 1) matrix of all the other variables (including the column of
ones) with coefficient vector b< 2>- 12 The normal equations for this setup are

[~~~ ~~~:][~~J = [ZJ

We wish to solve for b2 and interpret the result in terms of a simple regression slope.
The solution is 13

b2 = (x2M.x2)- 1(x2M.y) (3.18)

where M. = / - x.(x:x.)- x: 1

M. is a symmetric, idempotent matrix with the properties M.X. = 0 and M.e = e.

Now by analogy with Eq. (3.17) it follows that
M.y is the vector of residuals when y is regressed on X.
and M.x2 is the vector of residuals when x 2 is regressed on X.
Regressing the first vector on the second gives a slope coefficient, which, using the
symmetry and idempotency of M., gives the b2 coefficient already defined in Eq.
(3.18). This general result has already been illustrated for the three-variable case.
A simpler and elegant way of proving the same result is as follows. Write the
partitioned regression as

llSee Problem 3.4.

Note that this is a different use of the star subscript than in an earlier section where it was used to
indicate data in deviation form.
13 See Appendix 3.2.
CHAPTER 3: The k- Variable Linear Equation 89

X's look alike, the more imprecise is the attempt to estimate their relative effects. This
situation is referred to as multicollinearity or collinearity. With perfect or exact collinear-
ity the standard errors go to infinity. Exact collinearity means that the columns of X are
linearly dependent, and so the LS vector cannot be estimated.

3.4.3 Estimation of u 2

The variance-covariance matrix in Eq. (3.25) involves the disturbance variance a 2 ,

which is unknown. It is reasonable to base an estimate on the residual sum of squares
from the fitted regression. Following Eq. (3.17), we write e = My = M(X/3 + u) =
Mu, since MX = 0.
Thus, E(e'e) = E(u'M'Mu) = E(u'Mu)
Utilizing the fact that the trace of a scalar is the scalar, we write
E(u'Mu) = E[tr(u'Mu)]
= E[tr(uu'M)]
= u 2 tr(M)
= a 2trl - a 2 tr[X(X'X)- 1X']
= a 2 trl - a 2 tr[(X'X)- 1(X'X)]
= a 2(n - k)

Thus, s2 = -e'e- (3.26)

defines an unbiased estimator of a 2 The square root s is the standard deviation of
the Y values about the regression plane. It is often referred to as the standard error
of estimate or the standard error of the regression (SER).

3.4.4 Gauss-Markov Theorem

This is the fundamental least-squares theorem. It states that, conditional on the as-
sumptions made, no other linear, unbiased estimator of the f3 coefficients can have
smaller sampling variances than those of the least-squares estimator, given in Eq.
(3.25). We prove a more general result relating to any linear combination of the f3
coefficients. Let c denote an arbitrary k-element vector of known constants, and de-
fine a scalar quantity , as
, = c'/J
If we choose c' = [0 1 0 0], then , /32. Thus, we can pick out any
single element in f3. If we choose

c' = [1 X2.n+l x3,n+I

then , = E(Yn+i)