# ECOM023 Econometrics

Lecture 7: Simultaneous Equations Models:
Identi…cation, Estimation and Testing
R.G. Pierse
1. Introduction
So far this course has concentrated on models with a single dependent variable.
The only exception was in lecture 3 where we looked at the seemingly unrelated
regressors (SURE) model. However, that model was essentially a set of single
equations related solely through the covariances between error terms. In this
lecture we look at systems of equations determining several dependent variables
jointly.
In economics we are often interested in the interaction of several equations,
simultanously determining more than one variable. An example is the demand
and supply model. Here we have a demand function
( = c
1
+c
2
1 +c
3
) +n
1
(1.1)
where ( is the quantity demanded, 1 is the price, ) is income and n
1
is a
disturbance term representing random shocks to demand. We expect that there
is a negative relationship between price and quantity demanded so that c
2
should
be negative. There is also a supply function
1 =
1
+
2
(+
3
\ +n
2
(1.2)
relating price to the quantity supplied and an unspeci…ed variable \ with distur-
bance term n
2
representing shocks to supply. We expect a positive relationship
between price and quantity supplied so that
2
should be positive.
Jointly, the demand and supply equations determine price and quantity. Note
that it is entirely arbitrary to write quantity as the dependent variable of the
demand function and price as the dependent variable of the supply function. We
could equally well have written price on the left-hand side of the demand function
and quantity on the left-hand side of the supply function, or even written both
equations with the same variable on the left-hand side. The important thing
is to distinguish the variables that are determined within the system (price and
quantity) from the variables () and \) that appear in the equations but which
are assumed to be determined outside the system.
However, it is very much an assumption as to which variables can be treated
as being determined outside the system. In the context of demand and supply
for a single product it might be reasonable to treat income as outside the sys-
tem. In considering demand and supply at the aggregate level, this would not be
reasonable.
1.1. Endogenous, Exogenous and Predetermined Variables
The variables that are determined within a simultaneous equations system are
known as endogenous variables. Any other variables that appear in the system but
are determined outside it are known as exogenous variables. In the example, price
and quantity are both endogenous variables and ) and \ are exogenous variables.
Note that there are the same number of endogenous variables as equations. When
this is the case we say that the system is complete. When there are less equations
than endogenous variables then the system is incomplete. When there are more
equations than endogenous variables, the system is said to be overdetermined. In
this case, one or more equations is redundant and can be dropped.
The dependent variables in a system (those that appear on the left-hand side
of an equation) are necessarily endogenous. Since they are functions of a random
disturbance term, they are also random variables. This applies equally to any
endogenous variable since it can always be written as the dependent variable of
one of the system equations. Therefore all endogenous variables must be treated
as random variables.
Furthermore, where endogenous variables appear on the right-hand side of
an equation, they will be correlated with the error term in that equation. This
follows from the simultaneity of the system which means that all equations are
determined jointly. Consider a one unit positive shock to the disturbance term
n
1
in the demand equation (1.1) leading to a one unit increase in quantity (.
This in turn leads to an increase of
2
in the price variable 1 through the supply
equation (1.2). Thus 1 is correlated with n
1
and this correlation is positive since
2

2
is positive. Similarly, a positive unit supply shock to n
2
c
2
on ( through the demand equation. Since c
2
is negative, it follows that ( is
negatively correlated with n
2
.
The exogenous variables in the system, by assumption, are independent of all
current, past and future values of the error term. This assumption is known as
strict exogeneity. It is important to stress, however, that this is just an assumption
and one that it is possible to test. Consider the case of a dynamic simultaneous
system which includes lags of the endogenous variables, such as
(
t
= c
1
+c
2
1
t
+c
3
)
t
+c
4
(
t1
+c
5
1
t1
+n
1t
(1.3)
and
1
t
=
1
+
2
(
t
+
3
\
t
+
4
1
t1
+
5
(
t1
+n
2t
(1.4)
where subscript t denotes the time period. As long as the error terms n
1
and
n
2
are not autocorrelated, lagged endogenous variables will be independent of all
current or future values of the error terms. Variables that satisfy this condition
are known as predetermined. Clearly, exogenous variables by assumption are also
predetermined, so that the lagged endogenous variables together with all current
and lagged exogenous variables form the set of predetermined variables.
Since the predetermined variables are independent of current and future values
of the error term, regression of the endogenous variables on the predetermined
variables alone satis…es the conditions of OLS
2. The Simultaneous Equations Model
The general linear simultaneous equations model with : equations can be written
formally as
By
t
+z
t
= u
t
. t = 1. . 1 (2.1)
where y
t
is an :1 vector of observations on the : current endogenous variables
at period t, z
t
is a ¡1 vector of observations on the ¡ predetermined variables, u
t
is an :1 vector of disturbances, B is a :: square matrix of coe¢cients on the
endogenous variables and is an :¡ matrix of coe¢cients on the predetermined
variables. It is assumed that
1(u
t
) = 0
and
var(u
t
) =
3
where is a positive de…nite matrix. Thus disturbances in di¤erent equations in
the same time period t are allowed to be correlated. However, the disturbances
are assumed not to be autocorrelated so that
cov(u
t
. u
s
) = 1(u
t
u
0
s
) = 0 . \t = : .
The dynamic demand–supply model (1.3) and (1.4) can be rewritten in the
form of (2.1) as
¸
1 ÷c
2
÷
2
1
¸
(
t
1
t

+
¸
÷c
1
÷c
4
÷c
5
÷c
3
0
÷
1
÷
5
÷
4
0 ÷
3

1
(
t1
1
t1
)
t
\
t
¸
¸
¸
¸
¸
¸
=
¸
n
1t
n
2t

.
(2.2)
Equation (2.1) is known as the structural form of a simultaneous system. It
corresponds to the behavioural equations of the economic model and the coe¢cient
matrices Band will typically contain zeros or other restrictions corresponding to
assumptions in the economic model. For example, in the demand–supply model,
the economic assumption is that variable \
t
does not a¤ect the demand function
and that )
t
does not a¤ect the supply function so that the matrix contains two
zeros.
Assuming that the matrix B in (2.1) is nonsingular, it is possible to pre-
multiply through by B
1
giving
y
t
= ÷B
1
z
t
+B
1
u
t
(2.3)
or
y
t
= z
t
+v
t
. (2.4)
where = ÷B
1
and v
t
= B
1
u
t
. This is known as the reduced form of the
system and it relates the endogenous variables y solely to the predetermined
variables z, removing the simultaneity in the structural form. In this formulation,
the economic assumptions in the model are less obvious but are embodied in
the restriction that = ÷B
1
. The reduced form disturbances v
t
no longer
correspond to disturbances on particular behavioural equations.
4
3. Identi…cation in Simultaneous Equations
Consider pre-multiplying the simultaneous equation system (2.1) by the : :
nonsingular matrix F to give
FBy
t
+Fz
t
= Fu
t
. (3.1)
The reduced form of this transformed model is
y
t
= ÷(FB)
1
Fz
t
+ (FB)
1
Fu
t
= ÷B
1
F
1
F.
t
+B
1
F
1
Fn
t
= ÷B
1
z
t
+B
1
u
t
which is identical to the reduced form(2.3) of the original model (2.1). Considering
the observations y
t
as having being generated by the equation (2.3), there is clearly
a problem in determining whether the structural parameters are given by (2.1) or
by (3.1).
In general it is impossible to estimate the parameters of a simultaneous equa-
tion system unless there are su¢cient restrictions on the elements of B and
(or ) to uniquely identify the parameters of the model. This is known as the
problem of identi…cation. Identi…cation conditions can therefore be viewed as the
conditions under which it is possible to recover the structural form parameters
from the reduced form.
3.1. Rank and Order Conditions for Identi…cation
Identi…cation can be ensured by restrictions involving any of the structural form
parameters B , and . Consequently, the conditions for identi…cation in the
most general case are complicated to state. Here we consider only the most com-
mon form of restriction, namely zero restrictions on B and . These correspond to
the exclusion of some variables from particular equations and so these restrictions
are also known as exclusion restrictions.
It is possible that some equations in a model may be identi…ed while others
are not. A model is identi…ed only if all equations in the model are identi…ed.
Therefore, identi…cation needs to be checked separately for each equation in a
model .
Let the number of variables excluded from the ,th equation by denoted by
:
j
. Then the order condition for identi…cation of the ,th equation by exclusion
5
restrictions is that :
j
is greater than or equal to :÷1. This condition is necessary
but not su¢cient. When the number of exclusion restrictions is strictly greater
than :÷1 then the equation is said to be over-identi…ed whereas when :
j
= :÷1
then the equation is said to be exactly identi…ed. The order condition has the
advantage of being very easy to check.
A condition which is both necessary and su¢cient is the rank condition for
identi…cation of the ,th equation. This considers the rank of the matrix formed
from the columns of the matrices B and , corresponding to the excluded variables
in the ,th equation, but excluding the ,th row. This matrix will be of dimension
(:÷1) :
j
. The rank condition states that the rank of this matrix must be equal
to :÷1.
3.2. Some examples
Consider the two equation dynamic demand–supply model (2.2). In each equation,
a single variable is excluded (\
t
in the demand equation and )
t
in the second).
Thus both equations satisfy the order condition for identi…cation.
Consider now the rank condition. for the …rst equation, the matrix to be
considered is the 1 1 matrix

3
] .
Clearly this will have rank :÷1 = 1 as long as
3
= 0. If
3
= 0 however, then
the equation will not be identi…ed. Similarly, the rank condition for the second
equation is that the 1 1 matrix
[÷c
3
]
has rank :÷1 = 1. Again, this will be the case except if c
3
= 0.
As a second example, consider a four equation IS-LM model based on Stewart
(1991) p 253
(
t
= ÷
11
÷c
14
)
t
+n
1t
1
t
= ÷
21
÷c
23
1
t
÷c
24
)
t
+n
2t
1
t
= ÷c
34
)
t
÷
32
`
t
+n
3t
)
t
= (
t
+1
t
+2
t
where (
t
is consumption, 1
t
is investment, 1
t
is the rate of interest, )
t
is income,
`
t
is the money stock and 2
t
is autonomous expenditure. This model can be
6
rewitten in the form (2.1) as

1 0 0 c
14
0 1 c
23
c
24
0 0 1 c
34
÷1 ÷1 0 1
¸
¸
¸
¸

(
t
1
t
1
t
)
t
¸
¸
¸
¸
+

11
0 0

21
0 0
0
32
0
0 0 ÷1
¸
¸
¸
¸

1
`
t
2
t
¸
¸
= u
t
.
Consider the …rst equation. The number of excluded regressors is 4 :÷1 = 3
so that the order condition is satis…ed. The rank condition is based on the rank
of the matrix

1 c
23
0 0
0 1
32
0
÷1 0 0 ÷1
¸
¸
which has rank 3 even if both parameters c
23
and
32
are equal to zero. This
equation is over-identi…ed.
For the second equation, the number of excluded regressors is 3 and the rank
condition is based on the matrix

1 0 0
0
32
0
÷1 0 ÷1
¸
¸
which has rank 3 as long as the parameter
32
is not equal to zero. In this case,
the equation is exactly identi…ed. If however,
32
= 0 then it is clear that the
rank of this matrix is only 2 in which case the equation is not identi…ed.
The third equation has four exclusion restrictions and the rank condition de-
pends on the matrix

1 0
11
0
0 1
21
0
÷1 ÷1 0 ÷1
¸
¸
which has rank 3 so that the rank condition is satis…ed. the equation is over-
identi…ed.
Finally, the fourth equation has 3 exclusion restrictions and the rank condition
depends on the matrix

0
11
0
c
23

21
0
1 0
32
¸
¸
7
which has rank 3 as long as all coe¢cients are non-zero. Then the equation is
exactly identi…ed. However, if
32
= 0 then the matrix only has rank 2 so that
the equation is then not identi…ed.
In practice, it can not be known whether any of the parameters have true value
zero. Thus in specifying a model, only the order condition can be guaranteed to
hold by construction. It is still possible that the model may not be identi…ed
through failure of the rank condition. If this happens, then estimation will break
down.
4. Estimation: Single Equation Methods
In the estimation of simultaneous equations systems there are two basic ap-
proaches. The …rst is to consider the estimation of each equation in isolation.
This approach ignores the information about the covariances between the equa-
tions given by the covariance matrix and information about the exclusion re-
strictions on all other equations. Consequently, this approach is called a limited in-
formation approach. The second approach estimates the complete system jointly,
taking into account all the identifying restrictions in the model and the covariance
information provided by . Since this approach uses all available information, it
is known as the full information approach.
When the model is correctly speci…ed, then full information estimation is more
e¢cient than limited information estimation. However, because of its system
nature, any mistakes made in the speci…cation of one equation will a¤ect the
estimates of all the equations. Consequently, if there is uncertainty about the
speci…cation of the system, there may be an argument in favour of using limited
information methods. They also have the advantage of being computationally
much cheaper to implement.
Consider the estimation of the ,th equation from the system (2.1). This equa-
tion can be written as
n
jt
= x
0
j

j
+n
jt
. t = 1. . 1
where the / 1 vector x
j
represents all the variables in y
t
and z
t
that have
unrestricted coe¢cients. Note that the number of regressors / is equal to :÷1+
¡ ÷:
j
and the order condition for identi…cation ensures that / _ ¡.
Stacking all the 1 observations together we can write
y
j
= X
j

j
+u
j
8
where y
j
is a 1 1 vector, X
j
is a 1 / matrix, and u
j
is a 1 1 vector satisfying
1(u
j
) = 0 and var(u
j
) = o
jj
I
T
.
5. Indirect Least Squares
Consider the two equation demand and supply model
(
t
= c
1
+c
2
1
t
+c
3
)
t
+n
1
and
1
t
=
1
+
2
(
t
+
3
\
t
+
4
1
t
+n
2
which di¤ers from the model (1.1) and (1.2) in that the supply equation now
includes an extra exogenous variable 1. In matrix form the model can be written
as
¸
1 ÷c
2
÷
2
1
¸
(
t
1
t

+
¸
÷c
1
÷c
3
0 0
÷
1
0 ÷
3
÷
4

1
)
t
\
t
1
t
¸
¸
¸
¸
=
¸
n
1t
n
2t

.
and by inspection it can be veri…ed that the …rst equation is overidenti…ed while
the second equation is exactly identi…ed.
The reduced form of the system is given by
¸
(
t
1
t

=
¸
1 ÷c
2
÷
2
1

1
¸
c
1
c
3
0 0

1
0
3

4

1
)
t
\
t
1
t
¸
¸
¸
¸
+
¸
·
1t
·
2t

=
¸
:
11
:
12
:
13
:
14
:
21
:
22
:
23
:
24

1
)
t
\
t
1
t
¸
¸
¸
¸
+
¸
·
1t
·
2t

where :
ij
represent the reduced form parameters.
Noting that
¸
1 ÷c
2
÷
2
1

1
=
1
(1 ÷c
2

2
)
¸
1 c
2

2
1

9
we have the parameter correspondence
¸
:
11
:
12
:
13
:
14
:
21
:
22
:
23
:
24

=
1
(1 ÷c
2

2
)
¸
c
1
+c
2

1
c
3
c
2

3
c
2

4

1
+c
1

2
c
3

2

3

4

.
Can the structural form parameters c
i
and
i
be uniquely recovered from the
reduced form parameters :
ij
? For the parameters of the second equation, the
answer is yes with the results

1
= :
21
÷:
11
:
22
:
12
;
2
= :
22
:
12

3
= :
23
÷:
13
:
22
:
12
;
4
= :
24
÷:
14
:
22
:
12
.
This unique correspondence is the result of the fact that the second equation is
exactly identi…ed.
For the …rst equation, there is no unique way to recover all the structural
parameters since, for example
c
2
= :
13
:
23
= :
14
:
24
so that there are two alternative ways of de…ning c
2
. This is the meaning of the
fact that the …rst equation is overidenti…ed.
For an exactly identi…ed equation, one possible way of estimating the para-
meters is to estimate the unrestricted reduced form parameters by OLS on the
reduced form equations
y
j
= Z
j

j
+u
j
. , = 1. . :
where Z
j
is the matrix of predetermined variables appearing in the ,th equation.
The structural parameters can then then recovered from ´
j
using the relationships
´

1
= ´ :
21
÷ ´ :
11
´ :
22
´ :
12
;
´

2
= ´ :
22
´ :
12
´

3
= ´ :
23
÷ ´ :
13
´ :
22
´ :
12
;
´

4
= ´ :
24
÷ ´ :
14
´ :
22
´ :
12
.
This is known as indirect least squares. It can be shown that the indirect least
squares (ILS) estimator is an instrumental variables (IV) estimator.
10
5.1. The Method of Instrumental Variables
We have seen that the endogenous variables on the right-hand side of an equation
from a simultaneous equation system are correlated with the disturbance term.
This violates the fundamental assumption of the regression model that regressors
and error term are uncorrelated, or formally that
1(X
0
u) = 0
or, in large samples, that
plim
X
0
u
:
= 0.
Consequently, since this assumption is violated, OLS estimates will be both biased
and inconsistent.
Consider the model
y = X +u
where there are / regressors and
1(X
0
u) = 0.
Suppose that we can …nd a set of / variables W satisfying the condition that
1(W
0
u) = 0
and
plim
W
0
u
:
= 0
where the variables W are correlated with X, then the variables W are called
instruments for X and the estimator
¯
= (W
0
X)
1
W
0
y (5.1)
is called an instrumental variables or IV estimator.
Note that
¯
= (W
0
X)
1
W
0
y = +(W
0
X)
1
W
0
u
so that, as long as
plim

W
0
X
:

= Q . 0 < Q < ·.
11
then
plim(
¯
) = +plim

W
0
X
:

1
plim

W
0
u
:

=
so that the IV estimator is consistent.
Sometimes there may be more instruments W available than there are regres-
sors X. In this case we can take a subset of any / columns from W. This will
be a valid set of instruments. However, the higher the correlation between the
instrument set and the regressors X the better, and so a better strategy is to use
the 1 / linear combination of the instruments
´
X = W(W
0
W)
1
W
0
X.
Note that
´
X can be interpreted as the …tted values from a regression of X on
the set of instruments W. This is the linear combination that maximises the
correlation with X and is hence the best combination of instruments.
The IV estimator is then given by
¯
=

´
X
0
X

1
´
X
0
y =

´
X
0
´
X

1
´
X
0
y
= (X
0
W(W
0
W)
1
W
0
X)
1
X
0
W(W
0
W)
1
W
0
y (5.2)
This estimator is known as the Two Stage Least Squares or 2SLS estimator. This
can be regarded as a generalised IV estimator or GIVE estimator since, when the
number of instruments and regressors is the same, it collapses to the standard
form (5.1).
In general, the problem with IV estimation is to …nd a set of variables that
satisfy the conditions for being valid instruments.
5.2. Two Stage Least Squares Estimator
In the simultaneous equations context of estimating the equation
y
j
= X
j

j
+u
j
there is a ready set of valid instruments available. This is the 1 ¡ matrix of
predetermined variables Z from the system (2.1) de…ned by
Z =

z
0
1
.
.
.
z
0
T
¸
¸
¸
12
These variables comprise lagged dependent variables plus exogenous variables that
by assumption satisfy the condition that
1(Z
0
u
j
) = 0 . , = 1. . :
and
plim
Z
0
u
j
1
= 0.
In general, ¡ /
j
so that there are more instruments than regressors.
Hence the application of IV leads to the 2SLS estimator
¯

j
= (X
0
j
Z(Z
0
Z)
1
Z
0
X
j
)
1
X
0
j
Z(Z
0
Z)
1
Z
0
y
j
(5.3)
This estimator can be interpreted as a two stage estimation procedure. In the
…rst stage, the regressors X
j
are regressed on the set of instruments Z. This is an
estimation of the parameters of the ,th equation of the reduced form (2.3). Then
y
j
is regressed on the …tted values from this regression
´
X
j
= Z(Z
0
Z)
1
Z
0
X
j
= P
z
X
j
where P
z
= Z(Z
0
Z)
1
Z
0
is an idempotent projection matrix, to give the second
stage estimates
¯

j
= (
´
X
0
j
´
X
j
)
1
´
X
0
j
y
j
which is formally identical to (5.3). The
´
X
j
are sometimes called constructed
regressors. They are de…ned so that, by construction, they are not correlated
with the error term u
j
. In this interpretation of 2SLS, the original regressors are
replaced by the constructed regressors in the second stage of estimation. Although
this was the original rationale for the 2SLS estimator that was invented by Theil
(1958), it is generally more helpful to think of 2SLS as the instrumental variables
estimator
¯

j
= (
´
X
0
j
X
j
)
1
´
X
0
j
y
j
where the original regressors are not replaced but are instrumented by
´
X
j
, even
though formally, the two expressions are identical in this case.
The variance of the 2SLS estimator is given by
var(
¯

j
) = o
jj
(X
0
j
Z(Z
0
Z)
1
Z
0
X
j
)
1
and a consistent estimator of o
jj
can be obtained from the 2SLS residuals
e = y
j
÷X
j
¯

j
(5.4)
13
using the expression
¯ o
jj
=
e
0
e
1 ÷/
.
The degrees of freedom correction in the denominator of this expression does not
a¤ect the consistency. Note that the 2SLS residuals (5.4) are not the same as the
residuals obtained from the second stage of the two stage estimation procedure,
which would be given by
´e = y
j
÷
´
X
j
¯

j
.
This is one reason why it can be unhelpful to think of 2SLS as a two stage esti-
mation procedure, since it would lead to an incorrect expression for the estimated
error variance ¯ o
jj
.
In the special case of an exactly identi…ed equation where ¡ = /
j
, then the
estimator (5.3) collapses to
¯

j
= (Z
0
X
j
)
1
Z
0
y
j
.
This is equivalent to the indirect least squares (ILS) estimator.
On the assumption that the plims of the moment matrices
plim
Z
0
Z
1
= Q
ZZ
and plim
X
0
j
Z
1
= Q
XZ
exist and are …nite, it can be shown that the 2SLS estimator is consistent since
¯

j
= (X
0
j
Z(Z
0
Z)
1
Z
0
X
j
)
1
X
0
j
Z(Z
0
Z)
1
Z
0
(X
j

j
+u
j
)
=
j
+ (X
0
j
Z(Z
0
Z)
1
Z
0
X
j
)
1
X
0
j
Z(Z
0
Z)
1
Z
0
u
j
and
plim
¯

j
=
j
+

plim
X
0
j
Z
1

Z
0
Z
1

1
Z
0
X
j
1

1
plim

X
0
j
Z
1

Z
0
Z
1

1
Z
0
u
j
1

=
j
.
Finally, we can apply a central limit theorem
1

1
2
Z
0
u
j
~
a
`(0. o
jj
Q
ZZ
)
to show that
¯

j
is asymptotically normally distributed with

1(
¯

j
÷
j
) ~
a
`(0. o
jj
(Q
XZ
Q
1
ZZ
Q
ZX
)
1
) . (5.5)
This result can be used as the basis for hypothesis testing in the simultaneous
equations model, using the same large sample principles as those considered for
the classical model.
14
5.3. Limited Information Maximum Likelihood
The maximum likelihood principle can also be used to construct an estimator in
the simultaneous equations system using limited information. This estimator is
known as the LIML estimator. The details will not be presented here. However,
it can be shown that the LIML estimator is asymptotically equivalent to the 2SLS
estimator and so has the same asymptotic distribution (5.5).
5.4. Testing Overidentifying Restrictions
When an equation is overidenti…ed, there are more restrictions on its parameters
than are necessary to identify it. Therefore it is possible to construct a test of
these extra restrictions. Such a test was developed by Sargan (1964). It is based
on the quantity
e
0
P
z
e
where e = y
j
÷ X
j
¯

j
is the vector of equation residuals (2SLS or LIML) and
P
z
= Z(Z
0
Z
1
)Z
0
is the instrument projection matrix. The test for overidentifying
restrictions is given by
e
0
P
z
e
¯ o
jj
~
a
.
2
qk
j
and is asymptotically distributed as a chi-squared variate with degrees of freedom
¡ ÷ /
j
which is the degree of overidenti…cation of the equation. This test is
sometimes called the Sargan validity of instruments test.
5.5. Testing Exogeneity
Consider the equation
y
j
= X
j

j
+u
j
= Y
1

j
+Z
1

j
+u
j
(5.6)
where the regressors X
j
have been partitioned into the set of j current endogenous
variables Y
1
and the /
j
÷ j predetermined variables Z
1
appearing in the ,th
equation. The special estimation methods for simultaneous equations are needed
because the assumption of exogeneity that
1(Y
1
0
u
j
) = 0
will not be expected to hold. If however, this condition did hold, then OLS
estimation would be appropriate. It is possible to devise a test of the exogeneity
15
of the regressors Y
1
. This is known as the Wu-Hausman test for exogeneity
and was developed independently by Wu (1973) and Hausman (1978). The test
is computed by …rst estimating the equation (5.6) by OLS and computing the
residual sum of squares o
0
. Then the equation is re-estimated by OLS including
as additional regressors the instrumental variables
´
Y
1
= Z(Z
0
Z
1
)Z
0
Y
1
to give residual sum of squares o
1
. then the test statistic is given by
o
0
÷o
1
´ o
2
~
a
.
2
p
where ´ o
2
is the estimated error variance from the …rst estimation. The test sta-
tistic is asymptotically distributed as chi-squared with degrees of freedom equal
to j, the number of columns of Y
1
and so the number of potentially endogenous
regressors. Rejection of the null hypothesis of exogeneity would show that IV
estimation is needed.
6. Estimation: System Methods
System methods of estimation in simultaneous equation systems use all the infor-
mation in the model to estimate the parameters of all equations jointly. They will
be more e¢cient than single equation methods but are liable to the problem that
misspeci…cation of any one equation will a¤ect the estimates in all the equations.
These methods are more costly in computational terms than single equation meth-
ods and may not be feasible when the instrument set is large. However, many
econometric packages such as TSP, E-Views and Pc-FIML o¤er these estimation
techniques.
6.1. Three Stage Least Squares
Consider stacking the all the equations of the model
y
j
= X
j

j
+u
j
. , = 1. . :
to form the stacked equation

y
1
.
.
.
y
m
¸
¸
¸
=

X
1
0 0
0
.
.
.
0
0 0 X
m
¸
¸
¸

1
.
.
.

m
¸
¸
¸
+

u
1
.
.
.
u
m
¸
¸
¸
(6.1)
16
or
y = X +u
where
var(u) =

o
11
I
T
o
1m
I
T
.
.
.
.
.
.
.
.
.
o
m1
I
T
o
mm
I
T
¸
¸
¸
= .I
T
(6.2)
The stacked system (6.1) has a non-constant variance covariance matrix (6.2). It
also has the problem that the regressors X are correlated with the error term u.
The solution is to apply a combination of instrumental variables estimation
and generalised least squares to correct these two problems. The instrument set
is the matrix

´
X
1
0 0
0
.
.
.
0
0 0
´
X
m
¸
¸
¸
=

P
z
X
1
0 0
0
.
.
.
0
0 0 P
z
X
m
¸
¸
¸
= (I
m
.P
z
)X (6.3)
where P
z
= Z(Z
0
Z)
1
Z
0
is the instrument projection matrix.
Applying both GLS using (6.2) and IV using instrument set (6.3) results in
the Three Stage Least Squares (3SLS) Estimator of Zellner and Theil (1962)
¯
=

X
0
(I
m
.P
z
)
0
(.I
T
)
1
(I
m
.P
z
)X

1
X
0
(I
m
.P
z
)
0
(.I
T
)
1
y
=

X
0
(
1
.P
z
)X

1
X
0
(
1
.P
z
)y . (6.4)
In practice the unknown covariance matrix needs to be replaced by a consistent
estimator
´
. Such an estimator can be based on the expression
´ o
ij
=
e
0
i
e
j
1
where e
j
is the vector of residuals
e
j
= y
j
÷X
j
¯

j
from 2SLS regression on the ,th equation.
Note that in the case where is a diagonal matrix, 3SLS estimation is identical
to 2SLS estimation on each equation. This is also the case if every equation is
17
exactly identi…ed. The reason is that in these cases there is no informational gain
in considering all the equations together.
The 3SLS estimator is consistent and asymptotically e¢cient in the class of
full information models with

1(
¯
÷) ~
a
`

0.

plim
1
1
X
0
(
1
.P
z
)X

1

. (6.5)
6.2. Full Information Maximum Likelihood
The maximum likelihood principle can also be used to construct an estimator
in the simultaneous equations system using full information. This estimator is
known as the Full Information Maximum Likelihood or FIML estimator.
Consider again the general linear simultaneous equations model with : equa-
tions
By
t
+z
t
= u
t
. t = 1. . 1
where it is now assumed that u
t
is distributed independently normally as
u
t
~ 1`(0. ) .
The probability distribution function for u
t
is given by
1(u
t
) = (2:)
m=2
[[
1=2
exp(÷
1
2
u
0
t

1
u
t
)
and the likelihood function for y
t
is given by
1(y
t
) =

·u
t
·y
t

1(u
t
) = |B| 1(u
t
)
= (2:)
m=2
[[
1=2
|B| exp(÷
1
2
u
0
t

1
u
t
) .
where || denotes the absolute value of the determinant. |·u
t
·y
t
| is called the
Jacobian of the transformation from u
t
to y
t
.
The likelihood of the whole sample is therefore given by
1(B. . ; y. z) =
T
¸
t=1
1(y
t
)
= (2:)
mT=2
[[
T=2
|B|
T
exp(÷
1
2
T
¸
t=1
u
0
t

1
u
t
)
18
The FIML estimator is derived by maximising this likelihood function numerically
with respect to the unknown parameters B. . and , taking into account all the
identifying restrictions imposed on these matrices.
It can be shown that the FIML estimator is asymptotically equivalent to the
3SLS estimator and so has the same asymptotic distribution (6.5). FIML has the
advantage over 3SLS that all parameters are estimated jointly whereas in 3SLS,
is pre-estimated from 2SLS residuals. On the other hand, it requires iterative
numerical optimisation and so is computationally more costly than 3SLS.
References
[1] Hausman, J. (1978), ‘Speci…cation tests in econometrics’, Econometrica, 46,
1251–71.
[2] Sargan, J.D. (1964), ‘Wages and prices in the United Kingdom: a study in
econometric methodology’, in D.F. Hendry and K.F. Wallis (eds.), Economet-
rics and Quantitative Economics, Basil Blackwell, Oxford, 1984.
[3] Stewart, J. (1991), Econometrics, Philip Allan, Hemel Hempstead.
[4] Theil, H. (1958), Economic Forecasts and Policy, North-Holland, Amsterdam.
[5] Wu, D-M. (1973), ‘Alternative tests of independence between stochastic re-
gressors and disturbances’, Econometrica, 41, 733–750.
[6] Zellner, A. and H. Theil (1962), ‘Three-stage least squares: simultaneous es-
timation of simultaneous equations’, Econometrica, 30, 54–78.
19

1. one or more equations is redundant and can be dropped.demand function and price as the dependent variable of the supply function. the system is said to be overdetermined. they are also random variables. When there are less equations than endogenous variables then the system is incomplete. In the context of demand and supply for a single product it might be reasonable to treat income as outside the system. Thus P is correlated with u1 and this correlation is positive since 2 . The important thing is to distinguish the variables that are determined within the system (price and quantity) from the variables (Y and W ) that appear in the equations but which are assumed to be determined outside the system. Consider a one unit positive shock to the disturbance term u1 in the demand equation (1.1. Exogenous and Predetermined Variables The variables that are determined within a simultaneous equations system are known as endogenous variables. price and quantity are both endogenous variables and Y and W are exogenous variables. where endogenous variables appear on the right-hand side of an equation. The dependent variables in a system (those that appear on the left-hand side of an equation) are necessarily endogenous. Endogenous. they will be correlated with the error term in that equation.1) leading to a one unit increase in quantity Q. However. it is very much an assumption as to which variables can be treated as being determined outside the system. Any other variables that appear in the system but are determined outside it are known as exogenous variables. Therefore all endogenous variables must be treated as random variables. When this is the case we say that the system is complete.2). When there are more equations than endogenous variables. or even written both equations with the same variable on the left-hand side. In considering demand and supply at the aggregate level. This applies equally to any endogenous variable since it can always be written as the dependent variable of one of the system equations. We could equally well have written price on the left-hand side of the demand function and quantity on the left-hand side of the supply function. Furthermore. This follows from the simultaneity of the system which means that all equations are determined jointly. This in turn leads to an increase of 2 in the price variable P through the supply equation (1. this would not be reasonable. In this case. Since they are functions of a random disturbance term. In the example. Note that there are the same number of endogenous variables as equations.

The exogenous variables in the system.is positive. As long as the error terms u1 and u2 are not autocorrelated. Since 2 is negative. t = 1. B is a m m square matrix of coe¢ cients on the endogenous variables and is an m q matrix of coe¢ cients on the predetermined variables. past and future values of the error term. Since the predetermined variables are independent of current and future values of the error term. Variables that satisfy this condition are known as predetermined. such as 2 Qt = and Pt = 1 + 2 Pt + + 3 Yt + + 4 Qt 1 + + 5 Pt 1 + u1t + u2t (1. lagged endogenous variables will be independent of all current or future values of the error terms. by assumption. Consider the case of a dynamic simultaneous system which includes lags of the endogenous variables.T (2. Clearly.3) (1. it follows that Q is negatively correlated with u2 . zt is a q 1 vector of observations on the q predetermined variables. It is assumed that E(ut ) = 0 and var(ut ) = 3 . Similarly. are independent of all current. exogenous variables by assumption are also predetermined. It is important to stress. ut is an m 1 vector of disturbances. however. The Simultaneous Equations Model The general linear simultaneous equations model with m equations can be written formally as Byt + zt = ut .1) where yt is an m 1 vector of observations on the m current endogenous variables at period t. a positive unit supply shock to u2 leads to an increase of 2 on Q through the demand equation. . This assumption is known as strict exogeneity. so that the lagged endogenous variables together with all current and lagged exogenous variables form the set of predetermined variables. regression of the endogenous variables on the predetermined variables alone satis…es the conditions of OLS 2. that this is just an assumption and one that it is possible to test.4) 1 + 2 Qt 3 Wt 4 Pt 1 5 Qt 1 where subscript t denotes the time period.

the disturbances are assumed not to be autocorrelated so that cov(ut .3) 4 . Thus disturbances in di¤erent equations in the same time period t are allowed to be correlated. it is possible to premultiply through by B 1 giving yt = or yt = zt + vt : (2. the economic assumption is that variable Wt does not a¤ect the demand function and that Yt does not a¤ect the supply function so that the matrix contains two zeros. us ) = E(ut u0s ) = 0 . the economic assumptions in the model are less obvious but are embodied in the restriction that = B 1 . removing the simultaneity in the structural form.where is a positive de…nite matrix.1) as 2 3 1 6 Qt 1 7 6 7 1 Qt 0 2 1 4 5 3 6 Pt 1 7 = u1t : + 6 7 1 Pt 0 u2t 2 1 5 4 3 4 Yt 5 Wt (2.2) Equation (2.4) can be rewritten in the form of (2. 8t 6= s : The dynamic demand– supply model (1.1) is known as the structural form of a simultaneous system. The reduced form disturbances vt no longer correspond to disturbances on particular behavioural equations.3) and (1. It corresponds to the behavioural equations of the economic model and the coe¢ cient matrices B and will typically contain zeros or other restrictions corresponding to assumptions in the economic model. However. This is known as the reduced form of the system and it relates the endogenous variables y solely to the predetermined variables z. Assuming that the matrix B in (2. In this formulation. B 1 zt + B 1 ut (2. For example. in the demand– supply model.1) is nonsingular.4) where = B 1 and vt = B 1 ut .

3).1). and . Considering the observations yt as having being generated by the equation (2.3. Then the order condition for identi…cation of the jth equation by exclusion 5 . Therefore.1) by the m nonsingular matrix F to give FByt + F zt = Fut : The reduced form of this transformed model is yt = = = (FB) 1 F zt + (FB) 1 Fut B 1 F 1 F zt + B 1 F 1 Fut B 1 zt + B 1 ut m (3.1) or by (3. namely zero restrictions on B and . These correspond to the exclusion of some variables from particular equations and so these restrictions are also known as exclusion restrictions. the conditions for identi…cation in the most general case are complicated to state. It is possible that some equations in a model may be identi…ed while others are not.1). there is clearly a problem in determining whether the structural parameters are given by (2.1) which is identical to the reduced form (2. Here we consider only the most common form of restriction. Identi…cation in Simultaneous Equations Consider pre-multiplying the simultaneous equation system (2. This is known as the problem of identi…cation. 3.3) of the original model (2. Rank and Order Conditions for Identi…cation Identi…cation can be ensured by restrictions involving any of the structural form parameters B . In general it is impossible to estimate the parameters of a simultaneous equation system unless there are su¢ cient restrictions on the elements of B and (or ) to uniquely identify the parameters of the model. Consequently. Identi…cation conditions can therefore be viewed as the conditions under which it is possible to recover the structural form parameters from the reduced form. A model is identi…ed only if all equations in the model are identi…ed.1. identi…cation needs to be checked separately for each equation in a model . Let the number of variables excluded from the jth equation by denoted by rj .

In each equation. As a second example.2. 3. A condition which is both necessary and su¢ cient is the rank condition for identi…cation of the jth equation. Again. Mt is the money stock and Zt is autonomous expenditure. This matrix will be of dimension (m 1) rj .2). This considers the rank of the matrix formed from the columns of the matrices B and . Yt is income. Rt is the rate of interest. Consider now the rank condition. Similarly. this will be the case except if 3 = 0. Some examples Consider the two equation dynamic demand– supply model (2. The rank condition states that the rank of this matrix must be equal to m 1. This model can be 6 . the matrix to be considered is the 1 1 matrix [ 3] : Clearly this will have rank m 1 = 1 as long as 3 6= 0.restrictions is that rj is greater than or equal to m 1: This condition is necessary but not su¢ cient: When the number of exclusion restrictions is strictly greater than m 1 then the equation is said to be over-identi…ed whereas when rj = m 1 then the equation is said to be exactly identi…ed. Thus both equations satisfy the order condition for identi…cation. then the equation will not be identi…ed. consider a four equation IS-LM model based on Stewart (1991) p 253 Ct It Rt Yt = 14 Yt + u1t 11 = 23 Rt 24 Yt + u2t 21 = 34 Yt 32 Mt + u3t = Ct + It + Zt where Ct is consumption. The order condition has the advantage of being very easy to check. a single variable is excluded (Wt in the demand equation and Yt in the second). corresponding to the excluded variables in the jth equation. but excluding the jth row. If 3 = 0 however. It is investment. for the …rst equation. the rank condition for the second equation is that the 1 1 matrix [ 3] has rank m 1 = 1.

In this case.1) as 32 2 1 0 0 14 76 6 0 1 23 24 7 6 6 4 0 54 0 1 34 1 1 0 1 3 2 Ct 11 It 7 6 21 7+6 Rt 5 4 0 Yt 0 0 0 32 0 3 2 3 0 1 0 74 7 M t 5 = ut : 0 5 Zt 1 which has rank 3 even if both parameters 23 and 32 are equal to zero. The number of excluded regressors is 4 > m 1 = 3 so that the order condition is satis…ed. the fourth equation has 3 exclusion restrictions and the rank condition depends on the matrix 2 3 0 0 11 4 23 21 0 5 1 0 32 7 .Consider the …rst equation. 32 = 0 then it is clear that the rank of this matrix is only 2 in which case the equation is not identi…ed. Finally. The rank condition is based on the rank of the matrix 2 3 1 0 0 23 4 0 1 0 5 32 1 0 0 1 rewitten in the form (2. The third equation has four exclusion restrictions and the rank condition depends on the matrix 2 3 1 0 0 11 4 0 1 0 5 21 1 1 0 1 which has rank 3 so that the rank condition is satis…ed. For the second equation. This equation is over-identi…ed. If however. the number of excluded regressors is 3 and the rank condition is based on the matrix 2 3 1 0 0 4 0 0 5 32 1 0 1 which has rank 3 as long as the parameter 32 is not equal to zero. the equation is exactly identi…ed. the equation is overidenti…ed.

1). Consequently. If this happens. because of its system nature. t = 1. This approach ignores the information about the covariances between the equations given by the covariance matrix and information about the exclusion restrictions on all other equations. The second approach estimates the complete system jointly. Since this approach uses all available information. 4. if there is uncertainty about the speci…cation of the system. The …rst is to consider the estimation of each equation in isolation. When the model is correctly speci…ed. taking into account all the identifying restrictions in the model and the covariance information provided by . there may be an argument in favour of using limited information methods. this approach is called a limited information approach. They also have the advantage of being computationally much cheaper to implement. This equation can be written as yjt = x0j j + ujt . it can not be known whether any of the parameters have true value zero.T where the k 1 vector xj represents all the variables in yt and zt that have unrestricted coe¢ cients. Thus in specifying a model. then full information estimation is more e¢ cient than limited information estimation. then estimation will break down. only the order condition can be guaranteed to hold by construction. In practice. It is still possible that the model may not be identi…ed through failure of the rank condition. Consequently. Consider the estimation of the jth equation from the system (2. it is known as the full information approach. Note that the number of regressors k is equal to m 1 + q rj and the order condition for identi…cation ensures that k q. However. if 32 = 0 then the matrix only has rank 2 so that the equation is then not identi…ed. Then the equation is exactly identi…ed. However. Estimation: Single Equation Methods In the estimation of simultaneous equations systems there are two basic approaches. .which has rank 3 as long as all coe¢ cients are non-zero. Stacking all the T observations together we can write yj = Xj 8 j + uj . any mistakes made in the speci…cation of one equation will a¤ect the estimates of all the equations.

where yj is a T 1 vector. and uj is a T var(uj ) = jj IT 1 vector satisfying E(uj ) = 0 and : 5.1) and (1. In matrix form the model can be written as 2 3 1 6 Yt 7 1 Qt 0 0 u1t 2 1 3 6 7 + 4 Wt 5 = u2t : 1 Pt 0 2 1 3 4 Rt and by inspection it can be veri…ed that the …rst equation is overidenti…ed while the second equation is exactly identi…ed. Noting that 1 2 1 2 1 = 1 (1 9 2 2) 1 2 2 1 . Xj is a T k matrix. Indirect Least Squares Consider the two equation demand and supply model Qt = and Pt = 1 1 + 2 Pt + 3 Wt 3 Yt + u1 4 Rt + 2 Qt + + + u2 which di¤ers from the model (1. The reduced form of the system is given by 3 2 1 1 v1t Qt 1 0 0 6 Yt 7 2 1 3 7 6 = 4 Wt 5 + v2t Pt 1 0 2 1 3 4 Rt 2 3 1 6 Yt 7 v1t 11 12 13 14 6 7 = 4 Wt 5 + v2t 21 22 23 24 Rt where ij represent the reduced form parameters.2) in that the supply equation now includes an extra exogenous variable R.

for example 2 = 13 = 23 = 14 = 24 so that there are two alternative ways of de…ning 2 .we have the parameter correspondence 11 21 12 22 13 23 14 24 = 1 (1 2 2) + + 1 1 2 1 1 2 3 3 2 2 3 3 2 4 4 : Can the structural form parameters i and i be uniquely recovered from the reduced form parameters ij ? For the parameters of the second equation. j = 1. This is the meaning of the fact that the …rst equation is overidenti…ed. It can be shown that the indirect least squares (ILS ) estimator is an instrumental variables (IV ) estimator. 4 24 : This unique correspondence is the result of the fact that the second equation is exactly identi…ed.m where Zj is the matrix of predetermined variables appearing in the jth equation. The structural parameters can then then recovered from b j using the relationships b3 = b23 b1 = b21 b13 b22 =b12 . For an exactly identi…ed equation. For the …rst equation. one possible way of estimating the parameters is to estimate the unrestricted reduced form parameters by OLS on the reduced form equations yj = Zj j + uj . the answer is yes with the results 1 3 = 21 11 22 = 12 . b4 = b24 b11 b22 =b12 . b2 = b22 =b12 b14 b22 =b12 : This is known as indirect least squares. . 10 . = 2 = 22 = 12 14 22 = 12 = 23 13 22 = 12 . there is no unique way to recover all the structural parameters since.

11 1 W0 u 0 < Q < 1: . that X0 u = 0: n Consequently.1. or formally that E(X0 u) = 0 or. since this assumption is violated. then the variables W are called instruments for X and the estimator e = (W0 X) 1 and W0 y (5. as long as plim W0 X n =Q . The Method of Instrumental Variables We have seen that the endogenous variables on the right-hand side of an equation from a simultaneous equation system are correlated with the disturbance term. Consider the model y =X +u plim where there are k regressors and E(X0 u) 6= 0 : Suppose that we can …nd a set of k variables W satisfying the condition that E(W0 u) = 0 W0 u =0 plim n where the variables W are correlated with X.5. Note that e = (W0 X) 1 W0 y = + (W0 X) so that. in large samples.1) is called an instrumental variables or IV estimator. This violates the fundamental assumption of the regression model that regressors and error term are uncorrelated. OLS estimates will be both biased and inconsistent.

the higher the correlation between the instrument set and the regressors X the better. The IV estimator is then given by e = b X0 X 1 = (X0 W(W0 W) 1 W0 X ) 1 X0 W(W0 W) 1 W0 y b b b X0 y = X0 X 1 b X0 y (5. 5 . the problem with IV estimation is to …nd a set of variables that satisfy the conditions for being valid instruments. This will be a valid set of instruments. This can be regarded as a generalised IV estimator or GIVE estimator since. Sometimes there may be more instruments W available than there are regressors X. In this case we can take a subset of any k columns from W. and so a better strategy is to use the T k linear combination of the instruments plim( e ) = + plim b X = W(W0 W) 1 W0 X : 1 b Note that X can be interpreted as the …tted values from a regression of X on the set of instruments W.2.1): In general. z0T 12 . However. This is the linear combination that maximises the correlation with X and is hence the best combination of instruments. when the number of instruments and regressors is the same. 7 Z=4 . Two Stage Least Squares Estimator In the simultaneous equations context of estimating the equation yj = Xj j + uj q matrix of there is a ready set of valid instruments available. This is the T predetermined variables Z from the system (2.then W0 u W0 X plim = n n so that the IV estimator is consistent.2) This estimator is known as the Two Stage Least Squares or 2SLS estimator. 5. it collapses to the standard form (5.1) de…ned by 3 2 z01 6 .

The Xj are sometimes called constructed regressors. They are de…ned so that. In the …rst stage.These variables comprise lagged dependent variables plus exogenous variables that by assumption satisfy the condition that E(Z0 uj ) = 0 . Hence the application of IV leads to the 2SLS estimator e j = (X0 Z(Z0 Z) 1 Z0 Xj ) 1 X0 Z(Z0 Z) 1 Z0 yj j j (5. This is an estimation of the parameters of the jth equation of the reduced form (2. to give the second stage estimates e = (X0 Xj ) 1 X0 yj b b b j j j b where the original regressors are not replaced but are instrumented by Xj . by construction. In this interpretation of 2SLS. Then yj is regressed on the …tted values from this regression b Xj = Z(Z0 Z) 1 Z0 Xj = Pz Xj b which is formally identical to (5.3). the original regressors are replaced by the constructed regressors in the second stage of estimation.4) . the regressors Xj are regressed on the set of instruments Z. and plim j = 1. the two expressions are identical in this case. The variance of the 2SLS estimator is given by 0 0 1 0 jj (Xj Z(Z Z) Z Xj ) 1 and a consistent estimator of can be obtained from the 2SLS residuals e = yj 13 Xj e j (5.3) This estimator can be interpreted as a two stage estimation procedure. Although this was the original rationale for the 2SLS estimator that was invented by Theil (1958). q > kj so that there are more instruments than regressors. . they are not correlated with the error term uj .3). it is generally more helpful to think of 2SLS as the instrumental variables estimator e j = (X0 Xj ) 1 X0 yj b b j j var( e j ) = jj where Pz = Z(Z0 Z) 1 Z0 is an idempotent projection matrix. even though formally.m Z0 uj = 0: T In general.

we can apply a central limit theorem T 1 2 Z0 uj a N (0. jj (QXZ QZZ QZX ) ) : j) (5. Note that the 2SLS residuals (5. jj QZZ ) to show that e j is asymptotically normally distributed with p 1 1 T (ej a N (0. then the estimator (5. using the same large sample principles as those considered for the classical model.4) are not the same as the residuals obtained from the second stage of the two stage estimation procedure. which would be given by b b = yj Xj e j : e e0 e This is one reason why it can be unhelpful to think of 2SLS as a two stage estimation procedure. In the special case of an exactly identi…ed equation where q = kj .using the expression ejj = : T k The degrees of freedom correction in the denominator of this expression does not a¤ect the consistency.5) This result can be used as the basis for hypothesis testing in the simultaneous equations model. 14 .3) collapses to This is equivalent to the indirect least squares (ILS ) estimator. it can be shown that the 2SLS estimator is consistent since e j = (X0 Z(Z0 Z) 1 Z0 Xj ) 1 X0 Z(Z0 Z) 1 Z0 (Xj j + uj ) j j e j = (Z0 Xj ) 1 Z0 yj : = j + (X0j Z(Z0 Z) 1 Z0 Xj Z0 Z T 1 ) 1 X0j Z(Z0 Z) 1 Z0 uj ! 1 and plim e j = j j + : X0j Z plim T Z0 Xj T plim X0j Z T Z0 Z T 1 Z0 uj T ! = Finally. On the assumption that the plims of the moment matrices X0j Z Z0 Z plim = QZZ and plim = QXZ T T exist and are …nite. since it would lead to an incorrect expression for the estimated error variance ejj .

5). this condition did hold. The details will not be presented here. then OLS estimation would be appropriate. However. Testing Overidentifying Restrictions When an equation is overidenti…ed. This estimator is known as the LIML estimator. Therefore it is possible to construct a test of these extra restrictions. there are more restrictions on its parameters than are necessary to identify it. 5. Limited Information Maximum Likelihood The maximum likelihood principle can also be used to construct an estimator in the simultaneous equations system using limited information. It is based on the quantity e0 Pz e where e = yj Xj e j is the vector of equation residuals (2SLS or LIML) and Pz = Z(Z0 Z 1 )Z0 is the instrument projection matrix.6) where the regressors Xj have been partitioned into the set of p current endogenous variables Y1 and the kj p predetermined variables Z1 appearing in the jth equation. If however. The test for overidentifying restrictions is given by e0 Pz e 2 a q kj ejj and is asymptotically distributed as a chi-squared variate with degrees of freedom q kj which is the degree of overidenti…cation of the equation.5. 5.4. The special estimation methods for simultaneous equations are needed because the assumption of exogeneity that E(Y1 0 uj ) = 0 will not be expected to hold. Testing Exogeneity Consider the equation yj = Xj j + uj = Y1 j + Z1 j + uj (5. This test is sometimes called the Sargan validity of instruments test. it can be shown that the LIML estimator is asymptotically equivalent to the 2SLS estimator and so has the same asymptotic distribution (5.3. Such a test was developed by Sargan (1964).5. It is possible to devise a test of the exogeneity 15 .

5 = 4 0 . . 0 5 4 . Three Stage Least Squares Consider stacking the all the equations of the model yj = Xj j + uj .m 3 u1 . 7 . 5 + 4 . The test is computed by …rst estimating the equation (5. um to form the stacked equation 32 3 2 2 3 2 y1 X1 0 0 1 76 . 7 6 6 . then the test statistic is given by where b is the estimated error variance from the …rst estimation. the number of columns of Y1 and so the number of potentially endogenous regressors. Then the equation is re-estimated by OLS including as additional regressors the instrumental variables b Y1 = Z(Z0 Z 1 )Z0 Y1 S0 S1 b 2 a 2 p to give residual sum of squares S1 .6) by OLS and computing the residual sum of squares S0 .1) . . many econometric packages such as TSP. j = 1. This is known as the Wu-Hausman test for exogeneity and was developed independently by Wu (1973) and Hausman (1978). Estimation: System Methods System methods of estimation in simultaneous equation systems use all the information in the model to estimate the parameters of all equations jointly.1. 6. 0 0 Xm ym m 16 (6.of the regressors Y1 . 7 6 4 . 5 .. These methods are more costly in computational terms than single equation methods and may not be feasible when the instrument set is large. The test statistic is asymptotically distributed as chi-squared with degrees of freedom equal to p. However. E-Views and Pc-FIML o¤er these estimation techniques. Rejection of the null hypothesis of exogeneity would show that IV estimation is needed.. 2 6. They will be more e¢ cient than single equation methods but are liable to the problem that misspeci…cation of any one equation will a¤ect the estimates in all the equations.

0 5 = 4 0 .2) and IV using instrument set (6.. Applying both GLS using (6. (6. . 3SLS estimation is identical to 2SLS estimation on each equation. 5= .. .. It also has the problem that the regressors X are correlated with the error term u.3) 4 0 .2). 3 IT 1m 7 .4) X( 0 1 In practice the unknown covariance matrix needs to be replaced by a consistent estimator b . The instrument set is the matrix 2 3 2 3 b X1 0 0 Pz X1 0 0 6 7 6 7 .2) where Pz = Z(Z0 Z) 1 Z0 is the instrument projection matrix. The solution is to apply a combination of instrumental variables estimation and generalised least squares to correct these two problems. mm IT IT (6. . Note that in the case where is a diagonal matrix. ..1) has a non-constant variance covariance matrix (6. This is also the case if every equation is 17 Xj e j .or y =X +u where 2 11 IT . m1 IT The stacked system (6.3) results in the Three Stage Least Squares (3SLS) Estimator of Zellner and Theil (1962) e = = X (Im X( 0 1 0 Pz )0 ( Pz )X IT ) 1 (Im 1 1 Pz )X Pz )y : X (Im 0 Pz )0 ( IT ) 1 y (6. Such an estimator can be based on the expression where ej is the vector of residuals bij = e0i ej T ej = yj from 2SLS regression on the jth equation. 0 5 = (Im Pz )X b 0 0 Pz Xm 0 0 Xm 6 var(u) = 4 .

k@ut =@yt k is called the Jacobian of the transformation from ut to yt . The reason is that in these cases there is no informational gain in considering all the equations together. plim X ( Pz )X : (6.2. Consider again the general linear simultaneous equations model with m equations Byt + zt = ut .5) T 6. The 3SLS estimator is consistent and asymptotically e¢ cient in the class of full information models with ! 1 p 1 0 1 T (e ) a N 0.T where it is now assumed that ut is distributed independently normally as ut IN (0. ) : The probability distribution function for ut is given by f (ut ) = (2 ) m=2 j j 1=2 exp( 1 0 u 2 t 1 ut ) and the likelihood function for yt is given by f (yt ) = @ut f (ut ) = kBk f (ut ) @yt m=2 = (2 ) j j 1=2 kBk exp( 1 0 u 2 t 1 ut ) : where kk denotes the absolute value of the determinant. The likelihood of the whole sample is therefore given by L(B. This estimator is known as the Full Information Maximum Likelihood or FIML estimator. Full Information Maximum Likelihood The maximum likelihood principle can also be used to construct an estimator in the simultaneous equations system using full information. . . z) = T Y t=1 f (yt ) mT =2 = (2 ) j j 18 T =2 1X 0 kBk exp( u 2 t=1 t T T 1 ut ) . .exactly identi…ed. y. t = 1.

F. [3] Stewart. Econometrics. H. J. [4] Theil. (1964). ‘ Speci…cation tests in econometrics’ Econometrica. Hendry and K. Hemel Hempstead. ‘ Alternative tests of independence between stochastic regressors and disturbances’ Econometrica. . 19 . rics and Quantitative Economics. it requires iterative numerical optimisation and so is computationally more costly than 3SLS.5). Amsterdam. Wallis (eds. 750. [5] Wu. It can be shown that the FIML estimator is asymptotically equivalent to the 3SLS estimator and so has the same asymptotic distribution (6. North-Holland. [2] Sargan. is pre-estimated from 2SLS residuals. (1973). J. On the other hand. and H. (1958). 1251– 71.). ‘ Wages and prices in the United Kingdom: a study in econometric methodology’ in D. 54– . Theil (1962). References [1] Hausman. 41. (1978). 30. ‘ Three-stage least squares: simultaneous estimation of simultaneous equations’ Econometrica. J. Economic Forecasts and Policy.The FIML estimator is derived by maximising this likelihood function numerically with respect to the unknown parameters B. Philip Allan.F. 78. 46. 1984.D. D-M. [6] Zellner. A. Basil Blackwell. FIML has the advantage over 3SLS that all parameters are estimated jointly whereas in 3SLS. and . taking into account all the identifying restrictions imposed on these matrices. Oxford. . 733– . (1991). Economet.