You are on page 1of 28

1

Lecture: Simultaneous Equation Model (Wooldridge’s Book Chapter 16)


2

Model

• Consider a system of two regressions

y1 = β1 y2 + u1 (1)
y2 = β2 y1 + u2 (2)

• This is a simultaneous equation model (SEM) since y1 and y2 are determined


simultaneously.

• Both variables are determined within the model, so are endogenous, and denoted by letter y.
3

Example

• The demand-supply model in microeconomics includes demand function and supply


function

• y1 is the quantity of good; y2 is the price

• If β1 < 0, β2 > 0, then (1) is the demand function while (2) is the supply function; u1 is the
demand shock and u2 is the supply shock.

• Another example is the Keynesian cross (45 degree line) model in which y1 is the national
income and y2 is total consumption.
4

Structural Form

• (1) and (2) are structural in the sense that they are directly implied by economics theory.

• We assume
E(u1 u2 ) = 0 (3)
So the structural errors are uncorrelated (orthogonal).

• Our goal is to estimate the structural coefficient β that measures the causal effect of one
endogenous variable on the other endogenous variable
5

Simultaneity Bias

• Plugging (1) into the right hand side of (2) leads to

y2 = β2 (β1 y2 + u1 ) + u2 .

• After collecting terms, we have


β2 u1 + u2
y2 = , (4)
1 − β2 β1
which indicates that
β2 Eu21
E(y2 u1 ) = ̸ 0 ⇒ cov(y2 , u1 ) ̸= 0.
= (5)
1 − β2 β1

• So structural model (1) suffers endogeneity issue (simultaneity bias)—the regressor in (1) is
correlated with the error term. Consequently, OLS applied to (1) gives inconsistent and
biased estimate. So does (2).
6

Example

• Suppose there is a demand shock u1

• u1 affects the quantity y1 through the demand function (1)

• Next quantity y1 affects the price y2 through the supply function (2). Some people call this
reverse causation.

• So u1 affects y2 , and the two variables are correlated. In other words, the regressor in (1) is
endogenous.
7

Endogeneity

• Typically an economic theory implies SEM, so several variables are determined


simultaneously within the model. Those variables are endogenous from the economics
perspective

• We just show SEM suffers simultaneity bias. Therefore those variables are correlated with
the error, so are endogenous from the econometrics perspective

• In short, economic endogeneity is closely related to econometric (statistical) endogeneity.


8

Graph

We can not identify either demand curve or supply curve from the scatter plot of quantity versus
price. However, if there are some exogenous variables, say, input price that can shift the supply
curve, then we can identify the demand curve.
9

SEM with Exogenous Regressors

• Now augment the structural model with exogenous regressors

y1 = β1 y2 + c1 z1 + u1 (6)
y2 = β2 y1 + c2 z2 + u2 (7)

• For instance, z1 is income; z2 is input price

• z1 and z2 are determined outside the model, so are exogenous (pre-determined)

• Statistically, exogeneity means that

E(z1 u1 ) = 0; E(z1 u2 ) = 0; E(z2 u1 ) = 0; E(z2 u2 ) = 0 (8)


10

Reduced Form

The reduced form expresses the endogenous variables in terms of exogenous variables only
c1 z1 + β1 c2 z2 + e1
y1 = (9)
1 − β2 β1
β2 c1 z1 + c2 z2 + e2
y2 = (10)
1 − β2 β1
e1 = u1 + β1 u2 (11)
e2 = β2 u1 + u2 (12)

(9) and (10) are reduced forms, and only exogenous variables z1 and z2 appear on the right hand
side (RHS)
11

Reduced Form Continued

• Let
c1 β1 c 2 e1
π11 = ; π12 = ; e∗1 =
1 − β2 β1 1 − β2 β1 1 − β2 β1
β2 c 1 c2 e2
π21 = ; π22 = ; e∗2 =
1 − β2 β1 1 − β2 β1 1 − β2 β1
• The reduced form can be rewritten as

y1 = π11 z1 + π12 z2 + e∗1 (13)


y2 = π21 z1 + π22 z2 + e∗2 (14)

where e∗ is reduced-form error, which is linear function of structural error.

• Note that reduced-form error is correlated cov(e∗1 , e∗2 ) ̸= 0, whereas the structural error is
uncorrelated.
12

Reduced Form Continued

• Note that all exogenous variables appear on the right hand side of each reduced form; by
contrast, the structural form has endogenous variable and some exogenous variables on the
right hand side. As a result, OLS applied to structural form is inconsistent, whereas OLS
applied to reduced form is consistent

• Reduced form (14) is the first-stage regression if we want to use 2SLS estimator to obtain
the causal effect of y2 on y1 . Notice that all exogenous variables are used as regressors in
the first-stage regression.
13

Indirect Least Squares Estimator (ILS)

• OLS applied to (13) and (14) separately gives consistent estimate for π s.

• However, we are interested in the coefficients in the structural form.

• The indirect least squares estimator (ILS) estimates the structural-form coefficients β based
on the estimated reduced-form coefficient π
14

Indirect Least Squares Estimator Continued

• The ILS estimators for structural-form coefficient β are


π̂12
β̂1ILS = (15)
π̂22
π̂21
β̂2ILS = (16)
π̂11
provided that

c2 ̸= 0 (17)
c1 ̸= 0 (18)
15

Identification and Exclusion Restrictions

• β1 cannot be identified if c2 = 0. So identification for β1 requires c2 ̸= 0

• c2 ̸= 0 indicates that there is an exogenous variable z2 which is excluded from the first
structural equation (order condition) but appears in the second structural equation with
non-zero coefficient (rank condition)

• For the demand-and-supply example, the demand function can be identified if input price is
present in the supply function. Graphically the demand curve can be traced out (identified)
when supply curve shifts due to varying input price.
16

Remarks

• β1 is over-identified if there are more than one exogenous variables that are excluded from
the first equation and appear in the second equation with non-zero coefficients

• In that case, the ILS estimator for β1 is not unique (Exercise), a big disadvantage of ILS.

• Another disadvantage of ILS is, β̂ ILS is nonlinear function of π̂ , so deriving the variance
entails delta method
17

Delta Method

• For simplicity, let π̂ and β̂ = f (π̂ ) be scalars. Consider the first order Taylor expansion

β̂ = f (π̂ ) ≈ f (π ) + f ′ (π )(π̂ − π )

which implies that


var(β̂ ) = ( f ′ (π ))2 var(π̂ )

• More generally, for vectors we have


( )′ ( )
∂ β̂ ∂ β̂
var-covariance(β̂ ) = var-covariance(π̂ )
∂ π̂ ∂ π̂

∂ β̂
where ∂ π̂ is called gradient (column) vector.
18

2SLS Estimator

• The nice by-product of the structural model is that instrumental variables are readily
available.

• The reduced form (9) and (10) clearly show that the exogenous variables z1 and z2 are
correlated with the endogenous regressors y1 and y2 . Moreover, we assume z1 and z2 are
uncorrelated with the error u, see (8)

• So z1 and z2 are instrumental variables for y1 and y2 , if the exogeneity assumption (8) holds

• Essentially the 2SLS estimator replaces the endogenous regressors with their exogenous
parts, and we use instrumental variables to isolate those exogenous parts.
19

2SLS Estimator Continued

• Step 1: Estimate the reduced form (13) and (14) (first stage) using OLS and keep the fitted
values

ŷ1 = π̂11 z1 + π̂12 z2 (19)


ŷ2 = π̂21 z1 + π̂22 z2 (20)

• Step 2: Replace the endogenous regressors with fitted values, and fit the second-stage
regressions using OLS

y1 = β1 ŷ2 + c1 z1 + u1 (21)
y2 = β2 ŷ1 + c2 z2 + u2 (22)

• ŷ1 is the exogenous part of y1 ; ŷ2 is the exogenous part of y2 ; Both are linear combinations
of exogenous z1 and z2 . (Where are the endogenous parts?)
20

2SLS Estimator Stata

• For example, to get β̂12SLS in (6) , using


ivreg y1 (y2 = z2) z1

where y1 is the dependent variable, y2 is the endogenous regressor, z2 is the excluded


exogenous variable, and z1 is the included exogenous variable (control variable).

• Exercise: what is the stata command to get β̂22SLS in (7)? You need to think carefully which
variable is which.

• This command will first run regression (19), then (21).


21

(Optional) Seemingly Unrelated Regression (SUR)

• Reduced form (13) and (14) are example of seemingly unrelated regressions

• They have different LHS variables, so seem unrelated.

• They are indeed related because the reduce-form errors are correlated across equations, i.e.,

cov(e∗1 , e∗2 ) ̸= 0,

see (11) and (12)


22

(Optional) SUR Continued

• Generally the optimal estimator for SUR model is generalized least squares estimator
(GLS), due to the correlation between errors across regressions.

• However, if each equation in SUR has the identical RHS variables, GLS becomes
equation-by-equation OLS

• The STATA command to estimate SUR model using GLS estimator is


sureg (y1 x1)(y2 x2)
23

(Optional) Gauss-Markov Theorem

• If the error is homoskedastic and uncorrelated, then OLS estimator is the best linear
unbiased estimator (BLUE) conditional on the regressors.

• See theorem 10.4 in the textbook for details


24

(Optional) Generalized Least Squared Estimator I

• There is an estimator better than OLS if error is heteroskedastic.

• Suppose the model is

yi = β xi + ui , E(u2i ) = σi2 , (Heteroskedasticity) (23)

• Consider the transformed model

y∗i = β xi∗ + u∗i (24)


yi ∗ xi ∗ ui
y∗i = , xi = , ui = (25)
σi σi σi
where the transformed error is homoskedastic: E(u∗2
i )=1

• The estimator better than OLS is GLS, which is the OLS applied to the transformed
regression (24).
25

(Optional) Generalized Least Squared Estimator II

• Consider a model with correlated error

yi = β xi + ui , ui = ρ ui−1 + vi , (Correlation) (26)

• Consider the transformed model

y∗i = β xi∗ + u∗i (27)


y∗i = yi − ρ yi−1 , xi∗ = xi − ρ xi−1 (28)

where the transformed error is uncorrelated: E(u∗i u∗i−1 ) = 0

• GLS is the OLS applied to the transformed regression (27).


26

(Optional) Matrix Algebra for GLS

We use GLS when E(UU′ |X) = Ω ̸= σ 2 I. Because Ω is symmetric and positive definite, there is
spectral decomposition Ω = AA′ . Now consider the transformed model

Y∗ = X∗ β + U∗

where Y∗ = A−1 Y, X∗ = A−1 X, U∗ = A−1 U. It follows that GLS estimator is OLS applied to the
transformed regression, which satisfies the conditions of Gauss-Markov Theorem. That is,

E(U∗ U∗ |X∗ ) = I.

In short,
( )−1 ( ) ( ′ −1 )−1 ( ′ −1 )
∗′ ∗ ∗′ ∗
β̂ GLS
= X X X Y = XΩ X XΩ Y
27

(Optional) STATA and GLS

• For instance, the STATA command


prais y x
reports the GLS estimator assuming the error is an AR(1) process, and so serially correlated.

• You use command sureg to obtain GLS estimator for SUR model.

• Alternatively, you can generate the transformed variables, and fit the transformed regression
using OLS
28

OLS is inconsistent when applied to simultaneous equation model (SEM).


2SLS estimator is consistent. ILS estimator is also consistent, but requires
delta method to obtain the variance and standard error. The benefit of
considering SEM is tremendous in that instrumental variables are readily
indicated by the structural model!

You might also like