You are on page 1of 31

General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Generalized Method of Moments


GMM in Applied Settings

Ashvin Gandhi
1

Harvard University

September 16, 2015


2

1
agandhi@fas.harvard.edu
2
Based on previous notes by Daniel Pollmann, Tom Wollmann, and Michael

Sinkinson.
Harvard University Generalized Method of Moments September 16, 2015 1 / 31
General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Lectures

I Go to the lectures.

I Actively follow along in the notes.

I Read the notes beforehand.

I Review the notes afterwards.

Harvard University Generalized Method of Moments September 16, 2015 2 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Readings

I Do the reading.

Harvard University Generalized Method of Moments September 16, 2015 3 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Problem Sets

I Start early.

I Read the papers.

I Work together, but do not copy code or content.

I Show your work.

I Comment your code.

I Package your code.

I Include your code in your writeup (LATEX package mcode).

Harvard University Generalized Method of Moments September 16, 2015 4 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Sections

I Not weekly.

I Not required.

I Potentially long.

I Hopefully helpful.

I Email me questions or topics beforehand.

Harvard University Generalized Method of Moments September 16, 2015 5 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Oce Hours

I By appointment.

I Email: agandhi@fas.harvard.edu

Harvard University Generalized Method of Moments September 16, 2015 6 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Seminars

I Monday and Wednesday at 2:30 PM.

I Go to as many as possible.

I Familiarize yourself with the tools and how they are used.

I Learn what current IO research looks like.

Harvard University Generalized Method of Moments September 16, 2015 7 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Overview

I Objective:

I What is GMM? How is it dierent from other estimation

techniques?
I Some technical details, and translating models into moments.

I Implementing GMM in Matlab.

I For derivations and details, I highly recommend Gary


Chamberlain's ECON 2120 Lecture Note 16 and Alberto
Abadie's ECON 2140 Extremum Estimators Handout. There
are also a number of great texts out there.

Harvard University Generalized Method of Moments September 16, 2015 8 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

What is GMM?
I GMM is a framework for identifying parameters by leveraging
relations the econometrician would like to hold in expectation:

E [ψ (wi ; θ0 )] = 0.

I Need at least as many identifying moments as parameters.


May not be possible to impose all of these identifying
moments simultaneously.

I Solution: Weighted penatly for deviation from the moments.


0
θ0 = argminθ E [ψ (wi ; θ)] W E [ψ (wi ; θ)] ,
where W is positive semidenite.

I Heuristic: what parameters allow the data to best t the


identifying moments the econometrician wants to hold?

Harvard University Generalized Method of Moments September 16, 2015 9 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

GMM and Other Methods

I This framework can encompass many techniques we are


familiar with:

I Regression:

E [xi i ] = 0.
I Instrumental Variables:

E [zi i ] = 0.
I Maximum likelihood:
 
∂ log f (Yi |Zi , θ)
E = 0.
∂θ
I Others, less so:

I Non-parametric estimation. (Though, GMM is used in

semi-parametric analysis.)

Harvard University Generalized Method of Moments September 16, 2015 10 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

GMM and Other Methods

I Heuristically, one can think of GMM as imposing structure


that is somewhere between highly parametric techniques (like
MLE) and highly non-parametric ones (like kernel density
estimators). Sometimes people will even refer to GMM as
semiparametric. (This is a less-common use of the term.)

I MLE makes strong assumptions about the distributions (i.e.

known up to a parameter).
I GMM makes assumptions about the moments of the

distributions (e.g., means, variances, covariances, etc.).


I Non-parametrics make almost no assumptions about the

distributions.

I Tradeo between strength of assumptions and eciency.

Harvard University Generalized Method of Moments September 16, 2015 11 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

How Does GMM Relate to Structural Modeling?

I Structural models in IO aim to leverage theory and data to


estimate policy invariant parameters that can be used to
assess counterfactuals.

I Often the theory may dictate equations we want to hold true

in expectation (identifying moments).

I Consumer optimality conditions.


I Producer optimality conditions.

Harvard University Generalized Method of Moments September 16, 2015 12 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Moment restriction

I We frame the problem as:

E [ψ (wi ; θ)] = 0,
where we call the M -vector ψ (wi ; θ) the moment function, wi
is an observation in the data, and θ is our parameter vector we
want to estimate (dim (θ) = K ).

I Our moment function evaluates a vector of M moments in the


data, and our parameter vector θ
K parameters. If
contains
M = K , then we say we are just-identied. If M > K , then
we are over-identied. Finally, if M < K , we are
under-identied (and cannot recover point estimates of the
parameters).

Harvard University Generalized Method of Moments September 16, 2015 13 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

A Simple Example
I Suppose we have the following model:

yi = xi0 β + i ,
where E (i |xi ) = 0.
I 0 0
Then, E (yi − xi β|xi ) = 0 ⇒ E [(yi − xi β) h (xi )] = 0 for any
function h (·), in particular h (x) = x .
I Hence,

E [ψ (wi ; θ)] = 0,
0
where ψ (wi ; θ) = (yi − xi β) xi .
I In a more general problem, using optimal instruments means
optimal choice of h (·). (See Chamberlain 1987.)
 
Ω− 1 ,
∂ρjt (θ)
I h∗ (zt ) =
∂θ 0 |zt where E (ρjt |zt ) = 0 and
0

Ω = E ρjt ρjt |zt .
I See, e.g., Berry, Levinsohn, and Pakes (EMA, 1995) (BLP),

Reynart and Verboven (JOE, 2014).

Harvard University Generalized Method of Moments September 16, 2015 14 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Identication

I Recall: In that in Maximum Likelihood, a model is identied when

arg max L (θ; w ) = θ0


θ
is only true i θ 0 = θ0 , the true value. That is, the likelihood is
uniquely maximized at the true value.

I The analog for the semi-parametric GMM case is that

E [ψ (wi ; θ)] = 0
only holds at the true value θ = θ0 , and that at all other values of
the parameter vector, it does not hold.

I If we to set our moment restrictions as the gradients of the


parametric likelihood, we see that GMM nests Maximum Likelihood.

Harvard University Generalized Method of Moments September 16, 2015 15 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Consistency

I When asking whether an estimator is consistent, we want to


know whether it converges to the true value in probability:
p
θ̂ −
→ θ0
I Formally, this is the same as saying
h i
lim Pr θ̂ − θ0 > ε = 0, ∀ε > 0.

N→∞
I Under appropriate assumptions, GMM and ML are consistent.

Harvard University Generalized Method of Moments September 16, 2015 16 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Eciency
I We want to know whether our estimates are as precise as
possible. The ML estimator achieves the smallest variance
among all unbiased estimators in the parametric setting:

Var θ̂(X ) ≥ = (θ0 )−1


 

 2 

= (θ) = −E ln f (X |θ)
∂θ∂θ0
I We call = (θ) the Fisher Information matrix, and we call
−1
= (θ0 ) the Cramer-Rao lower bound on variance

I The GMM estimator attains the semi-parametric eciency


bound (Chamberlain, JoE, 1987), which is the lower bound on
variance for an estimator using only the information contained
in the moment restrictions. In practice, the over-identied case
will require a two-step estimator (which we will discuss shortly)
for eciency.

Harvard University Generalized Method of Moments September 16, 2015 17 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Estimation

I We compute the empirical mean of the moment function, and


select θ̂ = arg minθ QC ,n (θ), where
" n #0 " n #
1 X 1 X
QC ,n (θ) = ψ (wi , θ) C ψ (wi , θ)
n n
i=1 i=1
for some positive denite M × M -matrix C .
I The weighting matrix matrix C assigns importance to
satisfying the dierent moment conditions.

Harvard University Generalized Method of Moments September 16, 2015 18 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Asymptotic variance
I Under appropriate assumptions,
√  
d
n θ̂ − θ0 −→ N (0, V ) ,
where

−1 0 − 1
V = Γ0 C Γ Γ C ∆C Γ Γ0 C Γ .

h i
∂ψ
I Γ=E ∂θ (x, θ0 ) (M × K ): gradient of the moment function

with respect to the parameters

∆ = E ψ (x, θ0 ) ψ (x, θ0 )0 (M × M ):
 
I outer product of the
moments

I For extremely clear and concise derivation, see Gary


Chamberlain's ECON 2120 Lecture Note 16.

Harvard University Generalized Method of Moments September 16, 2015 19 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Just-identied case

−1 −1
V = Γ0 C Γ Γ0 C ∆C Γ Γ0 C Γ
= Γ−1 C −1 Γ0−1 Γ0 C ∆C ΓΓ−1 C −1 Γ0−1
= Γ−1 ∆Γ0−1
− 1
= Γ0 ∆−1 Γ ,

since Γ is invertible in the just-identied case. Equivalently, C


drops out because we can set the moments to zero (at least in the
limit), so we do not need to trade o dierent elements of the
moment function vector.

Harvard University Generalized Method of Moments September 16, 2015 20 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Over-identied case

For C = ∆−1 ,

−1 − 1
V = Γ0 C Γ Γ0 C ∆C Γ Γ0 C Γ
− 1 0 − 1 −1
= Γ0 ∆−1 Γ Γ ∆ ∆∆−1 Γ Γ0 ∆−1 Γ
− 1
= Γ0 ∆−1 Γ
− 1
The proof that (Γ0 C Γ)−1 Γ0 C ∆C Γ (Γ0 C Γ)−1 − Γ0 ∆−1 Γ ≥0
(positive semi-denite) can be found in virtually every econometrics
text or lecture notes. This proves that C = ∆−1 is indeed optimal.

Harvard University Generalized Method of Moments September 16, 2015 21 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Choice of weighting matrix

I We would like C ∝ ∆−1 . Recall that ∆ is the expectation of


the outer product of the moments at θ0 .
I Problem: we don't know θ0 .
I Solution: Form a consistent estimate ˆ
∆ using a consistent
though inecient estimate of θ0 .

Harvard University Generalized Method of Moments September 16, 2015 22 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Two-step GMM

I Step 1: Estimate θ̂GMM 1 by minimizing QC (θ) with any


arbitrary choice of (positive semi-denite) C , such as the
identity matrix.

I Step 2: Estimate the optimal weighting matrix as:


   0 −1
ˆ −1 =
 
∆ En ψ wi , θ̂GMM 1 ψ w , θ̂GMM 1

and use this to then solve for θ̂GMM 2 = arg minθ Q∆


ˆ −1 (θ).

Harvard University Generalized Method of Moments September 16, 2015 23 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Issues and alternatives

I GMM, just like IV, is generally biased in nite sample.

I CUE (Continuously Updating Estimator) has less bias, but


large dispersion.

I For more details, see Newey's 14.385 notes (GMM II).

Harvard University Generalized Method of Moments September 16, 2015 24 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Return to simple example

I For the linear regression model, we have


ψ (wi ; θ) = (yi − xi0 β) xi .
 
I Just-identied GMM sets 1
Pn
n i=1 ψ w i ; θ̂ = 0.
1 Pn 0 −1 1 n
 P
I Solving this, we get θ̂ = n i=1 xi xi n i=1 xi yi , which is
the same as OLS.

Harvard University Generalized Method of Moments September 16, 2015 25 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Linear IV example

I Now, suppose E (i |xi ) 6= 0, but we have a (relevant)


instrument z such that E (i |zi ) = 0 (exclusion restriction).
I Standard tool is TSLS

I In the GMM framework, we can use the moment function


ψ (wi , θ) = (yi − xi0 β) zi .
I If only some elements of the K -vector xi zi
are endogenous,
will also include the remaining subset. If dim (zi ) = dim (xi ),
the model is just-identied; for dim (zi ) > dim (xi ), it is
over-identied.

Harvard University Generalized Method of Moments September 16, 2015 26 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Typical moments in IO applications

I Look at the models we estimate for zero-correlation


conditions. One obvious example is the unobserved
heterogeneity term, ξ, in BLP. Look at what it is uncorrelated
with and form a moment from that.

I Micro moments/aggregate information: Suppose you know the


average of some function of the data and parameters.

I Nash conditions: This comes up in BLP's pricing equation.

I Optimality: Even in non-competitive environments, the


producer is usually optimizing some objective function. Even
without continuous controls, inequalities can be used.

Harvard University Generalized Method of Moments September 16, 2015 27 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Implementing GMM in Matlab


I The primary Matlab functions you should be familiar with are
fminsearch and fminunc.

I Basic syntax example of how to use it (just-identied case):

beta = fminsearch(@(b) (X'*(Y-X*b))'*(X*(Y-X*b)),betastart,myopts)

I In our simple example:

I Y is a column vector and X is a matrix where each row is an

observation
I The answer will be stored in a variable beta
I @(b) means the routine will attempt to minimize the

expression (X'*(Y-X*b'))'*(X'*(Y-X*b')) with respect to b


I The starting guess for b will be the value held in the vector

betastart
I The routine will follow the specications in the options set

myopts, which is set before this using a command like

myopts = optimset('TolFun',10^-12, 'MaxFunEvals',1000000,'MaxIter',1000)

Harvard University Generalized Method of Moments September 16, 2015 28 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Matlab: More complicated minimization


I The fminsearch command can also evaluate a named
function. This is useful if your moments are hard to evaluate.
In that case, you would create a separate .m le for the
function. Here's an example of moment_function.m le:

function [val] = moment_function(beta, X, S, alpha, P)

% Do manipulations with the input arguments beta, X, S, alpha, P.

% Suppose you evaluate a moment condition for each observation

% into a vector called "moment"

...

val = mean(moment);

I I could then call this function from fminsearch using:

beta = fminsearch(@(b) moment_function(b, X, S, alpha, P), betastart, myopts)

I Question: How would you implement 2-step GMM?

Harvard University Generalized Method of Moments September 16, 2015 29 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Evaluating gradients

I Necessary for Γ in asymptotic variance.

I Exact dierentiation (analytic derivatives) is always preferred


to numerical dierentiation due to approximation error. This is
also runs much faster.

I If not practical, use nite dierences with h on the order of


10
−6 :

I Forward dierence formula:


f (x + h) − f (x)
f 0 (x) ≈
h
I Symmetric dierence formula (more accurate):

f (x + h) − f (x − h)
f 0 (x) ≈
2h
I See Judd (1998, Ch. 7) for details.

Harvard University Generalized Method of Moments September 16, 2015 30 / 31


General Advice Introduction Theory Estimation and inference Implementation (Matlab) Conclusion

Conclusion

I If you want article references or notes for anything in this


presentation or beyond, feel free to ask me.

I Questions?

Harvard University Generalized Method of Moments September 16, 2015 31 / 31