A Study of the Finite Sample Properties of

EMM, GMM, QMLE, and MLE for a
Square-Root Interest Rate Di usion Model
1

Hao Zhou

Mail Stop 91
Federal Reserve Board
Washington, DC 20551

First Draft: April 1997
Last Revised: August 2000

1 I would like to thank George Tau hen for his valuable advi e and Ronald Gallant for his
insightful omments. The views expressed in this paper re e t those of the author and do not
represent those of the Board of Governors of the Federal Reserve System or other members of its
sta . I am grateful to an anonymous referee for his or her onstru tive suggestions for making the
study omplete. This paper has bene ted from dis ussions with Matthew Pritsker, Christopher
Downing, Mark Coppejans, Chien-Te Hsu, and the parti ipants of the Duke nan ial e onomi s
seminar. Please onta t Hao Zhou with questions and omments: Trading Risk Analysis Se tion,
Division of Resear h and Statisti s, Federal Reserve Board, Washington DC 20551 USA; Phone
1-202-452-3360; Fax 1-202-452-3819; e-mail hao.zhoufrb.gov. An earlier draft of the paper was
distributed under the title \Finite Sample Properties of EÆ ient Method of Moments and Maximum
Likelihood Estimations for a Square-Root Di usion Pro ess".

Abstra t
This paper performs a Monte Carlo study on EÆ ient Method of Moments (EMM), Generalized Method of Moments (GMM), Quasi-Maximum Likelihood Estimation (QMLE), and
Maximum Likelihood Estimation (MLE) for a ontinuous-time square-root model under two
hallenging s enarios|high persisten e in mean and strong onditional volatility|that are
ommonly found in estimating the interest rate pro ess. MLE turns out to be the most eÆ ient of the four methods, but its nite sample inferen e and onvergen e rate su er severely
from approximating the likelihood fun tion, espe ially in the s enario of highly persistent
mean. QMLE omes se ond in terms of estimation eÆ ien y, but it is the most reliable in
generating inferen es. GMM with lag-augmented moments has overall the lowest estimation
eÆ ien y, possibly due to the ad ho hoi e of moment onditions. EMM shows an a elerated onvergen e rate in the high volatility s enario, while its overreje tion bias in the
mean persisten e s enario is una eptably large. Finally, under a stylized alternative model
of the US interest rates, the overidenti ation test of EMM obtains the ultimate power for
dete ting misspe i ation, while the GMM J-test is in reasingly biased downward in nite
samples.

Keywords: Monte Carlo Study, EÆ ient Method of Moments, Maximum Likelihood
Estimation, Square-Root Di usion, Quasi-Maximum Likelihood, Generalized Method of Moments.
JEL lassi ation: C15; C22; C52

1 Introdu tion
When estimating a ontinuous time model in nan e, one often fa es the diÆ ulty of partial
observability. Usually the ontinuous time re ord is not available, sin e the data is dis retely sampled. A further ompli ation is that the transitional density of the sto hasti
pro ess does not always have a losed-form solution. Due to the la k of a tra table likelihood fun tion, mu h of the interest among resear hers has turned to nonlikelihood-based
approa hes. For instan e, by exploiting the analyti al solutions of the rst two moments, the
Quasi-Maximum Likelihood Estimation (QMLE), as dis ussed in Bollerslev and Wooldridge
(1992) onveniently ir umvents the need to evaluate density fun tion, although the asymptoti validity of QMLE imposes ertain restri tions on the innovation density (Newey and
Steigerwald 1997). Moreover, the Generalized Method of Moments (GMM) by Hansen (1982)
and Hansen and S heinkman (1995) further redu es the relian e on distribution assumptions
by mat hing the empiri al moments with the theoreti al ones. Meanwhile, the Simulated
Method of Moments (SMM) in time-series appli ation (Ingram and Lee 1991, DuÆe and
Singleton 1993) minimizes the relian e on distribution assumptions by mat hing the empiri al moments with the simulated ones. Both GMM and SMM are robust to the misspe i ation of likelihood fun tions, while retaining a parametri model to ondu t simulation or
proje tion. However, these methods of moments su er from the ad ho hoi e of moment
onditions and must presume the existen e of arbitrary population moments; and the hisquare spe i ation test of the overidentifying restri tions is subje t to severe overreje tion
bias (Hansen, Heaton, and Yaron 1996, Andersen and Srenson 1996). Furthermore, the
eÆ ien y loss of parameter estimates is losely related to the high ost in estimating the
weighting matrix, as the varian e- ovarian e matrix is typi ally heteroskedasti and serially
orrelated (Andersen and Srenson 1996). The Wald test is also found to ex eed its asymptoti size due to the diÆ ulty in estimating the residue spe tral-density matrix (Burnside
and Ei henbaum 1996).
The EÆ ient Method of Moments (EMM), introdu ed by Bansal, Gallant, Hussey, and
Tau hen (1995) and Gallant and Tau hen (1996 ), endogenously sele ts the moment onditions during the pro edure's rst step. A seminonparametri s ore generator (SNP) uses
the Fourier-Hermite polynomial to approximate the underlying transitional density. As an
1

orthogonal series estimator, the SNP density has a fast uniform onvergen e, given the
smoothness of the underlying distribution fun tion. A suitable model sele tion riterion,
e.g., the S hwarz's Bayesian Information Criterion (BIC), is used to hoose the dire tion
and omplexity of the auxiliary model expansion. The se ond stage of EMM is simply an
SMM-type estimator, minimizing the quasi-maximum likelihood s ore fun tions that are hosen appropriately in the rst stage. Sin e the s ore fun tions are orthogonal, the weighting
matrix (i.e., the information matrix from the quasi-maximum likelihood estimator) should
be nearly serially un orrelated. Hen e the asymptoti varian e estimator approa hes the
minimum bound, and the parameter estimates are asymptoti ally as eÆ ient as MLE. It
is proven that for an ergodi sto hasti system with partially observed data, the eÆ ien y
of EMM approa hes that of MLE, as the number of moment onditions and the number of
lags entering ea h moment in rease with the sample size (Gallant and Long 1997). Another
salient feature of EMM is the apability to dete t a misspe i ed stru tural model, if the
auxiliary model is ri h enough su h that Hermite polynomial s ores approximate the true
s ores fairly well (Tau hen 1997). Under orre t spe i ation of the maintained model, the
normalized obje tive fun tion value onverges in distribution to a hi-square distribution.
Under misspe i ation, the unnormalized obje tive fun tion onverges almost surely to a
onstant. For the parti ular hoi e of a s ore generator, this onstant may be zero and the
hi-square test has little power against the alternatives. If the data generating pro ess is adequately aptured by a more exible nonparametri s ore generator, the onstant is positive
and reje tion of misspe i ation is almost ertain.
Re ent Monte Carlo studies do umented signi ant eÆ ien y gains of EMM over GMM
(Andersen, Chung, and Srenson 1999), but with similar overreje tion problems in spe i ation tests (Chuma ero 1997). In more analyti al fashion, Gallant and Tau hen (1998a) show
that EMM outperforms the onventional method of moments (CMM) for a representative
lass of e onometri models, given the same number of moments being sele ted. However,
there is no universal theory regarding the eÆ ien y of EMM versus that of CMM, and the
omparison must be made ase-by- ase (Gallant and Tau hen 1998a). Therefore hoosing
EMM over QMLE, as in Dai and Singleton (2000), should be a ompanied by solid argument
or Monte Carlo eviden e. This paper omplements these omparative studies in several areas. First, the Monte Carlo setup is a ontinuous time model, like many re ent appli ations
2

of EMM, whi h have fo used on the sto hasti di erential equations. Se ond, I onsider the
relative eÆ ien y of EMM with respe t to asymptoti ally eÆ ient MLE, omputationally ef ient QMLE, and empiri ally attra tive GMM. Third, both blindfold and edu ated hoi e
of moment onditions are examined, so as to best imitate the realisti approa hes taken
by resear hers using EMM. Thus, the main ontribution of this paper is to provide Monte
Carlo eviden e whi h shows that when the analyti al density or moments are unavailable,
EMM performs reasonably well ompared to the infeasible MLE, QMLE, or GMM. Further,
under two hallenging s enarios|mean persisten e and volatility luster, ea h method has
its own strength and weakness, in terms of estimation eÆ ien y, parameter inferen e, and
spe i ation test.
A square-root di usion pro ess (Cox, Ingersoll, and Ross 1985) is hosen as the vehi le
for ondu ting the Monte Carlo study. On the one hand, the CIR model is simple enough
to give losed-form solutions for both the transitional density and the asset pri ing formula.
On the other hand, it is ri h enough to generate a highly persistent volatility and nonGaussian error distribution. The square-root pro ess seems to be a good starting point
to model more ompli ated nan ial time series data. EMM estimation of the interest
rate di usions is reported by Gallant and Tau hen (1998b), and the square-root model is
rmly reje ted. With a losed-form transitional density, the dynami maximum likelihood
estimation was implemented for the two-fa tor CIR model (Pearson and Sun 1994, DuÆe and
Singleton 1997). Gibbons and Ramaswamy (1993) employed a GMM estimator, using the
sto hasti Euler equations to generate the moment onditions. Their results favor the squareroot model. The most re ent interest in aÆne term stru ture (Dai and Singleton 2000) an
be viewed as an immediate extension of the multifa tor square-root model. The CIR model
also has an expli it marginal density in terms of the drift and volatility fun tions, whi h
motivated a nonparametri spe i ation test (At-Sahalia 1996b). Conley, Hansen, Luttmer,
and S heinkman (1997) implemented a GMM estimator for a subset of the parameters by
exploiting the reversibility of a stationary Markov hain. Fisher and Gilles (1996) proposed
a general QMLE estimator for the CIR type di usion pro esses.
The remaining se tions are organized as following: Se tion 2 dis usses some properties of
the square-root model and hara terizes the implementations of MLE, QMLE, and GMM;
Se tion 3 introdu es the relatively new EÆ ient Method of Moments estimator; Se tion 4
3

designs the Monte Carlo experiment and suggests the ben hmark model hoi e; Se tion 5
reports the major ndings; and Se tion 6 on ludes.

2 Square-Root Model and Parametri Estimator
This se tion de nes a maximum likelihood estimator for the square-root model, based on
a Poisson-mixing-Gamma hara terization of the likelihood fun tion. A quasi-maximum
likelihood estimator is also available with analyti al solutions to the rst two onditional
moments. Augmenting these two moments with instrumental variables gives a generalized
method of moments estimator found in the literature.

2.1 Probabilisti Solution to Square-Root Model
It is a well-known result that the square-root model,

drt = (a0 + a1 rt )dt + b0 rt1=2 dWt ;

(1)

satis es the regularity onditions for both a strong solution (pathwise onvergent) and a weak
solution ( onvergent in probability) (Karatzas and Shreve 1997). A strong solution obviously
implies a weak one, but not vi e versa. If (1) a0 > 0, (2) b0 > 0, (3) a1 < 0, and (4) b20  2a0 ,
then the square-root model has a unique fundamental solution (Feller 1951). The marginal
density is a Gamma distribution, and the transitional density is a type I Bessel fun tion
distribution or a non entral hi-square distribution with a fra tional order (Cox et al. 1985).
Intuitively, ondition (3) gives mean reversion, and ondition (4) ensures the stationarity.
The marginal Gamma distribution is
!   1 !r0
f (r0 j; ! ) =
r e ;
(2)
( ) 0
where  = 2a0 =b20 , ! = 2a1 =b20 , and () is the Gamma fun tion.1 The un onditional mean
and varian e are
a0 
E (r0 ) =
= ;
(3)
a1
!
b20 a0 
= :
(4)
V (r0 ) =
2a21 ! 2
1 In

some textbooks and software, the se ond parameter of the Gamma density is written in terms of 1=!
instead of ! (Johnson and Kotz 1970).

4

Noti e that the rst two moments merely identify the marginal distribution. Higher order
moments are simply nonlinear fun tions of the rst two moments. The marginal density alone
an not identify all three parameters in the di usion pro ess. Any GMM-type estimator must
add at least one lagged instrumental variable (Gibbons and Ramaswamy 1993). A reje tion of
the marginal distribution an reje t the square-root model; however, a non-reje tion does not
provide enough information for judging a parti ular parameter setting (At-Sahalia 1996b).
Transitional information must be exploited to fully identify the dynami stru ture.
2 The onditional density is
1
v q
f (r1 jr0 ; a0 ; a1 ; b0 ) = e u v ( ) 2 Iq (2(uv ) 2 );
u
2a0
q = 2 1;
b0

(5)
(6)

where = 2a1 =(b20 (1 ea1 )), u = r0 ea1 , and v = r1 . Iq () is a modi ed Bessel fun tion
of the rst kind with a fra tional order q (Oliver 1972). The onditional mean and varian e
are

E (r1 jr0 ) = r0 ea1

a0
(1 ea1 );
a1

(7)

b2
b2 a
V (r1 jr0 ) = r0 ( 0 )ea1 (1 ea1 ) + 0 20 (1 ea1 )2 :
(8)
a1
2a1
A ording to the stationary property, the limit of the transitional density as the time interval goes to in nity, is exa tly the marginal density. Therefore any estimation strategy or
spe i ation test that exploits the transitional density will naturally nest the ones that rely
on the marginal density.
It is a ommon pra ti e in the literature to all this distribution a \non entral hisquare distribution." However, the \integer order non entral hi-square distribution" does
not naturally extend to the \fra tional order non entral hi-square distribution." The latter
arises ommonly from the solution to a di usion pro ess (Feller 1971), while the former
arises from the sample standard deviation of independent, nonidenti al, non entered, normal
random variables (Johnson and Kotz 1970).
2 The

one.

pro ess is sampled with weekly observations and the time interval of a week is normalized to be

5

2.2 Analyti al Foundations for MLE, QMLE, and GMM
In industry and a ademi s alike, one popular method in estimating the square-root model for
interest rates is the Dis retized Maximum Likelihood Estimation (DMLE), i.e., a misspe i ed
QMLE based on the time dis retization of the onditional mean E (r1 jr0 )  a0 +(1+ a1 )r0 and
varian e V (r1 jr0 )  b20 r0 . As pointed out by Lo (1988), DMLE is generally not onsistent.
The parameter estimates are asymptoti ally biased, sin e both moments are misspe i ed.
When implementing MLE for the square-root model, the Bessel fun tion representation
of the likelihood fun tion is not at all a onvenient form (Pearson and Sun 1994, DuÆe and
Singleton 1997). An alternative Poisson-mixing-Gamma hara terization an be inferred
from the simulation strategy suggested by Devroye (1986). Within the admissible parameter
region, one an substitute the Bessel fun tion with an in nite series expansion (Oliver 1972).
p
With appropriate transformations (y = v ,  = q + 1, and = 2u), the alternative mixing
formula omes out ni ely,

f (y ) =
=

1 y j + 1e
X
j =0

1
X

y 

(j + ) 

2 j

2 e
j!

2
2

(9)

2
Gamma(y jj + ; 1)  Poisson(j j ):
2
j =0

One needs to be autious that the Poisson weights are not onstant, but rather ondition on
the previous realization r0 . This formula orresponds to the \Poisson driven Gamma pro ess"
in Feller (1971). The only di eren e is that  = q + 1 remains a fra tional number, not an
integer. The evaluation of the log-likelihood fun tion in MLE is greatly simpli ed when using
the Poisson-mixing-Gamma formula. It is fairly easy to a hieve the single pre ision 10 8 by
trun ating the Poisson distribution around 100. One an also avoid any ompli ation of
omplex value or non- onvergen e in evaluating the Bessel fun tion.3
The exa t expressions for the onditional mean and varian e (equations 7 and 8) suggest
3 This

mixing approa h to likelihood fun tion works well for the high volatility ase, but the approa h
fails for the mean persisten e ase, be ause the latter is lose to the unit root or nonstationary region (see
Se tion 4.1 for details on hoosing the ben hmark s enario) and be ause the evaluation of the likelihood
fun tion easily diverges. To implement MLE in the mean persisten e ase, I adopt the asymptoti expansion
formula (Oliver 1972) in the same fashion as Pearson and Sun (1994).

6

a quasi-maximum likelihood estimator (QMLE) for the square-root model,
TX1

1
exp
log q
max
fa0 ;a1 ;b0 g t=1
2V (rt+1 jrt )

(

)

[rt+1 E (rt+1 jrt )℄2
:
2V (rt+1 jrt )

(10)

QMLE is shown to be root-n onsistent (Bollerslev and Wooldridge 1992) and asymptoti ally
normal (Newey and Steigerwald 1997) under some mild regularity onditions.
For omparison purposes, a generalized method of moments (GMM) estimator an be
onstru ted from the following moment ve tor
2
6
ft (a0 ; a1 ; b0 ) = 664

rt+1 E (rt+1 jrt )
rt [rt+1 E (rt+1 jrt )℄
V (rt+1 jrt ) [rt+1 E (rt+1 jrt )℄2
rt fV (rt+1 jrt ) [rt+1 E (rt+1 jrt )℄2 g

3
7
7
7:
5

(11)

P

The parameter is estimated by min gT0 W gT , where gT = 1=T Tt=11 ft and W is the asymptoti
varian e- ovarian e matrix of gT (Hansen 1982). A two step estimator is adopted here.

2.3 Inferen e
Sin e we know the true parameter value for MLE, the likelihood ratio tests an determine how
often the on den e region entered at the estimated parameter value ontains the truth.
The same prin iple may be extended to QMLE if the expe ted quasi-likelihood fun tion
is uniquely maximized at the truth (satisfying identi ation requirement). In addition to
the orre t spe i ation of the mean and varian e, the remaining innovation error must be
entered at zero or have a symmetri distribution (Newey and Steigerwald 1997). QMLE
for the square-root model meets these onditions. Also, the sample-size normalized riterion
fun tion value in GMM provides a spe i ation test distributed as a Chi-square (1).
The standard error for an individual parameter is estimated by the inverted Hessian
formula in MLE, the OPG-Hessian-OPG formula in QMLE, and the Gradient-Weighting
Matrix-Gradient formula in GMM.

2.4 Simulation
This hara terization (equation 9) also de nes a omposite method to simulate the squareroot pro ess (Devroye 1986). First, one draws a random number j from the Poisson(j j 2=2).
7

Then, one draws another random number y from the Gamma(y jj + ; 1). Finally, one
al ulates the desired state variable r1 by r1 = y= . Noti e that the realized r1 is the
onditioning value r0 in the next draw. The initial value r0 , when starting a simulation run,
an be set to the theoreti al un onditional mean. To pass on the transient e e t, the rst
1000 realizations an be dis arded.

3 EÆ ient Method of Moments Estimation
This se tion des ribes the EMM estimator in a univariate ase. For formal dis ussions, see
Gallant and Tau hen (1996 ), Tau hen (1997), and Gallant and Long (1997).

3.1 Approximating True Density with Auxiliary Model
Denote the invariant probability measure implied by the underlying data generating pro ess
as the p-model. It is assumed that the dire t maximum likelihood estimation of the p-model
is not available. However, any smooth density fun tion an be approximated arbitrarily lose
by a Hermite polynomial expansion.
Consider a s alar ase. Let y be the random variable, x be the lagged y , and  be the
parameter. The auxiliary f -model has a density fun tion de ned by a modi ed Hermite
polynomial,
f (y jx; ) = C f[P (z; x)℄2 (y jx; x2 )g;
(12)
where P is a polynomial with degree Kz in z and thus the square of P makes the density
positive. The argument of the polynomial is z , whi h is the transformation z = (y x )=x .
The oeÆ ient of the polynomial is another polynomial of degree Kx in x. The onstant in
the polynomial is set to 1 for identi ation. C is a normalizing fa tor to make the density
proper.4 () is a normal density of y with onditional mean x and onditional varian e x2 .
The length of the auxiliary model parameter is determined by the lag in mean L, lag in
varian e Lr , lag in polynomial oeÆ ient Lp , polynomial degree Kz , and polynomial degree
Kx . Let fy~t gnt=1 be the observed data and x~t 1 be the lagged observations. The sample mean
4 In

the ase of multivariate density, the intera tion terms in both the Hermite polynomial and the
oeÆ ient polynomial an be set to zero, in order for a gradual expansion of the auxiliary model. See above
referen es for details.

8

log-likelihood fun tion is de ned by

Ln(; f g

y~t nt=1 )

n
1X
log[f (~yt jx~t 1 ; )℄:
=
n t=1

(13)

A quasi-maximum-likelihood estimator is obtained by 

~ = arg max
Ln(; fy~tgnt=1 ): 

(14)

The dimension of the auxiliary f -model, the length of , is sele ted by S hwarz's Bayesian
Information Criterion (BIC). There are di erent hoi es of information riteria for optimal
model sele tion. For a nite dimensional stationary pro ess, BIC with a larger penalty for
model omplexity proves to be onsistent, while the Akaike's Information Criterion (AIC)
will over t the model. On the other hand, if the true dimension is in nity or in reases to
in nity with the sample size, AIC with a smaller penalty for model omplexity is optimal
(Zheng and Loh 1995). The dimensions of the f-model needs to be as large as the p-model
to meet the identi ation ondition.5

3.2 Mat hing Auxiliary S ores with Minimum Chi-Square
From the rst-stage seminonparametri estimates, one obtains the tted s ores as the moment onditions,
n
1X 

mn (~) =
log f (~yt jx~t 1 ; ~):
(15)
n t=1 
In the se ond stage, a SMM-type estimator is implemented in the following way. Although
the dire t MLE for p-model is assumed to be impossible, the simulation from the stru tural
model (e.g., the sto hasti di erential equation) is readily available. Let fy^t gNt=1 be a long
simulation from a andidate value of , the parameter of the maintained stru tural model.
The auxiliary s ore fun tions an be reevaluated by numeri al integration of of the s ore
fun tions with the simulated data,
N 

1X
log f (^yt jx^t 1 ; ~);
m
^ N (; ~) =
N t=1 
5 Even

(16)

when the underlying p-model has a xed dimension (i.e., a xed number of parameters), the
auxiliary f -model an still have an in reasing order in nite sample sizes, be ause of the approximation from
auxiliary model to true model.

9

and the minimum hi-square estimator is simply, 

^ = arg min
fm^ N (; ~)0I~ 1 m^ N (; ~)g; 

(17)

where the weighting matrix I~ 1 is estimated by the mean-outer-produ t of s ores from the
auxiliary model
n
~I = 1 X[  log f (~yt jx~t 1 ; ~)℄[  log f (~yt jx~t 1 ; ~)℄0 :
(18)
n t=1  

Remember that the nite sample eÆ ien y loss of the GMM-type estimators is largely
attributed to the high ost and low a ura y in estimating the serially orrelated weighting
matrix. In EMM the moment onditions (s ore fun tions) of the rst step QMLE are orthogonal by onstru tion; hen e the information matrix is diagonal or nearly diagonal (i.e.,
serially un orrelated). EMM will be asymptoti ally as eÆ ient as MLE if the following onditions are met: the dimension of the auxiliary model is suÆ iently large (K ! 1), the lag
in the the auxiliary model is suÆ iently long (L ! 1), and the simulation from the maintained stru tural model is suÆ iently long (N ! 1). This en ompasses both Markovian
and non-Markovian ases (Gallant and Long 1997).

3.3 Overreje tion and Misspe i ation
The normalized riterion fun tion value in the EMM estimation,

2n = nm
^ N (^; ~)0 I~ 1 m
^ N (^; ~);

(19)

forms a spe i ation test for the overidentifying restri tions. Under the orre t spe i ation
of the maintained model (Tau hen 1997), we have 2n D! X 2 (l lp ), where the degree of
freedom equals the parameter length of the auxiliary model minus that of the stru tural
model. However, if the maintained model is misspe i ed (Tau hen 1997), we have

2n
n

! 2 = m^ N (; )0I 1 m^ N (; ) > 0;

a:s:

where , , and I are the asymptoti pseudo-true values under the maintained misspe i ation. As long as the sample size n is large enough, the SNP s ore generator will be ri h
enough su h that the false passage 2 = 0 will not o ur under misspe i ation.
10

4 Monte Carlo Design and Ben hmark Choi e
Te hni al aspe ts of the Monte Carlo experiment are summarized here. Most importantly,
two demanding s enarios|mean persisten e and high volatility|are hosen as the ben hmarks. Also, a maintained model of interest rate with sto hasti volatility and level feedba k
is designated as the data generating pro ess for omparing the power of dete ting misspe i ation.

4.1 Experimental Design
All omputations are performed on Sun Unix Workstations|Spar 10, Spar 20, and Pentium
II-300. Programs for generating random samples and GMM, QMLE, and MLE estimations
of the square-root model are written in FORTRAN language. The FORTRAN odes for
SNP and EMM are modi ed from SNP Version 8.5 (Gallant and Tau hen 1996b) and EMM
Version 1.3 (Gallant and Tau hen 1996a), in orporating automati SNP sear h by BIC in
some of the EMM estimations. NPSOL (Gill, Murray, Saunders, and Wright 1991) is the
optimization routine used for all of the programs.
The number of Monte Carlo repli ations is hosen as 1000 for ea h s enario and ea h estimator. Two nite sample sizes|500 and 1500 weekly observations|are used for ontrasting
the asymptoti behavior of ea h estimator. To generate the pseudo-random samples, the
Poisson-mixing-Gamma formula in Se tion 2 is implemented, and the 1000 initial stret h
is dis arded ea h time in order to pass on the transient e e t. The se ond stage of EMM
estimation is similar to SMM. The simulation size should be at least 30,000. EMM does
stabilize from 50,000 to 75,000 and ould have more Monte Carlo errors at 100,000. Thus
50,000 turns out to be a onservative but e onomi al hoi e (Gallant and Tau hen 1998b).
To hoose a SNP s ore generator in the EMM estimation, one an either let BIC automati ally de ide the SNP dimension in ea h repli ation, or, one an use a \posterior" xed
SNP spe i ation. For the omparisons with MLE, QMLE, and GMM estimators, I use
the xed SNP s ore in EMM estimation, su h that the overreje tion test statisti s has the
same degree of freedom for ea h sample size. However, in order to hoose an appropriate
xed SNP s ore, I rst run EMM 1000 times with the automati SNP s ore generator. I
then adopt one parti ular s ore for the 500 sample size and another for the 1500 sample
11

size. These spe i ations are at relatively higher dimensions, with abundant o urren e but
without severe reje tion. A by-produ t of the EMM estimation with automati SNP s ore is
that some light an be shed on how the overreje tion bias varies with the number of moments
hosen by the SNP estimation.6
In terms of the omputing time, ea h QMLE run takes about 3-5 se onds; ea h GMM
run takes about 5-10 se onds; ea h MLE run takes about 5-10 minutes; and ea h EMM run
takes about 1-2 hours. Note that MLE requires numeri al approximation to the likelihood
fun tion and also that EMM uses numeri al integration to evaluate the moment onditions.
EMM estimator is programmed up and ready to be implemented (Gallant and Tau hen
1996b, Gallant and Tau hen 1996a). One only needs to in orporate the square-root model
in the simulation ode. The developing time for QMLE and EMM is minimal, while GMM
espe ially MLE requires a lot of ne-tuning. Overall, EMM is still the most omputationally
intensive method, although there are numeri al te hniques to speed up the estimation (e.g.,
parallel pro essing or antitheti simulation).

4.2 Ben hmark Model
To sele t a suitable parameter setting, we start with the empiri al result from Gallant and
Tau hen (1998b), drt = (0:02491 0:00285rt)dt + 0:0275rt1=2 dWt . Using equations 3, 4, and
6-8, one an al ulate the un onditional mean and varian e, the Bessel fun tion order, and
the onditional mean and varian e. It is not diÆ ult to see that this original spe i ation,
S enario 1 in Table 1, features low mean-reversion (E (rt+1 jrt ) is nearly the unit-root) and
low onditional volatility (V (rt+1 jrt ) is lose to zero). Also the un onditional varian e is
unusually small, representing an abnormally quiet pro ess. The order of the Bessel fun tion,
twi e of whi h orresponds to the degree of freedom for an integer order non entral hi-square
distribution, is so large that the onditional density looks almost Gaussian. Not withstanding
all these short omings, s enario 1 is still the typi al empiri al result. It poses an important
hallenge to resear hers in tting the highly persistent, nearly unit-root onditional mean
pro ess, although its volatility stru ture and innovation density are not ri h enough. Hen e
6 This

automati -plus- xed SNP-EMM pro edure mimi s the realisti situation, when an empiri al resear her not only relies on BIC as an obje tive riterion in model sele tion but also in orporates prior
subje tive information to expand the auxiliary model|e.g, the E-GARCH s ore generator in Andersen et al.
(1999).

12

it will only rarely generate the high order ARCH or the high degree polynomial in the SNP
s ore. Based on the empiri al results in Gallant and Tau hen (1998b), I use the established
SNP s ore for S enario 1|s14140 for the 500 sample size and s14141 for the 1500 sample
size7 |and I report only the xed s ore EMM estimations. The S enario 1 in Table 1 is
termed an LMR-LCV spe i ation (low-mean-reversion and low- onditional-volatility).
If one in reases only the varian e parameter b0 from S enario 1 to 2-4 in Table 1, the Bessel
fun tion order q de reases gradually from the Gaussian-like spe i ation, but the onditional
volatility is still negligible. Alternatively, one an in rease both the mean parameters a0 and
a1 by a fa tor of 100 and the varian e parameter b0 by a fa tor of 10 from S enario 1 to
5 as in Table 1, while holding the un onditional mean and varian e onstant. This hange
will in rease the onditional volatility slightly, but the Bessel fun tion order q is still quite
large (resembling a Gaussian-like distribution). If one in reases the varian e parameter b0
from S enario 5 to 6-8 as shown in Table 1, both high onditional volatility and small Bessel
fun tion order are a hieved. S enario 8 is ri h enough in both the onditional volatility
and non-Gaussian innovation, hen e S enario 8 is suitable for examining the automati SNP
s ore EMM estimations. Strong volatility luster is another hallenge in tting the short
interest rate, although the high persisten e in mean is sa ri ed somewhat. S enario 8 in
Table 1, drt = (2:491 0:285rt )dt + 1:1rt1=2 dWt , is termed an HMR-HCV spe i ation
(high-mean-reversion and high- onditional-volatility).
One should not arbitrarily relate either S enario 1 or S enario 8 alone to the real interest
rate, but put together they represent the most important features in short rate pro ess.
There is a fundamental on ern with how exible the square-root model an be. In order
to t the real interest rate data, one would like to hold the un onditional mean ( a0 =a1 )
onstant without explosion (b0  4a20 ), resembling a stationary interest rate pro ess. In
addition one would like to a hieve high persisten e in both onditional mean and varian e.
With only three parameters to manipulate, it seems impossible to satisfy all four onstraints
simultaneously. This indi ates that the square-root model may not be exible enough to
model the interest rate dynami s. To nd a more suitable model requires a fourth degree of
7 s14140 means 1 lag in mean, 4 lag in varian e, 1 lag in oeÆ ient polynomial, 4 degree Hermite polynomial, and 0 degree oeÆ ient polynomial. s14141 only di ers in 1 lag oeÆ ient polynomial. There is more
detailed dis ussion of the SNP stru ture in Se tion 3.1.

13

freedom, for example, a sto hasti volatility omponent (Gallant and Tau hen 1998b).

4.3 Testing Misspe i ation
The goal is to nd a true data generating pro ess, of whi h an adequate auxiliary s ore will
not a ommodate a misspe i ed model. As dis ussed in Se tion 1, the square-root model
is widely used in tting the short rate pro ess, but most serious studies have reje ted this
spe i ation. So it is natural to adopt the square-root pro ess as a misspe i ed model and
use a non-reje ted model as the true data generating pro ess (for the short interest rate).
A re ent study by Gallant and Tau hen (1998b) gave the most favorable eviden e for the
following spe i ation,

drt = (0:014 0:002rt)dt + (0:043 0:018rt )eut dW1t ;

(20)

dut = ( 0:006rt

(21)

0:157ut)dt + (0:593 0:052ut)dW2t :

The short rate pro ess rt has a linear drift and a linear di usion, with the di usion multiplied
by an unobserved (exponential) sto hasti volatility term eut . The short rate rt is only
partially observable in dis rete time. The latent sto hasti volatility pro ess ut also has
a linear drift and a linear di usion, with the drift in luding a short rate level feedba k.
This model adequately passed the spe i ation test for various simulation sizes and greatly
outperformed the ompeting models in terms of reproje ting the onditional density and the
onditional volatility. It would be a very suitable hoi e for the true data generating pro ess.
When we t the misspe i ed square-root model drt = (a0 + a1 rt )dt + b0 rt1=2 dWt to the
interest rate data simulated from the maintained true spe i ation (equations 20 and 21),
the drift is orre tly spe i ed as a linear fun tion, and the misspe i ation omes only into
the di usion. From It^o's formula we know that the onditional mean is also orre tly spe i ed as linear; therefore the drift parameters are onsistent estimates (At-Sahalia 1996a). It
is equivalent to the ase where Ordinary Least Square is onsistent, but una ounted heteroskedasti and/or orrelated error stru ture may ause very noisy and ineÆ ient estimates.
The misspe i ed di usion generates in onsistent estimates of the onditional varian e, whi h
may ause serious distortions in pri ing dis ount bond yields or other interest rate sensitive
derivatives. Therefore dete ting the misspe i ation in di usion or volatility is a riti al
hallenge to resear hers and pra titioners.
14

5 Monte Carlo Results
Tables 2 through 6 and Figures 1 through 9 summarize the major ndings of this paper. The
dis ussions are organized along topi s, and extensive omparisons are made a ross QMLE,
EMM, GMM, and MLE. The main fo uses are nite sample eÆ ien y, overreje tion bias of
EMM under the null, and the dete tion of maintained misspe i ation in EMM and GMM.
Both the automati s ore generator and the xed s ore generator are used in EMM. The
likelihood ratio tests in MLE and QMLE provide joint inferen es under the null.

5.1 Simulation S hemes
The Poisson-mixing-Gamma formula is a useful hara terization of the transitional density
of the square-root model. The simulation a ura y based on the distribution fun tion an
provide an independent he k for the derivations in Se tion 2. The Monte Carlo study is
not alone in needing a reliable simulator; further appli ations of MLE with this formula
also require some justi ation. A ornerstone of the EMM estimator|a simulation-based
estimator|is the dis retized approximation to sto hasti di erential equations. EMM uses
a weak-order 2 s heme (Kloeden and Platen 1992). To assess these simulation approa hes,
some empiri al statisti s from long realizations (100,000) are ompared to their theoreti al
ounterparts.
Table 2 lists the al ulation of two moments and three quantiles. Clearly both s hemes
from exa t distribution and time dis retization work reasonably well. In both persistent
mean and strong volatility ases, the simulated moments and quantiles are very lose to the
model implied ones. It is not surprising that the probabilisti method is slightly better than
the dis retized method, although the di eren e is negligible.

5.2 S ore Generator
An important feature of EMM is the endogenous moment sele tion by a seminonparametri
s ore generator (SNP), whi h ontrasts with the ad ho hoi e of moment onditions in some
less sophisti ated GMM or SMM estimators. The optimal SNP sear h and the inexpensive
weighting matrix estimate are key to the eÆ ien y argument, and hopefully they also improve the overreje tion test. It is worthwhile to he k whether the SNP s ore aptures the
15

distribution features of di erent dependent stru tures before laun hing the full-s ale Monte
Carlo experiment. Tables 3 (500 sample size) and Table 4 (1500 sample size) report the SNP
sear hes for the 8 s enarios in Table 1. For ea h setting, the frequen ies of all kinds of model
hoi es among 100 repli ations are listed. Model dimension is represented by a ve-digit
number, whi h stands for, onse utively, lag in mean, lag in varian e, lag in polynomial,
degree of Hermite polynomial, and degree of Hermite oeÆ ient polynomial.
S enario 1 in Tables 3 and 4 is the LMR-LCV ase (low-mean-reversion, low- onditionalvolatility). Not surprisingly the Gaussian auto-regression spe i ation of 10100 dominates
other hoi es. This nding is onsistent with the fa t that the true density is nearly Gaussian
under this parameter setting (see Table 1). Moving from S enario 1 to 2, 3, and 4, the
onditional volatility in reases gradually, sin e the varian e parameter b0 is altered (see Table
1). The dominating hoi e is still Gaussian, and the han es of ARCH and/or non-Gaussian
spe i ations in rease slightly. Moving toward S enarios 5-8, both mean parameters a0 and
a1 as well as varian e parameter b0 are altered (see Table 1), and ultimately one rea hes
the HMR-HCV ase (high-mean-reversion, high- onditional-volatility). It is lear that the
SNP sear h favors the nonlinear, nonparametri AR-ARCH spe i ation. This is onsistent
with the low Bessel fun tion order and high onditional varian e (Table 1). Largely due to
this \distribution-dependent" or \data-dependent" s ore generator, the EMM estimator is
laimed to be asymptoti ally eÆ ient and hopefully more reliable in the spe i ation test.
Also evident from Tables 3 and 4 is that larger sample sizes enable the SNP to pi k
up higher model dimensions. In fa t, the asymptoti eÆ ien y argument requires that the
number of moment onditions and the lags entering ea h moment in rease with the sample
size (Gallant and Long 1997).
A salient question is whether the stru tural model an be identi ed when the SNP sear h
does pi k the Gaussian-AR(1) s ore. This orresponds to a quasi-maximum likelihood estimator based on the innovation assumption

zt = (rt

0

1 rt 1 )= 2  N (0; 1):

(22)

Sin e the onditional mean is orre tly spe i ed, the QMLE of 0 and 1 is a onsistent estimator of ea1 and a0 =a1 (1 ea1 ). It is just an Ordinary Least Square with a heteroskedasti
and serially orrelated error term (At-Sahalia 1996a). The onditional varian e is mis16

spe i ed as the onstant 2 . However, a ording to the theory of misspe i ed maximum
likelihood estimation (White 1994), the estimator ^ 2 may onverge to a pseudo-true value 
2 . The key argument is that the misspe i ed asymptoti varian e 2 must be a fun tion
of the true varian e parameter b0 , sin e both onditional varian e and un onditional varian e are determined by b0 . These asymptoti relations, two expli it and one impli it, are
indeed the binding fun tions in the language of Indire t Inferen e (Gourierous, Monfort, and
Renault 1993). Obviously the stru tural parameters a0 , a1 , and b0 are exa tly identi ed by
the auxiliary parameters 0 , 1 , and 2 . EMM is thus a feasible rst-order approximation
toward the Indire t Inferen e (Gallant and Long 1997).

5.3 Estimation Bias and Finite Sample EÆ ien y
Table 5 (persistent mean ase ) and Table 6 (strong volatility ase) report the mean bias,
medium bias, and root-mean-squared error (RMSE) a ross MLE, QMLE, GMM, and EMM
between 500 and 1500 sample sizes.
First look at the ase of persistent mean (Table 5). The biases are very large for the drift
estimates (a0 and a1 ) but quite small for the di usion parameter (b0 ). The biases redu e
with the sample size, ex ept for MLE and GMM. MLE is the most eÆ ient in a hieving the
smallest RMSE; however, the drift parameter estimates diverge|RMSE does not shrink with
the sample size.8 EMM seems to be more eÆ ient than GMM but less eÆ ient than QMLE.
The di usion parameter estimation in GMM does not onverge, similar to the drift parameter
in MLE. The drift parameter estimates in QMLE and the inter ept of drift in GMM seem
to onverge faster than root-n. The onvergen e rate of EMM in these nite samples is very
p
lose to 3. Overall the drift estimates are more biased and noisy than the di usion estimate,
and the lose-to-unit-root mean persisten e auses some unusual onvergen e problems in
nite samples. The rank in order of eÆ ien y from highest to lowest would be MLE, QMLE,
EMM, GMM, as expe ted.
In the ase of strong volatility (Table 6), all parameter estimates have very small biases
(relative to parameter value), and all biases shrink appropriately as the sample size in reases.
8 When

the square-root is lose to the non-stationary region, the mixing formula for approximating the
likelihood easily diverges in MLE estimations. One has to rely on the asymptoti expansion whi h introdu es
the asymptoti biases. It should be pointed out that the simulation s heme based on the mixing formula is
still sound, sin e the parameter is xed in simulations.

17

In terms of the eÆ ien y, the RMSE's of MLE, QMLE, and GMM are shrinking approxip
mately at the rate 3, but the RMSE of EMM de reases faster than root-n. Re all that
EMM should be asymptoti ally as eÆ ient as MLE, as the auxiliary SNP s ore generator
adopts an in reasing dimension with the sample size. Overall MLE a hieves the highest
eÆ ien y, QMLE omes se ond at T = 500 and third at T = 1500, EMM is third at T = 500
and se ond at T = 1500, GMM is best for the inter ept parameter of drift but is worst for
the slope parameter of drift and the di usion parameter. The nding that some parameter
estimates in GMM and QMLE are slightly more eÆ ient than MLE, an be attributed to the
fa t that \exa t" likelihood fun tion in MLE needs to be numeri ally approximated while
the moment onditions in GMM or QMLE are in losed forms.

5.4 Parameter Inferen e
The nite sample distributions of the standardized t-test statisti s for individual parameters
are summarized in Figure 1 (persistent mean) and Figure 2 (strong volatility). Standard
Gaussian kernel smoothing is adopted here. QMLE works well in both ases; EMM is
reliable in the latter ase, while MLE and GMM seem not to perform in either ase.
In the ase of mean persisten e (Figure 1), MLE seems to have asymptoti biases for the
drift parameter (underestimating a0 and overestimating a1 ), but the di usion parameter is
perfe tly approximated by its asymptoti distribution. QMLE works equally well for the
di usion parameter, while its estimates for the drift parameters, though having some nite
sample bias (a0 has upward bias and a1 has downward bias), are dissipating asymptoti ally.
GMM is not biased in estimating the drift, but the standard error is too small (high peak in
the middle) and the estimation variation is too large (fat tails on both sides). Its inferen e
for the di usion is s attering everywhere. EMM su ers from high variations for all the
parameters.
The ase of strong volatility (Figure 2) looks mu h di erent. MLE has almost negligible
biases, but it still understates the standard error and produ es fat tails. QMLE works
perfe tly well, espe ially in the tails where the 1%, 5%, and 10% t-tests are usually ondu ted.
GMM su ers heavily from both understating and overstating the standard errors. EMM
omes lose to QMLE, and its bias and variation are shrinking with the in reasing sample
sizes.
18

Several fa tors may ontribute to the unusual t-test distributions: (1) high persisten e
in the mean makes the simulated data look like a unit-root and the MLE, GMM, and EMM
estimators an not distinguish it from a persistent yet stationary situation in nite samples;
(2) the so- alled \exa t" likelihood in MLE is approximated by its series expansion (in strong
volatility ase) or its asymptoti expansion (in persistent mean ase); (3) the t-test is a Wald
test and la ks invarian e to nonlinear transformations.9

5.5 Overreje tion Bias and Number of Moments
The Monte Carlo results on EMM with automati SNP s ore generator an reveal some
onne tions between the number of overidentifying moments and the overreje tion rate.
Figures 3 and 4 report this experiment for S enario 8|the ase of strong volatility|whi h
an generate ri her SNP s ores. \Number of Overidenti ed Moments" refers to the di eren e
between the number of moments hosen automati ally by BIC (whi h varies ea h trial) and
the number of stru tural parameters in the square-root model (whi h is xed at 3). The
o urren e urve is the per entage of how many times BIC hooses a parti ular number of
moments over the 1000 Monte Carlo repli ations. Further, the reje tion urves ( xed at
the 5% level) tell you the per entage of how many times EMM reje ts the null square-root
spe i ation at this parti ular \Number of Overidenti ed Moments" over the number of
times that BIC pi ks this parti ular SNP s ore.10
The asymptoti size of the spe i ation test is xed at 5%. The o urren e rates show the
frequen ies of di erent numbers of moment onditions in 1000 repli ations. On average, the
5% gross overreje tion rate in automati s ore EMM is about 20% for T = 500 and about
25% for T = 1500. Some important features need to be mentioned. First, the reje tion
urve does not uniformly shoot up when more moment onditions are in luded, sin e these
moments are optimally sele ted by the SNP s ore generator. Se ond, the reje tion rates are
more stable at T = 1500 than T = 500, as more moments and lags are in luded. Third,
the reje tion rate ould be remarkably small for ertain low dimensions as well as for some
high dimensions. Sin e BIC tends to under t the auxiliary model in small samples, the
9 For

example, the drift fun tion in the square-root model a0 + a1 rt an be reparameterized as ( rt ),
then  be omes a nonlinear transformation of a0 and a1 .
10 Figures 7 and 8 are the same, ex ept that the data is simulated from the alternative sto hasti volatility
pro ess while the EMM is arried out for a misspe i ed square-root pro ess.

19

higher level unreje ted s ore is more likely apturing the true distribution. If the lower
level unreje ted s ore did sele t the true spe i ation, the reje tion rate is likely to shoot
up beyond that level. The impli ation for empiri al work is that an SNP sear h should go
beyond the rst optimal hoi e by BIC.

5.6 Spe i ation Test
Based on the insights gathered from the EMM experiments with automati SNP s ore generator, I hoose some \edu ated" xed SNP s ores for the strong volatility ase. For the
500 sample size I use s10111 (1 lag in mean, 0 lag in varian e, 1 lag in polynomial oeÆ ient, 1 degree in Hermite polynomial, and 1 degree in oeÆ ient polynomial), and for the
1500 sample I use s10121 (1 lag in mean, 0 lag in varian e, 1 lag in polynomial oeÆ ient,
2 degrees in Hermite polynomial, and 1 degree in oeÆ ient polynomial). In the ase of
mean persisten e, whi h is widely estimated in empiri al studies, I use the established SNP
s ores|s14140 for T = 500 and s14141 for T = 1500. In Figures 5 and 6, these EMM
J-tests results are ontrasted with GMM. With a knowledge of the true parameters, one
an also perform a likelihood ratio test in MLE or a quasi-likelihood ratio test in QMLE to
see whether the on den e ball on entrated at the estimated parameter ontains the true
parameter as often as suggested by the hi-square (3) random variable. This approa h is
not available to the empiri al resear her, sin e no true parameter is known. However, in the
Monte Carlo setting, one an use these \infeasible" tests to judge the reliability of MLE or
QMLE.
When testing the mean persisten e s enario (Figure 5), QMLE gives the best inferen e
as the overreje tion bias is small and is redu ing with in reases in the sample size. GMM
and EMM have severe overreje tion biases, although the biases shrink rapidly in a large
sample size. EMM seems to be worse than GMM, even with improved estimates of the
weighting matrix. The MLE LR-test diverges as the sample size in reases, whi h an be
attributed to the asymptoti bias of parameter estimates introdu ed by approximating the
likelihood fun tion. Turning to the ase of strong volatility (Figure 6), one an see that the
QMLE LR-test and the EMM J-test have almost perfe t size, and the overreje tion bias is
negligibly small. On the ontrary, the MLE LR-test has severe underreje tion, whi h does
not shrink asymptoti ally. Even worse, the GMM J-test has a large underreje tion bias,
20

whi h is diverging with in reases in sample sizes.
There are at least four sour es of overreje tion or underreje tion bias in GMM-type
estimators: ina urate and ostly estimates of the weighting matrix; unseasoned sele tion
of the moment onditions; an inadequate number of moments to apture the distribution
feature; and simply a small sample bias. A generi EMM approa h over omes the rst two
problems by adopting a serially un orrelated information matrix and an optimal SNP s ore
generator. The onservative BIC pro edure in EMM may hoose too few moments, but
one an re tify this problem by using additional information and extending the SNP sear h
beyond the BIC hoi e. The remaining small sample bias an be remedied by enlarging the
sample size. The above arguments arry through here, ex ept in the ase of persistent mean,
where the EMM estimator may mistake the data as oming from a unit-root pro ess.

5.7 Dete ting Misspe i ation
The power for dete ting misspe i ation is examined in two aspe ts: (1) rst, using automati SNP s ore EMM pro edure to study the power in relation to the number of moments,
whi h is optimally hosen by BIC; (2) se ond, using a xed SNP-s ore EMM pro edure to
ompare the power with GMM, where EMM adopts an \edu ated" hoi e of the moment
onditions while GMM adopts the simple lag-augmented moment onditions.11
In the rst stage, the ben hmark sto hasti volatility model (equations 20 and 21 in
Se tion 4.3) is used to simulate 1000 repli ations for the 500 and 1500 sample sizes, and
then a square-root di usion pro ess is tted to the data. This time I let BIC automati ally
hoose the best SNP s ore generator. Sin e the drift is linear, the onditional mean with
lag one is orre tly spe i ed. For the 500 sample size, 91% of the trials sele t lag 1; and
for the 1500 sample size, 93% of the trials sele t lag 1. The hoi e of onditional standard
deviation is all over the pla e, due to the nature of nonlinear sto hasti volatility. For T =
500, the sele tion is s attered mainly from lag 1 to lag 4, and for T = 1500, it is s attered
mainly from lag 3 to lag 6. The hoi es of Kz and Kx are predominantly zero. Figures 7 and
11 One

should be autioned that the alternative model of interest rate pro ess adopted here is only one
spe ial ase, and a full s ale study of misspe i ation issue is learly outside the s ope of this paper. Even
in the following limited example, the performan e of EMM is not unrelated with the established ARCH
(Engle 1982) and GARCH (Bollerslev 1986) ltering of the sto hasti volatility pro ess. In fa t, the power
of EMM will be optimal if the rst stage SNP auxiliary model|with an ARCH (earlier version) or GARCH
(re ent version) leading term|adequately aptures the interest rate dynami s.

21

8 plot the 5% reje tion rates against the number of overidenti ed moments along with the
o urren e rates of these moment hoi es. The highlight is that the probability of reje ting
a misspe i ed model does onverge to one very qui kly. At T = 500, the 5% level reje tion
rate is around 80-90% for a range of overidenti ed moments between 1 to 6, and beyond
that the reje tion rate is almost 100% (Figure 7). At T = 1500, the reje tion rate is always
lose to 100%, ex ept in an exa tly identi ed ase (Figure 8).
In the se ond stage, similar to the study of the overreje tion issue, I x the SNP s ore
generator and look at the reje tion rate uniformly along the 1%-100% test level. The xed
SNP s ore generator for the 500 sample size is s13100 (1 lag in mean and 3 lags in varian e),
and the hi-square test has a degree of freedom that is 3. When T = 1500, the s ore is
xed at s15100 (1 lag in mean and 5 lags in varian e), with 5 degrees of freedom. Figure 9
gives the reje tion plot for the EMM J-test statisti s, in omparison with GMM whi h has 1
overidenti ed moment. The upshot is that the EMM has the power to dete t a misspe i ed
model, and the power qui kly onverges to one as the sample size in reases from 500 to 1500.
In ontrast, the GMM has a serious underreje tion problem for the maintained misspe i ation, and the underreje tion bias be omes larger as sample size in reases|ultimately loosing
the power to dete t misspe i ation. The explanation is quite simple: EMM hooses arefully a SNP moment generator by BIC, while standard GMM simply uses the lag-augmented
instruments.

6 Con lusions
This paper performs a Monte Carlo study on EÆ ient Method of Moments (EMM), Generalized Method of Moments (GMM), Quasi-Maximum Likelihood Estimation (QMLE), and
Maximum Likelihood Estimation (MLE) for a ontinuous-time square-root model under two
hallenging s enarios|high persisten e in mean and strong onditional volatility|that are
ommonly en ountered when estimating the empiri al interest rate pro ess.
MLE a hieves the highest eÆ ien y, while its inferen es on individual parameters and
overall spe i ation are not very reliable and are even misleading on some o asions. QMLE
is less eÆ ient in omparison to MLE, but QMLE stands out as the best inferen e tool in
both the individual t-test and overall LR-test. EMM shows a onvergen e rate faster than
22

root-n, due to the expanding SNP s ore hoi e by BIC as sample size in reases. EMM also
provides better inferen e than GMM or MLE in a high volatility s enario. In the ase of
persistent mean| lose to unit root in small samples|some asymptoti s of MLE and GMM
break down, as parameter estimates and test statisti s diverge.
A number of lessons an be learned from this study: (1) MLE is not ne essarily the best
hoi e if the numeri al approximation to the density is omplex and/or the approximation
tends to diverges near the non-stationary region; (2) QMLE is simple to implement and an
be very reliable when the spe i ation information is easily in orporated in the losed-form
onditional mean and varian e; (3) if there is no new information to be in orporated into
the moment onditions, GMM an not be superior to QMLE; (4) when the true density or
moment fun tions are not known, EMM is the only hoi e; its small sample performan e is
not ne essarily inferior to the infeasible MLE or QMLE and is most likely superior to the
infeasible GMM.

23

Referen es
At-Sahalia, Ya ine (1996a), \Nonparametri Pri ing of Interest Rate Derivatives," E onometri a , vol. 64, 527{560.
At-Sahalia, Ya ine (1996b), \Testing Continuous-Time Models of the Spot Interest Rate,"
The Review of Finan ial Studies , vol. 9, 385{426.
Andersen, Torben G., Hyung-Jin Chung, and Bent E. Srenson (1999), \EÆ ient Method of
Moments Estimation of a Sto hasti Volatility Model: A Monte Carlo Study," Journal
of E onometri s , vol. 91, 61{87.
Andersen, Torben G. and Bent E. Srenson (1996), \GMM Estimation of a Sto hasti Volatility Model: A Monte Carlo Study," Journal of Business and E onomi Statisti s , vol. 14,
328{352.
Bansal, Ravi, A. Ronald Gallant, Robert Hussey, and George Tau hen (1995), \Nonparametri Estimation of Stru tural Models for High-Frequen y Curren y Market Data,"
Journal of E onometri s , vol. 66, 251{287.
Bollerslev, Tim (1986), \Generalized Autoregressive Conditional Heteros edasti ity," Journal of E onometri s , vol. 31, 307{327.
Bollerslev, Tim and Je ery Wooldridge (1992), \Quasi-Maximum Likelihood Estimators and
Inferen e in Dynami Models with Time-Varying Covarian es," E onometri Review ,
vol. 11, 143{172.
Burnside, Craig and Martin Ei henbaum (1996), \Small-Sample Properties of GMM-Based
Wald Tests," Journal of Business and E onomi Statisti s , vol. 14, 294{308.
Chuma ero, Romulo A. (1997), \Finite Sample Properties of the EÆ ient Method of Moments," Studies in Nonlinear Dynami s and E onometri s , vol. 2, 35{51.
Conley, Tim, Lars Peter Hansen, Erzo Luttmer, and Jose S heinkman (1997), \Short Term
Interest Rates as Subordinated Di usions," Review of Finan ial Studies , vol. 10, 525{
578.
24

Cox, John C., Jonathan E. Ingersoll, and Stephen A. Ross (1985), \A Theory of the Term
Stru ture of Interest Rates," E onometri a , vol. 53, 385{407.
Dai, Qiang and Kenneth J. Singleton (2000), \Spe i ation Analysis of AÆne Term Stru ture
Models," Journal of Finan e , forth oming.
Devroye, Lu (1986), Non-Uniform Random Variate Generation , Spinger-Verlag.
DuÆe, Darrell and Kenneth Singleton (1993), \Simulated Moments Estimation of Markov
Models of Asset Pri es," E onometri a , vol. 61, 929{952.
DuÆe, Darrell and Kenneth Singleton (1997), \An E onometri Model of the Term Stru ture
of Interest-Rate Swap Yields," Journal of Finan e , vol. 52, 1287{1321.
Engle, Robert F. (1982), \Autoregressive Conditional Heteros edasti ity with Estimates of
the Varian e of U.K. In ation," E onometri a , vol. 50, 987{1008.
Feller, Wiliam (1951), \Two Singular Di usion Problems," Annals of Mathemati s , vol. 54,
173{182.
Feller, Wiliam (1971), An Introdu tion to Probability Theory and Its Appli ations , vol. 2,
John Wiley & Sons, In ., Prin eton University, 2nd ed.
Fisher, Mark and Christian Gilles (1996), \Estimating Exponential AÆne Models of the
Term Stru ture," Working Paper .
Gallant, A. Ronald and Jonathan R. Long (1997), \Estimating Sto hasti Di erential Equations EÆ iently by Minimum Chi-Square," Biometrika , vol. 84.
Gallant, A. Ronald and George Tau hen (1996a), User's Guide for EMM: A Program for
EÆ ient Method of Moments Estimation , 1st ed.
Gallant, A. Ronald and George Tau hen (1996b), User's Guide for SNP: A Program for
Nonparametri Time Series Analysis , 8th ed.
Gallant, A. Ronald and George Tau hen (1996 ), \Whi h Moment to Mat h?" E onometri
Theory , vol. 12, 657{681.
25

Gallant, A. Ronald and George Tau hen (1998a), \The Relative EÆ ien y of Method of
Moments Estimators," Working Paper .
Gallant, A. Ronald and George Tau hen (1998b), \Reproje ting Partially Observed Systems with Appli ation to Interest Rate Di usions," Journal of the Ameri an Statisti al
Asso iation , vol. 93, 10{24.
Gibbons, Mi hael R. and Krishna Ramaswamy (1993), \A Test of the Cox, Ingersoll, and
Ross Model of the Term Stru ture," Review of Finan ial Studies , vol. 6, 619{658.
Gill, Philip E., Walter Murray, Mi hael A. Saunders, and Margaret H. Wright (1991), \User's
Guide for NPSOL (Version 4.06): A Fortran Pa kage for Nonlinear Programming,"
Te h. rep., Stanford University.
Gourierous, C., A. Monfort, and E. Renault (1993), \Indire t Inferen e," Journal of Applied
E onometri s , vol. 8, s85{s118.
Hansen, Lars Peter (1982), \Large Sample Properties of Generalized Method of Moments
Estimators," E onometri a , vol. 50, 1029{1054.
Hansen, Lars Peter, John Heaton, and Amir Yaron (1996), \Finite-Sample Properties of
Some Alternative GMM Estimators," Journal of Business and E onomi Statisti s ,
vol. 14, 262{280.
Hansen, Lars Peter and Jose Alexandre S heinkman (1995), \Ba k to the Future: Generalized Moment Impli ations for Continuous Time Markov Pro ess," E onometri a ,
vol. 63, 767{804.
Ingram, Beth F. and B. S. Lee (1991), \Simulation Estimation of Time Series Models,"
Journal of E onometri s , vol. 47, 197{205.
Johnson, Norman L. and Samuel Kotz (1970), Distributions in Statisti s: Continuous Univariate Distributions , vol. 2, John Wiley & Sons.
Karatzas, Ioannis and Steven E. Shreve (1997), Brownian Motion and Sto hasti Cal ulus ,
Springer.
26

Kloeden, Peter E. and E khard Platen (1992), Numeri al Solution of Sto hasti Di erential
Equations , Appli ations of Mathemati s, Springer-Verlag.
Lo, Andrew W. (1988), \Maximum Likelihood Estimation of Generalized It'^o Pro ess with
Dis retely Sampled Data," E onometri Theory , vol. 4, 231{247.
Newey, Whitney K. and Douglas G. Steigerwald (1997), \Asymptoti Bias for QuasiMaximum-Likelihood Estimators in Conditional Heteros edasti ity Models," E onometri a , vol. 65, 587{599.
Oliver, F. W. J. (1972), Handbook of Mathemati al Fun tions with Formulas, Graphs, and
Mathemati al Tables , John Wiley & Sons.
Pearson, Neil D. and Tong-Sheng Sun (1994), \Exploiting the Conditional Density in Estimating the Term Stru ture: An Appli ation to the Cox, Ingersoll, and Ross Model,"
Journal of Finan e , vol. 49, 1279{1304.
Tau hen, George (1997), \New Minimum Chi-Square Methods in Empiri al Finan e," in
\Advan es in E onometri s, Seventh World Congress," (edited by Kreps, D. and K. Wallis), Cambridge University Press, Cambridge UK.
White, Halbert (1994), Estimation, Inferen e, and Spe i ation Analysis , Cambridge University Press, University of California, San Diego.
Zheng, Xiaodong and Wei-Yin Loh (1995), \Consistent Variable Sele tion in Linear Models,"
Journal of the Ameri an Statisti al Asso iation , vol. 90, 1029{1054.

27

Table 1: Ben hmark Model Choi e
The square-root model is drt = (a0 + a1 rt )dt + b0 rt1=2 dWt . S enario 1 is taken from Gallant
and Tau hen (1998b). In S enarios 2-4, the varian e parameter b0 is in reased by a fa tor
of 2, 3, and 4 respe tively. From S enario 1 to S enario 5, the mean parameters a0 and a1
are multiplied by 100 and the varian e parameter b0 is multiplied by 10. From S enario 5 to
S enarios 6-8, the varian e parameter b0 is in reased by a fa tor of 2, 3, and 4 respe tively.
E (rt ), V (rt ), q , E (rt+1 jrt ), and V (rt+1 jrt ) are al ulated using equations 3, 4, and 6-8.

E (rt )
V (rt )
Bessel q
E (rt+1 jrt )
V (rt+1 jrt )

E (rt )
V (rt )
Bessel q
E (rt+1 jrt )
V (rt+1 jrt )

S enario 1
S enario 2
S enario 3
S enario 4
a0 = 0:02491 a0 = 0:02491 a0 = 0:02491 a0 = 0:02491
a1 = 0:00285 a1 = 0:00285 a1 = 0:00285 a1 = 0:00285
b0 = 0:0275
b0 = 0:055
b0 = 0:0825
b0 = 0:11
8.74
8.74
8.74
8.74
1.16
4.64
10.44
18.55
64.88
15.47
6.32
3.12
0.997rt +0.025 0.997rt +0.025 0.997rt+0.025 0.997rt +0.025
0.001rt +0.000 0.003rt +0.000 0.007rt+0.000 0.012rt +0.000
S enario 5
S enario 6
S enario 7
S enario 8
a0 = 2:491
a0 = 2:491
a0 = 2:491
a0 = 2:491
a1 = 0:285
a1 = 0:285
a1 = 0:285
a1 = 0:285
b0 = 0:275
b0 = 0:55
b0 = 0:825
b0 = 1:1
8.74
8.74
8.74
8.74
1.16
4.64
10.44
18.55
64.88
15.47
6.32
3.12
0.75rt +2.17
0.75rt +2.17
0.75rt +2.17
0.75rt +2.17
0.05rt +0.07
0.20rt +0.29
0.45rt +0.64
0.79rt +1.14

28

Table 2: Comparing Simulation S hemes (100,000 Length)
For the square-root model drt = (a0 + a1 rt )dt + b0 rt1=2 dWt , the marginal distribution is
a Gamma (equation 2), and the theoreti al values are al ulated a ordingly. Simulation
by distribution is based on the Poisson-mixing-Gamma formula (equation 11) and is subsequently implemented in both the Monte Carlo data generation and the Maximum Likelihood
Estimation. Simulation by dis retization is based on the weak-order 2 s heme (Gallant and
Long 1997) and unders ores the EÆ ient Method of Moments. S enario 1 (LMR-LCV)
is the Low-Mean-Reversion Low-Conditional-Varian e ben hmark, and S enario 8 (HMRHCV) is the High-Mean-Reversion High-Conditional-Varian e alternative. Both s enarios
are de ned in Table 1.
S enario 1
Simulated by Simulated by Theoreti al
LMR-LCV
Distribution Dis retization
Value
Mean
8.72
8.79
8.74
Varian e
1.18
1.09
1.16
5% Quantile
6.91
7.19
7.05
Median
8.72
8.72
8.70
95% Quantile
10.50
10.57
10.58
S enario 8
Simulated by Simulated by Theoreti al
HMR-HCV Distribution Dis retization
Value
Mean
8.75
8.70
8.74
Varian e
18.34
18.22
18.55
5% Quantile
3.04
3.03
3.05
Median
8.08
8.02
8.05
95% Quantile
16.70
16.67
16.81

29

Table 3: SNP Sear h for the Sample Size of 500
(This note applies to both Table 3 and Table 4.) These results are from 100 repli ations
of ea h s enario with the sample sizes of 500 and 1500. The information riterion used in
moment sele tion is S hwarz's BIC. S enarios 1-8 are the same as those in Table 1. Ea h
model spe i ation is hara terized by a 5-digit number. Ea h digit onse utively stands for
lag in mean, lag in varian e, lag in polynomial, degree of Hermite polynomial, and degree of
Hermite oeÆ ient polynomial.
S enario 1 S enario 2 S enario 3 S enario 4
Model %
Model %
Model %
Model %
10100 95
10100 96
10100 92
10100 88
11100 3
20100 3
10110 4
11100 7
20100 1
10110 1
11100 2
10110 1
21100 1
20100 2
10120 1
20100 1
21100 1
11111 1
S enario 5 S enario 6 S enario 7 S enario 8
Model %
Model %
Model %
Model %
10100 90
10100 77
10100 46
10111 25
10110 5
10110 12
10110 25
10100 18
20100 3
11110 4
10111 8
10110 15
11100 1
11100 3
10121 7
10121 11
10120 1
10111 1
10120 6
11110 10
10120 1
11110 3
11120 6
12110 1
10130 1
10120 4
20100 1
11100 1
12120 3
11120 1
10131 2
11130 1
11130 2
12110 1
10112 1
11111 1
15110 1
21120 1

30

Table 4: SNP Sear h for the Sample Size of 1500
S enario 1 S enario 2 S enario 3 S enario 4
Model %
Model %
Model %
Model %
10100 96
10100 96
10100 85
10100 57
10110 2
11100 4
11100 11
11100 15
20100 1
20100 2
12100 9
11100 1
12100 1
13100 5
13100 1
11111 2
15100 2
16100 2
11121 1
14100 1
14110 1
15110 1
16110 1
16111 1
18100 1
25100 1
S enario 5 S enario 6 S enario 7 S enario 8
Model %
Model %
Model %
Model %
10100 89
10111 43
10111 41
12120 12
10110 9
10100 30
10121 13
10131 11
10120 1
10110 10
11110 11
10121 10
11110 1
10120 6
11111 8
11110 8
10121 4
11121 8
10121 7
11110 2
10131 7
11140 7
12120 2
10111 5
10111 5
11111 1
12120 3
11130 5
20111 1
10120 2
13120 5
20120 1
10141 1
10122 4
12110 1
12130 4
11111 1
11141 1
11150 1
11160 1
12110 1
14120 1
21120 1
21140 1
22120 1

31

Table 5: Finite Sample Bias and EÆ ien y Comparison for S enario 1
For ea h sample size, 1000 Monte Carlo repli ations are generated from the square-root pro ess, and the model is estimated by MLE, QMLE, GMM, and EMM respe tively. S enario 1
is the Low-Mean-Reversion Low-Conditional-Varian e (LMR-LCV) ase. EMM estimation
in this table uses a xed SNP s ore generator.
Mean Bias
Median Bias
RMSE
True Value T = 500 T = 1500 T = 500 T = 1500 T = 500 T = 1500
Maximum Likelihood Estimation
a0 = 0.02491 -0.0123
-0.0130 -0.0119
-0.0126
0.0125
0.0131
a1 =-0.00285 0.0014
0.0015
0.0014
0.0014
0.0014
0.0015
b0 = 0.02750 -4.4e-5
2.5e-6 -4.6e-5
2.1e-5
0.0009
0.0005
Quasi-Maximum Likelihood Estimation
a0 = 0.02491 0.0994
0.0285
0.0803
0.0209
0.1343
0.0437
a1 =-0.00285 -0.0113
-0.0033 -0.0091
-0.0025
0.0153
0.0050
b0 = 0.02750
3.0e-5
1.2e-5
4.1e-5
1.9e-5
0.0009
0.0005
Generalized Method of Moments
a0 = 0.02491 0.0019
0.0023
4.7e-5
-5.1e-7
0.2418
0.0960
a1 =-0.00285 -0.0012
0.0022 -7.9e-5
-5.8e-5
0.0539
0.0481
b0 = 0.02750 0.0040
0.0075 -7.3e-6
6.1e-6
0.1256
0.1264
EÆ ient Method of Moments
a0 = 0.02491 0.0451
0.0407
2.6e-4
0.0085
0.1252
0.0944
a1 =-0.00285 -0.0054
-0.0048 -8.1e-5
-0.0012
0.0149
0.0112
b0 = 0.02750 -0.0015
-0.0003 -4.8e-6
-4.3e-7
0.0076
0.0041

32

Table 6: Finite Sample Bias and EÆ ien y Comparison for S enario 8
For ea h sample size, 1000 Monte Carlo repli ations are generated from the square-root
pro ess, and the model is estimated by MLE, QMLE, GMM, and EMM respe tively. S enario 8 is the How-Mean-Reversion How-Conditional-Varian e (HMR-HCV) ase. EMM
estimation in this table uses a xed SNP s ore generator.
Mean Bias
Median Bias
RMSE
True Value T = 500 T = 1500 T = 500 T = 1500 T = 500 T = 1500
Maximum Likelihood Estimation
a0 = 2.491 -0.0832
-0.0663 -0.0679
-0.0524
0.1337
0.0923
a1 =-0.285 0.0085
0.0058
0.0029
0.0010
0.0251
0.0161
b0 = 1.100
0.0024
-0.0016
0.0060
0.0000
0.0432
0.0263
Quasi-Maximum Likelihood Estimation
a0 = 2.491
0.0742
0.0224
0.0022
0.0006
0.3613
0.2111
a1 =-0.285 -0.0100
-0.0020 -0.0071
-0.0011
0.0448
0.0258
b0 = 1.100
0.0023
0.0003
0.0015
-0.0001
0.0430
0.0246
Generalized Method of Moments
a0 = 2.491
0.0003
0.0039
0.0001
0.0001
0.1023
0.0541
a1 =-0.285 0.0186
0.0110
0.0001
0.0005
0.0784
0.0561
b0 = 1.100
0.0097
0.0041 -0.0023
-0.0018
0.0838
0.0581
EÆ ient Method of Moments
a0 = 2.491
0.1323
-0.0067
0.0433
-0.0173
0.4891
0.2000
a1 =-0.285 -0.0310
-0.0022 -0.0199
-0.0000
0.0694
0.0257
b0 = 1.100 -0.0218
-0.0137 -0.0091
-0.0122
0.0618
0.0296

33

MLE: a0

MLE: a1

MLE: b0

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0

0
−5

0
QMLE: a0

5

0
−5

0
QMLE: a1

5

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0

0
−5

0
GMM: a0

5

0
GMM: a1

5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0
−5

0
EMM: a

5

0
EMM: a

5

−5

0
GMM: b0

5

−5

0
EMM: b

5

0

5

1

0

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0

0
0

5

0
−5

0

−5

0
QMLE: b0

0
−5

0.5

0

−5

5

0
−5

0

5

−5

Figure 1: Sampling Distributions of t-Statisti s for S enario 1.
The notations are respe tively: \- - -" t-test statisti s for 500 sample size; \|{" t-test
statisti s for 1500 sample size; \-.-.-" Normal (0,1) density as the referen e.
34

MLE: a0

MLE: a1

MLE: b0

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0

0
−5

0
QMLE: a0

5

0
−5

0
QMLE: a1

5

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0

0
−5

0
GMM: a0

5

0
GMM: a1

5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0
−5

0
EMM: a

5

0
EMM: a

5

−5

0
GMM: b0

5

−5

0
EMM: b

5

0

5

1

0

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0

0
0

5

0
−5

0

−5

0
QMLE: b0

0
−5

0.5

0

−5

5

0
−5

0

5

−5

Figure 2: Sampling Distributions of t-Statisti s for S enario 8.
The notations are respe tively: \- - -" t-test statisti s for 500 sample size; \|{" t-test
statisti s for 1500 sample size; \-.-.-" Normal (0,1) density as the referen e.
35

100

Percentage of Rejection and of Occurrence

90

80

70

60

50

40
Rejection Curve
30

20
Occurrence Curve
10

0

0

2

4
6
8
Number of Overidendified Moments

10

12

Figure 3: 5% Overreje tion Rate of EMM T = 500 with Automati S ore Generator.
The o urren e rate is the frequen y of the same moment hoi e divided by 1000. The
reje tion rate is the frequen y of reje tions divided by the number of o urren es.

36

100

Percentage of Rejection and of Occurrence

90

80

70

60
Rejection Curve
50

40

30

20

10
Occurrence Curve
0

0

2

4
6
8
Number of Overidendified Moments

10

12

Figure 4: 5% Overreje tion Rate of EMM T = 1500 with Automati S ore Generator.
The o urren e rate is the frequen y of the same moment hoi e divided by 1000. The
reje tion rate is the frequen y of reje tions divided by the number of o urren es.

37

MLE LR−Test T = 500

MLE LR−Test T = 1500

100

100

80

80

60

60

40

40

20

20

0

0

20
40
60
80
QMLE LR−Test T = 500

0

100

100

100

80

80

60

60

40

40

20

20

0

0

20

40
60
80
GMM J−Test T = 500

0

100

100

100

80

80

60

60

40

40

20

20

0

0

20

40
60
80
EMM J−Test T = 500

0

100

100

100

80

80

60

60

40

40

20

20

0

0

20

40

60

80

0

100

0

20
40
60
80
QMLE LR−Test T = 1500

100

0

20
40
60
80
GMM J−Test T = 1500

100

0

20
40
60
80
EMM J−Test T = 1500

100

0

20

100

40

60

80

Figure 5: Spe i ation Test for S enario 1.
The likelihood ratio tests for MLE and QMLE are against the true parameter values, and
the J-tests for GMM and EMM are against the overidentifying restri tions.
38

MLE LR−Test T = 500

MLE LR−Test T = 1500

100

100

80

80

60

60

40

40

20

20

0

0

20
40
60
80
QMLE LR−Test T = 500

0

100

100

100

80

80

60

60

40

40

20

20

0

0

20

40
60
80
GMM J−Test T = 500

0

100

100

100

80

80

60

60

40

40

20

20

0

0

20

40
60
80
EMM J−Test T = 500

0

100

100

100

80

80

60

60

40

40

20

20

0

0

20

40

60

80

0

100

0

20
40
60
80
QMLE LR−Test T = 1500

100

0

20
40
60
80
GMM J−Test T = 1500

100

0

20
40
60
80
EMM J−Test T = 1500

100

0

20

100

40

60

80

Figure 6: Spe i ation Test for S enario 8.
The likelihood ratio tests for MLE and QMLE are against the true parameter values, and
the J-tests for GMM and EMM are against the overidentifying restri tions.
39

100

Percentage of Rejection and of Occurrence

90

80
Rejection Curve

70

60

50

40
Occurrence Curve
30

20

10

0

0

1

2

3

4
5
6
7
Number of Overidendified Moments

8

9

10

11

Figure 7: 5% Reje tion of Misspe i ed Model T = 500
The o urren e rate is the frequen y of the same moment hoi e divided by 1000. The
reje tion rate is the frequen y of reje tions divided by the number of o urren es.

40

100
Rejection Curve

Percentage of Rejection and of Occurrence

90

80

70

60

50

40
Occurrence Curve
30

20

10

0

0

2

4

6
8
10
Number of Overidendified Moments

12

14

16

Figure 8: 5% Reje tion of Misspe i ed Model T = 1500
The o urren e rate is the frequen y of the same moment hoi e divided by 1000. The
reje tion rate is the frequen y of reje tions divided by the number of o urren es.

41

GMM J−Test T = 500

GMM J−Test T = 1500

100

100

80

80

60

60

40

40

20

20

0

0

20

40

60

80

0

100

0

20

EMM J−Test T = 500
100

80

80

60

60

40

40

20

20

0

20

40

60

60

80

100

80

100

EMM J−Test T = 1500

100

0

40

80

0

100

0

20

40

60

Figure 9: Power to Dete t Misspe i ed Model.
The GMM J-test is a hi-square (1), and the EMM J-test is a hi-square (3) for T = 500
and hi-square (5) for T = 1500.

42