You are on page 1of 21

Pseudo Maximum Likelihood Methods: Applications to Poisson Models

Author(s): C. Gourieroux, A. Monfort and A. Trognon


Source: Econometrica, Vol. 52, No. 3 (May, 1984), pp. 701-720
Published by: The Econometric Society
Stable URL: http://www.jstor.org/stable/1913472 .
Accessed: 23/03/2014 13:48

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

http://www.jstor.org

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
Econometrica,Vol. 52, No. 3 (May, 1984)

PSEUDO MAXIMUM LIKELIHOOD METHODS: APPLICATIONS


TO POISSON MODELS

BY C. GOURIEROUX, A. MONFORT, AND A. TROGNON

Pseudo maximum likelihood techniques are applied to basic Poisson models and to
Poisson models with specification errors. In the latter case it is shown that consistent and
asymptotically normal estimators can be obtained without specifying the p.d.f. of the
disturbances. These estimators are compared both from the finite sample and the asymp-
totic point of view. Quasi generalized PML estimators, which asymptotically dominate all
PML estimators, are also proposed. Finally, bivariate and panel data Poisson models are
discussed.

1. INTRODUCTION

THE ANALYSIS OF ECONOMIC BEHAVIOR often leads to the study of characteristics


taking a small number of positive values. The classical linear model is not
adapted to explain how such discrete variables depend on other quantitative or
qualitative variables. The reasons are similar to those usually given in the case of
an endogenous qualitative variable: the shape of the observation set does not
correspond to a linear model, the assumption of normality of the disturbances
cannot be made, since the endogenous variables take a small number of values
with strictly positive probabilities, and the prediction formulae which are de-
duced from a linear model give impossible values.
In the models considered in the literature to describe discrete variables (Cox
and Lewis [2], El. Sayyad [3], Frome, Kutner, and Beauchamp [4], Gilbert [5],
Hausman, Hall, and Griliches [8]; see also Lancaster [12]) the endogenous
variable is assumed to have a Poisson distribution conditional upon the exoge-
nous variables. The parameter of this distribution is a function of the values of
the exogenous variables. The choice of such a model is justified if the dependent
variable counts the occurrence of a given event during a fixed period and if the
usual assumptions of the Poisson process are satisfied. For instance, the model is
adapted to describe daily numbers of oil tankers' arrivals in a port, the number
of accidents at work by factory, or the number of patents applied for and
received by firms (Hausman, Hall, and Griliches [8]).
This kind of model may also be used as a first approach for cases where the
usual assumptions are not exactly satisfied, in particular the assumption of
independence between the present and the past. For instance, such a model has
been used by Gilbert [5] to explain the number of jobs held during a year.
In Section 2 we give the definitions of the basic Poisson model and of the
Poisson model with specification errors. In the latter case, we show how the first
and second order moments of the endogenous variable depend on the parameters
(Section 3), and we derive pseudo maximum likelihood estimation methods only
based on these first and second moments (Gourieroux, Monfort, and Trognon
[7]). The pseudo maximum likelihood estimators are compared in Section 4.
Generalizations of the Poisson model for the multivariate case, in particu-
lar for models required by panel data, are described in Sections 5 and 6.
701

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
702 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

2. PRELIMINARIES

In this section, the basic Poisson model and the Poisson model with specifica-
tion errors are described.

2.a. The Basic Poisson Model


Let yi, i = 1, . . ., n, be the observations of a discrete variable. In the basic
Poisson model, it is assumed that the yi are independently distributed and that
the distribution of yi is a Poisson distribution, whose parameter is

Ai = exp(xib) = exp( 2 Xikbk)

where (xi, .. ., xiK) are K exogenous variables associated with the ith observa-
tion and (b,, . .. , bK) are K unknown parameters.
The exponential function appearing in Ai is mainly justified by the positivity of
Ai; a linear function Ai = xib would imply possibly incompatible constraints on
the parameters. The mean and the variance of yi are equal to Ai and the density
function of yi is
exp( -
Yi!
The log-likelihood function is
n n
L(b) = - A.+
i 2 y,log Ai- log(yi!)
i=l i=l i
n n
= constant - exp(xib) + 2 yixib.
i=l i=

In the maximization of L(b), the first order conditions are:'

d- xi[exp(xib)-yi]=0 or
-db
n
xi(Ai -y) = 0.
i= 1

If a componentof xi is constantthe correspondinglikelihoodequationis


n

(Xi - Y) = 0.
i= 1

This equation implies that the sum of the residuals is equal to zero; therefore the comparison of the
computed sum of the residuals with zero can be used as an indicator of the algorithm accuracy.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 703

If a solution of the previous equations exists,2 it is a unique maximum since the


Hessian matrix

n
d2L
dbdb'J x'xiexp(xib)

is negative definite, if X'X = En.= x'xi is of full rank; the log-likelihood function
is, in this case, strictly concave and can be easily maximized by the usual
algorithms.

2.b. The Poisson Model with Specification Error


In the previous model the conditional expectation of the endogenous variable
given the exogenous variables and the corresponding conditional variance cannot
vary independently.3 To overcome this restriction it has been proposed to
introduce a disturbance in the definition of the parameter of the Poisson
distribution. In such models the parameter is written:

Xi*= exp(xib + ci)

where Ei is a specification error due, for instance, to omitted explanatory


variables independent from the exogenous variables x. This randomness of the
model must be distinguished from that of the Poisson model, which is related to
the random character of the endogenous variable considered. In this model the
1i* are random variables, and the conditional distribution of theyi, i = 1, ... . n,
given the xi and ci (or Xi*), i = 1, . . . , n, is given by fln= l1(yi Ixi, where
l(yi Ixi, ei) is the density of the Poisson distribution P [exp(xib + Ce)].
Since ci is an unobservable random variable, we must integrate it out to obtain

2These equations are not always solvable. Let us consider the case K = 1; the likelihood equation
is 7. xiexp(xib) = E7=lxiyi. The mapping b -> i=xiexp(xib) is continuous and its range is
-oo, + oo[ if minixi < 0 < maxi xi, 1-oo, 0[ if maxixi < 0 and 10, + 0cl if minixi > 0; therefore,
since theyi are positive, the equation is solvable if min xi < 0 < max xi; otherwise a solution exists if,
and only if, at least one yi is non null. Also note that the maximum likelihood estimator b is such that

n -I n

g= (lx'
x x;exp(xib) = 4j(b) (say),

where b is the O.L.S. estimator of b. By inverting ',, it is possible to deduce the asymptotic
properties of b from those of b (see Gourieroux and Monfort [6]).
3The fact that the sample variance of y is greater than the sample mean does not invalidate the
basic Poisson model. In effect, if this model is true:

V(y) = V(E(y Ix)) + EV(y Ix) = V(E(y Ix)) + EE(y Ix) > EE(y Ix) = E(y).

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
704 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

the conditional distribution of yi given xi:

l*(yilxI)=f exp( -XXi*y g(i)d


yi.

exp -exp(xib + ]exp [ yi(xib + Ei)


1])
]g(j) dEi

where g is the p.d.f. of E,.


If the disturbances c1, i= .I1, . , N, are i.i.d. the log-likelihood function is
given by:
n
L= log l*(yi Ixi).

The distribution l* depends on the distribution of Eiand, in general, I* is a very


complicated function. For convenience, it is sometimes assumed that ui = exp(1,)
is distributed as a y(a, /3), with p.d.f.

u a-I exp(-uI/3)
fAFr(a)

This kind of model has been considered by Hausman, Hall, and Griliches [8]
and by Gilbert [5] (see also Lancaster [11]).
Assuming that there exists a constant term in xib, there is no loss of generality
in setting: E exp(Ei) = 1. In that case B = a-1 = 712= V[exp(ci)] and it is straight-
forward to show (see, for instance, Gilbert [5]) that the distribution of yi given xi
is a negative binomial distribution:

J*(yi xi; b,'q2)

r 2+Yi)

= 7(1 +~ [n2]Yb[] [1 + ti2exp(xib)]


rF 12)r(yi + 1)

The ML estimation of b can be computed by numerical methods which


necessitate expansions of gamma and digamma functions.
As shown later, this ML estimator may be inconsistent if exp(E) does not have
a gamma distribution. Therefore it is interesting to propose simple estimation
methods leading to consistent estimators for all possible distributions g having
second order moments.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 705

3. ESTIMATION METHOD FOR THE POISSON MODEL WITH


SPECIFICATION ERROR

Since E(exp 4E)= 1 and V(exp E) = 2, we have:

E(yi xi) = EE(yi I Ei, xi) = E exp(xib + Ei)


= exp(xib)E exp(Ei)

= exp(xib),

V(yixi) = E(V(yixi,Ei)) + V(E(yixi,Ei))


= E(exp(xib + Ei)) + V(exp(xib + Ei))

= exp(xib) + q2exp(2xib).

These results will be used to estimate the true values bo0,2 of b, 2 without
assumptions on g.
We shall consider three kinds of methods proposed in Gourieroux et al. [7],
which are pseudo maximum likelihood based on linear exponential families,
quasi-generalized pseudo maximum likelihood (QGPML), and pseudo maximum
likelihood based on quadratic exponential families.

3.a. Pseudo Maximum Likelihood Estimator Based on


Linear Exponential Families
We shall consider the PML estimator associated with four linear exponential
families: the normal family (with unit variance), the Poisson family, the negative
binomial family (with i2 taken equal to a given positive number a), and the
gamma family.
The objective functions corresponding to these families are, respectively:
n
(i) - (yi - exp xib)2 (nonlinear least squares),4

n n

(ii) - expx b+ E yixib,


i=l i-l

41t might have seemed natural to consider a generalized least squares estimator of b by minimizing
the following objective function:
n (yi - exp(xib))2
i= exp(xib) + a exp(2xib)
where a is a given number. However such a method, based on the criterion
(U-_m)2
- r
p(u, m) =
m + am2
is not always consistent. This result is a consequence of the proof of Theorem 2 in Gourieroux et al.
[7], which establishes a necessary condition for the consistency of the estimator obtained by
minimizing Eiq(yi, exp(xib)). This condition is that a3q/am = X(m)(u - m) which is not satisfied in
the present case.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
706 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

(iii) > { yixib - ( +yi)log(1 + a expxib)


n

(iv) x
[-xbb-yiexp(-xib)].
i= 1

The pseudo likelihood equations are:


n
(i) - 2 xi(yi - expxib)expxib= 0,
i= 1
n

(ii) Xi(- exp xib + yi) = ?,


i= I

n, y' - expx1b
(iii) ,. i 1 + aexpx,b =
n

(iv) -i(Yi -exp xib)exp(-xib) = 0.

The Hessian matrices of the objective functions are:


n
(i) -2 2 x- xi(yi - 2expxib)expxib,
i= 1
n
(ii) -E x* x expxib,
i= 1

n X/ . xi(I + 2a exp xib)exp xib


i=1 (I
+(+aexpxib)2
n

(iv) -E x> xiyiexp(- xib).


i= 1

Note that the last three objective functions are concave and can be easily
maximized by usual iterative methods, whereas the objective function associated
with nonlinear least squares is not concave.
We know from Gourieroux et al. [7, Theorems 2 and 3], that these four PML
estimators bj, j = 1, . . . , 4, are strongly consistent and asymptotically normal
and that the asymptotic covariance matrix of n (bj - bo) has the general form
J - IJ'- , where

Iaf)
1= EX( abf b-)

I = Ext abf2 -'2o2 -l adfb

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 707

f(x,b) = expxb, af/ab = (af/ab)(x,bo) = x'expxbo. 0 = Y(x,bo) is the vari-


ance of the chosen linear exponential family and QO= exp xb0 + q 2exp2xbo is
the variance of the true distribution.
The asymptotic covariance matrices of the four PML estimators are:

(i) [E,(x'x exp 2xbo)] E' {x'x exp 2xbo(exp xbo + qoexp2xbo) }

x EX(X'x exp2xbo)1],

(ii) [ Ex(x'x expbo)] 'EX{x'x (expxb0 + 12exp2xbo)}

x [ Ex(x'x exp xbo) ]'

2
=Ex (x'x exp xbo)1 + [Ex (x'x exp xbo)

X EX(x'x exp 2xbo)[ E (x'x exp xbo)]'.

Note that, if the likelihood of the basic Poisson model is used, the esti-
mator obtained is consistent and asymptotically normal; however the previous
formula shows that the asymptotic covariance matrix obtained when it is
(wrongly) assumed that the correct model is the basic Poisson model, i.e.
[Ex(x'x exp xbo)]f , is "smaller" than the true one.

(i [ exp xbo x'x(expxbo + ]


(iii) EtXk 1xx+ mqoexp2xbo) J
aexpxbo (1 + aexpxb 0)2

[E ( x xexpxb )]

(iv) (Exx'x) 'EX {X'x[Nq2+exp(-xbo)]x}(ExoX)

-
o(Ex 'x) + (Exx'x) Ex[x'xexp(-xbo)](Exx'x)y.

These asymptotic covariance matrices may be consistently estimated in different


ways. If a consistent estimate of q 2 is known (see Section 3.b), consistent
estimates of the previous covariance matrices are obtained by replacing bo and qo
with consistent estimates and expectations with sample means. Another way of
estimating these matrices is to consider the sample means with respect to x and

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
708 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

y; in this case the estimator is

n~~~~
(_,i'ix 2iz(xi, bA)

x (xixexp 2XibE (X,b)(y - expxb


n -

Nil

Does there exist one of the previous PML estimators uniformly better than the
others, i.e. with a smaller asymptotic covariance matrix for all possible distribu-
tion of the error ci? Heuristically the answer is "no." For instance, if c1= 0, the
PML estimator based on the Poisson family is exactly the ML estimator; it is
asymptotically efficient and in particular better than the other three estimators.
Whereas, if ci has a gamma distribution, the PML estimator based on the
negative binomial family is asymptotically efficient and better than the others.
However, as we shall see in the next section, it is possible to build estimators
(QGPML estimators), which are asymptotically uniformly better than the four
previous PML estimators.
Finally note that, in the same spirit as White [13] or Chesher [1], it would be
possible to perform a specification test for the distribution of the errors by
comparing to zero an estimate of J - I. However as noted by White [13], the
computation of the test statistic can be cumbersome.

3.b. Quasi-generalizedPseudo Maximum Likelihood Estimators


To apply the QGPML approach (Gourieroux et al. [7, Section 5]), it is
necessary to have consistent estimators of b and q2. A consistent estimator
of b can be obtained by any b1 and a consistent estimator of q2 may be
derived along the following lines: since Vyi = exp(xib) + 2exp(2xib), we have:
q
(yi - exp(xib))2 - exp(xib) = 2exp(2xib) + ui with Eui = 0. A strongly consis-
tent estimator of q2 iS the estimated regression coefficient:
n
[(yy -
exp(xibj))2 - exp(xib) ]exp(2xibj)
A2 i_=_1

TU= n

E exp(4xibj)

QGPML estimators can be implemented when the linear exponential families


depend on two parameters bijectively related with the mean and the variance.
Therefore, the QGPML approach can be applied with the normal family (i), the
negative binomial family (iii), or the gamma family (iv), but not with the Poisson

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 709

family (ii). In these cases the objective functions are:

n
(yi - exp(xib))2
(1) ~ ~ 1=~ ~ ~ E
)- A

exp(xib1) + qexp(2x b1)


Xi

(iii) 2 yixib- 2 +Yilog(l +3exp(xib))

A A

(iv) { + exp(x ib4)


i +
A2

exp(Xib) kXi -yiexp( -xi ))J


As previously mentioned, the two last objective functions are globally concave
(if 2<0?0 = 1 or 4, we take A2 = 0 and if 3 = 0, (iii) should be replaced by the
A

function associated to the basic Poisson model).


All these estimators are consistent and have the same asymptotic covariance
matrix:

Ex x'x exp(xbo)
1 + q2exp(xbo)

This covariance matrix is smaller than the asymptotic covariance matrices of the
PML estimators of the previous section.
If we expand the covariance matrix of the QGPML estimators in a neighbor-
hood of 712 = 0, we have:

[ E' x'x exp(xbo)


x 1 + q2exp(xbo)

2
= (E.x'xexp(xbo)(l - exp(xbo))) + o( )

= [Ex(x'xexp(xbo)) -
OEx(x'xexp(2xbo))]'+ o(q0)

= EE(x'x exp(xbo)) + m2[Ex (x'x exp(xb0))1


x [ Ex(x'x exp(2xbo)) [Ex (x'x exp(xbo)) '+ o (72).

This shows that the PML estimator based on the Poisson family is second order
efficient in a neighborhood of q 2 = 0.

3.c. Pseudo Maximum Likelihood Estimator Based on


Quadratic Exponential Families
The parameters b and 'q2 can be estimated simultaneously by using the PML
approach based on quadratic exponential families. This method may be based

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
710 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

for instance on the normal family. The estimators are defined as the solutions of
the optimization problem:

maxT(b,'q2) =- .
log[exp(xib) + q2exp(2xib)]

n [y - exp(xib)]2
=1 exp(xib) + 'q2exp(2xib)

a p(b,q2) n f 1 + 2_q2exp(xib)
ab i=Ixi{l + q2exp(xib)

[y - exp(xib) ]2[ 1 + 2iq2exp(xib)]

[1 + q22exp(xib) ] 2exp(xib)

-2 y -exp(xib) 1=0,
1 + q2exp(xib) J

n
exp(xib) [yi-exp(xib)]2
a__(b,_2)
a(2) i=, l + q2exp(xib)
I [1 + nq2exp(xib)] {

The asymptotic covariance matrix of the PML estimators is obtained by


applying the result of Gourieroux et al. [7, Appendix 5], with

= -
2) =
,
C(m,2 D(m,a2) __I

=) [ expxb
[expxb + q2exp2xb J
ah(b, q2) _
x'exp(xb) x' [exp(xb) + 2iq2exp(2xb)]
a(b, 2)
q
0 exp(2xb) J

Finally, we can remark that the objective function associated with the negative

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 711

binomial family, i.e.

r
n /rq 12
+i)
+Yj~ ~ ~ ~~qep(b]1/1+~ (I
slog 1[q2exp(xib)]I[l + q exp(xib)F
i=' L 7 r(yi + 1)

does not satisfy the necessary condition of Theorem 7 in Gourieroux et al. [7] for
the strong consistency of the PML estimator. Therefore the method described in
2.b. is inconsistent for some true distributions of expEi (different from the gamma
distribution).

4. COMPARISON OF THE VARIOUS PML ESTIMATORS

In the previous section, we defined a family of consistent pseudo maximum


likelihood estimators and we derived their asymptotic covariance matrices. It is
natural to compare the properties of these estimators and this is the purpose of
this section. This comparison will be performed in two ways: asymptotically by
specifying a convenient limit distribution for the explanatory variables, and in a
finite sample by using Monte-Carlo methods.

4.a. Asymptotic Comparison


Let us write xbo = ao + x*b*. The parameter of interest is b* and we shall
assume that the limit distribution of x*' is multivariate normal N [ t, Q ] where Q
is nonsingular. In fact we can assume without loss of generality that I = 0 and
Q= I; in effect, if it is not the case, it is possible to define the new variable
Q - l2(x*' _ ) and the new parameters ao + j' b*, Q 1/2bo*;these parameters are
bijectively related to the initial ones and the new parameter of interest is
bijectively related to b*.
The asymptotic covariance matrices of the PML estimators associated with the
normal, Poisson, and gamma pseudo families are functions of expressions of the
form: E. [x'x exp Oxbo].When x* is standard multivariate normal, this matrix is
equal to:
i
exl9+ 92IlbII2
Ex[ x'x exp(Oxbo)] = exp OaO+ 2 )b
1 Ob*'
1+ 92bb'

where llb*112= b*'b*. We deduce that:

{EX[x'x exp(Oxb0)] } 1

(_ b* 2[1 + 2llbo*l2 O9b*'1


211
Fexps-ulas- 2lc )[

From these results, the asymptotic covariance matrices of the PML estimators of

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
712 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

bo can be worked out; we obtain:

Normal: Vas(b*')= exp( - ao + b 1) [I+ b*b']

12)[I + 4bobo'];
+ 2exp(411bo

Poisson: Vas(b) =exp[-aO-. b_12 ]1+ 2expIlb 112[I+ bobo'];

Gamma: Vas(b4) exp [ao+ b 12 1[I+b*b*'1+2I.

Since Vas(b*)>> Vas(b*) and Vas(b*)>> Vas(b:), the PML estimator based on the
normal pseudo family, i.e. the nonlinear least squares estimator is always domi-
nated by the other PML estimators. As far as b* and b* are concerned, none of
these estimators is uniformly better than the other; more precisely: VaSb >>
VasbZ if and only if

2
< exp -a- . l

In particular we find as expected that, if the variance of the disturbance is small


enough, the PML estimator based on the Poisson distribution is better than the
PML one based on the Gamma distribution.

4.b. Finite Sample Comparison


In order to have some idea of the finite sample properties of the PML
estimators, we consider the following model:

Yi-P {exp[aO + boxi* + Ei]}, where aO= bo = 1.


The variables xi" and E, i = 1, . . ., 20, are chosen independent with respective
distributions N [0; (.88)2] and N [-.35; (.84)2]. The particular values appearing in
these distributions are such that E exp Ei = 1 and Eyi = 4. Forty samples have
been independently drawn for each estimation method.
The empirical means, the asymptotic theoretical means, and the standard
errors of b* are given in Table I.
The method corresponding to the negative binomial pseudo family, which is
usually chosen, yields to the greater bias in the present case. The empirical
distributions are reported in Figure 1 and the comparison between the empirical
and the asymptotic distributions of the PML estimator based on the gamma
family is given in Figure 2.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 713

TABLE I

Pseudo Empirical Empirical Asymptotic Asymptotic


family mean standard error mean standard error

Normal 0.98 0.31 1 2.47


Poisson 0.97 0.31 1 0.52
Negative
binomial 0.78 0.57 1 0.33
Gamma 1.07 0.39 1 0.36

Normal

-l 0 1 2

Poisson

El~~~m
-1 0 1 2

Negative Binomial

-1 0 1 2

Gamma

-1 0 1 2
FIGURE 1-Empirical distribution of the PML estimator of the slope coefficient.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
714 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

I~~~,,I

0 1 2
FIGURE 2-Comparison between the empirical and the asymptotic distributions of the
PML estimator based on the gamma family.

5. THE BIVARIATE POISSON MODEL

5.a. The Statistical Model


In this section will be proposed a generalization of Poisson models which
allows the simultaneous description of count variablesy, andY2. The correspond-
ing bivariate Poisson model is the same as the model presented by Johnson and
Kotz [10, p. 297-300]. This model can be obtained through the classical hypothe-
ses of Poisson processes.

HYPOTHESISHI: y,(t) and y2(t) are the number of events of types 1 and 2
which occur between 0 and t.

HYPOTHESISH2: The occurrence of events between t and t + dt is indepen-


dent of what occurred before t.

HYPOTHESISH3: Between t and t + dt, will occur: (i) one event of type 1 and
no event of type 2 with probability

d+od)[ oo(dt)
X dt + o (dt) limo dt ;

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 715

(ii) one event of type 2 and no event of type 1 with probability Udt + o(dt);
(iii) one event of type 1 and one event of type 2 with probability y dt + o(dt);
(iv) no event with probability 1 - Xdt - udt -ydt + o(dt).

The distribution of the process [yI(t), y2(t)] can be easily obtained. Let us
denote by Pnm(t) the probability that y,(t) = n and y2(t) = m and by G(t, u, v)
= EuyI(t)VY2(') the moment generating function. By definition, we have the
following difference equation:

(t + dt) = Pn,m(t) [ 1 - Xdt - u dt -y dt + o(dt)]


Pn,m
I,(t) [Xdt + o(dt) ] + Pn,m I(t) [ u dt + o(dt)]
+ Pnm-

+Pn l,m-,(t)[-ydt + o(dt)]-


As dt tends to zero, we can write

dpn,m(t)
ln()= ( - -
Y-Y)Pnm (t) + XPn - i,m (t)
dt

+UpPn,m I(t) + YPn- m_


Im(t).

This relation implies the following differential equation for the moment
generating function:

a-t G(t,u,v) = (- - -y + Xu + uv + -yuv)G(t,u,v).

Then G(t,u,v) = exp{(-X - y + Xu + uv + -yuv)t}. The expression for G


implies that y1(t) and y2(t) are respectively Poisson variables with parameters
(X + y)t and ( u + y)t, setting v = 1 or u = 1. These variables are correlated. The
covariance between y,(t) and y2(t) is obtained through the second cross partial
derivative of G:
[ a2G(t, u, v)1
EYI(t)Y2(t) au[ J -yt + (X+y)t* ( + y)t
u= 1,v= I
and Cov(y,(t), y2(t)) = Ey1(t)y2(t) - Eyl(t)- Ey2(t) yt.5 Note that this model
can be obtained by setting:

yl(t) = a(t) + ,(](t),


y2(t) = a(t) + ,2(t),

5When y = 0 the covariance betweenyI(t) andy2(t) is equal to zero. But in that case the moment
generating function of [yI(t), y2(t)] is the product of the moment generating functions of y1(t) and
y2(t). This result implies that in such models noncorrelation is equivalent to independence. In the
general case y is positive. This parameter measures the serial correlation between y1(t) and y2(t). It is
positive sinceyl(t) andy2(t) simultaneously increase, when t increases; this positivity condition on y
does not imply a similar condition on the empirical correlation computed from the observations yi(t)
and y2i(t), when t is fixed and i is varying.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
716 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

where a(t), /,3(t), /32(t) are independent Poisson variables with parameters yt, Xt
and jit respectively. This fact will be used in the multivariate case (see Section 5).
Note that the difference y,(t) - y2(t) is equal to /,3(t) - P2(t); this means that, if
only y,(t) - y2(t) is observable, y is not identifiable and independence between
y,(t) and y2(t) can be assumed. More importantly, the difference between two
independent Poisson variables provides a natural generalization of the usual
Poisson variable which can be used when the discrete variable considered can
take negative values. The specifications and the estimation methods proposed
above can be easily extended to this case.

5.b. The EconometricModel


Let us consider a sample of n independent observations of (Y Ii Y2i) distributed
as bivariate Poisson variables (i = 1,2, . .. , n). These observations are counts of
events of types 1 and 2 during a period of length 1 (by convention). The
econometric model will be complete if we define the moments of (Yli Y2i) as
functions of explanatory variables and disturbances; by analogy with the previ-
ous section we will assume:
xi = exp(xib + Ell), ,i = exp(zic + E20)

Yi = exp(mid + E3,)

with
EexpE I= EexpE2i = EexpE3i = 1,

exp E11
V expE2i = Q.
exp E3i
The complexity of the elementary probabilities Pnm does not allow one to
integrate out the distribution of the disturbances (E1i, E2i1,E3i). In this bivariate
case the maximum likelihood method is not tractable, whereas the estimation
methods based on first and second order moments are applicable. The expecta-
tions of the observed variables are:

Eyl = exp(xib) + exp(mid), Ey2i = exp(zic) + exp(mid).


And, the covariance matrix is given by:

v( Yi) = [exp(xib) + exp(mid) exp(mid)


Y2i L exp(mid) + exp(mid)
exp(zic)

0 exp(x?b) 0
bexp(xb) exp(mid)
+ 0
eO 0 exp(zic) exp(mid)
expz~c)exp(1d)1 exp(zmc)
exp(m1d) exp(m1d)

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 717

The nonlinear least squares estimators of b, c, and d are obtained by minimizing:


n

[(Yli - exp(xib) - exp(mid))2 + (Y2i - exp(zic) - exp(mid))2].


,= A

Let us denote by b,c,d, these estimators and by wli and w2i the estimated
residuals:
A A
A
=--eexp(xib -exp(mid),

A
= - -
Y2i exp(zic) exp(mid).

Consistent estimators of ow, i, j = 1,2,3, the elements of Q, are obtained by


applying ordinary least squares to:

- exp(xb) - exp(mid )= wj1exp(2xib) + 2o13exp(xib+mid)


l33exp(2md + )+ disturbance,
W2- xp(zic)-ex(i) =ApexpuLi^u) 2w23exp(zic^ + mid )
+ W33exp(2mid ) + disturbance,
Wiw2i -exp(
exp(m p= dexp(2xp +ZiC^)+ 2o23exp(z +Zimid)
+ w13exp(xib + mid )+ .33exp(2mid )
+ disturbance.

The estimator Q of Q~may be used to apply the QGPML method. The QGPML
estimators of b, c, d are solutions of:
V~~~~~~~~~
Zii A)Z

min E (yl - Ey1,,y2, - EY2)V(Y1:) '(Y2,-2i

where

is obtained from

in which b,c,d,Q are replaced by,3,xd,X. This QGPML estimator has an


asymptotic covariance matrix equal to the lower bound of the asymptotic
covariance matrices of the PML estimators.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
718 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

6. THE MULTIVARIATE CASE

6.a. A Statistical Model for Panel Data


The bivariate Poisson model can be directly generalized for the case of m
count variables (see Johnson and Kotz [10]). In this section, we only present a
generalization related to the problem of panel data.
The observed variables are y, (t), i = 1, . . ., I; j = 1, . . . , J. We assume that

yiv(t) = ali(t) + l8j(t) + -yij() (i = 1, . . . , I; j = 1, . . . , J),

where ai(t), flj(t),-yi(t) are independent Poisson variables with parameters Xit,
yjt, y,t. The previous formulation is an error component model for count data:
ai(t) and /t3(t) are the specific effects, yij(t) is the residual effect. The moment
generating function of y,(t), i = 1,... , I;j = 1, . . ., J is

G(t, uij; i = ,. . I; j= 1, . .. , J)

=E(r tuiY(t))

- E(1 Ju U7(t)+fuj(t)+Y(t) )
ii~~~~
-
II7JE((Jl
Ui) )IIE((JiJUi) )n HIIE(UXJ.),

=exp
-E,ZJ ' -YJ+ ' Ju
+i i~i 17

The marginal distribution of Yq is P [(Xi+ + -yy)t] and the second order


moments are given by:

Vyyj= (XA+ Yj+ Yi)t,

Cov(yYI Ykl) = if i#k, ]#l,

Cov(yy, y4) =yjt if i=: 1,

Cov(yj, yYl)=Xit if J]# 1.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
PSEUDO MAXIMUM LIKELIHOOD METHODS 719

6.b. The Associated EconometricModel


As previously stated the exogenous variables and the disturbances are intro-
duced in the parameters Xi, j,-ye. This introduction is done in such a way that
,

the interpretation of Xi, j,-ye as main and specific effects is maintained.


,

I = exp(xib + Elj), tj = exp(zjc


+ E2ij), YU,
= exp(mid +
E3.),

where the disturbancesElije2ijE3, are assumedindependent,and such that:

E(exp EkU) = 1, V(expEkU) = Cok (k = 1, 2, 3).

Integrating with respect to the E's, and putting by convention t = 1, the first and
second order moments of Yi are

EyU = exp(xib) + exp(zjc) + exp(m,d),

Vyo = EV(yj/E) + VE(yU/E)

= expxib + w2 exp2xib + expzjc + co2exp2zjc

+ exp m d + co2exp 2m d,

Cov(y , Ykl) =O if i#=k, l 1,

Cov(yij, y,) = exp(zjc) + co2exp(2zjc) if i # 1,

Cov(yU , yil) = exp(xib) + c2exp(2xib) if j # 1.

The previous approach of double indexed count data provides an alternative


formulation to that proposed by Hausman, Hall, and Griliches [8].
The parameters may be estimated by the QGPML method.

7. CONCLUDING REMARKS

The properties of the Poisson model presented in this paper are similar to those
of Gaussian linear models: the distribution of the observations is characterized
by its first and second order moments and, in the multidimensional case, the lack
of correlation between two endogenous variables is equivalent to the indepen-
dence between these variables. These models are also extended in order to define,
in the count data context, the analogue of seemingly unrelated equations models
and error components models. Simple estimation methods based on the general
frameworks proposed in Gourieroux et al. [7] are available for all these models. It
would also be interesting to propose Poisson models where the occurrence of
some events between t and t + dt depends on what occurred before t; such

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions
720 C. GOURIEROUX, A. MONFORT, AND A. TROGNON

models would be similar to the autoregressive model and would be of particular


importance for instance for the problems related to job search (see Heckman [9]).

CEPREMAP
and
Ecole Nationale de la Statistique et de l'Administration Economique

ManuscriptreceivedJanuary, 1982; final revisionreceivedMarch, 1983.

REFERENCES
[1] CHESHER, A.: "Testing for Neglected Heterogeneity," Econometrica, 52(1984), forthcoming.
[2] Cox, D. R., ANDP. A. W. LEWIS:"The Statistical Analysis of Series of Events," London, 1966.
[3] EL SAYYAD, G. M.: "Bayesian and Classical Analysis of Poisson Regression," Journal of the
Royal Statistical Society, Series B, 35(1973), 445-451.
[4] FROME,E., M. KUTNER,ANDJ. BEAUCHAMP: "Regression Analysis of Poisson Distributed Data,"
Journal of the American Statistical Association, 68(1973), 935-940.
[5] GILBERT,C. L.: "Econometric Models for Discrete Economic Processes," Discussion Paper,
University of Oxford, presented at the Econometric Society European Meeting, Athens, 1979.
[6] GOURIEROUX, C., AND A. MONFORT:"Asymptotic Properties of the Maximum Likelihood
Estimator in Dichotomous Logit Model," Journal of Econometrics, 17(1982), 83-97.
[7] GOURIEROUX, C., A. MONFORT, AND A. TROGNON:"Pseudo Maximum Likelihood Methods:
Theory," Econometrica, 52(1984), 681-700.
[8] HAUSMAN, J., B. HALL, AND Z. GRILICHES: "Econometrics Models for Count Data with an
Application to the Patents R. & D. Relationship," Econometrica, 52(1984), forthcoming.
[9] HECKMAN, J.: "Statistical Model for Discrete Data," in Structural Analysis of Discrete Panel
Data. ed. by C. Manski & D. McFadden. Cambridge: M.I.T. Press, 1981.
[10] JoHNSON, N. L., AND S. KOTZ:Distributions in Statistics: Discrete Distributions. Boston: Hough-
ton Mifflin Co., 1969.
[11] LANCASTER, T.: "Prediction of Poisson Variates," mimeo, University of Hull, 1976.
[12] "Econometric Methods for the Duration of Unemployment," Econometrica, 47(1979),
939-956.
[13] WHITE, H.: "Maximum Likelihood Estimation of Misspecified Models," Econometrica, 50(1982),
1-25.

This content downloaded from 41.129.105.239 on Sun, 23 Mar 2014 13:48:05 PM


All use subject to JSTOR Terms and Conditions

You might also like