Professional Documents
Culture Documents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.
http://www.jstor.org
Pseudo maximum likelihood techniques are applied to basic Poisson models and to
Poisson models with specification errors. In the latter case it is shown that consistent and
asymptotically normal estimators can be obtained without specifying the p.d.f. of the
disturbances. These estimators are compared both from the finite sample and the asymp-
totic point of view. Quasi generalized PML estimators, which asymptotically dominate all
PML estimators, are also proposed. Finally, bivariate and panel data Poisson models are
discussed.
1. INTRODUCTION
2. PRELIMINARIES
In this section, the basic Poisson model and the Poisson model with specifica-
tion errors are described.
where (xi, .. ., xiK) are K exogenous variables associated with the ith observa-
tion and (b,, . .. , bK) are K unknown parameters.
The exponential function appearing in Ai is mainly justified by the positivity of
Ai; a linear function Ai = xib would imply possibly incompatible constraints on
the parameters. The mean and the variance of yi are equal to Ai and the density
function of yi is
exp( -
Yi!
The log-likelihood function is
n n
L(b) = - A.+
i 2 y,log Ai- log(yi!)
i=l i=l i
n n
= constant - exp(xib) + 2 yixib.
i=l i=
d- xi[exp(xib)-yi]=0 or
-db
n
xi(Ai -y) = 0.
i= 1
(Xi - Y) = 0.
i= 1
This equation implies that the sum of the residuals is equal to zero; therefore the comparison of the
computed sum of the residuals with zero can be used as an indicator of the algorithm accuracy.
n
d2L
dbdb'J x'xiexp(xib)
is negative definite, if X'X = En.= x'xi is of full rank; the log-likelihood function
is, in this case, strictly concave and can be easily maximized by the usual
algorithms.
2These equations are not always solvable. Let us consider the case K = 1; the likelihood equation
is 7. xiexp(xib) = E7=lxiyi. The mapping b -> i=xiexp(xib) is continuous and its range is
-oo, + oo[ if minixi < 0 < maxi xi, 1-oo, 0[ if maxixi < 0 and 10, + 0cl if minixi > 0; therefore,
since theyi are positive, the equation is solvable if min xi < 0 < max xi; otherwise a solution exists if,
and only if, at least one yi is non null. Also note that the maximum likelihood estimator b is such that
n -I n
g= (lx'
x x;exp(xib) = 4j(b) (say),
where b is the O.L.S. estimator of b. By inverting ',, it is possible to deduce the asymptotic
properties of b from those of b (see Gourieroux and Monfort [6]).
3The fact that the sample variance of y is greater than the sample mean does not invalidate the
basic Poisson model. In effect, if this model is true:
V(y) = V(E(y Ix)) + EV(y Ix) = V(E(y Ix)) + EE(y Ix) > EE(y Ix) = E(y).
u a-I exp(-uI/3)
fAFr(a)
This kind of model has been considered by Hausman, Hall, and Griliches [8]
and by Gilbert [5] (see also Lancaster [11]).
Assuming that there exists a constant term in xib, there is no loss of generality
in setting: E exp(Ei) = 1. In that case B = a-1 = 712= V[exp(ci)] and it is straight-
forward to show (see, for instance, Gilbert [5]) that the distribution of yi given xi
is a negative binomial distribution:
r 2+Yi)
= exp(xib),
= exp(xib) + q2exp(2xib).
These results will be used to estimate the true values bo0,2 of b, 2 without
assumptions on g.
We shall consider three kinds of methods proposed in Gourieroux et al. [7],
which are pseudo maximum likelihood based on linear exponential families,
quasi-generalized pseudo maximum likelihood (QGPML), and pseudo maximum
likelihood based on quadratic exponential families.
n n
41t might have seemed natural to consider a generalized least squares estimator of b by minimizing
the following objective function:
n (yi - exp(xib))2
i= exp(xib) + a exp(2xib)
where a is a given number. However such a method, based on the criterion
(U-_m)2
- r
p(u, m) =
m + am2
is not always consistent. This result is a consequence of the proof of Theorem 2 in Gourieroux et al.
[7], which establishes a necessary condition for the consistency of the estimator obtained by
minimizing Eiq(yi, exp(xib)). This condition is that a3q/am = X(m)(u - m) which is not satisfied in
the present case.
(iv) x
[-xbb-yiexp(-xib)].
i= 1
n, y' - expx1b
(iii) ,. i 1 + aexpx,b =
n
Note that the last three objective functions are concave and can be easily
maximized by usual iterative methods, whereas the objective function associated
with nonlinear least squares is not concave.
We know from Gourieroux et al. [7, Theorems 2 and 3], that these four PML
estimators bj, j = 1, . . . , 4, are strongly consistent and asymptotically normal
and that the asymptotic covariance matrix of n (bj - bo) has the general form
J - IJ'- , where
Iaf)
1= EX( abf b-)
(i) [E,(x'x exp 2xbo)] E' {x'x exp 2xbo(exp xbo + qoexp2xbo) }
x EX(X'x exp2xbo)1],
2
=Ex (x'x exp xbo)1 + [Ex (x'x exp xbo)
Note that, if the likelihood of the basic Poisson model is used, the esti-
mator obtained is consistent and asymptotically normal; however the previous
formula shows that the asymptotic covariance matrix obtained when it is
(wrongly) assumed that the correct model is the basic Poisson model, i.e.
[Ex(x'x exp xbo)]f , is "smaller" than the true one.
[E ( x xexpxb )]
-
o(Ex 'x) + (Exx'x) Ex[x'xexp(-xbo)](Exx'x)y.
n~~~~
(_,i'ix 2iz(xi, bA)
Nil
Does there exist one of the previous PML estimators uniformly better than the
others, i.e. with a smaller asymptotic covariance matrix for all possible distribu-
tion of the error ci? Heuristically the answer is "no." For instance, if c1= 0, the
PML estimator based on the Poisson family is exactly the ML estimator; it is
asymptotically efficient and in particular better than the other three estimators.
Whereas, if ci has a gamma distribution, the PML estimator based on the
negative binomial family is asymptotically efficient and better than the others.
However, as we shall see in the next section, it is possible to build estimators
(QGPML estimators), which are asymptotically uniformly better than the four
previous PML estimators.
Finally note that, in the same spirit as White [13] or Chesher [1], it would be
possible to perform a specification test for the distribution of the errors by
comparing to zero an estimate of J - I. However as noted by White [13], the
computation of the test statistic can be cumbersome.
TU= n
E exp(4xibj)
n
(yi - exp(xib))2
(1) ~ ~ 1=~ ~ ~ E
)- A
A A
Ex x'x exp(xbo)
1 + q2exp(xbo)
This covariance matrix is smaller than the asymptotic covariance matrices of the
PML estimators of the previous section.
If we expand the covariance matrix of the QGPML estimators in a neighbor-
hood of 712 = 0, we have:
2
= (E.x'xexp(xbo)(l - exp(xbo))) + o( )
= [Ex(x'xexp(xbo)) -
OEx(x'xexp(2xbo))]'+ o(q0)
This shows that the PML estimator based on the Poisson family is second order
efficient in a neighborhood of q 2 = 0.
for instance on the normal family. The estimators are defined as the solutions of
the optimization problem:
maxT(b,'q2) =- .
log[exp(xib) + q2exp(2xib)]
n [y - exp(xib)]2
=1 exp(xib) + 'q2exp(2xib)
a p(b,q2) n f 1 + 2_q2exp(xib)
ab i=Ixi{l + q2exp(xib)
[1 + q22exp(xib) ] 2exp(xib)
-2 y -exp(xib) 1=0,
1 + q2exp(xib) J
n
exp(xib) [yi-exp(xib)]2
a__(b,_2)
a(2) i=, l + q2exp(xib)
I [1 + nq2exp(xib)] {
= -
2) =
,
C(m,2 D(m,a2) __I
=) [ expxb
[expxb + q2exp2xb J
ah(b, q2) _
x'exp(xb) x' [exp(xb) + 2iq2exp(2xb)]
a(b, 2)
q
0 exp(2xb) J
Finally, we can remark that the objective function associated with the negative
r
n /rq 12
+i)
+Yj~ ~ ~ ~~qep(b]1/1+~ (I
slog 1[q2exp(xib)]I[l + q exp(xib)F
i=' L 7 r(yi + 1)
does not satisfy the necessary condition of Theorem 7 in Gourieroux et al. [7] for
the strong consistency of the PML estimator. Therefore the method described in
2.b. is inconsistent for some true distributions of expEi (different from the gamma
distribution).
{EX[x'x exp(Oxb0)] } 1
From these results, the asymptotic covariance matrices of the PML estimators of
12)[I + 4bobo'];
+ 2exp(411bo
Since Vas(b*)>> Vas(b*) and Vas(b*)>> Vas(b:), the PML estimator based on the
normal pseudo family, i.e. the nonlinear least squares estimator is always domi-
nated by the other PML estimators. As far as b* and b* are concerned, none of
these estimators is uniformly better than the other; more precisely: VaSb >>
VasbZ if and only if
2
< exp -a- . l
TABLE I
Normal
-l 0 1 2
Poisson
El~~~m
-1 0 1 2
Negative Binomial
-1 0 1 2
Gamma
-1 0 1 2
FIGURE 1-Empirical distribution of the PML estimator of the slope coefficient.
I~~~,,I
0 1 2
FIGURE 2-Comparison between the empirical and the asymptotic distributions of the
PML estimator based on the gamma family.
HYPOTHESISHI: y,(t) and y2(t) are the number of events of types 1 and 2
which occur between 0 and t.
HYPOTHESISH3: Between t and t + dt, will occur: (i) one event of type 1 and
no event of type 2 with probability
d+od)[ oo(dt)
X dt + o (dt) limo dt ;
(ii) one event of type 2 and no event of type 1 with probability Udt + o(dt);
(iii) one event of type 1 and one event of type 2 with probability y dt + o(dt);
(iv) no event with probability 1 - Xdt - udt -ydt + o(dt).
The distribution of the process [yI(t), y2(t)] can be easily obtained. Let us
denote by Pnm(t) the probability that y,(t) = n and y2(t) = m and by G(t, u, v)
= EuyI(t)VY2(') the moment generating function. By definition, we have the
following difference equation:
dpn,m(t)
ln()= ( - -
Y-Y)Pnm (t) + XPn - i,m (t)
dt
This relation implies the following differential equation for the moment
generating function:
5When y = 0 the covariance betweenyI(t) andy2(t) is equal to zero. But in that case the moment
generating function of [yI(t), y2(t)] is the product of the moment generating functions of y1(t) and
y2(t). This result implies that in such models noncorrelation is equivalent to independence. In the
general case y is positive. This parameter measures the serial correlation between y1(t) and y2(t). It is
positive sinceyl(t) andy2(t) simultaneously increase, when t increases; this positivity condition on y
does not imply a similar condition on the empirical correlation computed from the observations yi(t)
and y2i(t), when t is fixed and i is varying.
where a(t), /,3(t), /32(t) are independent Poisson variables with parameters yt, Xt
and jit respectively. This fact will be used in the multivariate case (see Section 5).
Note that the difference y,(t) - y2(t) is equal to /,3(t) - P2(t); this means that, if
only y,(t) - y2(t) is observable, y is not identifiable and independence between
y,(t) and y2(t) can be assumed. More importantly, the difference between two
independent Poisson variables provides a natural generalization of the usual
Poisson variable which can be used when the discrete variable considered can
take negative values. The specifications and the estimation methods proposed
above can be easily extended to this case.
Yi = exp(mid + E3,)
with
EexpE I= EexpE2i = EexpE3i = 1,
exp E11
V expE2i = Q.
exp E3i
The complexity of the elementary probabilities Pnm does not allow one to
integrate out the distribution of the disturbances (E1i, E2i1,E3i). In this bivariate
case the maximum likelihood method is not tractable, whereas the estimation
methods based on first and second order moments are applicable. The expecta-
tions of the observed variables are:
0 exp(x?b) 0
bexp(xb) exp(mid)
+ 0
eO 0 exp(zic) exp(mid)
expz~c)exp(1d)1 exp(zmc)
exp(m1d) exp(m1d)
Let us denote by b,c,d, these estimators and by wli and w2i the estimated
residuals:
A A
A
=--eexp(xib -exp(mid),
A
= - -
Y2i exp(zic) exp(mid).
The estimator Q of Q~may be used to apply the QGPML method. The QGPML
estimators of b, c, d are solutions of:
V~~~~~~~~~
Zii A)Z
where
is obtained from
where ai(t), flj(t),-yi(t) are independent Poisson variables with parameters Xit,
yjt, y,t. The previous formulation is an error component model for count data:
ai(t) and /t3(t) are the specific effects, yij(t) is the residual effect. The moment
generating function of y,(t), i = 1,... , I;j = 1, . . ., J is
G(t, uij; i = ,. . I; j= 1, . .. , J)
=E(r tuiY(t))
- E(1 Ju U7(t)+fuj(t)+Y(t) )
ii~~~~
-
II7JE((Jl
Ui) )IIE((JiJUi) )n HIIE(UXJ.),
=exp
-E,ZJ ' -YJ+ ' Ju
+i i~i 17
Integrating with respect to the E's, and putting by convention t = 1, the first and
second order moments of Yi are
+ exp m d + co2exp 2m d,
7. CONCLUDING REMARKS
The properties of the Poisson model presented in this paper are similar to those
of Gaussian linear models: the distribution of the observations is characterized
by its first and second order moments and, in the multidimensional case, the lack
of correlation between two endogenous variables is equivalent to the indepen-
dence between these variables. These models are also extended in order to define,
in the count data context, the analogue of seemingly unrelated equations models
and error components models. Simple estimation methods based on the general
frameworks proposed in Gourieroux et al. [7] are available for all these models. It
would also be interesting to propose Poisson models where the occurrence of
some events between t and t + dt depends on what occurred before t; such
CEPREMAP
and
Ecole Nationale de la Statistique et de l'Administration Economique
REFERENCES
[1] CHESHER, A.: "Testing for Neglected Heterogeneity," Econometrica, 52(1984), forthcoming.
[2] Cox, D. R., ANDP. A. W. LEWIS:"The Statistical Analysis of Series of Events," London, 1966.
[3] EL SAYYAD, G. M.: "Bayesian and Classical Analysis of Poisson Regression," Journal of the
Royal Statistical Society, Series B, 35(1973), 445-451.
[4] FROME,E., M. KUTNER,ANDJ. BEAUCHAMP: "Regression Analysis of Poisson Distributed Data,"
Journal of the American Statistical Association, 68(1973), 935-940.
[5] GILBERT,C. L.: "Econometric Models for Discrete Economic Processes," Discussion Paper,
University of Oxford, presented at the Econometric Society European Meeting, Athens, 1979.
[6] GOURIEROUX, C., AND A. MONFORT:"Asymptotic Properties of the Maximum Likelihood
Estimator in Dichotomous Logit Model," Journal of Econometrics, 17(1982), 83-97.
[7] GOURIEROUX, C., A. MONFORT, AND A. TROGNON:"Pseudo Maximum Likelihood Methods:
Theory," Econometrica, 52(1984), 681-700.
[8] HAUSMAN, J., B. HALL, AND Z. GRILICHES: "Econometrics Models for Count Data with an
Application to the Patents R. & D. Relationship," Econometrica, 52(1984), forthcoming.
[9] HECKMAN, J.: "Statistical Model for Discrete Data," in Structural Analysis of Discrete Panel
Data. ed. by C. Manski & D. McFadden. Cambridge: M.I.T. Press, 1981.
[10] JoHNSON, N. L., AND S. KOTZ:Distributions in Statistics: Discrete Distributions. Boston: Hough-
ton Mifflin Co., 1969.
[11] LANCASTER, T.: "Prediction of Poisson Variates," mimeo, University of Hull, 1976.
[12] "Econometric Methods for the Duration of Unemployment," Econometrica, 47(1979),
939-956.
[13] WHITE, H.: "Maximum Likelihood Estimation of Misspecified Models," Econometrica, 50(1982),
1-25.