Professional Documents
Culture Documents
ABSTRACT. In this paper a class of multivariate dispersion models generated from the
multivariate Gaussian copula is presented. Being a multivariate extension of Jùrgensen's
(1987a) dispersion models, this class of multivariate models is parametrized by marginal
position, dispersion and dependence parameters, producing a large variety of multivariate
discrete and continuous models including the multivariate normal as a special case. Proper-
ties of the multivariate distributions are investigated, some of which are similar to those of
the multivariate normal distribution, which makes these models potentially useful for the
analysis of correlated non-normal data in a way analogous to that of multivariate normal
data. As an example, we illustrate an application of the models to the regression analysis of
longitudinal data, and establish an asymptotic relationship between the likelihood equation
and the generalized estimating equation of Liang & Zeger (1986).
Key words: copula, dependence, dispersion model, generalized estimating equation, general-
ized linear model, longitudinal data, regression, small-dispersion asymptotics
1. Introduction
Dispersion models, introduced ®rst by Jùrgensen (1987a) as the class of error distributions
for the generalized linear models, has drawn a lot of attention in the literature. The
dispersion models contain many commonly used distributions such as normal, Poisson,
gamma, binomial, negative binomial, inverse Gaussian, compound Poisson, von Mises and
simplex distributions (Barndorff-Nielsen & Jùrgensen, 1991). The dispersion models have
properties similar to those of the normal distribution, which makes the class of models
useful in many statistical areas. See Jùrgensen (1997) for details.
The attempt to identify a multivariate extension of the dispersion models has long been
of interest in the literature. An early version of such an extension was proposed by
Jùrgensen (1987a), where his class of models has only a single parameter available for
modelling the correlation structure, and hence has very limited ¯exibility for the use in
multivariate analysis. A recent development of the multivariate extension was discussed by
Jùrgensen & Lauritzen (1998) in which the density of the multivariate dispersion model is
de®ned in a similar form to that of the multivariate normal. However, their models are not
marginally closed in the sense that the marginal distributions may not be, in general,
in the given distribution class. This drawback limits the use of these distributions in, for
instance, regression analysis.
For the regression analysis of multivariate data, the pattern of marginal means is often of
interest and expected to be explicitly modelled as a function of covariates (explanatory
variables). This virtually gives rise to the development of multivariate distributions that can
provide the given margins. The focus of this paper is on the application of Sklar's (1959) copula
approach to a multivariate construction of the dispersion models, which leads to a class of
multivariate models that are marginally closed.
Constructing multivariate distributions by means of copulas has proved popular in recent
years; see for example Joe (1993, 1997), Hutchinson & Lai (1990), and Marshall & Olkin
306 P. X.-K. Song Scand J Statist 27
(1988). The motivation for the copula approach is probably rooted in the aim of forming
multivariate non-normal distributions by combining given non-normal marginal models, in
a certain way, with dependence patterns. Being a multivariate joint distribution which
contains only information regarding dependence, a copula produces new multivariate
distributions whenever new suitable margins are fed into it. Amongst several types of
copulas available in the literature, we are particularly interested in the multivariate
Gaussian copula in this paper, which is ``extracted'' from the multivariate normal
distribution, say N m (ì, Ã) where à (ã ij ) is a Pearson correlation matrix. It is noted
that in the Gaussian copula the correlation matrix à is responsible for the dependence, the
values of which may vary between ÿ1 and 1, accommodating both negative and positive
dependence. Combining the copula with dispersion model margins, a large variety of
multivariate distributions are produced in a uni®ed fashion, including both continuous and
discrete distributions such as multivariate gamma, multivariate binomial and multivariate
Poisson.
The copula approach to generating multivariate distributions gives rise to a question of
interpreting the dependence matrix à in new multivariate distributions. It turns out that
in continuous models its (i, j)th entry, ã ij , gives a non-linear dependence measurement
for the (i, j)th pair of components in the sense of the normal scoring í introduced in
the present paper. This dependence measure can also be extended for use in discrete
models as shown in both binomial and Poisson models that are discussed by this paper
in detail.
This Gaussian copula approach makes the multivariate models constructed appear to have
many properties similar to the multivariate normal distribution. For instance, the model is
``reproducible'' in the sense that its subvector has a distribution of the same form as that of its
full vector. This property is not satis®ed by the log-linear model of Bishop et al. (1975) nor the
quadratic exponential model of Zhao & Prentice (1990) in the binary variable case. Also
because the parametrization of our model is similar to that of the multivariate normal, the
constraint between the correlations and the marginal means is rather simple, circumventing the
drawback of the Bahadur's representation (see Bahadur, 1961) for the binary case (see Diggle
et al., 1994, p. 149).
It is shown that Jùrgensen & Lauritzen's (1998) multivariate dispersion models effectively
turn out to be the limiting distributions of our multivariate dispersion models for small
dispersion parameters. This result links the two classes of multivariate extensions of dispersion
models, and it helps us to understand the multivariate generalization of dispersion models from
a different point of view.
The class of models are ®nally applied to the longitudinal regression model where
response variables are assumed to follow multivariate exponential dispersion models. We
study the maximum likelihood estimation for regression parameters under the assumed joint
density, and show that Liang & Zeger's generalized estimating equation turns out to be an
approximate version of our likelihood equation. This result might be regarded as a comple-
ment to the result obtained by Fitzmaurice et al. (1993), where they proved likelihood
equations for binary regression parameters are of exactly the same form as the generalized
estimating equation (GEE) for binary longitudinal responses. Theoretically, our result gives
an asymptotic justi®cation for the GEE approach in a wider range of distributions than the
binary case.
This paper is organized as follows. Section 2 discusses the construction of multivariate
dispersion models, and then gives three examples: binomial, Poisson and gamma. Some
properties are studied in section 3, and section 4 illustrates an application of the multivariate
models to a longitudinal regression analysis.
2.1. Copula
Let uÿS be a subvector of u (u1 , . . ., u m )T with those components indicated by the set S
being omitted, where S is a subset of the indices f1, . . ., mg. A mapping C:
(0, 1) m ! (0, 1) is called a copula if (1) it is a continuous distribution function; and (2)
each margin is a univariate uniform distribution, namely
lim C(u) u i , u i 2 (0, 1)
uÿi !1
where the limit is taken under u j ! 1, 8 j 6 i. Clearly, lim u j !0 C(u) 0, for any
j 1, . . ., m. It is easy to prove that for any subset S, the marginal obtained by
limuÿS !1 C(u) is also a copula. Copulas are easy to construct from a given multivariate
distribution.
If X (X 1 , . . ., X m )T H where H is an m-dimensional distribution function with margins
H 1 , . . ., H m , then the copula is of the form
CH (u1 , . . ., u m ) Hf H ÿ1 ÿ1
1 (u1 ), . . ., H m (u m )g, u i 2 (0, 1), i 1, . . ., m,
provided that the marginal inverse distribution functions H ÿ1
i of H i exist. An important
special case (which is the focus of this paper) is obtained when X N m (0, Ã) with
standardized margins and H i Ö. The m-dimensional Gaussian copula is denoted by
CÖ (ujÃ), and its density is given by
1 1 1
cÖ (ujÃ) jÃjÿ1=2 exp ÿ qT Ã ÿ1 q qT q jÃjÿ1=2 exp qT (I m ÿ Ã ÿ1 )q (1)
2 2 2
where q (q, . . ., q m )T with normal scores q i Öÿ1 (ui ), i 1, . . ., m.
Figure 1 shows four contour plots of bivariate density functions of bivariate Gaussian copulas
with different values of the correlation parameter ã, namely ÿ0:9, ÿ0.5, 0.5 and 0.9. The
Gaussian copulas with negative values of ã turn out to be concentrated in an opposite direction
in relation to those with positive values of ã, re¯ecting, respectively, the negative correlation
and positive correlation between variables u1 and u2, as desired.
It is shown in Joe (1997, sect. 5.1) that the bivariate Gaussian copula attains the lower FreÂchet
bound maxf0, u1 u2 ÿ 1g, independence, or the upper FreÂchet bound minfu1 , u2 g, according
to the values of the corresponding correlation parameter equal to ÿ1, 0, or 1.
2.2. Construction
By complementing the copula CH with given margins, say F1 , . . ., Fm, a new multivariate
distribution can be obtained by
G(y) CH fF1 ( y1 ), . . ., Fm ( y m )g: (2)
One of its properties is that the ith margin of G gives the orginal Fi , namely the
distribution is marginally closed.
A class of m-variate multivariate dispersion models is obtained by (2) when copula
CH CÖ (:jÃ) and margins Fi s are dispersion models. The multivariate dispersion models,
denoted by MDM m (ì, ó2 , Ã), are parametrized by three sets of parameters, ì ( ì1 , . . ., ì m )T ,
Fig. 1. Four contours of bivariate Gaussian copula densities with different values of ã.
where d is the regular unit deviance. Exponential dispersion (ED) models and proper
dispersion (PD) models are two important special classes of dispersion models given,
respectively, as follows. When d in (3) takes the form of
d( yi ; ì i ) yi æ1 ( ì i ) æ2 ( ì i ) æ3 ( yi )
for suitable functions æ1 , æ2 and æ3, the dispersion model is indeed an exponential
dispersion model with mean ì i and dispersion parameter ó 2i . If a in (3) is factorized to be
the form of b(ó 2i )c( yi ), the dispersion model becomes a proper dispersion model. See
Jùrgensen (1997, sect. 1.2) for more details.
In parallel, we obtain the multivariate exponential dispersion model, and the multivariate
proper dispersion models, denoted by MED m (ì, ó2 , Ã) and MPD m (ì, ó2 , Ã), respectively,
when the corresponding margins are used in construction.
When marginal models are continuous, a multivariate dispersion model can be equivalently
de®ned by the density of the following form
g(y; ì, ó2 , Ã) cÖ fF1 ( y1 ), . . ., Fm ( ym )jÃg f ( y1 ; ì1 , ó 21 ) f ( ym ; ì m , ó 2m ): (4)
X
2 X
2
g(y) P(Y1 y1 , . . ., Ym ym ) (ÿ1) j1 j m CÖ (u1 j1 , . . ., u mj m jÃ) (5)
j1 1 j m 1
where ã ij is the Pearson correlation between two normal scores, measuring the dependence
between Yi and Yj based on a monotone non-linear transformation. We shall refer to this
non-linear dependence measure as the normal scoring í, which is similar to the Spearman's
r dependence measure determined via the Pearson correlation of transformed variables that
follow the uniform distribution on (0, 1). Precisely, r ij 12EGi (Yi )Gj (Yj ) ÿ 3 where the
expectation is taken under the joint distribution G( yi , yj ) of (Yi , Yj ). Also Kendall's ô
dependence of pair (Yi , Yj ) equals to ô ij 4EG(Yi , Yj ) ÿ 1 where the expectation is taken
under G( yi , yj ) as well. See for example Kendall & Gibbons (1990) for r and ô. Clearly,
both r and ô dependence measurements are non-linear functions in ã ij . For each ®xed ã ij,
the Monte Carlo method may be employed to get r ij and ô ij numerically. It is found that in
the copula setting (1) í and r are effectively very close to each other, and í and ô are
positively ``correlated'' in the sense that increasing the value of í dependence results in a
similar increase for the value of ô dependence, and vice versa.
Note that we also use measure í for the dependence of discrete random variables although the
above interpretation is not applicable in the discrete case. Some supportive evidence for this
extension can be drawn from examples 1 and 2 where the bivariate binomial and Poisson are
studied in detail. Hence in the following the measure í is assumed well-de®ned in both the
continuous and discrete cases.
Let S fr1 , . . ., rs g be a subset of indices f1, . . ., mg. The marginal distribution function of
Y r1 , . . ., Y rs is obtained by letting components yi ! 1, i 2 S in the joint distribution
G(y; ì, ó2 , Ã) where S is the complementary set of S. Clearly, the marginal distribution is
CS (F r1 ( y r1 ), . . ., F rs ( y rs )jà S ) where CS (u S ) is the marginal distribution of CÖ (u) and the
dependence matrix à S is the submatrix of à with entries corresponding to the set S. In
particular, for both continuous and discrete cases, the marginal density of one component Yi is
equal to the density of univariate DM( ì i , ó 2i ), like the normal case, not depending on the matrix
à at all.
8
<0 yi , 0
Fi ( yi ) 1 ÿ pi 0 < yi , 1
:
1 yi > 1
The m-variate probability function is given by (5). In particular when m 2, the bivariate
probability function is of the form
P(Y1 y1 , Y2 y2 ) Cã (u1 , u2 ) ÿ Cã (u1 , v2 ) ÿ Cã (v1 , u2 ) Cã (v1 , v2 ) (6)
To make use of this model for the regression analysis of binary data, we may assume a
generalized linear model structure for marginal expectations, namely, logit( pi ) ç i or
Öÿ1 ( pi ) ç i , where ç i xTi â and x i are a vector of covariates. This leads to a multivariate
logistic model or a multivariate probit model, respectively. In particular the probit link results in
1 ÿ pi Ö(ÿç i ), and therefore Cã (1 ÿ p1 , 1 ÿ p2 ) Ö2 (ÿç1 , ÿç2 jã).
As a matter of fact, the multivariate probit model can be interpreted as a probit model with
the latent variable representation. This may be seen through a bivariate case. Let ( Z 1 , Z 2 ) be
the latent normal vector satisfying Z i xTi â E i , i 1, 2, where (E1 , E2 ) N (0, 0, 1, 1, ã),
and de®ne Yi 0, if Z i < 0; 1, otherwise. Then the point probability P(Y1 0,
Y2 0) Ö2 (ÿxT1 â, ÿxT2 âjã), identical to the ®rst expression of (7). It is easy to prove that the
other three point probabilities are the same as the rest in (7). In this case, the correlation
parameter ã in (6) is identical to that from the latent normal distribution.
For the bivariate binary model, the lower and upper FreÂchet bounds are given, respectively, in
the ®rst and second lines of the following two-way table,
0 < y1 , 1 y1 > 1
0 < y2 , 1 maxf0, 1 ÿ p1 ÿ p2 g 1 ÿ p2
1 ÿ maxf p1 , p2 g 1 ÿ p2
y2 > 1 1 ÿ p1 1
1 ÿ p1 1
and the bounds otherwise equal to zero. It is easy to show that the bivariate binary model
attains these two bounds when ã equals to ÿ1 and 1 respectively.
as a linear function
p in y2 , where ã is the Pearson correlation coef®cient of (Y1 , Y2 ) equal
p
to ë12 = ë1 ë12 ë2 ë12 and ì i ë i ë12 are given marginal means. For the copula-
based construction, the conditional mean is
X
1
E(Y1 jY2 y2 ) y1 P(Y1 y1 , Y2 y2 )=P(Y2 y2 ), (9)
y1 0
where the joint point probability P(Y1 y1 , Y2 y2 ) is same as (6). It is relatively easy to
compute this function numerically, although its closed form expression is unavailable. A
comparison between the two conditional means are illustrated in Fig. 2, where the two
margins are set to be the same.
In Fig. 2, a linear approximation to conditional mean (9) is also shown. This approximation
takes a form similar to (8), given by
E(Y1 jY2 y2 ) ì1 ãK( ì1 )ø( y2 , ì2 ), (10)
Fig. 2. Two exact conditional means and a linear approximation represented, respectively, by solid line
( ), dashed line (- - -) and a dotted line (´ ´ ´ ´).
P1
where K( ì1 ) y1 0 öfq1 ( y1 )g and
öfq2 ( y2 ÿ 1)g ÿ öfq2 ( y2 )g
ø( y2 , ì2 ) ,
F2 ( y2 ) ÿ F2 ( y2 ÿ 1)
where ö is the standard normal density. The approximation (10) is obtained simply by the
Taylor expansion of (9) around ã 0, given by for u i Fi ( yi ), i 1, 2,
Cã (u1 , u2 ) F1 ( y1 )F2 ( y2 ) ö(q1 )ö(q2 )ã O(ã2 ):
p
The function difference ì ÿ K( ì) is found positive at ì 1, 2, . . ., and a monotone
decreasing to zero as ì goes to the in®nity. For example, it equals to 0.0225, 0.0127, 0.0099
when ì 10, 30, 50, respectively.
Figure 2 contains nine plots with all possible combinations of (ã, ì) for ã 0:3, 0.6, 0.9 and
ì 5, 20, 40, and each graph consists of three lines corresponding to the linear conditional
mean (8), the conditional mean (9) and the approximation (10), respectively, represented by
solid line (ÐÐ), dashed line (- - -) and dotted line (´ ´ ´ ´).
Clearly, when marginal means are not small, say 20 or bigger as seen in the ®gure, the two
exact conditional means are almost identical within fairly reasonably large ranges around the
means, and the approximations are also shown fairly close to the other two, although
approximations near tails have some small departures.
For small marginal means (equal to 5 in the ®gure), the two exact conditional means are still
close enough to each other, and the approximations almost overlap with the other two at low y
values but start to go away from them when y is far from the mean ì (in this ®gure the going-
away begins approximately at 2 ì).
This comparison also sheds light on the interpretation of the correlation parameter ã in the
copula-based construction. At least numerically we see the closeness between this parameter and
the one appearing in the stochastic representation method that has the traditional interpretation.
3. Properties
3.1. Moments
It is known from the previous section that the marginal distributions of MDM m (ì, ó2 , Ã)
are just DM( ì i , ó 2i ), and hence their marginal moments are always straightforward
obtained. For the special case of the MED model, E(Yi ) ì i and var(Yi ) ó 2i v( ì i ), where
v(:) is the corresponding variance function.
Unlike the marginal moments, joint moments are in general unavailable in closed forms.
However, in some cases we may use the Monte Carlo approach to obtain them numerically.
Suppose Y (Y1 , . . ., Ym )T follows a continuous MDM m (ì, ó2 , Ã). Then for a function h such
that h(Y) satis®es the conditions of the law of large number, when M is large,
1 XM
(i)
E G h(Y1 , . . ., Ym ) h[F ÿ1 ÿ1 (i)
1 fÖ(X 1 )g, . . ., F m fÖ(X m )g] (11)
M i1
ó 2i ó 2j
cov(Yi , Yj ) h i0h j ó ij h9i h9j h i h j0 O(ó 2i ó j ) O(ó i ó 2j ) O(ó 3i ) O(ó 3j )
2 2
where ó ij ó i v1=2 ( ì i )ó j v1=2 ( ì j )ã ij and h k h k ( ì k ), h9k h9k ( ì k ), h k0 h k0 ( ì k ), k i, j.
From the fact that the density function (4) appears to be of a product form of the copula
density and marginal densities, the moment generating function is then easily obtained as
Qm
j (t) i1 j (t) where the ith marginal moment generating function of ED( ì i , ó 2i )
j(t i )j
is given by
j(t i ) exp[ë i fk(èi t i =ë i ) ÿ k(èi )g], with ë i 1=ó 2i
and
1 1
j (t) E exp qT (I m ÿ Ã ÿ1 )q ÿ q T (I m ÿ Ã ÿ1 )q : (14)
2 2
The expectation in (14) is taken under the multivariate model MED m (ì , ó2 , Ã) where the
ith element of ì corresponds to èi t i =ë i under the mean mapping k9(:). Obviously,
j (0) 1.
Clearly, the moment generating function is factorized by two parts where the ®rst factor is the
product of m marginal moment generating functions of ED random variables, and the second
factor j (t) is in general too complicated to have a closed form expression. But the following
theorem describes a pro®le of this function, that is, function j (t) contains the covariance
information of the multivariate distribution.
Theorem 1
If j (t) exists in a neighbourhood of 0, then
@j j (0) @ 2 j (0)
0, 0, for i 1, . . ., m,
@ ti @ t2i
and
@ 2 j (0)
cov(Y i , Y j ), for i 6 j:
@ ti@ t j
Proof. Since the marginal distributions of MED m (ì, ó2 , Ã) are univariate dispersion models
ED( ì i , ó 2i ), we have
j(t)
@j j (t)
(0)k9(è ) @j j (0)
@j
ì i EYi j i ìi :
@ t i t0 @ t i t0 @ ti
Hence, @j j (0)=@ t i 0. A similar procedure results in @ 2 j (0)=@ t2i 0. A simple calcula-
tion gives for i 6 j,
@ 2 j (t)
E(Yi Yj ) ì i ì j ,
@t @t
i j t0
which leads to
@ 2 j (0)
cov(Y i , Y j ):
@ ti@ t j
This theorem implies that j (t) is the contributor of the covariances, and for this sake, j (t)
may be called the dependence generating function.
Theorem 2
Suppose j (t) exists for t 2 R m. If the variance±covariance matrix Ó is positive de®nite,
then
d
Óÿ1=2 (Y ÿ ì) ! N m (0, I m ), as kÓk ! 0 or kÓÿ1 k ! 0:
Proof. We ®rst give the proof for the case of kÓk ! 0. By Cauchy±Schwarz's inequality, it is
easy to see that ó 2max max(ó 21 , . . ., ó 2m ) ! 0 is equivalent to kÓk ! 0. According to
Jùrgensen (1997, sect. 3.6), the marginal asymptotic normality takes the following form,
F( y; ì i , ó 2i ) Ö(î i =ó i )f1 O(ó 2i )g, i 1, . . ., m, (15)
where î i is the Pearson residual given by
y ÿ ìi
î i î i ( y) 1=2 , i 1, . . ., m,
v ( ìi)
That is, Y is asymptotically multivariate normal with mean ì and covariance matrix given
by
diag[ó 1 v1=2 ( ì1 ), . . ., ó m v1=2 ( ì m )]Ã diag[ó 1 v1=2 ( ì1 ), . . ., ó m v1=2 ( ì m )],
as ó 2max ! 0. As a matter of fact, this covariance matrix is equal to Ó because of (13).
Pm
We now prove the other case of kÓÿ1 k ! 0. Let Ù(t) i1 ë i fk(èi t i =ë i ) ÿ k(èi )g. An
application of the multivariate version of the Taylor expansion leads to
1
Ù(t) ìT t tT diag[ó 21 v( ì1 ), . . ., ó 2m v( ì m )]t o(ktk2 ):
2
On the other hand, from theorem 1 and the Taylor expansion for term j (t), we obtain
1
j (t) 1 tT Ó0 t o(ktk2 )
2
where Ó0 is a matrix with diagonals zero and off-diagonals covariances. Combining the
two above expressions of expansion, we have
1
j (t) exp ìT t tT Ót o(ktk2 ) : (16)
2
Furthermore, letting Z Óÿ1=2 (Y ÿ ì), we obtain the moment generating function of Z as
follows,
j Z (t) expfÿtT Óÿ1=2 ìgj
jY (Óÿ1=2 t)
1
expfÿtT Óÿ1=2 ìgexp ìT Óÿ1=2 t tT Óÿ1=2 ÓÓÿ1=2 t o(kÓÿ1=2 tk2 )
2
1
exp tT t o(kÓÿ1=2 tk2 ) :
2
Note that
kÓÿ1=2 tk2 < kÓÿ1=2 k2 ktk2 tr(Óÿ1 )ktk2 :
Therefore, for any t 2 R m,
1 T
j Z (t) ! exp t t , as tr(Óÿ1 ) ! 0:
2
According to the uniqueness theorem,
d
Z Óÿ1=2 (Y ÿ ì) ! N m (0, I m ):
Pm
Let 0 , r1 < ´´´ < rm be the eigenvalues of Ó. Clearly kÓÿ1 k ÿ2
i1 r i ! 0 is equivalent
P m ÿ1
to tr(Óÿ1 ) i1 r i ! 0.
where Ó diag(ó i )Ã diag(ó i ). It is noted that the density function obtained by normalizing
the right-hand side of the approximation (17) coincides with the de®nition for multivariate
dispersion density of Jùrgensen & Lauritzen (1998).
1X n
qT (y i ; â, ó 2 )(I m ÿ Ã ÿ1 )q i (y i ; â, ó 2 ) (18)
2 i1 i
where DTi @ìTi =@â, V i diagfv( ì it )g and QTi ó 2 @qTi =@â. The maximum likelihood
estimate â^ ML of â can be obtained as the solution to (19).
Note that Qi is an m 3 p matrix and the tth column is given by
yit
@q it
ó2 x it fv( ì it )h9( ì it )ö(q it )gÿ1 u f it (u) du ÿ ì it Fit
@â ÿ1
yit
x it fv( ì it )h9( ì it )ö(q it )gÿ1 ( yit ÿ ì it )Fit ÿ Fit (u) du :
ÿ1
If repeated observations from a subject are independent of one another, i.e. Ã I m , then (19)
will be simpli®ed to the so-called score equation in the context of generalized linear models.
This likelihood equation also indicates a different approach to incorporating the dependence
feature in the estimating procedure in comparison to the GEE approach. Unlike the GEE
approach of Liang & Zeger (1986) that in fact generalizes the score equation (namely the ®rst
term inside the curly bracket in (19)) using a non-zero off-diagonal matrix V ~ i to substitute a
diagonal covariance matrix V i , this likelihood equation introduces the dependence matrix à in a
separate term (namely the second term inside the curly bracket in (19)) that may be regarded as
a penalty term for correlation.
Under some mild regularity conditions, the standard large sample theory for likelihood
estimates implies both the consistency and asymptotic normality, where in particular the
observed Fisher information matrix is Ø ^Ø^ T with all parameters being replaced by their
corresponding estimates in (19).
In general the computation in solving (19) is not straightforward because the normal scores
q it and their derivatives are non-linear functions involving both parameters â and ó 2, and a
software for such a numerical implementation is necessary. Some further studies for this aspect
is needed.
form of Ó i . Also note that Ã, being treated as the nuisance parameters in the GEE approach,
may be interpreted, in our setting, as the matrix of normal scoring í dependence measurements
between components of y i . Also by (13), the matrix Ó i is indeed the approximate covariance
^ a may be done by the GEE routine available in S-Plus.
matrix of y i . Solving (21) to obtain â
Clearly the inference function U (â) is unbiased. Therefore, under some mild regularity
conditions (see for example Godambe, 1991), we have the asymptotic normality as follows,
p ^ d
n(â a ÿ â) ! N p (0, (â)), as n ! 1
with (â) lim n nJ ÿ1 (â) where J (â) is the Godambe information matrix given by
J (â) S(â)V ÿ1 (â)S T (â):
Here S(â) Eâ U 9(â) and V (â) Eâ U (â)U T (â).
Applying the saddlepoint approximation for a(:) function involving in (20), we could also
obtain an estimate of ó 2 ,
1 X m
ó^ 2a (y i ÿ ì i )T Óÿ1
i (y i ÿ ì i ) (22)
mn ÿ p i1
which is of exactly same form as Liang & Zeger's (1986) estimate for ó 2. Similarly, the
estimate of à is obtained by
X
n
^ 1
à V
ÿ1=2 ÿ1=2
(y i ÿ ì i )(y i ÿ ì i )T V i :
nó i1 i
2
^ ML and â
Table 1. Asymptotic relative ef®ciency of â ^ GEE using gamma
regression model
Dispersion Dependence ã
2
ó 0.0 0.3 0.6 0.9
0.01 1.0000 1.0039 1.0034 1.0311
0.10 1.0000 1.0350 1.1014 1.2312
0.50 1.0000 1.1580 1.3749 1.8148
1.00 1.0000 1.2772 1.6083 2.0808
2.00 1.0000 1.4553 2.0590 3.2390
Acknowledgements
This work is a part of the author's PhD dissertation under the supervision of Professor B.
Jùrgensen at University of British Columbia. Also this research was partially supported by
a grant from the Natural Sciences and Engineering Research Council of Canada and by a
grant from Faculty of Arts, York University.
The author thanks the two referees and the associate editor for helpful comments which lead
this paper to a better exposition.
References
Bahadur, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. In
Studies on item analysis and prediction (ed. H. Solomon), 158±168: Stanford Mathematical Studies in the
Social Sciences VI, Stanford University Press, Stanford, CA.
Barndorff-Nielsen, O. E. and Jùrgensen, B. (1991). Some parametric models on the simplex. J. Multivariate
Anal. 39, 106±116.
Bishop, Y. M. M., Fienberg, S. E. & Holland, P. W. (1975). Discrete multivariate analysis: theory and
practice. MIT Press, Cambridge, MA.
Crowder, M. (1987). On linear and quadratic estimating functions. Biometrika 74, 591±597.
Diggle, P. J., Liang, K.-Y. & Zeger, S. L. (1994). The analysis of longitudinal data. Oxford, Oxford University
Press.
Fitzmaurice, G. M., Laird, N. M. & Rotnitzky, A. G. (1993). Regression models for discrete longitudinal
responses. Statist. Sci. 8, 284±309.
Godambe, P. V. (1991). Estimating functions: an overview. Oxford University Press, Oxford.
Hutchinson, T. P. & Lai, C. D. (1990). Continuous bivariate distributions, emphasising applications. Rumsby,
Sydney.
Joe, H. (1993). Parametric family of multivariate distributions with given margins. J. Multivariate Anal. 46,
262±282.
Joe, H. (1997). Multivariate models and dependence concepts. Chapman & Hall, London.
Jùrgensen, B. (1987a). Exponential dispersion models (with discussion). J. Roy. Statist. Soc. Ser. B 49,
127±162.
Jùrgensen, B. (1987b). Small-dispersion asymptotics. Braz. J. Probab. Statist. 1, 59±90.
Jùrgensen, B. (1997). The theory of dispersion models. Chapman & Hall, London.
Jùrgensen, B. & Lauritzen, S. L. (1998). Multivariate dispersion models. Research Report 2, Department of
Statistics and Demography, Odense University, Denmark.
Kendall, M. & Gibbons, J. D. (1990). Rank correlation methods. 5th edn. Edward Arnold, London.
Liang, K.-Y. & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika
73, 13±22.
Marshall, A. W. & Olkin, I. (1988). Families of multivariate distributions. J. Amer. Statist. Assoc. 83,
834±841.
McCullagh, P. & Nelder, J. A. (1989). Generalized linear models. 2nd edn. Chapman & Hall, London.
Sklar, A. (1959). Fonctions de reÂpartition aÁ n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 8,
229±231.
Zhao, L. P. & Prentice, R. L. (1990). Correlated binary regression using a generalized quadratic model.
Biometrika 77, 642±648.
P. X.-K. Song, Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada M3J
1P3.