You are on page 1of 17

Inferences on the Association Parameter in Copula Models for Bivariate Survival Data

Author(s): Joanna H. Shih and Thomas A. Louis


Source: Biometrics, Vol. 51, No. 4 (Dec., 1995), pp. 1384-1399
Published by: International Biometric Society
Stable URL: http://www.jstor.org/stable/2533269 .
Accessed: 25/09/2013 17:43

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to
Biometrics.

http://www.jstor.org

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
BIOMETRICS 51, 1384-1399
December 1995

Inferenceson theAssociationParameterin Copula Models for


BivariateSurvivalData

Joanna H. Shihl and Thomas A. Louis2


'National Heart, Lung, and Blood Institute, Two Rockledge Centre,
6701 Rockledge Drive, Bethesda, Maryland 20892-7938, U.S.A.
2Division of Biostatistics, School of Public Health,
University of Minnesota, Minneapolis, Minnesota 55455-0392, U.S.A.

SUMMARY

We investigatetwo-stageparametricand two-stagesemi-parametric estimationproceduresforthe


association parameterin copula models forbivariatesurvivaldata wherecensoringin eitheror both
components is allowed. We derive asymptoticpropertiesof the estimatorsand compare their
performanceby simulations.Both parametricand semi-parametricestimatorsof the association
parameterare efficientat independence, and the parameterestimates in the marginshave high
efficiencyand are robustto misspecificationof dependencystructures.In addition,we propose a
consistentvariance estimatorforthe semi-parametricestimatorof the association parameter.We
apply the proposed methodsto an AIDS data set forillustration.

1. Introduction
There has been growinginterestin modelingbivariateand multivariatesurvivaldata. For example,
in an AIDS study, human immunodeficiency virus (HIV)-infectedpatients are at risk of having
various AIDS events, such as toxoplasmosis, wasting, cryptococcosis, etc. It is importantto
understandsequencing of these events so that clinicians can use the joint distributionto predict
futureepisodes. In a bone marrowtransplantstudy,afterreceivingbone marrowtransplantation, a
patientis at riskof acute graftversus host disease and cytomegalovirus.Physicianswish to studythe
association between timeto cytomegalovirusinfectionand timeto acute graftversus host disease.
Furthermore, we need to know about theassociation of thecorrelatedfailuretimesto designstudies
and produce appropriatestandarderrorsforparameterestimates.
There is a great deal of flexibilityin modeling multivariatesurvival data. Fully parametric
approaches are attractive,because they are generallyidentifiableand produce smooth survival
functions.Fully nonparametricapproaches, while providingflexibility, and, in the
are inefficient
presenceof manytypesof censoring,can be inconsistent.Semi-parametric models combinethebest
featuresof both approaches and so we investigatethis approach.
Two approaches are commonlyused in modelingmultivariatedata: randomeffectsmodels and
marginalapproach. Random effectsmodels withconditionalindependencegeneratea wide class of
joint distributions.A common approach in the survival applicationassumes independencecondi-
tionalon a scalar non-negativerandomvariable, the so-called frailty,that multipliesthe hazard.
Mixingover thedistribution forthefrailtyproduces dependence. The marginalapproachmodels the
marginaldistributiondirectlyand thenimposes a dependencystructure.Two aspects of statistical
interestarise fromthe marginalapproach: robust estimationof the marginaland estimationof
association.For the former,consistencyof parameterestimatesis a majorconcern,and association
amongdependentfailuretimesis treatedas a nuisance. For the latter,estimatingthe association is
the primaryinterest,and marginaldistributionsare treatedas nuisance functions.
In thispaper, we take the marginalapproach and focus on estimationof association. We model
the association of bivariate failure times by copula functions.These functionsare continuous
distributions on the unit square [0, 1]2 with uniformmarginaldistributions.Since an absolutely
continuousdistribution functioncan be transformed to a uniformdistribution,
a generalizedcopula

Keywords. Association; Bivariate failuretimes; Copula models; Semi-parametricmodels; Time-


dependentcorrelationcoefficient.
1384

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1385
model can have arbitrarymargins.Many researchershave studiedbivariatedistributions in copula
models, includingPlackett(1965), Mardia (1970), Clayton(1978), Cook and Johnson(1981), Genest
and MacKay (1986a,b), Hougaard (1986), Marshall and Olkin (1988), and Nelsen (1986). One
attractivefeature of a copula model is that the marginsdo not depend on the choice of the
dependency structure,and consequently we can model and estimate the dependency and the
marginsseparately.
This paper is organized as follows. In Section 2, we presenta few familiesof bivariatedistribu-
tions in copula models, and compare theirdependencystructuresby a time-dependent association
measure. In Section 3, We presentand deriveasymptoticpropertiesof thetwo-stageparametricand
semi-parametric estimationprocedures.In Section 4, we compare the performanceof thetwo-stage
estimationproceduresto maximumlikelihoodestimation(MLE). In Section 5, we analyze an AIDS
data set forillustration.We conclude and highlight futureresearchin Section 6. All theoremproofs
are in the Appendices.

2. Copula Models
Suppose that Ca is a distributionfunctionwith densityca on [0, 1]2 fora E= Rk'. Let (T1, T2)
denotethepairedfailuretimes,and (SI, S), (fI, f2) denotethecorresponding marginal survival
functionsand densityfunctions.If (T1, T2) comes fromthe Ca copula forsome a, thenthejoint
survivalfunctionand densityfunctionof (T1, T,) are given by

S(tl, t2) = C(Sl(tl), SA(t2)) t1, t2 > 0,

f(tl, t2) = ca(Sl(tl), S2(t2))f1(t1)f2(t2) t1, t2 > 0.

2.1 Archimedean
Copulas
A bivariatedistributionbelongingto the Archimedeancopula model familyhas the representation

C(u, v) = 4a[47a'(U) + 4Kc'(7)] 0 $ it, 7J < 1, (1)

where 0 S 0 < 1, 4(0) = 1, 4' < 0, j" > 0. If / is a Laplace transform of some distribution,
Archimedeancopula models reduce to proportionalfrailtymodels (Marshalland Olkin,1988; Oakes,
1989). Also, bivariatesurvivalfunctioninducedby a proportionalfrailtymodel can be expressedby

S(t1, t2) = E[e -Z(HI(1i) + H2(t2))1 (2)

where the expectation is taken over frailtyZ, and HI, H2 are the cumulativehazard functions
conditional on Z. Thus, the bivariate survival functionis the Laplace transformof the frailty
distributionevaluated atHl(t1) + H2(t2). Equating (1) with(2), we can show thatHi =
i = 1, 2. Three examples are presentedbelow.
Clayton's Family. The bivariate survivalfunctionbelongingto Clayton's family(1978) has the
form

C(tl, 7J) = {Ula + v1 t - 1}W/(1-a), a > 1. (3)

Here +(s) = (1 + s) /(1-) is the Laplace transformof a gamma distribution.T, and T2 are
positivelyassociated when a > 1 and are independentwhen a -o 1. Let A denotea hazard function.
Clayton(1978) shows thatA(t29T1 = tl)/A(t2TlT t1) equals a ifand only ifthebivariatesurvival
functionbelongs to Clayton's family.
Positive Stable Frailties. The bivariate survival functioninduced frompositive stable frailties
takes the form

C(u, v) = exp[-{(-log u)11' + (-log v)l/a}eai, (4)

where 0 < a < 1. Here +(s) = exp(-s ') is the Laplace transform of a positivestable distribution.
Small values of a providelargecorrelationand T1, T, are independentas a approaches 1. Hougaard
(1986) advocates the use of positive stable frailties.He noted the importantpropertythat in
univariatedata the proportionalityof the conditionalhazard (given a frailty)is inheritedby the
marginalhazard.
Frank's Family. The bivariatesurvivalfunctionintroducedby Frank(1979) has therepresentation

C(u, 7J) = loga{1 + (aU [


)(TgX - 1)} 5

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1386 Biometrics,December 1995
where a 0, and log, denotes logarithmto the base a. TI, T2 are positivelyassociated when
>
a < 1, negativelyassociated when a > 1, and independentwhen a approaches 1. Here +(s) =
loga{1 - (1 - a)e S} and is a Laplace transform for 0 < a < 1.
Note thatthese familiesmay exhibitdifferent
association structureseven withthe same margins.
For illustration,consider the time-dependent
correlationcoefficient
definedby

cov(Ml(tl), M2(t2))
r(tl,St2) = (6)
{var(MI (t 1))var(M2(t2))}

where fori = 1, 2 Mi(t) = I{ Ti S t} - Ai(t A Ti) and Ai - log(Si) is the marginalcumulative


hazard function.The expected value ofMi(t) equals 0 fort 0 and is a martingalewithrespectto
the filtration
definedby Ti in the absence of censoring.The correlationcoefficient r (t1, t2) is an
appealing association measure, because it is bounded between -1 and 1, and gives a direct
interpretationof the strengthof association. Prenticeand Cai (1992) express the covariance in the
numeratorof (6) in termsof the marginaland bivariatesurvivalfunctionas

cov(M1(tJ), M2(t2)) = S(tl, t2) - 1 + f S(s1, t2)AI(dsl) + f S(tj, s2-)A2(dS2)

+ fIft2
o o

S(sl, s-j)Al(dsl)A2(ds2). (7)

The above expressionis usefulforcomputingr (tl, t2) . Figure 1 displaysthe contoursof r (tl, t2)
forthe above threefamiliesof distributions withunitexponentialmarginsand witha corresponding
to Kendall's tau equal to .2. In Clayton's family,r (t1, t2) increases when botht1 and t2 increase.

Clayton'sfamily Positivestable frailties Frank'sfamily


?io _ X X z aO _ - aI ?1
0.0 0.5 1.0 1.5 2. 2. 3. 000.0. 05 10 .5 .0 .5 3.0 C's 1. N 0.24

0.55

C\1t t1 t\1

.4
Figur 1.1 Cotuso (l .5.2/ 2 aiy
eeatdf\lyo' 62 ostv tbefalis n
6'0 4
0.0550 . 01

NU~~~~
0.0 0.5 1.0 1.5 2.00.5~.4
2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
ti ti ti
Figure 1. Contours of rv(t, t2) generatedfromClayton's family,positive stable frailties,and
Frank's familywithKendall's tau equal to .2.
0.45-

0.40

0.35-

0.30

-- -- - -- - - - - -- - - - -- - - - -- - --
0 .2 5 - --

0.20 .

0.15-

0.10-

0.05

0.00.. . . . . .

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1387
In the positivestable frailtiesfamily,large correlationoccurs at small values of tl and t2, and then
decreases only slightlywhen tl and t2 increase. In Frank's family,r (t 1, t2) firstincreases when
botht1 and t2 increase and thendecreases slightlyafterit reaches a maximum.Figure 2 is a plot of
thecorrelationcurver(t, t) againstt1 = t = t. Althoughr (t, t) does notexhibitthefullassociation
structureas the contoursdo, it characterizesthe distinctivecorrelationfeaturein each of these
familiespresented.For theAIDS data analyzed in Section 5, r (t, t) is usefulin understanding how
the strengthof the association of the two groups of diseases changes duringthe course of the
follow-up.A formalinvestigationof the propertiesand sensitivityof r (t, t) formodel assessment
is beyond the scope of thispaper.

3. Estimation
The bivariatesurvivalfunctionS = C(S 1, S2) in copula models is characterizedby thedependence
functionC and the two marginalsurvivalfunctionsS l, S, This special structuresuggeststhatwe
may estimate the two margins and a separately. For example, Hougaard (1989) uses such a
procedureto estimatea. At stage 1, thetwo marginalcumulativehazards are estimatedby Nelson's
nonparametricprocedure (Nelson, 1972) ignoringdependence. At stage 2, a is estimatedwiththe
two marginsfixedat the Nelson estimates.Hougaard does not examine the statisticalpropertiesof
the two-stageprocedure. Clayton (1978) and Oakes (1982a, 1986) investigatea semi-parametric
estimation procedure for a in Clayton's family by treatingthe conditional hazard functions
A(t1 T2 > t,) and A(t9,T1 > tl) as nuisance functions.Clayton and Cuzick (1985) propose a
two-stepiterativeprocedureforestimatinga in Clayton's family.Their approach involves consid-
erable computation.
In this section, we investigatetwo-stageestimation.At stage 1, we estimatethe two margins
assumingindependence.We can use any estimationprocedureswhich produce consistentestima-
tors forthe margins.At stage 2, we estimatea by fixingthe marginsat the estimatesfromstage 1.
We considerboth parametricand nonparametricmargins.

3.1 Two-StageParametric
Estimation
Two-stageparametricestimationassumes thatfunctionalformsof the marginsare knownand have
a finitenumberof unknownparameters.For illustrationpurposes, let B1, /32 denote thevectors of
parametersformargin1 and margin2, respectively.Let (C1, C2) denote paired censoringtimes.
Forj = 1, ... , n, assume that(T1j, T,j) and (Clj, C2j) are independentand randomsamples with
continuoussurvivalfunctionsS and G, respectively.For each j, we observe Xi = Tij A Cij and
= I = Tij}. Write 0 for(f3', 3'2, a)'. Then the likelihoodof 0 is

H t(Xlj,
n
j=1
f(x11, X2j;
X2 as(Xlj,
6o)8lj821
X2j;xj 0)8lj(l - 82j)

as j!Xxj 0 82j -8
x j)S(X11 K; )(1-8i)(1
- 82i). (8)

WriteUB3(0), U,32(0), U(0) forthe score functionswhich are essentiallythe derivativeof the log
of (8) with respect to . 13 2, and a. Maximum likelihoodestimate 0 is the solution to U,3 = 0,
U,32=0, Uy = 0. Let 0O = (/3'o, P/'0, ao)' denote thevector of the trueparametervalues. Under
regularityconditions(Cox and Hinkley,1974,p. 281) n 1/2( 8 - 0o) convergesto multivariatenormal
withmean vector zero and variance-covariancematrixI1, where I is partitionedintoblocks

Ill 1 12 1 13
121 122 123
i
131 32 I33

For i = 1, 2, j = 1, 2, nli1 is the variance-covariancematrixbetween U,3 and U,,j, nI,3 is the


covariance matrixbetween U,3 and Ua, and n133 is the scalar variance of Ua.
For two-stageparametricestimation,at stage 1 we estimate (i31, 132) by (i31, 12) assuming
independence.Then fori = 1, 2, 13iis the solutionto the estimatingequations

UU= bij0 logf1(X11;f3) +01 8 log St(Xi1; pi3)

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1388 Biometrics,December 1995
Under regularityconditions,n 112(f3l - O!, t'2 - I30)' converges to multivariatenormalwith
mean vector zero and variance (1*) - I, where nI* is the variance-covariancematrixbetween U*
and U fori = 1, 2, j = 1, 2, and is partitionedinto blocks I*l, 12, I*1, and I*2.
Afterthe two marginsare estimated,we estimatethe association parametera by solving the
estimatingequation

Ua(f13, fB2, a) = 0. (9)

Theorem1. Let -a denote the solutionto (9). Under regularityconditions,Cn(a- - ao) converges
to normalwithmean zero and variance (J2, where

1 1
(T2 +? 2 (I31I8p11 13 + 132122123 + 13 111 12122123 ? 1321221211113).
I33 133
When ,1, 32 are scalar, u2 has the followingsimplerexpression

)
2
1 1 / 123 223 I 13I23I *12\

=33 I33 11 22 I' 122 (10)

The second termin accounts forthe loss in asymptoticefficiency


0-2 in estimating-a due to the
lackofknowledge
of.B1and to showthat113 = 123 = 0 whenT1 andT2 are
32. It is straightforward
independent,and thus -a is efficient at independence.

Estimation
3.2 Two-StageSemi-parametric
3.2.1 A Semi-parametricEstimator. In this section we relax parametricassumptionson the two
margins,and estimateS1, S2 by the Kaplan-Meier estimatorsdenotedby S1, S2. Forj = 1, ...
n, write(uj, Vj) for(S(Xlj), S2(X2j)). Then given (uj, Vj),j = 1, , n, the likelihoodof a is

oCa (Uit. v.)8 l(l- 82j)


H L(a, ut, v1) = 7 ca (uj, vj)81j
82j a

aCa (Utj, Vj)82j(l - 8C1)(1 81j)(l - 82j)


x ~~~~~Ca
(Uj,1j

Let l(a, u, v) denote the log of L (a, u, v), and Ua(a, S1, S2) the score functionof a which is
the derivativeof the log of (11) withrespectto a. The semi-parametricestimatora* is the solution
to the estimatingequation:

Ol(a, S1(Xlj), S2(X21))


Ua(a, S1, S2)= E
aa

Theorem2. Assume thatthejoint distribution of Tl, TXbelongs to a copula model family,and that
the followingregularityconditionshold:
(a). Standard regularityconditionsformaximumlikelihoodestimate.
(b). Wa(a, S1(t1), S2(t2)), Va(a, S1(t1), S2(t2)), Val(a, Sl(t1), S2(t2)), and
Va,2(a, Sl(tl), S2(t2)) are continuousand bounded for (tl, t2) E S = 10, tOl] x [0, t02], where

dl(cx, ut, v)
Wa (a, u, v) dcx
aa

d21(a, u, v)
Va(a,it, V) = 0af2

d2l(cx, ut, v)
Va, (a, u, v) = 2

a 21(a, u, v)
Va, 2(a, U, v) = a u
dcxdv

t = sup{t: p(T1 > t, C1 > t) > 0}, and t02 = sup{t: p(T2 > t, C2 > t) > 0}.
Then n l/2(af* - ag) converges to normalwith mean zero and variance 72 = [i T2+? 2]/T4-

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1389
The specificationof sl is requiredto ensurethatS 1(t I) is consistentforS I (t 1) withtE1 [0, t01]
and S2(t2) forS2(t2) witht2 E [0, t02].
The formulasfor r- and '-?are specifiedas

72 = E[-V,(aO, S1(X1I), S2(X21))] = f -VJ/(a(, S1(tl), S2(t9)) dHII(tl, t2, 1i, 62)

(12)

T? = E[{I1(X1, 8 a01 ) ?+ I9(X2l1, 6,I, a0)}2] = f [Il(tl,61,ao)

+ I2(t9, 62, a()] dHa,(tl, t2, 81, 62), (13)

where Ha) is the joint distributionof (Xlj, 61j) and (X9j, 62j). Forj = 1, n, I, and I2 are
definedby

I (Xij, 8 Ij, a o) = f Vai(ao, Sl(tl), S2(t2))I/(X1j,61j)(tl) dHa,(tl, t2, 6b 62),

6 0) t2, 6 62),
2j, a
=
12(X2j, Va, 2(aO 62j)(t2)
1

Si(t1), S,(t,))I1(X2j, dHa)(tl,


where

JoPTl ?u, C1 ? u{) dlU

I?(X2j, 62j) (t2) -S2(t2) u) j'>uI dA2(u)l


{p(T2 duNC
Jop(TI
-
= u () > u, C, 3 Ujt() )

}i-
i=
U
dn.
2; ~~~~~N(l)=IXj p =1, C....,

When ao approaches the value correspondingto independence,by integrationby parts and by


notingthatE tWa(ao, u, v)Ol(a0, u, v)/du] = E Wa(ao, u, v)Ol(ao, u, v)/do] = 0, it can be
easily shown thatIl and '2 converge to zero. Thus at independencei- convergesto 0 and a* is
efficient.
3.2.2 Finding a Standard Error. A variance estimatorof a* may be obtained by replacingH by
its empiricaldistributionfunctionH,, and S1, S2, a by S1, S2, and a*. Specifically,

JVaa*
I e(71(I)r2(2))2
d 51(t1), S2(t2)) dH,(t1, t2, 61, 62) =-

J=1
- Vaa,S(1)
Va (a?t7 1(X Ij ) 7 522X1)
X2j ) )

Forj = 1, , n, I(Xlj, 8lj, a) and I2(X2j, 62j, a) are estimatedby

11(X1I , 6 j at)* = 52(t2 ) )I (X,I61' j) (tI) dH (t I72,


Vai (a*E, S1I(t1S), 7~ 672 )

=- E Va,i(a*, Sl(Xlk), S2(X2k))Il(Xl1, 61l)(Xlk),

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1390 Biometrics, December 1995

7 (X9,27 2j, a f V . 2(a*, 1(t,), S2(t2))I7(X9, 6 2j)(t2) dH,,(t1, t2, 6 , 62)

= - E ~Vcy2(a, Sl(X1k), S 2(X2k))I 2 (X2j, 82j)(Xk)J,

where

I
I'(XI, 1j)(Xlk) = -SI(Xlk){f'j { [I k1 = 1} AA1(X,
Ij Xl ES Xl jvXlkA 1

and

=
AX(I2.2<
Xj,k 2)(X2k) = Ak2 (X2 1)
i2(~ 62j 2jj ( 2
-S2(X2k)~'~X,,6J=1 ____
P2jX9!1 X2j,X2k P2/

I I{xi/ Xii}
P= n ' = 1, 2,
n
I{Y1(t) > 0 d1() i ,2
AAi(t) = Nelson's estimator -dNi(t), i = 1, 2,

Yi(t)= ,I{Xj > t} Ni(t)= E Nij(t).


J j

Finally,

-2 = (t 1, a*) + 2(t2, 82, a)]2 dH,7(tl, t2, 61, 62)


1

n E [Ii(Xlj, 8j, a*) + 2(X2j, 62/, a*)]2


n
J

Theorem 3. Under regularity conditions (a), (b), z 12,and 72 are consistent estimators of K2and K.
4. Simulations
4.1 Perfornance of Two-Stage Estimation
In this section, we compare by simulations the performance of the two-stage estimators to that of a
maximum likelihood estimator. We assume unit exponential margins (f31 = /32 = 1), and choose
three values of a in each of the three copula models described in Section 2 such that the corre-
sponding Kendall's tau equals .2, .4, and .6. We choose Kendall's tau as a global measure of
association, because it has rank-invariance properties and can be expressed by a simple function of
- 1 (Genest and MacKay, 1986a,b).
We consider both no-censoring and 30% censoring. To achieve 30% censoring, we let the two
censoring variables be independently and identically distributed (i.i.d.) uniformlyover (0, 2.3). For
each value of a we generated 1,000 simulated samples with n = 50, 200. We used the algorithm
provided by Ripley (1987) to generate gamma deviates and the algorithm of Marshall and Olkin
(1988) to generate random pairs from proportional frailty models. For the positive stable frailty
model, we used the transformation method of Lee (1979) and Oakes and Manatunga (1992) to
generate random pairs. For Frank's family, we used the algorithm of Genest (1987) to generate
random pairs. We estimated y = log(a) because estimation for y is more stable.
The biases (not shown) of the two-stage estimators for uncensored as well as censored data are
within 3% for all y values in Clayton's family and Frank's family. In the positive stable frailties, the
biases are within 5% except in the y = - .22, n = 50 scenario for the semi-parametric estimator,
where the bias is 11% for uncensored data and 15% for censored data. Table 1 presents the variances
of the estimators.It is strikingthat,forall threedependencystructures,the two-stageparametric
estimatory performs so well as a maximumlikelihoodestimatorj' forbothuncensoredand censored
data withn = 200. For n = 50, the performanceof y is slightlybetterthan ', but theirdifference
is not statisticallysignificant.Since Fisher's information
can be calculated explicitlyforClayton's
familyand positivestable frailties(Qakes, 1982a, 1992), we can compare the asymptoticvariances

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1391
Table 1
Variancesof , 7* from1,000samplesofsize 50 and 200fromClayton'sfamily,
y comnputed
positivestablefrailty,
and Frank-'s
family.In each model,thefirstrowis thevarianceof ', the
secondrowis theratioofthevarianceof to thevarianceof ' multiplied
' by 100,and thethird
rowis theratioofthevarianceof j to thevarianice
of y*multipliedby 100.
var(x n)
No censoring 30% censoring
Model y= log(a) n = 50 n = 200 n = 50 n = 200
Clayton's family .41 1.56 1.29 2.51 2.29
103 101 102 100
84 92 93 99

.85 1.39 1.33 2.44 2.44


101 100 104 100
85 83 100 90

1.39 1.37 1.44 2.42 2.49


99 100 103 98
79 88 89 90

Positive stable frailties -.22 .67 .54 .87 .69


101 100 105 100
88 84 86 84

-.51 .67 .65 .88 .75


101 100 102 100
70 68 68 68

-.92 .75 .70 1.01 .96


101 100 102 100
61 59 66 62

Frank's family -1.85 45.8 41.2 60.2 55.0


102 100 108 101
101 97 104 99

-4.20 56.2 49.4 75.1 65.5


102 101 105 t00
96 93 93 93

-7.90 98.0 85.2 153.0 122.1


106 101 112 101
91 85 101 87

of MLE and the two-stageestimatordirectly.To thisend, we need to calculate I*I I*22, *I2. For
unitexponentialmargins,I* = I l = 1, and I*, is the covariance betweenM1(x) and M2(x)
which can be obtained throughnumericalintegration.We findthat the variances of & and -a are
identical for a wide range of a values in Clayton's familyas well as positive stable frailties.
However, the two-stagemarginalparameterestimators,B and /62 are inefficient relativeto ,13and
/32, because they are estimatedwithoutusing the information in the correlation.In Table 2 we
presentthe simulationresultsforthe performanceof ,B and /32 foruncensoreddata. For Clayton's
familyand positive stable frailties,,B and 02 performalmost as well as 31 and 2 whetherthe
correlationis moderateor large. For Frank's family,the variances of ,B and /B2are close to those
of MLEs when the correlationis moderate, but are about 15-20% larger when Kendall's tau
equals .6.
In the
k, following,we give a plausible explanationforthe efficiencyof but we don't have a
rigorousproof.In AppendixA, we show thatthe asymptoticdistribution of 5eis determinedby the
linearcombinationof U a' U 3, U 3v Consider a singlepair of uncensoredobservations.For i - 1,
2, U 3. is proportionalto Mi(oo). Note thatknowledgeof U 3. is equivalentto thatofMi(t), t ? 0.
Prenticeand Cai (1992) show thatM1 and M2 and theircovariance functionuniquelydeterminethe
bivariate survivalfunction.Hence, UX3 and U*X2and theircovariance also completelydetermine

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1392 Biometrics, December 1995
Table 2
Variances pi, pi, i = 1, 2, comnptutedfrom 1,000samples of size 50 and 200fromClayton's
family,positive stable frailty,and Frank's family.In each model, thefirstrow is the variance of
f3i,and the second row is the ratio of the variance of fj to the variance of f3multipliedby 100.
var(x n)

Model y n = 50 n = 200 n = 50 n = 200


Clayton's family .41 .96 .98 .94 1.03
99 99 98 99

.85 1.08 .99 1.00 1.06


96 99 98 97

1.39 .95 .99 .98 1.00


96 98 97 99

Positive stable frailties -.22 1.04 .98 1.05 .98


99 98 98 97

-.51 1.01 .96 .94 .98


96 95 96 96

-.92 1.02 .99 .97 .97


95 98 97 95

Frank's family -1.85 1.15 1.04 1.11 .97


99 97 99 98

-4.20 1.01 .97 1.05 .86


94 90 94 91

-7.90 .90 .86 .89 .88


85 81 82 82

the correlation.As a result,the loss of informationin the firststage is fullyrecovered fromthe


second stage by the covariance I 12 of U and U 2, leading to efficiencyof a-.
For the two-stagesemi-parametric estimationprocedure,thereis loss of efficiency
in y* relative
to j. However, this loss of efficiencyis primarilybecause the marginsare estimatednonparamet-
rically.
This simulationstudydemonstratestwo advantagesof the two-stageestimationprocedure.First
we maintainhighefficiency forthe estimationof the association parameterwhile havingflexibility
in modelingthe margins.Second, because of the special structureof Archimedeancopula models,
the marginalestimatesnot only are robust to misspecificationof dependencystructuresbut also
preservehighefficiency.
4.2 Performanceof the Variance Estimator 2
In thissection,we examine the performanceof thevariance estimator 2. We estimateits relative
bias by
*
* . Ei z~(2i) - n ,i(y i y )2/
Relative bias = - T (/M n>
- (Y))i)- (14)
n
where 2 ) and y (i) are the estimatesfromtheith simulatedsample, Y)= E iy (i,M is theaverage
of y`(E, andM(= 1,000) is thenumberofsimulatedsamples. For largesample sizes, whenM A > cr,
the denominatorof (14) converges to i72. In addition,we calculated the empiricalcoverage of the
95% confidenceintervalbased on y* ? 1.96z/ n;. Table 3 presents the simulationresults for
uncensoredsamples. The bias is fairlysmall for every y value. Furthermore,all the empirical
coverages are close to the nominal95% level. Table 4 presentsthe simulationresultsforcensoredQ
samples. The bias is largerforn = 50, because the Kaplan-Meier estimatetends to be unstablefo;r

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1393
Table 3
Performanaceof the variance estimatorof y*from 1,000simulateduncensoredsamples of size
50 and 200from Clayton'sfamily,positive stable frailty,and Frank's family.
n =50 n-200
95% 95%
Relative coverage Relative coverage
Model y bias (%) probability bias (%) probability
Clayton's family .41 4.4 95.4 -4.9 93.4
.85 5.2 94.9 5.0 95.0
1.39 4.7 95.5 .6 94.6

Positive stable frailties - .22 -3.9 93.9 1.6 95.1


-.51 -3.2 93.8 -6.6 93.7
-.92 1.8 95.5 -2.9 94.8

Frank's family -1.85 -5.0 95.5 7.8 95.2


-4.20 -4.3 94.9 -1.1 94.8
-7.90 1.4 94.4 -8.4 93.2

Table 4
Perfornance of the variance estimatorof y*from 1,000simulated30% censored samples of size
50 and 200from Clayton'sfamily,positive stable frailty,and Frank's family.
n =50 n =200
95% 95%
Relative coverage Relative coverage
Model y bias (%) probability bias (%) probability
Clayton's family .41 5.9 95.2 5.0 96.0
.85 19.7 97.2 -2.5 95.0
1.39 23.9 96.6 0.7 95.5

Positive stable frailties -.22 -9.9 94.4 -6.0 93.5


-.51 -6.2 94.1 0.0 94.3
-.92 5.9 95.1 -8.5 93.3

Frank's family -1.85 -3.3 94.6 -7.5 94.9


-4.20 3.6 96.8 4.1 95.5
-7.90 14.2 95.9 2.1 94.8

small and heavilycensored data. Interestingly,


the empiricalcoverage is stillveryclose to the 95%
level formost of the y values.

5. Illustration
5.1 Data
We analyzed a subset of data for patients enrolled into various protocols of the Terry Biern
CommunityProgramsforClinical Research on AIDS. This programis fundedby the Division of
AIDS in the National Instituteof Allergyand InfectiousDiseases, U.S.A., to utilizetheknowledge
and expertiseof communityphysiciansin treatingHIV-infectedpatients.There are 17 unitswith
over 200 clinicsin theUnited States participatingin theprogram.The CD4 countat thebaseline visit
of the patientsincludedin this analysis is below 100. The patientsare at riskforvarious diseases,
and we groupthemintotwo levels of seriousness. Events belongingto group 1 (less serious) include
cryptococcosis,candidiasis, herpes simplex,Kaposi's sarcoma, and pneumocystiscariniipneumo-
nia. Events belongingto group 2 (more serious) include cryptosporidiosis,toxoplasmosis,myco-
bacteriumaviumcomplexinfection,histoplasmosis,cytomegalovirus, AIDS dementiacomplex,and
wasting. One goal of this analysis is to studysequencingof these diseases.
We analyze data up throughApril 30, 1992. Of 1,092 patients, 266 (24.4%) have developed
diseases ingroup1 and 261 (23.9%) have developed diseases ingroup2, and 102patients(9.3%) have
experienceddiseases in bothgroupswith44 (4.0%) havingsimultaneousdiagnosis.The mediantime
to the diseases in group 1 is 557 days and the median timeto the diseases in group 2 is 538 days.

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1394 Biometrics,December 1995
0.18
0.16
0.14-
0.12-
0.10
0.08-
0.06
0.04
0.02 -

0.00
-0.02IT ___ ____ ..

0 100 200 300 400 500 600 700


time(days)
Figure 3. Nonparametric and semi-parametricestimates of the correlation curves from the
AIDS data. ,non parametric;- ,---- Clayton's family;-- -' positive stable; - -,Frank's
family.

0.7

0.6

*00.5

2: 0.4

8 0.2

0.1

0.0 . . . . . . . . . . . . . .
0 100 200 300 400 500 600 700
time(days)

Figure 4. Probability1: probabilityof having a group 1 disease conditional on having had a


group 2 disease; probability2: probabilityof having a group 2 disease conditionalon havinghad
a group 1 disease. ~, probability1; - ,---- probability2.

5.2 Resuilts
We fitthetwo marginsby Weibullmodels as well as theKaplan-Meier estimates.We applied Gakes'
concordance test of independence(Gakes, 1982b), leading to rejectionof the hypothesisof inde-
pendence. We fitthe dependencystructureby Clayton's family,positivestable frailty,and Frank's
family.Kendall's tau equals .24, .10, and .16, respectively.
We examine the fitof the three dependency structuresby comparingtheir semi-parametric
correlationcurves r (t, t) with the nonparametricone. The semi-parametricestimateis based on
parametricdependencyand Kaplan-Meier margins,while the nonparametricestimateis based on,
thenonparametric estimateof thebivariatesurvivalfunctionof Dabrowska (1988). Figure3 displays
theestimatesof thecorrelationcurves. None of the threemodels are able to capturetheassociation
well in thefirst60 days because of the largevariationoccurringin thisperiod. But overallit appears
thatthepositivestable frailtymodel fitsbetterthantheothertwo models. In particular,thepositive
stablefrailtymodelcapturesthe highcorrelationoccurringin thefirst200 days, while theothertwo
models underestimateit.
Undertheestimatedpositivestable frailtymodel,we calculated theprobabilityof havinga group
2 disease by timet conditionalon havingia giroup1 disease by timet, as well as the probabilityof

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1395
serious disease thanis the reverse.However, the sequencingof thetwo groupsof diseases becomes
less apparentafter200 days possiblybecause 20% died withouthavingdeveloped eithera group 1
or group2 disease.

6. Conclusion
We have investigatedaspects of inferenceson dependency structureof copula models. High
efficiencyof thetwo-stageparametricestimatorsuggeststhat,withthetwo marginsbeingunknown,
thetwo-stagesemi-parametric estimatormightalso be highlyefficient. Bickel et al. (1993) calculates
information bounds forestimationof the association parameterand two nonparametricmarginsin
thecopula model family.The estimatorsare based on estimationof the efficient influencefunction.
His approach is able to constructan efficient estimatorof the association parameterwhen onlyone
marginis unknown,but becomes unwieldyin establishingthe efficiencywhen both marginsare
unknown.
The graphicalcomparisonof the nonparametricand semi-parametric correlationcurves appears
to be usefulforassessment. Formal investigationon thepropertiesand sensitivityof thecorrelation
curve is a topic of futureresearch. In addition,forthe subclass of proportionalfrailtymodels, one
may compare the posteriordistribution of the frailty,given the data, withthe assumed parametric
form.This approach is being developed.
Extension of the two-stageapproach to accommodatingcovariates with margin-specific regres-
sion coefficientsis direct.We may choose any regressionmodels to fitthe marginsat stage 1. At
stage 2, we fix the regressioncoefficientsat the estimated ones and maximize the parametric
likelihoodto estimatethe association parameter.We expect thatthe asymptoticresultspresentedin
thispaper stillhold.
Finally,in thispaperwe assume thatthemarginalsare notlinked,so we estimatethemseparately.
In applicationssuch as studiesof twins,thefailuretimesof each pair may be exchangeable,or they
mighthave some commonparametervalues. Thus, we should use thewhole data set to estimatethe
common marginalsurvivalor parameters.Developmentof asymptoticpropertiesof the parameter
estimatesin thiscase is a topic of currentresearch.

ACKNOWLEDGEMENTS

Partialsupportof thisworkwas providedby contractNO1-AI-05073fromthe National Instituteof


Allergyand InfectiousDiseases, U.S.A. Most of theworkwas done while thefirstauthorwas at the
Division of Biostatistics,Universityof Minnesota. We thankDr. KathrynChaloner for her con-
structivecomments.We appreciate the valuable commentsof the two referees,which have im-
proved the presentationof thispaper.
Afterthis manuscripthad been submitted,the work of Genest et al. (1995) was broughtto our
attention.They investigatea semi-parametricestimatorof the association parametersimilarto ours
foruncensoreddata in Clayton's family.

RESUME
Nous d6veloppons des proc6duresd'estimationparam6triqueet semi param6triquea deux 6tapes
pour le parametred'association dans les modeles copula pour les donn6es de survie bivari6es en
pr6sence de censure pour l'une des deux composantes ou pour les deux. Nous 6tablissons les
proprietesasymptotiquesdes estimateurset comparons leurs performancespar simulation.Les
deux estimateursdu parametred'association, param6triqueet semi param6trique,sont efficacesen
cas d'ind6pendance,et les estimationsdes parametresdans les margesont une grandeefficaciteet
sont robustes vis-a-vis d'une erreurde d6finitiondes structuresde d6pendance. De plus, nous
proposons un estimateurconvergentde la variance de l'estimateursemi param6triquedu parametre
d'association. Nous illustronsles methodes propos6es par une application a des donn6es sur le
SIDA.

REFERENCES

Bickel, P. J., Klaassen, C. A. J., Ritov, Y., and Wellner, J. A. (1993). Efficientand Adaptive
EstimationforSemipararnetric Models. Baltimore,Maryland:JohnsHopkins UniversityPress.
Clayton, D. G. (1978). A model for association in bivariate life tables and its application in
epidemiological studies of familial tendency in chronic disease incidence. Biomnetrika 65,
141-151.
Clayton, D. G. and Cuzick, J. (1985). Multivariategeneralizationsof the proportionalhazards
model. Journalof the Royal Statistical Society, Series A 148, 82-108.

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1396 Biometrics,December 1995
Cook, R. D. and Johnson,M. E. (1981). A familyof distributionsfor modelingnon-elliptically
symmetricmultivariatedata. Journalof the Royal StatisticalSociety, Series B 43, 210-218.
Cox, D. R. and Hinkley,D. V. (1974). TheoreticalStatistics. London: Chapman & Hall.
Dabrowska, D. M. (1988). Kaplan-Meier estimateon theplane.Annals of Statistics161, 1475-1489.
Fleming,T. R. and Harrington,D. P. (1991). CountingProcesses and SurvivalAnalysis.New York:
Wiley.
Frank, M. J. (1979). On the simultaneous associativity of F(x, y) and x + y - F(x, y).
Aequationes Mathematicae 19, 194-226.
Genest, C. (1987). Frank's familyof bivariatedistributions.Biometrika74, 549-555.
Genest, C. and MacKay, R. J. (1986a). Archimedeancopulas and bivariatefamilieswithcontinuous
marginals.Canadian Journalof Statistics 14, 145-159.
Genest, C. and MacKay, R. J. (1986b). The joy of copulas: Bivariate distributionswith uniform
marginals.American Statistician40, 280-283.
Genest, C., Ghoudi, K., and Rivest L. P. (1995). A semiparametricestimationprocedure for
dependence parametersin multivariatefamiliesof distributions. Biometrika82, 543-552.
Hougaard, P. (1986). A class of multivariatefailuretimedistributions. Biometrika73, 671-678.
Hougaard, P. (1989). Fittinga multivariatefailuretimedistribution.IEEE Transactionson Reliabil-
ity38, 444-448.
Lee, L. (1979). MultivariatedistributionshavingWeibull properties.Journalof MultivariateAnal-
ysis 9, 267-277.
Mardia, K. V. (1970). Families of Bivariate Distributions.London: Charles W. Griffin.
Marshall,A. W. and Olkin,I. (1988). Families of multivariatedistributions.
Journalof theAmerican
StatisticalAssociation 83, 834-841.
Nelsen, R. B. (1986). Propertiesof a one-parameterfamilyof bivariatedistributions withspecified
marginals.Communicationsin Statistics,Part A 153, 3277-3285.
Nelson, W. B. (1972). Theory and applications of hazard plottingfor censored failure data.
Technometrics14, 945-965.
Oakes, D. (1982a). A model for association in bivariate survival data. Jourtal of the Royal
Statistical Society, Series B 44, 414-422.
Oakes, D. (1982b). A concordance test forindependencein the presence of censoring.Biometrics
38, 451-455.
Oakes, D. (1986). Semiparametricinferencein a model for association in bivariatesurvival data.
Biometrika73, 353-361.
Oakes, D. (1989). Bivariatesurvivalmodels inducedby frailties.Journalof theAmerican Statistical
Association 84, 487-493.
Oakes, D. and Manatunga,A. K. (1992). Fisher information fora bivariateextremevalue distribu-
tion. Biometrika79, 827-832.
Pepe, M. S. (1991). Inferenceforeventswithdependentrisksin multipleendpointstudies.Journal
of theAmerican StatisticalAssociation 86, 770-778.
Plackett, R. L. (1965). A class of bivariate distributions.Journal of the American Statistical
Association 60, 516-522.
Prentice,R. L. and Cai, J. (1992). Covariance and survivorfunctionestimationusing censored
multivariatefailuretime data. Biometrika.79, 495-512.
Ripley,B. D. (1987). Stochastic Simulation. New York: Wiley.
Susarla, V. and Van Ryzin, J. (1976). NonparametricBayesian estimationof survivalcurves from
incompleteobservations.Journalof theAmerican StatisticalAssociation 71, 897-902.

Received August 1993; revisedAugust 1994; accepted Janitary1995.

APPENDIX A

Proof of Theorem]
Expanding
thescorefunction *
U ina Taylorseriesaround131()
andevaluating
itat ,3 = ,3 we get
U*(f o) 0 = U * (f310)- A*(3(310)(t31 - j31() + o1,(n1/2),
whereA ( 1131= - AU*(,11)/aB3 evaluatedat p1 = p1,0.Similarly,
U2(132)
= 0 = U,(1320) - AO2(,20)(p2 - p2I2) + op(n
where A*2(320) =
-aU*(J32)/ap2 evaluated at 132 = 1320,and

U.(6) = 0 = U(O() - Ba- -I(o)(p -- 11)-)Ba2(00)(32-/p2()


- (00)(a - a0) + op(1/2 )

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1397
where
Uau(O)
Ba,1(00) = alB
U.(
e=00
aUa (0)
Ba,2(00) = - e= O
132

a Ua (0)
Ba(00) = - a

By the law of largenumbers,as n - oo, A* /n, A*2/n, Ba I/n,B 121n,andBa/n converge to 1,


1*22, 131, 132, Hence, (U,*;(p10), Up*(I320), U(00))'/ n is approximately
and 133,respectively.
equivalent to n 0 - 0o), where

P= | ?
131
I*22
132 I33/
)
Nextwe showthatthecovariancebetweenU* and Ua equals0. It suffices
toshowthecovariance
equalsOforn = 1. Let hax1,x2, 52) denotethejointdensity
517 of(X1, 51) and(X2, 2)* Then

cov(u, Ua)= U* IUaha(ul, U2, 51, 52) du1du2

= U* a- ha(Ul, U2, '1, '2) du2du1

= U* 0 dul

= 0.

Similarly, thecovariancebetweenU* and Ua also equals0. By centrallimittheorem,(U(10),


U ~2( 1320),SUa( Os) ) ' / n converges
to multivariate
normalwithmeanzero andvariance-covariance
matrixQ, where
(Ill I112 0
Q= I* I~ 02 ?
233 0 /

Thus, n;(0 - O0) convergesto multivariatenormalwithmeanvectorzero and variance-covariance


matrixp- 1QP- 1 .Aftermatrixmultiplications,one obtainsC2 as thelowerrightcornerelementof
P-1QP- '. This completes the proof.

APPENDIX B

Proof of Theorem2
Expandingthescore function
U(ar; S1, S2) in a Taylorseriesarounda, and evaluating
it at
P = p*hweTget

Ua(a*; S1, S2) = 0 = Ua(ao; S1, S2) + (a* - ao) E Va(a(, SI(Xlj), 52(X2j)) + o,(n12).

Thus

-/(* o -Ua(aYo; Sl, S2)/A/n

Since S1(.) converges in probabilityto S1(.) uniformlyin [0, t01], S2(.) converges to S9(.)
uniformlyin [0, to2] (Fleming and Harrington,1991, p. 115), and Vjar, uI, v) is a continuous
functionof u and v, IVja0(, S1(t1), S2(t9)) - V~(a0(, S1(t1), S2(t2))l converges in probability

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
1398 Biometrics,December 1995
to zero for(t1, t2) E A. Thus Yj
Vj(ao, S1(Xlj), S2(X2j))/n is asymptoticallye%uivalentto
-
Y. -
Va(0, S(Xlj), which by the law of large numbersconverges to 71.
S2(X2j))/n
Next, we decompose Ua(ao, S1, S2)/ n into two termsR, (ao, S1, S2) and Z,7(ao, S1, 32):

1Ua(a 52) = /n Wa(ao, Sl(tl), S2(t2)) dH,,(tl, t2, 1i, 52)


dn

f
A

= /n Wa(a0, S1(tl), S,(t2)) dHaf(tl, t2, 17, 52)

+ Wa(a0, S1(tl), S2(t2)(dH,, - dHa0)(tl, t2, ,


17 2)

= R,7(a0,1S1, S2) + Zn(a07, 1, 52)- (16)

We furtherdecompose Z,, into two terms,

Z,(a0o, 57, 52) = /n f [W(ao, S1(tl), S2(t2)) - Wa(ao, Sl(tJ), S2(t2))](dH,7 - dHa0)

(tl, t2, 17, '2) + J'n Wa(ao, Sl(t1), S2(t2))(dHn- dHa)(tl, t2, 51i, 5)-

Since S1 S, S2
, - 2, Cn(H, - H) - Op(l), and W is continuous and bounded, by the
Dominated ConvergenceTheorem,the firsttermin Z,, convergesto 0. The second termof Z,, is a
sum of i.i.d. randomvariables of mean zero and variance r2. By centrallimittheoremit converges
to normalwithmean zero and variance 12
We will now derive asymptoticpropertiesof R,,7(ao, Sl, S2). Using Von Mises expansion on
R,,(ao,, S1, S2) around Sl, S2, we get

R,7(ao0, l, 52) R,7(a0(, Sl,,S2) + 2C1(t1)d(S1 - S1)(tt) + Vn 1C2(tc) d(S2 - 52)

(t2) = 0 + 1n ICI(t1) d(Sl - Sj)(tj) + 1n IC9(t2) d(S2 - S2)(t2), (17)

where IC 1, IC, are obtained by differentiating


R (aC, (1 - 8)
E S 1 + ES11, (1 - E2)S2 + E2S2)
withrespect to El and 82 and evaluatingat El = 82 = 0:
t I tO'

IC1(t1)= -{ t Va ,1(a0, Sj(u), S2(t2))haO(U, t2, 81, 52) dt9dti,

-J {
I-) tot

IC2(t2)= Va,2(a(, SI(tl), S2(u))ha0(t1, U, 51, 52) dtldul.


0 (

From the standardmethodologyof countingprocesses (e.g., Pepe, 1991), fort E [0, toj, Cn(Sl
(t)) - S1(t)) is asymptoticallyequivalent to a sum of n i.i.d. randomvariables Yj I()(Xlj, 5j)
(t)/ ii. Similarly,for t E [tO2], nl(S9(t) - S2(t)) is asymptoticallyequivalent to Ej I(X2j,
n2j)(01Cn. Substituting
YIj(Xlj, j)/Vnfor n(S1 - Sj) and yI02(X2j, 52j)l/n for n(32 - S2)
and integrating by parts in (17) gives

317 32) -~1 + I-,(X21,'62j7


R17(a0, >- 11(X11, 'lj7 a0) a))7

which is a sum of n i.i.d. randomvariables. For i = 1, 2, since Ij) is a martingaleand ICi(.) is a


deterministic function,the expectationis thatI, will be zero. By centrallimittheorem,R,,(a,(a(,
S2) convergesto normalwithmean zero and variance 7r. Consequently,n l/2(a" - a)) converges
to (Z 1 + Z2)/ r 1, whereZ 1 andZ2 arenormalwithmeanzeroandvariance41andr 2, respectively.
Note that Z1 is asymptoticallyequivalent to ) 7-i Wja0(, 51(X11), S2(X21))/i, and Z2 iS
asymptoticallyequivalentto 1/ /nEj= IlXj alj' a0() + I2(X2J, 62j' an(). Using the approach

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions
Association Parameter in Copula Models 1399
similarto thatinAppendixA, one can show thatZI and Z2 are uncorrelated.Hence, n 1/2((* - aY)
convergesto normalwithmeanzero andvariance(_r2 + r2)/rl. Thiscompletestheproof.
Remark1. If Wa(a, Sl, S2) or itsderivative at Si = 0, 1, i = 1, 2, we can
is notcontinuous
smooththeKaplan-Meierestimates, forexample,by theNelsonestimator forthemarginal cumu-
lativehazardsorbytheBayesiannonparametricestimatorofthesurvivalofSusarlaandVan Ryzin
(1976).

APPENDIX C

Proof of Theorem3
We firstshow 2 convergesto _r in probability.

tl- tll S | t [Va(a*, S1(tl), S2(t2))- Va(a0, 51(t1), 52(t1))] dH,(t1, t2, 6,, 52)

+ | V,1.(a(), S1(tJ),S2(tl)) d(H,, -H)(tl, t2, 51, 52) . (18)

Since a* converges to ao and S1(t1), 32(t2) to SI(tl), S2(t2), and Va is continuous,Va(a*,


S1(tl),S2(t2)) converges in probabilityto Va(a0, SI(t1), S2(tl)) for(tl, t,) E A. Since Va is
bounded by the Dominated ConvergenceTheorem,the firsttermin (18) convergesto zero. By the
Glivenko-CantellitheoremHn - H converges to zero in probability,so the second termin t18)
convergesto zero by applyingthe Dominated ConvergenceTheorem. Hence z 2 convergesto r - in
probability.
We next show the consistencyof 72

|2 2 | A 1, a) )+
[{11(t1, 12(t2, ' 27 a )} {II(tt, l, a0)

+ 12(t-2, '2, ao)}2] dH,,(t,, t2, al, ,2,) + | {I(t,, 1,,a0)

+ 2(t2, '2, ao)} 2d(H,, - H)(tl, t2, ' 1, 2) . (19)

To show thatthefirsttermin (19) convergesto zero, it sufficesto show thatIl convergesto I', and
I2 to I2*

111(ti, 51, a*) - Il(ti7 51, ao)j _< 1I1(tl, 51, a*) - TI(ti7 517 a))| + |IT(tij, 517 a))

- II(tl, 517 ())|I (20)


where

1I(xl, * ao) = Va(at, Si(tj), S2(t2))I?(x, &)(tt) dH,,(t, t2, 517 52) (21)

Since a* > cao


a S 1 > S1,
I2 -> S2, and Va,1 is continuous, Va 1(,I
( 1(t1), S2(t,)) coniverges
in probabilityto Va I(ao, S1(t1), S2(t2)) fort1, t, E A. By similararguments,one can also show
that l7(xl, 51)(tl) converges in probabilityto IP(xj, 51)(tl). Since Va,1 and IP are bounded, by
the Dominated ConvergenceTheorem, the firsttermin (20) converges in probabilityto zero. The
second termconverges to zero in probabilityby applyingthe Glivenko-CantelliTheorem and the
Dominated ConvergenceTheorem to H,7 - H. By the similararguments,'2 converges in proba-
bility to I2. Hence, the firstterm in (10) converges in probabilityto zero. By applying the
Glivenko-CantelliTheorem and the Dominated Convergence Theorem again, the second termin
(19) also converges to zero in probability.This completes the proof.

This content downloaded from 128.83.63.20 on Wed, 25 Sep 2013 17:43:43 PM


All use subject to JSTOR Terms and Conditions

You might also like