1 Interval Censoring: Nonparametric Estimation For Interval Censored Data

Statistic Seminar, 8th talk:
Nonparametric estimation for interval censored data

Martina Albers, Nanina Anderegg, Urs Müller
Monday, 2. May 2011
1 Interval Censoring
Current Status Censoring / Interval Censoring Case 1:
• X : the failure time, where X ∼ F

• T : observation time, where T ∼ G
• X is independent of T
• n observations which are iid copies of (T, ∆) = (T, 1{X ≤ T })

The goal is to estimate the distribution function of X , i.e. F (x) = P [X ≤ x].
Interval Censoring Case k : (here for k = 2)
• X : the failure time, where X ∼ F
• (T1 , T2 ): observation times, where (T1 , T2 ) ∼ G

• X is independent of (T1 , T2 )
• n observations which are iid copies of
(T, ∆) = ((T1 , T2 ) , 1{X ≤ T1 }, 1{T1 < X ≤ T2 }, 1{X > T2 })
The goal is again to estimate the distribution function of X .

Mixed Case Interval Censoring: Instead of having a xed k , the number
of observations may also vary from subject to subject, e.g. the rst patient is
tested 2 times, the second patient 3 times and the third patient only once. Dene
hence a random variable K denoting the number of observations!
Bivariate Interval Censored Data:
• (X, Y ): the failure time, where (X, Y ) ∼ F

• U = (U1 , U2 ), V = (V1 , V2 ): observation times, where (U, V ) ∼ G
• (X, Y ) are independent of (U, V )

• n observations which are iid copies of (U, V, ∆), where
∆ = (∆11 , ∆12 , ∆13 , ∆21 , ∆22 , ∆23 , ∆31 , ∆32 , ∆33 )
1
The variable ∆ij is dened to be: ∆ij = 1{(X, Y ) ∈ Rij }, where the Rij must
be chosen from:
R11 = (0, U1 ] × (0, V1 ]
R12 = (U1 , U2 ] × (0, V1 ]
R13 = (U2 , ∞) × (0, V1 ]
R21 = (0, U1 ] × (V1 , V2 ]
R22 = (U1 , U2 ] × (V1 , V2 ]
R23 = (U2 , ∞) × (V1 , V2 ]
R31 = (0, U1 ] × (V2 , ∞)
R32 = (U1 , U2 ] × (V2 , ∞)
R33 = (U2 , ∞) × (V2 , ∞)
(1)
The goal is again to estimate the distribution function of (X, Y ), i.e.
P [X ≤ x, Y ≤ y].
2 The Nonparametric MLE for Current Status

Data
The likelihood of n iid observations (ti , δi ) , i = 1, . . . , n can be written as

n
(2)
Y 1−δi
F (ti )δi (1 − F (ti )) g(ti )
i=1
and thus the nonparametric MLE F fullls

n
(3)
Y 1−δi
Ln (F ) = F (ti )δi (1 − F (ti ))
i=1
and is well-dened. By using the notation of observed sets

(0, ti ] if δi = 1
(
Ri = , i = 1, . . . , n (4)
(ti , ∞) if δi = 1
the likelihood of the nonparametric MLE F can be rewritten as
n
(5)
Y
Ln (F ) = PF (Ri ),
i=1
where PF (Ri ) is the probability under distribution F that X ∈ Ri .
3 Finite sample properties and computation of

the MLE
Reducing the optimization problem:

The optimization problem
n
sup ln (F ) with ln (F ) = log Ln (F ) = (6)
X
log PF (Ri ),
F ∈F i=1
2
where F is the space of all distribution functions on the appropriate space, is an
innite dimensional optimization problem. We can reduce it to a nite dimensio-
nal optimization problem by looking at the maximal intersections A1 , . . . , Am of
the observed sets R1 , . . . , Rn , which are areas where there is maximal overlap of
the observed sets. Let α1 , . . . , αm be the masses assigned to the corresponding
sets A1 , . . . , Am . It can be shown that
n
(7)
X
ln (α) = log(C T α)i ,
i=1
where C is an m × n matrix, called the clique matrix, with entries Cji = 1{Aj ⊆
Ri }. We then get a nite dimensional convex optimization problem:
ln (α̂) = max ln (α), (8)

α∈A
m
where A = {α ∈ Rm : αj ≥ 0, j = 1, . . . , m, (9)
X
αj = 1}.
j=1
Existence and (non-)uniqueness of the MLE:
Theorem 1. The MLE α̂ dened by (8) exists.

Let PF (R) denote the vector (PF (R1 ), . . . , PF (Rn )).
Theorem 2. The log likelihood (8) is strictly concave in PF (R). Thus, the MLE
estimates the probabilties PF (R1 ), . . . , PF (Rn ) of the observation rectangles un-
iquely.
However, the log likelihood is concave in F and α, but not strictly concave,
which means that two dierent functions F1 , F2 ∈ F can yield the same vector
PF (R). Similarly, two dierent α1 , α2 ∈ A can yield the same vector PF (R).
Thus, we cannot estimate F or α uniquely. See slides for an example.
Theorem 3. The MLE α̂ is unique if the clique matrix C has rank m.
4 Characterization and convex minorants for Cur-

rent status Data
Let T(1) , . . . , T(n) denote the order statistics of T1 , . . . , Tn , and let ∆(1) , . . . , ∆(n)
be the corresponding ∆ values, i.e., ∆(i) = ∆j if T(i) = Tj . Furthermore, let
Y = {y ∈ Rn : 0 < y1 ≤ · · · ≤ yn < 1}, and dene ŷ ≡ F̂n (T(i) ).
Proposition 1. ([GW92], Proposition 1.1, page 39) The vector ŷ ∈ Y is the
MLE if and only if
X ∆(i) 1 − ∆(i)

− ≤ 0, for all j = 1, . . . , n (10)
ŷi 1 − ŷi
i≥j
n
1 − ∆(i)

∆(i)
(11)
X
− ŷi = 0
i=1
ŷi 1 − ŷi
3
Corollary 1. The vector ŷ ∈ Y is the MLE if and only if
∆(i) − ŷi ≥ 0, for all j = 1, . . . , n + 1, (12)

X
i<j
and equality holds if ŷj > ŷj−1 (with ŷ0 = 0 and ŷn+1 = 1).
Proposition 2. Let P = {Pi = (i, j≤i ∆(j) ), i = 0, . . . , n}. Let H be the
P
greatest convex minorant of P . Then ŷ is the MLE if and only if for all i =
1, . . . , n, ŷi equals the left derivativce of H at i.
5 Asymptotic Theory
• The MLE for current status data is globally and locally consistent.
• The MLE for current status data converges globally and locally with rate
n1/3 to F0 .
• n1/3 (F̂n −F0 ) converges in distribution to the slope of the convex minorant
of a Brownian motion plus a parabola at point 0.
The likelihood ratio test has asymptotic distribution D = S 2 (t) − S02 (t)dt,
R
where S is the slope process of the greatest convex minorant of a two-sided

Brownian motion plus a parabola, and S0 is the slope process of the greatest
convex minorant of a two-sided Brownian motion plus a parabola under the
constraint that the slopes are ≥ 0 for all t ≥ 0 and the slopes are ≤ 0 for t ≤ 0.
Let λn (θ) be the likelihood ratio test for the null hypothesis H0 : F (t0 ) = θ
and the alternative H1 : F (t0 ) 6= θ. Then for 0 < α < 1 and dα such that
P (D > dα ) = α, we have the condence sets Cn,α = {θ : 2 log λn (θ) ≤ dα }.
Proposition 3. Suppose that F and G have densities f and g which are positive
and continuous in a neighbourhood of t0 . Then
PF,G (F (t0 ) ∈ Cn,α ) → P (D ≤ dα ) = 1 − α, as n → ∞.
Literatur
[GW92] P. Groenebook and J. A. Wellner, Information bounds and nonpa-

rametric maximum likelihood estimation, Birkhäuser Verlag, Basel,
1992.
[Maa07a] Marloes H. Maathuis, Survival analysis for interval censo-
red data, part 1, Seminar of Statistics, ETH Zurich, 2007,
http://stat.ethz.ch/∼maathuis/teaching/fall07/notes1a.pdf (last ac-
cessed April 29, 2011).
[Maa07b] , Survival analysis for interval censored da-
ta, part 2, Seminar of Statistics, ETH Zurich, 2007,
http://stat.ethz.ch/∼maathuis/teaching/fall07/notes2b.pdf (last
accessed April 29, 2011).

1 Interval Censoring: Nonparametric Estimation For Interval Censored Data

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Interval Censoring: Nonparametric Estimation For Interval Censored Data

Uploaded by

Copyright:

Available Formats

Statistic Seminar, 8th talk:

Nonparametric estimation for interval censored data

Current Status Censoring / Interval Censoring Case 1:

• X : the failure time, where X ∼ F

• n observations which are iid copies of (T, ∆) = (T, 1{X ≤ T })

• X : the failure time, where X ∼ F

• (T1 , T2 ): observation times, where (T1 , T2 ) ∼ G

(T, ∆) = ((T1 , T2 ) , 1{X ≤ T1 }, 1{T1 < X ≤ T2 }, 1{X > T2 })

The goal is again to estimate the distribution function of X .

• (X, Y ): the failure time, where (X, Y ) ∼ F

• (X, Y ) are independent of (U, V )

∆ = (∆11 , ∆12 , ∆13 , ∆21 , ∆22 , ∆23 , ∆31 , ∆32 , ∆33 )

2 The Nonparametric MLE for Current Status

The likelihood of n iid observations (ti , δi ) , i = 1, . . . , n can be written as

and thus the nonparametric MLE F fullls

and is well-dened. By using the notation of observed sets

where PF (Ri ) is the probability under distribution F that X ∈ Ri .

3 Finite sample properties and computation of

Reducing the optimization problem:

ln (α̂) = max ln (α), (8)

Existence and (non-)uniqueness of the MLE:

Theorem 1. The MLE α̂ dened by (8) exists.

4 Characterization and convex minorants for Cur-

∆(i) − ŷi ≥ 0, for all j = 1, . . . , n + 1, (12)

where S is the slope process of the greatest convex minorant of a two-sided

[GW92] P. Groenebook and J. A. Wellner, Information bounds and nonpa-

You might also like

and thus the nonparametric MLE F fullls

and is well-dened. By using the notation of observed sets

Theorem 1. The MLE α̂ dened by (8) exists.