Professional Documents
Culture Documents
Survival - Notes (Lecture 4)
Survival - Notes (Lecture 4)
December 1, 2021
dj
λ̂j =
rj
Survival Models and Analysis December 1, 2021 2 / 29
Survival Models and Analysis
Estimating the Survival Function
Other Justifications for KM Estimator
Proof:
g
d
Y
L(λ) = λj j (1 − λj )rj −dj
j=1
d
X
LogL(λ) = [log λj j (1 − λj )rj −dj ]
X
= [dj log λj + (rj − dj )log (1 − λj )]
∂logL(λ) dj rj − dj
= − =0
∂λ λj 1 − λj
dj r j − dj
=
rj 1 − λj
dj (1 − λj ) = λj (rj − dj )
dj − λj rj = 0
dj
λj =
rj
Survival Models and Analysis December 1, 2021 3 / 29
Survival Models and Analysis
Estimating the Survival Function
Other Justifications for KM Estimator
• Step (3): Moving from left to right, each time you encounter a censored
observation, distribute its mass to all times to its right.
• Step (4): Calculate Ŝj by subtracting the final weight for time j from
Ŝj−1
Example of “redistribute to the right” algorithm
Consider the event times: 2, 2.5+, 3, 3, 4, 4.5+, 5, 6, 7. The algorithm
goes as follows:
0.11
0.25 = 0.22 + 2( )
7
0.11
0.13 = 0.11 +
7
0.13
0.17 = 0.13 + ...
3
Step (4):
Greenwood’s formula
λ̂j (1 − λ̂j )
Var (λ̂j ) =
rj
dj dj
rj 1 − rj
=
rj
dj (rj − dj )
=
rj2
(1 − λˆj )
Y
Var (Ŝ(t)) = Var
j:aj <t
Y
log [Ŝ(t)] = log (1 − λ̂j )
j
X
= log (1 − λ̂j )
j
dj
X λ̂j X rj
= = d
j
(1 − λ̂j )r j j (1 − rjj )rj
X dj
=
rj (rj − dj
j
h i
Now Ŝ(t) = exp log Ŝ(t)
Var (Ŝ(t)) = Var
h (exp(log Ŝ(t)))
i
By (B) Z ∼ N e µ , [e µ ]2 σ 2
h i2
Var (Ŝ(t)) = e log Ŝ(t) var (log Ŝ(t))
h i2 X dj
= Ŝ(t) Greenwood’s formula
rj (rj − dj
j
Confidence intervals
Problem:
This approach can yield values > 1 or < 0. Better approach: Get a 95%
confidence interval for
• We take the square root of the above to get se(L̂(t)), and then form the
confidence intervals as:
±1.96se(L̂(t))
Ŝ(t)e
This is the approach that Stata uses. R gives an option to calculate these
bounds (conf.type="log-log" in surv.fit
Survival Models and Analysis December 1, 2021 16 / 29
Survival Models and Analysis
Estimating the Survival Function
Properties of the KM Estimator
but R will also give either of the other two options if you request them.
Survival Models and Analysis December 1, 2021 18 / 29
Survival Models and Analysis
Estimating the Survival Function
Properties of the KM Estimator
R Commands
time<-c(6, 6, 6, 6, 7, 9, 10, 10, 11, 13, 16, 17, 19, 20, 22,
23, 25, 32, 32, 34, 35) Creating the time variable.
status<-c(0,1,1,1,1,0,0,1,0,1,1,0,0,0,1,1,0,0,0,0,0) Creating
the event variable
Survival Models and Analysis December 1, 2021 19 / 29
Survival Models and Analysis
Estimating the Survival Function
Properties of the KM Estimator
summary(fit)
Stata Commands
Create a file called “leukemia.dat” with the raw data, with a column for
treatment, weeks to relapse (i.e., duration of remission), and relapse
status:
If the dataset has already been created and loaded into Stata, then you
can substitute the following commands for initializing the data:
time failtime*censor(1);
time failtime*failind(0);
The first variable is the failure time, and the second is the failure or
censoring indicator. In parentheses you need to put the specific numeric
value that corresponds to censoring.
The upper and lower confidence limits on Ŝ(t) are included in the data set
“OUTSURV” when specified. The upper and lower limits are called:
sdf ucl, sdf lcl.
SAS Commands
Ŝ(22) = 0.5378
Ŝ(23) = 0.4482
The median is thus 23. This can also be seen visually on the graph to the
left.
Lower quartile (25th percentile):
the smallest time (LQ) such that Ŝ(LQ) ≤ 0.75
Upper quartile (75th percentile):
the smallest time (UQ) such that Ŝ(UQ) ≤ 0.25
Survival Models and Analysis December 1, 2021 29 / 29