You are on page 1of 45

Kuntjoro Harimurti

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Cipto Mangunkusumo Hospital / Faculty of Medicine UI, Jakarta kuntjoro.harimurti01@ui.ac.id

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Let we start with an example


A cohort study was conducted to determine the survival of HIV(+) patients with CD4+ <100/mL, treated with new combination of antiretroviral (ARV). The determined event is death. The study was started at January 1st 2001 and ended at December 31st 2005. Results of the observation

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

On study period, there were 15 HIV(+) patients enrolled:


Study period A B C D E F G H I J K L M N O 1/1/01 1/1/02 1/1/03 1/1/04 1/1/05 length observation (months) 34; died 57; live at study end 20; died 47; died 2; died 38; died 14; lost to follow-up 23; lost to follow-up 21; died 23; died 12; live at study end 3; died 1; lost to follow-up 3; live at study end 2; live at study end 31/12/05

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Using usual methods of statistics on given data


Mean live survival
only calculate survival on subjects that experience the event

Median live survival


needs 50% of subjects have experience the event

Rate of survival
what the numerator and denominator?

Survival at specific time


problem on determining the denominator: died?, alive?, what about subjects that withdrawn and lost to follow- up?

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Why use survival analysis?


Usual methods of descriptive and analytic statistics cannot or unsatisfied for used in survival data, because:
subjects not enter the study at same time; not all of the study subjects experience the event; there were subjects that lost to follow-up or withdrawn; at the end of study, there were subjects still alive

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

What Is Survival Analysis?


A collection of statistical procedures for data analysis for which the outcome variable of interest is time until an event occurs (Time to event analysis)
Start follow-up TIME
days weeks months years

Event
death disease relapse recovery
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Goals of survival analysis


To estimate and interpret survivor and/or hazard functions from survival data To compare survivor and/or hazard function To assess the relationship of explanatory variables to survival time

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Time as an outcome
Survival time:
Leukemia patients/time in remission (weeks) Diabetes patients/time until heart disease (years) Elderly (60+) population/time until death (years) Etc.

From the beginning of follow-up until an event occur, age of individual when an event occur

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Event
Any designated experience of interest that may happened to an individual
Death Disease incidence Relapse from remission Recovery Etc.

Typically refers to failure (negative event: e.g. death, relapse), but may be a positive event (e.g. recovery) Usually only one event is of designated interest; it could be >1 events competing risk
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Censored data
A key analytical problem in survival analysis Censoring occurs when there is some information about individual survival time, but dont know how the survival time exactly Censored data appears because we cannot follow every subjects until an event occurs Three reasons why censoring may occur:
The study end Lost to follow up Withdrawn from the study
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Example: Leukemia patients in remission


Censored Follow-up time

Start remission: Beginning of follow-up

The study end/ Lost to follow-up/ Withdrawal

Relapse: Event

Relapse time

Follow-up time

Dont know the relapse time exactly


Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Example: Leukemia patients in remission


Weeks

4
T=5

10

12

A
S U B J E C T S

X
T=12 T=3.5

Event (relapse) Censored Withdrawn T=8 T=6 Censored

B C D

Censored
Lost followup T=3.5 Censored

E
F
Study start

Event (relapse)

Study end
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Why censored data important?


It can be used in analyzing survival data Even though censored observations are incomplete, we have the information on a censored person up to the time we lose track the person In survival analysis, every single information about the survival is important, so do not throw away the information by using the censored data
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Common Techniques in Survival Analysis


Actuarial (Cutler-Ederer) method Kaplan-Meier (product-limit) method Log rank test Coxs proportional hazards model (Cox regression)

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Actuarial Method
Used to determine the survival on specific time interval Time interval chosen depends on disease characteristic or effect Conditions and assumption in actuarial analysis:
Beginning of the observation should be clearly defined Effect studied should be clearly defined Withdrawal and loss to follow-up should be independent to effect Risk for experience the effect does not depends on calendar year Risk for experience the effect in chosen interval should be equal Censored patients assumed to experience effect

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Follow-up data of 15 HIV(+) patients; event=death


Study period
A B C D E F G H I J K L M N O 1/1/01 1/1/02 1/1/03 1/1/04 1/1/05
length of observation (months) 34; died 57; live at study end 20; died 47; died 2; died 38; died 14; lost to follow-up 23; lost to follow-up 21; died 23; died 12; live at study end 3; died 1; lost to follow-up 3; live at study end 2; live at study end

31/12/05

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

We can rearrange the length of observation as if all observations started at the beginning of the study
A B C D E F G H I J K L M N O 1/1/ 01 1/1/ 02 1/1/ 03 1/1/ 04 1/1/ 05 31/12/ 05 A B C D E F G H I J K L M N O 0 1 2 3 4 5

Dates

Years
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Re-arranged data
Study period A B C D E F G H I J K L M N O 0 1 2 3 4 5
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

length of observation (months)

34; died 57; live at study end 20; died 47; died 2; died 38; died 14; lost to follow-up 23; lost to follow-up 21; died 23; died 12; live at study end 3; died 1; lost to follow-up 3; live at study end 2; live at study end

Length of follow-up (yrs)

Calculation the survival function on actuarial methods


rx= Ix-(cx/2) qx= dx/rx px= 1-qx Sx= px1 x px2 x x pxn Cumulativ e survival

Ix

cx

dx

Interval (year)

Number alive at beginning of interval 15


9 4 3

Number censored during interval 4


2 0 0

Number at risk during interval 13


8 4 3

Number death during interval 2


3 1 2

Death rate during interval 0.15


0.38 0.25 0.67

Survival during interval 0.85


0.63 0.75 0.33

0123-

0.85
0.53 0.40 0.13

4-

0.5

0.13

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Survival Curve (Actuarial Method)


1.0 0.9
0.85

0.8
Probability of survival

0.7
0.6
0.53

0.5
0.4 0.3 0.2 0.1 0 1 2 3
0.13 0.13 0.40

Survival time (Years)

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Some people talks in their sleep. Lecturers talk while other people sleep. (Albert Camus)

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Introduction to Kaplan Meier

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Kaplan-Meier Method
The most common method for survival analysis is Kaplan-Meier (product limit) estimation This technique measures the hazard every time there is an event The rates are based on the number of individuals living at the start of the time interval These counts of living people at risk vary with the number of censored records and number of events Used to estimate the survival curve from observed survival times without the assumption of an underlying probability distribution
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Kaplan-Meier Method
Probability of surviving k or more periods from entering the study is a product of the k observed survival rates for each period (i.e. the cumulative proportion surviving):
S(k) = p1 x p2 x p3 x x pk
S = survival function p = proportion surviving in given period

Proportion surviving period i having survived up to period i:


pi = ri - di ri

pi = proportion surviving in a period ri = number alive at the beginning of the period di = number of deaths within the period
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Re-arranged data from HIV(+) study


Study period A B C D E F G H I J K L M N O 0 12 24 36 48 60
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

length of observation (months)

34; died 57; live at study end 20; died 47; died 2; died 38; died 14; lost to follow-up 23; lost to follow-up 21; died 23; died 12; live at study end 3; died 1; lost to follow-up 3; live at study end 2; live at study end

Length of follow-up (months)

Ordered data
Study period
M E O L N K G C I H J A F D B 0 365 730 1095 1440 1825
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

length of observation (days) 31; lost to follow-up 60; died 62+; live at study end 86; died 92; live at study end 356; live at study end 410; lost to follow-up 590; died 610; died 680; lost to follow-up 700; died 1050; died 1130; died 1400; died 1704; live at study end

Length of follow-up (days)

Patient name M E O

Survival time (days)

NO known to be alive (ri) 14 14

Deaths (di)

Proportion surviving (pi=[ri-di]/ri)

Cumulative proportion surviving (S[t]) 1.000

0
31+ 60 62+ 1 (14-1)/14=0.929 0.929

L
N K G C I H J A F D B

86
92+ 356+ 410+ 590 610 680+ 700 1050 1130 1400 1704+

12

(12-1)/12= 0.917

0.917*0.929=0.852

8 7 5 4 3 2

1 1 1 1 1 1

(8-1)/8=0.875 (7-1)/7=0.857 (5-1)/5=0.800 (4-1)/4=0.750 (3-1)/3=0.667 (2-1)/2=0.500

0.875*0.852=0.746 0.857*0.746=0.640 0.800*0.640=0.512 0.750*0.512=0.384 0.667*0.384=0.256 0.500*0.256=0.128

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Kaplan-Meier Curve
1.0 60 86 590 610 700 1050 1130 1400

0.9
0.8 Probability of survival 0.7 0.6 0.5

* *
* *

0.4
0.3 0.2 0.1 0 1

*
*
2 3 Survival time (Years) 4 5

Example

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Calculation for the Kaplan-Meier estimate of the survival function for the treatment 1
Patient number Survival time (days) 0 6 9 8 12 7 6 1 1 (7-1)/7=0.857 (6-1)/6=0.833 1x0.857=0.857 0.857x0.833=0.714 Number known to be alive (ri) Deaths (di) Proportion surviving (pi=[ri-di]/ri) Cumulative proportion surviving (S[t])

10
12 13 14

15+
25+ 37 55 3 2 1 1 (3-1)/3=0.667 (2-1)/2=0.500 0.714x0.667=0.476 0.500x0.476=0.238

15

72+

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Plot of the survival curve for treatment 1


1.0 0.9 0.8 Probability of survival 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 20 40 Survival time (days) 60

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Calculation for the Kaplan-Meier estimate of the survival function for the treatment 2
Patient number Survival time (days) Number known to be alive (ri) 8 8 6 2 1 (8-2)/8=0.750 (6-1)/6=0.833 1x0.750=0.750 0.750x0.833=0.625 Deaths (di) Proportion surviving (pi=[ri-di]/ri) Cumulative proportion surviving (S[t])

0
1 2 3 1 1 4

4
5 7 8

5
6+ 9 9+

5
3

1
1

(5-1)/5=0.800
(3-1)/3=0.667

0.625x0.800=0.500
0.500x0.667=0.333

11

22

(1-1)/1=0

0.333x0=0

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Plot of the survival curve for treatment 2


1.0 0.9 0.8 Probability of survival 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 20 40 Survival time (days) 60

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Estimating and comparing survival curve for the two treatment group using the Kaplan-Meier method
1.0 0.9

0.8
Probability of survival

0.7
0.6

Treatment 1

0.5
0.4 0.3 0.2 0.1 0

Treatment 2

Median survival time for Treatment Group 2 = 5 days

Median survival time for Treatment Group 1 = 37 days

20

40 Survival time (days)

60

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Comparing survival curves of two groups using the log rank test
Log rank test: a statistical hypothesis test to compare two survival curves Null hypothesis: no difference between the population survival curves It can be calculated manually or by statistical packages computer program

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Calculation of log-rank test

O1 and O2 = total numbers of observed events in groups 1 and 2 E1 and E2 = total numbers of expected events

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Yes

Event

No

Group 1

b
d

Group 2

E (a) = (a+b)(a+c)/(a+b+c+d) E (b) = (a+b)(b+d)/(a+b+c+d), etc


P Value

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Coxs proportional hazards model


Enables the difference between survival times of particular groups of patients to be tested while allowing for other factors handles >1 variables The response (dependent) variable is the hazard probability of dying Hazard ratio does not depend on time (same at any other time)
s

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Coxs proportional hazards model

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

An example from the literature

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Survival of patients with bronchiectasis after the first ICU stay for respiratory failure
Dupont et al. Chest 2004;125:1815-20

Objectives of the study: to assess the long term outcomes and to identify the factors associated with a reduced survival on patients with bilateral bronchiectasis admitted for the first time to the ICU for respiratory failure Study period: 10 years (January 1990 to March 2000) retrospectively Time variable: days after ICU admission Event: death
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Survival of patients with bronchiectasis after the first ICU stay for respiratory failure
Dupont et al. Chest 2004;125:1815-20

The KaplanMeier estimates of survival for (a) age > 65 years or 65 years, and (b) long-term oxygen therapy (LTOT) before intensive care unit admission (yes/no). The P values are for the log rank test.

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Survival of patients with bronchiectasis after the first ICU stay for respiratory failure
Dupont et al. Chest 2004;125:1815-20

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Conclusions
Survival analysis provides special techniques that are required to compare risks for event associated with different treatment groups, where the risk change over time In measuring survival time, the start and end-points must be clearly defined and the censored observation noted Actuarial method and Kaplan-Meier provide a method for estimating the survival curve The log rank test provides a statistical comparison of two groups Coxs proportional hazards model allow additional covariates to be included
Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital

Center for Clinical Epidemiology and Evidence-Based Medicine (CEEBM) Faculty of Medicine, University of Indonesia Cipto Mangunkusumo Hospital