You are on page 1of 55

Study of Prognosis

Darwin Amir
Staf Pengajar S2 Kebidanan
Fakultas Kedokteran Universitas Andalas
Objectives:

1. Review definitions.
2. Understand concept of natural history and inception
cohort studies.
3. Define commonly used measures of prognosis.
4. Understand origins of bias in follow-up studies.
5. Understand basic statistical methodology for survival
data.
6. Define characteristics of an ideal prognostic study.
7. Understand the rationale and development of clinical
decision/ prediction rules

2
“Prediction is very difficult, especially
about the future”

-Niels Bohr

3
I. Definitions
• Prognosis: the prediction of the future course of
events following the onset of disease.
• can include death, complications, remission/recurrence,
morbidity, disability and social or occupational function.

• Prognostic factors: factors associated with a


particular outcome among disease subjects.
• examples includes age, co-morbidities, tumor size, severity
of disease etc.
• often different from disease risk factors e.g., BMI and pre-
menopausal breast CA.

4
I. Definitions
• Natural history: the evolution of disease
without medical intervention.

• Clinical course: the evolution of disease in


response to medical intervention.

5
Natural History Studies

• Degree to which natural history can be


studied depends on the medical system and
the type of disease.

• The natural history of some diseases can be


studied because:
– remain unrecognized (i.e., asymptomatic) e.g., anemia,
hypertension.
– considered “normal” discomforts e.g., arthritis, mild
depression.

6
Natural History Studies

• Natural history studies permit the


development of rational strategies for:
• early detection of disease
– e.g., CIN and Invasive Cervical CA.
• treatment of disease
– e.g., middle ear infections.
– DCIS
– Hypertension

7
Natural History of Disease

Disease Preclinical Clinical


onset
Diagnosis Prognosis

Preclinical Clinical
Disease Prognosis
onset Diagnosis
8
Identifying the Onset of Disease
Infectious diseases
• Exposure (bite, infection, etc.)
• Biological culture
• Presence of antibody responses, viral DNA and
RNA

9
Identifying the Onset of Disease
Cancer
• Initial damage from radiation or chemicals
• First cancer cell division
• Lost of cell replication
• Screening for pathologic changes during preclinical phase
• First evidence of signs and symptoms
• Medical diagnosis of disease

Preclinical phase Clinical phase

Biological Pathologic Signs and Medical Diagnosis Treatment


onset of evidence of symptoms care
disease disease of disease sought
Identifying the Endpoints of Disease
• Death
• Cure
• Remission
- A decrease in, or disappearance of, signs
and symptoms of disease
• Recurrences
- A return of disease

11
Expressing Prognosis

• Case-fatality (rate) or CFR


• Five-year survival
• Observed survival rate
• Median survival time
• Relative survival rate

12
Expressing Prognosis: Case-Fatality Rate (CFR)
Case-fatality (rate) =
Number of people who die of a disease
Number of people who have the disease

Example
200 people with the disease
20 deaths from the disease
CFR = 20x100 = 10%
200

13
Expressing Prognosis: Five-Year Survival
• Five-year survivalis the proportion of patients who are
alive five years after diagosis

Number of persons withth especified disease surviving 5 years


total number of persons with the specified disease

Preclinical Clinical
Disease
onset Prognosis = a live 5 years
Diagnosis

14
A. Study Designs

• Measuring natural history or clinical course of


disease requires a cohort study:
• Most often use a retrospective cohort.

• Exposed group = affected (diseased) patients


followed to measure outcomes of interest.

15
Cohort Study Design

• To determine if outcome is atypical, need to


compare to a non-exposed group which
should be:
• obtained from the same source population that
gave rise to the patients.
• monitored with the same intensity.
• If unaffected cohort unavailable, use “outside” data
e.g., standardized incidence or mortality ratios for
cancer studies.

16
B. The Inception Cohort

• Prognosis studies usually involve taking a


sample of diseased patients.
• Such cohort studies are very susceptible to
bias because clinical course of disease can
be both prolonged and variable
• Key factor in determining disease outcome is
when did the time clock start?

17
Starting Point

• Starting point must be well defined and


clearly specified:
• The same point in the course of the disease for all
individuals.
• Ideal starting point is near the onset (“inception”)
of disease = inception cohort.
• Usually it is:
– onset of symptoms, time at first diagnosis, or beginning
of treatment.

18
Effect of Using a Defined Inception Cohort
on Average Survival Time

a) No inception cohort b) Inception cohort


5 cases identifed at different points in time
5 cases identifed at the same point in time

0 1 2 3 4 5 0 1 2 3 4 5
Years Years
Start cohort Pre-clinical phase Start cohort Pre-clinical phase
Clinical phase Clinical phase

Av. duration of survival= 14.5/5 = 2.9 yrs Av. duration of survival= 11/5 = 2.2 yrs

19
II. Bias in Follow-up Studies

• A. Selection or Confounding Bias


• i) Assembly or susceptibility bias occurs when the
exposed and non-exposed groups differ other than by the
prognostic factors under study, and
• the extraneous factor affects the outcome of the study.

• Examples:
• differences in starting point of disease (survival cohort)
• differences in stage or extent of disease, co-morbidities,
prior treatment, age, gender, or race.

20
Survival Cohorts
• Survival cohort (or available patient cohort) studies
can be very biased because:
• convenience sample of current patients are likely to be at
various stages in the course of their disease.
• individuals not accounted for have different experiences
from those included e.g., died soon after trt.

• Not a true inception cohort e.g., retrospective case


series.
• Very common!

21
Survival Cohorts (Fletcher)
Observed True
True Cohort Improvement Improvement

Assemble Measure Outcomes


Cohort Improved = 75 50% 50%
N=150 Not improved = 75

Survival Cohort
Assemble
patients

Begin Measure Outcomes


Follow-up Improved = 40 80% 50%
N = 50 Not improved = 10
Not
Observed Dropouts:
N = 100 Improved = 35
Not improved = 65 22
A. Selection Bias
• ii) Migration bias
• occurs when patients drop out of the study
– (lost-to-follow-up).
• usually subjects drop out because of a valid reason
– e.g., died, recovery, side effects or disinterest.
• these factors are often related to prognosis.
• asses extent of bias by using a best/worst case analysis.
• patients can also cross-over from one exposure group to
another
– if cross-over occurs at random = non-differential
misclassification of exposure

23
A. Selection Bias
• iii) Generalizability bias
• related to the selective referral of patients to
tertiary (academic) medical centers.
• highly selected patient pool have different clinical
spectrum of disease.
• influences generalizability (see Moltusky, 1978;
Melton, 1985).

24
II. Bias in Follow-Up Studies

• B. Measurement bias
• Measurement (or assessment) bias occurs
when one group has a higher (or lower) probability
of having their outcome measured or detected.
– likely for softer outcomes
• side effects, mild disabilities, subclinical disease or
• the specific cause of death.

25
B. Measurement bias
• Measurement bias can be minimized by:

• ensuring observers are blinded to the exposure


status of the patients.

• using careful criteria (definitions) for all outcome


events.

• apply equally rigorous efforts to ascertain all


events in both exposure groups.

26
III. Commonly Used Measures of
Prognosis
• A. 5-year Survival Rate

• Number of individ. who survived between t 0 and t+5 years


Number of individuals with disease at t 0

– typically t 0 is the point of initial diagnosis or treatment.


– cumulative incidence rate (= risk of death at 5 years).
– measures the proportion of the original patient population alive
at 5 years.
– easy to interpret and remember, but fails to indicate the rate of
death

27
Fig. Limitation of 5-year Survival Rates
(From Fletcher)

100

80
% Surviving

Age at 100 yrs


60
Aneurysm
40 AIDS
CML
20

0
0 1 2 3 4 5
Years

28
B. Case-fatality Rate (CFR)

• CFR = Number of indv. who die during t0 to t+1


Number of individuals with disease at t0

– specific type of cumulative incidence rate (= proportion,


not a rate).
– measures the risk of death among those individuals who
develop disease.
– must have an explicit (or implicit) time period that is
sufficiently long to ensure that all relevant events have
been observed e.g., legionnaires= disease, stroke, CML.
– Same principles apply to response, remission, and
recurrence rates.

29
C. Mortality or Death Rate (MR)

• MR = Number of indv. who die during t 0 to t+1


Total population time-at-risk during t 0 to t+1

– defined as the incidence rate of death per "population


time"
– denominator is population time-at-risk.
– measures the speed of death due to a specific disease
– distinguish from the case-fatality rate! Example:
• Stroke 28 day CFR = 23%
• Stroke MR = 63/100,000

30
IV. Statistical Methods Used in
Prognosis Studies
• A. Analysis of Survival or Failure Time Data
• primary end point of prognosis studies is time until
event of interest occurs e.g., death or relapse.
• analysis of survival or failure time data, requires
specific techniques:
– Kaplan-Meier estimator
– Log-rank test
– Cox proportional hazards regression model.

31
Censoring:
• Defn: when the event of interest does not
occur in all individuals because:
• study was stopped before everyone in the study
had the event
• loss to follow-up
• death from other (competing) causes e.g., road
traffic accidents

32
Censoring

• All statistical methods assume censoring is


non-informative:

• reason for incomplete observation is not related to


the underlying risk of failure.

• therefore, survival experience of those “lost to


follow-up” is assumed to be the same as those
that remain.

33
Survival function (S(t))
• Defn: the probability of survival to a given point in
time (t)

• S(3)= 60 indicates that 60% of the population survived 3


years.
• Graphically displayed using a "life table" or "survival curve.“

• Median survival time = the time at which half the


patients have "failed“
• a crude but common measure of survival

34
A Typical Survival Curve Showing the Survival
Function (S(t)) Plotted Against Time with a
Median Survival Time of 1.25 Years

0.8

0.6
S(t)
0.4

0.2

0
0 1 2 3 4 5

Years

35
Hazard function (h(t))

• Defn: the probability of an event at a specific


moment in time (t), given the patient has
already survived to that point in time.

• closely linked to the survival function.


• indicates the probability of the patient "failing"
during the next time period.
• a direct measure of prognosis.

36
Kaplan-Meier Estimator
• a widely accepted method of estimating S(t).
• S(t) is expressed as the product of conditional
probabilities e.g.,
• S(3)= S(1) x S(2|1) x S(3|2)

• where:
– S(1)= probability of surviving year 1
– S(2|1)= conditional probability of surviving year 2,
given survival to year 1.
– S(3|2)= conditional probability of surviving year 3,
given survival to year 2.

37
Kaplan-Meier Estimator
• estimators or "curves" begin at time zero with S(t) = 1 and
then decrease in a series of steps corresponding to
observed times of failure.

• censored observations contribute to the survival probability


estimates up until the time of censoring.

• no assumptions made about the shape of the survival or


hazard function (= a non-parametric technique).

• variability of estimates is greatest at the ends of the curves


- few subjects and few failures.

38
Kaplan-Meier Estimators of the Survival Function
(S(t)) for Two Groups (Treatment and Control).

0.8 Controls

0.6
S(t)
Treatment
0.4

0.2

0
0 1 2 3 4 5

Years

39
Log-rank test
• a statistical test of the difference in survival distributions
(see Peto et al, 1977).

• at each observed time of failure, compares the observed


number of events in one group to the expected number
(based on identical hazard functions for the two groups).

• gives equal weight to differences at each point in time.

• if it makes sense to place greater emphasis on differences


at earlier time periods then use the generalized Wilcoxon
test (Cox and Oakes, 1984).

40
Cox proportional hazard model

• Very powerful regression modeling technique based


on the hazard function (see Tibshirani 1982).
• allows for the full application and flexibility of regression
analysis to be applied to survival data.
• in its simplest form its an extension of the log rank test.

• Advantages:
• ability to handle a large number of prognostic variables (both
discrete and continuous)
• can adjust for confounding variables, and
• evaluate interaction effects

41
B. Statistical Control of Common
(Selection) Biases
• Prognostic studies are essentially observation
studies that focus on survival (or some other
outcome).

• Techniques to control biases are therefore


the same as used in observational
epidemiology.

42
Table. Methods for Controlling Selection Bias
(from Fletcher)
     
    Phase of Study
       

Methods Description Design Analysis

       
Randomization Random assignment ensures that known and unknown +  
confounders are equally distributed between exposure groups
(this is rarely feasible however, unless a specific RCT designed
to evaluate some aspect of prognosis is being conducted).

       
Restriction If a strong confounding factor is known - such as age or sex - +  
limit the range of the characteristics of patients in the study.

       
Matching Match exposure groups on the basis of important prognostic +  
variables - such as stage of disease, age or sex.

       
Stratification Compare event rates within subgroups (strata) with otherwise   +
similar probability of outcomes e.g., sex or age-groups specific
rates.

43
Table. Adjustment Procedures to Control
Selection Bias
     
Adjustment Procedures  Design  Analysis

       
Simple Mathematically adjust crude rates for a characteristic known to   +
be an important prognostic factor e.g., age adjustment.

       
Multiple Use mathematical models to adjust risk estimates for several   +
prognostic variables (Cox Regression).

       
Sensitivity Analysis Describe how the results could differ by changing the values   +
of known prognostic factors over plausible ranges. Best/worst
case analysis is an example.
 

44
V. Ideal Characteristics of Prognostic
Studies

• 1. Was the sample well defined and representative


of a definable underlying population? Was the
referral pattern well described?

• 2. Was an inception cohort assembled? Were all the


study patients at a similar well defined point in the
course of their disease?

• 3. Was the follow-up complete and sufficiently long?

45
V. Ideal Characteristics of Prognostic
Studies

• 4. Were objective and unbiased outcome


criteria used?

• 5. Was the outcome assessment blind?

• 6. Was adjustment for extraneous prognostic


factors carried out?

46
Editorial Readings

• Melton
• What is selection bias? and how can it effect the
conclusions of studies?

• Motulsky
• Why did author place such emphasis on
understanding the selection method?

47
VI. Clinical Decision Rules (CDR)

• clinical tools that combine history, physical


examination, and simple diagnostic tests to aid in
diagnostic, prognostic or treatment decisions

• Outcomes:
• Probability of disease/event (risk)
– e.g., APGAR, APCHE, CVD Risk Prediction (Framingham),
colic prognosis
• Diagnostic/treatment decision
– e.g., Breast biopsy decisions, Breast CA risk (Gail model), colic
surgery

48
CDRs – 3 Step Development Process

• Step 1 - Derivation:
• Identify important (predictive) variables
• Use statistical methods (Logistic regression,
recursive partitioning, neural networks), or pick
variables based on expert opinion
• Initial statistical testing (validation)
– Split sample (development and training sets)
– Bootstrap techniques

49
CDRs – 3 Step Development Process

• Step 2 – Validation
• Usually prospective, validation required because
– CDR accuracy may be specific to development
population (because of severity & disease prevalence)
– CDR may not be applied in the same manner in other
populations
• Narrow: Application to similar patient popl.
• Broad: Application to different populations with
varying prevalence and disease spectrum.

50
CDRs – 3 Step Development Process
• Step 3 – Impact Analysis
• Required because reluctance to use CDR is common. Why?
– Concern about different patient population/settings
– Risk of false negatives (esp. legal concerns)
– Rules are complicated or take too long to use
– Doesn't provide a course of action (just a probability!!)

• Test effect of CDR on physician behavior and clinical practice and


patient outcomes
• Rarely done!
• Ideal = Randomize individual patients (difficult) or practices.
• Or, evaluate using pre – post design

51
CDR - Hierarchy of Evidence

Level Minimum Evidence Required Recommended Use

Accuracy and applicability demonstrated Widespread use. High


1 in BROAD prospective validation studies confidence that can
in different populations. PLUS impact change practice or
analysis. outcomes.
Accuracy demonstrated in BROAD Widespread use. High
2 prospective validation studies in different confidence.
populations
Accuracy demonstrated in single Use with caution in similar
3 NARROW prospective validation study settings

Statistical validation or retrospective Further evaluation


4 validation only required before use

52
CDR – Methodological Standards
(McGinn, JAMA 2000)

• Were all important predictors included in the derivation


process
– content validity
– were they collected prospectively in a blinded fashion for the
purposes of CDR development?
– every patient included?, minimal missing data?
– were predictors present in large proportion of study pop

• Was the patient population and setting well defined?


– age, sex, referral filter

• Were all outcome events clearly defined?


– Are they of clinical importance?
– Were they determined blindly (independent of predictors)

53
CDR – Methodological Standards
(McGinn, JAMA 2000)

• Were appropriate statistical methods used?


– Adequate sample size to avoid over-fitting (need 10 outcomes per variable)

• Were results of CDR appropriate and clear?


– Se/Sp or ROC curves– usually want high Se (to avoid FNs)
– PV’s of more use to clinicians (Prevalence dependent)
– LR’s?
– Prob (outcome) = survival curves

• Was reproducibility of predictors and the rule itself assessed?


– Many S/S are not very reliable
– Concerned with inter-observer variability (K)

54
Thank You

You might also like