Stratification in The Cox Model: Patrick Breheny

Introduction
Checking the proportional hazards assumption

Fitting stratified Cox models
Stratification in the Cox model
Patrick Breheny
November 17
Patrick Breheny Survival Data Analysis (BIOS 7210) 1/20

Introduction
Introduction
Today’s topic is the use of stratification in Cox regression

There are two main purposes of stratification:
It is useful as a diagnostic for checking the proportional
hazards assumption
It offers a way of extending the Cox model to allow for
non-proportionality with respect to some covariates

Introduction
VA Lung Cancer data
To illustrate these concepts, we will look at a classic survival

data set, the VA lung cancer data (veteran in the survival
package)
The data comes from a clinical trial carried out by the
Veterans’ Administration on male veterans with advanced,
inoperable lung cancer
In the trial, patients were randomized to receive either a
standard chemotherapy or an experimental chemotherapy, and
the primary endpoint was the time until death

Introduction
Covariates
A number of covariates which potentially affect survival were also
recorded:
karno: The Karnofsky score, a way of quantifying the
patient’s overall baseline status, with ≥ 70 denoting that the
patient is able to care for themselves, 40 − 60 meaning that
the patient requires assistance and regular medical care, and
10 − 30 meaning that the patient is hospitalized
diagtime: Time in months from diagnosis to randomization
age: Age in years at randomization
prior: Indicator for whether the patient had received prior
therapy
celltype: Type of tumor (small cell, large cell, squamous,
adenocarcinoma)

Introduction
Kaplan-Meier
Standard Test
1.0
0.8
0.6
Survival
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5
Time (years)

Introduction
Cox results
Adeno v squamous ● < 0.0001
Small v squamous ● < 0.01
Large v squamous ● 0.16
Treatment ● 0.16
Prior ● 0.76
Diag. Time ● 0.99
Age ● 0.35
Karnofsky ● < 0.0001
0.6 1 1.6 2.7 4.5 7.4
Hazard ratio
Introduction
Diagnostics for proportional hazards
Consider the following as a way to assess the proportional

hazards assumption: rather than including a term in the
model as a covariate, we will estimate separate baseline
hazards Λ̂01 , Λ̂02 , . . . , for each level of the covariate
If the baseline hazards appear proportional, then it is
reasonable to model the term in the regular manner

Introduction
Diagnostic plot types
Because proportionality is difficult to assess by visual

inspection, it is common to plot log Λ̂0 :
Λi (t) = Λ0 (t) exp(ηi )

=⇒ log Λi (t) = log Λ0 (t) + ηi
An alternative, known as the Andersen plot, is to plot Λ̂01

versus Λ̂02 ; under proportional hazards this should be a
straight line with slope exp(η)

Introduction
Treatment (Version 1)
Standard Test
0
log(Λ(t))
−1
−2
−3
−4
0.0 0.5 1.0 1.5
Time (years)

Introduction
Treatment (Version 2) (βb = 0.29)
0.6
0.4
logΛ1(t) − logΛ2(t)
0.2
0.0
−0.2
0.0 0.5 1.0 1.5
Time (years)

Introduction
Treatment (Version 3, the Andersen plot)
4
Λ2(t)
0 1 2 3 4
Λ1(t)

Introduction
Cell type
squamous smallcell adeno large
0
log(Λ(t))
−1
−2
−3
−4
0.0 0.5 1.0 1.5
Time (years)

Introduction
Karnofsky
(0,30] (30,60] (60,100]
0
log(Λ(t))
−1
−2
−3
−4
0.0 0.5 1.0 1.5
Time (years)

Introduction
Remarks
Treatment appears broadly proportional except for very

short-term survival
Proportional hazards appears questionable with respect to cell
type
Karnofsky status also appears non-proportional, with the
variable losing relevance over time (which makes sense)

Introduction
The stratified Cox model

What should we do in the presence of variables with
non-proportional effects?
One remedy is to allow for different baseline hazards for each
level of the variable:
λij (t) = λ0j (t) exp(xTi β),
where λij (t) is the hazard function for the ith subject, who
belongs to the jth stratum
The model may seem complex, but is entirely straightforward
in the likelihood framework, as we can simply combine
likelihoods across strata:
Y
L(β) = Lj (β)
j

Introduction
Stratified Cox model: Details
Furthermore,
X
`(β) = `j (β)
j
X
u(β) = uj (β)
j
X
I(β) = Ij (β),
j
so estimation, the Newton-Raphson algorithm, and inference are all

straightforward as well: we simply have to sum the contributions
from each stratum

Introduction
R code
The survival package makes it easy to fit stratified Cox

models through the use of the strata function:
fit <- coxph(S ~ trt + karno + ... + strata(celltype))
summary(fit) will then provide a summary for all the

parametric terms (trt, karno, . . . ), but not celltype
survfit(fit) will estimate K different baseline hazard
functions, one for each stratum (here, K = 4)

Introduction
Predictions
Standard treatment, wait 12 months, age 40, no prior treatment
squ: krn=80 squ: krn=60 lar: krn=60
1.0
0.8
0.6
Survival
0.4
0.2
0.0
0.0 0.5 1.0 1.5
Time (years)

Introduction
Final remarks
Stratified Cox models are a useful extension of the standard

Cox models to allow for covariates with non-proportional
hazards
A minor drawback is that stratifying unnecessarily (i.e., even
though the PH assumption is met) reduces estimation
efficiency, although the loss is typically very small
A larger limitation of stratification is that it becomes messy
with continuous variables and with multiple stratification
variables, as there is no way to impose an additive structure

Introduction
Final remarks (cont’d)
The other primary limitation of stratified models is that there

is no way to carry out inference for the stratification variables
For example, stratification is commonly used to aggregate
results across multi-center studies, because comparing these
sites is typically not of interest
Stratification is less useful in dealing with non-proportionality
with respect to treatment – we are definitely interested in
estimating the effect of treatment, and although we can
obtain descriptive measures by estimating baseline
coefficients, confidence intervals and tests are lacking
In such cases, a more satisfying approach is to directly model
the changing effect of the predictor over time, a topic we will
cover in a future lecture

Stratification in The Cox Model: Patrick Breheny

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stratification in The Cox Model: Patrick Breheny

Uploaded by

Copyright:

Available Formats

Introduction

Checking the proportional hazards assumption

Stratification in the Cox model

Patrick Breheny Survival Data Analysis (BIOS 7210) 1/20

Today’s topic is the use of stratification in Cox regression

Patrick Breheny Survival Data Analysis (BIOS 7210) 2/20

VA Lung Cancer data

To illustrate these concepts, we will look at a classic survival

Patrick Breheny Survival Data Analysis (BIOS 7210) 3/20

Patrick Breheny Survival Data Analysis (BIOS 7210) 4/20

0.0 0.5 1.0 1.5 2.0 2.5

Patrick Breheny Survival Data Analysis (BIOS 7210) 5/20

Adeno v squamous ● < 0.0001

Small v squamous ● < 0.01

Large v squamous ● 0.16

Diag. Time ● 0.99

Karnofsky ● < 0.0001

0.6 1 1.6 2.7 4.5 7.4

Diagnostics for proportional hazards

Consider the following as a way to assess the proportional

Patrick Breheny Survival Data Analysis (BIOS 7210) 7/20

Diagnostic plot types

Because proportionality is difficult to assess by visual

Λi (t) = Λ0 (t) exp(ηi )

An alternative, known as the Andersen plot, is to plot Λ̂01

Patrick Breheny Survival Data Analysis (BIOS 7210) 8/20

0.0 0.5 1.0 1.5

Patrick Breheny Survival Data Analysis (BIOS 7210) 9/20

Treatment (Version 2) (βb = 0.29)

0.0 0.5 1.0 1.5

Patrick Breheny Survival Data Analysis (BIOS 7210) 10/20

Treatment (Version 3, the Andersen plot)

Patrick Breheny Survival Data Analysis (BIOS 7210) 11/20

squamous smallcell adeno large

0.0 0.5 1.0 1.5

Patrick Breheny Survival Data Analysis (BIOS 7210) 12/20

(0,30] (30,60] (60,100]

0.0 0.5 1.0 1.5

Patrick Breheny Survival Data Analysis (BIOS 7210) 13/20

Treatment appears broadly proportional except for very

Patrick Breheny Survival Data Analysis (BIOS 7210) 14/20

The stratified Cox model

Patrick Breheny Survival Data Analysis (BIOS 7210) 15/20

Stratified Cox model: Details

so estimation, the Newton-Raphson algorithm, and inference are all

Patrick Breheny Survival Data Analysis (BIOS 7210) 16/20

The survival package makes it easy to fit stratified Cox

summary(fit) will then provide a summary for all the

Patrick Breheny Survival Data Analysis (BIOS 7210) 17/20

squ: krn=80 squ: krn=60 lar: krn=60

0.0 0.5 1.0 1.5

Patrick Breheny Survival Data Analysis (BIOS 7210) 18/20

Stratified Cox models are a useful extension of the standard

Patrick Breheny Survival Data Analysis (BIOS 7210) 19/20

Final remarks (cont’d)

The other primary limitation of stratified models is that there

Patrick Breheny Survival Data Analysis (BIOS 7210) 20/20

You might also like