You are on page 1of 18

Research Article

Received 4 May 2011, Accepted 9 August 2011 Published online 3 November 2011 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/sim.4392

Relative survival multistate


Markov model‡
Ella Huszti,a,b Michal Abrahamowicz,a * † Ahmadou Alioum,c,d
Christine Binquete and Catherine Quantine
Prognostic studies often have to deal with two important challenges: (i) separating effects of predictions on
different ‘competing’ events and (ii) uncertainty about cause of death. Multistate Markov models permit mul-
tivariable analyses of competing risks of, for example, mortality versus disease recurrence. On the other hand,
relative survival methods help estimate disease-specific mortality risks even in the absence of data on causes of
death. In this paper, we propose a new Markov relative survival (MRS) model that attempts to combine these two
methodologies. Our MRS model extends the existing multistate Markov piecewise constant intensities model to
relative survival modeling. The intensity of transitions leading to death in the MRS model is modeled as the sum
of an estimable excess hazard of mortality from the disease of interest and an ‘offset’ defined as the expected haz-
ard of all-cause ‘natural’ mortality obtained from relevant life-tables. We evaluate the new MRS model through
simulations, with a design based on registry-based prognostic studies of colon cancer. Simulation results show
almost unbiased estimates of prognostic factor effects for the MRS model. We also applied the new MRS model
to reassess the role of prognostic factors for mortality in a study of colorectal cancer. The MRS model con-
siderably reduces the bias observed with the conventional Markov model that does not permit accounting for
unknown causes of death, especially if the ‘true’ effects of a prognostic factor on the two types of mortality differ
substantially. Copyright © 2011 John Wiley & Sons, Ltd.

Keywords: multistate Markov model; relative survival; unknown cause of death; disease recurrence; prognostic
factor effect; simulations

1. Introduction

Prognostic studies are essential in understanding the role of particular determinants of disease progres-
sion and mortality and thus improve prognosis and ultimately help in selecting appropriate interventions.
However, such studies face important analytical challenges. One difficulty concerns separating the
effects of putative prognostic factors on alternative clinical endpoints or ‘competing events’, such as
disease recurrence versus recurrence-free death or death from cancer versus death from other causes.
This helps understand the disease evolution and develop targeted preventive interventions by identifying
which risk is most serious for a given patient [1].
In many cancers, one nonfatal event of considerable interest is disease recurrence, which increases the
mortality risk and thus should be modeled properly. In recent prognostic studies, recurrence is typically
modeled as a time-dependent covariate in Cox’s proportional hazards model [2]. However, recurrence
can also be considered an ‘intermediate’ event and the risk of recurrence itself depends on some prog-
nostic factors. Survival analytical methods, such as Cox’s model, are limited to a single ‘endpoint’ event
and thus do not allow modeling recurrence as an ‘intermediate’ event. The multistate models, which

a Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Canada
b Universityof Washington – Harborview Center for Prehospital Emergency Care, Seattle, WA, USA
c Inserm, U897, Bordeaux F-33000, France
d ISPED-Université Victor Segalen Bordeaux 2, Bordeaux F-33000, France
e Medical Informatics Department, Dijon University Hospital, Dijon, France
*Correspondence to: Michal Abrahamowicz, McGill University Health Centre, 687 Pine Avenue West, V Building, Montreal,
Que. H3A 1A1, Canada.
269

† E-mail: michal.abrahamowicz@mcgill.ca
‡ Supporting information may be found in the online version of this article.

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

generalize classic time-to-event analyses to multiple outcomes are particularly useful for dealing with
such multiple events that may occur in different sequences [3–5].
Another frequent limitation of prognostic studies is that many data sources, such as most population-
based cancer registries [6], often record only the date of death but not the cause of death or the latter
information is often missing or unreliable. Yet a proportion of patients is likely to die of causes not
related to the disease of primary interest, especially in cancers with lower case fatality and those that
affect older subjects. This may put bias on the estimated effects of prognostic factors whose impact
on the disease-specific mortality is quite different from their impact on all-cause mortality. For instance,
advanced cancer stage at diagnosis is a very strong predictor of cancer-related death, but it may have little
impact on, for example, cardiovascular mortality. In contrast, while men have a higher risk of all-cause
mortality, their cancer-related mortality may not differ from women [7]. In addition, biased estimation of
the effects of such factors may induce residual confounding of the effects of other variables, including,
for example, treatments, with which they are correlated [8].
To address such difficulties, relative survival is frequently used in population-based cancer survival
studies [9–11] to estimate net survival, that is, survival corrected for the effect of other causes of death
[10]. More recently, relative survival methods have been extended to multivariable regression model-
ing of the effects of putative prognostic factors on disease-related mortality [10, 12–14]. Simulations
have confirmed that, in the absence of data on individual causes of death, relative survival methods
considerably improve the accuracy of the estimation and inference [13, 15].
Analyzing data from many real-life prognostic studies requires dealing with both multiple types of
clinical events and unknown causes of death. However, we are not aware of any method that would
simultaneously address both challenges. Therefore, we propose a new Markov relative survival (MRS)
model that extends the Markov piecewise constant intensities (MKVPCI) multistate model, originally
developed by Alioum and Commenges [16], to incorporate relative survival modeling. To this end, we
adapt the additive relative survival model developed by Esteve et al. [12] to the modeling of transitions
between multiple states.
Section 2 first describes the relative survival model of Esteve et al. [12] and describes the MKVPCI
model. Section 3 describes in detail our new MRS model. Section 4 presents the design of the simula-
tions and Section 5 summarizes the results. Section 6 presents the application of the new MRS model to
reassess the role of prognostic factors for mortality in a study of colorectal cancer. Finally, a discussion
in Section 7 concludes the paper.

2. Overview of relative survival and MKVPCI multistate model


This section provides an overview of the two separate models that can be considered as the precursors
of our MRS model described in detail in Section 3.

2.1. Additive relative survival model


Relative survival methods are increasingly used to assess the importance of putative prognostic factors
on disease related mortality in the absence of data on individual causes of death [10, 17]. Among several
relative survival multivariable regression models proposed in the last two decades [11, 13, 18, 19], the
additive relative survival regression model developed by Esteve et al. [12] represents the hazard of the
all-cause mortality (/ as the sum of two components

.t I ´/ D pop .t I ´g / C  .t I ´/ (1)

where pop is the expected hazard of ‘natural’ mortality in the underlying general population, which
depends on a vector ´g of ‘generic’ risk factors, such as age, sex, race, whereas  is the excess haz-
ard from the disease of interest and depends on a different vector´ of covariates, some or all of which
may belong to ´g [12]. The expected hazard pop is estimated from external data obtained from nation-
wide population life-tables typically stratified by age, sex, calendar time and, where applicable, race
[10, 17, 20].

2.2. Markov piecewise constant intensities model


270

Multistate models generalize survival models, which can be viewed as two-state models with one tran-
sition from an initial state to a terminating state. In longitudinal studies, these models are useful for

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

describing multiple changes in a patient’s health condition over time. The states represent various health
conditions, and transitions between states correspond to changes in a patient’s health condition. An
absorbing state is a state that once entered cannot be left, such as death. The most popular multistate
model is a Markov model. A multistate Markov model is defined as a stochastic process fY.t/; t > 0g
that takes values in a finite state space S D f1; 2; : : : ; kg, such that Y.t/ is the state occupied by the
process at time t . A transition between two states h and j (h and j belong to S ) is characterized by the
transition probability

phj .s; t / D PrŒY.t / D j jY.s/ D h; for t > s

or equivalently by the transition intensity

˛hj .t / D limt !0 PrŒY.t C t / D j jY.t/ D h=t; for h ¤ j:

For a homogeneous Markov model, this transition intensity is constant over time t . In this partic-
ular case, the transition probabilities can be computed easily in terms of transition intensities [21].
A natural extension of the homogeneous Markov model is a nonhomogeneous Markov model with
piecewise constant transition intensities. The time axis is divided into consecutive disjoint intervals
Œl1 ; l /; l D 1; 2; : : : ; r C 1, where rC1 D C1, and intensity for each type of transition is allowed to
vary from one interval to another, while remaining constant within each interval. Under the proportional
intensities assumption, covariates affect transition intensities according to the following model:
h 0 i
˛hj .t jZ.t // D ˛hj 0 .t / exp ˇhj Z.t / (2)

where Z.t / is a matrix whose columns are q possibly time-dependent covariates, ˛hj 0 .t / is the ‘baseline’
intensity of the transition from h to j , corresponding to Z.t / D 0, and ˇhj is a vector of constant-
over-time regression coefficients, similar to log hazard ratios in the proportional hazard (PH) model,
describing covariate effects on the intensity of transition from h to j . The methods discussed in this
paper are limited to time-fixed covariates.
Alioum and Commenges proposed a method and a computer program, MKVPCI, for fitting Markov
models with piecewise constant intensities and for estimating the effects of covariates on transition
intensities under the proportional intensities assumption [16]. The basic idea is to introduce artificial
time-dependent indicators of prespecified time intervals I.t / defined as

0 if 0 6 t 6 l
I1 .t / D for l D 1; 2; : : : ; r
1 if t > l

Then, the following modification of the time-homogeneous Markov model is used to estimate both the
time-varying baseline transition intensities and the regression coefficients
h 0 0
i h 0 i

˛hj .t jI.t /; Z.t // D ˛hj 0 exp hj I.t / C ˇhj Z.t / D ˛hj 0 .t / exp ˇ hj Z.t / (3)
h 0 i

where ˛hj 0
.t / D ˛hj 0 exp  hj
I.t / is the baseline transition intensity that depends on the follow-up
time and is described by a step-function defined on intervals Œl1 ; l /; l D 1; 2; : : : ; r C 1
8
ˆ
ˆ ˛hj 0   if 0 6 t 6 1
ˆ
ˆ
ˆ
< ˛hj 0 exp hj 1
if 1 6 t 6 2

˛hj 0 .t / D ˆ :: (4)
ˆ
ˆ :  
ˆ
:̂ ˛hj 0 exp  1 C    C  r if t 6 r
hj hj

The particular case where r D 0 corresponds to the time-homogeneous proportional intensities model
with constant baseline transition intensities.
In practice, the occurrence of most transient states is not observed in continuous time, but can only
be established at discrete ‘assessment times’. For example, a recurrence of a disease may be established
only at times when a patient visits the clinic and thus the time of transition to recurrence is only known to
271

lie within an interval between the two consecutive visits, the length of which depends on the frequency
of the visits. Such a pattern of observations leads to interval-censored data. In contrast, in a multistate

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

model with one absorbing state k (e.g., death), the exact time of transition to the absorbing state is often
known. The MKVPCI can handle this type of hybrid data with exact times of transitions to the absorbing
state and interval-censored data for transitions to the nonabsorbing states [16].
The MKVPCI estimation process produces full maximum likelihood estimates of the baseline tran-
sition intensities and regression coefficients [16]. Asymptotic covariance matrix of the parameters is
obtained from the empirical information matrix [22]. Various hypotheses
 of
 interest, including tests of
u
no effect of a covariate Zu on the intensity of a specific transition ˇhj D 0 and tests of the equality of
 
u
the effects of the same covariate on two different transitions ˇhj D ˇhu0 j 0 are also proposed [16].

3. New Markov relative survival model

If individual causes of death are not known, for individual i, the transition intensity ˛hk .t jZi .t // for
each transition towards the absorbing state k (death) will represent the all-cause mortality hazard rate,
and the estimated covariate effects will reflect their impact on the overall mortality rather than on disease-
specific mortality. Such effects of prognostic factors on all-cause mortality may be difficult to interpret
because the effects on disease-specific mortality may differ from their effects on other-causes mortality
[12–14].
To address this problem, we propose a new MRS model that extends the original MKVPCI model
of Alioum and Commenges [16] by incorporating relative survival. The goal is to estimate the effects
of prognostic factors on disease-specific mortality, while accounting for different patterns of transitions
between p > 3 alternative states and for unknown causes of deaths. To this end, we adopt the approach
of the additive relative survival model (1) proposed by Esteve et al. [12] to Markov multistate modeling.
In the new MRS model, the transition intensity from any state h to any nonabsorbing state j ¤ k
remains represented by Equation (3), as in the original MKVPCI model [16]. However, the intensity of
transition leading to death (state k) is rewritten consistently with the additive relative survival model
(1). Specifically, the observed all-cause mortality transition intensity becomes the sum of the expected
pop
hazard for ‘natural’ mortality in the general population, ˛i .agei C t; sexi /, which depends on sexi and
.agei C t / at time t of individual i and the excess hazard of mortality attributed to the disease of interest.
Therefore, in the MRS model, the transition intensity from state h to state k, for individual i becomes:
h 0 i
pop 
˛hk .t jI.t /; Z.t // D ˛i .agei C t; sexi / C ˛hk0 .t / exp ˇhk Z.t / (5)

h 0 i

where ˛hk0 .t / D ˛hk0 exp hk I.t / is defined by Equation (4) and represents the baseline excess mor-
tality hazard in the context of the MRS model, that is, the hazard of mortality specifically because of
the disease of interest and corresponding to Zi .t / D 0. Similar to the relative survival methodology, the
pop
expected hazard ˛i is obtained from appropriate population life-tables, stratified by age and sex [12].
pop
Thus, ˛i .agei C t; sexi / are considered the known subject-specific constants, that is, ‘offsets’ to the
hazard in (5), and only ˛hk0 , hk , and ˇhk are estimated in expression (6).
Relevant formulas from the original MKVPCI likelihood maximization process are modified in accor-
dance with the above modification of the transition intensities towards the absorbing state k to obtain the
maximum likelihood estimates of the baseline transition intensities and regression coefficients.
Supplementary online material (Appendix A) contains details on the MRS implementation.

4. Simulation design

In a recent simulation study, we compared the performance of different multivariable regression


approaches for modeling multi-event disease progression processes, while assuming that either the
causes of death are known or the goal is to estimate the covariate effects on the hazard of all-cause
mortality [23]. Specifically, we have fitted: (i) separate Cox’s PH models for each transient or absorbing
event with right censoring on ‘competing’ event(s); (ii) the Lunn–McNeil extension of the Cox’s model
to competing risks analyses [24]; and (iii) the multistate MKVPCI model [16], described in Section 2.2.
Overall, the MKVPCI model was applicable to the wider range of simulated situations and yielded
272

more accurate covariate effects estimates than the two other models [23]. These simulation results cor-
roborated previous empirical findings and methodological arguments regarding the advantages of using

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

Markov multistate modeling to analyze multi-event disease progression processes, involving both tran-
sient events, such as disease recurrence and absorbing events, such as death [1]. Given these results, the
present simulations aimed at comparing the performance of two Markov multistate models, namely our
new MRS model versus the MKVPCI model [16], in modeling multi-event processes in the more com-
plex situations when the causes of death are not known. Specifically, we attempted to assess the potential
advantages of the new MRS model in the case when the effect(s) of some covariate(s) on the hazard of
disease-related mortality are different from their effects on other-causes ‘natural’ mortality.
To have realistic and clinically relevant assumptions for our simulations, we based them on data on
incident cases of colon cancer from Cote D’Or administrative district (Burgundy, France) [1, 7], using a
design generally similar to the simulations reported by Le Teuff et al. [15]. During the follow-up, these
patients may develop a recurrence and die of cancer afterwards, may die of cancer without developing a
recurrence, may die of other causes before or after developing a cancer recurrence, or may get censored
either through drop-out or administrative censoring. Furthermore, the mortality data from population-
based cancer registries are unlikely to include individual causes of death. Thus, such studies encounter
both methodological challenges addressed in our manuscript.
To simulate a registry-based prognostic study of colon cancer, we generated the 10-year follow-up
data for a hypothetical cohort of N subjects who all started in state 1 (diagnosed with colon cancer
without recurrence). We then assumed that during follow-up, the patients could have ‘progressed’ to
one or two of the following states: 2 D recurrence, 3a D death from cancer or 3b D death from other
causes. The transition intensities were assumed to depend on three prognostic factors: (i) age at diag-
nosis; (ii) sex; and (iii) cancer stage at diagnosis. In all simulations, when generating times to death
from disease (cancer) and from recurrence, we assumed that prognostic factors effects conform with the
PH assumption. In the main simulations, event times were generated from an exponential distribution,
implying constant hazard intensities, and accordingly, data were analyzed with the time-homogeneous
models. In additional simulations, event times were generated from a Weibull distribution and analyzed
with piecewise models, assuming a priori a single change in the transition intensity at 2 years of follow-
up. We introduced administrative right censoring (at 10 years) and random drop-out at times uniformly
distributed throughout the follow-up interval. Also, note that in contrast to data generating, for analysis
purposes we did not distinguish between competing causes of death (‘cancer-related’ vs ‘other causes’).
Across simulations, we considered three sample sizes .N D 500; 1000; 1500/, and three values of the
maximum number of repeated observations .P D 5; 10; 20/ at which the transition to intermediate state
2 (recurrence) was assessed. In contrast, the time of death was assumed to be known exactly. For each
simulation scenario, 100 data sets were independently generated and analyzed. All simulation data were
analyzed with both the original MKVPCI model [16] and the new MRS model. Because in the analysis
the cause of death was considered unknown, all deaths observed during the follow-up were considered
as the same event 3, that is, death from any cause. Accordingly, we estimated Markov models with three
states. Figure 1 shows the logically possible transitions between the three states. In relative survival anal-
yses, based on our MRS model, the French mortality tables, used to generate times to death from other
causes, were employed to determine the offsets, that is, the individual values of the expected hazard of
‘natural’ mortality in Equation (9).
The results of the two models were compared using several standard criteria. Bias in the estimated
effect of a prognostic factor was quantified as the difference between the mean of the estimates from
each of the 100 simulated datasets and the corresponding true log hazard ratio .ˇ/. The relative bias
was the ratio of the bias to the true value of ˇ. The root mean square error (RMSE) for each of the two
models was also calculated. Then, to compare the accuracy of the two estimates, we calculated the ratio
of the corresponding RMSEs of our new MRS model and the original MKVPCI model. The empirical
273

Figure 1. Multistate Markov model, employed for data analysis, with three states and three transitions for disease
progression.

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

coverage rates of the nominal 95% confidence intervals (95% CI) were estimated as the proportion of
samples, in which the 95% CI included the true ˇ.

5. Simulations results

In all analyses of the simulated data, we estimated the effects of covariates (age, sex, and cancer stage) on
three transitions between the three states shown in Figure 1. Consistent with Figure 1, from here on, we
denote the transition Cancer Diagnosis ! Recurrence as ‘1!2’, Cancer Diagnosis ! Recurrence-Free
Death as ‘1!3’, and Recurrence ! Death after Recurrence as ‘2!3’. Across simulation experiments
we varied: (i) the strength and direction of the effects of age and sex on death from cancer; (ii) sample
size .N /; and (iii) maximum number of repeated observations .P /.
Table I summarizes the results of 100 simulation experiments with the sample size fixed at N D 1500,
resulting in about 500 cancer deaths, 420 deaths from other causes and 360 cancer recurrences in each
simulated sample. The maximum number of repeated observations was fixed at P D 20, implying
assessment times were distributed at 6-month intervals. Column 3 of Table I shows the true effect .ˇ/ of
a prognostic factor on the intensity of each transition.
As expected, the lack of information on cause of death had little impact on the MKVPCI estimates of
covariate effects on recurrence, so that for both MKVPCI and MRS models, biases for transition 1!2
were only minor. On the other hand, Column 4 indicates that for most effects on transitions toward death,
the original MKVPCI model yields statistically significant bias, as the corresponding 95% CI exclude
0. The only exception concerns the practically unbiased estimate of the effect of age on transition 1!3.
Most importantly, for certain effects relative bias is very large. To illustrate such potential biases, in this
scenario we purposely assumed that men have lower risk of cancer-related mortality .ˇsex;1!3 D 0:7/,
in contrast to their higher risk of ‘natural’ other-causes death. In the absence of cause of death, the
MKVPCI model was forced to estimate the covariate effects on all-cause mortality, which resulted in a
strong underestimation of the protective effect of male sex on cancer death with a relative bias towards
the null exceeding 65%. This bias towards the null occurs because the two opposite ‘true’ effects of sex
on cancer death versus natural death largely cancel each other out, because the model cannot distinguish
between the two types of death. Table I also shows a substantial underestimation of the impact of higher
cancer stage on mortality (transitions 1!3 and 2!3). Again, the lack of the ‘true’ effect of cancer
stage on natural mortality pushed towards the null its estimated effects on cancer death both before and
after recurrence with relative biases of about –22% and about –33%, respectively. This implies that, in
the absence of information on individual causes of death, the original model’s estimates may diverge
substantially from the ‘true’ effects of covariates on the risk of cancer-related death.
Column 5 of Table I indicates that, in contrast to the original MKVPCI model, for most effects the
new MRS model did not yield statistically significant biases. Also, for only one of the nine effects, the
relative bias in the MRS estimates exceeded 5% (11% for the effect of cancer stage on recurrence).
The fact that, in these cases, the 95% CIs for the relative biases for the corresponding estimates from the
two models did not overlap indicates that this bias reduction was statistically significant.
All the coverage rates of the 95% CIs obtained from the new MRS model range between 87% and
96%, with the majority above 90% (column 7 of Table I). In contrast, for strongly biased estimates, the
original model yielded coverage rates as low as 0% and 8% (column 6).
Column 8 of Table I indicates that the MRS model’s estimates have slightly higher variances than the
corresponding MKVPCI estimates, as the SD ratios (MRS/MKVPCI) are systematically above 1.0. This
is due to the fact that the MRS model attempts to use only information about deaths from cancer, whereas
the MKVPCI estimates are based on all observed deaths. A similar increase in variance was reported in
simulations that compared the relative survival model of Esteve et al. with ‘crude’ Cox’s model esti-
mates [15]. The last column of Table I shows the ratio of RMSE of the relative survival MRS-based
estimates to the RMSE of the corresponding MKVPCI estimates. Because in both models the covariate
effects on recurrence (transition 1!2) and the effects of age on mortality have only minor bias (columns
4 and 5), for these effects the increased variance (column 8) results in slightly higher RMSE of the MRS
estimates. In contrast, for effects of sex on transition 1!3 and cancer stage on both transitions leading
to death, the much larger bias of MKVPCI estimates implies a better bias–variance trade-off of MRS
estimates, as reflected by RMSE ratios well below 1.0 (column 9 of Table I).
274

Table II summarizes the results for a scenario similar to that presented in Table I, except that the
event times are now generated from a Weibull distribution with decreasing hazard. The results were

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

Table I. Comparison of estimated prognostic factor effects between MKVPCI and MRS models, N D 1500.

Copyright © 2011 John Wiley & Sons, Ltd.


Transition typea Prognostic factor True effect .ˇ/ % Relative bias (95% CI) Coverage rate (%) Standard deviation ratio RMSE ratio
1 2 3 MKVPCI 4 MRS 5 MKVPCI 6 MRS 7 (MRS/MKVPCI) 8 (MRS/MKVPCI) 9
1!2 Age (years) ln(1.2) 4.4 (0.4; 8.4) 4.4 (0.4; 8.5) 91 91 0.97 1.01
1!2 Sex (M/F) ln(1.3) 5.1 (0.8; 9.4) 3.1 (–0.3; 6.6) 94 95 1.01 1.00
1!2 Cancer stage ln(1.5) 9.1 (3.4; 14.7) 10.9 (4.8; 17.0) 91 89 0.99 1.05
1!3a Age (years) ln(1.1) –2.2 (–5.1; 0.7) –2.0 (–4.8; 0.7) 97 96 1.50 1.25
1!3a Sex (M/F) ln(0.7) –67.5 (–76.6; –58.3) –0.3 (–1.4; 0.8) 8 95 1.30 0.38
1!3a Cancer stage ln(3) –21.6 (–29.7; –13.5) –0.1 (–0.7; 0.5) 0 91 1.50 0.16
2!3a Age (years) ln(1.1) –4.6 (–8.6; –0.5) –3.1 (–6.5; 0.3) 83 87 1.44 1.44
2!3a Sex (M/F) ln(1.2) 48.7 (38.9; 58.5) –2.3 (–5.2; 0.6) 92 96 1.55 1.41
2!3a Cancer stage ln(1.5) –33.4 (–42.6; –24.1) 5.0 (0.7; 9.3) 59 92 1.39 0.75
a Transition 1!2 represents the transition Disease!Recurrence; 1!3 represents transition Disease!Death and 2!3 represents transition Recurrence!Death.

Statist. Med. 2012, 31 269–286


275
276

Table II. Comparison of estimated prognostic factor effects between piecewise MKVPCI and piecewise MRS models, N D 1500. One cut point at 2 years.

Copyright © 2011 John Wiley & Sons, Ltd.


Transition typea Prognostic factor True effect .ˇ/ % Relative bias (95% CI) Coverage rate (%) Standard deviation ratio RMSE ratio
1 2 3 MKVPCI 4 MRS 5 MKVPCI 6 MRS 7 (MRS/MKVPCI) 8 (MRS/MKVPCI) 9
1!2 Age (years) ln(1.2) 6.4 (2.2; 10.6) 6.2 (2.0; 10.4) 88 89 1.01 1.02
1!2 Sex (M/F) ln(1.3) 6.8 (1.2; 12.4) 4.4 (0.4; 10.4) 90 92 1.02 1.01
1!2 Cancer stage ln(1.5) 10.5 (3.7; 17.3) 12.9 (6.7; 19.1) 87 84 1.02 1.10
1!3a Age (years) ln(1.1) –5.3 (–11.3; 0.7) –4.7 (–9.3; –0.1) 90 91 1.38 1.30
1!3a Sex (M/F) ln(0.7) –71.0 (–80.5; –61.5) –1.1 (–0.7; 0.8) 3 97 1.35 0.32
1!3a Cancer stage ln(3) –21.6 (–29.7; –13.5) –0.1 (–0.9; 0.7) 10 90 1.40 0.45
2!3a Age (years) ln(1.1) –8.7 (–15.3; –2.1) –8.5 (–16.1; –0.9) 79 83 1.48 1.38
2!3a Sex (M/F) ln(1.2) 55.8 (41.8; 69.5) 6.7 (–0.1; 6.8) 43 91 1.61 1.42
2!3a Cancer stage ln(1.5) –41.0 (–52.2; –29.8) 8.2 (0.5; 15.9) 51 93 1.38 0.81
a Transition 1!2 represents the transition Disease!Recurrence; 1!3 represents transition Disease!Death and 2!3 represents transition Recurrence!Death.
E. HUSZTI ET AL.

Statist. Med. 2012, 31 269–286


E. HUSZTI ET AL.

obtained by using a piecewise constant version of the MKVPCI and MRS with a priori selected cut-
point at T D 2 years from diagnosis. The overall pattern of results and, in particular, the differences
in results between the original and the new MRS model are consistent and similar to those in the time-
homogeneous scenario. Because piecewise models with a single cut-point are not able to accurately
account for a continuous change in the baseline intensity, implied by the data-generating Weibull model,
the bias in Table II is slightly larger and the coverage rates are slightly lower for both models than in the
time-homogeneous scenario (Table I). However, the coverage rates for MRS remain close to 90% in all
scenarios (Table II). As in Table I, for most effects on transitions toward death, the original piecewise
MKVPCI model yields statistically significant bias, up to 71%, whereas the corresponding bias in the
piecewise MRS never exceeds 8% (Table II).
Table III presents the results of the second simulation scenario in which the sample size was decreased
to N D 1000, resulting in about 300 cancer deaths, 240 deaths from other causes, and 230 cancer recur-
rences per simulated sample, while the event times were generated from the exponential distribution as
in Table I. The maximum number of repeated observations was kept at P D 20, as in the first scenario.
While most covariate effects were assumed to be the same as in Table I, there were important changes.
First, males were assumed to have a higher risk of cancer mortality than females (ˇsex;1!3 D 2:0 in
Table III), so that effects of sex on both types of death were now similar. Second, we assumed the true
effect of older age for transition 1!3 to be protective, that is, in the opposite direction from its effect
on natural death. Such an assumption is not totally implausible, given the nonmonotonic impact of age
on the risk of recurrence and death in different cancers, where the risks tend to decrease with age at
diagnosis increasing from 30 to about 50 years [14, 25, 26].
This induced a very strong bias in estimates from the conventional MKVPCI model, which was unable
to discriminate between the two types of death, with opposite age effects. Indeed, the 98% bias towards
the null for the effect of age on transition 1!3 (column 4 in Table III) indicates that the two effects
practically cancelled each other out. Accordingly, in this situation, the conventional model would incor-
rectly suggest that age has no effect on the risk of death before recurrence. In contrast, the proposed
relative survival MRS model yielded only a minor relative bias of about 6.5%, that is, it was able to
correctly recover the ‘true’ protective effect of older age on cancer-related mortality, even in the absence
of the information on the cause of death. This very large difference in bias of the age effect estimates
obtained with the two models also implied a strong difference in the corresponding coverage rates (0%
for MKVPCI versus 97% for the MRS) and a much better overall accuracy of the relative survival esti-
mates (RMSE ratio of 0.19 in column 9 of Table II). Furthermore, MKVPCI estimates of the effects
of higher cancer stage on death before or after recurrence showed relative biases of 55% and 47%,
respectively.
On the other hand, in Scenario 2, conventional MKVPCI estimates of the effect of male sex on death
(transitions 1!3 and 2!3) showed only moderate bias. This was expected, as in this scenario men
were assumed to have a higher risk of cancer-related death, similar to their higher other-cause mortal-
ity. Finally, while our MRS model substantially reduced large biases, its estimates in Table III were
somewhat more biased than in Table I. This may be because of lower sample size.
Figures 2 and 3 help assess the impact of increasing the sample size on the relative bias and coverage
rates of parameter estimates, respectively. Specifically, the two figures compare the estimated effects of
sex, obtained with the MKVPCI versus the MRS models, across the sample sizes (N D 500, 1000 or
1500) for the three transitions. Generally, the graphs suggest that results presented in Tables I and II are
robust with respect to different sample sizes. While for the MKVPCI model estimates of sex effect on
transition 1!3 (Figure 3(b)), the coverage rate decreases with increasing N . This is simply because a
smaller N implies a wider confidence interval. Furthermore, there is some trend for the improved per-
formance of MRS estimates with increasing sample size: bias tends to decrease and coverage rates get
closer to 95%.
Figures 4 and 5 assess the impact of the number of repeated observations (P D 5, 10, or 20) on bias
and coverage rates obtained from the two models. As in Figures 2 and 3, results are presented for the
effect of sex, but similar results are observed for other covariates (data not shown). Because recurrence
(state ‘2’) is observed at those P periodic examinations, the timing of recurrence becomes more accu-
rate as these examinations become more frequent. As a consequence, the estimates for transition 1!2,
leading to recurrence, become less biased as P increases (Figure 4(a)). For similar reasons, the coverage
rates for transitions 2!3 improve with increasing P for both models. Most importantly, regardless of
277

the value of P , our new MRS model yields systematically lower bias and higher coverage rates than the

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
278

Table III. Comparison of estimated prognostic factor effects between MKVPCI and MRS models, N D 1000.

Copyright © 2011 John Wiley & Sons, Ltd.


Transition typea Prognostic factor True effect .ˇ/ % Relative bias (95% CI) Coverage rate (%) Standard deviation ratio RMSE ratio
1 2 3 MKVPCI 4 MRS 5 MKVPCI 6 MRS 7 (MRS/MKVPCI) 8 (MRS/MKVPCI) 9
1!2 Age (years) ln(1.2) 4.1 (0.2; 8.0) 1.9 (–0.8; 4.6) 85 94 0.01 0.01
1!2 Sex (M/F) ln(1.3) 5.3 (0.9; 9.7) 8.2 (2.8; 13.5) 94 94 0.15 0.15
1!2 Cancer stage ln(1.5) –4.0 (–7.9; –0.2) 3.1 (–0.3; 6.4) 93 94 0.05 0.05
1!3a Age (years) ln(0.9) –98.6 (–100.9; –96.3) 6.6 (1.7; 11.4) 0 97 0.10 0.02
1!3a Sex (M/F) ln(2) –9.6 (–15.4; –3.8) 4.4 (0.4; 8.4) 90 96 0.13 0.20
1!3a Cancer stage ln(3) –54.8 (–64.6; –45.1) 1.5 (–0.9; 3.9) 0 93 0.60 0.10
2!3a Age (years) ln(1.1) –5.1 (–9.4; –0.8) –10.1 (–16.0; –4.2) 84 83 0.01 0.02
2!3a Sex (M/F) ln(1.2) 22.9 (14.6; 31.1) –8.7 (–14.2; –3.2) 93 94 0.18 0.26
2!3a Cancer stage ln(1.5) –47.4 (–57.2; –37.6) –7.7 (–13.0; –2.5) 16 87 0.20 0.10
a Transition 1!2 represents the transition Disease!Recurrence; 1!3 represents transition Disease!Death and 2!3 represents transition Recurrence!Death.
E. HUSZTI ET AL.

Statist. Med. 2012, 31 269–286


E. HUSZTI ET AL.

Figure 2. Comparison of % Relative Bias of estimated SEX effects between MKVPCI and MRS models.
P D 20.

conventional model. As expected, P has no impact on the estimates for transition 1!3 (cancer diagnosis
to recurrence-free death), which does not involve recurrence.

6. Application to colorectal cancer

We applied our MRS model and compared its results with alternative models, to reassess the effects
of putative prognostic factors on colorectal cancer prognosis and cancer-related mortality in a study that
uses a population-based registry, in which causes of death are unknown. The cohort consisted of 874 con-
secutive patients from the Registry of Digestive Tumors from the Cote D’Or, France [27] who underwent
surgery for colorectal cancer between 1976 and 1984. The main exclusion criteria were nonepithelial
cancers and short-term post-surgical mortality (deaths within 30 days after surgery) [1]. Baseline demo-
graphic and clinical data were obtained for all patients from medical records and recorded in the Registry,
while information on vital status and date of death was obtained from administrative and medical sources
[27]. Dates of first diagnosis of recurrence and/or metastasis (post-surgery) were established through a
retrospective chart review [28].
The data included two demographic factors: sex and age at diagnosis (in years), and baseline clinical
data on the date of diagnosis, cancer stage and tumor site at the time of cancer diagnosis. Tumor site was
grouped into two categories: colon and rectum [1, 29]. Baseline cancer stage was classified into three
categories: Dukes A, B and C tumors. Stage D patients were excluded because they were considered as
presenting a metastasis at baseline [1].
Data were analyzed using: (i) the conventional piecewise MKVPCI model [16]; (ii) the new piecewise
MRS model proposed in Section 3; (iii) when applicable, separate endpoint-specific Cox’s PH models
279

[30]; and (iv) the Lunn–McNeil competing risks model [24]. The analysis defined three states: the initial
state 1 corresponded to ‘cancer diagnosis’ with time 0 defined as the date of surgery, state 2 represented

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

Figure 3. Comparison of Coverage Rate of estimated SEX effects between MKVPCI and MRS models. P D 20.

local cancer recurrence, and state 3 death. The two Markov models simultaneously estimated the effects
of all four prognostic factors on the hazard of each of the three endpoints corresponding to transitions to:
(i) recurrence 1!2; (ii) death without recurrence 1!3; and (iii) death after recurrence 2!3. Three sep-
arate Cox models were used for each of the three endpoints (recurrence, death without recurrence, and
death after recurrence). While fitting the (separate) Cox models for recurrence and death without recur-
rence, subjects were censored at the time of the ‘competing event’ (respectively, death without recurrence
or recurrence). The third Cox model, death after recurrence, was estimated using only those subjects who
had a recurrence and defining the time 0 as the time of recurrence. Finally, the Lunn–McNeil model was
used to estimate the covariate effects on the hazard of the two competing events: (i) recurrence and (ii)
death without recurrence. In contrast to the other models, the competing risks Lunn–McNeil model [24]
could not be used to analyze a sequence of events and thus was not implemented for the analyses of
the transition 2!3 from recurrence to death after recurrence [23]. In all Markov model analyses, only
time to death was assumed to be known exactly. In the analyses relying on the Cox or Lunn–McNeil
models, the time of recurrence was assumed to correspond to the time of the first clinic visit when it
was recorded [1, 27]. Because in many cancers, including colorectal cancer, mortality is much higher
in the first year after diagnosis [14] when many patients die because of post-surgery complications, in
both Markov models, baseline hazards were assumed to be piecewise constant with the change point at
1 year (365 days). In all analyses, only time to death was assumed to be known exactly. Recurrence and
metastasis times were recorded in the data set as times of the first clinic visit when they were detected
[1, 25]. For the MRS model, the probability of natural all-cause mortality was obtained from mortality
tables for the general population of France [31] and was assigned to each individual at each transition
step based on gender, calendar year of death, and his/her updated age.
Secondary analyses included testing the null hypotheses of no difference between the effects of
280

advanced cancer stage and tumor site on the transition to recurrence (1!2) versus their effects on the
transition to death without recurrence (1!3). These tests were limited to the analyses that relied on the

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

Figure 4. Comparison of % Relative Bias of estimated SEX effects between MKVPCI and MRS models.
N D 1500.

two Markov models and the Lunn–McNeil model, as it was impossible to formally test the difference
between the estimates from two separate endpoint-specific Cox models.
The cohort of 874 patients was followed for up to 11 years, with a median follow-up time of 3.7 years.
The number of patients who experienced specific transitions between different health states is shown in
Figure 6. Table IV summarizes the baseline characteristics of the cohort.
Table V compares the estimated effects of prognostic factors (rows), yielded by the four models
(columns), on the hazard of each of the three transitions. As expected, for the transition from surgery
to recurrence (1!2) that did not involve death, results obtained with all models were very similar. All
models indicated a significant increase of risk of recurrence for patients with more advanced cancer
stages, especially stage C, and for cancer of the rectum relative to colon (upper part of Table V). For
the transition from recurrence to death (2!3), where the Lunn–McNeil model could not be applied (see
above), the estimates obtained with the Cox model were also generally similar to those from the two
Markov models (bottom part of Table V). However, the Cox model-based estimated impact of advanced
cancer stage C on the mortality after recurrence was weaker and, in contrast to both Markov models,
did not reach statistical significance (2nd to the last row of Table V). No other statistically significant
predictors of post-recurrence mortality were identified by any of the three models. This finding is con-
sistent with the literature and reflects the fact that after colorectal cancer, recurrence survival is almost
uniformly very short, regardless of individual prognostic factors observed at the time of initial cancer
diagnosis.
For the transition from surgery to death without recurrence (1!3), the results yielded by the MRS
did differ significantly from those obtained with the three other ‘conventional’ models, which did not
account for unknown causes of death and thus estimated the effects of covariates on the hazard of all-
cause mortality. Specifically, in contrast to the two other transitions, the two Markov models produced
281

substantially different estimates. First, the baseline hazard for transition 1!3 decreased more than
two-fold when using the MRS model relative to the conventional MKVPCI model (data not shown).

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

Figure 5. Comparison of Coverage Rate of estimated SEX effects between MKVPCI and MRS models.
N D 1500.

Figure 6. Number of patients in each state and transition.

Table IV. Baseline characteristics of colorectal cancer cohort subjects.


Variable N D 874
Age – year
Median (IQR) 70 (61 – 77)
Male sex – number (%) 451 (51.6)
Cancer stage – number (%)
Cancer stage A 186 (21.3)
Cancer stage B 440 (50.3)
Cancer stage C 248 (28.4)
Tumor site – number (%)
Colon 497 (56.9)
282

Rectum 377 (43.1)

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

Table V. Results of three-state models of progression of colorectal cancer .N D 874/.


Cox model Lunn–McNeil MKVPCI MRS
TRa Variable HR 95% CI HR 95% CI HR 95% CI HR 95% CI
Age (years) 1.01 1.00 – 1.03 1.02 1.00 – 1.03 1.02 1.00 – 1.03 1.02 1.00 – 1.03
Male vs female 1.17 0.84 – 1.62 1.23 0.89 – 1.71 1.26 0.91 – 1.74 1.25 0.90 – 1.73

Copyright © 2011 John Wiley & Sons, Ltd.


Cancer stage: B vs A 1.94 1.13 – 3.34 1.98 1.15 – 3.42 2.02 1.17 – 3.49 2.02 1.17 – 3.49
Cancer stage: C vs A 5.36 3.14 – 9.14 5.94 3.50 – 10.13 6.16 3.60 – 10.55 6.16 3.60 – 10.54
Site: rectum vs Colon 2.27 1.62 – 3.17 2.38 1.70 – 3.32 2.40 1.72 – 3.35 2.39 1.71 – 3.33

Age (years) 1.04 1.03 – 1.05 1.04 1.03 – 1.05 1.04 1.03 – 1.06 1.01 0.99 – 1.03
Male vs female 1.46 1.16 – 1.84 1.44 1.14 – 1.81 1.47 1.16 – 1.85 1.34 0.89 – 2.00
Cancer stage: B vs A 1.26 0.90 – 1.76 1.24 0.89 – 1.74 1.26 0.90 – 1.77 2.05 0.77 – 5.45
Cancer stage: C vs A 2.91 2.06 – 4.12 2.80 1.98 – 3.95 2.91 2.06 – 4.13 8.03 3.14 – 20.56
Site: rectum vs Colon 1.16 0.91 – 1.47 1.13 0.89 – 1.44 1.15 0.91 – 1.47 1.36 0.91 – 2.04

Age (years) 1.01 1.00 – 1.03 1.01 1.00 – 1.03 1.01 0.99 – 1.03
Male vs female 0.82 0.58 – 1.17 0.86 0.60 – 1.23 0.83 0.57 – 1.20
Cancer stage: B vs A 1.17 0.64 – 2.14 NAb NAb 1.22 0.67 – 2.22 1.29 0.68 – 2.45
Cancer stage: C vs A 1.54 0.85 – 2.78 1.87 1.04 – 3.38 2.04 1.08 – 3.85
Site: rectum vs Colon 1.10 0.76 – 1.60 1.03 0.71 – 1.50 1.06 0.72 – 1.57
a TR,Transition type 1!2 (surgery !local recurrence); 1!3 (surgery!death); 2!3 (local recurrence!death).
b NA,Lunn–McNeil method could be used ONLY on competing risks for: recurrence (1!2) and death without recurrence (1!3).
 P 6 0:05,  P 6 0:001.

Statist. Med. 2012, 31 269–286


283
E. HUSZTI ET AL.

This substantial reduction in the baseline hazard after accounting for ‘natural’ mortality suggests that
many deaths observed among patients whose cancers did not recur are from causes other than col-
orectal cancer. For this reason, the hazard for excess cancer-related mortality estimated by MRS is
considerably lower than the hazard of all-causes observed mortality estimated by MKVPCI. Second,
all results of the conventional models suggested a significantly increased risk of recurrence-free all-
cause mortality associated with both older age (Cox and Lunn–McNeil: Hazard Ratio (HR) = 1.04
[1.03 - 1.05]; HR D 1:04Œ1:03  1:05; MKVPCI: HR D 1:04Œ1:03  1:06 for 1 year increase in
age) and male gender (Cox: HR D 1:46Œ1:161:84; Lunn–McNeil: HR D 1:44Œ1:141:81; MKVPCI:
HR D 1:47Œ1:16  1:85) (middle part of Table V). In contrast, after having accounted for the expected
risk of natural mortality, MRS yielded lower and statistically nonsignificant estimates of the effects for
both age .HR D 1:01Œ0:99  1:03/ and male gender .HR D 1:34Œ0:89  2:00/. This reduction of
the estimated impact of both factors in the relative survival-based MRS model can be explained by the
fact that both older age and male gender are strong prognostic factors for all-cause mortality. Our MRS
analyses were essential to demonstrate that, once these well-known effects were accounted for, neither
characteristics had a statistically significant association with cancer-related mortality among patients
who had no cancer recurrence.
On the other hand, the effect of cancer stage C on transition 1!3 estimated in the MRS model
.HR D 8:03Œ3:14  20:56/ was almost three times higher than in any of the conventional models (Cox:
HR D 2:91Œ2:064:12; Lunn–McNeil: HR D 2:80Œ1:983:95); including the existing MKVCI model
.HR D 2:91Œ2:064:13/. Here, the MRS helped separate the dramatic impact of advanced cancer stage
that is unlikely to affect mortality from other causes, specifically on cancer-related mortality. Interest-
ingly, the conventional models, which were not able to separate the two types of mortality, suggested that
the impact of advanced cancer stage (C) on the risk of transition to death without recurrence (1!3) was
significantly weaker than its impact on the transition to recurrence (1!2) (Lunn–McNeil: 2 D 5:43,
df D 1, p D 0:02; MKVPCI: 2 D 6:15, df D 1, p D 0:013). In contrast, in the MRS model, the
difference between the two effects became completely nonsignificant (2 D 0:20, df D 1, p D 0:66)
while the point estimate for recurrence-free death .HR D 8:03/ was actually higher than for recurrence
.HR D 6:16/.
Overall, in these empirical analyses, our new relative survival MRS model yielded a substantially bet-
ter fit to the data compared with the existing MKVPCI model (deviance of 9074 vs 9150 for the same
number of DOFs). In addition to improving the fit to data, the MRS model provided important new
insights about the effects of prognostic factors on the hazard of recurrence-free cancer-related death, and
eliminated some spurious differences between the effects of advanced cancer stage on the competing
risks of cancer recurrence versus recurrence-free mortality.

7. Discussion

We have proposed a new multistate MRS model to simultaneously address two important methodologi-
cal challenges that frequently arise in prognostic studies: (i) separating the effects of putative prognostic
factors on different endpoints and (ii) unknown causes of death. In a previous simulation study, we
already demonstrated the advantages of Markov multistate modeling in addressing the first challenge,
when assuming the causes of death are known [23]. In this context, when compared with either fitting
separate Cox’s PH models for each event or using the Lunn–McNeil competing risks model [24], the
multistate MKVPCI model [16] yielded, on average, more accurate covariate effects estimates and was
applicable in a wider range of scenarios than the survival-based models [23]. Accordingly, in the present
study, we evaluated the performance of our new MRS model that extends the MKVPCI model of Alioum
and Commenges [16] to the relative survival analyses in addressing the second challenge. To this end,
we simulated different clinically plausible scenarios in which individual causes of death were not known
and analyzed the simulated data with both the existing MKVPCI model and the new MRS model. How-
ever, it should be emphasized that the MKVPCI model was not originally developed to deal with such
incomplete mortality data. Accordingly, the focus of simulations was not on demonstrating biases in the
conventional estimates, but rather on exploring to what extent our new model could provide reasonably
unbiased estimates.
Our simulations showed that the MRS model is able to accurately estimate the effects of prognostic
factors on cancer related mortality even if: (i) these effects are quite different from the effects on the ‘nat-
284

ural’ mortality and (ii) individual causes of death are not known. As expected, such gains in accuracy
were especially important in simulations where we assumed very different (even opposite) ‘true’ effects

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

of age and/or sex on disease-related mortality versus ‘natural’ mortality. In such cases, not accounting
properly for other-causes, ‘natural’ mortality resulted in cancelling out the two opposite effects and the
conventional estimates being strongly biased toward the null. In real-life studies, this could lead to mis-
leading conclusions about the lack of increased risk of disease-related mortality for specific high-risk
subgroups of patients who in fact should be targeted by a more aggressive treatment. On the other hand,
in simulations where we assumed that age and sex effects were similar for deaths of different causes,
both the original MKVPCI model and the new MRS model performed relatively well, with similar, minor
biases for estimated effects of each of the two factors. However, even in this case, for prognostic factors
that affected only disease-related mortality but not ‘natural’ death, such as ‘cancer stage’, conventional
MKVPCI estimates were still biased towards the null compared with their ‘true’ effects on disease-
specific mortality. In contrast, our MRS model allowed practically unbiased estimation of such effects,
limited to only one type of mortality. In simulations, where the generated data did not conform to a
time-homogeneous intensities assumption, the estimates yielded by the new, piecewise-constant intensi-
ties version of the MRS showed similar advantages relative to the existing piecewise-constant MKVPCI
estimates. Furthermore, the piecewise-constant MRS considerably improved the fit to empirical data and
yielded new insights about the impact of different prognostic factors on the analyses of mortality in
colorectal cancer, where the assumption of the constant baseline hazard did not hold.
The new MRS model attempts to use only information about ‘excess’ mortality from cancer, whereas
the MKVPCI estimates are based on all observed deaths. As a consequence, in simulations, we typi-
cally observed an increased variance for the new model estimates. This variance inflation was higher if
the proportion of deaths from ‘natural’ mortality increased. On the other hand, while a relatively large
number of unidentified ‘natural’ deaths would lead to a higher variance in the MRS model, this might
also lead to a stronger bias in conventional analyses. Indeed, our simulations showed that, in such situ-
ations, the bias-variance trade-off was better for the MRS model; in most cases where the conventional
MKVPCI model produced strongly biased results, the RMSE from our model was smaller (RMSE ratio
< 1 in Tables I–III).
In all simulated scenarios, the coverage rates of the 95% confidence intervals for MRS estimates were
above 90%. In contrast, the conventional model produced very low coverage rates in scenarios where
the estimates were seriously biased. When we varied the sample size, the general pattern of results
remained similar (Figures 2 and 3), which suggests some robustness of the results and conclusions of
our simulations.
Our simulations assumed, realistically, that time to death is known exactly while time to recurrence
is not. Therefore, the accuracy of the latter improves with increasing frequency of repeated assessment
times. Interestingly, in our MRS model, even a large interval of two years (P D 5 over a 10 year
follow-up) produced effective estimates with reasonably low bias.
As in most simulation studies, the assumptions underlying our data generation schemes were rather
arbitrary. However, we aimed at simulating scenarios somewhat typical of registry-based cancer prognos-
tic studies. Future simulation evaluations of the MRS model should include a larger number of prognostic
factors and a higher variety of their ‘true’ effects on various events. Future work should also evalu-
ate the accuracy of testing if the effects of a prognostic factor on different transitions are equal (e.g.
sex sex
H0 W ˇ1!2 D ˇ1!3 ). Finally, similar to single-event relative and ‘crude’ survival models [13, 14], incor-
porating flexible modeling of time-dependent and/or nonlinear covariate effects in our MRS will further
enhance the accuracy of real-life analyses of the complex relationships between prognostic factors and
the risks of various events.
We hope that our encouraging results will stimulate both further methodological research on the
refinement of the proposed MRS model and its applications in clinical prognostic studies.

Acknowledgements
We would like to thank Marie-Eve Beauchamp, Raluca Ionescu-Ittu and Nora Bohossian for their careful review
of the article. Michal Abrahamowicz is a James McGill Professor at McGill University and this research was
285

supported by his grants from the National Sciences and Engineering Research Council of Canada (# 228203) and
the Canadian Institutes for Health Research (#MOP-81275).

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286
E. HUSZTI ET AL.

References
1. Dancourt V, Quantin C, Abrahamowicz M, Binquet C, Alioum A, Faivre J. Modeling recurrence in colorectal cancer.
Journal of Clinical Epidemiology 2004; 57:243–251.
2. Bebchuk JD, Betensky RA. Tests for treatment group differences in the hazards for survival, before and after the
occurrence of an intermediate event. Statistics in Medicine 2005; 24:359–378.
3. Andersen PK, Keiding N. Multi-state models for event history analysis. Statistical Methods in Medical Research 2002;
11:91–115.
4. Hougaard P. Multi-state models: a review. Lifetime Data Analysis 1999; 5:239–264.
5. Commenges D. Multi-state models in epidemiology. Lifetime Data Analysis 1999; 5:315–327.
6. Remontet L, Estève J, Bouvier AM, Grosclaude P, Launoy G, Menegoz F, Exbrayat C, Tretare B, Carli PM, Guizard
AV, Troussard X, Bercelli P, Colonna M, Halna JM, Hedelin G, Macé-Lesec’h J, Peng J, Buemi A, Velten M, Jougla E,
Arveux P, Le Bodic L, Michel E, Sauvage M, Schvartz C, Faivre J. Cancer incidence and mortality in France over the
period 1978-2000. Revue d’Epidémiologie et de Santé Publique 2003; 51:3–30.
7. Quantin C, Abrahamowicz M, Moreau T, Bartlett-Esquilant G, MacKenzie T, Tazi MA, Lalonde L, Faivre J. Variation
over time of the effects of prognostic factors in a population based study of colon cancer: Comparison of statistical models.
American Journal of Epidemiology 1999; 150:1188–1200.
8. Brenner H, Blettner M. Controlling for continuous confounders in epidemiologic research. Epidemiology 1997; 8:
429–434.
9. Berkson J. The Calculation of Survival Rates. In Carcinoma and Other Malignant Lesions of the Stomach, Walters W,
Gray K, Priestley JT (eds). Chap. XXII. W. B. Saunders Co.: Philadelphia, 1942; 467–484.
10. Dickman PW, Sloggett A, Hills M, Hakulinen T. Regression models for relative survival. Statistics in Medicine 2004;
23:51–64.
11. Bolard P, Quantin C, Esteve J, Faivre J, Abrahamowicz M. Modeling time dependent hazard ratios in relative survival:
application to colon cancer. Journal of Clinical Epidemiology 2001; 54:986–996.
12. Esteve J, Benhamou E, Croasdale M, Raymond L. Relative survival and the estimation of net survival: elements for further
discussion. Statistics in Medicine 1990; 9:529–538.
13. Giorgi R, Abrahamowicz M, Quantin C, Bolard P, Esteve J, Gouvernet J, Faivre J. A relative survival regression model
using B-spline functions to model non-proportional hazards. Statistics in Medicine 2003; 22:2767–2784.
14. Remontet L, Bossard N, Belot A, Estève J. French network of cancer registries FRANCIM. An overall strategy based
on regression models to estimate relative survival and model the effects of prognostic factors in cancer survival studies.
Statistics in Medicine 2007; 26:2214–2228.
15. Le Teuff G, Abrahamowicz M, Bolard P, Quantin C. Comparison of Cox’s and relative survival models when estimat-
ing the effects of prognostic factors on disease-specific mortality: a simulation study under proportional excess hazards.
Statistics in Medicine 2005; 24:3887–3909.
16. Alioum A, Commenges D. MKVPCI: A computer program for Markov models with piecewise constant intensities and
covariates. Computer Methods and Programs in Biomedicine 2001; 64:109–119.
17. Hakulinen T. On long-term relative survival rates. Journal of Chronic Diseases 1977; 30:431–443.
18. Stare J, Pohar M, Henderson R. Goodness of fit of relative survival models. Statistics in Medicine 2005; 24:3911–3925.
19. Sasieni PD. Proportional excess hazards. Biometrika 1996; 83:127–141.
20. Buckley JD. Additive and multiplicative models for relative survival rates. Biometrics 1984; 40:51–62.
21. Cox DR, Miller HD. The theory of stochastic processes. Wiley: New York, 1965.
22. Kalbfleisch JD, Lawless JF. The analysis of panel data under a Markov assumption. Journal of the American Statistical
Association 1985; 80:863–871.
23. Huszti E, Abrahamowicz M, Alioum A, Quantin C. Comparison of selected methods for modeling of multi-state dis-
ease progression processes: a simulation study. Communications in Statistics - Simulation and Computation 2011;
40:1402–1421.
24. Lunn M, McNeil D. Applying Cox regression to competing risks. Biometrics 1995; 51:524–532.
25. Abrahamowicz M, MacKenzie TA. Joint estimation of time-dependent and non-linear effects of continuous covariates on
survival. Statistics in Medicine 2007; 26:392–408.
26. Gray RJ. Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. Journal
of the American Statistical Association 1992; 87:942–951.
27. Faivre J, Legoux JL, Faivre M, Martin R, Klepping C. A registry of tumors of the digestive tract in the Department of Cote
d’Or (France). Gastroenterologie Clinique et Biologique 1977; 1:983–993.
28. Faivre J, Milan C, Meny B. Risque de recidive loco-regionale apres exerese d’un cancer du rectum. Annales de Chirurgie
1994; 48:520–524.
29. International classification of diseases for oncology, ninth revision. World Health Organization: Geneva, 1975.
30. Cox DR. Regression models and life-tables (with discussion). Journal of the Royal Statistical Society 1972; B34:187–220.
31. InsermSC8. http://sc8.vesinet.inserm.fr:1080.
32. Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Statistics in
Medicine 2005; 11:1713–1723.
286

Copyright © 2011 John Wiley & Sons, Ltd. Statist. Med. 2012, 31 269–286

You might also like