You are on page 1of 87

Time-dependent covariates in instrumental variables analysis

Zhe Tian

Department of Epidemiology, Biostatistics and Occupational Health

McGill University
Montréal, Québec
August 2016

A thesis submitted to McGill University in partial fulfillment of the requirements of


the degree of Master of Science in Biostatistics


c Zhe Tian, 2016
Acknowledgment

Firstly, I would like to express my sincere gratitude to my supervisors Prof. Abra-


hamowicz and Prof. Rochefort for their continuous support, patience and guidance
in my M.Sc. study and related research.
I would like to thank Dr. Paolo Dell’Oglio, Dr. Sami Leyh-Bannurah and Dr.
Vincent Trudeau from the CHUM as second readers of this thesis.
I would like to thank Dr. Vincent Trudeau from the CHUM and Dr. Marie-Eve
Beauchamp for their help in the French language abstract of this thesis.
I would like to thank Dr. Pierre Karakiewicz and his team from the CHUM for
their input on prostate cancer research as the motivating example for the simulation
studies of this thesis.
I would like to thank Prof. Abrahamowicz, Prof. Rochefort, and Dr. Marie-Eve
Beauchamp for the substantial revision help that they provided for this thesis.

I
Abstract

Unmeasured confounders are a major concern when analyzing non-randomized data


in comparative effectiveness or safety studies of different treatments. Given the right
conditions, instrumental variables (IV) methods have been proposed as one way of
controlling for such unmeasured confounders in regression analysis. Well-established
literature exists on the validity of two-stage IV methods in the linear and logistic
regression models. In contrast, applications of IV methods in time-to-event analyses
that typically rely on the Cox proportional hazards model (Cox model) are increas-
ingly frequent in recent clinical and epidemiological studies, although the method-
ological literature on the topic is non-existent. Furthermore, it is common in Cox
model applications that, at least for some study subjects, the treatments/exposures
of interest are not delivered at the beginning of the follow-up and, thus, have to be
modelled using time-varying covariates. It is currently unknown how IV methods
perform in time-to-event analyses when the treatment/exposure variable of interest
is time-varying.
In this thesis, I examine the validity of IV methods in the Cox proportional
hazards setting. In Section 2, I review current literature and the development of
IV methods in the linear regression and, subsequently, in the Cox model setting.
In Section 4.2, I use simulation studies to empirically verify the validity of the two
alternative methods for the implementation of IV estimators for the Cox proportional
hazards model with a baseline treatment variable: (a) two-stage residual inclusion
(2SRI) and (b) a method recently proposed by Mackenzie et al. (2014) (MIV).
Finally, in Section 4.3, I examine the possibility of applying these IV methods in the

II
Cox model setting when the treatment variable is time-varying, under the simplifying
assumption that treatment status changes no more than once during the follow-up.
Results of my simulations indicate that both the 2SRI and MIV estimators can
be effective in reducing the bias due to unmeasured confounders when estimating
average treatment effect in the Cox model setting where treatment does not change
over time (i.e., does not require using a time-varying covariate). Furthermore, when
the treatment variable is quantitative, the 2SRI estimator can eliminate bias, even in
the presence of a moderately strong unmeasured confounder. In contrast, when the
treatment variable is time-varying, even if it only changes once during the follow-up,
the 2SRI estimator may fail completely due to the inability to observe the intended
treatment (e.g., for patients who were assigned to receive surgery at some point
during the follow-up, but died before the time of surgery). In this time-varying
setting, when the 2SRI estimator yields excessively biased results, the MIV estimator
can be used to reduce, but not eliminate, the estimation bias of the average treatment
effect.

III
Abrégé

Les facteurs de confusion non mesurés sont un problème majeur lors de l’analyse
des données non randomisés des études comparatives de l’efficacité ou la sécurité de
différents traitements. Sous les bonnes conditions, les méthodes du type variables
instrumentales (VI) ont été proposées comme un moyen de contrôler cette confusion
non mesurée dans l’analyse de régression. Actuellement, il existe de la littérature
bien établie sur la validité de la méthode de VI en deux étapes pour les modèles
de régression linéaire et logistique. Par contre, bien que la méthode de VI dans les
modèles de régression Cox soit utilisés de plus en plus fréquemment dans les études
cliniques et épidémiologiques récentes, la littérature méthodologique sur le sujet est
très rare. De plus, un phénomène fréquent dans les modèles de régression Cox est
que les traitements/expositions ne débutent pas au début du suivi et doivent ainsi
être modélisés avec des variables qui varient dans le temps. Actuellement, nous
ne savons pas comment les méthodes de type VI performent lorsque la variable de
traitement/exposition est une variable qui varie dans le temps.
Dans cette étude, j’examine la validité des méthodes du type VI dans le cadre du
modèle de régression Cox lorsque la variable de traitement/exposition varie dans le
temps. Dans la Section 2, j’analyse la documentation existante sur les méthodes du
type VI dans les modèles de régression linéaires, et ensuite, dans le cadre du modèle
de Cox. Puis, dans la Section 4.2, je compte sur des études de simulation, afin de
vérifier empiriquement la validité des deux méthodes alternatives d’estimateur VI
pour le modèle des risques proportionnels de Cox avec une variable de traitement
qui ne varie pas dans le temps: (a) la méthode d’inclusion des résidus en deux

IV
étapes (2SRI) et (b) une méthode récemment proposée par Mackenzie et al. (2014)
que j’appellerai la méthode des variables instrumentales de Mackenzie (MIV). Enfin,
dans la Section 4.3, j’examine la possibilité d’appliquer ces méthodes VI dans le
cadre du modèle de Cox lorsque la variable de traitement varie dans le temps, sous
l’hypothèse simplificatrice que le statut de traitement change au plus une fois au
cours du suivi.
Les résultats de mes simulations ont démontré que les deux estimateurs VI, 2SRI
et MIV, peuvent être utilisés pour réduire le biais d’estimation de l’effet de traitement
moyen, causé par les facteurs de confusion non mesurés, dans le cadre du modèle de
régression Cox sans variable qui varie dans le temps. De plus, lorsque la variable de
traitement est de nature quantitative, l’estimateur 2SRI peut éliminer complètement
le biais même en présence d’un facteur de confusion non mesuré modérément fort. Par
contre, dans un cadre plus complexe où la variable de traitement varie dans le temps,
même si elle ne change qu’une fois au cours du suivi, je constate que l’estimateur 2SRI
peut échouer complètement en raison de l’incapacité d’observer le traitement prévu
(par exemple, un patient assigné à subir une chirurgie, mais qui décède avant de subir
sa chirurgie). Dans ce cadre de traitement variable dans le temps, l’estimateur MIV
peut être utilisé pour réduire, mais sans éliminer complètement, le biais d’estimation
de l’effet moyen de traitement.

V
Contents

Acknowledgment I

Abstract II

Abrégé IV

1 Introduction and background 1

2 Literature Review 9
2.1 Linear setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Cox model setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Time-varying covariates . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Objective 19

4 Simulation Studies 20
4.1 Simulation design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Cox model setting without time-dependent covariate . . . . . . . . . 30
4.2.1 The 2SRI Estimator . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.2 The MIV estimator . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Cox model setting with a time-dependent treatment variable . . . . . 39
4.3.1 The 2SRI estimator . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.2 The MIV estimator . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Performance comparison . . . . . . . . . . . . . . . . . . . . . . . . . 52

VI
5 Discussion 57

References 64

VII
List of Figures

1 DAG representing a confounding situation with an instrumental variable 10

VIII
List of Tables

1 Simulation results for applying the naive Cox model to scenarios with
a time-invariant and quantitative treatment variable (datasets B1 ).
Each estimate is calculated over 1000 replicate, random datasets with
identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Simulation results for applying the naive Cox model to scenarios with
a time-invariant and binary treatment variable (datasets B2 ). Each
estimate is calculated over 1000 replicate, random datasets with iden-
tical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Simulation results for applying the naive Cox model to scenarios with a
time-varying and quantitative treatment variable (datasets B3 ). Each
estimate is calculated over 1000 replicate, random datasets with iden-
tical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4 Simulation results for applying the naive Cox model to scenarios with
a time-varying and binary treatment variable (datasets B4 ). Each esti-
mate is calculated over 1000 replicate, random datasets with identical
α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Simulation results for applying the 2SRI estimator to scenarios with
a time-invariant and quantitative treatment variable (datasets B1 ).
Each estimate is calculated over 1000 replicate, random datasets with
identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

IX
6 Simulation results for applying the 2SRI estimator to scenarios with a
time-invariant and binary treatment variable (datasets B2 ). Each esti-
mate is calculated over 1000 replicate, random datasets with identical
α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7 Simulation results for applying the MIV estimator to scenarios with
a time-invariant and quantitative treatment variable (datasets B1 ).
Each estimate is calculated over 1000 replicate, random datasets with
identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8 Simulation results for applying the MIV estimator to scenarios with a
time-invariant and binary treatment variable (datasets B2 ). Each esti-
mate is calculated over 1000 replicate, random datasets with identical
α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9 Simulation results when applying the 2SRI estimator that uses the
intended treatment as the outcome in the first-stage regression for
scenarios with a time-varying and quantitative treatment variable
(datasets B3 ). Each estimate is calculated over 1000 replicate, random
datasets with identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . 41
10 Simulation results for applying the 2SRI estimator that uses the ob-
served treatment as the outcome of the first-stage regression to scenar-
ios with a time-varying and quantitative treatment variable (datasets
B3 ). Each estimate is calculated over 1000 replicate, random datasets
with identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

X
11 Simulation results for applying the 2SRI estimator that uses the in-
tended treatment as the outcome of the first-stage regression to sce-
narios with a time-varying and binary treatment variable (datasets
B4 ). Each estimate is calculated over 1000 replicate, random datasets
with identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
12 Simulation results for applying the 2SRI estimator that uses the ob-
served treatment as the outcome of the first-stage regression to sce-
narios with a time-varying and binary treatment variable (datasets
B4 ). Each estimate is calculated over 1000 replicate, random datasets
with identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
13 Simulation results for applying the MIV estimator to scenarios with a
time-varying and quantitative treatment variable (datasets B3 ). Each
estimate is calculated over 1000 replicate, random datasets with iden-
tical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
14 Simulation results for applying the MIV estimator to scenarios with a
time-varying and binary treatment variable (datasets B4 ). Each esti-
mate is calculated over 1000 replicate, random datasets with identical
α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
15 Empirical SD and RMSE for β from IV methods and the naive Cox
model applied to scenarios with a time-varying and quantitative treat-
ment. Each estimate is calculated over 1000 replicate, random datasets
with identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

XI
16 Empirical SD and RMSE for β from IV methods and the naive Cox
model applied to scenarios with a time-varying and binary treatment.
Each estimate is calculated over 1000 replicate, random datasets with
identical α, β, θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

XII
1 Introduction and background

The goal of many epidemiological studies is to estimate the average effect of a treat-
ment/exposure (e.g., surgery) on an outcome (e.g., mortality) in a population. Ide-
ally, such an effect should be estimated via a well-designed randomized control trial
(RCT) (Jüni, Altman, & Egger, 2001). Given an ideal RCT, where there is full
compliance, no dropouts, and successful blinding of study subjects and study evalu-
ators when required, the difference in outcome measured between different treatment
groups can be interpreted as the causal effect of the treatment on the outcome (Bhatt,
2011; Jadad et al., 1996).
While RCTs are considered the gold standard in comparative effectiveness stud-
ies, one major drawback of RCTs is their staggering costs (Sertkaya, Wong, Jessup,
& Beleche, 2016; Emanuel, Schnipper, Kamin, Levinson, & Lichter, 2003); a small
phase I trial usually costs several million dollars in 2015, and a phase III trial can
cost up to several times more. Another major drawback involves concerns about the
external validity of RCTs (Rothwell, 2005), which are often carried out in settings
and/or populations that may differ significantly from those where the treatment
being evaluated is applied in routine clinical practice. Lastly, perhaps the most com-
pelling drawback is that it can take many years to follow newly recruited subjects
in a RCT before a sufficient number of clinical events of interest are accumulated,
thus making the study of slow progressing diseases and long-term outcomes imprac-
tical (e.g., in prognostic studies of newly diagnosed prostate cancer patients, about
15 years of follow-up are necessary to observe most of disease-related mortality)

1
(Albertsen, Hanley, & Fine, 2005; Eggener et al., 2010). Given all these drawbacks,
observational cohort studies, which typically use large administrative databases, are
needed to supplement RCT results, especially where the latter are too expensive, or
impractical (Black, 1996; Concato, Shah, & Horwitz, 2000; Jadad et al., 1996).
In a cohort setting without randomization, the major concern about the internal
validity of the study is related to the problem of unmeasured confounders (Greenland,
Robins, & Pearl, 1999; Avorn, 2007). For example, in an administrative database
study that examines the effect of surgery versus standard care (with no interven-
tion) on all-cause mortality in the elderly is unlikely to have data on the subjects’
overall health. Yet, not being able to control for such a powerful predictor of the
outcome may lead to a serious over-estimation of the true benefits of surgery if it is
generally given to healthier subjects. In other words, a failure to account for the over-
all health status will result in model misspecification, because patients with better
overall health experience both less mortality and a higher likelihood of undergoing
surgery (Austin, 2011). Overall health is very hard to measure, and many of its
components will likely remain unmeasured, especially in large databases of routinely
collected data. Thus, it will likely be impossible to know if the estimated treatment
effect of surgery represents a true ‘causal’ effect. While there is no general solution
to this fundamental problem (McMahon, 2003; Austin, 2011), some approaches have
been suggested in the methodological literature.
One ‘generic’ approach to unmeasured confounders involves bias sensitivity anal-
yses (McCandless, Gustafson, & Levy, 2007). This involves performing repeated
analyses by introducing artificially simulated confounders specified a priori, based

2
on clinical knowledge and literature, that are associated with (a) the observed ex-
posure or treatment, and (b) the outcome. By doing this, researchers can evaluate
how robust their results are to various patterns and strengths of potential unmea-
sured confounders (McCandless et al., 2007; Groenwold, Nelson, Nichol, Hoes, &
Hak, 2010).
An alternative approach to sensitivity analysis is to use instrumental variables
(IV) analysis that, under the right conditions, can produce consistent estimators
of average treatment effect. Instrumental variables analysis was first proposed and
developed in 1928 by Philips G Wright to address the endogeneity of regressors in
statistical models in econometrics. More recently, IV analysis has received increasing
attention in epidemiology, where it is most often employed to account for unmea-
sured confounders in observational studies of potential associations between different
treatment modalities and different clinical outcomes (Terza, Basu, & Rathouz, 2008;
Greenland, 2000). To date, no consensus exists on how to employ IV methods in
the context of a Cox proportional hazards model (Cox model) analysis of right-
censored time-to-event data collected in prospective or retrospective cohort studies.
Moreover, several recently published papers in statistical methodology have noted
potential barriers that may make the use of IV methods unsuitable in epidemiological
studies (Vickers & Sjoberg, 2015; Terza, Bradford, & Dismuke, 2008). Nevertheless,
with no alternatives available, IV methods continue to be used extensively in epi-
demiological studies to correcting for unmeasured confounders (Gore et al., 2010;
Tan et al., 2012).
One of the greatest strengths of the Cox proportional hazards model is its abil-

3
ity to model the effects of covariates that vary during the follow-up (Fisher & Lin,
1999). In most epidemiological cohort studies, variables for a single subject may
change over the course of the follow-up (e.g., weight, blood pressure, disease dura-
tion, time since transplant, current or cumulative drug dosage, etc.). By using only
baseline values of such time-varying variables, researchers fail to exploit the full in-
formation contained in repeated measurements that, if used correctly, often greatly
improves the predictive ability of the model and its goodness of fit (Karp, Abra-
hamowicz, Bartlett, & Pilote, 2004; Abrahamowicz, Beauchamp, & Sylvestre, 2012).
In addition, by modelling the value of an inherently time-varying variable measured
at some time during the follow-up as a fixed-in-time ‘baseline’ variable, researchers
may introduce numerous forms of bias and model misspecification in their analysis,
such as immortal time bias (Suissa, 2008; Zhou, Rahme, Abrahamowicz, & Pilote,
2005). To avoid such biases, it is essential to use time-dependent covariate(s) in the
Cox proportional hazards model that allows us to capture the changes in the values
of these variables over time. Indeed, time-dependent covariates are ubiquitous in
current epidemiological research for modelling the effects of time-varying treatments
(Moura et al., 2015; Abrahamowicz, Bartlett, Tamblyn, & Berger, 2006).
While some recent efforts have been devoted to examining the use of IV meth-
ods in the Cox model setting, no attention has been given to how the relevant IV
methods may behave when employed in the more complex (but clinically important)
case in which treatment or exposure varies over time and thus, has to be modelled
with a time-varying variable. In this thesis, I first use simulations to evaluate the
performance of IV based estimators in the Cox proportional hazards model setting

4
without time-varying treatment variable. Then, I extend these methods to settings
where the treatment variable is time-varying.
This thesis assumes the reader has an understanding of the Cox proportional
hazards model and the fundamental risk set paradigm introduced by Sir David Cox
in his seminal 1972 paper (Cox, 1972).
Below are the notations used in this thesis:

• i = {1 . . . .N } is a subscript that denotes the individual patient in a dataset.

• Wi is the instrumental variable (IV). Throughout the thesis I assume, consis-


tent with most IV applications in pharmaco-epidemiology (Brookhart, Wang,
Solomon, & Schneeweiss, 2006), that the IV is binary (e.g., a binary indica-
tor of the treatment received by a previous patient of the same physician)
(Ionescu-Ittu, Abrahamowicz, & Pilote, 2012).

• Ui is a (time-invariant) unmeasured confounder, which is assumed to be quanti-


tative (e.g., disease severity score). Furthermore, this variable is assumed to be
(a) associated with both the treatment and the outcome, and (b) unrecorded
in the study database.

• τi is the (possibly counterfactual, if the subject had an event or was censored


before τi ) time of treatment initiation when the treatment is time-varying, for
subject i, and corresponds to time elapsed between the subject’s entry into the
cohort (or start of his/her follow-up), i.e. ‘time 0’ and the date of treatment
initiation by the same subject. This variable is quantitative, and its values are
strictly non-negative.

5
• Xi is the observed baseline treatment variable, i.e., the value observed for
subject i at time 0 (start of his/her follow-up). This variable can be either
binary (e.g., surgery) or quantitative (e.g., drug dose).

• Xi (t) is the observed treatment status or intensity at time t. This variable is


time-varying and can be either binary (e.g., surgery) or quantitative (e.g., drug
dose). In my thesis, this variable is assumed to change its value not more than
once over the course of the follow-up, with change for subject i occurring at
t = τi , when the subject’s status will switch, e.g., from unexposed (untreated)
to exposed (treated).

• Yi is the outcome variable. This variable is continuous (e.g., blood pressure)


in the linear regression setting; in the Cox model setting, it is binary and
represents the subject’s survival status at the end of his/her follow-up, with 0
for censoring and 1 for the event of interest (e.g., death).

• φi is the assigned (but not necessarily received) treatment status (binary) or


treatment intensity (quantitative, e.g., dose) the patient is expected to receive
at time τi . Whether this variable is observable is very important for Section
4.3.

• Ti is the follow-up (‘survival’) time for patient i, corresponding to either (a)


event time (if Yi = 1) or (b) censoring time (if Yi = 0). By definition, this
variable is continuous and strictly non-negative.

• β is the effect of the treatment on the outcome, i.e., the slope (expected change
in Y associated with a unit increase in X in the linear regression setting) and

6
the log hazard ratio, of treated versus untreated individuals, in the Cox model
setting.

• α is the effect of the instrument W on the log odds of receiving a binary


treatment (X = 1 or X(t) = 1), or the expected change (slope) in the value of
treatment intensity (e.g., dose), associated with W = 1 relative to W = 0.

• θ is the effect of the unmeasured confounder U on the outcome. This variable


is a slope in the linear regression setting; in a Cox model setting, this a log
hazard ratio, for a one unit increase in U .

• exp(x) is the natural exponent function ex

ex
• expit(x) is a function defined by (1+ex )
.

•  is the random error variable in the linear regression setting. It follows a


normal distribution, with mean 0.

• v is a random variable generated from a uniform U (0, 1) distribution for the


purpose of generating simulated times to event (see Section 4.1).

• λ0 (t) is the underlying baseline hazard function (for untreated subjects, with
X = 0 or X(t) = 0). For simplicity’s sake, in this thesis, this function is
assumed to be constant, implying an exponential distribution of event times,
and equal to 0.05 for all simulation studies.

• I[x] is an indicator function which equals to 1 if the logical statement x is true


or 0 if the logical statement x is false.

7
• R(Tk ) is the set of observations still at risk at time Tk , i.e., the subset of those
members of the cohort who have no event and were not censored until time Tk .

8
2 Literature Review

The Literature Review is divided into three sections. The first section focuses on
the development of two-stage instrumental variables methods in the linear regres-
sion setting. The second section provides an overview of current developments and
existing documentations on instrumental variables (IV) analysis in the Cox model
setting. The third section is a brief overview of time-varying covariates in the Cox
model setting.

2.1 Linear setting

The most common forms of instrumental variables (IV) methods in epidemiological


research are the two-stage IV methods, two-stage predictor substitution (2SPS) and
two-stage residual inclusion (2SRI). The existing methodological literature on two-
stage IV methods in the linear setting is extensive. Some of the most cited recent
publications on two-stage IV methods in epidemiology are Greenland (2000) and
Terza, Basu, and Rathouz (2008). Furthermore, publications such as Hernan and
Robins (2006), Stukel et al. (2007), and Thanassoulis and O’Donnell (2009) have
all shown many applications of two-stage IV methods in epidemiological research. I
describe the two-stage methods in this section based on these papers.
The typical situation where two-stage IV methods are applicable is described
by the directed acyclic graph (DAG) presented in Figure 1. Here, the effect of
the treatment variable, X, on the outcome, Y , is confounded by the unmeasured
confounder, U , and the variable W is an instrumental variable for this situation. For

9
U

W X Y

Figure 1: DAG representing a confounding situation with an instrumental variable

the variable W to be an instrumental variable for this confounding situation, it must


satisfy the following three conditions (Greenland, 2000):

• 1. It must be correlated with the treatment variable X.

• 2. It can only affect the outcome Y through the treatment variable X.

• 3. It must itself be independent of unknown confounder U .

An excellent example of an IV is the presence of well-trained specialists for a spe-


cific treatment when comparing the effect of the said treatment versus no treatment
(Tan et al., 2012). Another good example is the prescription patterns of physicians
when comparing the effect of a drug (Chen & Briesacher, 2011).
In this situation, if I estimate the average treatment effect of X on Y using
linear regression, Equation (1), the resulting β̂ will be biased since the unmeasured
confounder U is unadjusted. But, a consistent estimator is possible in this situation
since the instrumental variable W is available.

Y = βX (1)

Based on Figure 1, in the linear setting, the “real” relationship between the variables

10
X, Y , U and W can be described by Equations (2) and (3).

X = αW + U + 1 (2)

Y = βX + θU + 2 (3)

There are two different ways to make use of instrumental variable W to produce
a consistent estimator of the average treatment effect. Both ways will start by
fitting a linear regression model with the treatment variable X as outcome and the
instrumental variable W as a predictor.

X = αW (4)

From this first stage regression, an estimator of X is obtained, Equation (5); and an
estimator of U is also be obtained, Equation (6).

X̂ = α̂W (5)

Û = X − α̂W (6)

It is important to note here that Û , from Equation (6), is a consistent estimator of


U because of the independence assumption between W and U .
The first two-stage IV method, 2SPS, is constructed by using X̂ instead of X in

11
the second stage regression described by Equation (7).

Y = β X̂ (7)

The second way, 2SRI, is constructed by including Û as a predictor in the second


stage regression described by Equation (8).

Y = βX + θÛ (8)

To see why 2SPS estimator works in the linear setting, I will derive the 2SPS esti-
mator of β from Equation (7):
X̂ = α̂W

Y = β2SP S X̂ = β2SP S (α̂W )

Thus, the resulting estimator is equal to:

β̂2SP S = ((α̂W )0 (α̂W ))−1 (α̂W )0 Y

β̂2SP S = α̂−1 ((W )0 (X))−1 (α̂W )0 Y

β̂2SP S = (W 0 X)−1 W 0 Y

Since the true model of Y is given by:

Y = βX + θU + 2

12
β̂2SP S = (W 0 X)−1 W 0 (βX + θU + 2 )

β̂2SP S = (W 0 X)−1 W 0 (βX) + (W 0 X)−1 W 0 (θU ) + (W 0 X)−1 W 0 (2 )

β̂2SP S = β + (W 0 X)−1 W 0 (θU ) + (W 0 X)−1 W 0 (2 )

β̂2SP S = β + θ(W 0 X)−1 W 0 U + (W 0 X)−1 W 0 (2 ) (9)

First, note W 0 (2 ) = 0. Then, It is clear here that if W 0 U = 0 and W 0 X 6= 0, then


β̂SP S is a consistent estimator of β.
To see why the 2SRI estimator works in the linear setting, I consider the second
stage regression described by Equation (8) and the observation where Û from Equa-
tion (6) is a consistent estimator of U . The resulting estimator of β from Equation
(8) would be computed by:

[β̂2SRI , θ̂] = ([X, Û ]0 [X, Û ])−1 [X, Û ]0 Y (10)

Since Û is a consistent estimator of U , then the resulting estimator of this step,


β̂2SRI , is a consistent estimator of β as the right hand side part of Equation (10)
would asymptotically converge to ([X, U ]0 [X, U ])−1 [X, U ]0 Y .

([X, Û ]0 [X, Û ])−1 [X, Û ]0 Y →


− ([X, U ]0 [X, U ])−1 [X, U ]0 Y

The asymptotic elimination of bias of the 2SPS and 2SRI estimators comes at the
cost of having higher variance. Simulation studies in the methodological literature
(Greenland, 2000; Terza, Basu, & Rathouz, 2008) suggest that there is an inversely

13
proportional relationship between the variance of the IV estimators and the corre-
lation strength between the IV and the treatment variable. Therefore, using an IV
weakly associated with the treatment will result in a poor estimator due to a higher
variance. In practice, Terza, Basu, and Rathouz (2008) and Newey and McFad-
den (1986) suggest using either bootstrapping or the sandwich estimator (Freedman,
2006) to estimate the variance of the two-stage IV estimators.
As a cautionary note, according to Hernan and Robins (2006) and Greenland
(2000), it may be difficult to verify empirically the three conditions an instrumental
variable must satisfy. Thus, the existence and the identification of instrumental
variables could be a difficult task in itself.

2.2 Cox model setting

Current methodological literature is lacking for IV methods in the Cox model setting.
A careful survey of the methodological literature yielded no results on how to apply
or the analytical validity of IV methods in the Cox model setting. Several clinical
and epidemiological studies (Tan et al., 2012; Lu-Yao, Albertsen, Moore, & al, 2008;
Kuo, Montie, & Shahinian, 2012) have cited Terza, Basu, and Rathouz (2008) as
the methodological basis for using the 2SRI estimator in the Cox model setting.
However, Terza, Basu, and Rathouz (2008) presented no analytical or empirical
evidence pertinent to the use of 2SRI estimator in the Cox model setting.
Based on the 2SRI estimator described by clinical and epidemiological studies
(Tan et al., 2012; Lu-Yao et al., 2008; Kuo et al., 2012), the confounding situation
with an instrumental variable presented in Figure 1 in the Cox model setting can be

14
described by Equations (11) and (12).

X = αW + U (11)

λ(t) = λ0 (t)exp(βX + θU ) (12)

The first stage regression of 2SRI estimator in the Cox model setting is thus given by
the Equation (13). Then, from the residuals of this first stage regression, Equation
(14), a consistent estimator of unmeasured confounder U is obtained.

X = αW (13)

Û = X − α̂W (14)

Using Û from Equation (14) as a covariate with the treatment variable X, the partial
likelihood of second stage Cox regression, given by Equation (15), is then maximized
with respect to β and θ. The resulting β̂ in the maximized partial likelihood is the
2SRI estimator of the average treatment effect in the Cox model setting. Surprisingly,
I was unable to find methodological literature that offers empirical or analytical
evidence supporting that the 2SRI estimator is a consistent estimator of the average
treatment effect in the Cox model setting. This gap in evidence will be addressed
and filled in Section 4.2.1 of this thesis.

n
Y exp(βXi + θÛi )
(P )Yi (15)
i=1 k∈R(Ti ) exp(βXk + θÛk )

15
MacKenzie, Tosteson, Morden, Stukel, and O’Malley (2014) produced a new IV
estimator for the Cox model setting, and in this thesis, I will refer to it as Mackenzie’s
instrumental variables (MIV) estimator. However, this newly proposed IV estimator
is consistent only if the effect of the unknown confounder is additive, as described
by Equation (16) where h(U, t) satisfy the assumption given in Equation (17).

λ(t) = λ0 (t)(exp(βX) + h(U, t)) (16)

E[h(U, t)|T (X) ≥ t] = 0 (17)

In Equations (17), T (X) is the observed survival time given observed treatment X.
MacKenzie et al. (2014) stated that these assumptions are critical; if it is violated,
the hazard function no longer gives a proportional hazards model with hazard ratio
exp(β). A proposed specification for h(U, t) can be θU + M GFu (−θ) where θ is the
strength of the additive confounder. However, MacKenzie et al. (2014) provided no
relatable real life example of this in their paper, and I am also unable to relate this
type of confounding to a real-life example.
The proposed estimator by MacKenzie et al. (2014) is implemented by setting
the function in Equation (18) to zero and solve for β. The resulting β̂ is the MIV
estimator for β.
n
P
j∈R(T ) Wj exp(βXj )
X h i
g(β) = Yi Wi − P i (18)
i=1 j∈R(Ti ) exp(βXj )

MacKenzie et al. (2014) showed via simulations that if the unmeasured confounder
is indeed additive and satisfies the assumptions defined by Equations (16) and (17),
then a consistent estimator of the average treatment effect β is achieved. A major

16
limitation of this paper is that the authors were unable to relate this type of additive
confounding to a real life scenario. Instead, they applied the MIV estimator to
the multiplicative confounding setting described by Equations (11) and (12). The
authors showed via simulation studies that the MIV estimator is inherently biased
when the unmeasured confounder is multiplicative. However, they also showed the
MIV estimator is less biased than the naive Cox model.

2.3 Time-varying covariates

One of the important strengths of the Cox proportional hazards model is its ability
to model time-dependent covariates (TDC). Ignoring TDC can result in a biased
estimation of the average treatment effect. Walraven, Davis, Forster, and Wells
(2004) reviewed the literature and concluded that incorrect model specification of
time-varying covariates as a baseline variable is one of the most prominent forms of
model misspecification found in the current published medical and epidemiological
literature.
Ignoring TDC can produce several types of bias. In this thesis, I focus mainly on
the situation in which a treatment variable is delivered some time after the beginning
of the follow-up, resulting in a single change in that parameter value during the
follow-up. In this scenario, if the time-varying treatment variable is modelled as
a baseline variable, the resulting bias is typically referred to as the immortal time
bias (Suissa, 2008). This bias is due to observations which underwent treatment
cannot experience the event of interest between the beginning of the follow-up and
time of treatment delivery, thus, artificially lower the estimated hazard ratio of the

17
treatment effect.
Analytically, I illustrate this bias by using the partial likelihood function when the
treatment variable X(t) is time-varying. By considering a time-varying treatment
variable to be baseline, some of the observations in the risk set, R(Ti ), will be incor-
rectly specified as treated, when in reality they have not undergone the treatment
at time Ti . Assuming the treatment variable changes at most once during follow-up,
this means each Xi (Ti ) in the numerator and Xk (Ti ) in the denominator of Equation
(19) will become Xi (∞) and Xk (∞) respectively instead. Thus, when this incor-
rectly specified partial likelihood function is maximized, the resulting estimator of β
will be biased (Beyersmann, Wolkewitz, & Schumacher, 2008; Suissa, 2008).

n
Y exp(βXi (Ti ))
(P ) Yi (19)
i=1 k∈R(Ti ) exp(βX k (Ti ))

No data are currently available on how IV methods perform when the treatment
variable is time-varying. A careful survey of the methodological literature using
searches in Jstor and Web of Science using the Boolean keyword search, “Instru-
mental AND (Time-dependent OR (Time dependent) OR Time-varying OR (Time
varying))” returned no hits. Thus, the topic of applying IV methods in the context
of time-to-event data with time-varying covariates, which is the main topic of the
thesis, has apparently not been considered previously.

18
3 Objective

Despite the availability of IV methods for linear model setting, the currently available
literature describing how to apply instrumental variables analysis in the Cox model
setting is scarce. Moreover, a careful survey of the current literature, as of May 2016,
reveals not one methodological publication exists that addresses how to apply an IV
estimator in the Cox model setting with a time-varying treatment variable. This
absence of methodological guidance provides the rationale for the objective of this
thesis.
The objective of this thesis is to assess the performance and the validity of IV
analysis to control for unmeasured confounders in comparative effectiveness studies
using a Cox proportional hazards model, particularly when the treatment variable
is time-varying and potentially changes (no more than) once during the follow-up.
I first describe a real life scenario that motivates the work of this thesis, and it
also serves as the basis for the simulation studies of this thesis. Then, I empirically
verify the validity of the 2SRI and the MIV estimators in the Cox model setting
without time-dependent covariates via simulation studies. Finally, I examine the
formulation of the 2SRI and MIV estimators in the Cox model setting when the
treatment variable is time-varying.

19
4 Simulation Studies

4.1 Simulation design

I rely heavily on simulation studies to empirically verify results and hypotheses made
in this thesis. To enhance both the clinical relevance and plausibility of the simula-
tions, and to increase the interpretability of their results, I have designed the simula-
tions studies of this thesis based on a real-life scenario adapted from prostate cancer
research. In studies that examine the survival of very elderly patients diagnosed with
a clinically localized prostate cancer, it is often important to compare the mortality of
those who received different types of treatment (e.g., surgery or hormones injections)
versus those who are untreated (Sun et al., 2014; Abdollah et al., 2012). The ratio-
nale is that some treatments may cause serious side effects that may lead to death,
so it may be better not to treat these patients at all (Abdollah et al., 2012; Wilt
et al., 2012). On the other hand, several studies on prostate cancer treatments that
used large administrative databases have suggested that treatment typically leads
to longer survival. However, many clinicians have cautioned that patients who are
not given any treatment also tend to be less healthy than patients undergoing active
treatment (Wilt et al., 2012; Jeldres et al., 2008). Clinical indicators of health status
are typically not recorded in administrative databases, although they are often used
to assess the safety or effectiveness of various treatments. The unavailability of these
variables may result in unmeasured confounding bias, and lead to an overestimation
of the benefits of treatment (Abdollah et al., 2012; Lu-Yao et al., 2008). For these
reasons, a hypothetical study of the impact of treatment on the mortality of prostate

20
cancer patients provides an appropriate setting for this thesis, which focuses on the
effects of unmeasured confounding in survival analyses.
Based on this prostate cancer research scenario, the variables generated in the
simulation studies can be interpreted as follows. Yi is a binary indicator, equal to
one if the patient has died or zero if censored at the end of his/her follow-up. The
follow-up variable Ti represents the time to death (if Yi = 1) in months, or censoring
time (if Yi = 0). The binary instrumental variable Wi is the indicator of the avail-
ability of a specialist (well trained to administer the appropriate treatment) within
driving distance of a given prostate cancer patient’s residence. Parameter α is the log
odds ratio when the treatment is binary, or the dosage difference when treatment is
quantitative, of the patient undergoing treatment associated with having a specialist
available within driving distance, relative to not having a specialist within driving
distance. The unmeasured confounder Ui represents the centered and standardized
(mean = 0, SD =1) value of the overall health score of the patient, at the start (time
0) of the follow-up, which is not recorded in the study database and, thus, cannot
be adjusted for in the analyses. θ is the log hazard ratio associated with a one unit
(i.e., SD) increase in the value of the standardized health score. τi is the time when
treatment is performed if the simulated patient is assigned to the treatment group.
φi is the assigned/intended treatment of the patient; this is important because some
patients do not survive until the time of treatment. Xi (t), Xi are indicator variables
for whether the patient actually underwent treatment at or before time t (for time-
varying Xi (t)) or at baseline (for time-fixed Xi ), respectively. β is the log hazard
ratio of mortality, associated with being treated (at any time in the past), relative

21
to untreated.
Given a set of α, β and θ value, to generate the survival time, Ti , of the observation
i in a simulated dataset, I first generate an overall health score Ui from a normal
distribution with mean 0 and standard deviation 1.

Ui ∼ N (0, 1) (20)

Then, I generate the instrumental variable Wi from a Bernoulli distribution with


mean parameter 0.5.
Wi ∼ Bernoulli(0.5) (21)

For the time-dependent treatment, τi was then generated from a uniform U (0, 24)
distribution, and I set τi to be 0 if the treatment is not time-dependent.

τi ∼ U (0, 24) (22)

or

τi = 0 (23)

The assigned/intended treatment φi is calculated from the sum of αWi and Ui if


treatment is quantitative (e.g., drug dose) or sampled from a Bernoulli distribution
with mean parameter expit(αWi + Ui ) if the treatment is binary (e.g., surgery).

φi = αWi + Ui (24)

22
or

φi ∼ Bernoulli(expit(αWi + Ui )) (25)

Based on φi and τi , I set Xi = φi if the treatment variable is not time-varying, and


Xi (t) = φi I[t > τi , τi < Ti ] if the treatment variable is time-varying.

Xi = φi (26)

or

Xi (t) = φi I[t > τi , τi < Ti ] (27)

Finally, the survival time variable Ti is generated from following (Austin, 2012):

vi ∼ U (0, 1) (28)

 −ln(vi )


λ0 exp(θUi )
, if − ln(vi ) < λexp(θUi )τi
Ti = (29)
 −ln(vi )−λexp(θUi )τi +λexp(βXi (τi )+θUi )τi , if − ln(vi ) ≥ λexp(θUi )τi


λexp(βXi (τi )+θUi )

A final administrative censoring of 180 months (15 years) is introduced for all
patients that survive until then.
Using the methods and specifications described above, four types of simulated
datasets were generated. The first type describes situations where the treatment is
a baseline and quantitative variable, where τi , φi and Xi are generated by Equations
(23), (24) and (26); I will refer to these as simulated datasets B1 . The second type,
B2 , describes situations where the treatment is a baseline and binary variable, where

23
τi , φi and Xi are generated by Equations (23), (25) and (26). The third type, B3 ,
describes situations where the treatment variable is time-varying and quantitative;
here τi , φi and Xi (t) are generated by Equations (22), (24) and (27). Finally, the
fourth type, B4 , describes situations where the treatment is a time-varying and binary
variable; thus, τi , φi and Xi (t) are generated by Equations (23), (24) and (27).
Each of four simulated datasets, Bi is comprised of 72 sets of 1000 simulated co-
horts of 500 patients based on permutations of parameters α, θ, β taking the following
possible values.
α = {ln(3), ln(5), ln(8)}

θ = {ln(0.33), ln(0.5), ln(0.66), ln(1.5), ln(2), ln(3)}

β = {ln(0.5), ln(0.66), ln(1.5), ln(2)}

Here, the values of α represent weak, moderate and strong associations, respectively,
between the instrument and the treatment.
To serve as a comparison for the IV methods that will be examined in later
sections of this thesis, the appropriate naive Cox model (accounting for time-varying
treatment) with U unadjusted was applied to all four types of simulated datasets.
The results are presented in Tables 1-4.

24
Table 1: Simulation results for applying the naive Cox model to
scenarios with a time-invariant and quantitative treatment variable
(datasets B1 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 3.874 (0.06) 3.031 (0.024) 2.569 (0.012)
2 2 3.171 (0.033) 2.732 (0.018) 2.458 (0.011)
2 1.5 2.669 (0.021) 2.462 (0.015) 2.326 (0.011)
2 0.66 1.453 (0.004) 1.531 (0.004) 1.603 (0.003)
2 0.5 1.161 (0.002) 1.27 (0.002) 1.358 (0.002)
2 0.33 0.878 (0.001) 1.013 (0.001) 1.123 (0.001)
1.5 3 2.979 (0.026) 2.386 (0.012) 2.043 (0.006)
1.5 2 2.412 (0.016) 2.098 (0.008) 1.903 (0.005)
1.5 1.5 2.015 (0.009) 1.87 (0.006) 1.768 (0.005)
1.5 0.66 1.097 (0.002) 1.164 (0.002) 1.219 (0.001)
1.5 0.5 0.889 (0.001) 0.985 (0.001) 1.064 (0.001)
1.5 0.33 0.688 (0.001) 0.813 (0.001) 0.912 (0.001)
0.66 3 1.46 (0.004) 1.233 (0.002) 1.099 (0.001)
0.66 2 1.125 (0.002) 1.016 (0.001) 0.941 (0.001)
0.66 1.5 0.914 (0.001) 0.86 (0.001) 0.82 (0.001)
0.66 0.66 0.498 (0.001) 0.535 (0.001) 0.566 (0.001)
0.66 0.5 0.415 (0.001) 0.475 (0.001) 0.524 (0.001)
0.66 0.33 0.334 (0.001) 0.417 (0.001) 0.484 (0.001)
0.5 3 1.138 (0.002) 0.981 (0.001) 0.882 (0.001)
0.5 2 0.865 (0.001) 0.788 (0.001) 0.731 (0.001)
0.5 1.5 0.691 (0.001) 0.65 (0.001) 0.622 (0.001)
0.5 0.66 0.374 (0.001) 0.405 (0.001) 0.431 (0.001)
0.5 0.5 0.316 (0.001) 0.363 (0.001) 0.403 (0.001)
0.5 0.33 0.258 (0.001) 0.324 (0.001) 0.382 (0.001)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

25
Table 2: Simulation results for applying the naive Cox model
to scenarios with a time-invariant and binary treatment variable
(datasets B2 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 2.754 (0.086) 2.642 (0.072) 2.545 (0.072)
2 2 2.68 (0.075) 2.594 (0.073) 2.518 (0.065)
2 1.5 2.513 (0.059) 2.47 (0.065) 2.422 (0.065)
2 0.66 1.401 (0.015) 1.422 (0.017) 1.444 (0.018)
2 0.5 1.129 (0.01) 1.148 (0.011) 1.171 (0.013)
2 0.33 0.899 (0.007) 0.918 (0.007) 0.948 (0.008)
1.5 3 2.281 (0.049) 2.208 (0.049) 2.115 (0.044)
1.5 2 2.139 (0.048) 2.067 (0.036) 2.013 (0.037)
1.5 1.5 1.928 (0.033) 1.899 (0.036) 1.863 (0.033)
1.5 0.66 1.088 (0.009) 1.105 (0.011) 1.113 (0.009)
1.5 0.5 0.898 (0.007) 0.914 (0.006) 0.933 (0.007)
1.5 0.33 0.749 (0.005) 0.768 (0.005) 0.79 (0.005)
0.66 3 1.349 (0.015) 1.302 (0.015) 1.264 (0.014)
0.66 2 1.118 (0.01) 1.093 (0.01) 1.062 (0.01)
0.66 1.5 0.925 (0.006) 0.917 (0.007) 0.902 (0.007)
0.66 0.66 0.519 (0.002) 0.53 (0.002) 0.539 (0.003)
0.66 0.5 0.473 (0.002) 0.483 (0.002) 0.496 (0.002)
0.66 0.33 0.442 (0.002) 0.454 (0.002) 0.467 (0.002)
0.5 3 1.128 (0.011) 1.095 (0.01) 1.055 (0.009)
0.5 2 0.893 (0.006) 0.879 (0.006) 0.858 (0.006)
0.5 1.5 0.716 (0.004) 0.706 (0.004) 0.694 (0.004)
0.5 0.66 0.401 (0.002) 0.41 (0.002) 0.41 (0.002)
0.5 0.5 0.377 (0.001) 0.382 (0.001) 0.392 (0.002)
0.5 0.33 0.364 (0.001) 0.373 (0.001) 0.383 (0.002)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

26
Table 3: Simulation results for applying the naive Cox model to
scenarios with a time-varying and quantitative treatment variable
(datasets B3 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 2.385 (0.015) 2.226 (0.006) 2.227 (0.006)
2 2 2.344 (0.012) 2.189 (0.006) 2.189 (0.006)
2 1.5 2.254 (0.01) 2.199 (0.005) 2.104 (0.005)
2 0.66 1.679 (0.005) 1.791 (0.003) 1.793 (0.004)
2 0.5 1.485 (0.004) 1.649 (0.003) 1.646 (0.003)
2 0.33 1.294 (0.003) 1.479 (0.003) 1.48 (0.003)
1.5 3 1.887 (0.008) 1.607 (0.003) 1.606 (0.003)
1.5 2 1.809 (0.007) 1.614 (0.003) 1.616 (0.003)
1.5 1.5 1.721 (0.006) 1.599 (0.003) 1.596 (0.003)
1.5 0.66 1.259 (0.003) 1.347 (0.002) 1.345 (0.002)
1.5 0.5 1.13 (0.002) 1.25 (0.002) 1.252 (0.002)
1.5 0.33 1.015 (0.002) 1.159 (0.002) 1.157 (0.002)
0.66 3 1.022 (0.002) 0.889 (0.001) 0.889 (0.001)
0.66 2 0.913 (0.001) 0.825 (0.001) 0.822 (0.001)
0.66 1.5 0.813 (0.001) 0.759 (0.001) 0.76 (0.001)
0.66 0.66 0.57 (0.001) 0.619 (0.001) 0.618 (0.001)
0.66 0.5 0.537 (0.001) 0.61 (0.001) 0.61 (0.001)
0.66 0.33 0.516 (0.001) 0.609 (0.001) 0.61 (0.001)
0.5 3 0.825 (0.001) 0.713 (0.001) 0.715 (0.001)
0.5 2 0.712 (0.001) 0.642 (0.001) 0.642 (0.001)
0.5 1.5 0.621 (0.001) 0.58 (0.001) 0.579 (0.001)
0.5 0.66 0.430 (0.001) 0.468 (0.001) 0.468 (0.001)
0.5 0.5 0.407 (0.001) 0.467 (0.001) 0.467 (0.001)
0.5 0.33 0.398 (0.001) 0.473 (0.001) 0.473 (0.001)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

27
Table 4: Simulation results for applying the naive Cox model to sce-
narios with a time-varying and binary treatment variable (datasets
B4 ). Each estimate is calculated over 1000 replicate, random
datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 2.66 (0.082) 2.588 (0.081) 2.502 (0.072)
2 2 2.618 (0.077) 2.535 (0.07) 2.502 (0.073)
2 1.5 2.459 (0.062) 2.435 (0.064) 2.38 (0.059)
2 0.66 1.467 (0.022) 1.487 (0.023) 1.499 (0.025)
2 0.5 1.218 (0.016) 1.243 (0.018) 1.277 (0.018)
2 0.33 1.015 (0.013) 1.062 (0.013) 1.097 (0.016)
1.5 3 2.199 (0.057) 2.121 (0.054) 2.047 (0.052)
1.5 2 2.064 (0.045) 2.026 (0.049) 1.968 (0.043)
1.5 1.5 1.900 (0.037) 1.869 (0.041) 1.844 (0.035)
1.5 0.66 1.12 (0.014) 1.136 (0.015) 1.153 (0.015)
1.5 0.5 0.944 (0.01) 0.975 (0.011) 0.995 (0.012)
1.5 0.33 0.82 (0.008) 0.846 (0.01) 0.881 (0.011)
0.66 3 1.226 (0.021) 1.184 (0.018) 1.15 (0.019)
0.66 2 1.06 (0.014) 1.042 (0.013) 1.016 (0.014)
0.66 1.5 0.903 (0.011) 0.888 (0.009) 0.879 (0.010)
0.66 0.66 0.521 (0.003) 0.528 (0.004) 0.536 (0.004)
0.66 0.5 0.469 (0.003) 0.476 (0.003) 0.491 (0.003)
0.66 0.33 0.44 (0.003) 0.454 (0.003) 0.466 (0.003)
0.5 3 0.986 (0.014) 0.953 (0.013) 0.927 (0.013)
0.5 2 0.834 (0.010) 0.815 (0.009) 0.792 (0.009)
0.5 1.5 0.689 (0.006) 0.676 (0.006) 0.665 (0.006)
0.5 0.66 0.395 (0.002) 0.399 (0.002) 0.407 (0.002)
0.5 0.5 0.366 (0.002) 0.373 (0.002) 0.378 (0.002)
0.5 0.33 0.350 (0.002) 0.358 (0.002) 0.372 (0.002)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

The results from Tables 1-4 showed that the naive Cox model is biased toward
the direction of the log hazard ratio of the unmeasured confounder, θ. The observed

28
higher variance when the treatment is binary (datasets B2 and B4 )is due to the
variation in parameters that generate treatment probability. In simulated datasets
B2 and B4 , when the log odds ratio of the instrument on treatment, α, is high, it
can result in a skewed distribution of treatment (φ = 1) and observation (φ = 0).

29
4.2 Cox model setting without time-dependent covariate

As mentioned in Section 2.2, the current methodological literature on IV methods


in the Cox model setting is lacking. However, based on the work of Terza, Basu,
and Rathouz (2008), many clinical studies have used the 2SRI estimator assuming
that the consistency observed in the linear and logistic model is also true in the Cox
model setting. In this section, I will use simulated datasets to verify empirically
whether this is true. In addition, I will use simulated datasets to assess the results of
using the MIV estimator developed by MacKenzie et al. (2014) in the multiplicative
confounding setting. And finally, I briefly compare the performance of both methods.

4.2.1 The 2SRI Estimator

Recall from section 2.2 and Equations (13-15) that the 2SRI estimator relies on the
first stage regression to produce a consistent estimator of the unmeasured confounder,
U , from its residuals, Û . By fitting the residuals, Û , of the first stage regression, a
consistent estimator of the average treatment effect β is achieved in the second stage
regression.
Analytically, the consistency of the 2SRI estimator in the Cox model can be
established using the same argument as the linear setting. If Û is a consistent
estimator of U , then the partial likelihood estimator with Û as an independent
variable, Equation (30), should converge to Equation (31) asymptotically.

n
Y exp(βXi + θÛi )
(P )Yi (30)
i=1 k∈R(Ti ) exp(βXk + θ Ûk )

30
n
Y exp(βXi + θUi )
(P )Yi (31)
i=1 k∈R(Ti ) exp(βX k + θU k )

Thus, I will verify this result empirically using simulated datasets B1 . Datasets in B1
are generated with quantitative treatment variable that is not time-varying. Thus,
the residuals of the first stage regression should provide a consistent estimator of
U . Hence, the second stage regression should produce a consistent estimator of the
average treatment effect β.

31
Table 5: Simulation results for applying the 2SRI estimator to sce-
narios with a time-invariant and quantitative treatment variable
(datasets B1 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 2.018 (0.065) 2.003 (0.032) 2.012 (0.021)
2 2 1.998 (0.046) 2 (0.021) 2.005 (0.013)
2 1.5 2.004 (0.036) 1.996 (0.018) 2.005 (0.012)
2 0.66 2.024 (0.038) 2.007 (0.017) 2.015 (0.013)
2 0.5 2.037 (0.049) 2.011 (0.022) 2.013 (0.016)
2 0.33 2.047 (0.071) 2.025 (0.031) 2.015 (0.02)
1.5 3 1.507 (0.033) 1.508 (0.017) 1.502 (0.011)
1.5 2 1.503 (0.026) 1.501 (0.011) 1.501 (0.007)
1.5 1.5 1.508 (0.019) 1.505 (0.009) 1.505 (0.005)
1.5 0.66 1.517 (0.019) 1.506 (0.009) 1.507 (0.006)
1.5 0.5 1.524 (0.025) 1.509 (0.012) 1.504 (0.007)
1.5 0.33 1.538 (0.043) 1.516 (0.018) 1.517 (0.01)
0.66 3 0.665 (0.006) 0.666 (0.003) 0.667 (0.002)
0.66 2 0.665 (0.005) 0.665 (0.002) 0.665 (0.001)
0.66 1.5 0.669 (0.004) 0.666 (0.002) 0.666 (0.001)
0.66 0.66 0.672 (0.004) 0.668 (0.002) 0.666 (0.001)
0.66 0.5 0.671 (0.005) 0.671 (0.002) 0.67 (0.002)
0.66 0.33 0.675 (0.008) 0.675 (0.004) 0.67 (0.002)
0.5 3 0.499 (0.004) 0.498 (0.002) 0.502 (0.001)
0.5 2 0.499 (0.003) 0.5 (0.001) 0.499 (0.001)
0.5 1.5 0.5 (0.002) 0.497 (0.001) 0.499 (0.001)
0.5 0.66 0.503 (0.002) 0.502 (0.001) 0.501 (0.001)
0.5 0.5 0.507 (0.003) 0.5 (0.001) 0.501 (0.001)
0.5 0.33 0.506 (0.005) 0.502 (0.002) 0.503 (0.001)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

32
From Table 5, it is clear that the 2SRI estimator is consistent when applied to
scenarios described by simulated datasets B1 . In comparison to the naive Cox model,
results reveal a higher empirical variance when the association strength between the
instrument and treatment is weak, (α = ln(3)).
In the above scenario, simulated datasets B1 have a quantitative treatment vari-
able in which the residual of the first stage regression is a consistent estimator of
the unmeasured confounder U . However, this is not always the case. For example,
consider the case where the treatment variable is binary and assigned via a Bernoulli
process, the first stage regression has to be modelled as a logistic regression. This,
in turn, produces the expected probability of the treatment variable. As a result,
residuals of the first stage logistic regression will, in general, fail to be a consistent
estimator of the unmeasured confounder.
To verify this suspicion, I applied the 2SRI estimator to simulated datasets B2 ,
which fits the scenario just described. The first stage regression is then a logistic
regression. While the residuals of this step correlate with the unmeasured confounder
U , it is not a consistent estimator of U because the residuals would have only four
possible values due to the instrumental variable being binary.

33
Table 6: Simulation results for applying the 2SRI estimator to sce-
narios with a time-invariant and binary treatment variable (datasets
B2 ). Each estimate is calculated over 1000 replicate, random
datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 1.709 (0.568) 1.653 (0.237) 1.693 (0.163)
2 2 1.886 (0.827) 1.84 (0.284) 1.852 (0.169)
2 1.5 2.103 (0.991) 1.969 (0.338) 1.977 (0.183)
2 0.66 2.092 (1.643) 1.942 (0.351) 1.877 (0.174)
2 0.5 1.956 (1.173) 1.75 (0.299) 1.653 (0.147)
2 0.33 1.848 (1.248) 1.563 (0.231) 1.459 (0.129)
1.5 3 1.384 (0.449) 1.397 (0.199) 1.418 (0.095)
1.5 2 1.484 (0.469) 1.46 (0.183) 1.484 (0.109)
1.5 1.5 1.545 (0.459) 1.505 (0.175) 1.524 (0.106)
1.5 0.66 1.574 (0.525) 1.514 (0.215) 1.426 (0.102)
1.5 0.5 1.547 (0.613) 1.39 (0.167) 1.313 (0.086)
1.5 0.33 1.465 (0.637) 1.342 (0.186) 1.223 (0.096)
0.66 3 0.834 (0.148) 0.813 (0.059) 0.836 (0.033)
0.66 2 0.772 (0.122) 0.774 (0.054) 0.782 (0.03)
0.66 1.5 0.751 (0.12) 0.734 (0.047) 0.733 (0.027)
0.66 0.66 0.775 (0.163) 0.723 (0.053) 0.686 (0.024)
0.66 0.5 0.81 (0.152) 0.744 (0.055) 0.711 (0.027)
0.66 0.33 0.92 (0.397) 0.798 (0.071) 0.738 (0.033)
0.5 3 0.682 (0.081) 0.675 (0.042) 0.694 (0.025)
0.5 2 0.624 (0.075) 0.62 (0.036) 0.629 (0.02)
0.5 1.5 0.58 (0.069) 0.554 (0.023) 0.569 (0.016)
0.5 0.66 0.619 (0.118) 0.547 (0.031) 0.523 (0.014)
0.5 0.5 0.663 (0.123) 0.594 (0.053) 0.55 (0.017)
0.5 0.33 0.755 (0.216) 0.658 (0.053) 0.611 (0.024)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

34
The 2SRI estimator, in this case, does not yield a consistent estimator of the
average treatment effect (Table 6). Furthermore, the bias of the 2SRI estimator is
always lower than the naive Cox model (Table 2), and this bias is in general toward
the null. The bias also increases as the magnitude of exp(θ) increases. Lastly, as in
the quantitative treatment case, a higher empirical variance is observed compared to
the naive Cox model.
The empirical observations made in the simulation studies of this section suggest
that using the 2SRI estimator in the Cox model setting requires an additional as-
sumption in order to be a consistent estimator of the average treatment effect β: the
residuals of the first stage regression must be a consistent estimator of the unmea-
sured confounder. When this assumption is not met, and the treatment variable is
binary, the 2SRI estimator is, in general, biased towards the null. Nonetheless, in
this case, it still is less biased than the naive Cox model.

4.2.2 The MIV estimator

MacKenzie et al. (2014) indicate that their proposed MIV estimator (see Section 2.2,
above) is biased when applied to a multiplicative confounding setting described by
Equations (11) and (12). However, when compared to the naive Cox model, the MIV
estimator is significantly less biased.
In this section, I empirically verify the performance of the MIV estimator using
simulated datasets B1 and B2 . The results of this section allow a comparison between
the MIV estimator and the 2SRI estimator.

35
Table 7: Simulation results for applying the MIV estimator to sce-
narios with a time-invariant and quantitative treatment variable
(datasets B1 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 1.454 (0.043) 1.452 (0.017) 1.459 (0.021)
2 2 1.571 (0.064) 1.58 (0.022) 1.581 (0.022)
2 1.5 1.731 (0.082) 1.71 (0.031) 1.726 (0.031)
2 0.66 2.212 (0.331) 2.201 (0.108) 2.23 (0.159)
2 0.5 1.868 (0.255) 2.051 (0.105) 2.053 (0.117)
2 0.33 1.725 (0.129) 1.716 (0.038) 1.708 (0.039)
1.5 3 1.257 (0.03) 1.26 (0.013) 1.253 (0.013)
1.5 2 1.323 (0.04) 1.324 (0.014) 1.323 (0.014)
1.5 1.5 1.399 (0.042) 1.395 (0.016) 1.396 (0.015)
1.5 0.66 1.542 (0.097) 1.513 (0.026) 1.518 (0.026)
1.5 0.5 1.449 (0.083) 1.43 (0.021) 1.423 (0.02)
1.5 0.33 1.342 (0.049) 1.324 (0.016) 1.329 (0.016)
0.66 3 0.752 (0.011) 0.758 (0.004) 0.76 (0.005)
0.66 2 0.691 (0.011) 0.697 (0.005) 0.699 (0.005)
0.66 1.5 0.661 (0.01) 0.662 (0.004) 0.662 (0.004)
0.66 0.66 0.721 (0.012) 0.718 (0.004) 0.716 (0.004)
0.66 0.5 0.753 (0.013) 0.755 (0.005) 0.757 (0.005)
0.66 0.33 0.796 (0.014) 0.799 (0.006) 0.797 (0.006)
0.5 3 0.578 (0.009) 0.582 (0.004) 0.587 (0.004)
0.5 2 0.437 (0.011) 0.489 (0.005) 0.49 (0.004)
0.5 1.5 0.441 (0.01) 0.446 (0.004) 0.451 (0.005)
0.5 0.66 0.579 (0.009) 0.578 (0.004) 0.58 (0.003)
0.5 0.5 0.631 (0.009) 0.624 (0.004) 0.627 (0.004)
0.5 0.33 0.685 (0.012) 0.679 (0.004) 0.684 (0.004)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

36
Table 8: Simulation results for applying the MIV estimator to sce-
narios with a time-invariant and binary treatment variable (datasets
B2 ). Each estimate is calculated over 1000 replicate, random
datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 1.348 (0.988) 1.495 (0.605) 1.482 (0.558)
2 2 1.399 (0.993) 1.642 (0.635) 1.619 (0.62)
2 1.5 1.545 (1.158) 1.753 (0.697) 1.711 (0.654)
2 0.66 1.546 (1.01) 1.841 (0.739) 1.919 (0.809)
2 0.5 1.492 (0.9) 1.705 (0.76) 1.687 (0.711)
2 0.33 1.368 (0.904) 1.542 (0.623) 1.521 (0.708)
1.5 3 1.127 (0.853) 1.268 (0.455) 1.269 (0.379)
1.5 2 1.192 (0.857) 1.337 (0.479) 1.323 (0.452)
1.5 1.5 1.263 (0.833) 1.384 (0.467) 1.396 (0.52)
1.5 0.66 1.277 (0.817) 1.486 (0.628) 1.47 (0.593)
1.5 0.5 1.216 (0.823) 1.386 (0.578) 1.361 (0.528)
1.5 0.33 1.174 (0.787) 1.336 (0.576) 1.298 (0.547)
0.66 3 0.721 (0.459) 0.726 (0.164) 0.725 (0.178)
0.66 2 0.653 (0.427) 0.692 (0.153) 0.684 (0.134)
0.66 1.5 0.634 (0.342) 0.666 (0.15) 0.656 (0.151)
0.66 0.66 0.661 (0.408) 0.709 (0.177) 0.72 (0.189)
0.66 0.5 0.715 (0.506) 0.731 (0.205) 0.763 (0.226)
0.66 0.33 0.739 (0.458) 0.813 (0.272) 0.803 (0.222)
0.5 3 0.577 (0.32) 0.592 (0.137) 0.578 (0.092)
0.5 2 0.542 (0.353) 0.546 (0.108) 0.542 (0.099)
0.5 1.5 0.509 (0.273) 0.486 (0.065) 0.493 (0.096)
0.5 0.66 0.572 (0.441) 0.537 (0.133) 0.543 (0.122)
0.5 0.5 0.598 (0.4) 0.577 (0.134) 0.588 (0.172)
0.5 0.33 0.655 (0.447) 0.662 (0.209) 0.668 (0.17)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

37
Results of both simulation studies (Tables 7, 8) confirm that the estimated hazard
ratio from the MIV estimator is biased, similar to MacKenzie et al. (2014) noted, but
it is indeed less biased than the naive Cox model (Tables 1, 2) when the correlation
between the instrument and treatment is strong (i.e., α = ln(5) and α = ln(8)). In
addition, similar to the 2SRI estimator, the bias of the MIV estimator seems to be,
in general, towards the null. When the effect of the unmeasured confounder is small,
exp(θ) = 0.66 or exp(θ) = 1.5, the MIV estimator produces estimates that are very
close to the true hazard ratio (Tables 7, 8).
In contrast, consider the case when the correlation strength between the instru-
ment, W , and the treatment variable X is weak, the MIV estimator may produce
very unstable estimates of the average treatment effect. Indeed, in Table 8, where
α = ln(3) shows several empirical variances that are much larger than those of naive
Cox model (e.g., when exp(β) = 2).
As a performance comparison, in the case where the treatment is quantitative,
since the 2SRI estimator is consistent, the MIV estimator is not competitive in
such a situation. Furthermore, the simulation studies of this section reveal that
when the treatment variable is binary, both the 2SRI and MIV estimators yield a
biased estimator of average treatment effect β that is, in general, towards the null.
Comparing the results of Table 6 and Table 8, I observe that the 2SRI estimator, in
general, has a lower bias and lower variance than the MIV estimator.

38
4.3 Cox model setting with a time-dependent treatment vari-

able

In Section 4.2, I verified empirically the performance of the 2SRI and MIV estimators
in the Cox model settings with a quantitative and binary treatment variable that is
always delivered at the beginning of the follow-up. The simulation results suggest
that the 2SRI estimator is a consistent estimator only if the residuals of its first stage
regression is a consistent estimator of the unmeasured confounder U . On the other
hand, the MIV estimator is, in general, biased toward the null in both situations.
In this section, I consider the situation in which the treatment variable varies
with time. In this case, a new level of complexity is introduced by the possibility of
some patients not surviving until the time of treatment delivery.

4.3.1 The 2SRI estimator

Whether the intended treatment, φ, is observable is an important consideration for


the 2SRI estimator when the treatment is time-varying. Based on the motivating ex-
ample from prostate cancer study, the intended treatment is typically not observable
in administrative databases using billing claims. A patient undergoing observation
(i.e., no treatment) is usually identified via an absence of any therapy based on
billing claims. However, in reality, many patients who are undergoing apparent ob-
servation may have been assigned to undergo surgery, but they did not survive until
the scheduled surgery date. In such cases, using the observed treatment, Xi (τi ), as
the outcome for the first stage regression may result in a badly biased estimator of
the unmeasured confounder.

39
To illustrate this point, I first will use the 2SRI estimator on simulated datasets
B3 , where the treatment is quantitative. By design, simulated datasets B3 contain
many observations which do not survive until the delivery of the treatment. In this
first simulations, the first stage regression will be fitted with the intended treatment,
φ, as the outcome.

40
Table 9: Simulation results when applying the 2SRI estimator that
uses the intended treatment as the outcome in the first-stage regres-
sion for scenarios with a time-varying and quantitative treatment
variable (datasets B3 ). Each estimate is calculated over 1000 repli-
cate, random datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 2.011 (0.015) 2.006 (0.008) 2.006 (0.007)
2 2 2.005 (0.01) 2.002 (0.006) 2.003 (0.006)
2 1.5 2.004 (0.01) 2.001 (0.005) 2.006 (0.005)
2 0.66 2.011 (0.009) 1.999 (0.005) 2.001 (0.005)
2 0.5 2.01 (0.011) 2.006 (0.006) 2.002 (0.005)
2 0.33 2.006 (0.015) 2.002 (0.006) 2.004 (0.007)
1.5 3 1.504 (0.008) 1.502 (0.004) 1.502 (0.004)
1.5 2 1.499 (0.006) 1.500 (0.003) 1.501 (0.003)
1.5 1.5 1.504 (0.006) 1.502 (0.003) 1.5 (0.003)
1.5 0.66 1.504 (0.005) 1.507 (0.003) 1.504 (0.003)
1.5 0.5 1.504 (0.006) 1.502 (0.003) 1.503 (0.003)
1.5 0.33 1.504 (0.008) 1.506 (0.004) 1.501 (0.004)
0.66 3 0.668 (0.002) 0.666 (0.001) 0.667 (0.001)
0.66 2 0.667 (0.001) 0.667 (0.001) 0.665 (0.001)
0.66 1.5 0.666 (0.001) 0.665 (0.001) 0.666 (0.001)
0.66 0.66 0.666 (0.002) 0.667 (0.001) 0.666 (0.001)
0.66 0.5 0.666 (0.002) 0.665 (0.001) 0.665 (0.001)
0.66 0.33 0.666 (0.003) 0.667 (0.001) 0.667 (0.001)
0.5 3 0.501 (0.001) 0.499 (0.001) 0.500 (0.001)
0.5 2 0.498 (0.001) 0.5 (0.001) 0.499 (0.001)
0.5 1.5 0.500 (0.001) 0.499 (0.001) 0.498 (0.001)
0.5 0.66 0.500 (0.001) 0.500 (0.001) 0.499 (0.001)
0.5 0.5 0.498 (0.001) 0.500 (0.001) 0.500 (0.001)
0.5 0.33 0.501 (0.002) 0.499 (0.001) 0.499 (0.001)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

As expected, in Table 9, the results suggest the 2SRI estimator in this case is a
consistent estimator of the average treatment effect as the unmeasured confounder

41
U is consistently estimated from the residuals of the first stage regression.
Now, for comparison, I repeat the same simulation, using the simulated datasets
B3 , however, this time, the first stage regression will use the observed treatment
variable, Xi (τi ) as described by Equation (27), as the outcome variable.
In Table 10, a clear bias is observed. Since many patients do not survive until
the delivery of treatment, the first stage regression is fitted with 0 as the outcome
for those observations, resulting in a biased estimator of the unmeasured confounder
U . Then, by fitting the resulting biased estimate of U from the first stage regression,
the resulting estimated β̂ from the second stage regression is also biased. Compared
to the naive Cox model applied to B3 (Table 3), the 2SRI estimator, in this case, is
more biased. Interestingly, the bias observed here is always an overestimation.
To further illustrate this problem of intended treatment, φ, being unobservable,
I apply the 2SRI estimator to simulated datasets B4 where the treatment variable
is binary and time-varying. Again, first I will run the simulation with the intended
treatment, φ, as the outcome of the first stage regression to provide a comparison.

42
Table 10: Simulation results for applying the 2SRI estimator that
uses the observed treatment as the outcome of the first-stage regres-
sion to scenarios with a time-varying and quantitative treatment
variable (datasets B3 ). Each estimate is calculated over 1000 repli-
cate, random datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 2.525 (0.025) 2.311 (0.012) 2.308 (0.012)
2 2 2.646 (0.025) 2.456 (0.015) 2.454 (0.014)
2 1.5 2.759 (0.028) 2.588 (0.015) 2.59 (0.015)
2 0.66 3.074 (0.061) 2.858 (0.029) 2.864 (0.029)
2 0.5 3.151 (0.079) 2.855 (0.034) 2.844 (0.039)
2 0.33 3.158 (0.124) 2.727 (0.043) 2.737 (0.043)
1.5 3 1.998 (0.015) 1.853 (0.007) 1.849 (0.006)
1.5 2 2.053 (0.016) 1.915 (0.007) 1.913 (0.007)
1.5 1.5 2.112 (0.017) 1.981 (0.008) 1.975 (0.008)
1.5 0.66 2.296 (0.035) 2.139 (0.018) 2.135 (0.016)
1.5 0.5 2.353 (0.046) 2.124 (0.019) 2.122 (0.02)
1.5 0.33 2.389 (0.067) 2.074 (0.025) 2.07 (0.023)
0.66 3 1.085 (0.004) 1.051 (0.002) 1.052 (0.002)
0.66 2 1.048 (0.004) 1.005 (0.002) 1.003 (0.002)
0.66 1.5 1.016 (0.004) 0.967 (0.002) 0.968 (0.002)
0.66 0.66 0.998 (0.007) 0.93 (0.004) 0.927 (0.004)
0.66 0.5 1.043 (0.011) 0.951 (0.005) 0.955 (0.005)
0.66 0.33 1.101 (0.017) 0.993 (0.006) 0.993 (0.007)
0.5 3 0.874 (0.003) 0.858 (0.002) 0.859 (0.002)
0.5 2 0.823 (0.003) 0.798 (0.002) 0.796 (0.002)
0.5 1.5 0.784 (0.003) 0.751 (0.002) 0.751 (0.002)
0.5 0.66 0.753 (0.005) 0.696 (0.003) 0.694 (0.003)
0.5 0.5 0.777 (0.007) 0.716 (0.003) 0.717 (0.004)
0.5 0.33 0.831 (0.011) 0.752 (0.004) 0.753 (0.005)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

43
Table 11: Simulation results for applying the 2SRI estimator that
uses the intended treatment as the outcome of the first-stage regres-
sion to scenarios with a time-varying and binary treatment variable
(datasets B4 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 1.615 (0.088) 1.644 (0.083) 1.651 (0.07)
2 2 1.825 (0.123) 1.805 (0.096) 1.825 (0.096)
2 1.5 1.955 (0.147) 1.96 (0.127) 1.928 (0.106)
2 0.66 1.984 (0.172) 1.975 (0.139) 1.936 (0.116)
2 0.5 1.966 (0.185) 1.928 (0.135) 1.884 (0.109)
2 0.33 1.973 (0.185) 1.942 (0.145) 1.862 (0.116)
1.5 3 1.303 (0.063) 1.31 (0.055) 1.297 (0.049)
1.5 2 1.395 (0.073) 1.409 (0.068) 1.387 (0.058)
1.5 1.5 1.486 (0.087) 1.477 (0.083) 1.49 (0.066)
1.5 0.66 1.517 (0.113) 1.5 (0.087) 1.514 (0.073)
1.5 0.5 1.544 (0.114) 1.519 (0.09) 1.482 (0.077)
1.5 0.33 1.595 (0.103) 1.52 (0.083) 1.500 (0.073)
0.66 3 0.666 (0.018) 0.668 (0.016) 0.672 (0.016)
0.66 2 0.676 (0.023) 0.689 (0.021) 0.685 (0.016)
0.66 1.5 0.685 (0.025) 0.683 (0.02) 0.691 (0.018)
0.66 0.66 0.711 (0.029) 0.706 (0.022) 0.7 (0.019)
0.66 0.5 0.76 (0.029) 0.752 (0.025) 0.744 (0.024)
0.66 0.33 0.865 (0.037) 0.837 (0.03) 0.816 (0.024)
0.5 3 0.521 (0.013) 0.525 (0.011) 0.536 (0.011)
0.5 2 0.525 (0.014) 0.529 (0.012) 0.528 (0.012)
0.5 1.5 0.523 (0.015) 0.52 (0.012) 0.520 (0.011)
0.5 0.66 0.539 (0.016) 0.535 (0.013) 0.531 (0.013)
0.5 0.5 0.596 (0.02) 0.589 (0.017) 0.577 (0.015)
0.5 0.33 0.679 (0.022) 0.666 (0.021) 0.656 (0.018)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

44
Table 12: Simulation results for applying the 2SRI estimator that uses
the observed treatment as the outcome of the first-stage regression to
scenarios with a time-varying and binary treatment variable (datasets
B4 ). Each estimate is calculated over 1000 replicate, random datasets
with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 17.806 (39.372) 11.525 (9.512) 9.145 (4.34)
2 2 15.661 (29.26) 10.016 (6.946) 8.063 (3.17)
2 1.5 14.051 (24.037) 9.555 (6.858) 7.576 (2.902)
2 0.66 12.688 (29.155) 8.383 (5.793) 6.98 (2.792)
2 0.5 12.73 (29.69) 8.56 (6.732) 7.269 (3.657)
2 0.33 14.717 (48.604) 10.104 (11.901) 8.381 (6.091)
1.5 3 15.487 (31.883) 9.747 (6.149) 7.693 (3.137)
1.5 2 12.39 (19.562) 8.204 (4.753) 6.504 (2.267)
1.5 1.5 11.421 (17.384) 7.362 (3.497) 6.052 (1.915)
1.5 0.66 9.893 (15.972) 6.757 (4.165) 5.759 (2.06)
1.5 0.5 10.471 (20.607) 6.996 (5.057) 6.044 (2.618)
1.5 0.33 12.516 (34.561) 8.458 (8.746) 7.251 (4.403)
0.66 3 9.686 (12.771) 6.131 (2.941) 4.861 (1.401)
0.66 2 7.549 (8.626) 4.902 (1.79) 3.95 (0.924)
0.66 1.5 6.419 (5.912) 4.188 (1.5) 3.405 (0.73)
0.66 0.66 5.323 (4.999) 3.787 (1.349) 3.217 (0.787)
0.66 0.5 6.221 (9.212) 4.254 (1.988) 3.555 (1.018)
0.66 0.33 7.979 (16.12) 5.51 (4.141) 4.711 (2.232)
0.5 3 8.184 (9.634) 5.284 (2.325) 4.272 (1.212)
0.5 2 6.458 (6.409) 4.078 (1.284) 3.347 (0.686)
0.5 1.5 5.331 (5.066) 3.455 (1.064) 3.005 (1.021)
0.5 0.66 4.381 (3.594) 3.159 (1.125) 2.7 (1.077)
0.5 0.5 4.945 (4.665) 3.479 (1.384) 3 (0.832)
0.5 0.33 6.596 (11.138) 4.563 (2.462) 4.017 (1.712)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ), where
β̂i is the estimated β of the ith simulated dataset within the set of 1000
simulated datasets generated with the same α, β, θ parameter.

In Table 11, similar to the case in Section 4.2.1, biases towards the null are
observed with magnitudes that are significantly lower than those observed when

45
applying the naive Cox model (Table 4) to the same datasets. I note, here, that
when the effect of the unmeasured confounder is small (i.e., exp(θ) = 0.66 and
exp(θ) = 1.5), the estimates from the 2SRI estimator becomes very close to the true
hazard ratio.
Repeating the simulation with Xi (τi ), the observed treatment, as the outcome
of the first stage regression reveals rather extreme overestimation biases (Table 12).
The bias seen in this simulation far exceeds that of the naive Cox model (Table 4).
The results of the simulation studies in this section show that the 2SRI estimator
may fail catastrophically when (1) the intended treatment, φ, is not unavailable as
the outcome of the first stage regression, and (2) when a large number of patients do
not survive until the time of treatment delivery. In this case, the dependent variable
of the first stage regression will be misspecified, thus, resulting in residual that is not
a consistent estimator of the unmeasured confounder, U . In turn, the second stage
regression will produce a biased estimator of β.

4.3.2 The MIV estimator

The previous section revealed some severe potential issues with the 2SRI estimator,
due to problems with the first stage regression. The MIV estimator described by
MacKenzie et al. (2014) does not require a first stage regression. Thus, whether the
intended treatment is observable does not affect the estimation of the average treat-
ment effect. However, Section 4.2.2 showed empirically that the MIV estimator is
inherently biased when the confounding variable is multiplicative. Given the poten-
tially catastrophic failure of the 2SRI estimator when the treatment is time-varying,

46
this section will examine how the MIV estimator performs as a possible alternative
IV method to the 2SRI estimator.
First, I will go through the analytical derivations of the MIV estimator to verify
whether it retains all its properties when the treatment variable is time-varying.
Since the treatment variable is time-varying, I have X(t) instead of X. Second,
in the analytical derivation, I will assume that the effect of the confounder on the
hazard is additive, as MacKenzie et al. (2014) did in their original paper. Thus, I
have the following parameterization for the hazard of an individual patient:

λ(t) = λ0 (exp(βX(t))) + h(U, t) (32)

Based on the derivations of MacKenzie et al. (2014), I also apply the following
condition on the unknown confounder in order for the estimated β to be interpreted
as a log hazard ratio:
E[h(U, t)|T (x) ≥ t] = 0 (33)

MacKenzie et al. (2014) stated in their paper that the MIV estimator is motivated
from the risk set paradigm. Now, consider an event at time t, and every observation
that is still at risk, R(t), just before time t, the probability that it happens to a
particular observation k, pk (t), would be given by:

λ0 (t)(exp(βXk (t))) + h(Uk , t)


pk (t) = P
j∈R(t) (λ0 (t)(exp(βXj (t)) + h(Uj , t))

λ0 (t)(exp(βXk (t))) + h(Uk , t)


pk (t) = P P (34)
j∈R(t) λ0 (t)exp(βXj (t)) + j∈R(t) h(Uj , t)

47
Using the assumption E[h(U, t)|T (x) ≥ t] = 0 from Equation (33), I can make the
P
argument here that j∈R(t) h(Uj , t) is negligible asymptotically. Thus, I can write
Equation (34) as:

λ0 (t)(exp(βXk (t))) + h(Uk , t) exp(βXk (t)) + h(Uk , t)/λ0 (t)


pk (t) ≈ P = P (35)
j∈R(t) λ0 (t)exp(βXj (t) j∈R(t) exp(βXj (t))

From here, the expected value of W in the subject who had an event occur at time
P
t given those who were at risk just before time t is given by j∈R(t) Wj pj (t). Using the
P
expression for pk (t) from Equation (35) I can write j∈R(t) Wj pj (t) approximately
as:

X X exp(βXj (t)) + h(Uj , t)/λ0 (t)


Wj pj (t) ≈ Wj P
j∈R(t) j∈R(t) j∈R(t) exp(βXj (t))
P P (36)
j∈R(t) Wj exp(βXj (t)) j∈R(t) Wj h(Uj , t)
= P + P
j∈R(t) exp(βXj (t)) λ0 (t) j∈R(t) exp(βXj (t))

Here, I will use the assumption that E[h(U, t)|T (x) ≥ t] = 0 from Equation (33)
and the independence assumption of W and U from the IV assumptions, to deduce
P
that the numerator j∈R(t) Wj h(Uj , t) = 0 asymptotically. Thus, I conclude that
the expected value of Ŵ in the subject having the event at t, given all those still at
risk just before time t can be estimated as:

P
X j∈R(t) Wj exp(βXj (t))
Ŵ = Wj pj (t) ≈ P (37)
j∈R(t) exp(βXj (t))
j∈R(t)

Based on equation (37), and letting i be the index of all patients in the dataset,
and setting each Ŵi equal to the observed Wi in subjects that had an event during

48
the follow-up at ti , I can write the following estimation equation.

n
P
X h
j∈R(t ) Wj exp(βXj (ti )) i
g(β) ≈ Yi Wi − P i (38)
i=1 j∈R(ti ) exp(βXj (ti ))

Setting Equation (38) to zero and solve for β, gives a β̂ that should be a consistent
estimator for β.
Based on these derivations, all the arguments and derivations made by MacKenzie
et al. (2014) are analytically valid if the time invariant X is replaced by a time-
dependent variable X(t). The R code for this adapted estimator can be found in
Appendix 1.
Again, I must reiterate that the additive confounding variable parameterization
of the individual hazard, Equation (32), is not intuitive and very hard to related
to a real life scenario. Furthermore, MacKenzie et al. (2014) did not provide an
example in their paper neither. Therefore, I will only test the MIV estimator in
the multiplicative confounding setting. Based on the results in Section 4.2.2 for the
MIV estimator in the time invariant and multiplicative confounding setting, I should
observe a similar bias toward the null when applying the MIV estimator in a scenario
where the treatment variable is time-varying. To empirically verify this, I apply the
MIV estimator to the simulated datasets B3 , where the treatment variable is time-
varying and quantitative, and simulated datasets B4 , where the treatment variable
is time-vary and binary.

49
Table 13: Simulation results for applying the MIV estimator to
scenarios with a time-varying and quantitative treatment variable
(datasets B3 ). Each estimate is calculated over 1000 replicate, ran-
dom datasets with identical α, β, θ.
True Hazard Ratio Mean Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 1.583 (0.028) 1.599 (0.013) 1.591 (0.013)
2 2 1.701 (0.029) 1.732 (0.016) 1.725 (0.014)
2 1.5 1.821 (0.034) 1.845 (0.015) 1.846 (0.015)
2 0.66 2.024 (0.032) 1.942 (0.012) 1.953 (0.012)
2 0.5 1.923 (0.027) 1.866 (0.013) 1.86 (0.012)
2 0.33 1.741 (0.024) 1.713 (0.008) 1.716 (0.008)
1.5 3 1.308 (0.016) 1.311 (0.007) 1.305 (0.007)
1.5 2 1.359 (0.015) 1.371 (0.007) 1.368 (0.007)
1.5 1.5 1.424 (0.016) 1.427 (0.008) 1.424 (0.007)
1.5 0.66 1.492 (0.015) 1.49 (0.007) 1.487 (0.007)
1.5 0.5 1.424 (0.013) 1.428 (0.006) 1.427 (0.006)
1.5 0.33 1.346 (0.012) 1.351 (0.006) 1.348 (0.005)
0.66 3 0.752 (0.004) 0.748 (0.002) 0.745 (0.002)
0.66 2 0.705 (0.004) 0.706 (0.002) 0.704 (0.002)
0.66 1.5 0.672 (0.004) 0.67 (0.002) 0.671 (0.002)
0.66 0.66 0.708 (0.005) 0.709 (0.003) 0.707 (0.003)
0.66 0.5 0.739 (0.006) 0.736 (0.003) 0.736 (0.003)
0.66 0.33 0.769 (0.007) 0.766 (0.004) 0.767 (0.004)
0.5 3 0.584 (0.003) 0.579 (0.002) 0.581 (0.002)
0.5 2 0.516 (0.003) 0.522 (0.002) 0.52 (0.002)
0.5 1.5 0.486 (0.003) 0.488 (0.002) 0.488 (0.002)
0.5 0.66 0.564 (0.005) 0.561 (0.003) 0.558 (0.003)
0.5 0.5 0.593 (0.006) 0.593 (0.003) 0.593 (0.004)
0.5 0.33 0.636 (0.007) 0.627 (0.004) 0.628 (0.005)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

50
Table 14: Simulation results for applying the MIV estimator to sce-
narios with a time-varying and binary treatment variable (datasets
B4 ). Each estimate is calculated over 1000 replicate, random
datasets with identical α, β, θ.
True Hazard Ratio Average Hazard Ratio (Empirical Variance)
exp(β) exp(θ) α=ln(3) α=ln(5) α=ln(8)
2 3 1.752 (0.86) 1.714 (0.475) 1.675 (0.294)
2 2 1.872 (0.769) 1.813 (0.466) 1.821 (0.357)
2 1.5 1.919 (0.83) 1.961 (0.542) 1.942 (0.355)
2 0.66 2.006 (0.963) 2.078 (0.771) 2.031 (0.543)
2 0.5 1.945 (1.032) 1.927 (0.731) 1.973 (0.619)
2 0.33 1.76 (1.066) 1.832 (0.797) 1.837 (0.743)
1.5 3 1.439 (0.643) 1.417 (0.326) 1.37 (0.227)
1.5 2 1.483 (0.568) 1.472 (0.351) 1.438 (0.235)
1.5 1.5 1.624 (0.753) 1.536 (0.347) 1.518 (0.247)
1.5 0.66 1.647 (0.841) 1.651 (0.602) 1.682 (0.469)
1.5 0.5 1.583 (0.842) 1.583 (0.63) 1.605 (0.546)
1.5 0.33 1.532 (0.859) 1.508 (0.666) 1.575 (0.609)
0.66 3 0.88 (0.376) 0.798 (0.146) 0.778 (0.088)
0.66 2 0.847 (0.374) 0.799 (0.155) 0.759 (0.067)
0.66 1.5 0.835 (0.295) 0.757 (0.114) 0.73 (0.071)
0.66 0.66 0.792 (0.3) 0.768 (0.172) 0.751 (0.123)
0.66 0.5 0.884 (0.471) 0.812 (0.224) 0.777 (0.141)
0.66 0.33 0.924 (0.505) 0.879 (0.367) 0.833 (0.239)
0.5 3 0.731 (0.275) 0.653 (0.107) 0.66 (0.082)
0.5 2 0.652 (0.239) 0.612 (0.089) 0.605 (0.068)
0.5 1.5 0.628 (0.205) 0.569 (0.079) 0.574 (0.049)
0.5 0.66 0.629 (0.231) 0.591 (0.117) 0.542 (0.053)
0.5 0.5 0.672 (0.342) 0.626 (0.14) 0.589 (0.089)
0.5 0.33 0.745 (0.451) 0.674 (0.201) 0.636 (0.102)
1 P1000
Note: The average hazard ratio is computed by exp( 1000 i=1 β̂i ),
where β̂i is the estimated β of the ith simulated dataset within the set
of 1000 simulated datasets generated with the same α, β, θ parameter.

Tables 13 and 14 show similar results to the simulations run in Section 4.2.2
(Tables 7, 8). When the correlation between the IV and the treatment is strong

51
(i.e., α = ln(5) and α = ln(8)), the MIV estimator can reduce, but not eliminate,
the estimation bias of the average treatment effect. When the correlation between
the instrument and the treatment is weak, the MIV estimator can produce unstable
estimates (i.e., when α = ln(3) and exp(θ) = 0.33).

4.4 Performance comparison

Based on the results of the simulation studies of previous sections, I observed that,
with an appropriate instrument and under the necessary conditions, assuming the
intended treatment φ is observable, both the 2SRI estimator and the MIV estimator
can reduce the bias of the treatment effect, but at the cost of an increase in variance.
To understand whether this bias-variance trade-off is worthwhile given that all the
conditions of applying IV methods are met, I will compare the root mean-squared
error (RMSE) of applying the 2SRI estimator, the MIV estimator and the naive Cox
model on a subset of simulated datasets B3 and B4 with the parameter α = ln(5).

52
Table 15: Empirical SD and RMSE for β from IV methods and
the naive Cox model applied to scenarios with a time-varying and
quantitative treatment. Each estimate is calculated over 1000 repli-
cate, random datasets with identical α, β, θ.
Parameter Values Empirical SD (RMSE)
exp(β) exp(θ) Naive Cox 2SRI MIV
2 3 0.039 (0.041) 0.043 (0.043) 0.073 (0.238)
2 2 0.037 (0.056) 0.038 (0.038) 0.072 (0.163)
2 1.5 0.035 (0.059) 0.035 (0.035) 0.066 (0.106)
2 0.66 0.032 (0.115) 0.035 (0.035) 0.055 (0.063)
2 0.5 0.034 (0.197) 0.037 (0.037) 0.061 (0.093)
2 0.33 0.036 (0.305) 0.039 (0.039) 0.051 (0.164)
1.5 3 0.035 (0.077) 0.043 (0.043) 0.063 (0.151)
1.5 2 0.034 (0.08) 0.037 (0.037) 0.063 (0.111)
1.5 1.5 0.033 (0.072) 0.035 (0.035) 0.061 (0.080)
1.5 0.66 0.032 (0.112) 0.035 (0.036) 0.056 (0.057)
1.5 0.5 0.032 (0.185) 0.036 (0.036) 0.054 (0.074)
1.5 0.33 0.035 (0.261) 0.042 (0.042) 0.058 (0.121)
0.66 3 0.033 (0.289) 0.046 (0.046) 0.062 (0.13)
0.66 2 0.032 (0.215) 0.041 (0.041) 0.064 (0.085)
0.66 1.5 0.035 (0.134) 0.041 (0.041) 0.071 (0.071)
0.66 0.66 0.042 (0.086) 0.044 (0.044) 0.073 (0.094)
0.66 0.5 0.042 (0.099) 0.048 (0.048) 0.074 (0.121)
0.66 0.33 0.044 (0.101) 0.055 (0.055) 0.078 (0.157)
0.5 3 0.040 (0.357) 0.054 (0.054) 0.082 (0.165)
0.5 2 0.042 (0.253) 0.052 (0.052) 0.093 (0.100)
0.5 1.5 0.047 (0.154) 0.055 (0.055) 0.102 (0.107)
0.5 0.66 0.054 (0.087) 0.056 (0.056) 0.099 (0.148)
0.5 0.5 0.056 (0.090) 0.058 (0.058) 0.097 (0.192)
0.5 0.33 0.061 (0.084) 0.068 (0.068) 0.107 (0.245)

Table 15 provides a side-by-side comparison of the empirical SD and RMSE of the


three methods applied to the subset of simulated datasets B3 with α = ln(5). Here,
the treatment variable is time-varying and quantitative. From the perspective of the
empirical SD, as expected, I observe an inflated empirical SD for the IV methods with

53
respect to the naive Cox model. A interesting note here is that the MIV estimator
always yields a higher empirical SD than the 2SRI estimator. Based on RMSE, the
2SRI estimator had in general a better RMSE than the naive Cox model and the
MIV estimator as it can eliminate the bias completely, and the MIV estimator is
better than the naive Cox model only in scenarios where the confounder’s effect on
the outcome is relatively small (i.e., when exp(θ) = 1.5 and exp(θ) = 0.66).

54
Table 16: Empirical SD and RMSE for β from IV methods and
the naive Cox model applied to scenarios with a time-varying and
binary treatment. Each estimate is calculated over 1000 replicate,
random datasets with identical α, β, θ.
Parameter Values Empirical SD (RMSE)
exp(β) exp(θ) Naive Cox 2SRI MIV
2 3 0.116 (0.196) 0.148 (0.307) 0.571 (0.662)
2 2 0.106 (0.198) 0.135 (0.196) 0.597 (0.658)
2 1.5 0.108 (0.179) 0.135 (0.147) 0.58 (0.626)
2 0.66 0.105 (0.268) 0.134 (0.138) 0.597 (0.673)
2 0.5 0.11 (0.403) 0.139 (0.149) 0.594 (0.719)
2 0.33 0.115 (0.52) 0.145 (0.156) 0.628 (0.792)
1.5 3 0.104 (0.262) 0.142 (0.249) 0.584 (0.599)
1.5 2 0.107 (0.245) 0.134 (0.168) 0.601 (0.621)
1.5 1.5 0.108 (0.201) 0.138 (0.144) 0.571 (0.582)
1.5 0.66 0.105 (0.258) 0.134 (0.134) 0.593 (0.617)
1.5 0.5 0.107 (0.365) 0.137 (0.138) 0.637 (0.687)
1.5 0.33 0.116 (0.463) 0.146 (0.146) 0.643 (0.729)
0.66 3 0.109 (0.494) 0.15 (0.152) 0.586 (0.624)
0.66 2 0.107 (0.393) 0.148 (0.148) 0.563 (0.605)
0.66 1.5 0.109 (0.273) 0.146 (0.146) 0.58 (0.605)
0.66 0.66 0.119 (0.226) 0.144 (0.149) 0.629 (0.649)
0.66 0.5 0.121 (0.266) 0.146 (0.189) 0.653 (0.682)
0.66 0.33 0.129 (0.282) 0.157 (0.278) 0.675 (0.705)
0.5 3 0.113 (0.588) 0.152 (0.16) 0.571 (0.68)
0.5 2 0.114 (0.45) 0.148 (0.154) 0.557 (0.659)
0.5 1.5 0.117 (0.307) 0.149 (0.154) 0.565 (0.636)
0.5 0.66 0.13 (0.216) 0.159 (0.172) 0.636 (0.725)
0.5 0.5 0.128 (0.242) 0.159 (0.22) 0.676 (0.787)
0.5 0.33 0.135 (0.241) 0.159 (0.33) 0.669 (0.77)

Table 16 provides a side-by-side comparison of empirical SD and RMSE of the


three methods applied to the subset of the simulated datasets B4 with α = ln(5).
Here, the treatment variable is time-varying and binary. Again, the empirical SD
is larger for IV methods. Furthermore, in this setting, the performance of the IV

55
methods is less convincing from the RMSE point of view as well. Although the 2SRI
estimator showed a better RMSE in general, when the effect of the confounder is
large (i.e., exp(θ) = 0.33 or exp(θ) = 3), the naive Cox model has a lower RMSE.
The MIV estimator performs very poorly in terms of RMSE in this setting, as its
RMSE is always high compared to both the naive Cox model and the 2SRI estimator.

56
5 Discussion

The first methodological contribution of this thesis is the verification of the validity
of applying the 2SRI estimator in the Cox model setting in the presence of an un-
measured confounder, and with time-invariant treatment that does not require using
a time-varying covariate. Results from Section 4.2.1 suggest the 2SRI estimator is
only consistent if the treatment variable is quantitative. When this condition is not
satisfied (i.e., when the treatment is binary), the 2SRI estimates are, in general, bi-
ased towards the null. Nevertheless, the magnitude of its bias is consistently smaller
than the bias observed when using the naive Cox model that ignores the unmeasured
confounder. When comparing the 2SRI and MIV estimators in a setting with a bi-
nary time-invariant treatment, assigned at time 0 and constant during the follow-up,
the simulation results in Sections 4.2.1 and 4.2.2 suggest the 2SRI estimator has an
edge, with slightly lower values of both the variance and the bias of the estimates.
The second methodological contribution of this thesis involves the implementation
of the IV estimators in the Cox model setting where the treatment variable may
change its value at most once during the follow-up. The results of the simulation
studies in Section 4.3.1 established that if the intended treatment is observable,
the 2SRI estimator is a consistent estimator of the causal effect of a quantitative
treatment variable, such as a dose of medication. In contrast, in the same setting,
the 2SRI estimator will reduce, but not eliminate the bias if the treatment variable is
binary. However, if the intended treatment is not observable, the 2SRI estimator may
fail catastrophically, due to misspecification of the regression model used at the first
stage of the analyses to predict treatment received by individual subjects. Further

57
simulation results suggest that, in the situations where the 2SRI estimator may fail
catastrophically, the MIV estimator, initially proposed by MacKenzie et al. (2014),
and adapted to the Cox model setting with a time-varying treatment, as described
in Sections 2.2 and 4.3.2 of this thesis, can be used as a more accurate alternative
IV method, and will likely reduce bias, relative to both (i) the naive Cox model and
(ii) the 2SRI estimator.
In practice, the results of this thesis raise caution regarding application of the
IV methodology in time-to-event analyses with a binary treatment variable and the
presence of unmeasured confounding. With a binary treatment variable, my simu-
lation studies suggest that a consistent estimator of the average treatment effect is
not possible with the IV estimators presented in this thesis. Further, the variance
of the IV estimators becomes inflated even if the correlation between the instrument
and the treatment is high. Thus, applying the IV estimators presented in this thesis
may not be warranted, especially in studies where the number of the events is low,
as the bias-variance trade-off may favour the biased but much more stable naive Cox
model-based estimator, confirming previous simulation results, reported by Ionescu-
Ittu, Delaney, and Abrahamowicz (2009) in a simpler, linear risk difference regression
analyses of binary outcomes.
The results of simulations presented in this thesis, regarding the performance of
the 2SRI and the MIV estimators in Cox model regression analyses of the average
effects of time-invariant or time-varying (with a single change of state) treatment
effects, are only useful if a valid instrument exists. Identification and validation of
the IV, measured for all study subjects, that can be assumed to meet all standard

58
IV assumptions, is a hard problem in itself. Indeed, for example, Hernan and Robins
(2006) have raised concerns that, in many real-life applications, fundamental assump-
tions requiring that the IV must (i) not have an independent effect on the outcome
(i.e., one not mediated through its association with the observed treatment), and (ii)
not be associated, conditional on the observed treatment, with either measured or
unmeasured confounders, may be impossible to test. Thus, identifying appropriate
instrument(s) will be an important challenge for researchers that wish to consider
using the estimators evaluated in this thesis in their substantive research projects
(McMahon, 2003).
This thesis has several limitations. First, it relies solely on empirical results
of simulation studies, which as for all simulations reported in the statistical litera-
ture, can consider only a limited subset of clinically relevant scenarios and restricted
ranges of crucial design parameters. Therefore, it may be difficult to extrapolate
my results to some future real-life time-to-event analyses, especially if a substantial
different data structure from the scenarios I considered. Second, my analysis and
simulation results did not address the issue of comparisons between the variances of
the alternative estimators with the same level of detail as the issue of their biases,
which was the main focus of this thesis. Indeed, while the IV methods are known to
reduce, or in the ideal case (with all assumptions are strictly met) eliminate bias due
to unmeasured confounders, it remains to be explored what conditions are necessary
for the IV estimators to have an acceptable variance (Ionescu-Ittu et al., 2012; Abra-
hamowicz, Beauchamp, Ionescu-Ittu, Delaney, & Pilote, 2011; Brookhart, Rassen, &
Schneeweiss, 2010). Lastly, my time-varying simulations were limited to a relatively

59
simple scenario, where the (essentially time-varying) treatment is allowed to change
its value, for a given observation, only once during the follow-up. This scenario is
of practical interest, because a similar single-change assumption is often relied on
(sometimes implicitly), e.g., in recent studies that use marginal structural models
(MSM) to estimate causal effects of anti-retroviral therapies in HIV (Hernán, Brum-
back, & Robins, 2000; Young, Hernán, Picciotto, & Robins, 2009; Robins, Hernan,
& Brumback, 2000). Indeed, in the MSM literature, it is often implicitly assumed
that once the previously untreated subject starts treatment, he or she will remain
treated until the end of follow-up, implying only a single change in the current value
of the time-varying treatment/exposure variable. However, future research should
investigate the validity of using alternative IV estimators in a more complex time-
varying setting where the treatment/exposure status may change multiple times dur-
ing follow-up, e.g., some subjects may re-start the active treatment after previous
interruption(s) or may permanently discontinue a treatment initiated at some ear-
lier time during the follow-up (Xiao, Abrahamowicz, Moodie, Weber, & Young, 2014;
Dixon et al., 2012). Finally, similar to most other studies of unmeasured confounders
(Stürmer, Schneeweiss, Rothman, Avorn, & Glynn, 2007; Ionescu-Ittu et al., 2009),
my simulations were limited to a simplified case where only a single unmeasured
confounder introduces bias. Further research should evaluate the performance of al-
ternative methods in a more complex, and arguably more clinically plausible setting,
where there are several potential unmeasured confounders. On the other hand, the
IV methodology avoids the need to identify particular unmeasured confounders or
even specify their number (Brookhart et al., 2006), so the results of this thesis should

60
be reasonably robust with respect to increasing number of unmeasured confounders.
The results of this thesis provide several possibilities for potential future method-
ological research and further developments in the challenging but the practically
important field of IV analyses of right-censored time-to-event data. First, the vari-
ance of the 2SRI and MIV estimators are not well understood in the Cox model
setting. Some authors suggest using bootstrapping to obtain robust estimates of
variance and confidence intervals for the average treatment effect, while accounting
for variance inflation due to a multiple-stage analysis (MacKenzie et al., 2014; Terza,
Basu, & Rathouz, 2008). However, the analytical results could provide more insights
into the conditions where the IV estimators will be expected to yield better overall
variance-bias tradeoff (in terms of lower root mean-squared error) than a simpler,
biased but likely much more stable naive Cox model-based estimator. Another direc-
tion for possible future work would be to extend the IV methods to the competing
risks setting (Jason P. Fine, 1999). Also, further research should investigate methods
for model diagnostics in this complex setting. Finally, the results of this thesis show
that when the treatment is binary, both the 2SRI and MIV estimators are system-
atically biased towards the null. Thus, future statistical research should focus on
adapting these IV estimators to the analyses of the effects of binary time-varying
treatments/exposures, in order to circumvent this problem.
In conclusion, this thesis establishes the conditions necessary for the validity of
the 2SRI estimator and the MIV estimator in the Cox model settings, with either a
time-invariant or simple time-varying treatment/exposure variable. When an appro-
priate instrument can be identified and the treatment variable is quantitative, the

61
IV estimators, 2SRI and MIV, can be used to reduce the bias due to unmeasured
confounders, and in the ideal case, eliminate the bias completely. However, further
research is necessary to assess more systematically and compare these models in a
wider range of simulated scenarios and in real-life applications of the Cox model to
time-to-event analyses of cohort studies.

Appendix 1

R code for the MIV estimator adapted for the Cox regression model with time-varying
treatment variable that changes a maximum of one time during the follow-up.

rev<-function(x, setorder){ ## this function is made to remove


b<-numeric(length(x)) # observations from the risk set
g<-0 # where the follow up didn’t start yet
for(i in c(1:length(x))){
g<-sum(x[setorder[[i]]])
b[i]<-g

}
return(b)
}

IVCox<-function(Time1, Time2, Status, X, W, start=0){


# determining the number of observations and store in variable n

62
n<-length(Time2)
# order the final end of follow-ip
ord<-order(-Time2)
# reorder the sequence of observations to this order
Time2<-Time2[ord]
Status<-Status[ord] # status
X<-X[ord] # X treatment
W<-W[ord] # W instrumental variable
#starting of the follow up, not necessarily zero due to
# time dependent covariate
Time1<-Time1[ord] # Time1

### build risk set order ###


b<-NULL # b is auxillary variable for the risk set
# storing the risk set of each observation in this list
# the follow up don’t always start at 0
setorder<-list()
for(i in c(1:length(Time2))){
# note the risk set is not accurate here because
which(Time1[1:(i-1)]>Time2[i])->a
setorder[[i]]<-a
}
# this is the estimation function

63
Est.Equat<-function(beta){
HR<-exp(beta*X)
# adjusting the riskset by removing observation
# who’s follow up didn’t start yet
S.0<-cumsum(HR)-rev(HR, setorder)
S.W1<-cumsum(W*HR)-rev(W*HR, setorder)
# Contribute to the partial likelihood only if event happened
sum(Status*(W-S.W1/S.0))
}
# finding the root of the estimation function
out.solution<-multiroot(Est.Equat, start = start)

## Outputing ##
beta.hat<-ifelse(out.solution$estim.precis<0.00001, out.solution$root, NA)
list(Est.log.HR=beta.hat)
}

References

Abdollah, F., Sun, M., Schmitges, J., Thuret, R., Tian, Z., Shariat, S. F., et al.
(2012). Competing-risks mortality after radiotherapy vs. observation for lo-
calized prostate cancer: A population-based study. International Journal of
Radiation Oncology*Biology*Physics, 84 (1), 95 - 103.

64
Abrahamowicz, M., Bartlett, G., Tamblyn, R., & Berger, R. du. (2006). Modeling
cumulative dose and exposure duration provided insights regarding the associ-
ations between benzodiazepines and injuries. Journal of Clinical Epidemiology,
59 (4), 393 - 403.
Abrahamowicz, M., Beauchamp, M.-E., Ionescu-Ittu, R., Delaney, J. A. C.,
& Pilote, L. (2011). Reducing the variance of the prescribing
preference-based instrumental variable estimates of the treatment effect.
American Journal of Epidemiology, 174 (4), 494-502. Available from
http://aje.oxfordjournals.org/content/174/4/494.abstract
Abrahamowicz, M., Beauchamp, M.-E., & Sylvestre, M.-P. (2012). Com-
parison of alternative models for linking drug exposure with adverse ef-
fects. Statistics in Medicine, 31 (11-12), 1014–1030. Available from
http://dx.doi.org/10.1002/sim.4343
Albertsen, P., Hanley, J., & Fine, J. (2005). 20-year outcomes following conservative
management of clinically localized prostate cancer. JAMA, 293 (17), 2095-2101.
Available from + http://dx.doi.org/10.1001/jama.293.17.2095
Austin, P. C. (2011, June). An introduction to propensity score
methods for reducing the effects of confounding in observational stud-
ies. Multivariate Behavioral Research, 46 (3), 399–424. Available from
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/
Austin, P. C. (2012). Generating survival times to simulate cox proportional hazards
models with time-varying covariates. Statistics in Medicine, 31 (29), 3946–3958.
Available from http://dx.doi.org/10.1002/sim.5452

65
Avorn, J. (2007). Keeping science on top in drug evaluation. New
England Journal of Medicine, 357 (7), 633-635. Available from
http://dx.doi.org/10.1056/NEJMp078134 (PMID: 17699813)
Beyersmann, J., Wolkewitz, M., & Schumacher, M. (2008). The impact of time-
dependent bias in proportional hazards modelling. Statistics in Medicine,
27 (30), 6439–6454. Available from http://dx.doi.org/10.1002/sim.3437
Bhatt, A. (2011). Quality of clinical trials: A moving target.
Perspectives in Clinical Research, 2 (4), 124–128. Available from
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3227329/
Black, N. (1996). Why we need observational studies to evaluate the effectiveness of
health care. BMJ: British Medical Journal , 312 (7040), 1215.
Brookhart, M. A., Rassen, J. A., & Schneeweiss, S. (2010). Instrumen-
tal variable methods in comparative safety and effectiveness research.
Pharmacoepidemiology and Drug Safety, 19 (6), 537–554. Available from
http://dx.doi.org/10.1002/pds.1908
Brookhart, M. A., Wang, P., Solomon, D. H., & Schneeweiss, S. (2006). Evalu-
ating short-term drug effects using a physician-specific prescribing preference
as an instrumental variable. Epidemiology, 17 (3), 268-275. Available from
http://www.jstor.org/stable/20486213
Chen, Y., & Briesacher, B. A. (2011, 06). Use of instrumental variable
in prescription drug research with observational data: A systematic re-
view. Journal of clinical epidemiology, 64 (6), 687–700. Available from
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3079803/

66
Concato, J., Shah, N., & Horwitz, R. I. (2000). Randomized, con-
trolled trials, observational studies, and the hierarchy of research designs.
New England Journal of Medicine, 342 (25), 1887-1892. Available from
http://dx.doi.org/10.1056/NEJM200006223422507 (PMID: 10861325)
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Sta-
tistical Society. Series B (Methodological), 34 (2), 187-220. Available from
http://www.jstor.org/stable/2985181
Dixon, W. G., Abrahamowicz, M., Beauchamp, M.-E., Ray, D. W., Bernatsky, S.,
Suissa, S., et al. (2012). Immediate and delayed impact of oral glucocorticoid
therapy on risk of serious infection in older patients with rheumatoid arthritis:
a nested case–control analysis. Annals of the Rheumatic Diseases, 71 (7), 1128-
1133. Available from http://ard.bmj.com/content/71/7/1128.abstract
Eggener, S. E., Scardino, P. T., Walsh, P. C., Han, M., Partin, A. W., Trock,
B. J., et al. (2010, 2016/08/10). Predicting 15-year prostate cancer specific
mortality after radical prostatectomy. The Journal of Urology, 185 (3), 869–
875. Available from http://dx.doi.org/10.1016/j.juro.2010.10.057
Emanuel, E. J., Schnipper, L. E., Kamin, D. Y., Levinson, J., &
Lichter, A. S. (2003). The costs of conducting clinical research.
Journal of Clinical Oncology, 21 (22), 4145-4150. Available from
http://jco.ascopubs.org/content/21/22/4145.abstract
Fisher, L. D., & Lin, D. Y. (1999). Time-dependent covari-
ates in the cox proportional-hazards regression model. An-
nual Review of Public Health, 20 (1), 145-157. Available from

67
http://dx.doi.org/10.1146/annurev.publhealth.20.1.145 (PMID:
10352854)
Freedman, D. A. (2006). On the so-called ”huber sandwich estimator” and ”robust
standard errors”. The American Statistician, 60 (4), 299-302. Available from
http://www.jstor.org/stable/27643806
Gore, J. L., Litwin, M. S., Lai, J., Yano, E. M., Madison, R., Setodji, C., et al.
(2010). Use of radical cystectomy for patients with invasive bladder cancer.
Journal of the National Cancer Institute, 102 (11), 802-811. Available from
http://jnci.oxfordjournals.org/content/102/11/802.abstract
Greenland, S. (2000). An introduction to instrumental variables for epidemiologists.
, 29 (4), 722–729.
Greenland, S., Robins, J. M., & Pearl, J. (1999, 02). Confounding and col-
lapsibility in causal inference. Statist. Sci., 14 (1), 29–46. Available from
http://dx.doi.org/10.1214/ss/1009211805
Groenwold, R. H. H., Nelson, D. B., Nichol, K. L., Hoes, A. W.,
& Hak, E. (2010). Sensitivity analyses to estimate the poten-
tial impact of unmeasured confounding in causal research. Inter-
national Journal of Epidemiology, 39 (1), 107-117. Available from
http://ije.oxfordjournals.org/content/39/1/107.abstract
Hernán, M. Á., Brumback, B., & Robins, J. M. (2000). Marginal structural models
to estimate the causal effect of zidovudine on the survival of hiv-positive men.
Epidemiology, 11 (5).
Hernan, M. A., & Robins, J. M. (2006). Instruments for causal inference: An

68
epidemiologist’s dream? , 17 (4), 360–372.
Ionescu-Ittu, R., Abrahamowicz, M., & Pilote, L. (2012). Treatment effect estimates
varied depending on the definition of the provider prescribing preference-based
instrumental variables. Journal of Clinical Epidemiology, 65 (2), 155 - 162.
Ionescu-Ittu, R., Delaney, J. A., & Abrahamowicz, M. (2009). Bias–variance trade-
off in pharmacoepidemiological studies using physician-preference-based instru-
mental variables: a simulation study. Pharmacoepidemiology and Drug Safety,
18 (7), 562–571. Available from http://dx.doi.org/10.1002/pds.1757
Jadad, A. R., Moore, R., Carroll, D., Jenkinson, C., Reynolds, D. M., Gavaghan,
D. J., et al. (1996). Assessing the quality of reports of randomized clinical
trials: Is blinding necessary? Controlled Clinical Trials, 17 (1), 1 - 12.
Jason P. Fine, R. J. G. (1999). A proportional hazards model for the subdistribution
of a competing risk. Journal of the American Statistical Association, 94 (446),
496-509. Available from http://www.jstor.org/stable/2670170
Jeldres, C., Suardi, N., Walz, J., Saad, F., Hutterer, G. C., Bhojani, N., et al.
(2008). Poor overall survival in septa- and octogenarian patients after radical
prostatectomy and radiotherapy for prostate cancer: A population-based study
of 6183 men. European Urology, 54 (1), 107 - 117.
Jüni, P., Altman, D. G., & Egger, M. (2001). Assessing the quality
of controlled clinical trials. BMJ , 323 (7303), 42–46. Available from
http://www.bmj.com/content/323/7303/42
Karp, I., Abrahamowicz, M., Bartlett, G., & Pilote, L. (2004). Updated risk factor
values and the ability of the multivariable risk score to predict coronary heart

69
disease. American Journal of Epidemiology, 160 (7), 707-716. Available from
http://aje.oxfordjournals.org/content/160/7/707.abstract
Kuo, Y.-F., Montie, J. E., & Shahinian, V. B. (2012). Reducing bias in the assessment
of treatment effectiveness: Androgen deprivation therapy for prostate cancer.
Medical Care, 50 (5).
Lu-Yao, G., Albertsen, P., Moore, D., & al et. (2008). Survival fol-
lowing primary androgen deprivation therapy among men with local-
ized prostate cancer. JAMA, 300 (2), 173-181. Available from +
http://dx.doi.org/10.1001/jama.300.2.173
MacKenzie, T. A., Tosteson, T. D., Morden, N. E., Stukel, T. A., & O’Malley,
A. J. (2014, June). Using instrumental variables to estimate a cox’s pro-
portional hazards regression subject to additive confounding. Health ser-
vices & outcomes research methodology, 14 (1-2), 54–68. Available from
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261749/
McCandless, L. C., Gustafson, P., & Levy, A. (2007). Bayesian sensitivity analysis
for unmeasured confounding in observational studies. Statistics in Medicine,
26 (11), 2331–2347. Available from http://dx.doi.org/10.1002/sim.2711
McMahon, A. D. (2003). Approaches to combat with confounding by
indication in observational studies of intended drug effects. Phar-
macoepidemiology and Drug Safety, 12 (7), 551–558. Available from
http://dx.doi.org/10.1002/pds.883
Moura, C. S., Abrahamowicz, M., Beauchamp, M.-E., Lacaille, D., Wang, Y.,
Boire, G., et al. (2015). Early medication use in new-onset rheuma-

70
toid arthritis may delay joint replacement: results of a large population-
based study. Arthritis Research & Therapy, 17 (1), 1–9. Available from
http://dx.doi.org/10.1186/s13075-015-0713-3
Newey, W. K., & McFadden, D. (1986, January). Large sample estimation
and hypothesis testing. In R. F. Engle & D. McFadden (Eds.), Hand-
book of Econometrics (Vol. 4, p. 2111-2245). Elsevier. Available from
https://ideas.repec.org/h/eee/ecochp/4-36.html
Robins, J., Hernan, M., & Brumback, B. (2000). Marginal structural models and
causal inference in epidemiology. Epidemiology, 11 (5), 550–560.
Rothwell, P. M. (2005, 08). External validity of randomised con-
trolled trials: &#x201c;to whom do the results of this trial ap-
ply?&#x201d;. The Lancet, 365 (9453), 82–93. Available from
http://dx.doi.org/10.1016/S0140-6736(04)17670-8
Sertkaya, A., Wong, H.-H., Jessup, A., & Beleche, T. (2016). Key cost drivers of phar-
maceutical clinical trials in the united states. Clinical Trials, 13 (2), 117-126.
Available from http://ctj.sagepub.com/content/13/2/117.abstract
Stukel, T., Fisher, E., Wennberg, D., Alter, D., Gottlieb, D., & Vermeulen, M.
(2007). Analysis of observational studies in the presence of treatment selection
bias: Effects of invasive cardiac management on ami survival using propensity
score and instrumental variable methods. JAMA, 297 (3), 278-285. Available
from + http://dx.doi.org/10.1001/jama.297.3.278
Stürmer, T., Schneeweiss, S., Rothman, K. J., Avorn, J., & Glynn, R. J.
(2007). Performance of propensity score calibration—a simulation study.

71
American Journal of Epidemiology, 165 (10), 1110-1118. Available from
http://aje.oxfordjournals.org/content/165/10/1110.abstract
Suissa, S. (2008). Immortal time bias in pharmacoepidemiology. Amer-
ican Journal of Epidemiology, 167 (4), 492-499. Available from
http://aje.oxfordjournals.org/content/167/4/492.abstract
Sun, M., Sammon, J. D., Becker, A., Roghmann, F., Tian, Z., Kim, S. P., et
al. (2014). Radical prostatectomy vs radiotherapy vs observation among
older patients with clinically localized prostate cancer: a comparative effec-
tiveness evaluation. BJU International , 113 (2), 200–208. Available from
http://dx.doi.org/10.1111/bju.12321
Tan, H., Norton, E., Ye, Z., Hafez, K., Gore, J., & Miller, D. (2012). Long-
term survival following partial vs radical nephrectomy among older patients
with early-stage kidney cancer. JAMA, 307 (15), 1629-1635. Available from +
http://dx.doi.org/10.1001/jama.2012.475
Terza, J. V., Basu, A., & Rathouz, P. J. (2008). Two-stage residual inclusion
estimation: Addressing endogeneity in health econometric modeling. Journal
of Health Economics, 27 (3), 531 - 543.
Terza, J. V., Bradford, W. D., & Dismuke, C. E. (2008, June). The use of linear in-
strumental variables methods in health services research and health economics:
A cautionary note. Health Services Research, 43 (3), 1102–1120. Available from
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2442231/
Thanassoulis, G., & O’Donnell, C. (2009). Mendelian randomization: Nature’s
randomized trial in the post genome era. JAMA, 301 (22), 2386-2388. Available

72
from + http://dx.doi.org/10.1001/jama.2009.812
Vickers, A. J., & Sjoberg, D. D. (2015). Guidelines for reporting of statistics in
european urology. European Urology, 67 (2), 181 - 187.
Walraven, C. van, Davis, D., Forster, A. J., & Wells, G. A. (2004). Time-dependent
bias was common in survival analyses published in leading clinical journals.
Journal of Clinical Epidemiology, 57 (7), 672 - 682.
Wilt, T. J., Brawer, M. K., Jones, K. M., Barry, M. J., Aronson, W. J., Fox, S.,
et al. (2012). Radical prostatectomy versus observation for localized prostate
cancer. New England Journal of Medicine, 367 (3), 203-213. Available from
http://dx.doi.org/10.1056/NEJMoa1113162 (PMID: 22808955)
Xiao, Y., Abrahamowicz, M., Moodie, E. E. M., Weber, R., & Young, J. (2014).
Flexible marginal structural models for estimating the cumulative effect of
a time-dependent treatment on the hazard: Reassessing the cardiovascu-
lar risks of didanosine treatment in the swiss hiv cohort study. Journal
of the American Statistical Association, 109 (506), 455-464. Available from
http://dx.doi.org/10.1080/01621459.2013.87260
Young, J. G., Hernán, M. A., Picciotto, S., & Robins, J. M. (2009). Relation
between three classes of structural models for the effect of a time-varying ex-
posure on survival. Lifetime Data Analysis, 16 (1), 71–84. Available from
http://dx.doi.org/10.1007/s10985-009-9135-3
Zhou, Z., Rahme, E., Abrahamowicz, M., & Pilote, L. (2005). Sur-
vival bias associated with time-to-treatment initiation in drug ef-
fectiveness evaluation: A comparison of methods. American

73
Journal of Epidemiology, 162 (10), 1016-1023. Available from
http://aje.oxfordjournals.org/content/162/10/1016.abstract

74

You might also like