You are on page 1of 120

An Introduction to Dynamic

Treatment Regimes

Marie Davidian

Department of Statistics
North Carolina State University
http://www4.stat.ncsu.edu/davidian

1/64 Dynamic Treatment Regimes Webinar


Outline

• What is a dynamic treatment regime, and why study them?


• Clinical trials to study dynamic treatment regimes
• Thinking in terms of dynamic treatment regimes
• Constructing dynamic treatment regimes
• Discussion

2/64 Dynamic Treatment Regimes Webinar


Hot topic

Personalized Medicine

Source of graphic: http://www.personalizedmedicine.com/

3/64 Dynamic Treatment Regimes Webinar


A perspective on personalized medicine

Clinical practice: Clinicians make (a series of) treatment


decisions(s) over the course of a patient’s disease or disorder
• Key decision points in the disease process
• Fixed schedule , milestone in the disease process, event
necessitating a decision
• Several treatment options at each decision point
• Accruing information on the patient

4/64 Dynamic Treatment Regimes Webinar


A perspective on personalized medicine

Clinical practice: Clinicians make (a series of) treatment


decisions(s) over the course of a patient’s disease or disorder
• Key decision points in the disease process
• Fixed schedule , milestone in the disease process, event
necessitating a decision
• Several treatment options at each decision point
• Accruing information on the patient
• “Personalize ” treatment to the patient

4/64 Dynamic Treatment Regimes Webinar


A perspective on personalized medicine

Clinical practice: Clinicians make (a series of) treatment


decisions(s) over the course of a patient’s disease or disorder
• Key decision points in the disease process
• Fixed schedule , milestone in the disease process, event
necessitating a decision
• Several treatment options at each decision point
• Accruing information on the patient
• “Personalize ” treatment to the patient

That is: Treatment in practice involves sequential


decision-making based on accruing information
• Suggests thinking about and studying treatment from this
perspective. . .

4/64 Dynamic Treatment Regimes Webinar


Clinical decision-making

How are these decisions made?


• Clinical judgment
• Practice guidelines based on study results, expert opinion
• Synthesize all information on a patient up to the point of
the decision to determine the next treatment action

5/64 Dynamic Treatment Regimes Webinar


Clinical decision-making

How are these decisions made?


• Clinical judgment
• Practice guidelines based on study results, expert opinion
• Synthesize all information on a patient up to the point of
the decision to determine the next treatment action

Can clinical decision-making be formalized and made


“evidence-based?”

5/64 Dynamic Treatment Regimes Webinar


Dynamic treatment regime

Dynamic treatment regime:


• A set of sequential decision rules, each corresponding to a
key decision point
• Each rule dictates the treatment to be given from among
the available options based on the accrued information on
the patient to that point
• Taken together, the rules define an algorithm for making
treatment decisions
• Dynamic because the treatment action can vary depending
on the accrued information
• Ideally , provides an “evidence-based ” approach to
personalized treatment

6/64 Dynamic Treatment Regimes Webinar


Treatment regime

Terminology/Convention:
• Often, treatment regime is used to refer generally to any
approach to deciding on treatment
• And dynamic treatment regime is reserved for the case
where patient information is used
• We will use these terms interchangeably

In fact: Many common situations can be cast as involving


(dynamic) treatment regimes

7/64 Dynamic Treatment Regimes Webinar


ADHD therapy

Sequential (scheduled) decision points


• Decision 1: Low dose therapy – 2 options: medication or
behavior modification
• Subsequent monthly decisions:
I Responders – Continue initial therapy
I Non-responders – 2 options: add the other therapy or
increase dose of current therapy
• Objective: Improved end-of-school-year performance

Example from Susan Murphy, University of Michigan

8/64 Dynamic Treatment Regimes Webinar


Cancer treatment

Two (milestone) decision points:


• Decision 1 : Induction chemotherapy (options C1 , C2 )
• Decision 2 :
I Maintenance treatment for patients who respond
(options M1 , M2 )
I Salvage chemotherapy for those who don’t respond
(options S1 , S2 )
• Objective : Maximize survival time

9/64 Dynamic Treatment Regimes Webinar


Possible treatment regimes
Possible rules at Decision 1:
• “Give C1 ” (non-dynamic )

10/64 Dynamic Treatment Regimes Webinar


Possible treatment regimes
Possible rules at Decision 1:
• “Give C1 ” (non-dynamic )
• “If age < 50, progesterone receptor level < 10 fmol,
RAD51 mutation, then give C1 , else, give C2 ”

10/64 Dynamic Treatment Regimes Webinar


Possible treatment regimes
Possible rules at Decision 1:
• “Give C1 ” (non-dynamic )
• “If age < 50, progesterone receptor level < 10 fmol,
RAD51 mutation, then give C1 , else, give C2 ”
• “If patient is a Libra, Scorpio, or Sagittarius, give C1 ,
else, give C2 ”

10/64 Dynamic Treatment Regimes Webinar


Possible treatment regimes
Possible rules at Decision 1:
• “Give C1 ” (non-dynamic )
• “If age < 50, progesterone receptor level < 10 fmol,
RAD51 mutation, then give C1 , else, give C2 ”
• “If patient is a Libra, Scorpio, or Sagittarius, give C1 ,
else, give C2 ”

Possible rules at Decision 2:


• “If patient responds, give maintenance M1 ; if does not
respond, give salvage S1 ” (dynamic )

10/64 Dynamic Treatment Regimes Webinar


Possible treatment regimes
Possible rules at Decision 1:
• “Give C1 ” (non-dynamic )
• “If age < 50, progesterone receptor level < 10 fmol,
RAD51 mutation, then give C1 , else, give C2 ”
• “If patient is a Libra, Scorpio, or Sagittarius, give C1 ,
else, give C2 ”

Possible rules at Decision 2:


• “If patient responds, give maintenance M1 ; if does not
respond, give salvage S1 ” (dynamic )
• “If patient responds, age < 60, CEA > 10 ng/mL,
progesterone receptor level < 8 fmol, give M1 , else, give
M2 ; if does not respond, age > 65, P53 mutation,
CA 15-3 > 25 units/mL, then give S1 , else, give S2 ”

10/64 Dynamic Treatment Regimes Webinar


Possible treatment regimes

Result: Rules, and thus regimes , can be simple or complex


(or not realistic )
• More complex rules involve more “personalization ” and
more closely mimic clinical practice
• There is an infinitude of possible rules at each decision
point, and thus an infinitude of possible regimes
• Ultimate goal : Find the “best ” or “optimal ” regime

Regimes of interest and “optimal” depend on the question


• For definiteness, assume larger outcomes are preferred

11/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
1. Classical treatment comparison:
• Focus on a single decision point

12/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
1. Classical treatment comparison:
• Focus on a single decision point
• Cancer example: Decision 1
• Two regimes of interest: “Give C1 ” vs. “Give C2 ”
• Class of regimes of interest is D = { “Give C1 ” , “Give C2 ”}

12/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
1. Classical treatment comparison:
• Focus on a single decision point
• Cancer example: Decision 1
• Two regimes of interest: “Give C1 ” vs. “Give C2 ”
• Class of regimes of interest is D = { “Give C1 ” , “Give C2 ”}
• Usual question : “If all patients in the population were to be
given C1 , would mean outcome (mean survival time ) be
different from (better than ) that if all patients in the
population were to be given C2 ?”

12/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
1. Classical treatment comparison:
• Focus on a single decision point
• Cancer example: Decision 1
• Two regimes of interest: “Give C1 ” vs. “Give C2 ”
• Class of regimes of interest is D = { “Give C1 ” , “Give C2 ”}
• Usual question : “If all patients in the population were to be
given C1 , would mean outcome (mean survival time ) be
different from (better than ) that if all patients in the
population were to be given C2 ?”
• Optimal regime in D: The regime such that, if all patients in
the population were to receive treatment according to it ,
mean outcome would be the largest among all regimes in
D (here, “Give C1 ” or “Give C2 ”)

12/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
2. Which is the “best” treatment sequence?
• Multiple decision points

13/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
2. Which is the “best” treatment sequence?
• Multiple decision points
• Cancer example: Eight dynamic regimes of interest:

13/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
2. Which is the “best” treatment sequence?
• Multiple decision points
• Cancer example: Eight dynamic regimes of interest:
1. Give C1 followed by (M1 if response, S1 if no response)
2. Give C1 followed by (M1 if response, S2 if no response)
3. Give C1 followed by (M2 if response, S1 if no response)
4. Give C1 followed by (M2 if response, S2 if no response)
5. Give C2 followed by (M1 if response, S1 if no response)
6. Give C2 followed by (M1 if response, S2 if no response)
7. Give C2 followed by (M2 if response, S1 if no response)
8. Give C2 followed by (M2 if response, S2 if no response)
• Class D of interest contains these 8 regimes

13/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
2. Which is the “best” treatment sequence?
• Multiple decision points
• Cancer example: Eight dynamic regimes of interest:
1. Give C1 followed by (M1 if response, S1 if no response)
2. Give C1 followed by (M1 if response, S2 if no response)
3. Give C1 followed by (M2 if response, S1 if no response)
4. Give C1 followed by (M2 if response, S2 if no response)
5. Give C2 followed by (M1 if response, S1 if no response)
6. Give C2 followed by (M1 if response, S2 if no response)
7. Give C2 followed by (M2 if response, S1 if no response)
8. Give C2 followed by (M2 if response, S2 if no response)
• Class D of interest contains these 8 regimes
• Question: Comparison of mean outcomes if all patients in
the population were to follow each regime

13/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
2. Which is the “best” treatment sequence?
• Multiple decision points
• Cancer example: Eight dynamic regimes of interest:
1. Give C1 followed by (M1 if response, S1 if no response)
2. Give C1 followed by (M1 if response, S2 if no response)
3. Give C1 followed by (M2 if response, S1 if no response)
4. Give C1 followed by (M2 if response, S2 if no response)
5. Give C2 followed by (M1 if response, S1 if no response)
6. Give C2 followed by (M1 if response, S2 if no response)
7. Give C2 followed by (M2 if response, S1 if no response)
8. Give C2 followed by (M2 if response, S2 if no response)
• Class D of interest contains these 8 regimes
• Question: Comparison of mean outcomes if all patients in
the population were to follow each regime
• Optimal regime in D: The regime such that, if all patients
were to receive treatment according to it , mean outcome
would be the largest among all regimes in D

13/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
3. “Best” dynamic regime in a “feasible class?”
• Single or multiple decision points
• Cancer example: Decision 1

14/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
3. “Best” dynamic regime in a “feasible class?”
• Single or multiple decision points
• Cancer example: Decision 1
• X1 = (lots of) patient information available at Decision 1
• In resource-limited setting, interested in rules depending
on a subset of X1 routinely collected, e.g., of form
“If age < η1 and PR < η2 give C2 ; else give C1 ”
PR = progesterone receptor level

14/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
3. “Best” dynamic regime in a “feasible class?”
• Single or multiple decision points
• Cancer example: Decision 1
• X1 = (lots of) patient information available at Decision 1
• In resource-limited setting, interested in rules depending
on a subset of X1 routinely collected, e.g., of form
“If age < η1 and PR < η2 give C2 ; else give C1 ”
PR = progesterone receptor level
• Class D of interest consists of all regimes of this form
(so for all values of η1 and η2 )

14/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
3. “Best” dynamic regime in a “feasible class?”
• Single or multiple decision points
• Cancer example: Decision 1
• X1 = (lots of) patient information available at Decision 1
• In resource-limited setting, interested in rules depending
on a subset of X1 routinely collected, e.g., of form
“If age < η1 and PR < η2 give C2 ; else give C1 ”
PR = progesterone receptor level
• Class D of interest consists of all regimes of this form
(so for all values of η1 and η2 )
• Optimal regime in D: The regime defined by values η1opt ,
η2opt such that, if all patients in the population were to
receive treatment according to it , mean outcome would be
the largest among all regimes in D

14/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
4. “Optimal” overall dynamic treatment regime:
• Single or multiple decision points
• Cancer example: Two decision points

15/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
4. “Optimal” overall dynamic treatment regime:
• Single or multiple decision points
• Cancer example: Two decision points
• X1 = patient information available at Decision 1, X2 =
additional information collected between Decisions 1 and 2
• Accrued information at each decision
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }

15/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
4. “Optimal” overall dynamic treatment regime:
• Single or multiple decision points
• Cancer example: Two decision points
• X1 = patient information available at Decision 1, X2 =
additional information collected between Decisions 1 and 2
• Accrued information at each decision
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
• Class D of interest: All possible sets of rules
{d1 (H1 ), d2 (H2 )}
• Each rule takes as input the accrued information and
outputs a treatment from among the available options

15/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
4. “Optimal” overall dynamic treatment regime:
• Single or multiple decision points
• Cancer example: Two decision points
• X1 = patient information available at Decision 1, X2 =
additional information collected between Decisions 1 and 2
• Accrued information at each decision
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
• Class D of interest: All possible sets of rules
{d1 (H1 ), d2 (H2 )}
• Each rule takes as input the accrued information and
outputs a treatment from among the available options
• Optimal regime in D: {d1opt (H1 ), d2opt (H2 )} such that, if all
patients were to receive treatment according to it , mean
outcome would be the largest among all regimes in D

15/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
In all of Cases 1–4: A set of rules at each of K decision points,
K = 1 or 2, depending on accrued information
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
Dynamic treatment regime
d = d1 (H1 ) or d = {d1 (H1 ), d2 (H2 )}

16/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
In all of Cases 1–4: A set of rules at each of K decision points,
K = 1 or 2, depending on accrued information
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
Dynamic treatment regime
d = d1 (H1 ) or d = {d1 (H1 ), d2 (H2 )}
• Case 1 : K = 1, rules of form (simple )
d1 (H1 ) = Cj for all H1 , j = 1, 2

16/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
In all of Cases 1–4: A set of rules at each of K decision points,
K = 1 or 2, depending on accrued information
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
Dynamic treatment regime
d = d1 (H1 ) or d = {d1 (H1 ), d2 (H2 )}
• Case 1 : K = 1, rules of form (simple )
d1 (H1 ) = Cj for all H1 , j = 1, 2
• Case 2 : K = 2, rules of form (simple )
d1 (H1 ) = Cj for all H1 , j = 1, 2
X2 contains response status
d2 (H2 ) = Mk if response, S` if no response, k , ` = 1, 2

16/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
In all of Cases 1–4: A set of rules at each of K decision points,
K = 1 or 2, depending on accrued information
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
Dynamic treatment regime
d = d1 (H1 ) or d = {d1 (H1 ), d2 (H2 )}

17/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
In all of Cases 1–4: A set of rules at each of K decision points,
K = 1 or 2, depending on accrued information
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
Dynamic treatment regime
d = d1 (H1 ) or d = {d1 (H1 ), d2 (H2 )}

• Case 3 : K = 1, code {C1 , C2 } = {0, 1}, rules of form


d1 (H1 ) = I(age < η1 , PR < η2 )

17/64 Dynamic Treatment Regimes Webinar


Classes of treatment regimes
In all of Cases 1–4: A set of rules at each of K decision points,
K = 1 or 2, depending on accrued information
Decision 1 H1 = X1
Decision 2 H2 = {X1 , A1 , X2 }
Dynamic treatment regime
d = d1 (H1 ) or d = {d1 (H1 ), d2 (H2 )}

• Case 3 : K = 1, code {C1 , C2 } = {0, 1}, rules of form


d1 (H1 ) = I(age < η1 , PR < η2 )

• Case 4 : K = 2, general rules {d1 (H1 ), d2 (H2 )}; e.g., with


two options coded as {0, 1} at each decision
d1 (H1 ) = I(η1T H1 > 0), d2 (H2 ) = I(η2T H2 > 0)
Rules involve linear combinations of accrued information

17/64 Dynamic Treatment Regimes Webinar


Studying dynamic treatment regimes

How do we find an optimal treatment regime within a class


of interest?
• Required : Appropriate data
• Case 1. Classical, single decision treatment comparison :
Data from a standard clinical trial comparing C1 and C2
• Case 2. Optimal treatment sequence for two decision
points (simple dynamic treatment regimes)
• We will return to Cases 3 and 4 later

18/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes

Recall: In our example, D consists of eight regimes


1. Give C1 followed by (M1 if response, S1 if no response)
2. Give C1 followed by (M1 if response, S2 if no response)
3. Give C1 followed by (M2 if response, S1 if no response)
4. Give C1 followed by (M2 if response, S2 if no response)
5. Give C2 followed by (M1 if response, S1 if no response)
6. Give C2 followed by (M1 if response, S2 if no response)
7. Give C2 followed by (M2 if response, S1 if no response)
8. Give C2 followed by (M2 if response, S2 if no response)

How do we compare the regimes in D and identify the


“best?”

19/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Can’t we base this on data from a series of previous trials?
• In one trial, C1 was compared against C2 in terms of
response rate

20/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Can’t we base this on data from a series of previous trials?
• In one trial, C1 was compared against C2 in terms of
response rate
• In another trial, M1 and M2 were compared on the basis of
survival time in subjects who responded to their induction
chemotherapy

20/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Can’t we base this on data from a series of previous trials?
• In one trial, C1 was compared against C2 in terms of
response rate
• In another trial, M1 and M2 were compared on the basis of
survival time in subjects who responded to their induction
chemotherapy
• In yet another, S1 and S2 were compared (survival ) in
subjects for whom induction therapy did not induce
response

20/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Can’t we base this on data from a series of previous trials?
• In one trial, C1 was compared against C2 in terms of
response rate
• In another trial, M1 and M2 were compared on the basis of
survival time in subjects who responded to their induction
chemotherapy
• In yet another, S1 and S2 were compared (survival ) in
subjects for whom induction therapy did not induce
response
• Can’t we just “piece together ” the results from these
separate trials to figure out the “best regime ?”
• E.g., figure out the best “C” treatment for inducing
response and then the best “M” and “S” treatments for
prolonging survival?
• Wouldn’t the regime that uses these have to have the
“best ” mean outcome?
20/64 Dynamic Treatment Regimes Webinar
Clinical trials for studying treatment regimes

One problem with this: Delayed effects


• E.g., C1 may yield a higher proportion of responders than
C2 but may also have other effects that render subsequent
maintenance treatments less effective in terms of mean
survival time
• Implication : Must study entire regimes in the same
patients

21/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes

One problem with this: Delayed effects


• E.g., C1 may yield a higher proportion of responders than
C2 but may also have other effects that render subsequent
maintenance treatments less effective in terms of mean
survival time
• Implication : Must study entire regimes in the same
patients

Data for doing this:


• Design a clinical trial expressly for this purpose (next )
• Use longitudinal observational data , where treatments
actually received at each decision point have been
recorded (with other information)

21/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes

Clinical trials:
• An eight arm trial – subjects randomized to the jth arm
follow the jth regime
• A Sequential , Multiple Assignment , Randomized Trial
(next slide. . . )
• How to analyze the data to compare regimes and find the
optimal regime ? What else can be learned from such
trials?

22/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
SMART: Sequential, Multiple Assignment, Randomized Trial
(Randomization at •s)

M1

Response M2

C1
S1
No
Response
S2

Cancer
M1
Response

C2 M2

No
Response S1

S2
Pioneered by Susan Murphy, Phil Lavori, and others

23/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Embedded regimes: The eight regimes in D are embedded in
the SMART
M1

Response M2

C1
S1
No
Response
S2

Cancer
M1
Response

C2 M2

No
Response S1

S2

24/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Examples of SMARTs: SMARTs have been carried out or are
ongoing, mainly in behavioral disorders; see
http://methodology.psu.edu/ra/smart/projects

• SMARTs have also been done in oncology (coming up. . . )

25/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Examples of SMARTs: SMARTs have been carried out or are
ongoing, mainly in behavioral disorders; see
http://methodology.psu.edu/ra/smart/projects

• SMARTs have also been done in oncology (coming up. . . )

Remarks:
• There is really no conceptual difference between
randomizing up front or sequentially
• Advantages and disadvantages , e.g., consent , balance
• Important : Making efficient use of the data

Seminal reference: Murphy SA. (2005). An experimental


design for the development of adaptive treatment strategies,
Statistics in Medicine , 24, 1455–1481.

25/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Remark 1: Individuals following the same regime can have
different realized treatment experiences , e.g.,
Give C1 followed by (M1 if response, S1 if no response)
• Subject 1 : Receives C1 , responds, receives M1
• Subject 2 : Receives C1 , does not respond, receives S1
• Both subjects’ experiences are consistent with following
this regime

26/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Remark 1: Individuals following the same regime can have
different realized treatment experiences , e.g.,
Give C1 followed by (M1 if response, S1 if no response)
• Subject 1 : Receives C1 , responds, receives M1
• Subject 2 : Receives C1 , does not respond, receives S1
• Both subjects’ experiences are consistent with following
this regime

Remark 2: Individuals following different regimes can have the


same realized treatment experience , e.g., experience
C1 ⇒ Response ⇒ M1
is consistent with having followed EITHER OF regimes
• C1 followed by (M1 if response, S1 if no response)
• C1 followed by (M1 if response, S2 if no response)

26/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Remark 3: Do not confuse the regime with the possible
realized experiences that can result from following it
• “C1 followed by response followed by M1 ” and “C1 followed
by no response followed by S1 ” are not regimes but are
possible results of following the above regime
• The regime is the algorithm (set of rules)

27/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes
Remark 3: Do not confuse the regime with the possible
realized experiences that can result from following it
• “C1 followed by response followed by M1 ” and “C1 followed
by no response followed by S1 ” are not regimes but are
possible results of following the above regime
• The regime is the algorithm (set of rules)

Remark 4: Do not confuse dynamic treatment regimes


themselves or SMARTs with response-adaptive clinical trial
designs for classical treatment comparisons
• A dynamic treatment regime is an algorithm for treating a
single patient
• This has nothing to do with other patients in a study
• An adaptive trial is one in which the data are used to alter
the design (e.g., drop an arm, sample size)
• The design of a SMART does not change

27/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes

Estimation of mean outcome (e.g., mean survival):


• Usual approach under up-front randomization : estimate
mean for regime j by sample average outcome based on
subjects randomized to regime j only

28/64 Dynamic Treatment Regimes Webinar


Clinical trials for studying treatment regimes

Estimation of mean outcome (e.g., mean survival):


• Usual approach under up-front randomization : estimate
mean for regime j by sample average outcome based on
subjects randomized to regime j only
• However : Subjects will have realized experiences
consistent with more than one regime !
• This can be exploited to improve precision. . .

28/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Demonstration:
• A certain kind of SMART is common in oncology. . .
• . . . but way these trials are usually analyzed does not focus
on comparing the embedded dynamic treatment regimes
and finding the best treatment sequence
• We demonstrate the general principle of how to exploit
realized experiences consistent with more than one regime
to do this

Reference: Lunceford JK, Davidian M, Tsiatis AA. (2002).


Estimation of survival distributions of treatment policies in
two-stage randomization designs in clinical trials. Biometrics ,
58, 48–57.

29/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Cancer and Leukemia Group B (CALGB) Protocol 8923:


Double-blind, placebo-controlled trial of 338 elderly subjects
with acute myelogenous leukemia (AML) with randomizations
at two key decision points

30/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Cancer and Leukemia Group B (CALGB) Protocol 8923:


Double-blind, placebo-controlled trial of 338 elderly subjects
with acute myelogenous leukemia (AML) with randomizations
at two key decision points
• Decision 1 : Subjects randomized to either standard
induction chemotherapy C1 OR standard induction therapy
+ granulocyte-macrophage colony-stimulating factor
(GM-CSF ) C2 (two options)
• Decision 2 :
I If response , subjects randomized to M1 , M2 =
intensification/maintenance treatments I, II (two options)
I If no response , only one option: follow-up with physician
• All subjects followed for the outcome survival time

30/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Four possible regimes: The class D of interest comprises


1. C1 followed by (M1 if response, else follow-up) (C1 M1 )
2. C1 followed by (M2 if response, else follow-up) (C1 M2 )
3. C2 followed by (M1 if response, else follow-up) (C2 M1 )
4. C2 followed by (M2 if response, else follow-up) (C2 M2 )

31/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes
Schematic of CALGB 8923: Randomization at •s
 
 
   
Follow-up
Non-
 
Response
 
 
 
 
  Chemo +
Intensification I
  Placebo

 
Response
   
   
 
Intensification II
 
 
 
 
AML
 
 
Non-
  Follow-up
Response
   
 
 
   
  Chemo +
  GM-CSF
  Intensification I
   
  Response
 
 
   
Intensification II

32/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Standard analysis:
• Compare response rates to C1 and C2
• Compare survival between M1 and M2 among responders
• Compare survival between C1 and C2 regardless of
subsequent response
• Does not address the embedded regimes

33/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Goal: Find the regime in D such that, if all patients in the


population were to receive treatment according to it , mean
survival would be the largest

34/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Goal: Find the regime in D such that, if all patients in the


population were to receive treatment according to it , mean
survival would be the largest
• Estimate mean survival if all patients followed each of the
four embedded regimes Cj Mk , j = 1, 2, k = 1, 2
• Use data from all subjects whose realized experience is
consistent with having followed Cj Mk
• I.e., subjects with either

Cj ⇒ response ⇒ Mk
Cj ⇒ no response ⇒ follow up with physician

34/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Statistical framework: Causal inference perspective


• Characterize in terms of potential outcomes

Consider first: Classical single decision treatment comparison

35/64 Dynamic Treatment Regimes Webinar


Statistical framework

Case 1: Classical, single decision treatment comparison


• D = { “Give C1 ” , “Give C2 ” }
• Hypothesize potential outcomes under each regime in D

36/64 Dynamic Treatment Regimes Webinar


Statistical framework

Case 1: Classical, single decision treatment comparison


• D = { “Give C1 ” , “Give C2 ” }
• Hypothesize potential outcomes under each regime in D
• Y (1) = outcome that would be achieved if a randomly
chosen patient from the population were to follow regime
“Give C1 ”; Y (2) defined analogously
• E(Y (1) ) = the mean outcome if all patients in the
population were to follow “Give C1 ”; E(Y (2) ) analogously

36/64 Dynamic Treatment Regimes Webinar


Statistical framework

Case 1: Classical, single decision treatment comparison


• D = { “Give C1 ” , “Give C2 ” }
• Hypothesize potential outcomes under each regime in D
• Y (1) = outcome that would be achieved if a randomly
chosen patient from the population were to follow regime
“Give C1 ”; Y (2) defined analogously
• E(Y (1) ) = the mean outcome if all patients in the
population were to follow “Give C1 ”; E(Y (2) ) analogously
• Usual question : “If all patients in the population were to be
given C1 , would mean outcome be different from (better
than ) that if all patients were to be given C2 ?”
⇒ Compare E(Y (1) ) and E(Y (2) )

36/64 Dynamic Treatment Regimes Webinar


Statistical framework
Clinical trial: Do not observe Y (1) and Y (2) on each subject
• If A = 1 (2) if subject randomized to “Give C1 ” (“Give C2 ”),
we do observe (Y , A), where
Y = Y (1) I(A = 1) + Y (2) I(A = 2)

37/64 Dynamic Treatment Regimes Webinar


Statistical framework
Clinical trial: Do not observe Y (1) and Y (2) on each subject
• If A = 1 (2) if subject randomized to “Give C1 ” (“Give C2 ”),
we do observe (Y , A), where
Y = Y (1) I(A = 1) + Y (2) I(A = 2)

• By randomization , Y (1) , Y (2) ⊥


⊥A
⇒ E(Y (1) ) = E(Y (1) |A = 1) = E(Y |A = 1)
and similarly for E(Y (2) )
• Thus, from observed data (Yi , Ai ), i = 1, . . . , n (iid), can
estimate Pn
(1) Yi I(Ai = 1)
E(Y ) by Pi=1 n ,
i=1 I(Ai = 1)

the usual sample average , and E(Y (2) ) similarly

37/64 Dynamic Treatment Regimes Webinar


Statistical framework

Case 2: Optimal treatment sequence for two decision points


• D = { Cj Mk , j, k = 1, 2 }
• Hypothesize potential outcomes under each regime in D

38/64 Dynamic Treatment Regimes Webinar


Statistical framework

Case 2: Optimal treatment sequence for two decision points


• D = { Cj Mk , j, k = 1, 2 }
• Hypothesize potential outcomes under each regime in D
• Y (jk ) = survival time that would be achieved if a randomly
chosen patient from the population were to follow Cj Mk

38/64 Dynamic Treatment Regimes Webinar


Statistical framework

Case 2: Optimal treatment sequence for two decision points


• D = { Cj Mk , j, k = 1, 2 }
• Hypothesize potential outcomes under each regime in D
• Y (jk ) = survival time that would be achieved if a randomly
chosen patient from the population were to follow Cj Mk
• Question : Compare mean survival if all patients followed
each of Cj Mk , j, k = 1, 2
⇒ Compare (estimate ) E(Y (jk) ), j, k = 1, 2
• Or survival probabilities

Sjk (t) = pr(Y (jk) > t) = E{I(Y (jk ) > t)}, j, k = 1, 2

• Assume no censoring (can be generalized )

38/64 Dynamic Treatment Regimes Webinar


Statistical framework
Clinical trial (e.g., SMART): Do not observe Y (jk ) , j, k = 1, 2
• Can we make a connection between potential outcomes
and observed data as we did in Case 1?

39/64 Dynamic Treatment Regimes Webinar


Statistical framework
Clinical trial (e.g., SMART): Do not observe Y (jk ) , j, k = 1, 2
• Can we make a connection between potential outcomes
and observed data as we did in Case 1?
• Consider j = 1; j = 2 similar

Observed for each subject: (R, RZ , Y )


• Y = survival time
• R = 1 if subject responds to C1 , R = 0 if not
• Z = k for responder randomized to Mk , k = 1, 2
(not defined if R = 0)

39/64 Dynamic Treatment Regimes Webinar


Statistical framework
Clinical trial (e.g., SMART): Do not observe Y (jk ) , j, k = 1, 2
• Can we make a connection between potential outcomes
and observed data as we did in Case 1?
• Consider j = 1; j = 2 similar

Observed for each subject: (R, RZ , Y )


• Y = survival time
• R = 1 if subject responds to C1 , R = 0 if not
• Z = k for responder randomized to Mk , k = 1, 2
(not defined if R = 0)
• Assume when R = 0, Y (11) , Y (12) are the same ; then
Y = (1 − R)Y (11) + RI(Z = 1)Y (11) + RI(Z = 2)Y (12)
• From observed data (Ri , Ri Zi , Yi ), i = 1, . . . , n (iid),
Estimate E(Y (11) ), E(Y (12) ) and similarly for j = 2

39/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Consider j = 1: Responders to C1 are randomized to M1 with


probability π = 1/2
• Nonresponders to C1 ⇒ follow up
• Half of responders get M1 , half get M2
• Estimate mean survival for C1 M1 by weighted average
• Nonresponders represent themselves ⇒ weight = 1
• Each responder who got M1 represents him/herself and
another similar subject who got randomized to M2 ⇒
weight = 2
• Estimator for C1 M2 , switch roles
• Note : Survival times from nonresponders are used to
estimate the means for both C1 M1 and C1 M2

40/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes
Formally: For j = 1 (j = 2 similar), (Ri , Ri Zi , Yi ), i = 1, . . . , n
Yi = survival time for subject i
Ri = 1 if i responds to C1 , Ri = 0 if not
Zi = k for responder randomized to Mk , k = 1, 2
pr(Zi = 1| Ri = 1) = π (= 1/2 in previous)

41/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes
Formally: For j = 1 (j = 2 similar), (Ri , Ri Zi , Yi ), i = 1, . . . , n
Yi = survival time for subject i
Ri = 1 if i responds to C1 , Ri = 0 if not
Zi = k for responder randomized to Mk , k = 1, 2
pr(Zi = 1| Ri = 1) = π (= 1/2 in previous)

Estimators for E(Y (11) ): Qi = 1 − Ri + Ri I(Zi = 1) π −1


n n
!−1 n
X X X
−1
n Qi Yi or Qi Qi Yi
i=1 i=1 i=1

• Qi = 0 if i is inconsistent with C1 M1 (consistent with C1 M2 )


• Qi = 1 if Ri = 0
• Qi = π −1 if Ri = 1 and Zi = 1
• Similarly for E(Y (12) )

41/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Estimators for E(Y (11) ): Qi = 1 − Ri + Ri I(Zi = 1) π −1

n n
!−1 n
X X X
−1
n Qi Yi or Qi Qi Yi
i=1 i=1 i=1

• Can show : E(QY ) = E(Y (11) ), E(Q) = 1


• And similarly for j, k = 1, 2
• ⇒ Consistent estimators for E(Y (jk) ) (Appendix)
• Estimators for E(Y (jk ) ), k = 1, 2, are correlated
• Can derive statistics for comparison ⇒ identify optimal
regime in D

42/64 Dynamic Treatment Regimes Webinar


Estimating mean outcome for embedded regimes

Remarks:
• Subjects may die before having a chance to respond –
nonresponders at the time of death (R = 0)
• Survival time may be right-censored – can incorporate
inverse probability of censoring weighting
• Randomization at each decision is key ⇒ subjects are
prognostically similar
• Can be generalized to arbitrary number of decisions,
numbers of options at each

43/64 Dynamic Treatment Regimes Webinar


Designing SMARTs
Considerations:
• Class of regimes should involve key decision points where
it is feasible to randomize
• And with more than one treatment option and no
consensus on choice among options
• Simplicity – small numbers of decision points and options
• Embedded regimes should have simple decision rules ;
e.g., depending only on a few variables (response status )
• Criteria and methods for sample size determination is an
open problem
• Critical : Collect rich patient information at baseline and
between decision points to inform development of more
complex , optimal regimes (e.g., Cases 3 and 4)
• More shortly. . .

44/64 Dynamic Treatment Regimes Webinar


Designing SMARTs
Schematic of CALGB 8923: Randomization at •s
 
 
   
Follow-up
Non-
 
Response
 
 
 
 
  Chemo +
Intensification I
  Placebo

 
Response
   
   
 
Intensification II
 
 
 
 
AML
 
 
Non-
  Follow-up
Response
   
 
 
   
  Chemo +
  GM-CSF
  Intensification I
   
  Response
 
 
   
Intensification II

45/64 Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes

Questions not addressed in a conventional clinical trial:


• If a treatment is effective, what should be the duration of
administration?
• How would the randomized treatments have compared if
no patients had discontinued their assigned treatments?

46/64 Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes

Questions not addressed in a conventional clinical trial:


• If a treatment is effective, what should be the duration of
administration?
• How would the randomized treatments have compared if
no patients had discontinued their assigned treatments?

Such questions can be cast as questions about dynamic


treatment regimes
• Available data are almost always observational
• Databases from registries
• Databases from completed clinical trials

46/64 Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes
Example: Optimal treatment duration
• ESPRIT trial – Integrilin vs. placebo in PCI/stent patients
• Primary analysis : Integrilin superior
• Protocol : Infusion duration of 18 – 24 hours with
mandatory stopping for adverse events
• Duration of infusion left to physician discretion
• What should be the “recommended ” treatment duration ?
• Data are observational with respect to this question

47/64 Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes
Example: Optimal treatment duration
• ESPRIT trial – Integrilin vs. placebo in PCI/stent patients
• Primary analysis : Integrilin superior
• Protocol : Infusion duration of 18 – 24 hours with
mandatory stopping for adverse events
• Duration of infusion left to physician discretion
• What should be the “recommended ” treatment duration ?
• Data are observational with respect to this question

More precisely: Treatment duration of t hours means infuse


for t hours or until an adverse event requiring stopping,
whichever comes first
• This is a dynamic treatment regime for each t because
realized duration depends on the adverse event status
Johnson BA, Tsiatis AA. (2004). Estimating mean response as a function of treatment duration in an observational
study, where duration may be informatively censored. Biometrics, 60, 315–323.

47/64 Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes
Duration regime of t hours:
Stop infusion
 
immediately
  AE before t
hours

Start Integrilin
infusion

  No AE
before t Stop infusion at
hours t hours

• D = { all regimes of the form “infuse for t hours or until an


adverse event requiring
  stopping, whichever comes first”
for 18   ≤ t ≤ 24 }

Objective : Find t opt ∈ [18, 24] leading to largest mean


 
outcome (probability of no CVD event in 30 days)

48/64   Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes
Example: Treatment comparison in presence of treatment
discontinuation
• SYNERGY trial - enoxaparin (ENOX) vs. unfractionated
heparin (UFH) in ACS patients (open label )
• Primary (intent-to-treat) analysis : No difference
• Lots of treatment discontinuation (switching, stopping)
• Some mandatory due to adverse events , some at
clinician/patient discretion
• How do the treatments compare if there were no
discontinuation ?

49/64 Dynamic Treatment Regimes Webinar


Thinking in terms of dynamic treatment regimes
Example: Treatment comparison in presence of treatment
discontinuation
• SYNERGY trial - enoxaparin (ENOX) vs. unfractionated
heparin (UFH) in ACS patients (open label )
• Primary (intent-to-treat) analysis : No difference
• Lots of treatment discontinuation (switching, stopping)
• Some mandatory due to adverse events , some at
clinician/patient discretion
• How do the treatments compare if there were no
discontinuation ?

Objective: Compare the two dynamic treatment regimes


“Take ENOX (UFH) until completion or discontinuation for
mandatory reasons”
Zhang M, Tsiatis AA, Davidian M, Pieper KS, Mahaffey KW. (2011). Inference on treatment effects from a clinical
trial in the presence of premature treatment discontinuation: The SYNERGY trial. Biostatistics, 12, 258–269.

49/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data
Again: Data are observational with respect to these questions
• Decisions on duration , treatment discontinuation were
not randomized
• Made at clinician/patient discretion

50/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data
Again: Data are observational with respect to these questions
• Decisions on duration , treatment discontinuation were
not randomized
• Made at clinician/patient discretion

Difficulties for studying regimes:


• Confounding – subjects receiving one treatment or another
may not be prognostically similar
• E.g., subjects who discontinued may be sicker , older , etc
• Standard methods are available to adjust for confounding ,
e.g., regression , propensity scores , etc, assuming no
unmeasured confounders
• However , the time-dependent nature of treatment causes
additional complications

50/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data

Time-dependent confounding: Treatments actually received


over time depend on accruing information
• Temptation : “Adjust” for such time-dependent confounding
• E.g., a Cox model for outcome including time-dependent
intermediate variables and treatments
• However : Part of the effect of treatment on outcome may
be mediated through intermediate variables
• ⇒ Adjustment would incorrectly remove this effect and
hence misrepresent the true treatment effect

51/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data

Resolution:
• Requires a generalization of no unmeasured confounders
• Unverifiable from the observed data

52/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data

Resolution:
• Requires a generalization of no unmeasured confounders
• Unverifiable from the observed data

Sequential randomization assumption: At any point where a


treatment decision is made, the treatment received (among the
options available) depends only on the accrued information on
the patient and not additionally on his/her future prognosis
• At some level, this must be true
• In a SMART, this is automatically true by randomization
• With observational data , is tenable only if all accrued
information used to make decisions is available in the
database

52/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data
Under sequential randomization: Inference on dynamic
treatment regimes
• Can use weighted methods similar to those discussed
earlier for Case 2 , extended to multiple decision points
• Critical difference : Rather than weighting based on known
randomization probabilities , weighting is based on the
propensities of receiving treatment at each decision as a
function of accrued information
• Modeling/estimation of propensities

53/64 Dynamic Treatment Regimes Webinar


Studying regimes based on observational data
Under sequential randomization: Inference on dynamic
treatment regimes
• Can use weighted methods similar to those discussed
earlier for Case 2 , extended to multiple decision points
• Critical difference : Rather than weighting based on known
randomization probabilities , weighting is based on the
propensities of receiving treatment at each decision as a
function of accrued information
• Modeling/estimation of propensities

Moral:
• Many complex questions can be posed in terms of a class
of dynamic treatment regimes
• Methods are available for inference on regimes in the class

53/64 Dynamic Treatment Regimes Webinar


Constructing dynamic treatment regimes

Cases 3 and 4: More complex regimes focused on


personalizing treatment to the patient
• Case 3 : D = specified class of feasible regimes
• Case 4 : D = all possible regimes
• Rules involve accrued information on the patient

54/64 Dynamic Treatment Regimes Webinar


Constructing dynamic treatment regimes

Cases 3 and 4: More complex regimes focused on


personalizing treatment to the patient
• Case 3 : D = specified class of feasible regimes
• Case 4 : D = all possible regimes
• Rules involve accrued information on the patient

Can we estimate an optimal regime within these classes?


• From data from a SMART in which detailed accruing
information was collected?
• From data from an observational database ?

54/64 Dynamic Treatment Regimes Webinar


Characterizing an optimal regime

Demonstration: Characterize an optimal regime d opt in the


class D of all possible regimes d (Case 4 )
• Single decision point
• Two treatment options coded as {0, 1}
• d ∈ D is a single rule d1 (X1 ) taking values 0 or 1
• Data from a conventional clinical trial (simplest SMART)

(X1i , A1i , Yi ), i = 1, . . . , n (iid)

A1 is treatment received taking values {0, 1}


• Assume large outcomes are better

55/64 Dynamic Treatment Regimes Webinar


Characterizing an optimal regime
Potential outcome for a regime: For any regime d ∈ D
• Y (0) and Y (1) are potential outcomes if a randomly chosen
patient were to receive treatments 0 and 1, respectively

56/64 Dynamic Treatment Regimes Webinar


Characterizing an optimal regime
Potential outcome for a regime: For any regime d ∈ D
• Y (0) and Y (1) are potential outcomes if a randomly chosen
patient were to receive treatments 0 and 1, respectively
• Potential outcome if a randomly chosen patient were to
follow regime d
Y (d) = Y (1) I{d(X1 ) = 1} + Y (0) I{d(X1 ) = 0}
= Y (1) d(X1 ) + Y (0) {1 − d(X1 )}
• E(Y (d) ) = mean outcome if all patients in the population
were to follow regime d

56/64 Dynamic Treatment Regimes Webinar


Characterizing an optimal regime
Potential outcome for a regime: For any regime d ∈ D
• Y (0) and Y (1) are potential outcomes if a randomly chosen
patient were to receive treatments 0 and 1, respectively
• Potential outcome if a randomly chosen patient were to
follow regime d
Y (d) = Y (1) I{d(X1 ) = 1} + Y (0) I{d(X1 ) = 0}
= Y (1) d(X1 ) + Y (0) {1 − d(X1 )}
• E(Y (d) ) = mean outcome if all patients in the population
were to follow regime d

Optimal regime d opt : d opt maximizes


E(Y (d) ) among all d ∈ D

• Can we estimate d opt satisfying this from the trial data ?

56/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Observed outcome:
Y = Y (1) I(A1 = 1) + Y (0) I(A1 = 0) = Y (1) A1 + Y (0) (1 − A1 )
• By randomization , Y (0) , Y (1) ⊥
⊥ A1 |X1
⇒ E(Y (1) |X1 ) = E(Y (1) |X1 , A1 = 1) = E(Y |X1 , A1 = 1)
and similarly for Y (0)

57/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Observed outcome:
Y = Y (1) I(A1 = 1) + Y (0) I(A1 = 0) = Y (1) A1 + Y (0) (1 − A1 )
• By randomization , Y (0) , Y (1) ⊥
⊥ A1 |X1
⇒ E(Y (1) |X1 ) = E(Y (1) |X1 , A1 = 1) = E(Y |X1 , A1 = 1)
and similarly for Y (0)

Thus: E(Y (d) ) = E{ E(Y (d) |X1 ) }


h i
= E E(Y (1) |X1 )d(X1 ) + E(Y (0) |X1 ){1 − d(X1 )}
h i
= E E(Y (1) |X1 , A1 = 1)d(X1 ) + E(Y (0) |X1 , A1 = 0){1 − d(X1 )}
h i
= E E(Y |X1 , A1 = 1)d(X1 ) + E(Y |X1 , A1 = 0){1 − d(X1 )}

57/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Recall: We wish to maximize
h i
E(Y (d) ) = E E(Y |X1 , A1 = 1)d(X1 )+E(Y |X1 , A1 = 0){1−d(X1 )}

• Clearly : E(Y (d) ) is maximized by


d opt (X1 ) = I{ E(Y |X1 , A1 = 1) > E(Y |X1 , A1 = 0) }
• E(Y |X1 , A1 ) is the regression of outcome on baseline
information and treatment received

58/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Recall: We wish to maximize
h i
E(Y (d) ) = E E(Y |X1 , A1 = 1)d(X1 )+E(Y |X1 , A1 = 0){1−d(X1 )}

• Clearly : E(Y (d) ) is maximized by


d opt (X1 ) = I{ E(Y |X1 , A1 = 1) > E(Y |X1 , A1 = 0) }
• E(Y |X1 , A1 ) is the regression of outcome on baseline
information and treatment received
Suggests: Posit a regression model for E(Y |X1 , A1 )
Q(X1 , A1 ; β)
• Fit the model to trial data ⇒ Q(X1 , A1 ; β)
b
• Estimated optimal regime
b opt (X1 ) = I{ Q(X1 , 1; β)
d b }
b > Q(X1 , 0; β)

58/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Recall: We wish to maximize
h i
E(Y (d) ) = E E(Y |X1 , A1 = 1)d(X1 )+E(Y |X1 , A1 = 0){1−d(X1 )}

• Clearly : E(Y (d) ) is maximized by


d opt (X1 ) = I{ E(Y |X1 , A1 = 1) > E(Y |X1 , A1 = 0) }
• E(Y |X1 , A1 ) is the regression of outcome on baseline
information and treatment received
Suggests: Posit a regression model for E(Y |X1 , A1 )
Q(X1 , A1 ; β)
• Fit the model to trial data ⇒ Q(X1 , A1 ; β)
b
• Estimated optimal regime
b opt (X1 ) = I{ Q(X1 , 1; β)
d b }
b > Q(X1 , 0; β)
• Issue : What if the model Q(X1 , A1 ; β) is misspecified ?

58/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Shameless promotion: Discussion of estimation of an optimal
regime within a broad class of regimes D with a focus on
personalized treatment as in Cases 3 and 4 merits its own
shortcourse
• Robustness to misspecification of models?
• Alternative approaches ?
• Extension to multiple decision points ?
• Etc, etc. . .

59/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Shameless promotion: Discussion of estimation of an optimal
regime within a broad class of regimes D with a focus on
personalized treatment as in Cases 3 and 4 merits its own
shortcourse
• Robustness to misspecification of models?
• Alternative approaches ?
• Extension to multiple decision points ?
• Etc, etc. . .

Personalized Medicine and Dynamic Treatment Regimes


• Half-day shortcourse at 2015 ENAR Spring Meeting
(Sunday, March 15, morning)

59/64 Dynamic Treatment Regimes Webinar


Estimating an optimal regime
Shameless promotion: Discussion of estimation of an optimal
regime within a broad class of regimes D with a focus on
personalized treatment as in Cases 3 and 4 merits its own
shortcourse
• Robustness to misspecification of models?
• Alternative approaches ?
• Extension to multiple decision points ?
• Etc, etc. . .

Personalized Medicine and Dynamic Treatment Regimes


• Half-day shortcourse at 2015 ENAR Spring Meeting
(Sunday, March 15, morning)

Forthcoming book: Kosorok, M. R. and Moodie, E. E. M.


(2015). Adaptive Treatment Strategies in Practice: Planning
Trials and Analyzing Data for Personalized Medicine. SIAM.

59/64 Dynamic Treatment Regimes Webinar


Discussion

• Dynamic treatment regimes formalize clinical


decision-making and provide a framework for personalized
treatment
• A broad range of problems can be cast in terms of dynamic
treatment regimes
• SMARTs are the “gold standard ” data source for
estimation of dynamic treatment regimes
• Design considerations for SMARTs? Broader adoption ?
Implications for how treatments are evaluated ?
• Estimation of optimal treatment regimes is a wide open
area of research

60/64 Dynamic Treatment Regimes Webinar


Thought Leaders

2013 MacArthur Fellow Susan Murphy and Jamie Robins

61/64 Dynamic Treatment Regimes Webinar


Resources

Introductory material:
• http://methodology.psu.edu/
• http://www-personal.umich.edu/~dalmiral/
• http://www.huffingtonpost.com/
american-statistical-association/
being-smart-about-constru_b_4963862.html
• http://impact.unc.edu/Symposium2014Agenda

Literature: See the separate list of references

62/64 Dynamic Treatment Regimes Webinar


Appendix
Consistency of estimators for E(Y (11) ):
Qi = 1 − Ri + Ri I(Zi = 1) π −1
n n
!−1 n
X X X
−1
n Qi Yi or Qi Qi Yi
i=1 i=1 i=1
(11) (11)
Y = (1 − R)Y + RI(Z = 1)Y + RI(Z = 2)Y (12)

Want to show: E(QY ) = E(Y (11) )


• Using R(1 − R) = 0, I(Z = 1)I(Z = 2) = 0, etc.
E(QY ) = E[ Y (11) {(1 − R) + RI(Z = 1)π −1 } ]
= E[ Y (11) E{(1 − R) + RI(Z = 1)π −1 |R, Y (11) } ]
• So equivalently want to show
E{(1 − R) + RI(Z = 1)π −1 |R, Y (11) } = 1

63/64 Dynamic Treatment Regimes Webinar


Appendix

E{(1 − R) + RI(Z = 1)π −1 |R, Y (11) }


= E{(1 − R) + RI(Zi = 1)π −1 |R = 0, Y (11) }P(R = 0|Y (11) )
+ E{(1 − R) + RI(Zi = 1)π −1 |Ri = 1, Y (11) }P(R = 1|Y (11) )
= P(R = 0|Y (11) ) + E{ I(Z = 1)|R = 1, Y (11) }π −1 P(R = 1|Y (11) )
= P(R = 0|Y (11) ) + P(R = 1|Y (11) ) = 1

⊥ Y (11)
Because: By randomization, assignment to M1 ⊥

E{ I(Z = 1)|R = 1, Y (11) } = P(Z = 1|R = 1, Y (11) )


= P(Z = 1|R = 1) = π

For k = 2: Same argument, Q = 1 − R + RI(Z = 2)(1 − π)−1

64/64 Dynamic Treatment Regimes Webinar

You might also like