You are on page 1of 20

The Markov Decision Process

Sauré (2012)
The radiation therapy appointment scheduling problem is formulated as a
discounted infinite horizon MDP model.

Sauré and al. (2012) expended the ‘dynamic multi-priority’ patient scheduling
model developed by Patrick and al. (2008) by:
• introducing multiple appointment requests:
• multiple session durations,
• allowing parts of the appointments to be delivered using overtime.
• Block strategy scheduling : slots
• Patients are classified into types according to their priorities, each priority
having a different medically acceptable wait time
• Patient types and demand distributions do not change over time and are
uncorrelated
• The booking agent may schedule patients at most N days in advance
• Neither rescheduling nor cancellations are considered
1-State space
Scheduling decisions are made at the end of each day(once a day).
That means that the agent knows:
 um represents the number of regular hour appointment slots already
booked on day m
 vm The number of overtime slots already booked on day m
 wi The number of patients of types i waiting to be booked
Hence, a state of a system, denote s  S can be represented as follows:
s   u1 ,.., u M , v1 ,.., v M , w1 ,.., wI 
It’s important to note that the planning horizon M is dynamic so at the
beginning of each decision epoch:
uM  vM  0
Figure 1: The booking horizon
Figure 2: Timeline associated with the patient scheduling decisions for a given day k
2-Action sets
At the end of each day, the booking agent must decide on which day to start each of
the treatments waiting to be scheduled.
Any action that is available to the booking agent can be represented as follows:

a   x, y    x11 , x12 , .., xIN , y1 , .., yM 


xin Represents number of patients of type i whose first appointment is booked on
day n (from today)
ym Represents number of overtime slots booked today on day m (patients of
type I being diverted)

The set of feasible actions must satisfy the following constraints:


Constraint1

This constraint limits the number of patients booked for each treatment type i to
be less than or equal to the available number of treatments waiting to be booked

x
n 1
in  wi i
Constraint2
This constraint requires that the total number of appointment slots booked today for
day m be less than or equal to the available treatment capacity that day.

This is equivalent to ensure that the number x of overtime slots booked today for
day m is sufficient to cover the new bookings made for that day.

I min( m , N )
um    ri  m  k 1 xik  Cr  ym m
i 1 k  max{ m  li 1,1)
 rij is a vector representing the duration, in number of appointment slots,
of the patient’s appointments on day j of this time period
 li represent the number of sessions of a treatment of type i
 N is the number of days in the booking horizon
 Cr is a fixed-length appointment slots
 M  N  max i li   1 :M is the number of days in the planning horizon
which is defined large enough to allow the completion of any treatment
initiated on day N of the booking horizon
Constraint3
This constraint ensure that the total overtime utilization on day m is less than the
overtime capacity
𝑣𝑚 + 𝑦𝑚 ≤ 𝐶𝑜 ∀𝑚

Where Co is fixed and represent the overtime capacities (appointment slots/day).

Finally, all action variables must be positive and integer

𝑥𝑖𝑛 ∈ ℤ+ ∀𝑖, 𝑛

𝑦𝑚 ∈ ℤ+ ∀𝑚
3-Transition probabilities
The only source of uncertainties in the transition to the next state of the system is
the number of new requests for each type of treatment i.

Hence, the state of the system on the next day, denoted by S’ will be determined as
follows:
I
P  S ' | S, a    Pr( q
i 1
i )

The term Pr( qi ) corresponds to the probability of having qi new requests for
treatments of type i
If S’ satisfies the following equations:
Constraint4
This constraint determine the new number of regular-hour booked on day m as a
function of the number of previous slots booked on day m+1 plus all new
bookings that affected day m+1

I min m 1, N 

u 'm  um 1    ri  m 1 k 1 xik  ym 1


 
i 1 k  max m 1 li 1,1

m M
Constraint5

This constraint determine the new number of overtime appointment slots


booked on day m as a function of the number of previous slots booked on day
m+1 plus all new bookings that affected day m+1

v 'm  vm 1  ym 1 m M
Constraint6

This constraint determines the new number of treatments waiting to be


booked as the number of treatment requests have not yet been booked plus
new demand

N
w 'i  wi   xin  qi i
n 1
Constraint7

The new number of regular hour and overtime appointment slots must be equal
to zero

u 'M  v 'M  0
4-Costs
The total cost associated with choosing action a  Aa
s in s  S state s,
and comes from three sources:
 The penalties associated with the resulting patient wait times
 The cost associated with the use of overtime
 The penalties associated with postponing some of the booking
decisions
We represent the total cost as follows:

I N M I
 N

c ( s, a )  c
i 1 n 1
in xin   hm ym
m 1
  g i  wi   xin 
i 1  n 1 
 cin is the penalty associated with booking the first appointment of a patient of
type i on day n (from today)
• cin is non-increasing in I (i.e. patient with a smaller index I are more
urgent)
• Non-decreasing in n and equal to zero if n<=Ti where Ti is the wait time
target (the medically acceptable wait) associated with a patient of
priority
n
cin    k 1 f ik i, n
k 1

Where cin represent the penalty for starting a treatment of type i on day n.
The values of cin are obtained by discounting the penalties fik associated with
each additional day of wait before the start of a treatment.

hm   m 1h m
hm represents the discounted overtime cost associated with an overtime booking
on day m.

gi is the penalty for postponing to the next day the booking of a treatment of
type i.
h and  denote overtime cost and discount factor, respectively.
5-Optimality equations
To identify an optimal policy, we need to solve the following optimality
equations:

𝑣 𝑠 = min 𝑐 𝑠, 𝑎 + 𝜆 ෍ 𝑃 𝑠′|𝑠, 𝑎 𝑣(𝑠′)


𝑎∈𝐴𝑠
s  S
𝑠∈𝑆

Where v(s) represents the value function

You might also like