You are on page 1of 33

Operations Research

Chapter 3: Dynamic Programming

Sonia REBAI
Tunis Business School
University of Tunis
Introduction
ü Dynamic programming (DP) is a recursive optimization approach that helps
take interdependent and sequential decisions.

ü A recursive optimization approach is a method that optimizes over a


number of steps so that each step will provide the next one with adequate
information.

ü As opposed to LP, there is no mathematical formalism that can lead to


solving DP

ü DP is a solution method to be adapted to each specific problem.


Introduction - continued
ü As « divide & conquer », DP combines solved sub-problems to approach
the global problem.

ü Unlike “divide & conquer”, subproblems of DP are not independent.


ü DP reduces the calculation effort by
§ Solving in “bottom-up” fashion the subproblems

§ Preserving in the memory solutions to encountered subproblems


§ Using the obtained solutions only when the related subproblems are
involved.
Foundations of DP
The basic features that characterize DP problems are :

ü The problem can be divided into sub-problems also called stages.

ü Each stage has a number of states.

ü Recursively, a policy decision is identified at each stage.

ü The effect of the policy decision at each stage is to transform the current

state to a state associated with the next stage.


Foundations of DP - continued
ü Given the current state, an optimal policy for the remaining stages is

independent of the policy decisions adopted in previous stages.

ü Two calculation approaches may be adopted :

§ Backward method: we start from the last step and we go back to the

first.

§ Forward method: we start from the first stage and we go to the last.
Consider an n-step sequential decision problem.
ü We decompose the problem into n steps, each corresponding to a particular
decision.
ü Each step will feed the next one so that the output of one step serves as
input to the next step. Decision

Input Step Output

Result
Backward approach
ü Xi : the decision at step i

ü Si : The input or state of the system at step i

ü Ri(Si, Xi) : The immediate result of decision Xi given that the state is Si.

ü Fi(Si,Xi) : the cumulative results from i to n given that at step i decision Xi

is chosen and the state is Si.


According to the backward approach, calculations are made from step n to 1.

X1 Xn-1 Xn

S1 Step S2 ... Sn-1 Step Sn


1 n-1 Step n

R1(S1,X1) Rn-1(Sn-1,Xn-1) Rn(Sn,Xn)

F1(S1,X1) Fn-1(Sn-1,Xn-1) Fn(Sn,Xn)=Rn(Sn,Xn)


Example 1: Shortest path problem
We want to move from a city A to a city H. Several paths are possible.

Determine the shortest path.


6
5 B E 4
2
3 7 1
A C F H
4 4
3
5
D G
Example 1: Shortest path problem - continued
We denote by:
ü Xi: the destination city at step i (i=1, ..., 3).
ü Si: the departure city in step i, (i=1, ..., 3).
ü Ri(Si, Xi): the distance between cities Si and Xi, (i=1, ..., 3).
ü Fi(Si,Xi) the distance traveled from city Si to city H knowing that Xi is the
destination at step i.
ü Xi* is the destination that minimizes the distance Fi(Si,Xi).
ü Fi*(Si) = Min Xi Fi(Si,Xi) = Fi(Si,Xi*).
Step 3 F3(S3,X3) = R3(S3, X3) Optimal Decision
S3 \ X 3 H F3*(S3) X 3*
E 4 4 H
F 1 1 H
G 3 3 H
Step 2
F2(S2,X2) = R2(S2, X2) + F3(X2) Optimal Decision
S2 \ X 2 E F G F2*(S2) X 2*
B 6 +4 = 10 2 +1 = 3 - 3 F
C - 7+ 1= 8 4+ 3 = 7 7 G
D - - 5 +3 = 8 8 G

Step 1
F1(S1,X1) = R1(S1, X1) + F2*(X1) Optimal Decision
S1 \ X 1 B C D F1*(S1) X 1*
A 5 + 3 = 8 3 + 7= 10 4 + 8 = 12 8 B
Thus, the shortest path linking A to H is A-B-F-H with length 8.

The recursive function is written as follows:

Fi(Si,Xi) = Ri(Si,Xi) + Fi+1*(Xi) (i = 1,2)

F3(S3,X3) = R3(S3,X3)
Forward approach
ü Xi : the decision at step i

ü Si : state of the system after step i;

ü Ri(Si,Xi) : The immediate result of step i based on decision Xi and given

that the state is Si;

ü Fi(Si,Xi) : the cumulative result of steps 1 to i given that at step i decision

Xi is made and the state is Si.


Unlike the backward approach, the forward method starts at step 1 till step n.

X1 X2 Xn

Step S1 Step S2 ... Sn-1 Step Sn


1 2 n

R1(S1,X1) R2(S2,X2) Rn(Sn,Xn)

F1(S1,X1)= R1(S1,X1) F2(S2,X2) Fn(Sn,Xn)


Example 1: Shortest path problem - continued
We denote by:
ü Xi: the departure city at step i (i = 1, ..., 3)
ü Si: the destination city at step i (i = 1,.., 3)
ü Ri(Si,Xi) : the distance between the cities Xi and Si
ü Fi(Si,Xi) the distance traveled from city A to city Si knowing that Xi is the
departure city at step i.
ü Xi* is the departure city which minimizes the distance Fi(Si,Xi)
ü Fi*(Si) = MinXi Fi(Si,Xi) = Fi(Si,Xi*)
Step 1 F1(S1,X1) = R1(S1,X1) Optimal Decision
S1 \ X1 A F1*(S1) X1*
B 5 5 A
C 3 3 A
D 4 4 A
Step 2
F2(S2,X2) = R2(S2,X2) + F1*(X2) Optimal Decision
S2 \ X2 B C D F2*(S2) X2*
E 6 + 5 = 11 - - 11 B
F 2 + 5 = 7 7 + 3 = 10 - 7 B
G - 4 +3 = 7 5 +4 = 9 7 C

Step 3
F3(S3,X3) = R3(S3,X3) + F2*(X3) Optimal Decision
S3 \ X3 E F G F3*(S3) X3*
H 4 + 11 =15 1 + 7 = 8 3 + 7 = 10 8 F
Thus, once again we find the same shortest path linking A to H:

A-B-F-H with length 8.

The recursive function is written as follows:

Fi(Si,Xi) = Ri(Si,Xi) + Fi-1*(Xi) (i = 1,2)

F1(S1,X1) = R1(S1,X1)
Which method to use?
ü The method to adopt depends on the availability of information on the
initial or final state
ü If we know the initial state but not the final state, then we use the
backward method
ü If we know the final state but not the initial state, then we use the forward
method
ü If both states are known, then both methods apply
Keep in mind
DP Characteristics
ü The problem can be decomposed into a number of steps

ü At each step, a number of candidate states may apply

ü To each step and each corresponding state, we identify the possible

decisions that can be made

ü We use a recursive formula so that at least one decision can be maintained

for each state


DP Characteristics - continued
ü The recursive formula expresses the immediate consequence of the decision
Ri(Si,Xi) combined with the best cumulative result over the various steps
accounted for from last to current in a forward approach, F*i-1(Si-1) and from
first to current in a backward approach, F*i+1(Si+1).

ü The general form of the recursive formula is:

Fi(Si,Xi) = f(direct result, optimal cumulative result over the previous steps)
DP Characteristics - continued
More precisely:

Fi(Si,Xi) = g(Ri(Si,Xi),Fi-1*(Si-1)) (forward)

Fi(Si,Xi) = g(Ri(Si,Xi),Fi+1*(Si+1)) (backward)

ü Function g could be additive, multiplicative or other.


ü State vectors Si-1 and Si+1 are expressed in terms of Si and Xi.

ü DP relies on the principle of optimality or Bellman Principle. That is, any


sub-policy of an optimal policy is also optimal
Example 2 : Budget Allocation
An industrial firm having a budget of 60,000 TD must allocate its entire budget
among its three plants in Tunis, Sousse, and Sfax. Each plant cannot receive
more than 40,000 TD. All amounts must be multiple of 10,000 TD. Expected
revenues for each type of investment in thousands of TD are given below:
Expected Revenues
Investments Tunis Sousse Sfax
0 0 0 0
10 30 45 35
20 50 60 75
30 90 70 95
40 100 90 110
Example 2 - continued
Given that plants of Tunis & Sfax must each receive a minimum of 10,000
dinars, determine the optimal budget allocation.
ü Xi : The budget allocated to plant i; i=1 for Tunis, 2 for Sousse, and 3
for Sfax.
ü Si : the available budget before making a decision about plants i,
i+1,…, 3, i = 1, 2, 3.
ü Ri(Si, Xi): the obtained revenue of plant i resulting from a budget
allocation of Xi and an available budget of Si for plants i, i+1,...3, where
i = 1, 2, 3.
Example 2 - continued
ü Fi(Si,Xi) is the maximum cumulative revenue of plants i,…, 3, for a
total budget of Si for these plants and that the allocated budget for plant
i is Xi, i = 1, 2, 3.

ü Xi* optimal budget to allocate to plant i that maximizes revenue


Fi(Si,Xi).

ü Fi*(Si) = Max Xi Fi(Si,Xi) = Fi(Si,Xi*).


Step 3 (Sfax) F3(S3,X3) = R3(S3, X3) Optimal decision
S3 \ X 3 10 20 30 40 F3*(S3) X3 *
10 35 - - - 35 10
20 - 75 - - 75 20
30 - - 95 - 95 30
40 - - - 110 110 40

Step 2 (Sousse)
F2(S2,X2) = R2(S2, X2) + F*3(S2-X2) OD
S2 \ X 2 0 10 20 30 40 F2*(S2) X2*
20 0+75 =75 45 +35 =80 - - - 80 10
30 0+95 =95 45 +75 =120 60 +35 =95 - - 120 10
40 0 +110 =110 45 +95 =140 60 +75 =135 70 +35 =105 - 140 10
50 - 45+110 =155 60 +95 =155 70 +75 =145 90 +35 =125 155 10 ou 20

Step 1 (Tunis)
F1(S1,X1) = R1(S1, X1) + F*2(S1-X1) OD
S1 \ X 1 10 20 30 40 F1*(S1) X1 *
60 30 +155 =185 50 +140 =190 90 +120 =210 100 +80 =180 210 30
Example 2 - continued
Consequently, the optimal allocation is of 30,000 TD for the plant of Tunis,
10,000 TD for the plant of Sousse and 20,000 TD for the plant of Sfax. The
total revenue is of 210,000 TD.

The recursive function is written as follows:

Fi(Si,Xi) = Ri(Si,Xi) + Fi+1*(Si -Xi) (i = 1,2)

F3(S 3,X3) = R3(S3,X3)


Example 3 : Production Planning Problem
We are interested in determining the production levels of a certain product over

the next 4 months. A production run involves a fixed cost of 3 DT and a variable

cost of 1 DT per unit. At the end of each month, any excess of stock involves a

holding cost of 0.5 DT per unit. At any month, the production capacity is of 4

units while the storage capacity is of 2 units. The demand for the next 4 months

is respectively 1, 3, 2, and 4. Given that the initial stock is empty, determine the

optimal production plan.


Example 3 - continued
ü n=4

ü We use the backward method as the initial stock is known

ü Xi : The quantity to produce at month i, (i = 1, ..., 4)

ü Si : The stock level at the start of month i (i = 1,…, 4)

ü Ri (Si, Xi) = Cost at month i (i = 1,…, 4)


= Production cost + holding cost
= Production cost of Xi units + holding cost of (Si+Xi–Di) units
Example 2 - continued
where Di is the demand of month i (i =1,…, 4)

(3+1*Xi) + 0.5*(Si+Xi–Di) if Xi ≠ 0
Ri (Si, Xi) =
0.5*(Si+Xi–Di) if Xi = 0

ü Fi (Si,Xi) = total minimum cost for months i, i+1,…, 4, given that at the start
of month i the stock level is Si and Xi units are to be produced (i =1,…,4).
Example 2 - continued

ü Fi*(Si) = Min Xi Fi(Si,Xi) (i =1,…, 4)

ü Fi(Si, Xi) = g(Ri(Si,Xi) , F*i+1(Si+1))

= g(Ri(Si,Xi) , F*i+1(Si + Xi – Di))

= Ri(Si,Xi) + Fi+1*(Si + Xi – Di), (i =1,..,3)

ü F4(S4,X4) = R4(S4,X4)
Step 4 (month 4) D4 = 4
F4(S4,X4) = R4(S4,X4) Optimal Decision
S4\ X4 2 3 4 F4*(S4) X4*
0 - - 3 +4+ 0 = 7 7 4
1 - 3 + 3+ 0 = 6 3+4 + 0,5 = 7,5 6 3
2 3 + 2+ 0 = 5 3 + 3+ 0,5 =6,5 3+ 4 + 1 = 8 5 2

Step 3 (month 3/ month4) D3 = 2


F3(S3,X3) = R3(S3,X3) + F4*(S3+X3-D3) Optimal Decision
S3 \ X3 0 1 2 3 4 F3*(S3) X3*
0 - - 3 +2+ 0 +7 3 +3 +0,5 + 3 +4+ 1+
12 2
= 12 6 = 12,5 5 =13
1 - 3+1+0+7 3 +2+ 0,5 + 3 +3+ 1+ - 11 1
= 11 6 = 11,5 5 = 12
2 0+ 0 +7 3 +1 +0,5 3 +2 + 1+ - -
= 7 + 6 =10,5 5 = 11 7 0
Step 2 (months 2/3-4) D2 = 3
F2(S2,X2) = R2(S2,X2) + F3*(S2+X2-D2) Optimal Decision
S2 \ X2 1 2 3 4 F2*(S2) X2*
0 - - 3 +3 +0 + 3 +4+ 0,5
18 3
12 = 18 +11=18,5
1 - 3 + 2 + 0 +12 3 +3+ 0,5 3 + 4 + 1 15 4
= 17 + 11=17,5 + 7 = 15
2 3 + 1+ 0 + 12 3+ 2 + 0,5 + 3 +3 + 1+
- 14 3
= 16 11 = 16,5 7 = 14

Step 1 (months 1/2-4) D1 = 1


F1(S1,X1) = R1(S1, X1) + F2*(S1+X1-D1) Optimal Decision
S1 \X1 1 2 3 F1*(S1) X1*
0 3+1+ 0+18 = 22 3 +2+ 0,5 +15 = 20,5 3 + 3 +1+14= 21 20,5 2

Hence, the optimal production plan is: X1*= 2, X2*= 4, X3*= 0, X4*= 4 with a
minimum cost of = 20.5 dinars.
Hence, the optimal production plan is: X1*= 2, X2*= 4, X3*= 0, X4*= 4
with a minimum cost of = 20.5 dinars.

You might also like