You are on page 1of 16

‫إخىاحً وشهالئً وحالهٍري االحببء‬

‫السالم علٍكن وزحوت هللا وبسكبحه‬


‫أقدم لكن بفضل هللا حعبلى هحبضسة هبهت عي "البسهجت الدٌٌبهٍكٍت – الٌظسٌت والخطبٍقٍت" وبإذى‬
‫هللا حعبلى وهشٍئخه سٍخن شسحهب فً أقسة وقج هوكي‪ ،‬أزجى أى ٌخقبلهب هللا وأى حكىى ذاث فبئدة‬
‫للجوٍع فً الخدزٌس والبحىد ‪ ،‬اًه هىالًب فٌعن الوىلى وًعن الٌصٍس وصلى اللهن وسلن على‬
‫حبٍبك وحبٍبٌب سٍدًب دمحم زسىلٌب الكسٌن شٌٌت اَخٍبز وخٍس الوسسلٍي وأسىحٌب وقدوحٌب وأفضلٌب‬
‫وأعظوٌب وقبئدًب وهعلوٌب وشفٍعٌب وعلى آله وصحبه أجوعٍي‪.‬‬

‫‪1|Page‬‬
Dynamic Programming
(Theory and Applications)

By: Prof. Dr / M. Osman

Dynamic programming is not one of the mathematical disciplines but it is a method


for solving optimization problems under certain specific features. This features as
presented in Hiller and Lieberman [3] are:

1- The problem can be divided into stages with a policy decision required at each
stage
2- Each stage has a number of states associated with it
3- The effect of the policy decision at each stage is to transform the current state
into a state associated with the next stage (possibly according to a probability
distribution)
4- given the current state, an optimal policy for the remaining stages is independent
of the policies adopted in previous stages. This feature is known as the
optimality principal presented by Richard Bellman [1]
5- The solution procedure begins by finding the optimal policy for each state of the
last stage
6- A recursive relationship that identifies the optimal policy for each state at
stage , given the optimal policy for each state at stage is available
7- Using this recursive relationship, the solution procedure moves backward stage
by stage each time finding the optimal policy for each state of that stage, until it
finds the optimal policy when starting at the initial stage.

Example 1: (Travelling Salesman problem)


A good prototype example to illustrate the features and introduces the terminology
of dynamic programming is the stage coach problem represented in the same
reference [3] which can be stated simply as (see fig. 1). A traveling salesman had
to travel from a certain town (1), to the last destination (10) through three cities,
the first one has three possible stops (2), (3), (4), the second one has three possible
stops (5), (6), (7) and the third has two possible stops (8), (9) and the salesman
needs to travel on the shortest route, and therefore, we need to obtain which route
minimizes total distance of the policy. The distances for the standard policy on the
stage coach run from state to state , which will be denoted by d i j , is (See Table
1)

2|Page
5 6 7 8 9
10
2 3 4 2 7 4 6 5 1 4
8 3
1 2 4 3 3 3 2 4 6 6 3
9 4
4 4 1 5 7 3 3

Table 1

2 5

1 3 6 10

4 7

Fig. 1
Solution:
Let the decision variables be the immediate destination on
stage . Thus, the route selected would be , where

Let ( ) be the total distance of the best overall policy for the remaining
stages, given that the salesman is, in state , ready to start stage and selects as
the immediate destination. Given and , let denote the value of that
minimizes ( ), and let ( ) be the corresponding minimum value
of ( ). Then ( ) ( ). The objective is to find ( ) and the
corresponding policy. Dynamic programming does this by successively
finding ( ) ( ) ( ) and then ( ). By this way we will have the
following tables 2, 3, 4, 5 representing the solutions to the one-stage, two-stage,
three-stage, and the four-stage problems, respectively.

( )

Table 2

( ) ( )
( )
8 9
5 4 8 4 8
6 9 7 7 9
7 6 7 6 8
Table 3

3|Page
( ) ( )
( )
5 6 7
2 11 11 12 11 5, 6
3 7 9 10 7 5
4 8 8 11 8 5, 6
Table 4

( ) ( )
( )
2 3 4
1 13 11 11 11 3, 4
Table 5

Thus, the optimal solution is either the route:


, or , or
with optimum value (total distance) of ( ) .

From the presentation of the previous example and its solution, the following
remarks could be deduced keeping in mind that these remarks are still valid for any
other example which can be solved using the dynamic programming approach.

Remarks 1:
1) If the decision is taken optimally starting from node 1, we will get the route
with total distance 13 which is greater than the optimum
value of the problem. This means that when it is needed to take a decision at
stages for a given problem, the decision must be taken considering the whole
problem at once and the solution is applied stage by stage.
2) The method which is used for the solution is called backward computations.
However, the same problem could be solved by what is called forward
computations. In this case, the computation started from node 1 till we reach
node 10.
3) The problem has its own recurrence relation and each other example has its own
recurrence relation and in generally to get a good experience for defining the
recurrence relation for a given example, several problems must be solved, and
even in this case, it is possible to face difficulties in defining the recurrence
relation for a given problem.
4) The dynamic programming approach for solving any problem has a parametric
feature by nature. For example, in the previous problem if we want to end at

4|Page
node 8, the optimal path will be either the route or
with optimum distance 8.

Also, if it is wanted to start from a certain place before node 1, additional tables are
needed but the old ones are used as it is. In addition, if it is wanted to more beyond
node 10, then a forward computation is recommended in this case.

Example 2: (Linear Programming problem with two variables)


Use dynamic approach to solve the following linear programming problem:
,
subject to

Solution:
This problem is considered as a two-stage problem and the states are the slack
variables values, which are continuous.

The second stage problem:


( ) ,
subject to
,
,

i.e.
,
subject to
( ).
The solution is obtained directly as:
( ) , then ( ) ( ),
if ( ) , we get , i.e. ,
Otherwise ( ) if

5|Page
The first stage problem:
( ) ( ( ))
( ( ))
subject to

i.e.
( ( )
Subject to

Since , therefore,
( )
and the first stage problem takes the form
( )
subject to

i.e.,
( ),
subject to

Therefore, ( ) .

Example 3: (Linear programming problem with three variables)


Use dynamic programming approach to solve the following linear programming
problem.

subject to

Solution
This problem is considered as three-stage problem and the states are the slack
variables values, which are continuous.

6|Page
The third stage problem:
( ) ,
subject to

i.e.
( ),
subject to
( )
The solution is obtained directly as:
( )
( ) ( )
if ( ) , we get , and
otherwise, ( ) when .
Now, we will have two situations:
(i)

In this case
( )

The second stage problem:


( ) ( ),
subject to

i.e.,
( )
subject to

7|Page
( )
We will have two cases:
either ( ) , i.e., ,
or ( ) , i.e.,
(a) If , then the second stage problem takes the form:
( )
subject to

The problem is feasible when ( )


(b) If , then the second stage problem takes the form
( )
subject to

The problem is feasible if , therefore, for ,


( )
Therefore, in either cases, if ( )
(ii)
In this case, ( )

The second stage problem:


( ),
subject to
( )
i.e.
( )
subject to
( )
We will have also two cases,
(a) , then the second stage problem takes the form
( )
subject to

8|Page
The problem is feasible when ( ) .
(b) , then the second stage problem takes the form
( )
subject to

The problem is feasible when therefore, for ,


( ) .
Therefore, in either cases if , then ( ) .
The first stage problem:
( ) ( )
subject to
,
i.e.,
( ( ))
subject to

Return to the two cases…


(i) , the first stage problem takes the form (for )
( )
subject to

i.e.,
( )
subject to

from which ( ) .
(ii) , the first stage problem takes the form (for ,)
( )
subject to

i.e.,
( ),
subject to

9|Page
from which ( ) .
For case (i):
with optimum value ( ) which is matching
with
For case (ii):
and this is not matched with the condition , and
therefore, it is rejected. Then, the optimal solution of the problem is:
( )
Remarks 2:
1) The general linear programming problem which can be solved using the
dynamic programming approach must take the form:

subject to

where

In this case, the problem can be considered as an allocation problem for which
the resources are allocated to the activities .
2) To see the advantage of solving linear programming using the dynamic
programming approach consider the following parametric linear programming
problem:
,
Subject to

where . If , it is seen from example 2, that the


optimal solution is ̅ ̅ and the optimal value is ̅ .
In order to obtain the stability set of the first kind, ( ), for this problem,
which is defined by
( ) {( ) ̅ ̅ }.
Let us proceed as follows using dynamic programming approach (See Example
2).

10 | P a g e
The second stage problem:
( ) ( )

The first stage problem:

( ( ))
Subject to

i.e.,

( ( ))

subject to

In order that ̅ solves this problem, we must have , i.e.,


, and in this case ̅ .
Thus, ( ) *( ) +

Example 4: (Quadratic Programming problem)


Use the dynamic programming approach to solve the following quadratic
programming problem:

subject to

Solution:
The problem has two stages and the states are the slack variable values, which are
continuous.

The second stage problem:


( )
subject to
( )

The solution is clearly ( ) .

11 | P a g e
The first stage problem:
( ) ( )

subject to

The solution is clearly ( ) .


Therefore, the optimal solution for this problem is , with optimum
value ̅ ( ) .

Remarks 3:
1) The general nonlinear programming problem which can be solved using the
dynamic programming approach must take the form:

( )
subject to

where ( ) is a general nonlinear continuous function,


. In this case, the problem can be considered as an allocation problem
for which the resources are allocated to the activities .
2) To see the advantage of solving such nonlinear programming models using the
dynamic programming approach, consider the following parametric quadratic
programming problem.

subject to

If , it is seen from example 3, that the optimal solution


is ̅ ̅ , and the optimum value is ̅ .
In order to obtain the stability set of the first kind ( ) for this problem,
which is defined by:
( ) *( ) ̅ ̅ +.
Then we proceed as follows:
The second stage problem:
( )
subject to
( )

12 | P a g e
In order that the solution should be , we must have , and in this
case ( ) .
The first stage problem:
( ) ( )
subject to
.
In order that the solution should be , it is clear that we must have ,
and in this case ( ) . Therefore,
( ) *( ) +.

Example 5: The Transportation Problem [2] (Two sources, and Three destinations)
A firm has two stores , , which contains 3 , 4 units from a certain commodity
respectively, it receives orders from three customers , and for an amount
of 2,3, and 2, respectively (fig. 2).
c1 2

3 s1

c2 3

4 s2

c3 2

Fig. 2

The distances between stores and customers in kilometers are as shown in Table 6.
If the cost of transporting one unit for 1 Km is a fixed amount , use the dynamic
approach to find the optimal distribution of commodity between stores and
customers.
Customers / Stores
100 150 200
300 200 100
Table 6

Solution:
The mathematical model for this problem takes the form
( )
subject to

13 | P a g e
.

This model can be written as:


( )
subject to

Using the dynamic approach to solve this problem, the problem has three stages
and the states are the availabilities at the stores, which are discrete. Ignoring
from the equation, we get the following three-stage problem.

The third stage problem:

( )
( )

Table 7

 s1
c3 2

2  s2
Fig. 3

The second stage problem:

 s1 c2 3

𝜼
5  s2 c3 2

Fig. 4

14 | P a g e
𝟓 𝜼
( ) ( )
⁄ ( )

Table 8

The third stage problem:

( ) ( )
⁄ ( )

Table 9

c1 2
𝟑 𝟐
3 s1

c2 3

4 s2

c3 2𝟑

Fig. 5

Thus, the optimal solution of the problem is:


with minimum cost
( ) . 𝟐

Remarks 4:
1) This problem could be extended to any number of destinations, and it is possible
to handle also the case of three sources and in the case, we use two state
parameters . But to extend the problem to handle three sources or more, it
will be extremely difficult.

2) The advantage of using the dynamic programming approach to solve such


problems is as we said before it is general parametric nature. For example, if the
sources lie at equal distances from the firm, and there is a possibility to put the
total quantity of the commodity in either one of the two stores, we can

15 | P a g e
determine the best choice by adding several rows to the first stage problem as
shown in table 10.

⁄ ( ) ( ) ( )

Table 10

It is clear that the best situation is when contains 5 units and contains 2
units, the solution in this case is:

and the minimum cost is ( ) .

References
[1] Bellman R., Dreyfus: Applied Dynamic Programming, Princeton University
Press, Princeton, New Jersey, 1962
[2] Hadley G.: Nonlinear and Dynamic Programming, Addison-Wesley Publishing
Company, Inc., Reading Mass, 1964.
[3] Hillier, Lieberman, Introduction to operations research, 9th ed., McGraw-Hill
Companies, Inc., 2010

16 | P a g e

You might also like