Dynamic Programming

‫إخىاحً وشهالئً وحالهٍري االحببء‬
‫السالم علٍكن وزحوت هللا وبسكبحه‬

‫أقدم لكن بفضل هللا حعبلى هحبضسة هبهت عي "البسهجت الدٌٌبهٍكٍت – الٌظسٌت والخطبٍقٍت" وبإذى‬
‫هللا حعبلى وهشٍئخه سٍخن شسحهب فً أقسة وقج هوكي‪ ،‬أزجى أى ٌخقبلهب هللا وأى حكىى ذاث فبئدة‬
‫للجوٍع فً الخدزٌس والبحىد ‪ ،‬اًه هىالًب فٌعن الوىلى وًعن الٌصٍس وصلى اللهن وسلن على‬
‫حبٍبك وحبٍبٌب سٍدًب دمحم زسىلٌب الكسٌن شٌٌت اَخٍبز وخٍس الوسسلٍي وأسىحٌب وقدوحٌب وأفضلٌب‬
‫وأعظوٌب وقبئدًب وهعلوٌب وشفٍعٌب وعلى آله وصحبه أجوعٍي‪.‬‬
‫‪1|Page‬‬
Dynamic Programming
(Theory and Applications)
By: Prof. Dr / M. Osman
Dynamic programming is not one of the mathematical disciplines but it is a method

for solving optimization problems under certain specific features. This features as
presented in Hiller and Lieberman [3] are:
1- The problem can be divided into stages with a policy decision required at each
stage
2- Each stage has a number of states associated with it
3- The effect of the policy decision at each stage is to transform the current state
into a state associated with the next stage (possibly according to a probability
distribution)
4- given the current state, an optimal policy for the remaining stages is independent
of the policies adopted in previous stages. This feature is known as the
optimality principal presented by Richard Bellman [1]
5- The solution procedure begins by finding the optimal policy for each state of the
last stage
6- A recursive relationship that identifies the optimal policy for each state at
stage , given the optimal policy for each state at stage is available
7- Using this recursive relationship, the solution procedure moves backward stage
by stage each time finding the optimal policy for each state of that stage, until it
finds the optimal policy when starting at the initial stage.
Example 1: (Travelling Salesman problem)

A good prototype example to illustrate the features and introduces the terminology
of dynamic programming is the stage coach problem represented in the same
reference [3] which can be stated simply as (see fig. 1). A traveling salesman had
to travel from a certain town (1), to the last destination (10) through three cities,
the first one has three possible stops (2), (3), (4), the second one has three possible
stops (5), (6), (7) and the third has two possible stops (8), (9) and the salesman
needs to travel on the shortest route, and therefore, we need to obtain which route
minimizes total distance of the policy. The distances for the standard policy on the
stage coach run from state to state , which will be denoted by d i j , is (See Table
1)
2|Page
5 6 7 8 9
10
2 3 4 2 7 4 6 5 1 4
8 3
1 2 4 3 3 3 2 4 6 6 3
9 4
4 4 1 5 7 3 3
Table 1
2 5
1 3 6 10
4 7
Fig. 1
Solution:
Let the decision variables be the immediate destination on
stage . Thus, the route selected would be , where
Let ( ) be the total distance of the best overall policy for the remaining
stages, given that the salesman is, in state , ready to start stage and selects as
the immediate destination. Given and , let denote the value of that
minimizes ( ), and let ( ) be the corresponding minimum value
of ( ). Then ( ) ( ). The objective is to find ( ) and the
corresponding policy. Dynamic programming does this by successively
finding ( ) ( ) ( ) and then ( ). By this way we will have the
following tables 2, 3, 4, 5 representing the solutions to the one-stage, two-stage,
three-stage, and the four-stage problems, respectively.
( )
Table 2
( ) ( )
( )
8 9
5 4 8 4 8
6 9 7 7 9
7 6 7 6 8
Table 3
3|Page
( ) ( )
( )
5 6 7
2 11 11 12 11 5, 6
3 7 9 10 7 5
4 8 8 11 8 5, 6
Table 4
( ) ( )
( )
2 3 4
1 13 11 11 11 3, 4
Table 5
Thus, the optimal solution is either the route:

, or , or
with optimum value (total distance) of ( ) .
From the presentation of the previous example and its solution, the following
remarks could be deduced keeping in mind that these remarks are still valid for any
other example which can be solved using the dynamic programming approach.
Remarks 1:
1) If the decision is taken optimally starting from node 1, we will get the route
with total distance 13 which is greater than the optimum
value of the problem. This means that when it is needed to take a decision at
stages for a given problem, the decision must be taken considering the whole
problem at once and the solution is applied stage by stage.
2) The method which is used for the solution is called backward computations.
However, the same problem could be solved by what is called forward
computations. In this case, the computation started from node 1 till we reach
node 10.
3) The problem has its own recurrence relation and each other example has its own
recurrence relation and in generally to get a good experience for defining the
recurrence relation for a given example, several problems must be solved, and
even in this case, it is possible to face difficulties in defining the recurrence
relation for a given problem.
4) The dynamic programming approach for solving any problem has a parametric
feature by nature. For example, in the previous problem if we want to end at
4|Page
node 8, the optimal path will be either the route or
with optimum distance 8.
Also, if it is wanted to start from a certain place before node 1, additional tables are
needed but the old ones are used as it is. In addition, if it is wanted to more beyond
node 10, then a forward computation is recommended in this case.
Example 2: (Linear Programming problem with two variables)

Use dynamic approach to solve the following linear programming problem:
,
subject to
Solution:
This problem is considered as a two-stage problem and the states are the slack
variables values, which are continuous.
The second stage problem:

( ) ,
subject to
,
,
i.e.
,
subject to
( ).
The solution is obtained directly as:
( ) , then ( ) ( ),
if ( ) , we get , i.e. ,
Otherwise ( ) if
5|Page
The first stage problem:
( ) ( ( ))
( ( ))
subject to
i.e.
( ( )
Subject to
Since , therefore,
( )
and the first stage problem takes the form
( )
subject to
i.e.,
( ),
subject to
Therefore, ( ) .
Example 3: (Linear programming problem with three variables)

Use dynamic programming approach to solve the following linear programming
problem.
subject to
Solution
This problem is considered as three-stage problem and the states are the slack
variables values, which are continuous.
6|Page
The third stage problem:
( ) ,
subject to
i.e.
( ),
subject to
( )
The solution is obtained directly as:
( )
( ) ( )
if ( ) , we get , and
otherwise, ( ) when .
Now, we will have two situations:
(i)
In this case
( )

( ) ( ),
subject to
i.e.,
( )
subject to
7|Page
( )
We will have two cases:
either ( ) , i.e., ,
or ( ) , i.e.,
(a) If , then the second stage problem takes the form:
( )
subject to
The problem is feasible when ( )

(b) If , then the second stage problem takes the form
( )
subject to
The problem is feasible if , therefore, for ,

( )
Therefore, in either cases, if ( )
(ii)
In this case, ( )

( ),
subject to
( )
i.e.
( )
subject to
( )
We will have also two cases,
(a) , then the second stage problem takes the form
( )
subject to
8|Page
The problem is feasible when ( ) .
(b) , then the second stage problem takes the form
( )
subject to
The problem is feasible when therefore, for ,

( ) .
Therefore, in either cases if , then ( ) .
( ) ( )
subject to
,
i.e.,
( ( ))
subject to
Return to the two cases…

(i) , the first stage problem takes the form (for )
( )
subject to
i.e.,
( )
subject to
from which ( ) .
(ii) , the first stage problem takes the form (for ,)
( )
subject to
i.e.,
( ),
subject to
9|Page
from which ( ) .
For case (i):
with optimum value ( ) which is matching
with
For case (ii):
and this is not matched with the condition , and
therefore, it is rejected. Then, the optimal solution of the problem is:
( )
Remarks 2:
1) The general linear programming problem which can be solved using the
dynamic programming approach must take the form:
subject to
where
In this case, the problem can be considered as an allocation problem for which
the resources are allocated to the activities .
2) To see the advantage of solving linear programming using the dynamic
programming approach consider the following parametric linear programming
problem:
,
Subject to
where . If , it is seen from example 2, that the

optimal solution is ̅ ̅ and the optimal value is ̅ .
In order to obtain the stability set of the first kind, ( ), for this problem,
which is defined by
( ) {( ) ̅ ̅ }.
Let us proceed as follows using dynamic programming approach (See Example
2).
10 | P a g e
( ) ( )
( ( ))
Subject to
i.e.,
( ( ))
subject to
In order that ̅ solves this problem, we must have , i.e.,

, and in this case ̅ .
Thus, ( ) *( ) +
Example 4: (Quadratic Programming problem)

Use the dynamic programming approach to solve the following quadratic
programming problem:
subject to
Solution:
The problem has two stages and the states are the slack variable values, which are
continuous.

( )
subject to
( )
The solution is clearly ( ) .
11 | P a g e
( ) ( )
subject to
The solution is clearly ( ) .

Therefore, the optimal solution for this problem is , with optimum
value ̅ ( ) .
Remarks 3:
1) The general nonlinear programming problem which can be solved using the
dynamic programming approach must take the form:
( )
subject to
where ( ) is a general nonlinear continuous function,

. In this case, the problem can be considered as an allocation problem
for which the resources are allocated to the activities .
2) To see the advantage of solving such nonlinear programming models using the
dynamic programming approach, consider the following parametric quadratic
programming problem.
subject to
If , it is seen from example 3, that the optimal solution

is ̅ ̅ , and the optimum value is ̅ .
In order to obtain the stability set of the first kind ( ) for this problem,
which is defined by:
( ) *( ) ̅ ̅ +.
Then we proceed as follows:
( )
subject to
( )
12 | P a g e
In order that the solution should be , we must have , and in this
case ( ) .
( ) ( )
subject to
.
In order that the solution should be , it is clear that we must have ,
and in this case ( ) . Therefore,
( ) *( ) +.
Example 5: The Transportation Problem [2] (Two sources, and Three destinations)
A firm has two stores , , which contains 3 , 4 units from a certain commodity
respectively, it receives orders from three customers , and for an amount
of 2,3, and 2, respectively (fig. 2).
c1 2
3 s1
c2 3
4 s2
c3 2
Fig. 2
The distances between stores and customers in kilometers are as shown in Table 6.
If the cost of transporting one unit for 1 Km is a fixed amount , use the dynamic
approach to find the optimal distribution of commodity between stores and
customers.
Customers / Stores
100 150 200
300 200 100
Table 6
Solution:
The mathematical model for this problem takes the form
( )
subject to
13 | P a g e
.
This model can be written as:

( )
subject to
Using the dynamic approach to solve this problem, the problem has three stages
and the states are the availabilities at the stores, which are discrete. Ignoring
from the equation, we get the following three-stage problem.
( )
( )
Table 7
 s1
c3 2
2  s2
Fig. 3
 s1 c2 3
𝜼
5  s2 c3 2
Fig. 4
14 | P a g e
𝟓 𝜼
( ) ( )
⁄ ( )
Table 8
( ) ( )
⁄ ( )
Table 9
c1 2
𝟑 𝟐
3 s1
c2 3
4 s2
c3 2𝟑
Fig. 5
Thus, the optimal solution of the problem is:

with minimum cost
( ) . 𝟐
Remarks 4:
1) This problem could be extended to any number of destinations, and it is possible
to handle also the case of three sources and in the case, we use two state
parameters . But to extend the problem to handle three sources or more, it
will be extremely difficult.
2) The advantage of using the dynamic programming approach to solve such

problems is as we said before it is general parametric nature. For example, if the
sources lie at equal distances from the firm, and there is a possibility to put the
total quantity of the commodity in either one of the two stores, we can
15 | P a g e
determine the best choice by adding several rows to the first stage problem as
shown in table 10.
⁄ ( ) ( ) ( )
Table 10
It is clear that the best situation is when contains 5 units and contains 2
units, the solution in this case is:
and the minimum cost is ( ) .
References
[1] Bellman R., Dreyfus: Applied Dynamic Programming, Princeton University
Press, Princeton, New Jersey, 1962
[2] Hadley G.: Nonlinear and Dynamic Programming, Addison-Wesley Publishing
Company, Inc., Reading Mass, 1964.
[3] Hillier, Lieberman, Introduction to operations research, 9th ed., McGraw-Hill
Companies, Inc., 2010
16 | P a g e

Dynamic Programming

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dynamic Programming

Uploaded by

Copyright:

Available Formats

‫إخىاحً وشهالئً وحالهٍري االحببء‬

‫السالم علٍكن وزحوت هللا وبسكبحه‬

By: Prof. Dr / M. Osman

Dynamic programming is not one of the mathematical disciplines but it is a method

Example 1: (Travelling Salesman problem)

Thus, the optimal solution is either the route:

Example 2: (Linear Programming problem with two variables)

The second stage problem:

Example 3: (Linear programming problem with three variables)

The second stage problem:

The problem is feasible when ( )

The problem is feasible if , therefore, for ,

The second stage problem:

The problem is feasible when therefore, for ,

Return to the two cases…

where . If , it is seen from example 2, that the

The first stage problem:

In order that ̅ solves this problem, we must have , i.e.,

Example 4: (Quadratic Programming problem)

The second stage problem:

The solution is clearly ( ) .

The solution is clearly ( ) .

where ( ) is a general nonlinear continuous function,

If , it is seen from example 3, that the optimal solution

This model can be written as:

The third stage problem:

The second stage problem:

The third stage problem:

Thus, the optimal solution of the problem is:

2) The advantage of using the dynamic programming approach to solve such

and the minimum cost is ( ) .

You might also like