You are on page 1of 47

Advanced Algorithms

Dynamic Programming

Professor: Dijana Capeska Bogatinoska, PhD


Teaching Assistant: Aleksandar Karadimce, MSc

2016/2017 Academic year


BOOK
Algorithm Design
by Jon Kleinberg,
Eva Tardos
2005

Chapter 6
Optimization
This, generally, refers to classes of problems that possess
multiple solutions at one level, and where we have a real-
valued function defined on the solutions.

Problem: find a solution that minimizes or maximizes the


value of this function.

Note: there is no guarantee that such a solution will be


unique and, moreover, there is no guarantee that you will find
it (local maxima, anyone?) unless the search is over a small
enough search space or the function is restricted enough.
Optimization
Question: are there classes of problems for which you can
guarantee an optimizing solution can be found?

Answer: yes. BUT you also need to find such a solution in a


"reasonable" amount of time.

We are going to look at two classes of problems, and the


techniques that will succeed in constructing their solutions in a
"reasonable" (i.e., low degree polynomial in the size of the initial
data) amount of time.
Two Algorithmic Models:
Divide & Dynamic
Conquer Programming
View problem as collection of
subproblems
“Recursive” nature

Independent subproblems

Overlapping subproblems

Number of subproblems depends on typically small


partitioning
factors
Preprocessing
Characteristic running time typically log depends on number
function of n and difficulty of
subproblems
Primarily for optimization
problems
Optimal substructure:
optimal solution to problem
contains within it optimal
solutions to subproblems
Dynamic Programming
Example: Rod Cutting (text)

You are given a rod of length n ≥ 0 (n in meters)


A rod of length i inches will be sold for pi dollars
Cutting is free (simplifying assumption)
Problem: given a table of prices pi determine the maximum
revenue rn obtainable by cutting up the rod and selling the
pieces.
Length i 1 2 3 4 5 6 7 8 9 10
Price pi 1 5 8 9 10 17 17 20 24 30
Example: Rod Cutting

We can see immediately (from the values in the table) that


n ≤ pn ≤ 3n.
This is not very useful because:
The range of potential revenue is very large
Our finding quick upper and lower bounds depends on
finding quickly the minimum and maximum pi/i ratios (one
pass through the table), but then we are back to the point
above….
Example: Rod Cutting
Step 1: Characterizing an Optimal Solution
Question: in how many different ways can we cut a rod of length n?
For a rod of length 4:

24 - 1 = 23 = 8

For a rod of length n: 2n-1. Exponential: we cannot try all possibilities for n
"large". The obvious exhaustive approach won't work.
Example: Rod Cutting
Step 1: Characterizing an Optimal Solution
Question: in how many different ways can we cut a rod of length n?

Proof Details: a rod of length n can have exactly n-1 possible cut positions –
choose 0 ≤ k ≤ n-1 actual cuts. We can choose the k cuts (without repetition)
anywhere we want, so that for each such k the number of different choices is
n 1
 
 k 
When we sum up over all possibilities (k = 0 to k = n-1):
n 1 (n 1)!
k 0 k  k 0 k!(n 1 k)!  1 1n1  2n1.
n1 n1


For a rod of length n: 2n-1.



Example: Rod Cutting
Characterizing an Optimal Solution

Let us find a way to solve the problem recursively (we might be able to modify the
solution so that the maximum can be actually computed): assume we have cut a
rod of length n into 0 ≤ k ≤ n pieces of length i1, …, ik,
n = i1 +…+ ik,
with revenue
rn = pi1 + … + pik
Assume further that this solution is optimal.

How can we construct it?

Advice: when you don’t know what to do next, start with a simple example and
hope something will occur to you…
Example: Rod Cutting
Characterizing an Optimal Solution
Length i 1 2 3 4 5 6 7 8 9 10
Price pi 1 5 8 9 10 17 17 20 24 30
We begin by constructing (by hand) the optimal solutions for i = 1, …, 10:
r1 = 1 from sln. 1 = 1 (no cuts)
r2 = 5 from sln. 2 = 2 (no cuts)
r3 = 8 from sln. 3 = 3 (no cuts)
r4 = 10 from sln. 4 = 2 + 2
r5 = 13 from sln. 5 = 2 + 3
r6 = 17 from sln. 6 = 6 (no cuts)
r7 = 18 from sln. 7 = 1 + 6 or 7 = 2 + 2 + 3
r8 = 22 from sln. 8 = 2 + 6
r9 = 25 from sln. 9 = 3 + 6
r10 = 30 from sln. 10 = 10 (no cuts)
Example: Rod Cutting
Characterizing an Optimal Solution
Notice that in some cases rn = pn, while in other cases the optimal revenue rn is
obtained by cutting the rod into smaller pieces.
In ALL cases we have the recursion
rn = max(pn, r1 + rn-1, r2 + rn-2, …, rn-1 + r1)
exhibiting optimal substructure (how?)
A slightly different way of stating the same recursion, which avoids repeating some
computations, is
rn = max1≤i≤n(pi + rn-i)
And this latter relation can be implemented as a simple top-down recursive
procedure:
Example: Rod Cutting
Characterizing an Optimal Solution
Summary: How to justify the step from:
rn = max(pn, r1 + rn-1, r2 + rn-2, …, rn-1 + r1)
to
rn = max1≤i≤n(pi + rn-i)

Note: every optimal partitioning of a rod of length n has a first cut – a segment of,
say, length i. The optimal revenue, rn, must satisfy rn = pi + rn-i, where rn-i is the
optimal revenue for a rod of length n – i. If the latter were not the case, there would
be a better partitioning for a rod of length n – i, giving a revenue r’n–i > rn-i and a
total revenue r’n = pn + r’n-i > pi + rn-i = rn.
Since we do not know which one of the leftmost cut positions provides the largest
revenue, we just maximize over all the possible first cut positions.
Example: Rod Cutting
Characterizing an Optimal Solution

We can also notice that all the items we choose the maximum are optimal in their
own right: each substructure (max revenue for rods of lengths 1, …, n-1) is also
optimal (again, optimal substructure property).

Nevertheless, we are still in trouble: computing the recursion leads to recomputing


a number (= overlapping subproblems) of values – how many?
Example: Rod Cutting
Characterizing an Optimal Solution

Let’s call Cut-Rod(p, 4), to see the effects on a simple case:

The number of nodes for a tree corresponding to a rod of size n is:

T(n) 1 
n1
T 0 1, T( j)  2n , n 1.
j 0
Example: Rod Cutting
Beyond Naïve Time Complexity

We have a problem: “reasonable size” problems are not solvable in “reasonable


time” (but, in this case, they are solvable in “reasonable space”).

Specifically:
• Note that navigating the whole tree requires 2n stack-frame activations.
• Note also that no more than n + 1 stack-frames are active at any one time and that
no more than n + 1 different values need to be computed or used.

Can we exploit these observations?


A standard solution method involves saving the values associated with each T(j), so
that we compute each value only once (called “memoizing” = writing yourself a
memo).
Example: Rod Cutting
Naïve Caching
We introduce two procedures:
Example: Rod Cutting
More Sophisticated Caching
We now remove some unnecessary complications:

When we solve the problem in a bottom-up manner the asymptotic time is Θ(n2)
Algorithmic Paradigm Context
Divide & Dynamic Greedy
Conquer Programming Algorithm
View problem as collection of
subproblems
ÒRecursiveÓnature
Independent subproblems overlapping typically
sequential
dependence
Number of subproblems depends on typically small
partitioning
factors
Preprocessing typically sort
Characteristic running time typically log depends on number often dominated
function of n and difficulty of by nlogn sort
subproblems
Primarily for optimization
problems
Optimal substructure:
optimal solution to problem
contains within it optimal
solutions to subproblems
Greedy choice property:
locally optimal produces
globally optimal
Heuristic version useful for
bounding optimal value
Dynamic Programming
 Dynamic Programming is an algorithm
design method that can be used when the
solution to a problem may be viewed as the
result of a sequence of decisions
The General Dynamic
Programming Technique
Applies to a problem that at first seems to require a
lot of time (possibly exponential), provided we
have:
 Subproblem optimality: the global optimum value can be
defined in terms of optimal subproblems
 Subproblem overlap: the subproblems are not
independent, but instead they overlap (hence, should be
constructed bottom-up).
Dynamic Programming: Example
Consider the problem of finding a shortest path between a pair
of vertices in an acyclic graph.
An edge connecting node i to node j has cost c(i,j).
The graph contains n nodes numbered 0,1,…, n-1, and has an
edge from node i to node j only if i < j. Node 0 is source and
node n-1 is the destination.
Let f(x) be the cost of the shortest path from node 0 to node x.
The shortest path
 To find a shortest path in a multi-stage graph
3 2 7

1 4
S A B 5
T

5 6
 Apply the greedy method :
the shortest path from S to T :
1+2+5=8
Dynamic Programming: Example

A graph for which the shortest path between nodes 0 and 4 is to


be computed.
Dynamic Programming
The solution to a DP problem is typically expressed as a
minimum (or maximum) of possible alternate solutions.
If r represents the cost of a solution composed of subproblems
x1, x2,…, xl, then r can be written as

Here, g is the composition function.


If the optimal solution to each problem is determined by
composing optimal solutions to the subproblems and selecting
the minimum (or maximum), the formulation is said to be a DP
formulation.
Dynamic Programming: Example

The computation and composition of subproblem solutions to


solve problem f(x8).
Shortest-Path Problem
 Special class of shortest path problem where the graph
is a weighted multistage graph of r + 1 levels.
 Each level is assumed to have n levels and every node at
level i is connected to every node at level i + 1.
 Levels zero and r contain only one node, the source and
destination nodes, respectively.
 The objective of this problem is to find the shortest path
from S to R.
Shortest-Path Problem

An example of a serial monadic DP formulation for finding the


shortest path in a graph whose nodes can be organized into
levels.
Shortest-Path Problem
The ith node at level l in the graph is labeled vil and the cost of an
edge connecting vil to node vjl+1 is labeled cil,j.
The cost of reaching the goal node R from any node vil is
represented by Cil.
If there are n nodes at level l, the vector
[C0l, C1l,…, Cnl-1]T is referred to as Cl. Note that
C0 = [C00].
We have Cil = min {(cil,j + Cjl+1) | j is a node at level l + 1}
Shortest-Path Problem
Since all nodes vjr-1 have only one edge connecting them to the
goal node R at level r, the cost Cjr-1 is equal to cjr,-R1.
We have:

Notice that this problem is serial and monadic.


Shortest-Path Problem
The cost of reaching the goal node R from any node at level l is
(0 < l < r – 1) is
Shortest-Path Problem
We can express the solution to the problem as a modified
sequence of matrix-vector products.
Replacing the addition operation by minimization and the
multiplication operation by addition, the preceding set of
equations becomes:

where Cl and Cl+1 are n x 1 vectors representing the cost


of reaching the goal node from each node at levels l and l + 1.
Shortest-Path Problem
Matrix Ml,l+1 is an n x n matrix in which entry (i, j) stores the cost
of the edge connecting node i at level l to node j at level l + 1.

The shortest path problem has been formulated as a sequence


of r matrix-vector products.
The shortest path in multistage
graphs
 e.g. A
4
D
1 18
11 9

2 5 13
S B E T
16 2

5
C 2
F

 The greedy method can not be applied to this


case: (S, A, D, T) 1+4+18 = 23.
 The real shortest path is:
(S, C, F, T) 5+2+2 = 9.
Dynamic programming approach
 Dynamic programming approach (forward approach):

A
4
D 1 A
1 d(A, T)
18
11 9

2 d(B, T)
S
2
B
5
E
13
T S B T
16 2

5 d(C, T)
5
C 2
F C

 d(S, T) = min{1+d(A, T), 2+d(B, T), 5+d(C, T)}


4
A D
 d(A,T) = min{4+d(D,T), 11+d(E,T)} d(D, T)

= min{4+18, 11+13} = 22. T


11
E d(E, T)
 d(B, T) = min{9+d(D, T), 5+d(E, T), 16+d(F, T)}
= min{9+18, 5+13, 16+2} = 18.
4
A D 9 D
1 18 d(D, T)
11 9

5 d(E, T)
S
2
B
5
E
13
T B E T
16 2
d(F, T)
16
5 F
C 2
F

 d(C, T) = min{ 2+d(F, T) } = 2+2 = 4


 d(S, T) = min{1+d(A, T), 2+d(B, T), 5+d(C, T)}

= min{1+22, 2+18, 5+4} = 9.


 The above way of reasoning is called

backward reasoning.
Backward approach
(forward reasoning) A
4
D
1 18
11 9

2 5 13
S B E T
16 2
 d(S, A) = 1
5
d(S, B) = 2 C 2
F
d(S, C) = 5
 d(S,D)=min{d(S,A)+d(A,D), d(S,B)+d(B,D)}

= min{ 1+4, 2+9 } = 5


d(S,E)=min{d(S,A)+d(A,E), d(S,B)+d(B,E)}
= min{ 1+11, 2+5 } = 7
d(S,F)=min{d(S,B)+d(B,F), d(S,C)+d(C,F)}
= min{ 2+16, 5+2 } = 7
 d(S,T) = min{d(S, D)+d(D, T), d(S,E)+
d(E,T), d(S, F)+d(F, T)}
= min{ 5+18, 7+13, 7+2 }
=9 4
A D
1 18
11 9

2 5 13
S B E T
16 2

5
C 2
F
Principle of optimality
 Principle of optimality: Suppose that in solving a
problem, we have to make a sequence of decisions
D1, D2, …, Dn. If this sequence is optimal, then the
last k decisions, 1  k  n must be optimal.
 e.g. the shortest path problem

If i, i1, i2, …, j is a shortest path from i to j, then i1,


i2, …, j must be a shortest path from i1 to j
 In summary, if a problem can be described by a
multistage graph, then it can be solved by dynamic
programming.
Dynamic programming
 Forward approach and backward approach:
 Note that if the recurrence relations are formulated
using the forward approach then the relations are
solved backwards . i.e., beginning with the last decision
 On the other hand if the relations are formulated using
the backward approach, they are solved forwards.
 To solve a problem by using dynamic
programming:
 Find out the recurrence relations.
 Represent the problem by a multistage graph.
The resource allocation problem

m resources, n projects
profit Pi, j : j resources are allocated to project i.
maximize the total profit.

Project Resource
1 2 3
1 2 8 9
2 5 6 7
3 4 4 4
4 2 4 5

7 -49
The multistage graph solution
A E I
0,1 0 0,2 0 0,3
5 4
0 5
B 7 6 F 4 4 J
1,1 0 1,2 0 1,3
2 4
5 4
S 8 C
6
G
4
K 2
T
2,1 0 2,2 0 2,3
9 0
5 4
D H L
3,1 0 3,2 0 3,3

The resource allocation problem can be described as a multistage graph.


(i, j) : i resources allocated to projects 1, 2, …, j
e.g. node H=(3, 2) : 3 resources allocated to projects 1, 2.

7 -50
Assignment Find the longest path from S to T :

A E I
0,1 0 0,2 0 0,3
5 4
0 5
B 7 6 F 4 4 J
1,1 0 1,2 0 1,3
2 4
5 4
S 8 C
6
G
4
K 2
T
2,1 0 2,2 0 2,3
9 0
5 4
D H L
3,1 0 3,2 0 3,3

The longest path from S to T :


(S, C, H, L, T), 8+5+0+0=13
7 -51
Solution
The longest path from S to T :
(S, C, H, L, T), 8+5+0+0=13
2 resources allocated to project 1.
1 resource allocated to project 2.
0 resource allocated to projects 3, 4.
Project Resource
1 2 3
1 2 8 9
2 5 6 7
3 4 4 4
4 2 4 5

7 -52
Assignment:
Determine the shortest path from node A to node J with Dynamic Programming
Discussion of Parallel Dynamic Programming Algorithms

By representing computation as a graph, we identify three


sources of parallelism: parallelism within nodes, parallelism
across nodes at a level, and pipelining nodes across multiple
levels. The first two are available in serial formulations and the
third one in non-serial formulations.

Data locality is critical for performance. Different DP


formulations, by the very nature of the problem instance, have
different degrees of locality.