Professional Documents
Culture Documents
CS 331 Design and Analysis of Algorithms The Greedy Approach
CS 331 Design and Analysis of Algorithms The Greedy Approach
• Definition:
– Feasible solution: any subset that satisfies some
constraints
– Optimal solution: a feasible solution that maximizes or
minimizes the objective function
2
Make Change Problem
• Problem: minimize total number of coins
returned as change by a sales clerk to a
customer
• Assumption: unlimited supply of coins
• Solution Set: The amount of change in the
customer’s hand
3
Make Change Algorithm
while (there are more coins and the instance is not solved)
{
grab the largest remaining coin;
if (adding the coin makes the change exceed amount owed)
{ reject coin; }
else
{ add the coin to the change; }
4
Optimal Solution? Prove
• Example 1:
– Amount owned: 36 cents
– Standard coins: quarter, dime, nickel, penny
• Example 2:
– Amount owned: 16 cents
– Coins: quarter, 12-cent, dime, nickel, penny
5
General Greedy Procedure
procedure Greedy (A, n)
begin
solution Ø;
for i 1 to n do
x Select (A);
if Feasible (solution, x),
then solution Union (solution, x);
end;
• Select: A greedy procedure, based on a given objective
function, which selects input from A, removes it and
assigns its value to x.
• Feasible: A boolean function to decide if x can be included
into solution vector without violating any given constraints.
6
When applying Greedy method…
• The n inputs are ordered by some selection
procedure which is based on some
optimization measures.
7
Problem 1: Minimum Spanning Tree
• Given an undirected graph G, find a minimum
spanning tree of G.
8
Example
G:
2 3
4 5 6 7
8
10
DFS (Depth Frist Search)
• Starts from v, recursively visit each unvisited
node attached to v.
1 1
2 3
2 3
4 5 6 7
8
4 5 6 7
8
11
BFS (Breadth First Search)
• Visit all neighbors before visiting neighbors of
their neighbors
1
1
2 3 2 3
4 5 6 7
4 5 6 7
8
12
Example 2: Edges with Diff. Weights
16
G: 1
21
2
11 5
6 6 3
19
33 14
10
5 4
DFS 18
16
1 2
5
6
3
Cost 16 5 10 18 33
33
82
18 10
5 4 13
BFS
16 5
1 2 3
21 6 Cost 16 19 21 5 6
19
67
6 4
5
14
Prim’s Algorithm
• Basic idea: start from vertex 1 and let T Ø (T will
contain all edges in the spanning tree); the next edge
to be included in T is the minimum cost edge (u, v)
such that u is in the tree and v is not.
16
1 2
21 11 5
6 6 3
19
33 14
10
5 4
18
15
1 1 2
16
19 19 21
21 5 6 11
2 6 5 6
5 6 3 4
(Spanning Tree) T= Ø T= 1 2
1 2 3 1 2 3 4
19 19 21 11 18 14
21 6 11 10
5 5 6 6 5 6
6 4 6 4
T= 1 2 3 1 2 3
T=
4
16
1 2 3 4 6 T= 1 2 3
19 18 33 4 6
5 5 5 5
1 2 3
Cost = 16+5+6+11+18=56
T= Minimum Spanning Tree
4 6
18
Data Structures Used in Prim
• Cost adjacency matrix for G
weight on edge if there is an edge between vi and vj
Cost[i][j] = if there is no edge between vi and vj
0 if i=j
Cost = 1 2 3 4 5 6
1 0 16 19 21
2 16 0 5 6 11
3 5 0 10
4 6 10 0 18 14
5 19 18 0 33
6 21 11 14 33 0
19
Data Structures Used in Prim
• 1-D array Near
0 j already in the S.T.
Near(j) = a vertex in the tree
s.t. Cost(j, Near(j)) is minimum otherwise
So, the value of Near(j) is a vertex in the tree s.t. Cost(j, Near(j)) is minimum
among all choices for Near(j) assuming j not in the tree
Thus, it contains the closest node that j can connect to in the tree.
20
Algorithm:
19 6 6 3
33 14
10
5 4
18
T = {(2,1),(3,2),(4,2),(6,2),(5,4)}
22
In-Class Exercise #1
• Use Prim’s algorithm to find a minimum
spanning tree T in the following weighted
graph. What is the total weight of T? Show
intermediate results of array Near.
Cost = 1 2 3 4 5 6
1 0 10 30 45
2 10 0 50 40 25
3 50 0 35 15
4 30 0 20
5 45 40 35 0 55
6 25 15 20 55 0
23
Kruskal’s Algorithm
• Basic idea:
– Don’t care if T is a tree or not in the intermediate stage, as
long as the including of a new edge will not create a cycle,
we include the minimum cost edge
10
30 1 2 50
25
45
4 40 3
20 35
6 5
55
15
24
Sort all of edges T
(1,2) 10 √ 1 2
1 2 3 6
(3,6) 15 √
1 2 3 6 4
(4,6) 20 √
(2,6) 25 √ 1 2 3 6 4
(3,5) 35 √ 1 2 3 6 4
(2,5) 40 …
25
Kruskal’s Algorithm
While (T contains fewer than n-1 edges) and (E ) do
{
choose an edge (u,v) from E of the lowest cost;
delete (u,v) from E;
if (u,v) does not create a cycle in T
then add (u,v) to T
else discard (u,v);
}
26
YOUR EXERCISE
27
Question
• How to check adding an edge will create a
cycle or not?
• We can maintain a set for each group.
• Initially, each set contains one node.
• Sets merge when we make connection.
Ex: set1 set2 set3
1 2 3 6 4 5
2 6
29
Method 1: Straightforward
• Use an array to store group # for each element, and use
the smallest # of each set as its label
• E.g., {1, 5}, {2, 4, 7, 10}, {3, 6, 8, 9}
• We create array A to store the group number of these 10
elements as follows:
A: 1 2 3 2 1 3 2 3 3 2
Node: 1 2 3 4 5 6 7 8 9 10
A: 1 2 3 2 1 3 4 3 3 4
Node: 1 2 3 4 5 6 7 8 9 10
function Find2(x) procedure Merge2(a,b) // merge sets labeled a and b
{ {
i = x; if a<b then A[b] = a;
while A[i]≠i else A[a] = b;
do i=A[i]; }
return i;
}
So, the complexity of Find2 is O(n) and Merge2 is O(1). 31
Method 3 (improved version of Method 2, it reduces the
complexity of Find2 from O(n) to O(Logn)):
33
Problem 2: Single-Source Shortest Path
• Definition: Given a directed graph G = (V, E), a weight
for each edge in G, and a source node V0, to
determine the length of the shortest paths from V0 to
all the other vertices in G.
• Note, the length of a path is the sum of the weight of
the edges on that path
34
Notation
cost from vertex i to vertex j if there is an edge
Cost (i, j) = 0 if i = j
otherwise
35
Concept of Dijkstra’s Algorithm
1. Start from source node v
10 v1
v
2. Find u s.t. u is the closest neighbor of v, 5
update s(u) = 1 (why?) 8
v3
v2
u
3. For all node w, s(w)=0, update Dist(w)
and From(w) if the v-u-w (red) is better
u
than v-w v
36
To Be More General: Edge Relaxation
37
Dijkstra’s algorithm:
procedure Dijkstra (Cost, n, v, Dist, From) // Cost, n, v are input,
// Dist, From are output
{
for i 1 to n do
s(i) = 0;
Dist(i) = Cost(v, i);
From(i) = v;
s(v) = 1;
for num 1 to (n – 1) do
choose u s.t. s(u) = 0 and Dist(u) is minimum;
s(u) = 1;
for each neighbor w of u with s(w) = 0 do
if (Dist(u) + Cost(u, w) < Dist(w)
Dist(w) = Dist(u) + Cost(u, w);
From(w) = u;
}
38
45
Example:
50 10
V0 V1 V4
15 35
10
20 20
30
V2 V3 V5
15 3
39
b) Steps in Dijkstra’s Algorithm
45 45
50 10 50 10
V0 V1 V0 V1 V4
V4
15
15 10
20 10 35 35
20 20
20
30 30
V2 V3 V5 V2 V3 V5
15 3 15 3
40
3. Dist (v3) = 25, From (v3) = v2 4. Dist (v1) = 45, From (v1) = v3
choose u = v3 choose u = v1
45 45
50 10 50 10
V0 V1 V4 V0 V1 V4
15
35 15
20 10 35
20 10
30 20
20 30
V2 V3 V5 V2 V3 V5
15 3 15 3
41
5. Dist (v4) = 45, From (v4) = v0 6. Dist (5) =
choose u = v4 choose u = v5
45
45
50 10
50 10 V0 V1
V0 V1 V4 V4
15 15
35 35
20 10 20 10
20 20
30 30
V2 V3 V5 V2 V3 V5
15 3 15 3
42
c) Shortest paths from source v0
v1 v3 v2 v0 45
v2 v0 10
v3 v2 v0 25
v4 v0 45
v5 v0
43
Exercise, Trace the Process, S = 1
1
10 50
100 30
5 2
20
10 5
a) Cost adjacent matrix
4 3
50
44
b) Steps in Dijkstra’s algorithm
20 20
10 10 5
5
4 3 4 3
50 (100,1) 50
(100,1) (30,1) (30,1)
(20,5)
45
3. Choose u = 4 4. Choose u = 3
(0,1)
1 1 (0,1)
50 (50,1)
10 50 (50,1) 10
(40,4)
(40,4) (35,3)
100 30 100 30
5 2 5 2
(10,1) (10,1)
20 20
10 5 5
10
(100,1) 4 50
3
(30,1) (100,1) 4 3
50
(20,5) (20,5) (30,1)
5. Choose u = 2 1 (0,1)
Shortest paths
10 50 from source 1
(50,1)
(10,1) 30 (40,4) 2 3 1 35
100 2 (35,3)
5
31 30
20 4 5 1 20
10 5
(100,1) 4 51 10
3
(20,5) 50 (30,1) 46
In-Class Exercise #2
• Write an algorithm to print the shortest paths
from source node v to every other node in G.
For example, 2 4 3 1 should be
printed if the shortest path from node 1 to
node 2 is from 1 through 3, then 4, and finally
2.
47
Problem 3: Optimal Storage on Tapes
• Given n programs to be stored on tape, the lengths of
these n programs are l1, l2 , . . . , ln respectively.
• Suppose the programs are stored in the order
of i1, i2 , . . . , in ,
• Let tj be the time to retrieve program ij , which will be
the sum of all the length of the programs stored in
front of it.
• The goal is to minimize MRT (Mean Retrieval Time),
i.e. want to minimize total retrieving time
48
Example: n = 3, (l1, l2, l3) = ( 5, 10, 3)
There are n! = 6 possible orderings for storing them.
50
Interchange ia and ib in I and call the new list I :
I x
… ia ia+1 ia+2 … ib …
I SWAP
… ib ia+1 ia+2 … ia …
x
In I, Program ia+1 will take less (lia- lib) time than in I to be retrieved.
In fact, each program ia+1 , …, ib-1 will take less (lia- lib) time
For ib, the retrieval time decreases x + lia
For ia, the retrieval time increases x + lib
53
Greedy Strategy#2: Weights are ordered in nondecreasing order (3,2,1)
54
Greedy Solution, Complexity?
1. Calculate pi/wi for all i
2. Sort the items by decreasing pi/wi
3. Let m be the current weight limit (initially m
= M). In each iteration, we remove item i
from the head of the sorted items.
1. If m ≥ wi, we take item i, and m = m-wi, then
consider the next item.
2. Otherwise, we take a fraction f of item i, s.t. f =
m/wi, which weights exactly m. Done!
Observation: the algorithm may take a fraction of an item, which can only be the last
selected item. 55
Theorems
• Lemma: All optimal solutions will fill the knapsack
exactly
– If not, we can always increase the contribution of
some object by a fractional amount until the total
weight is exactly m
56
Analysis
Given some knapsack instance,
Suppose the objects are ordered s.t.
Proof by contradiction:
Case1: obviously it’s optimal
Case2: s.t.
where
57
So,
if profit(z) > profit(y) ()
60
In-Class Exercise #3
• Consider the knapsack problem. We now add the requirement
that xi=1 or xi=0, for all 1 i n. That is, an object is either
included or not included into the knapsack. We wish to solve
the problem:
61
Moral of Greedy Algorithms
• Greedy algorithms sometimes gives the
optimal solution, sometimes not, depending
on the problem.
62
Problem 5:
Job Sequencing with Deadlines
The problem:
Given n jobs, associated with job i is
an integer deadline di ≥ 1, and
an integer profit pi ≥ 0
For any job i, profit pi is earned iff
the job is completed by its deadline.
Assume each job needs one unit of execute time
and there is one machine available.
The goal is to find a job processing sequence to
maximize total profit.
63
Example
64
Consider jobs in order of nonincreasing profits p1 ≥ p2 ≥ … ≥ pn ,
maintain at each stage a set J of feasible jobs, i.e. jobs that can be run in
some sequence in which all jobs in J meet their deadlines. By the end of
finishing considering n jobs, J is the set of jobs that maximize total profit.
65
Question
• How to determine if a set J is feasible?
Claim:
Let J be a set of k jobs, i.e. | J | = k
S = i1 i2… ik is a permutation of the jobs in J
s.t. di1 ≤ di2 ≤ … ≤ dik
then J is feasible iff the jobs in J can be processed in order S
Proof:
(←) Obvious by definition
(→) If J is feasible, then there exists an ordering S’ = r1 r2… rk in
which all jobs meet their deadlines, i.e. drj ≥ j j
It is obvious that all jobs (except ra which was moved to later slot)
meet their deadlines since S’ is feasible.
Claim: dra ≥ drb
ra = ic a+1 ≤ c ≤ k in S
dia ≤ dic since S is arranged by greedy
dic = dra ≥ drb = dia
Since rb meets its deadline in S’, ra is able to meet its deadline too. 67
Repeat this process:
S’ → S’’ → S’’’ → … → S
So, If J is feasible, J can be processed in order S.
This proof also implies if S is not feasible, then J has no
feasible solution.
Example: J = {1, 2, 3, 4, 5, 6}
(d1, d2, d3, d4, d5, d6) = (2, 4, 6, 7, 8, 10)
S’ = 6 1 3 2 4 5
S’’ = 1 6 3 2 4 5
S’’’ = 1 2 3 6 4 5
…
S= 1 2 3 4 5 6
68
Algorithm:
procedure JS (D, n, J, k) // D, n are input, J, k are output
// assume jobs have been sorted by profit in nonincreasing order
// therefore, no “profit” in the parameter list
// D[1] is the deadline of the most profitable job
{
D[0] = 0; // sentinel, we can never insert a job before here
J[1] = 1;
k = 1;
for i 2 to n do
r = k;
while (D[J[r]] > D[i] and D[J[r]] > r) do r = r – 1;
// (r+1) is the earliest time slot we can schedule this new job
if D[i] ≥ r+1 // insert i into J
then for l k downto (r+1) by -1 do J[l+1] = J[l];
J[r+1] = i;
k++;
}
The complexity of JS is O(n2). 69
In-Class Exercise #4
• Suppose we have the following jobs, deadlines, and profits:
• What are the values of J and k at the end of algorithm JS? Show
steps.
70
In-Class Exercise #5
The greedy solution for the Job Sequencing with Deadlines
problem is to sort the jobs by their profits into non-increasing
order, and then consider one job at a time. It has been proved
that this greedy solution is optimal, i.e. it can maximize the total
profit.
71