You are on page 1of 28

CMPT405/705 Greedy Algorithms Qianping Gu

Greedy Algorithms (Ch 4)


• Greedy Algorithm for Interval Scheduling Problem
• Elements of Greedy Algorithms
• Greedy Algorithm for Minimizing the Maximum Lateness Problem
• Greedy Algorithm for Shortest Path Problem
• Greedy Algorithm for Clustering Problem
The lecture notes/slides are adapted from those associated with the text book by J. Kleinberg

and E. Tardos.

1
CMPT405/705 Greedy Algorithms Qianping Gu

Greedy Algorithm for Interval Scheduling Problem


Interval scheduling problem

• Given a set S = {a1 , .., an } of proposed jobs that wish to use a resource
which can serve one job at a time. Each ai has a start time si and a finish time
fi with 0 ≤ si < fi < ∞. If selected, ai takes place in time interval [si , fi ).
• Jobs ai and aj are compatible if [si , fi ) ∩ [sj , fj ) = ∅.
• Goal, Find a maximum subset of mutually compatible jobs from S .

2
CMPT405/705 Greedy Algorithms Qianping Gu

Greedy algorithm for interval scheduling problem

• Find an optimal solution in steps, each step optimizes a partial solution locally.

• Consider jobs in some order, select each job compatible with the selected ones.

• Possible selections:
Earliest-start-time-first, consider jobs in ascending order of si .

Earliest-finish-time-first, consider jobs in ascending order of fi .

Shortest-interval-first, consider jobs in ascending order of fi − si .


Fewest-incompatible-jobs-first, consider jobs in ascending order of the number of
incompatible jobs.

3
CMPT405/705 Greedy Algorithms Qianping Gu

Earliest-finish-time-first algorithm

• In each step, select a job ak with the earliest finish time.

• Intuition, select ak which leaves resource available for as many other jobs as
possible.

• Algorithm
Assume f1 ≤ .. ≤ fn . Let Sk = {ai ∈ S|si ≥ fk }.
Choose a1 ; then choose an ai with the smallest fi from S1 ;
repeat this selection, assume ak was chosen in the previous round of selection,
choose an ai from Sk , until no activity can be selected.

• Sort jobs in ascending order of finish time takes O(n log n) time, the rest steps
take O(n) time, total running time O(n log n).

4
CMPT405/705 Greedy Algorithms Qianping Gu

Example, set S of the following jobs:

i 1 2 3 4 5 6 7 8 9 10 11
si 1 3 0 5 3 5 6 8 8 2 12
fi 4 5 6 7 9 9 10 11 12 14 16

• Initially, partial solution S ∗ = {a1 }, then a job from S1 is selected.

• Since S1 = {a4 , a6 , a7 , a8 , a9 , a11 } and a4 has the smallest fi in S1 ,


S ∗ = {a1 } ∪ {a4 } and then select a job from S4 .

• Since S4 = {a8 , a9 , a11 } and a8 has the smallest fi in S4 ,


S ∗ = {a1 , a4 } ∪ {a8 } and then select a job from S8 .

• Since S8 = {a11 }, S ∗ = {a1 , a4 , a8 } ∪ {a11 } and then select a job from S11 .

• There is no job in S11 and the algorithm terminates.

5
CMPT405/705 Greedy Algorithms Qianping Gu

Theorem. For any nonempty Sk and am ∈ Sk with the smallest fm , am is included


in a maximum-size subset of mutually compatible jobs of Sk .

Proof. Let Ak be a maximum-size subset of mutually compatible jobs in Sk and aj be the


= am then the theorem is proved. Assume
job in Ak with the smallest fj . If aj
aj 6= am . Let A′k = (Ak \ {aj }) ∪ {am }. Since aj ∈ Ak and for any ai ∈ Ak with
ai 6= aj , fj ≤ fi and aj is compatible with ai , fj ≤ si . Therefore and from fm ≤ fj ,
fm ≤ si for every ai ∈ Ak . Thus, jobs in A′k are mutually compatible.

aj

am

6
CMPT405/705 Greedy Algorithms Qianping Gu

• Earliest-start-time-first selection may not give an optimal solution.

• Shortest-interval-first selection may not give an optimal solution.

• Fewest-incompatible-jobs-first selection may not give an optimal solution.

7
CMPT405/705 Greedy Algorithms Qianping Gu

Elements of Greedy Algorithms


• Major steps in designing a greedy algorithm
– Formulate the optimization problem as a sequence of subproblems in which
a local optimal (greedy) choice solves each subproblem.

– Prove that a greedy choice will give an optimal solution to each subproblem.

– Show optimal substructure: for each subproblem, a greedy optimal solution


to the subproblem combined with a greedy choice can give an optimal
solution to a remaining subproblem.

• Greedy-choice property: a global optimal solution can be assembled by making


locally optimal greedy choices.

• Optimal substructure: Optimal solution contains optimal solutions to


subproblems.

8
CMPT405/705 Greedy Algorithms Qianping Gu

Scheduling to minimize maximum lateness

• Let S = {a1 , .., an } be a set of proposed jobs that wish to use a resource
which can serve one job at a time. Each ai has deadline di and wishes a
contiguous time interval ti , before di to use the resource. For each ai , a
schedule S decides the time si at which ai starts to use the resource. The
finish time of ai is fi = si + ti .

The lateness of ai is defined as li = fi − di if fi > di otherwise 0.


Optimization goal: find a schedule S s.t. the maximum lateness of S
l(S) = max1≤i≤n li is minimized.

t1=2,d1=3 t2=3,d2=5 t3=3.5,d3=7

t1=2,d1=3 t2=3,d2=5 t3=3.5,d3=7


t
0 1 2 3 4 5 6 7 8 9
Maximum lateness=max{l1=0,l2=0,l3=1.5}=1.5

9
CMPT405/705 Greedy Algorithms Qianping Gu

A greedy algorithm

• Order the jobs in the order a1 , .., an s.t. d1 ≤ d2 ≤ ... ≤ dn . Let t be the
earliest available time of the resource (e.g., t = 0).

= t and f1 = s1 + t1 ;
For a1 , let s1
For ai with 2 ≤ i ≤ n, si = fi−1 and fi = si + ti ;
Assign each ai to resource at time interval [si , fi );

• No idle time property: The resource serves each job at any time point from the
start of the 1st served job until the finish time of the last served one.

• No inversion property: A schedule has an inversion if there are two jobs ai and
aj s.t. dj < di and sj > si . A schedule has no inversion if si < sj for
di < dj .

• The algorithm above has no idle time and no inversion.

10
CMPT405/705 Greedy Algorithms Qianping Gu

• Claim 1: There is an optimal schedule with no idle time.


For an optimal schedule S , assume that the resource is idle for a time interval t = [s, f )
before some jobs are served. Then for every job ai , if ai is served before time s then let
s′i = si , otherwise (f ≤ si ), let s′i = si − (f − s). Then we get a schedule S ′ with
l(S ′ ) ≤ l(S). Repeat this process, we get an optimal schedule with no idle time property.

• Claim 2: All schedules with no idle time and no inversions have the same maximum
lateness.

Let S and S ′ be schedules with no idle time and no inversions. Then two jobs ai and aj
are in different served orders in S and S ′ only if di
= dj = d. All jobs with deadline d
are served consecutively in S and S ′ (after jobs with deadline< d and before jobs with
deadline> d. The maximum lateness of all jobs with deadline d is independent of the
orders they served.

S lateness=d
deadline<d ai aj deadline>d
lateness
d
S’ lateness=d
deadline<d aj ai deadline>d
lateness
d

11
CMPT405/705 Greedy Algorithms Qianping Gu

• Claim 3: There is an optimal schedule that has no idle time and no inversions.
Let S : aj1 , .., ajn be an optimal schedule with no idle time (by Claim 1). If S has no
inversion then the claim holds. Assume S has an inversion. Then there is pair aji and
aji+1 s.t. dji > dji+1 . Let S ′ be the schedule obtained by exchanging the ordr of aji
and aji+1 . Then S ′ has one less inversion than S . We prove l(S ′ ) ≤ l(S).
Let ljr be the lateness and fjr be the finish time of ajr in S . Let lj′ be the lateness of
r
ajr in S ′ . Then
ljr = lj′ r for every r 6= i, i + 1,
lj′ i+1 < fji+1 − dji+1 = lji+1 and
lj′ i = fji+1 − dji < fji+1 − dji+1 = lji+1 .
We get max{lj′ , lj′ i+1 } ≤ max{lji , lji+1 }. Thus, l(S ′ ) ≤ l(S).
i
Repeat the exchange process above, we get the claim.
S aj aj
i i+1

lj
i lj
dj dj fj i+1 fj
i+1 i i i+1

S’ aj aj
i+1 i

l’j
l’j i
dj dj i+1 fj fj
i+1 i i i+1

12
CMPT405/705 Greedy Algorithms Qianping Gu

• The greedy algorithm finds an optimal schedule in t(n) + O(n) time, t(n) is the time to
sort n numbers.

Proof. By Claim 3, there is an optimal schedule with no idle time and no inversion. By Claim 2,
all schedules with no idle time and on inversion have the same maximum lateness. Hence, a
schedule with no idle time and no inversion is optimal. The algorithm finds a schedule with no idle
time and no inversion, an optimal one.

It takes t(n) time to sort the jobs in ascending order of their deadlines and O(n) time compute
si and fi . The running time of the algorithm is t(n) + O(n).

13
CMPT405/705 Greedy Algorithms Qianping Gu

Greedy Algorithm for Shortest Path Problem


• G(V, E), weighted digraph with each arc e assigned a real value length d(e)
(or cost or weight or ..).

• For a path P consists of arcs e1 , e2 , .., ek , the length of P is


P
d(P ) = 1≤i≤k d(ei ).
• For nodes s, t in G, the shortest path from s to t is a path from s to t with the
minimum length. The length of the shortest path from s to t is called the
distance d(s, t) from s to t.

50

14
CMPT405/705 Greedy Algorithms Qianping Gu

• Single source shortest paths (SSSP) problem, find the shortest paths
(distances) from a source node s to all other nodes.

• All pairs shortest paths (APSP) problem, for every pair of nodes u, v , find the
shortest path (distance) from u to v .

15
CMPT405/705 Greedy Algorithms Qianping Gu

Dijkstra’s algorithm for single source shortest paths problem

• Greedy approach: Let s be a source node, maintain a set S of nodes, for each u ∈ S ,
d(s, u) =length of a shortest path s u has been found.
– Initialize S = {s}, d(s, s) = 0;
– In each step, choose a node v 6∈ S which minimizes
˜ v) = mine=(u,v):u∈S d(s, u) + d(e);
d(s,
˜ v);
S = S ∪ {v}; d(s, v) = d(s,
– Repeat above until S = V (G).

S : solution to a subproblem (shortest distances to a subset of nodes found).

˜ v): (local) shortest distance of s


d(s, v paths passing through nodes of S only.

16
CMPT405/705 Greedy Algorithms Qianping Gu

Dijkstra Algorithm [Dijkstra 1956] for Shortest Distance


Input: Weighted digraph G with non-negative edge length and source s.
Output: Distance d(s, u) to every node u.
:= {s}; d(s, s) = 0;
Initially, S
for every node u 6= s,
˜ u) = d(e) else d(s,
if e = (s, u) ∈ E then d(s, ˜ u) = ∞;
while S 6= V do
v = arg minv∈V \S {d(s,˜ v)};
S := S ∪ {v}; d(s, v) = d(s,˜ v);
′ ˜ v ′ ) = min{d(s,
for every edge e = (v, v ), d(s, ˜ v ′ ), d(s, v) + d(e)};
end while

17
CMPT405/705 Greedy Algorithms Qianping Gu

Dijkstra Algorithm [Dijkstra 1956] for Shortest Distance and Path


Input: Weighted digraph G with non-negative edge length and source s.
Output: Distance d(s, u) to every node u and predecessor P (u).
:= {s}; d(s, s) = 0; P [s] = s;
Initially, S
for every node u 6= s,
P [u] = null;
˜ u) = d(e) else d(s,
if e = (s, u) ∈ E then d(s, ˜ u) = ∞;
while S 6= V do
v = arg minv∈V \S {d(s,˜ v)}; let e = (u, v) be the edge s.t. d(s,
˜ v) = d(s, u) + d(e);
S := S ∪ {v}; d(s, v) = d(s,˜ v); P [v] := u;
′ ˜ v ′ ) = min{d(s,
for every edge e = (v, v ), d(s, ˜ v ′ ), d(s, v) + d(e)};
end while

18
CMPT405/705 Greedy Algorithms Qianping Gu

Theorem. [Dijkstra 1956] Dijkstra’s Algorithm finds the distance d(s, u) and the shortest path
from s to every node in G.

Proof. Let Pu be the path s, .., P [P [u]], P [u], u. We prove the loop invariant: For each node
u ∈ S , d(s, u) =length of a shortest path s u and Pu is a shortest path s u. For |S| = 1,
S = {s}, d(s, s) = 0 and Ps = s, the invariant is true. Assume the invariant holds for |S| ≥ 1.
Let v be the next node added to S and e ˜ v) = d(s, u) + d(e). Let
= (u, v) be the edge s.t. d(s,
P be any other path s v , e1 = (x, y) be the 1st edge in P with x ∈ S and y 6∈ S , and P ′ be
the subpath s x of P . Then
˜ y) ≥ d(s,
d(P ) ≥ d(P ′ ) + d(e1 ) ≥ d(s, x) + d(e1 ) ≥ d(s, ˜ v).

Thus, d(s, v) =length of a shortest path s v and Pv is a shortest path s v.

u e v
s

P’ x y
e1

19
CMPT405/705 Greedy Algorithms Qianping Gu

Running time of Dijkstra’s Algorithm,

• For S = {s}, O(n) time for initialization.


n − 1 iterations of the while loop.
Each time a node v is added to S , O(outdeg(v)) updates
˜ v ′ ) = min{d(s,
d(s, ˜ v ′ ), d(s, v) + d(v, v ′ )}, total O(m) updates.

• Straightforward implementation
˜ v) = mine=(u,v),u∈S {d(s, u) + d(e)} minimal,
O(n) time to find v with d(s,
2
total O(n ) time.

• Improvements
˜ v) = mine=(u,v),u∈S {d(s, u) + d(e)} in a priority queue.
Keep d(s,

By a binary heap, O(1) time to find v and O(log n) time to adjust the heap after
˜ v
an update d(s,
′ ˜ v ′ ), d(s, v) + d(v, v ′ )}, total O(m log n) time.
) = min{d(s,

20
CMPT405/705 Greedy Algorithms Qianping Gu

Clustering
• Clustering instance: a set V of n points p1 , .., pn and a distance function
d(pi , pj ) for pi , pj ∈ V with d(pi , pi ) = 0, d(pi , pj ) ≥ 0 for i 6= j and
d(pi , pj ) = d(pj , pi ).
For subsets V1 and V2 of V , distance between V1 and V2 is defined as
d(V1 , V2 ) = minpi ∈V1 ,pj ∈V2 {d(pi , pj )}.
• k -Clustering problem
Given a clustering instance and integer k , partition V into k subsets V1 , .., Vk
s.t. mini6=j d(Vi , Vj ) is maximized. The problem can be solved by an algorithm
for the minimum spanning tree problem.

21
CMPT405/705 Greedy Algorithms Qianping Gu

Minimum spanning tree problem

• Let G(V, E) be a connected graph. A subgraph T of G is a spanning tree of G


if V (T ) = V (G) and T is a tree.

• If G is weighted (each edge e is assigned a non-negative weight d(e) then each


P
spanning tree T has a weight e∈E(T ) d(e).

Minimum spanning tree (MST) of G, a spanning tree T of G with the minimum


weight.

• Minimum spanning tree problem: Given G, find an MST of G.

22
CMPT405/705 Greedy Algorithms Qianping Gu

Kruskal’s Algorithm [Kruskal 1956]


Input: Weighted graph G.
Output: A MST of G.
A = ∅;
for each node v ∈ V create component c(v) = {v};
sort edges of G into e1 , .., em s.t. w(ei ) ≤ w(ej ) for i < j ;
for i = 1, 2, .., m do
if for edge ei = {u, v}, c(u) 6= c(v) then
A = A ∪ {ei }; c(u) = c(u) ∪ c(v); c(v) = c(u);
return A;

Disjoint-set data structure is used to maintain components of GA .

The running time of Kruskal’s algorithm is O(m log n).

8 7 8 7
b c d b c d
4 2 9 4 2 9
4 4
a 11 i 14 e a 11 i 14 e
7 6 7 6
8 10 8 10
(a) g (b) g
h 1 2 f h 1 2 f

23
CMPT405/705 Greedy Algorithms Qianping Gu

8 7 8 7
b c d b c d
4 2 9 4 2 9

11 4 11 4
a i 14 e a i 14 e
7 6 7 6
8 10 8 10
(a) g (b) g
h 1 2 f h 1 2 f

8 7 8 7
b c d b c d
4 2 9 4 2 9

11 4 11 4
a i 14 e a i 14 e
7 6 7 6
8 10 8 10
(c) g (d) g
h 1 2 f h 1 2 f

8 7 8 7
b c d b c d
4 2 9 4 2 9

11 4 11 4
a i 14 e a i 14 e
7 6 7 6
8 10 8 10
(e) g (f) g
h 1 2 f h 1 2 f

8 7 8 7
b c d b c d
4 2 9 4 2 9

11 4 11 4
a i 14 e a i 14 e
7 6 7 6
8 10 8 10
(g) g (h) g
h 1 2 f h 1 2 f

24
CMPT405/705 Greedy Algorithms Qianping Gu

8 7 8 7
b c d b c d
4 2 9 4 2 9
4 4
a 11 i 14 e a 11 i 14 e
7 6 7 6
8 10 8 10
(i) (j)
h 1 g 2 f h 1 g 2 f

8 7 8 7
b c d b c d
4 2 9 4 2 9
4 4
a 11 i 14 e a 11 i 14 e
7 6 7 6
8 10 8 10
(k) (l)
h 1 g 2 f h 1 g 2 f

8 7 8 7
b c d b c d
4 2 9 4 2 9
4 4
a 11 i 14 e a 11 i 14 e
7 6 7 6
8 10 8 10
(m) (n)
h 1 g 2 f h 1 g 2 f

25
CMPT405/705 Greedy Algorithms Qianping Gu

Prim’s Algorithm [Jarnik 1930, Dijkstra 1957, Prim 1959]


Input: Weighted graph G.
Output: A MST of G.
begin
S := {s}; E(T ) := ∅;
while S 6= V (G) do
Choose an edge e = {u, v} with u ∈ S, v ∈ V (G) \ S and minimum d(e);
S := S ∪ {v}; E(T ) := E(T ) ∪ {e};
end while

8 7 8 7 8 7
b c d b c d b c d
4 2 9 4 2 9 4 2 9
4 4 4
a 11 i 14 e a 11 i 14 e a 11 i 14 e
7 6 7 6 7 6
8 10 8 10 8 10
(a) g (b) g (c) g
h 1 2 f h 1 2 f h 1 2 f

26
CMPT405/705 Greedy Algorithms Qianping Gu

8 7 8 7 8 7
b c d b c d b c d
4 2 9 4 2 9 4 2 9
11 4 11 4 11 4
a i 14 e a i 14 e a i 14 e
7 6 7 6 7 6
8 10 8 10 8 10
(a) g (b) g (c) g
h 1 2 f h 1 2 f h 1 2 f

8 7 8 7 8 7
b c d b c d b c d
4 2 9 4 2 9 4 2 9
4 4 4
a 11 i 14 e a 11 i 14 e a 11 i 14 e
7 6 7 6 7 6
8 10 8 10 8 10
(d) (e) (f)
h 1 g 2 f h 1 g 2 f h 1 g 2 f

8 7 8 7 8 7
b c d b c d b c d
4 2 9 4 2 9 4 2 9
4 4 4
a 11 i 14 e a 11 i 14 e a 11 i 14 e
7 6 7 6 7 6
8 10 8 10 8 10
(g) g (h) g (i) g
h 1 2 f h 1 2 f h 1 2 f

27
CMPT405/705 Greedy Algorithms Qianping Gu

MST and k -Clustering

• Kruskal’s algorithm and k -clustering problem.


Represent a clustering instance by a weighted complete graph G. Run Kruskal’s
algorithm with a data structure keeping the connected components created in
the algorithm. Terminate the algorithm when the number of components
becomes k .

• MST and k -clustering problem.


Compute a MST T (e.g., by Prim’s algorithm). Remove k − 1 edges with the
largest distances from T .

28

You might also like