Professional Documents
Culture Documents
Section 9: Objectives:: Data Structures & Algorithms
Section 9: Objectives:: Data Structures & Algorithms
SECTION 9:
DISCUSSION:
In this section we will examine two important graph problems which arise in
modeling and solving certain “real world” problems, such as minimizing the cost of linking
the various nodes of a communications network or finding the cheapest way of going
from one city to another. These are the problems of:
We are interested in the algorithms to solve these problems not only for their
practical usefulness, but more for the lesson to be derived from them as prime examples
of a particular technique for algorithm design (the ‘greedy’ approach) and also for the
challenge of implementing them efficiently on the computer. The beauty of these
algorithms lie in the brevity and simplicity, and that in it self, is a valuable lesson to learn.
In the previous section, we have seen how depth-first search or breadth first
search initiated from any vertex in a connected undirected graph G can be used to
generate a spanning tree for G. For the case in which costs are assigned to the edges in
the tree, we define the cost of the spanning tree as the sum of the costs of the edges, or
Section 9 Page 1 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
The number of spanning trees which can be constructed for a given graph is
rather large. Specifically, for a complete graph on n vertices, the number of spanning
trees is nn-2. This result follows from the following theorem:
Cayley’s theorem: The number of spanning trees for n distinct vertices is nn-2.
Thus, for a complete graph on four vertices, the number of spanning trees is 16; for ten
vertices, it is 100 million! Even for a graph that is not complete, it is reasonable to expect
that the number of spanning trees is still quite large. Obviously, to find a minimum cost
spanning tree for a given undirected graph by enumeration, i.e., constructing all possible
spanning trees for the graph, computing the cost of each selecting one with minimum
cost, is definitely out of the question. (In any case, such an approach would be tedious
and totally uninteresting.)
Section 9 Page 2 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
Proof: Suppose T’ is a minimum cost spanning tree for G and edge (u, v) is not in
T’. Now add (u, v) to T’. Clearly a cycle is formed in T’ with (u. v) as one of the edges in
the cycle. Likewise, there must be some edges (p, q), with p Є U and q Є V-U, in the
resulting cycle [see figure below]. Since edge (u, v) is an edge of least cost among those
edges with one vertex in U and the other vertex in V-U, then the cost of (u, v) ≤ cost of (p,
q). Hence, removing (p, q) from T’ +(u, v) yields a spanning tree whose cost cannot be
edge (u, v).
Cost (u, v)
U V-U
u v
P q
cost (p, q)
Prim’s algorithm
Let U denote the set of vertices already chosen and T denote the set of edges
already taken in at any instance of the algorithm. Initially, U and T are both empty. Prom’s
algorithm may be stated as follows:
We see from this description of Prim’s algorithm that it is a direct and straightforward
application of MST theorem. To see how the algorithm actually works, and to assess
which steps are crucial in implementing the algorithm on a computer, consider the
following example.
Section 9 Page 3 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
22
2 3
10 14 13 23
1 4 5
12 18 15 8
8 7 6
16 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . .
1 1 2, 3, 4, 5, 6, 7, 8 (1, 2) -- 10
(1, 7) -- 18
(1, 8) -- 12
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . ..
2
1, 2 3, 4, 5, 6, 7, 8 (1, 7) -- 18
1 (1, 8) -- 12
(2, 3) -- 22
(2, 4) -- 14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . ..
1, 2, 8 3, 4, 5, 6, 7 (1, 7) -- 18
2
(2, 3) -- 22
1 (2, 4) -- 14
(8, 7) -- 16
8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . ..
2 1, 2, 4, 8 3, 5, 6, 7 (1, 7) -- 18
1
(2, 3) -- 22
4 (4, 5) -- 13
(4, 7) -- 15
8 (8, 7) -- 16
Section 9 Page 4 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . .
2 1, 2, 4, 5, 6, 8 3, 7 (1, 7) -- 18
1
(2, 3) -- 22
4 5
(4, 7) -- 15
(5, 3) -- 23
8 6 (6, 7) -- 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .
2 1, 2, 4, 5, 6,
1
7, 8 3 (2, 3) -- 22
4 5
(5, 3) -- 23
7 6
8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .
22 3
10 2 14 1, 2, 4, 5, 6,
1
13 5
7, 8
4
12 15 8
7 6
8
Cost = 94
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .
Section 9 Page 5 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
Example:
U
25 CLOSEST(i) = r
p 30 V-U
LOWCOST(i)= 15
q i
r 15
Initially, each element of CLOSEST is set equal to the starting vertex, say s
(since, initially, s is the only vertex in U). Correspondingly, LOWCOST is initialized to
contain the costs of the edges from each vertex in V-U to vertex s in U. LOWCOST(s) is
set to infinity to indicate that it is already in U. Subsequently, at each step, we find the
smallest element of LOWCOST, say LOWCOST(k), to give us the vertex k in V-U that is
connected to some vertex in U by an edge of least cost [by definition, this vertex is
CLOSEST(k)]. Vertex k is then added to U, edge (CLOSEST(k), k) attached to the
growing tree, and the vectors LOWCOST and CLOSEST accordingly updated.
Before After
Section 9 Page 6 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
procedure PRIM(C, s)
//Generates a minimum cost spanning tree for a connected, weighted, undirected graph
on n vertices using Prim’s algorithm. Graph is represented by its full cost adjacency
matrix C. Vertex s is the start vertex.//
//Initializations//
for i Å 1 to n do
CLOSEST(i) Å s
LOWCOST(i) Å C(s, i)
endfor
LOWCOST(s) Å ∞
mincost Å 0 // cost of minimum cost spanning tree//
For an undirected graph on n vertices and e edges, the time complexity of Prim’s
algorithm as implemented using the LOWCOST and CLOSEST vectors is clearly O(n2).
If a heap is used to store the edges from U to V-U and to find the edge of least cost, an
O(elog2e) version is possible.
Kruskal’s Algorithm
Section 9 Page 7 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
Initially, the edge of least cost is chosen. Subsequently, at each step, the edge of
least cost among the remaining edges in E is considered for inclusion in T. If including
this edge in T will create a cycle with the edges already in T, then it is rejected. The
algorithm terminates once n-1 edges have been included in T.
As with Prim’s algorithm, we see from this description of Kruskal’s algorithm that it
is also a straightforward application of the MST theorem. To see how the algorithm
actually works, and to assess what steps are critical in implementing the algorithm on a
computer, consider the following example. For the moment, ignore the FOREST column.
EDGE COST
===== =====
(1, 7) -- 1
1 2 (3, 4) -- 3
(2, 7) -- 4
(3, 7) -- 9
(2, 3) -- 15
6 7 3
(4, 7) -- 16
(4, 5) -- 17
5 4 (1, 2) -- 20
(1, 6) -- 23
(5, 7) -- 25
(5, 6) -- 28
(6, 7) -- 36
Section 9 Page 8 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
1 2 3 4 5 6 7
(1,7) Accept
1 2 3 4 5 6 7
7
(3, 4) Accept
1 2 4 5 6 7
7 3
3 1
4
(2, 7) Accept
1 2 4 5 6 7
3
7 3 1 2
(3, 7) Accept
1 2 5 6 7
7
3 1 2 4
4
3
1 2 3 4
Section 9 Page 9 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
(4, 5) Accept
1 2 6 7
7
3 1 2 3 4 5
5 4
(1, 6) Accept
1 7
23 1 4 2
8
6 7 3
17 3 1 2 3 4 5 6
5 4
Cost = 57
The major computational effort incurred in using Kruskal’s algorithm is sorting the
edges in nondecreasing order of their cost. If e is the number of edges in the graph, this
takes O(elog2e) time if heapsort, say, is used.
is rejected (because a cycle will otherwise be formed in this particular component of T),
while an edge incident on two vertices that belong to two different equivalence classes is
accepted, resulting in a union of the two classes into one. (In terms of growing tree, this is
the equivalent of two components joined together.) The UNION and FIND procedures of
Session 10 are the key to an efficient implementation of Kruskal’s algorithm, as
abundantly evident in the EASY procedure KRUSKAL.
Section 9 Page 11 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
//The EASY procedure UNION implements the weighting rule for the UNION operation//
procedure UNION(i, j)
// Merges trees with roots i and j, i<>j, using the weighting rule for union. //
array FATHER(1:n)
count Å FATHER(i) + FATHER(j)
if |FATHER(i)| > |FATHER(j)| then [FATHER(j) Å i
FATHER(i) Å count]
else [FATHER(i) Å j
FATHER(j) Å count]
end UNION
// The EASY procedure FIND implements the collapsing rule for the FIND operation//
procedure FIND(i)
// Finds the root of the tree containing node I and compresses the path from node i to the
root //
array FATHER(1:n)
//Find root//
kÅi
while FATHER(k) > 0 do
k Å FATHER(k)
endwhile
(a) the single source shortest paths (SSSP) problem – determine the cost of the
shortest path from a given vertex, called the source vertex, to every other
vertex in V
(b) the all-pairs shortest paths (APSP) problem – determine the cost of the
shortest path from each vertex to every other vertex in V.
The classical solution to the SSSP problem is called Dijkstra’s algorithm, which
is in the same greedy class as Prim’s and Kruskal’s algorithm. For a graph on n
Section 9 Page 12 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
vertices, we can view the APSP problem as n instances of the SSSP problem. To
solve the APSP problem, therefore, we can apply Dijkstra’s algorithm n times,
taking each vertex in turn as the source vertex. However, there is a more direct
solution to the APSP problem called Floyd’s algorithm.
The general idea behind Dijkstra’s algorithm may be stated as follows: Each
vertex is assigned a class and a value. A class 1 vertex is a vertex whose shortest
distance from the source vertex, say k, has already been found; a class 2 vertex is a
vertex whose shortest distance from k has yet to be found. The value of a class 1 vertex
is equal to its distance from vertex k along a shortest path; the value of a class 2 vertex is
its shortest distance from vertex k found thus far.
Now, the algorithm:
2. Set the value of vertex k to zero and the value of all other vertices to ∞
a. Define the pivot vertex as the vertex most recently placed in class 1
i. If a vertex is not connected to the pivot vertex, its value remains the
same.
ii. If a vertex is connected to the pivot vertex, replace its value by the
Section 9 Page 13 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
minimum of its current value or the value of the pivot vertex plus
the distance from the pivot vertex to the vertex in class 2.
Suppose to the contrary, that there are class 2 vertices in a special path as
shown in figure 2. If the hypothetical path s2 + s3 is shorter than s1, then it
must be true that s2 < s1, since s3 cannot be negative if all costs are
nonnegative. Now, if s2 < s1, then vertex x must have been placed in class 1
ahead of vertex j. Since this is not in fact the case, then s1 must be shorter
than s2, which means that the shortest path from k to j passes through class 1
vertices only. Note that this argument hinges on the requirement that edge
costs are nonnegative; if there are negative costs, Dijkstra’s algorithm will not
work properly.
class 1
vertices j next class 1 vertex is a class 2
s1 vertex with minimum value; all
intermediate vertices in shortest
s3 path are in class 1.
k s2
x
hypothetical shorter path to j
which passes through an
intermediate vertex x not in
class 1
In step 3.b.ii of Dijkstra’s algorithm, the cost of the shortest path from the source
Section 9 Page 14 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
where p is the current pivot vertex. Vertex 3a shows graphically the meaning of this
formula. Note that in the new, possibly shorter, path from k to j which passes through p,
there is no intermediate vertex in the path from p to j. Could it be that there is a yet
shorter path from k to j passing through p and some intermediate vertex q between p and
j, as shown in Figure 3b? Such a path would in fact be shorter if s2 + s3 < s1. But this
cannot be since q is an older class 1 node than p. hence, the only possible shorter path
from k to j and passing through p is the one depicted in Figure 3a; hence Eq. (1) is
sufficient.
class 1
class 1 j vertices s1
vertices j
k k q
cost(p, j)
s2 s3
p p
(a) (b)
Dijkstra’s algorithm gives the cost of the shortest path from the source vertex k to
the destination vertex l, but it does not tell which edges in E comprise the path. To
construct the path, we can define a vector of size n, say PATH, such that PATJ(i) = j if
vertex I changes value in step 3.b.ii when the pivot vertex is j. By Dijkstra’s algorithm,
vertex j is simply the predecessor of vertex i in the shortest path from vertex k to vertex l.
Section 9 Page 15 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
2 4 110
10 70
20 50
60
7
40 10
6 5 1
20
1 2 ∞ 0
2 2 ∞ 0
3 1 0 0 First pivot vertex is source
4 2 ∞ 0 vertex
5 2 ∞ 0
6 2 ∞ 0
7 2 ∞ 0
1 2 ∞ 0
2 2 10 3
3 1 0 0 Next pivot vertex is vertex 2.
4 2 30 3
5 2 110 3
6 2 ∞ 0
7 2 ∞ 0
1 2 ∞ 0
-2 1 10 3
3 1 0 0 Next pivot vertex is vertex 4.
4 2 30 3
5 2 110 3
6 2 70 2
7 2 ∞ 0
1 2 ∞ 0
2 1 10 3
3 1 0 0 Next pivot vertex is vertex 6
-4 1 30 3
5 2 100 4
6 2 50 4
7 2 80 4
Section 9 Page 16 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
The path from vertex 3 to vertex 5 can be determined from the PATH vector by
tracing the predecessor vertices in reverse order starting at vertex 5, thus:
Note that in the process of calculating the cost of the shortest path from the
source vertex k to the end vertex l, the procedure DIJKSTRA also finds the cost of the
shortest paths from k to the other vertices which entered class l ahead of vertex l. Thus to
solve the SSP problem in full, we need only to modify the condition in the while loop such
that exit from the loop occurs when all vertices are in class l.
The while loop in procedure DIJKSTRA will be executed n-1 times (if vertex l
enters class l last). The inner for loop is executed n times for each outer loop. Hence, the
time complexity of the procedure is 0(n2).
By this definition, Ao is simply the cost adjacency matrix for the graph. Subsequently,
each successive Ak is generated using the iterative formula.
= minimum [ i j i
k
j
For any given pair of vertices i and j, the iterative application of Eq. (2) is
equivalent to systematically considering the other vertices for conclusion in the path from
vertex i to vertex j. If at the kth iteration, including vertex k in the path from i to j results in
a shorter path, then the cost of this shorter path from i to j. Clearly, the nth iteration value
of this cost is the cost of the shortest path from vertex i to vertex j.
Section 9 Page 18 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
Then, An(i, j) is the cost of the shortest path from vertex i to vertex j for any 1≤ i, j ≤ n.
Floyd’s algorithm gives the cost of the shortest path between every pair of vertices i
and j, but not the past itself. The intermediate vertices along this shortest can be found by
maintaining an n x n matrix, say PATH, such that
PATH (i, j) = 0 initially, indicating that, initially, the shortest path between i and j is the
edge (i, j), if it exists
= k if, including k in the path from i to j at the kth iteration, yields a shorter
path
Example: Floyd’s algorithm at work
11
10
1 2
2 8
4 2 3 2
3
1 4
4
4 5
2
Section 9 Page 19 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
1 2 3 4 5 1 2 3 4 5
1 0 10 2 4 ∞ 1 0 0 0 0 0
2 11 0 8 ∞ 3 2 0 0 0 0 0
Ao=3 2 8 0 1 4 PATH=3 0 0 0 0 0
4 2 ∞ 1 0 2 4 0 0 0 0 0
5 ∞ 2 4 4 0 5 0 0 0 0 0
1 2 3 4 5 1 2 3 4 5
1 0 10 2 4 ∞ 1 0 0 0 0 0
2 11 0 8 15 3 2 0 0 0 1 0
A1=3 2 8 0 1 4 PATH=3 0 0 0 0 0
4 2 12 1 0 2 4 0 1 0 0 0
5 ∞ 2 4 4 0 5 0 0 0 0 0
1 2 3 4 5 1 2 3 4 5
1 0 10 2 4 13 1 0 0 0 0 2
2 11 0 8 15 3 2 0 0 0 1 0
A2=3 2 8 0 1 4 PATH=3 0 0 0 0 0
4 2 12 1 0 2 4 0 1 0 0 0
5 13 2 4 4 0 5 2 0 0 0 0
Section 9 Page 20 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
1 2 3 4 5 1 2 3 4 5
1 0 10 2 3 6 1 0 0 0 3 3
2 10 0 8 9 3 2 3 0 0 3 0
A3=3 2 8 0 1 4 PATH=3 0 0 0 0 0
4 2 9 1 0 2 4 0 3 0 0 0
5 6 2 4 4 0 5 3 0 0 0 0
1 2 3 4 5 1 2 3 4 5
1 0 10 2 3 5 1 0 0 0 3 4
2 10 0 8 9 3 2 3 0 0 3 0
A4=3 2 8 0 1 3 PATH=3 0 0 0 0 4
4 2 9 1 0 2 4 0 3 0 0 0
5 6 2 4 4 0 5 3 0 0 0 0
1 2 3 4 5 1 2 3 4 5
1 0 7 2 3 5 1 0 5 0 3 4
2 9 0 7 7 3 2 5 0 5 5 0
A5=3 2 5 0 1 3 PATH=3 0 5 0 0 0
4 2 4 1 0 2 4 0 5 0 0 0
5 6 2 4 4 0 5 3 0 0 0 0
Section 9 Page 21 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
It is clear that the time complexity of Floyd’s algorithm is 0(n3). Calling DIJKSTRA
n times solves the APSP problem also in 0(n3) time, but FLOYD involves less
computational effort.
The EASY procedure PRINTPATH constructs the shortest path for any given pair
of vertices i and j from the PATH matrix generated by FLOYD.
Section 9 Page 22 of 23
Jennifer Laraya-Llovido
Data Structures & Algorithms
Consider the digraph G = (V, E) and its corresponding adjacency matrix, S. Recall
that the length of a path in G is the number of
1 2 3 4
1 2 1 0 1 0 1
2 0 0 1 0
S= 3 0 0 0 1
4 3 4 0 1 1 0
= 0 (false) otherwise
T is called the transitive closure of the adjacency matrix, S. It simply indicates the
existence, or nonexistence, of a path of length at least 1 for every pair of vertices i and j
in G. The problem of generating T from S is similar to the problem of generating the least
cost matrix A from the cost adjacency matrix COST in Floyd’s algorithm. The algorithm to
generate T from A is called Warshall’s algorithm, and is, in fact, an older algorithm than
Floyd’s. Warshall’s algorithm may be stated as follows:
The iterative step in Warshall’s algorithm simply tests whether there is a path from
vertex i to vertex j if vertices 1, 2… n are successively considered for inclusion as
intermediate vertices in the path, and sets T(i, j) to true once such a path is found.
Clearly, if there is a path from i to j which contains no intermediate vertices of index
greater than k-1, or if there are paths from i to k and from k to j which contain no
intermediate vertices of index greater than k-1, then there must be a path from i to j which
contains no intermediate vertices of index greater than k. The algorithm may repeatedly
set T(i, j) to true in the course of the iterations.
Section 9 Page 23 of 23
Jennifer Laraya-Llovido