10/27/20
Minimum Spanning Tree
Spanning Tree
Def. Given a connected, undirected graph ! = ($, &), an acyclic subset
of the edges ( ⊆ & that connects all of the vertices of ! is called a
spanning tree (of !).
• A given graph ! = ($, &) can have many possible spanning trees
• If |$| = +, there are + − 1 edges in a spanning tree
• We specify the spanning tree by indicating the edges.
1
10/27/20
Minimum Spanning Tree
Minimum spanning tree: Given a connected graph ! = ($, &, ') with
edge weights ') , an MST is a subset of the edges * ⊆ & such that * is a
spanning tree whose sum of edge weights is minimized.
4 24
4
23 9
6 6 9
18
5 5
16 11 11
8 8
7 7
10 14
21
G = (V, E, w) T, SeÎT we = 50
Greedy Algorithms
Strategy 1. Start with ! = ∅ Consider edges in ascending order of
weight. Insert edge $ in ! unless doing so would create a cycle.
Strategy 2. Start with some root node % and greedily grow a tree
! from % outward. At each step, add the cheapest edge $ to ! that
connects the current tree ! to an isolated vertex.
Strategy 3. Start with ! = &. Consider edges in descending order of
weight. Delete edge $ from ! unless doing so would disconnect !.
2
10/27/20
Greedy Algorithms
Kruskal's algorithm. Start with ! = ∅ Consider edges in ascending order of
weight. Insert edge $ in ! unless doing so would create a cycle.
Prim's algorithm. Start with some root node % and greedily grow a tree
! from % outward. At each step, add the cheapest edge $ to ! that connects
the current tree ! to an isolated vertex.
Reverse-Delete algorithm. Start with ! = &. Consider edges in descending
order of weight. Delete edge $ from ! unless doing so would disconnect !.
Remark. All three algorithms produce an MST!
Kruskal’s Algorithm
Kruskal's algorithm. Start with ! = ∅ 1
Consider edges in ascending order of 15
weight. Insert edge $ in ! unless doing 8
7
so would create a cycle. 4
3
5 7
1 2
Kruskal ( G=(V,E,w) ) 1
1. Let T = Ø; 6
2. Sort the edges in increasing order of weight
3. For edge e do 2
4. If T U e does not contain a cycle then
5. Add e to T
6. Return T
3
10/27/20
The Cut Property
Simplifying assumption. All edge weights !" are distinct.
Cut property. Let # be any subset of nodes, and let " be the min weight
edge with exactly one endpoint in #. Then the MST contains ".
Note: the cut property does not rule
out the possibility that other edges
that have exactly one endpoint in S
S are part of the MST.
e
It just argues that the cheapest such
edge MUST be part of the MST.
e is in the MST
The Cut Property
Simplifying assumption. All edge weights !" are distinct.
Cut property. Let # be any subset of nodes, let " be the min weight edge with
exactly one endpoint in #. Then the MST contains ". Let’s call this MST $ ∗ .
Pf. (exchange argument) f
• Suppose " does not belong to $ ∗ .
S
• Adding " to $ ∗ creates a cycle C in $ ∗ .
e
• $ ' = $ ∗ ∪ {"} − {-} is also a spanning tree.
• Since !" < !- , !"/0ℎ2($′) < !"/0ℎ2($ ∗ ). T*
• This is a contradiction.
4
10/27/20
Kruskal – Proof of Correctness
Claim. Kruskal’s Algorithm produces an MST of G = (V,E)
Pf.
• First, every edge added by Kruskal’s Algorithm belongs to the MST.
• Consider any edge ! = ($, &) being added.
• Define ( as the set of nodes reachable from $ just before ! is added. Then $ ∈ (,
and & ∈ * − (. (Else ! would create a cycle).
• Edge ! must be cheapest among all edges from ( to * − (, else the algorithm would
have considered another such edge earlier.
• By the cut property, ! is part of the MST.
• Second, the output - of Kruskal’s Algorithm includes enough edges to be a
spanning tree of ..
• By definition of Kruskal Algorithm, - has no cycles.
• Also, - is connected, else the algorithm would not have ended yet.
Implementation: Kruskal's Algorithm
Implementation: Basic Version.
# ≤ -2 so this is !(# log -)
• Sort the edges by weight. !(# log #)
• Process the edges one at a time
• Check to see if adding the edge ( = (*, ,) results in a cycle
More accurately in this case !(- + -) = !(-)
• Use BFS from starting node *
because the graph being searched never has more than
• BFS is !(- + #) - − 1 edges.
• Overall complexity: ! #(- + #) • 34567-8: ! # log -
• Process each edge: !(- + #), in total #
• Can we do better? edges.
• Processing edges will dominate.
5
10/27/20
Implementation: Kruskal's Algorithm
Implementation: Union-Find Data Structure.
• Basic idea: don’t re-compute the connected components (to check
for a cycle) each time an edge is added.
• Instead, keep track of what connected component any node belongs
to.
• For edge ! = #, % :
• If # and % belong to different connected components, then edge ! can be
added without creating a cycle
• If # and % belong to the same connected component, there already exists a
path from # to %, so adding ! would create a cycle.
Implementation: Kruskal's Algorithm
Implementation: Union-Find Data Structure.
Support 3 basic operations:
• Init(S): for a set S returns an initialized Union-Find data structure in
which all elements are in separate subsets (their own connected
component).
• Find(!): returns the subset containing !
• Union(A,B): changes the data structure by merging the subsets A and
B into a single subset.
6
10/27/20
Kruskal's Algorithm
Kruskal's algorithm. Start with ! = ∅ Consider edges in ascending
order of weight. Insert edge $ in ! unless doing so would create a
cycle.
Kruskal ( G=(V,E,w) ) using Union-Find data structure
1. Let T = Ø;
2. Sort the edges in increasing order of weight
3. Init(V) //initialize union-find arrays
4. For edge e = (u,v) do
5. If Find(u) ≠ Find(v) //no cycle in T ∪ e
6. Add e to T
7. Union(u,v) //union of sets containing u, v
8. Return T
Union-Find Data Structure
Init (V)
1. for every vertex v do
2. boss[v]=v
3. size[v]=1
4. set[v]={v}
Find (u)
1. Return boss[u]
Union (u,v)
1. if size[boss[u]]>=size[boss[v]] then
2. set[boss[u]]= set[boss[u]]∪ set[boss[v]]
3. size[boss[u]]+=size[boss[v]]
4. for every z in set[boss[v]] do
5. boss[z]=boss[u]
6. else do steps 2.-5. with u,v switched
7
10/27/20
Complexity Analysis
! = ($, &), |$| = ), |&| = *
• Sort the edges: +(* log *) = +(* log )2) = +(* log )) Total Complexity: +(* log ))
Complexity dominated by
• Init. Union-Find Data Structures: +()) initial sorting of edges by
• All 01)2(3) operations together: +(*) weight.
• All 4)15)(3, 6) operations together: +() log )) * see next slide
Kruskal ( G=(V,E,w) ) using Union-Find data structure
1. Let T = Ø;
2. Sort the edges in increasing order of weight
3. Init(V) //initialize union-find arrays
4. For edge e = (u,v) do
5. If Find(u) ≠ Find(v) //no cycle in T ∪ e
6. Add e to T
7. Union(u,v) //union of sets containing u, v
8. Return T
Complexity Analysis
• All !"#$"(&, () operations together: *(" log ")
• Cost of union operation dominated by updating boss[]
• How many total boss updates are done across all unions?
• we always choose to update the smaller set, it guarantees that any time a vertex has its
boss updated, the set it belongs to at least doubles in size.
• Each vertex has its boss updated at most log n times
Union (u,v)
1. if size[boss[u]]>size[boss[v]] then
2. set[boss[u]]=set[boss[u]] ∪ set[boss[v]]
3. size[boss[u]]+=size[boss[v]]
4. for every z in set[boss[v]] do
5. boss[z]=boss[u]
6. else do steps 2.-5. with u,v switched
8
10/27/20
Prim’s Algorithm
Prim's algorithm. Start with some root node ! and greedily grow a tree
" from ! outward. At each step, add the cheapest edge # to " that
connects the current tree " to an isolated vertex.
1
15
7
8
4
3
5 7
1 2
1
6
1
Prim ( G=(V,E,w) ) 15
7
1. Let T = Ø; H = V – {s} 8
2. For every vertex v in H do 4
3. cost[v]= ∞, parent[v]=null 3
4. cost[s]=0, parent[s] = none 5 7
5. Update (s) 2
6. For i=1 to n-1 do 1 1
6
7. u=vertex from H of
smallest cost //extract_min
• Add (u,parent[u]) to T 2
• Update(u)
• Return T
Update (u)
1. For every neighbor v of u (such that v in H)
2. If cost[v]>w(u,v) then
3. cost[v]=w(u,v), parent[v]=u //change_key
9
10/27/20
Prim – Proof of Correctness
Claim. Prim’s Algorithm produces an MST of ! = ($, &).
Pf. (assuming distinct edge weights)
• First, every edge added belongs to the MST.
• Each edge added is the minimum weight edge ) between the current partial
spanning tree S and the remaining vertices $– +
• By the cut property, ) is part of the MST.
• Second, the output ($, ,) includes enough edges to be a spanning
tree of G.
• ($, ,) has no cycles
• ($, ,) is connected, else the algorithm would not have ended yet.
Complexity Analysis
Complexity Analysis for ! = ($, &), |$| = ), |&| = *
• We store attachment weight information for each vertex in a heap data
structure
• Each heap operation requires O(log )) time
• ) − 1 iterations in which an EXTRACT_MIN heap operation is performed
• 0() log ))
• Each edge can result in at most one CHANGE_KEY heap operation
• 0(* log ))
• Overall complexity: 0(* log ))
10