Professional Documents
Culture Documents
Graphs
Graphs
INTRODUCTION
There are many real world problems that can be modelled using a graph. Several are
described.
We will start by giving a standard set of definitions for graphs. The data structures most
commonly used will then be described and compared.
We are going to study several standard graph algorithms which can be used to solve a
range of problems. Shortest path problems for both un-weighted and weighted trees will
be discussed. Algorithms for graphs containing cycles will be treated, as well as ones that
do not contain cycles. Specifically Dijkstra’s algorithm will be discussed in detail. The
topological sort for acyclic graphs will be described.
The concept of a minimum spanning tree will be introduced and the solutions of Prim
and Kruskal will be described and illustrated.
In all these examples the basic problem is to find the minimum path from A to B in the
graph. One may also want to know if there alternate routes and what they are.
1
Computer Network configuration
2
DEFINITIONS
Above is an example of a directed graph. This graph will be used to illustrate various
graph definitions. It has 6 vertices & 9 edges.
3
ACYCLIC DIRECTED GRAPH: A directed graph with NO cycles. The graph above
is NOT a acyclic directed graph. If the arc E-D was removed then it would
be.
REPRESENTATION OF GRAPHS
Name A B C D E F
Index 0 1 2 3 4 5
A 0 8 6
B 1 2 4
C 2
D 3 3 7
E 4 5
F 5 2 7
From this table it is simple to extract an arc value. Let say D-C. That translates to row 3
column 2 giving the value of 7.
While it is very fast to find an arc cost this representation is wasteful of space as the
graph becomes less dense.
4
GRAPH CLASS STRUCTURE
I will briefly describe the structure of the graph class when the adjacency list is
used.
Using the graph above the first action is to input all the data relating to the graph. This
consists of the of all the arcs of the graph. One needs the name of the nodes and the value
of the arc. Lets assume that they are input in the following order:
Input:
AB8
FA2
FE7
BD2
AC6
BE4
ED5
DB3
DC7
A Dictionary is maintained and updated as this input process proceeds. This dictionary
keeps track of the Name of the node and the line that that node occupies in the graph
table. The first node (A) is line 0, the second (B) is line 1, the third (F) is line 2, the
fourth (E) is line 3 and so on.
Dictionary:
5
A 0
B 1
F 2
E 3
D 4
C 5
When a graph algorithm is run the purpose is to find the minimum path from some
specified node to all the other nodes in the graph. The graph table is an arrangement of
data to do this. Using our graph lets find the minimum distance to all other nodes. The
final graph table would be as follows:
Please note that there is no path from A to F. While you can easily work out the
minimum paths for this graph only the final result is shown here. In a later section the
algorithm will be discussed in full detail.
In this table:
Dist: is the minimum distance from this node to the starting node (A in this
case)
Prev: is the number of the previous node in the path from this node to the
starting node (Keeping this pointer allows us to find the path from any
node back to the starting node)
Name: is the name of the node. This information is redundant because the row
number of the table can be used to reference the Dictionary to find the
name of the node. Keeping the name here does speed up the process.
Adj list: This is the pointer to the adjacency list of this node. This
information is needed because you need to know who the neighbouring
nodes are.
First the unweighted shortest path problem will be explained informally then more
formally. Thereafter the weighted shortest path problem will be done because it is an
extension of the unweighted problem.
6
In an unweighted graph the weight of each arc is the same and is taken as 1.
We wish to find the shortest paths from node A to all the other nodes in the graph.
Very informally:
1) Place yourself at the starting node. Give it a distance of 0.
2) For every node that you can reach:
a. Give the node a value of 1 + value of the node at which you are
b. Put every such node on a queue
3) In turn, place yourself at each of the node on the queue and repeat step 1a &
2b until you are finished.
1) Mark the starting node with a distance = 0. Put this node on the Priority Queue
(PQ). PQ is ordered on Distance.
2) While the PQ is NOT empty. Remove the minimum node from the PQ.
3) Mark all the nodes adjacent to this node with the distance = distance + 1; and
insert these nodes into PQ. Only do this for nodes that have not already been
marked
Example:
7
Step 1: The starting node, A, is marked with distance 0. The PQ = A
Step 2:
Remove A from PQ
Mark all nodes adjacent to A with distance 1. They are B & C.
Also put B & C on a priority queue. PQ = B, C
8
Step 3:
Remove node from priority queue.
It is B. PQ) = C
Mark all neighbours of B with distance 2 and add them to the priority queue.
PQ = C, D, F
Step 4:
Remove node from priority queue.
It is C. The priority queue becomes PQ = D, F
Mark all neighbours of C with distance 2 and add them to the priority queue. There are
none so PQ = D, F
9
Step 5:
Remove node from priority queue.
It is D. The priority queue becomes PQ = F
Mark all neighbours of D with distance 3 and add them to the priority queue. There are
three of them.G, E, & C.
Step 6:
Remove node from priority queue.
It is F. The priority queue becomes PQ = G, E
Mark all neighbours of F with distance 4 and add them to the priority queue. There is
one.
Add G with Distance 4. No we don’t because G has already been marked with Distance 3
PQ = G, E
Step 7
Remove node from priority queue.
It is G. The priority queue becomes PQ = E
Mark all neighbours of G with distance 4 and add them to the priority queue. There is
one.
10
Add E with Distance 4. No we don’t because E has already been marked with Distance 3
PQ = E
Step 8
Remove node from priority queue.
It is E. The priority queue becomes PQ = Null
Mark all neighbours of E with distance 4 and add them to the priority queue. There is
one.
Add C with Distance 4. No we don’t because C has already been marked with Distance 1
PQ = Null
Step 9
Remove node from priority queue. Queue empty. STOP
The Java-like psuedocode for this algorithm is given below. This is followed by graph
table at each stage of the process. (The adjacency list for each node has been omitted to
simplify the description.)
q = new Queue( );
11
q.enqueue( s); s.dist = 0;
while( !q.isEmpty( ) )
{
v = q.dequeue( );
v.known = true;
The Graph Table is displayed below. It starts with the initial state and shows each
intermediate state as the algorithm progresses.
Initial A B C
State deque deque deque
d d d
Know Dis Pre Know Dis Pre Know Dis Pre Know Dis Pre
n t v n t v n t v n t v
A F 0 0 T 0 0 T 0 0 T 0 0
B F ~ 0 F 1 A T 1 A T 1 A
C F ~ 0 F 1 A F 1 A T 1 A
D F ~ 0 F ~ 0 F 2 B F 2 B
E F ~ 0 F ~ 0 F ~ 0 F ~ ~
F F ~ 0 F ~ 0 F 2 B F 2 B
G F ~ 0 F ~ 0 F ~ 0 F ~ ~
P A BC CDF DF
Q
D F G E
deque deque deque deque
d d d d
Know Dis Pre Know Dis Pre Know Dis Pre Know Dis Pre
n t v n t v n t v n t v
A T 0 0 T 0 0 T 0 0 T 0 0
B T 1 A T 1 A T 1 A T 1 A
C T 1 A T 1 A T 1 A T 1 A
D T 2 B T 2 B T 2 B T 2 B
E F 3 D F 3 D F 3 D T 3 D
12
F F 2 B T 2 B T 2 B T 2 B
G F 3 D F 3 D T 3 D T 3 D
P FGE GE E Empty
Q
Using the “Prev” column from the final state of the table the shortest path from any node
to the starting node can be deduced.
The algorithm given in the previous section is an example of a breadth first search.
This works by processing vertices in layers. The vertices closest to the starting node are
evaluated first, then the vertices closest to this layer are processed next & so on with the
vertices furthest away being processed last.
A breadth first search can be contrasted to a depth first search which always goes down
before going across. A preorder search of the tree is an example of a depth first search.
A breadth first search is also called a greedy algorithm because it finds a global optimum
by making a sequence of locally greedy choices. A locally greedy choice is the best
choice amongst several local alternatives at a particular stage of the solution process.
13
When the graph has weighted arcs the problem becomes more difficult. Simply put the
first “shortest path’ to a node may subsequently be replaced by an even shorter path. A
simple illustration is given below:
Dijkstra developed the clever algorithm to solve this. The essence of his solution has
already been described. It is based on breadth first search for the unweighted case where
DIST(w) = DIST (v) + 1. There are two differences in the weighted case.
1) DIST(w) = DIST (v) + COST(v,w)
2) ONLY if the new value of DIST(w) is an improvement on its previous value.
All nodes start with an infinite distance originally.
Dijkstra’s algorithm deals with this and the pseudocode for this algorithm is:
14
decrease( w.dist to v.dist + cvw );
w.path = v; // w’s path points to v
}
}
}
Find the minimum paths from A to all the nodes in the following graph:
Initial A C
known Known
Node Known Dist Prev Node Known Dist Prev Node Known Dist Prev
A F 0 0 A T 0 0 A T 0 0
B F ~ 0 B F 4 A B F 4 A
C F ~ 0 C F 3 A C T 3 A
D F ~ 0 D F 6 A D F 6 A
E F ~ 0 E F ~ 0 E F 9 C
F F ~ 0 F F ~ 0 F F ~ 0
PQ A(0) PQ C(3), PQ B(4),
B(4), D(6),
D(6) E(9)
Commentary:
1) Initially the starting node is placed on the Priority Queue ie
15
PQ = A(0) Node A with distance 0
2) Remove the minimum node from the priority queue. It is A. Mark it as known.
3) For all the neighbours of A calculate the distance & note their predecessor
and put them on the PQ ie
a. C is 3 away & predecessor is A PQ = C(3)
b. B is 4 away & predecessor is A PQ = C(3), B(4)
c. D is 6 away & predecessor is A PQ = C(3), B(4), D(6)
4) Remove the minimum node from the PQ. It is C. Mark it as known
5) For all the neighbours of C calculate the distance & note their predecessor
and put them on the PQ ie
a. E is 7 away & predecessor is C PQ = B(4), D(6), E(9)
b. E was given a distance 6 + distance to C ie 6 + 3 = 9
B D E
known known Known
Node Known Dist Prev Node Known Dist Prev Node Known Dist Prev
A T 0 0 A T 0 0 A T 0 0
B T 4 A B T 4 A B T 4 A
C T 3 A C T 3 A C T 3 A
D F 5 B D T 5 B D T 5 B
E F 9 C E F 8 D E T 8 D
F F 12 B F F 11 D F F 10 E
PQ D(5)*, PQ E(8)*, PQ F(10)*,
D(6)*, E(9)*, F(11)*,
E(9), F(11)*, F(12)*
F(12) F(12)*
Commentary:
6) Remove the minimum node from the priority queue. It is B. Mark it as known.
7) For all the neighbours of B calculate the distance & note their predecessor
and put them on the PQ ie
a. D is 1 away & predecessor is B PQ = D(5), D(6), E(9)
b. F is 8 away & predecessor is B PQ = D(5), D(6), E(9), F(12)
8) NOTICE that D now appears twice in the Priority Queue. This has
happened because a new LOWER value of the distance to D has been found.
Strictly it is not correct to have any node appear more than once in the PQ.
There are two different ways of solving this problem:
a. Before inserting any node in the priority queue check that it is not already
there. If it is delete the old value because it will be greater. It is costly to
do the search on the priority queue O(n)
b. The preferred method is to insert duplicates into the PQ. An extra field is
added to each node. When a node is removed from the PQ the first time
this field is set. Every time a node is removed form the PQ this field is
16
checked. If it is set then just disreguard the node because it is a duplicate.
While this method strictly breaks the priority queue data structure it is
preferred because of its efficiency. This is the method that will be used in
this example.
9) Remove the minimum node from the PQ. It is D. Mark it as known
10) For all the neighbours of D calculate the distance & note their predecessor
and put them on the PQ ie
a. E is 3 away & predecessor is D PQ = D(6), E(8), E(9), F(12)
b. F is 6 away & predecessor is D PQ = D(6), E(8), E(9), F(11), F(12)
11) Remove the minimum node from the PQ. It is D. Disreguard it because it has
already been dealt with. PQ = E(8), E(9), F(11), F(12)
12) Remove the minimum node from the PQ. It is E. Mark it as known
13) For all the neighbours of E calculate the distance & note their predecessor
and put them on the PQ ie
a. F is 1 away & predecessor is E PQ = E(9), F(10), F(11), F(12)
14) Remove the minimum node from the PQ. It is E. Disreguard it because it has
already been dealt with. PQ = F(10), F(11), F(12)
F Known
Node Known Dist Prev
A T 0 0
B T 4 A
C T 3 A
D T 5 B
E T 8 D
F T 10 E
PQ = Null
Commentary:
15) Remove the minimum node from the PQ. It is F. Mark it as known
It has no neighbours PQ = F(11), F(12)
16) Remove the minimum node from the PQ. It is F. Disreguard it because it has
already been dealt with PQ = F(12)
17) Remove the minimum node from the PQ. It is F. Disreguard it because it has
already been dealt with PQ = Null
18) Remove the minimum node from the PQ. PQ is empty. QUIT
17
Predecessor of B is A
The path is thus F - E – D – B – A
A final strength of Dijkstra’s algorithm is that it correctly deals with cyclic graphs. The
example above does not illustrate this useful feature of the algorithm.
A Minimum Spanning Tree (MST) of a graph is a tree formed from graph edges that
connect all the vertices of the graph at lowest total cost. Most commonly one is interested
in the MST of undirected graphs. There are problems where one wishes to find the MST
of a directed graph but this is a more difficult problem & will not be covered here.
A simple example of a MST is the following: Consider a house with electrical points.
Each electrical point is a node of the graph. Each arc is the distance between the two
nodes. How can we wire the house with a minimum of wire?
KRUSKAL’S ALGORITHM
Example:
18
AF 2 Accept
DE 5 Accept
EF 7 Accept
DF 8 Reject Cycle DEF
AE 11 Reject Cycle AFE
CD 13 Accept
EC 15 Reject Cycle DEC
CB 16 Accept
19
Kruskal’s algorithm is easy to implement manually. A computer algorithm is more
complex because of the problem of recognising when a cycle has occurred. An algorithm
to do this does exist. It is not described here because it is fairly advanced.
PRIM’S ALGORITHM
Prim’s algorithm is the same as Dijkstra’s algorithm with 3 very simple changes. These
are:
1) Each arc MUST be entered twice (Arc E-D must be entered as E-D 5 and D-E 5).
This is because the graph is undirected.
2) The DISTance is different. It is the weight of the shortest edge connecting “v” to
a known vertex. (“v” is a vertex that was not in the MST before this iteration).
3) The UPDATE rule is different: After vertex “v” is chosen, for each unknown “w”
adjacent to “v” DIST(w) = min( DIST(w), COST(w,v))
Prim’s algorithm needs only these minor changes from Dijkstra’s algorithm. So its is easy
to implement computer-wise.
ACYCLIC GRAPHS
There are many graphs that contain no cycles and they are known as acyclic. Typical
examples are: prerequisite courses of some course and activity graphs. The latter are
essentially used to calculate the critical path through a network. Such planning is usually
used in any medium or large scale project especially in the construction business.
20
To find the minimum path for such graphs Dijkstra’s algorithm can be used. A simpler
way of doing this is by a Topological Sort which will now be described.
TOPOLOGICAL SORT
1) Find any vertex with no incoming edges. Print it. Remove it & its edges from the
graph.
2) Repeat step 1 until there are no edges left.
21
For the above graph the final ordering is: CS1, M1, CS2, M2, CS3
(CS2 was arbitrarily removed before MTH2)
This ordering is not unique. Basically different orderings will occur when it is possible to
validly remove more than one node at any point in time. The other 3 possible orderings
are:
22
ACTIVITY GRAPHS
I re-iterate that there are NO cycles in such a graph. Typically an activity graph is used
when constructing a building. The foundations have to be built first. Then the walls &
finally the roof. Plumbing & electrical wiring needs to be done after the walls are built.
The completion date of constructing such a building is very important. Specifically one
wishes to know what effect a hold up in some specific activity will have on the final
completion date.
A critical path analysis is illustrated below. An edge (v, w) means that the activity v must
be completed before activity w can start. If the edge (v, w) has weight x this means that w
can only complete x time units after v has been completed.
If any activity on the critical path takes longer than predicted then the final completion
time is affected.
It is very easy to adapt our shortest path algorithm to calculate the earliest completion
time for any node. The equations are:
EC1 = 0
ECW = max(v, w) E(ECv +cv,w)
23
The graph with the Earliest completion time is given below:
One can also calculate the latest completion time that each event can finish at without
affecting the final completion time of the project.
One starts at the end of the graph & works backwards to the start. Again this is easy to do
although one would need to reverse the direction of every arc in the graph.
LCn = ECn
LCv = min(v, w) E(LCw - cv,w)
Examples: LCF = 14
LCD = min(v, w) E(LCw - cv,w) = 14 – 5 = 9
LCE = min(v, w) E(LCw - cv,w) = 14 – 2 = 10
LCC = min(v, w) E(LCw - cv,w) = 9 – 3 = 6 and 12 – 4 = 8 giving LCC = 6
Here you must take the minimum of the 2 possible paths
LCDB= min(v, w) E(LCw - cv,w) = 9 – 2 = 7
24
Finally the SLACK time for each arc can be calculated. This is the time that that activity
can be delayed without the final completion time being affected.
This activity graph has been presented to show how algorithms that have already been
developed can be adapted and expanded to solve other useful graph problems.
The critical path is the path with all arcs having zero slack time. It is shown below:
QUESTIONS
25
Q1: What is :
Q5: Use a Topological sort to get the paths (orderings) through the following graph. For
this sort what are the constraints on the graph?
26
Q6: Use Kruskal’s algorithm to find the minimum spanning tree of the following graph:
ANSWERS
Ans Q
1:
27
Ans Q2:
What is the adjacency list and why is it useful? An adjacency list is a list of the nodes that
can be reached from some specific node. It is useful because when Dijkstra’s algorithm is
being used one requires exactly this information to determine the distance to
neighbouring nodes.
Ans Q3:
Initial A D
known Known
Node Known Dist Prev Node Known Dist Prev Node Known Dist Prev
A F 0 0 A T 0 0 A T 0 0
B F ~ 0 B F 2 A B F 2 A
C F ~ 0 C F ~ 0 C F 3 D
D F ~ 0 D F 1 A D T 1 A
E F ~ 0 E F ~ 0 E F 3 D
F F ~ 0 F F ~ 0 F F 9 D
G F ~ 0 G F ~ 0 G F 5 D
PQ A(0) PQ D(1), PQ B(2),
B(2) C(3),
E(3),
G(5),
F(9)
B known C known
Node Known Dist Prev Node Known Dist Prev
A T 0 0 A T 0 0
B T 2 A B T 2 A
C F 3 D C T 3 D
D T 1 A D T 1 A
E F 3 D E F 3 D
F F 9 D F F 8 C
G F 5 D G F 5 D
PQ C(3), PQ E(3),
E(3), G(5),
G(5), F(8),*
F(9) F(9)*
Note that F occurs twice inn the priority queue. The first time it is removed it is marked
as done. When it is removed again it is disreguarded because it has already been done.
28
E G F
Know know know
n n n
Node Know Dis Pre Node know Dis Pre Node Know Dis Pre
n t v n t v n t v
A T 0 0 A T 0 0 A T 0 0
B T 2 A B T 2 A B T 2 A
C T 3 D C T 3 D C T 3 D
D T 1 A D T 1 A D T 1 A
E T 3 D E T 3 D E T 3 D
F F 8 C F F 6 G F T 6 G
G F 5 D G T 5 D G T 5 D
PQ G(5), PQ F(8), PQ F(9)*
F(8), F(9)
F(9)
Following the “Prev” column the path shortest path from F to A is:
F-G, G-D, D-A
Ans Q4: A Minimum Spanning Tree (MST) of a graph is a tree formed from graph edges
that connects all the vertices of the graph at lowest total cost.
ADBCEF
ADBECF
DABCEF
DABECF
Ans Q6:
A-B 1 Accept
H-E 2 Accept
A-G 2 Accept
H-G 3 Accept
B-H 4 Reject Cycle A-B-H-G-A
29
E-F 5 Accept
F-G 6 Reject Cycle G-H-E-F-G
D-E 8 Accept
C-D 9 Accept
C-B 13 Reject Cycle C-B-H-E-D-C
30
This document was created with Win2PDF available at http://www.daneprairie.com.
The unregistered version of Win2PDF is for evaluation or non-commercial use only.