You are on page 1of 139

University of Dar es Salaam

IS239- Algorithms and Complexity (8 Credits)

Facilitator: Dr Godfrey Justo

Pre-Requisites: IS143 and IS237


University of Dar es Salaam
IS239: Course Outline
• Preliminaries: Data Structures & Algorithms Analysis Review (Refer IS143 + IS237/IS137)
• Maths for Algorithm Analysis
• Data structures: Binary tree, Hashing
• Sorting Algorithms: Insertion sort, selection sort, Quick-sort, merge-sort, heap-sort
• Comparison of sorting algorithms
• Application of Binary tree: Huffman coding
• Algorithmic Strategies
• Graphs & Graph Algorithms: Graph definition, implementations, traversal(search), Connected components,
Topological sort, path problems, spanning trees, network flow, matching
• Fundamental Algorithms: Greedy, divide and conquer, dynamic programming, randomized algorithms
• Distributed Algorithms: Introduction and applications, Distributed computations algorithms, merge and sort;
Complexity measures: time, space and message complexity
• Basic Computability
• Concepts in computability, computable and un-computable sets and functions, halting problem
• Automata and Turing machine, Turing computability
• Modern models of computation, computing with objects, interaction-based model
• P versus NP: Classes of algorithms; NP-complete problems; NP-completeness and the classes P and NP
University of Dar es Salaam
Friendship Network
University of Dar es Salaam
Transportation Networks
University of Dar es Salaam
Internet
University of Dar es Salaam
Introduction to Graphs
• Informally a graph is a set of nodes joined by a set of lines or arrows
• Representing a problem as a graph can provide the appropriate tools
(Graph Theoretic methods) for solving the problem
• Graph theory provides a set of techniques for analysing graphs

1 1 2 3
2 3

4 5 6 4 5 6
University of Dar es Salaam
What makes a problem graph-like?
• There are two components to a graph
Nodes and edges
• In graph-like problems, these components have natural correspondences to
problem elements
Entities are nodes and interactions between entities are edges
• Most complex systems are graph-like
Friendship Network - FaceBook
Transportation Networks – Road/Flight/Railway
Internet
University of Dar es Salaam
Definition: Graph
• G is an ordered triple G:=(V, E, f)
• V is a set of nodes, points, or vertices
• E is a set, whose elements are known as edges or lines
• f is a function - maps each element of E to an unordered pair of vertices in V
• Vertex
• Basic Element, drawn as a node or a dot
• Vertex set of G is usually denoted by V(G), or V
• Edge
• A set of two elements, drawn as a line connecting two vertices, called end vertices, or endpoints
• An edge of a graph that joins a node to itself is called a loop or self-loop
• Some graphs may have pair of nodes joined by more than one edges-multiple or parallel edges
• The edge set of G is usually denoted by E(G), or E
• Simple graph - a graph without multiple edges or self-loops
• Multigraph - a graph with some multiple edges but no self-loops
• Psuedograph - a graph with some loops and multiple edges
University of Dar es Salaam
Example: A Simple Graph with Six Nodes & Seven Edges

• V:={1,2,3,4,5,6}
• E:={{1,2},{1,5},{2,3},{2,5},{3,4},{4,5},{4,6}}
University of Dar es Salaam
Directed Graph (Digraph)
• Edges have directions
• An edge is an ordered pair of nodes
University of Dar es Salaam
Weighted Graphs
• Graph for which each edge has an associated weight, usually given by
a weight function w: E  R

1.2 2
1 2 3 1 2 3

.2
.5 1.5 5 3
.3 1
4 5 6
4 5 6

.5
University of Dar es Salaam
Some Graph Structures and Structural Metrics
• Connectivity
A graph is connected if you can get from any node to any other by following a sequence
of edges OR any two nodes are connected by a path
• Strong Connectedness
A directed graph is strongly connected if there is a directed path from any node to any
other node
• Components
 Every disconnected graph can be split up into a number of connected components
University of Dar es Salaam
Degree of a Graph Node
• Number of edges incident on a node
• For directed Graphs (digraph) : degree = indeg + outdeg
In-degree: Number of edges entering
Out-degree: Number of edges leaving
outdeg(1)=2
indeg(1)=0

outdeg(2)=2
indeg(2)=2

outdeg(3)=1
indeg(3)=4

The degree of 5 is 3
University of Dar es Salaam
Degree: Simple Facts
• If G is a graph with m edges, then
 deg(v) = 2m = 2 |E |

• If G is a digraph then
 indeg(v)= outdeg(v) = |E |

• Number of odd degree nodes is even


University of Dar es Salaam
Walk, Path and Cycle in Graph
• A walk of length k in a graph is a succession of k (not necessarily different) edges of the
form uv, vw, wx, …, yz
• This walk is denote by uvwx … yz, and is referred to as a walk between u and z
• A walk is closed is u=z
• A path is a walk in which all the edges and all the nodes are different
• A cycle is a closed walk in which all the edges are different
Example:
• Walk of length 5: 1,2,5,2,3,4
• Closed walk of length 6: 1,2,5,2,3,2,1
• Path of length 4: 1,2,3,4,6
• 3-Cycle 1,2,5,1
• 4-Cycle 2,3,4,5,2
University of Dar es Salaam
Additional Definitions
• A graph with no cycle is called acyclic graph. A directed acyclic graph is
called a DAG
• In unweighted graphs, a path length is the number of edges on the path
The distance between two vertices is the length of the shortest path
between them
• A weighted path length is the sum of weights (costs or lengths) on the path
• If |E| = θ(|V|2), then the graph G is called dense graph
• If |E| = θ(|V|), then the graph G is called sparse graph
• Let G be a simple graph with n vertices and m edges. If G is undirected, then
m ≤ n(n - 1)/2. If G is directed, then m ≤ n(n -1)
University of Dar es Salaam
Special Types of Graphs
• Null graph – Have no nodes, so obviously no edge
• Empty Graph / Edgeless graph – Have one or more nodes but no edge
• Tree – A connected Acyclic (without cycles) Graph, i.e., every two nodes have
exactly one path between them
• Regular Graph - Connected Graph, with all nodes having the same degree
• Bipartite graph - V can be partitioned into 2 sets V1 and V2 such that (u,v)E
implies either u V1 and v V2 OR v V1 and uV2
• Complete Graph – A graph with every pair of vertices adjacent. Has n(n-1)/2
edges and commonly denoted as Kn, where n is the number of nodes
University of Dar es Salaam
Special Types of Graphs…
• Complete bipartite graph – a bipartite variation of complete graph, in which
every node of one set is connected to every other node on the other set
• Planar graph – a graph that can be drawn on a plane such that no two edges
intersect. K4 is the largest complete graph that is planar
• Subgraph of G – a graph whose vertex and edge sets are subsets of graph G
• Supergraph - a supergraph of G contains G as a subgraph
• Clique - a clique of G is a maximum complete connected subgraph
• Spanning subgraph – a graph H spans G, if H is a subgraph of G that has the
same vertex set as G but possibly not all the edges
• Spanning tree – a spanning tree of a connected graph G is a subgraph of G that
University of Dar es Salaam
Graph Representation
• Incidence Matrix : V x E -> [vertex, edges] contains the edge's data
• Adjacency Matrix : V x V -> Boolean values (adjacent or not) Or Edge
Weights
Incidence Matrix Adjacency Matrix
1,2 1,5 2,3 2,5 3,4 4,5 4,6 1 2 3 4 5 6
1 1 1 0 0 0 0 0 1 0 1 0 0 1 0
2 1 0 1 1 0 0 0 2 1 0 1 0 1 0
3 0 0 1 0 1 0 0 3 0 1 0 1 0 0
4 0 0 0 0 1 1 1 4 0 0 1 0 1 1
5 0 1 0 1 0 1 0 5 1 1 0 1 0 0
6 0 0 0 0 0 0 1 6 0 0 0 1 0 0
University of Dar es Salaam
Graph Representation…
• Edge List : pairs (ordered if directed) of vertices. May optionally have
weight and other data
• Adjacency List : an array of |V | with list for each vertex in V. For
each u  V , ADJ [ u ] points to all its adjacent vertices
Edge List Adjacency List
12 122
12 235
23 33
25 435
33 534
43
45
53
54
University of Dar es Salaam
Example: Edge Lists for Weighted Graphs

Edge List
1 2 1.2
2 4 0.2
4 5 0.3
4 1 0.5
5 4 0.5
6 3 1.5

Note: Often, dense and sparse graphs are represented by using adjacency matrix
and an adjacency list, respectively
University of Dar es Salaam
Topological Distance
• A shortest path is the minimum path connecting two nodes
• The number of edges in the shortest path connecting p and q is the
topological distance between the two nodes, dp,q
Distance Matrix – is a |V | x |V | matrix D = ( dij )
such that dij is the topological distance between i and j Distance Matrix
1 2 3 4 5 6
1 0 1 2 2 1 3
2 1 0 1 2 1 3
3 2 1 0 1 2 2
4 2 2 1 0 1 1
5 1 1 2 1 0 2
6 3 3 2 1 2 0
University of Dar es Salaam
Graph Traversal (Search)
• Breadth First Search (BFS)
 Considers neighbors of a vertex first in the search, before any outgoing edges of the
vertex
 The BFS algorithm traverses a graph in a breadth-ward motion and uses a queue to
remember to get the next vertex to start a search, when a dead end occurs in any
iteration
• Depth First Search (DFS)
 Considers outgoing edges of a vertex first in the search before any of the outgoing edges
of its predecessor, i.e., extremes are searched first.
 The DFS algorithm start from the root or any arbitrary node and mark the node and
move to the adjacent unmarked node and continue this loop until there is no unmarked
adjacent node. Then backtrack and check for other unmarked nodes and traverse them.
Finally print the nodes in the path
University of Dar es Salaam
Breadth First Search (BFS)
• Example
BFS algorithm traverses from A to B to C first then to D to E
and F, lastly to G. It employs the following steps:
• Step 1
 Visit the adjacent unvisited vertex
 Mark it as visited
 Process it
 Insert it in a queue
• Step 2
 If no adjacent vertex is found, remove the first vertex from
the queue
• Step 3
 Repeat Step 1 and Step 2 until the queue is empty
University of Dar es Salaam
The BFS Algorithm
Algorithm BFS
Input: A directed or undirected graph G = (V, E)
Output: Numbering of the vertices in BFS order
1. bfn ←1 //Initialize breadth-first-number
2. for each vertex v ϵ V
3. mark v unvisited
4. end for
5. for each vertex v ϵ V
6. if v is marked unvisited then bfs(v) // starting vertex
7. end for
University of Dar es Salaam
The BFS Algorithm
Procedure bfs(v) // v is starting vertex, using queue
1. Q ← {v} // insert v into queue
2. mark v visited
3. while Q ≠ {}
4. v ←dequeue(Q) // v is current vertex
5. for each edge (v, w) ϵ E
6. if w is marked unvisited then
7. enqueue(w, Q)
8. mark w visited
9. bfn ←bfn + 1
10. end if
11. end for
University of Dar es Salaam
The BFS Algorithm
The queue contents during BFS traversal
that starts from vertex a. Assume that
we choose to visit adjacent vertices in
alphabetical order
University of Dar es Salaam
The BFS Algorithm

• The order in which the nodes are


visited by a BFS that starts from
vertex a: a, b, g, c, f, h, d, e, i, j
• The resulting tree – The BFS tree:
Tree edges
Edges in the BFS-tree
 An edge (v, w) is a tree edge if w
was first visited when exploring
the edge (v, w)
Cross edges: All other edges
University of Dar es Salaam
The BFS Algorithm Time and Space Complexity
• The time complexity can be expressed as O(|V|+|E|)
• Every vertex and every edge will be explored in the worst case
• O(|E|) may vary between O(1) and O(|V|2), depending on how sparse the input graph is
• When the number of vertices in the graph is known ahead of time and additional
data structures are used to determine which vertices have already been added to
the queue, the space complexity can be expressed as O(|V|)
This is in addition to the space required for the graph itself, which may vary depending on
the graph representation used by an implementation of the algorithm
• NB: For graphs that are too large to store explicitly (or infinite), it is more practical
to describe the complexity of breadth-first search in different terms:
 To find the nodes that are at distance d from the start node (measured in number of edge
traversals), BFS takes O(bd + 1) time and space, where b is the "branching factor" of the graph
(the average out-degree)
University of Dar es Salaam
Depth First Search (DFS)
• Use stack instead of queue to hold discovered vertices:
 Goes “as deep as possible”, then back until finds first unexplored adjacent
vertex
Useful to compute “start time” and “finish time” of vertex u
Start time: time when a vertex is first visited -- predfn
 Finish time: time when all adjacent vertices of u have been visited -- postdfn
• Can write DFS iteratively using the same algorithm as for BFS but with a
STACK instead of a QUEUE, or, can write a recursive DFS procedure
University of Dar es Salaam
Depth-First Search (DFS) --- Iterative
Iterative DFS Algorithm
Input: A directed or undirected graph G = (V, E)
Output: Numbering of the vertices in depth-first search order
1. predfn ←1; postdfn ←1 //Initialize pre-depth-first number & post-depth-first number
2. for each vertex v ϵ V
3. mark v unvisited
4. end for
5. for each vertex v ϵ V
6. if v is marked unvisited then dfs(v) // starting vertex
7. end for
University of Dar es Salaam
Depth-First Search (DFS)...
Procedure dfs(v) // v is starting vertex, using stack
1. S ← {v} // insert v into stack
2. mark v visited
3. while S ≠ {}
4. v ←Peek(S) // v is current vertex
5. find an unvisited neighbor w of v
if w exists then
7. push(w, S)
8. mark w visited
9. predfn ←predfn + 1
10. else
11. pop(S); postdfn ←postdfn + 1
12. end if
University of Dar es Salaam
Depth-First Search (DFS)...
The stack contents during DFS
traversal that starts from vertex a.
Assume that we choose to visit
adjacent vertices in alphabetical
order
University of Dar es Salaam
Depth-First Search (DFS)...
• The order in which the nodes are visited by
a DFS that starts from vertex a: a, b, c, d, e,
f, g, h, i, j
• The resulting tree - depth-first search tree:
Tree edges: edges in the depth-first search tree
An edge (v, w) is a tree edge if w was first
visited when exploring the edge (v, w)
Back edges: All other edges
University of Dar es Salaam
Depth-First Search (DFS)...
A DFS traversal on a directed graph, classify the edges of G into four types:
• Tree edges:
edges in the depth-first search tree
An edge (v, w) is a tree edge if w was first visited when exploring the edge (v, w)
• Back edges:
 edges of the form (v, w) such that w is an ancestor of v in the depth-first search tree
(constructed so far) and vertex w was marked visited when (v, w) was explored
• Forward edges:
edges of the form (v, w) such that w is a descendant of v in the depth-first search tree
(constructed so far) and vertex w was marked visited when (v, w) was explored
• Cross edges: All other edges
University of Dar es Salaam
Depth-First Search (DFS) --- Recursion
Recursive DFS Algorithm
Procedure dfs(v) // v is starting vertex, using recursion
1. mark v visited
2. predfn ←predfn + 1
3. for each edge (v, w) ϵ E
4. if w is marked unvisited then dfs(w)
5. end for
6. postdfn ←postdfn + 1
University of Dar es Salaam
The DFS Algorithm Properties and Complexity
• DFS(s) reaches all vertices reachable from s
On undirected graphs, DFS(s) visits all vertices in CC(s) – Connected component
rooted at s - and the DFS-tree obtained is a spanning tree of G
• Analysis:
DFS(s) runs in O(|Vc|+|Ec|), where Vc, Ec are the number of vertices and edges in
CC(s) (reachable from s, for directed graphs)
When run on the entire graph, DFS(G) runs in O(|V| + |E|) time
Put differently, DFS runs in linear time in the size of the graph
 As was with BFS(G,s), it forms a tree - the DFS-tree
University of Dar es Salaam
Topological Sorting
• Given a directed acyclic graph G = (V, E):
The problem of topological sorting is to find
a linear ordering of its vertices in such a way
that if (v, w) ϵ E, then v appears before w in
the ordering
• Example:
 One possible topological sorting of the
vertices in the DAG shown on the right is b,
d, a, c, f, e, g (or a, b, d, c, e, f, g)
University of Dar es Salaam
Topological Sorting
• Generally, can assume that the DAG has only
one vertex, say s, of in-degree 0
 If not, simply add a new vertex s and edges
from s to all vertices of in-degree 0
• Next, carry out a depth-first search on G
starting at vertex s
When the traversal is complete, the values of
the counter postdfn define a reverse
topological ordering of the vertices in the DAG
To obtain the ordering, add an output step to
Algorithm DFS just after the counter postdfn is
incremented
The resulting output is reversed to obtain the
University of Dar es Salaam
Exercise Set - 1
1. Write a complete program to implement the DFS
2. Write a complete program to implement the BFS
3. Modify the DFS program to find the topologically sorted
order of a given DAG
4. List the order in which the nodes of the undirected graph
shown on the right are visited by a breadth-first traversal
that starts from vertex a. Repeat this exercise for a
breadth-first traversal starting from vertex d
5. List the order in which the nodes of the undirected graph
shown on the right are visited by a depth-first traversal
that starts from vertex a. Repeat this exercise for a depth-
first traversal starting from vertex d
University of Dar es Salaam
Shortest Path Problem
• Shortest Path Problem
A problem of finding the shortest path(s) between vertices of a given graph
Shortest path between two vertices is a path that has the least cost as compared to all
other existing paths
• Shortest path algorithms
A family of algorithms used for solving the shortest path problem
• Shortest path algorithms have a wide range of applications:
Google Maps
Road Networks
Logistics Research
University of Dar es Salaam
Types of Shortest Path Problem
• Single-pair shortest path problem
A shortest path problem where the shortest path between a given pair of
vertices is computed
A* Search Algorithm is a famous algorithm used for solving single-pair
shortest path problem
• Single-source shortest path problem
A shortest path problem where the shortest path from a given source vertex
to all other remaining vertices is computed
Dijkstra’s Algorithm and Bellman Ford Algorithm are the famous algorithms
used for solving single-source shortest path problem
University of Dar es Salaam
Types of Shortest Path Problem
• Single-destination shortest path problem
A shortest path problem where the shortest path from all the vertices to a single
destination vertex is computed
By reversing the direction of each edge in the graph, this problem reduces to
single-source shortest path problem
Dijkstra’s Algorithm is a famous algorithm adapted for solving single-destination
shortest path problem
• All pairs shortest path problem
A shortest path problem where the shortest path between every pair of vertices is
computed
Floyd-Warshall Algorithm and Johnson’s Algorithm are the famous algorithms
used for solving All pairs shortest path problem.
University of Dar es Salaam
Dijkstra Algorithm
• Famous greedy algorithm
• Solves the single source shortest path problem
• Computes the shortest path from one particular source node to all other
remaining nodes of the graph
• Note:
Works only for connected graphs
Works only for graphs that do not contain any negative weight edge
Does not output the shortest paths:
Only provides the value or cost of the shortest paths
By making minor modifications in the actual algorithm, the shortest paths can be easily obtained
Works for directed as well as undirected graphs
University of Dar es Salaam
Dijkstra Algorithm

1. Z←∅ // The set of vertices that have been visited ‘Z' is initially empty
2. Q←V // The queue 'Q' initially contains all the vertices
3. dist[S] ← 0 // The distance to source vertex is set to 0
4. Π[S] ← NIL // The predecessor of source vertex is set as NIL
5. for all v ∈ V - {S} // For all other vertices
6. do dist[v] ← ∞ // All other distances are set to ∞
7. Π[v] ← NIL // The predecessor of all other vertices is set as NIL
8. while Q ≠ ∅ // While loop executes till the queue is not empty
9. do u ← min-distance (Q, dist) // A vertex from Q with the least distance is selected
10. Z ← Z  {u} // Vertex 'u' is added to ‘Z' list of vertices that have been visited
11. for all v ∈ neighbors[u] // For all the neighboring vertices of vertex 'u'
12. do if dist[v] > dist[u] + w(u,v) // if any new shortest path is discovered
13. then dist[v] ← dist[u] + w(u,v) // The new value of the shortest path is selected
14. return dist
Source: https://www.gatevidyalay.com/dijkstras-algorithm-shortest-path-algorithm/
University of Dar es Salaam
Dijkstra Algorithm
1. First step (1-2) define two sets:
A set Z for all vertices which are included in the shortest path tree ---Initially empty
A set Q for all vertices which are yet to be included in the shortest path tree –
Initially contains all the vertices of the given graph
2. In second step (3-7), for each vertex of the given graph, two variables are
defined:
Π[v] which denotes the predecessor of vertex ‘v’
d[v] which denotes the shortest path estimate of vertex ‘v’ from the source vertex
Initially, the value of these variables is set as:
The value of variable ‘Π’ for each vertex is set to NIL, i.e. , Π[v] = NIL
The value of variable ‘d’ for source vertex is set to 0, i.e., d[S] = 0
The value of variable ‘d’ for remaining vertices is set to ∞, i.e., d[v] = ∞
University of Dar es Salaam
Dijkstra Algorithm
3. In Third step(8-13), the following procedure is
repeated until all the vertices of the graph are
processed:
Among unprocessed vertices, a vertex with minimum value
of variable ‘d’ is chosen
Its outgoing edges are relaxed
After relaxing the edges for that vertex, the sets created in
first step are updated
Edge Relaxation:
Consider the edge (a,b) in the graph shown on the right
Here, d[a] and d[b] denotes the shortest path estimate for
vertices a and b respectively from the source vertex ‘S’.
Now, If d[a] + w < d[b] then d[b] = d[a] + w and Π[b] = a
This is called as edge relaxation
University of Dar es Salaam
Time Complexity Analysis
• Case 1:
 Valid when
The given graph G is represented as an adjacency matrix A
Priority queue Q is represented as an unordered list
 Here:
A[i,j] stores the information about edge (i,j)
Time taken for selecting i with the smallest dist is O(|V|)
For each neighbour of i, time taken for updating dist[j] is O(1) and there will
be maximum |V| neighbours
Time taken for each iteration of the loop is O(|V|) and one vertex is deleted
from Q
Thus, total time complexity becomes O(|V|2)
University of Dar es Salaam
Time Complexity Analysis
• Case 2:
 Valid when
The given graph G is represented as an adjacency list
Priority queue Q is represented as a binary heap
 Here:
With adjacency list representation, all vertices of the graph can be
traversed using BFS in O(|V|+|E|) time
In min heap, operations like extract-min and decrease-key value
takes O(log|V|) time.
So, overall time complexity becomes O(|E|+|V|) x O(log|V|) which is O((|
E| + |V|) x log|V|) = O(|E|log|V|)
This time complexity can be reduced to O(|E|+|V|log|V|) using Fibonacci
University of Dar es Salaam
Dijkstra Algorithm : Example

• Consider the graph shown on the right:


 Use Dijkstra’s algorithm to find the
shortest distance from source vertex ‘S’
to remaining vertices.
 Write the order in which the vertices
are visited.
Solution
• In step 1, the following two sets are
created:
Unvisited set : {S , a , b , c , d , e}
Visited set : { }
University of Dar es Salaam
Dijkstra Algorithm : Example ...
• In step 2, the two variables Π and d are created for each vertex and initialized as:
Π[S] = Π[a] = Π[b] = Π[c] = Π[d] = Π[e] = NIL
d[S] = 0
d[a] = d[b] = d[c] = d[d] = d[e] = ∞
• In step 3, vertex ‘S’ is chosen. This is because shortest path estimate for vertex ‘S’ is
least. The outgoing edges of vertex ‘S’ are relaxed
Now,
d[S] + 1 = 0 + 1 = 1 < ∞
∴ d[a] = 1 and Π[a] = S
Before Edge Relaxation
d[S] + 5 = 0 + 5 = 5 < ∞
∴ d[b] = 5 and Π[b] = S
University of Dar es Salaam
Dijkstra Algorithm : Example ...
• After edge relaxation, our shortest path tree is as a
per top snapshot graph on the right:
• Now, the sets are updated as:
Unvisited set : {a , b , c , d , e}
Visited set : {S}
• In step 4, vertex ‘a’ is chosen. This is because shortest
path estimate for vertex ‘a’ is least. The outgoing
edges of vertex ‘a’ (see bottom snapshot graph) are
relaxed
Before Edge Relaxation-
University of Dar es Salaam
Dijkstra Algorithm : Example ...
• Now,
d[a] + 2 = 1 + 2 = 3 < ∞ ∴ d[c] = 3 and Π[c] = a
d[a] + 1 = 1 + 1 = 2 < ∞ ∴ d[d] = 2 and Π[d] = a
d[a] + 2 = 1 + 2 = 3 < 5 ∴ d[b] = 3 and Π[b] = a
After edge relaxation, our shortest path tree is on
the top snapshot graph on the right
• Now, the sets are updated as-
Unvisited set : {b , c , d , e}
Visited set : {S , a}
• In step 5, vertex ‘d’ is chosen. This is because
shortest path estimate for vertex ‘d’ is least. The Before Edge Relaxation-
University of Dar es Salaam
Dijkstra Algorithm : Example ...
• Now,
 d[d] + 2 = 2 + 2 = 4 < ∞ ∴ d[e] = 4 and Π[e] = d
• After edge relaxation, our shortest path tree is
as per the snapshot graph on the right
• Now, the sets are updated as:
 Unvisited set : {b , c , e}
 Visited set : {S , a , d}
• In step 6, vertex ‘b’ is chosen. This is because
shortest path estimate for vertex ‘b’ is least.
Vertex ‘c’ may also be chosen since for both
vertices, shortest path estimate is least. The Before Edge Relaxation
outgoing edges of vertex ‘b’ are relaxed.
University of Dar es Salaam
Dijkstra Algorithm : Example ...
• Now,
d[b] + 2 = 3 + 2 = 5 > 2 ∴ No change
• After edge relaxation, our shortest path tree remains
the same as in step 5
• Now, the sets are updated as:
Unvisited set : {c , e}
Visited set : {S , a , d , b}
• In step 7, vertex ‘c’ is chosen. This is because shortest
path estimate for vertex ‘c’ is least. The outgoing
edges of vertex ‘c’ are relaxed Before Edge Relaxation
University of Dar es Salaam
Dijkstra Algorithm : Example ...
• Now,
 d[c] + 1 = 3 + 1 = 4 = 4 ∴ No change
• After edge relaxation, our shortest path tree remains the same as in step 5
• Now, the sets are updated as:
 Unvisited set : {e}
 Visited set : {S , a , d , b , c}
• In step 8, vertex ‘e’ is chosen. This is because shortest path estimate for vertex ‘e’ is least.
The outgoing edges of vertex ‘e’ are relaxed. There are no outgoing edges for vertex ‘e’. So,
our shortest path tree remains the same as in step 5
• Now, the sets are updated as:
 Unvisited set : { }
 Visited set : {S , a , d , b , c , e}
University of Dar es Salaam
Dijkstra Algorithm : Example ...

• Now, all vertices of the graph are


processed.
• Final shortest path tree is as shown on
the right
• It represents the shortest path from
source vertex ‘S’ to all other remaining
vertices.
• The order in which all the vertices are
processed is : S , a , d , b , c , e
Dijkstra Algo Demo Video: https://youtu.be/smHnz2RHJBY
University of Dar es Salaam
Spanning Trees
• A tree is a connected graph with no cycles
• A forest is a graph with no cycles but may or may not be connected - i.e., a
forest is a graph whose components are trees
• Figure (a) shows a tree, while Figure (b) shows a forest
University of Dar es Salaam
Spanning Tree
• A spanning tree of a graph G is a tree T which is a spanning subgraph of G
That is, T has the same vertex set as G

• Computing a spanning tree:


Many algorithm exists
 Vertex-centric
 Edge-centric
University of Dar es Salaam
Finding a Spanning Tree
Vertex-centric algorithm
1. Pick an arbitrary node and mark it as being in the tree
2. Repeat the following until all nodes are marked as in the tree:
a) Pick an arbitrary node u in the tree with an edge e to a node w not in the tree
b) Add e to the spanning tree and mark w as in the tree

 Step 2 iterate n−1 times, because there are n−1 vertices (edges) that have to
be added to the tree
 The efficiency of the algorithm is determined by how efficiently one can find
a qualifying w
University of Dar es Salaam
Finding a Spanning Tree..
Edge-centric Algorithm
1. Start with the collection of singleton trees (each with exactly one node)
2. As long as there are more than one tree, connect two trees together with
an edge in the graph

 The algorithm performs n steps, because it has to add n − 1 edges to the


trees to obtain a spanning tree
 Its efficiency is determined by how quickly we can tell if an edge would
connect two trees or would connect two nodes already in the same tree
University of Dar es Salaam
Minimum Spanning Trees
• A minimum spanning tree (MST) or minimum weight spanning tree is
 A subset of the edges of a connected, edge-weighted undirected graph that connects all the
vertices together, without any cycles and with the minimum possible total edge weight
 A spanning tree that represents the minimum cost
• An MST is not necessarily unique
• Greedy algorithms that finds a minimum spanning tree
Prim’s Algorithm
1. Choose any start vertex to form the initial partial tree (vi)
2. Add the cheapest edge, ei, to a new vertex to form a new partial tree
3. Repeat
Kruskal’s step 2 until all vertices have been included in the tree
Algorithm:
No start vertex but Start adding edges to the MST from the edge with the
smallest weight which doesn't form a cycle
University of Dar es Salaam
Minimum Spanning Trees Case Study
• Suppose we have a group of offices
which need to be connected by a
network of communication lines
• The offices may communicate with each
other directly or through another office
• In order to decide on which offices to
build links between we need to work
out the cost of all possible connections
• This give us a weighted complete graph
University of Dar es Salaam
Minimum Spanning Trees...
• Example: Find a minimum spanning tree for the graph
representing communication links between offices:
 Start with any vertex and choose the one marked a
 Add the edge (a,b) which is the cheapest edge of those
incident to a
 Add a new edge in order to form a partial tree and choose
(b,c), which is one of the cheapest remaining edges incident
either with a or b
 Now add edge (a,d) which is the cheapest remaining edge
of those incident with a or b or c
 Continuing in this manner, finds a minimum spanning tree!
E.g., One whose total cost of the communication links in a
solution found to be: 2 + 3 + 3 + 2 + 4 = 14
University of Dar es Salaam
Network Flow

Water flowing through a pipe-work system. The values on the pipe are the capacities of water that they can carry
University of Dar es Salaam
Network Flow
• Suppose we turn on the tap, so that water flows along
the path SACT
 What is the maximum flow along this path?
 The maximum flow is governed by the minimum capacity along
the path----In this case 2!!!
• Now consider the path SBDT
 What is the maximum flow along this path?
 The minimum capacity along this path, hence the maximum
flow is 4
• Can increase a flow if can find a path from S to T with no
saturated edges
 The flow can then be increased by minimum excess capacity
 Consider the path SBCT
University of Dar es Salaam
University of Dar es Salaam
University of Dar es Salaam
Network Flow
• Is there another path from S to T that consists only of edges that are not saturated?
• Yes, the path SABCDT. What is the minimum excess capacity along this route?
• The minimum capacity is 1 along BC, therefore can increase the flow by 1
University of Dar es Salaam
Network Flow
• Now we have a total flow of 9 out of the source S and into the sink T
• Need a way of finding out whether this is the maximum possible flow!
University of Dar es Salaam
The Network Flow Problem
• A type of network optimization problem
• Arise in many different contexts
 Networks: routing as many packets as possible on a given network
 Transportation: sending as many trucks as possible, where roads have limits on the number of
trucks per unit time
 Bridges: destroying (?!) some bridges to disconnect s from t, while minimizing the cost of
destroying the bridges
• Settings:
 Given a directed graph G = (V, E), where each edge e is associated with its capacity c(e) > 0, and
two special nodes source s and sink t (s ≠ t)
• Problem:
 Maximize the total amount of flow from s to t subject to two constraints: Flow on edge e doesn’t
exceed c(e) – For every node v ≠ s, t, incoming flow is equal to outgoing flow
University of Dar es Salaam
The Network Flow Problem
Alternate Formulation: Minimum Cut

• We want to remove some edges from the graph such that after removing the
edges, there is no path from s to t
 The cost of removing e is equal to its capacity c(e)
 The minimum cut problem is to find a cut with minimum total cost

• Theorem:
(maximum flow) = (minimum cut)
University of Dar es Salaam
The Network Flow Problem

Capacities Maximum flow (of 23 total units)


University of Dar es Salaam
The Network Flow Problem: Minimum Cut Example
Capacities Minimum Cut (red edges are removed)
University of Dar es Salaam
Ford-Fulkerson Algorithm
• A simple and practical max-flow algorithm
• Main idea:
Find valid flow paths until there is none left, and add them up
• How do we know if this gives a maximum flow?
Proof sketch:
Suppose not
Take a maximum flow f⋆ and “subtract” our flow f
It is a valid flow of positive total flow
By the flow decomposition, it can be decomposed into flow paths and circulations
These flow paths must have been found by Ford-Fulkerson
 Contradiction
University of Dar es Salaam
Ford-Fulkerson Algorithm

• Back edges:
We don’t need to maintain the amount of flow on each edge but work with capacity
values directly
 If f amount of flow goes through u → v, then:
 Decrease c(u → v) by f
 Increase c(v → u) by f
Why do we need to do this?
Sending flow to both directions is equivalent to cancelling flow
University of Dar es Salaam
Ford-Fulkerson Algorithm : Pseudo-code
• Set ftotal = 0
• Repeat until there is no path from s to t:
 Run DFS from s to find a flow path to t
Let f be the minimum capacity value on the path
Add f to ftotal
For each edge u → v on the path:
 Decrease c(u → v) by f
 Increase c(v → u) by f
University of Dar es Salaam
Ford-Fulkerson Algorithm : Analysis

• Assumption:
capacities are integer-valued
• Finding a flow path takes θ (n + m) time
• We send at least 1 unit of flow through the path
If the max-flow is f⋆, the time complexity is O((n + m)f⋆)
 “Bad” in that it depends on the output of the algorithm
 Nonetheless, easy to code and works well in practice!!
University of Dar es Salaam
Computing Min-Cut

• We know that max-flow is equal to min-cut


And we now know how to find the max-flow
• Question:
how do we find the min-cut?
• Answer:
 use the residual graph
“Subtract” the max-flow from the original graph
University of Dar es Salaam
Original Graph Max-Flow Graph

Residual Graph
University of Dar es Salaam
Computing Min-Cut Residual Graph

• Mark all nodes reachable from s


Call the set of reachable nodes A

Original Graph
• Separate the nodes in A from the others –
i.e, V-A
Cut edges go from A to V − A in the original
graph
 Look at the original graph and find the cut:
Why isn’t b → c cut?
University of Dar es Salaam
Bipartite Matching
• A Bipartite Graph G = (V, E) is a graph in which the vertex set V
can be divided into two disjoint subsets X and Y such that every
edge e ∈ E has one end point in X and the other end point in Y
• A matching M is a subset of edges such that each node in V
appears in at most one edge in M
Interested in matching of large size
• Maximal Matching:
 A matching to which no more edges can be added without increasing
the degree of one of the nodes to two - it is a local maximum
• Maximum Matching:
 A matching with the largest possible number of edges - it is globally
optimal
University of Dar es Salaam
Bipartite Matching
• Goal:
Find a maximum matching in a graph
• Note:
A maximal matching can be found very easily
Keep adding edges to a matching until no more can be added
Can be shown that for any maximal matching M, we have that |M|≥ ½ |M*| where
M* is the maximum matching
Therefore one can easily construct a “2-approximation” to a maximum matching
University of Dar es Salaam
Bipartite Matching and Network Flow
• The problem of finding a maximum matching can be
reduced to maximum flow in the following manner:
 Let G(V,E) be the bipartite graph where V is divided
into X and Y
 Construct a directed graph G’(V’,E’), in which V’
contains all the nodes of V along with a source node s
and a sink node t
 For every edge in E, we add a directed edge in E’ from
X to Y
 Finally add a directed edge from s to all nodes in X
and from all nodes of Y to t
 Each edge is given unit capacity
University of Dar es Salaam
Bipartite Matching and Network Flow
• Let f be an integral flow of G’ of value k
Make the following observations:
1. There is no node in X which has more than one outgoing edge where there is a flow
2. There is no node in Y which has more than one incoming edge where there is a flow
3. The number of edges between X and Y which carry flow is k
• By these observations:
It is straightforward to conclude that the set of edges carrying flow in f forms a matching
of size k for the graph G
Likewise, given a matching of size k in G, we can construct a flow of size k in G’
Therefore, solving for maximum flow in G’ gives a maximum matching in G
 Note that we used the fact that when edge capacities are integral, Ford-Fulkerson
produces an integral flow
University of Dar es Salaam
Bipartite Matching : Example
• Scenario:
n students and d dorms
Each student wants to live in one of the dorms of their choice
Each dorm can accommodate at most one student (?!)
 A more reasonable variant of this problem: dorm j can accommodate cj students
 Make an edge with capacity cj from dorm j to the sink
• Problem:
Find an assignment that maximizes the number of students who get a housing
University of Dar es Salaam
Flow Network Construction

• Add source and sink


• Make edges between students
and dorms
All the edge weights are 1
University of Dar es Salaam
Flow Network Construction
• Find max-flow
Find the optimal assignment
from the chosen edges
University of Dar es Salaam
Bipartite Matching: Analysis
• The running time of this algorithm:
Constructing the graph G’ takes O(n+m) time where n = |V| and m = |E|
The running time for the Ford-Fulkerson algorithm is O(m’F) where m’ is the number
of edges in E’ and F = Σeϵδ(s)(ce)
In case of bipartite matching problem, F ≤ |V| since there can be only |V| possible
edges coming out from source node
So the total running time is O(m’n) = O((m + n)n)
• An interesting thing to note is that at any iteration of the algorithm
 Any s-t path in the residual graph will have alternating matched and unmatched
edges
Such paths are called alternating paths
This property can be used to find maximum matchings even in general graphs
University of Dar es Salaam
Perfect Matching
• A perfect matching is a matching in which each node has exactly one edge
incident on it
One possible way of finding out if a given bipartite graph has a perfect matching is to
use the above algorithm to find a maximum matching and checking if the size of the
matching equals the number of nodes in each partition
• There is another way of determining this: use Hall's Theorem
• Theorem
A Bipartite graph G(V,E) has a Perfect Matching iff (if and only if) for every subset S ⊆
X or S ⊆ Y, the size of the neighbours of S is at least as large as S, i.e., |Γ(S) | ≥ |S|
This theorem can be proven using induction - We do not discuss details of the proof in
here!!
University of Dar es Salaam
Other Algorithmic Techniques
• In an algorithm design there is no one 'silver bullet' that is a cure for all
computation problems
• Different problems may require use of different kinds of techniques
• A good programmer should make use of all these techniques based on the
type of problem
• Some commonly-used techniques are:
Greedy algorithms (This is not an algorithm, it is a technique)
Divide and conquer
Dynamic programming
Randomized algorithms
University of Dar es Salaam
Greedy Algorithms
• Always makes the choice that seems to be the best at that moment
 Makes a locally-optimal choice in the hope that the choice lead to a globally-
optimal solution
 Works step-by-step and always chooses the steps which provide immediate
profit/benefit
 May not always lead to the optimal global solution, because it does not consider
the entire data
• How to decide which choice is optimal?
Given an objective function that needs to be optimized (maximized or
minimized)
Makes greedy choices at each step to ensure that the objective function is optimized
Has only one shot to compute the optimal solution - never goes back and reverses the
decision
University of Dar es Salaam
Greedy Algorithms Basic Structure

getOptimal(item_arr[], int n)
1) initialize empty result : result = {}
2) while (All items are not considered)
i = SelectAnItem() // We make a greedy choice to select an item
if (feasible(i)) // If i is feasible, add i to the result
result = result  i
3) return result
University of Dar es Salaam
Greedy Approach
• Characteristics
1. There is an ordered list of resources (profit, cost, value, etc.)
2. Maximum/minimum of all the resources (max profit, max value, min cost, etc.) are
taken
3. For example, in fractional knapsack problem, the maximum value/weight is taken
first according to available capacity
• Applications
 Finding an optimal solution (Activity selection, Fractional Knapsack, Job Sequencing
, Huffman Coding)
 Finding close to the optimal solution for NP-Hard problems like TSP (Travelling
Salesman Problem)
University of Dar es Salaam
Advantages and Disadvantages of Greedy Approach
• Advantages
Greedy approach is easy to design and implement
Analyzing the run time for greedy algorithms will generally be much easier than for
other techniques (e.g. Divide and conquer)
Typically have less time complexities
Greedy algorithms can be used for optimization purposes or finding close to
optimization in case of NP Hard problems
• Disadvantages
Have to work much harder to understand correctness issues
 Even with the correct algorithm, it is hard to prove why it is correct - Proving that a greedy
algorithm is correct is more of an art than a science (It involves a lot of creativity)
The local optimal solution may not always be global optimal
University of Dar es Salaam
Standard Greedy Algorithms

Examples of Greedy Algorithms


Activity Selection Problem Minimum Swaps for Bracket Balancing
Egyptian Fraction Fitting Shelves Problem
Job Sequencing Problem Kruskal’s Minimum Spanning Tree 
Job Sequencing Problem (Using Disjoint Set) Prim’s Minimum Spanning Tree 
Huffman Coding  Boruvka’s Minimum Spanning Tree
Water Connection Problem Reverse delete algorithm for MST
Policemen catch thieves Dijkstra’s Shortest Path Algorithm 
University of Dar es Salaam
Divide and Conquer
• In divide and conquer approach, the problem in hand, is divided into
smaller sub-problems and then each problem is solved independently
• When we keep on dividing the sub-problems into even smaller sub-
problems:
 Eventually we reach a stage where no more division is possible
The "atomic" smallest possible sub-problem (fractions) are solved
The solution of all sub-problems is finally merged in order to obtain the solution of
an original problem
University of Dar es Salaam
Three-step Process of Divide-and-Conquer Approach
• Divide/Break
Breaks the problem into smaller sub-problems
Sub-problems represent a part of the original problem
This step generally takes a recursive approach to divide the problem until no sub-problem is further
divisible
 At this stage, sub-problems become atomic in nature but still represent some part of the actual problem
• Conquer/Solve
This step receives a lot of smaller sub-problems to be solved
 Generally, at this level, the problems are considered 'solved' on their own
• Merge/Combine
When the smaller sub-problems are solved, this stage recursively combines them until they formulate a
solution of the original problem
This algorithmic approach works recursively - conquer & merge steps works so close that they appear as
one
University of Dar es Salaam
Divide and Conquer Algorithms
• Algorithms based on divide-and-conquer programming approach include:
Merge Sort 
Quick Sort 
Binary Search 
Strassen's Matrix Multiplication
Closest pair (points)
• Note:
There are various ways available to solve any computer problem, but the mentioned
are a good example of divide and conquer approach
University of Dar es Salaam
Dynamic Programming (DP)
• Is repeating the things for which you already have the answer, a good thing ?
• A programmer would disagree
• That's what Dynamic Programming is about
• To always remember answers to the sub-problems you've already solved
• In DP, one need to break up a problem into a series of overlapping sub-problems
and build up solutions to larger and larger sub-problems
 Suppose a given problem can be broken down into smaller sub-problems and the smaller sub-
problems can still be broken into smaller ones
 If can find out that there are some over-lapping sub-problems, then have encountered a DP
problem!
• The core idea of DP is to avoid repeated work by remembering partial results -- this
concept finds it application in a lot of real life situations!!!
University of Dar es Salaam
Dynamic Programming and Recursion
• Dynamic programming is basically, recursion plus using common sense
 Recursion allows to express the value of a function in terms of other values of that function
 Common sense tells that if one implement a function in a way that the recursive calls are done
in advance, and stored for easy access, it will make a program faster
 Memoization - it is memorizing the results of some specific states, which can then be later accessed to solve
other sub-problems
• DP trade space for time
• Instead of calculating all the states taking a lot of time but no space, take up space to store the
results of all the sub-problems to save time later
• Example: Fibonacci numbers
• Fibonacci (n) = 1; if n = 0
Fibonacci (n) = 1; if n = 1
Fibonacci (n) = Fibonacci(n-1) + Fibonacci(n-2)
University of Dar es Salaam
Dynamic Programming and Recursion
A code using pure recursion Using Dynamic Programming approach with memoization

int fib (int n) { void fib (int n) {


if (n < 2) fibresult[0] = 1;
return 1; fibresult[1] = 1;
return fib(n-1) + fib(n-2); for (int i = 2; i<n; i++)
} fibresult[i] = fibresult[i-1] + fibresult[i-2];
}
University of Dar es Salaam
Dynamic Programming and Recursion
• Are we using a different recurrence
relation in the two codes? No
• Are we doing anything different in
the two codes? Yes
In the recursive code, a lot of values
are being recalculated multiple times
We could do good with calculating
each unique quantity only once
Take a look at the Figure to the right
to understand that how certain
values were being recalculated in the
recursive way:
University of Dar es Salaam
Dynamic Programming Problems Categories
• Optimization problems
The optimization problems expect you to select a feasible solution, so that the
value of the required function is minimized or maximized
• Combinatorial problems
Combinatorial problems expect you to figure out the number of ways to do
something, or the probability of some event happening
• A Dynamic Programming problem follow this schema:
Show that the problem can be broken down into optimal sub-problems
Recursively define the value of the solution by expressing it in terms of optimal
solutions for smaller sub-problems
Compute the value of the optimal solution in bottom-up fashion
Construct an optimal solution from the computed information
University of Dar es Salaam
Randomized Algorithms
• Deterministic Algorithm
The solution produced by the algorithm is correct
The number of computational steps is same for different runs of the algorithm with
the same input
University of Dar es Salaam
Randomized Algorithms
Deterministic Algorithm Realities
• Given a computational problem
It may be difficult to formulate an algorithm with good running time, or
The explosion of running time of an algorithm for that problem with the number of
inputs
• Remedies
Efficient heuristics
Approximation algorithms
Randomized algorithms
University of Dar es Salaam
Randomized Algorithms
A randomized algorithm employs a degree of randomness as part of its logic!

• In addition to the input, the algorithm uses a source of pseudo random numbers
• During execution, it takes random choices depending on the random numbers
• The behaviour (output) can vary if the algorithm is run multiple times on the same input
University of Dar es Salaam
Advantage and Disadvantage of Randomized Algorithms
• Advantage
The algorithm is usually simple and easy to implement
The algorithm is fast with very high probability, and/or
It produces optimum output with very high probability
• Disadvantage
There is a finite probability of getting incorrect answer, however, the probability of
getting a wrong answer can be made arbitrarily small by the repeated employment of
randomness
Analysis of running time or probability of getting a correct answer is usually difficult
Getting truly random numbers is Impossible!!
 One needs to depend on pseudo random numbers - So, the result highly depends on the quality of
the random numbers
University of Dar es Salaam
Quick Sort
The Problem: Deterministic Quick Sort
• Given an array A[1 . . . n]
containing n
(comparable) elements,
sort them in
increasing/decreasing
order
University of Dar es Salaam
Randomized Quick Sort
• An Useful Concept - The Central Splitter
It is an index s such that the number of elements less (resp. greater) than A[s] is at
least n/4
• The algorithm randomly chooses a key, and checks whether it is a central
splitter or not
• If it is a central splitter, then the array is split with that key as was done in
the QSORT algorithm
• It can be shown that the expected number of trials needed to get a central
splitter is constant
University of Dar es Salaam
Randomized Quick Sort
University of Dar es Salaam
Analysis of RandQSORT
• Fact:
Step 2 needs O(q − p) time
• Question:
How many times Step 2 is executed for finding a central splitter ?
• Result:
The probability that a randomly chosen element is a central splitter is ½
Therefore, the expected number of times the Step 2 needs to be repeated to get a
central splitter is

Thus, the expected time complexity of Step 2 is O(n)


University of Dar es Salaam
Analysis of RandQSORT ...

• Time Complexity
Worst case size of each partition in jth level of recursion is n × (3/4 )j
Number of levels of recursion = log 4/3 n = O(log n)
Recurrence Relation of the time complexity:

T(n) = 2T(3n/4 ) + O(n) = O(n log n)


University of Dar es Salaam
Types of Randomized Algorithms

• Las Vegas:
 A randomized algorithm that always returns a correct result, but the running
time may vary between executions
Example: Randomized QUICKSORT Algorithm
• Monte Carlo:
A randomized algorithm that terminates in polynomial time, but might
produce erroneous result
Example: Randomized MINCUT Algorithm
University of Dar es Salaam
Types of Randomized Algorithms

Example:
Consider the problem of finding an ‘a’ in an array of n elements
Input:
An array of n≥2 elements, in which half are ‘a’s and the other half are ‘b’s
Output:
Find an ‘a’ in the array
University of Dar es Salaam
Las Vegas algorithm

findingA_LV(array A, n)
begin
repeat
Randomly select one element out of n elements
until 'a' is found
end
University of Dar es Salaam
Las Vegas algorithm : Analysis
• The algorithm succeeds with probability 1
• The number of iterations varies and can be arbitrarily large, but the
expected number of iterations is:

• Since it is constant the expected run time over many calls is θ(1)
University of Dar es Salaam
Monte Carlo algorithm
findingA_MC(array A, n, k)
begin
i=0
repeat
Randomly select one element out of n elements
i=i+1
until i=k or 'a' is found
end
University of Dar es Salaam
Monte Carlo algorithm: Analysis

• If an ‘a’ is found, the algorithm succeeds, else the algorithm fails


• After k iterations, the probability of finding an ‘a’ is:

• This algorithm does not guarantee success, but the run time is bounded
The number of iterations is always less than or equal to k
 Taking k to be constant the run time (expected and absolute) is θ(1)
University of Dar es Salaam
Randomized Algorithms: Applications
• Randomized algorithms are particularly useful when faced with a
malicious "adversary" or attacker who deliberately tries to feed a
bad input to the algorithm such as in the Prisoner's dilemma

• It is for this reason that randomness is ubiquitous in cryptography


 In cryptographic applications, pseudo-random numbers cannot be used,
since the adversary can predict them, making the algorithm effectively
deterministic
 Therefore, either a source of truly random numbers or a
cryptographically secure pseudo-random number generator is required
 Another area in which randomness is inherent is quantum computing
University of Dar es Salaam
Las Vegas Algorithm vs Monte Carlo algorithm
• A Las Vegas algorithm can be converted into a Monte Carlo algorithm (via
Markov's inequality)
By having it output an arbitrary, possibly incorrect answer if it fails to complete
within a specified time
• Conversely, if an efficient verification procedure exists to check whether an
answer is correct, then a Monte Carlo algorithm can be converted into a Las
Vegas algorithm
By running the Monte Carlo algorithm repeatedly till a correct answer is obtained
University of Dar es Salaam
Distributed / Parallel Algorithms
Distributed Computing vs Parallel Computing
University of Dar es Salaam
Distributed Computing vs Parallel Computing
• Parallel computing system consists of multiple processors that communicate
with each other using a shared memory (processors are tightly coupled!)
As the number of processors increases, with enough parallelism available in
applications, such systems easily beat sequential systems in performance through the
shared memory
In such systems, the processors can also contain their own locally allocated memory,
which is not available to any other processors
• Distributed computing system contains multiple processors connected by a
communication network (processors are loosely coupled!)
Multiple system processors can communicate with each other using messages that are
sent over the network
University of Dar es Salaam
Why a system should be built distributed, not just parallel
Advantage Explanation
Scalability As distributed systems do not have the problems associated with shared memory, with the increased number
of processors, they are obviously regarded as more scalable than parallel systems
Data sharing Data sharing provided by distributed systems is similar to the data sharing provided by distributed databases.
Thus, multiple organizations can have distributed systems with the integrated applications for data exchange
Resources If there exists an expensive and a special purpose resource or a processor, which cannot be dedicated to each
sharing processor in the system, such a resource can be easily shared across distributed systems
Heterogeneity A system should be flexible enough to accept a new heterogeneous processor to be added into it and one of
and modularity the processors to be replaced or removed from the system without affecting the overall system processing
capability. Distributed systems are observed to be more flexible in this respect.
Geographic The geographic placement of different subsystems of an application may be inherently placed as distributed.
construction Local processing may be forced by the low communication bandwidth more specifically within a wireless
network
Economic With the evolution of modern computers, high-bandwidth networks and workstations are available at low cost,
which also favors distributed computing for economic reasons
University of Dar es Salaam
Parallel Merge Sort Algorithm
• The algorithm assumes that the sequence to be sorted is distributed and so
generates a distributed sorted sequence
• For simplicity, we assume that N is an integer multiple of P, that the N data
are distributed evenly among P tasks
• Recall that the sequential merge sort requires O(N log N) time to sort N
elements
• To implement the merge sort within a parallel processing environment:
Repeatedly split the sub-lists down to the point where you have single element lists
Merge these in parallel back up the processing tree until you obtain the fully merged
list at the top of the tree
While of theoretical interest, you probably don't have the massively parallel processor
that this would require
University of Dar es Salaam
The Parallel Algorithm
• The algorithm uses master slave model in the form of tree for parallel
sorting
• Each process receives the list of elements from its predecessor process then
divides it into two halves, keeps one half for itself and sends the second half
for its successor
• To address the corresponding predecessor & successor we use the concept
of ‘myrank_multiple’
For a process having odd rank it is calculated as Myrank_multiple=2*Myrank+1; and
for the process having even rank it is calculated as Myrank_multiple=2*Myrank+2
University of Dar es Salaam
The Parallel Algorithm
• It uses recursive calls both to emulate the transmission of the right
halves of the arrays and the recursive calls that process the left halves
• When the number of processors in the system exhaust then each
processor will sort the remaining data
• After that it will receive the sorted data from its successor & merge
that two sub lists, then it sends the result to its predecessor
• This process will continues up to root node
University of Dar es Salaam
The Parallel Algorithm
Procedure parallel_mergesort(DataArray,SizeofData)
Begin
MyData=LeftHalfof[DataArray]
TempData=RightHalfof[DataArray]
Send(TempData)
MyData = Mergesort(MyData,i,j)
Receive(TempData)
DataArray=MergeResult(MyData,TempData)
End
University of Dar es Salaam
The Parallel Algorithm ...
procedure Mergesort(MyData,i,j) procedure insertionSort (MyData,i,j)
Begin Begin
if(j-i>16) //Sequential_ insertionSort
{ End
mergeSort(MyData,i,(i+j)/2)
mergeSort(MyData,(i+j)/2,j)
}
else
insertionSort(MyData,i,j)
End
University of Dar es Salaam
Parallel Merge Sort Algorithm : Divide
University of Dar es Salaam
Parallel Merge Sort Algorithm : Conquer
University of Dar es Salaam
University of Dar es Salaam
Computational Complexity

• Sequential Merge Sort - O(n log n)


• In parallel, we have n processors
log n time is required to divide sequence
log n is required to merge sorted subsequence
log n + log n = 2*log n
O(log n)
University of Dar es Salaam
Common Distributed Complexity Measures
• Space complexity
How much space is needed per process to run an algorithm? (measured in terms of N,
the size of the network)
• Time complexity
What is the maximum time (number of steps) needed to complete the execution of
the algorithm?
• Message complexity
How many messages are needed to complete the execution of the algorithm?
University of Dar es Salaam
Exercises
1. For each of the following determine a graph with the required property,
and give its adjacency matrix, adjacency list and a drawing.
a) A 3-regular graph of at least 5 vertices
b) A complete graph of 6 vertices
c) A bipartite graph of 6 vertices
d) A complete bipartite graph of 7 vertices
e) A star graph of 7 vertices
2. Design a simple example of a directed graph with negative-weight edges
but no negative cycles for which Dijkstra's algorithm produces incorrect
answers. Demonstrate why Dijkstra's algorithm fails on your example.
University of Dar es Salaam
Exercises
3. Show the ordering of vertices produced by the
topological sort algorithm on the DAG in Figure
1. Assume that DFS considers vertices in
alphabetical order, and that all adjacency lists
are also given in alphabetical order.
4. Design a linear-time algorithm that takes as
input a directed acyclic graph G and two
vertices x and z, and returns the number of paths
from x to z in G. For example, in the DAG in
Figure 1, there are exactly four paths from
vertex p to vertex v: p-o-v, p-o-r-y-v, p-o-s-r-y-
v and p-s-r-y-v. (Your algorithm only needs Figure 1
to count the paths, not list them.)
University of Dar es Salaam
Exercises
5. Let G be the weighted graph whose vertex
set is {x, a, b, c, d, e, f} and whose edges
and weights are given by Table 1. Find all Table 1
minimum spanning trees for G.
6. Find the shortest route from A to F in the
weighted graph specified in Table 2.

Table 2
University of Dar es Salaam
Exercises
7. The goal of this exercise is to show an application of flows to organize the
presentation defences of Final Year Projects (FYP) of some students. Assume
that the students {S1, · · · , Sn} have to present their work to some panellists at
the end of their projects. There are q panellists P = {P1, · · · , Pq}. Each student Si
has a project Qi , i ≤ n. For any project, each panellist is either a specialist of the
subject or not. That is, for any i ≤ n, P is partitioned into Spi and NSpi , respectively
the subset of the panellists that are specialist of the project Qi , and the
panellists that are not. Finally, each panellist Pj , j ≤ q, can attend at most aj
defences. Each student Si must present his work to x panellists, y of them are
specialists of Pi and z = x − y of them are not. Use a flow-model to organize the
juries (i.e., which panellist will attend which presentation).
University of Dar es Salaam
Exercises
8. Figure 2 shows a flow network on which an 𝑠 − 𝑡
flow has been computed. The capacity of each
edge appears as a label next to the edge, and
the numbers in boxes give the amount of flow
sent on each edge (Edges without boxed
numbers—specifically, the four edges of
capacity 3—have no flow being sent on them).
a) What is the value of this flow?
b) Is this a maximum (s,t) flow in this graph? Figure 2
c) Find a minimum s-t cut in this flow network, and also
state what is its capacity.

You might also like