Professional Documents
Culture Documents
UNIT I
Basic Traversal & Search Techniques: Techniques for Graphs, connected components and
Spanning Trees, Bi-connected components and DFS.
ALGORITHM
An Algorithm is any well-defined computational procedure that takes some value or set
of values as Input and produces a set of values or some value as output. Thus algorithm is a
sequence of computational steps that transforms the i/p into the o/p.
Definition:
“An Algorithm is a finite set of instructions that, if followed, accomplishes a particular task.”
if we trace out the instructions of an algorithm, then for all cases the algorithm will terminate
after a finite number of steps;
Every instruction must be sufficiently basic that it can in principle be carried out by a person
using only pencil and paper. It is not enough that each operation be definite, but it must also be
feasible.
The study of algorithms includes many important and active areas of research.
Profiling is the process of executing a correct program on data sets and measuring
the time and space it takes to compute the results.
Theoretical analysis:
Experimental Study:
Algorithm Specification:
Algorithm can be described in three ways.
3. Pseudo-code Method:
This method describes algorithms as program, which resembles language like Pascal & algol.
3. An identifier begins with a letter. The data types of variables are not explicitly declared.
<statement-n>
}
For Loop:
For variable: = value-1 to value-2 step step do
{
<statement-1>
.
.
<statement-n>
}
repeat-until:
repeat
<statement-1>
.
<statement-n>
until<condition>
Case statement:
Case
{
:<condition-1> :<statement-1>
.
.
:<condition-n> :<statement-n>
:else :<statement-n+1>
}
9. Input and output are done using the instructions read & write.
Examples:
algorithm for find max of two numbers
algorithm Max(A,n)
// A is an array of size n
{
Result := A[1];
for I:= 2 to n do
if A[I] > Result then
Result :=A[I];
return Result;
}
Recursive Algorithms:
1. Towers of Hanoi:
Towers of Hanoi is a problem in which there will be some disks which of decreasing
sizes and were stacked on the tower in decreasing order of size bottom to top. Besides this there
are two other towers (B and C) in which one tower will be act as destination tower and other act
as intermediate tower. In this problem we have to move the disks from source tower to the
destination tower. The conditions included during this problem are:
Consider there are three towers A, B, C and there will be three disks present in tower A.
Consider C as destination tower and B as intermediate tower. The steps involved during moving
the disks from A to B are
Step 1: Move the smaller disk which is present at the top of the tower A to C.
Step 2: Then move the next smallest disk present at the top of the tower A to B.
Step 3: Now move the smallest disk present at tower C to tower B
Step 4: Now move the largest disk present at tower A to tower C
Step 5: Move the disk smallest disk present at the top of the tower B to tower A.
Step 6: Move the disk present at tower B to tower C.
Step 7: Move the smallest disk present at tower A to tower C
In this way disks are moved from source tower to destination tower.
Performance Analysis:
The performance of a program is the amount of computer memory and time needed to run a
program. We use two approaches to determine the performance of a program. One is analytical,
and the other experimental. In performance analysis we use analytical methods, while in
performance measurement we conduct experiments.
1. Space Complexity:
The space complexity of an algorithm is the amount of memory it needs to run to compilation.
2. Time Complexity:
The time complexity of an algorithm is the amount of computer time it needs to run to
compilation.
Space Complexity:
Space complexity is the amount of memory used by the algorithm (including the input values to
the algorithm) to execute and produce the result.
Auxiliary Space is the extra space or the temporary space used by the algorithm during it's
execution.
The Space needed by each of these algorithms is seen to be the sum of the following
component.
1. A fixed part that is independent of the characteristics (eg:number,size)of the inputs and
outputs.
The part typically includes the instruction space (ie. Space for the code), space for simple
variable and fixed-size component variables (also called aggregate) space for constants, and so
on.
The space requirement s(p) of any algorithm p may therefore be written as,
Example 1:
In this algorithm sp=0; let assume each variable occupies one word.
S(P)>=3
Example 2:
Algorithm sum(a,n)
{
s=0.0;
for I=1 to n do
s= s+a[I];
returns;
}
In the above algoritm n,s and occupies one word each and array „a‟ occupies n number of words
so S(P)>=n+3
Example 3:
ALGORITHM FOR SUM OF NUMBERS USING RECURSION:
Algorithm RSum (a, n)
{
if(n<=0) then
return 0.0;
else
return RSum(a,n-1)+a[n];
}
The space complexity for above algorithm is:
In the above recursion algorithm the space need for the values of n, return address and pointer to
array. The above recursive algorithm depth is (n+1).
To each recursive call we require space for values of n, return address and pointer to array. So
the total space occupied by the above algorithm is S(P) >= 3(n+1)
Time Complexity:
The time T(p) taken by a program P is the sum of the compile time and the run time(execution
time)
The compile time does not depend on the instance characteristics. Also we may assume that a
compiled program will be run several times without recompilation .This rum time is denoted by
tp(instance characteristics).
The number of steps any problem statement is assigned depends on the kind of statement.
For example,
comments 0 steps.
Assignment statements1 steps.
[Which does not involve any calls to other algorithms]
Interactive statement such as for, while & repeat-until Control part of the statement.
->We can determine the number of steps needed by a program to solve a particular problem
instance in Two ways.
1. We introduce a variable, count into the program statement to increment count with initial
value 0.Statement to increment count by the appropriate amount are introduced into the program.
This is done so that each time a statement in the original program is executes count is
incremented by the step count of that statement.
Example1:
Algorithm:
Algorithm sum(a,n) Algorithm sum(a,n)
{ {
s= 0.0; s= 0.0;
for I=1 to n do count = count+1;
{ for I=1 to n do
s=s + a[I]; {
} count =count+1;
return s; s=s+a[I];
} count=count+1;
}
count=count+1;
count=count+1;
return s;
}
If the count is zero to start with, then it will be 2n+3 on termination. So each invocation of sum
execute a total of 2n+3 steps.
Example 2:
Algorithm RSum(a,n)
{
count:=count+1;// For the if conditional
if(n<=0)then
{
count:=count+1; //For the return
return 0.0;
}
else
{
count:=count+1; //For the addition,function invocation and return
return RSum(a,n-1)+a[n];
}
}
Example 3:
Algorithm Add(a,b,c,m,n)
{
for i:=1 to m do
{
count:=count+1; //For 'for i'
for j:=1 to n do
{
count:=count+1; //For 'for j'
c[i,j]=a[i,j]+b[i,j];
count:=count+1; //For the assignment
}
count:=count+1;//For loop initialization and last time of 'for j'
}
count:=count+1; //For loop initialization and last time of 'for i'
If the count is zero to start with, then it will be 2mn+2m+1 on termination. So each invocation
of sum execute a total of 2mn+2m+1 steps
Asymptotic Notations:
The best algorithm can be measured by the efficiency of that algorithm. The efficiency of an
algorithm is measured by computing time complexity. The asymptotic notations are used to find
the time complexity of an algorithm.
Asymptotic notations gives fastest possible, slowest possible time and average time of the
algorithm.
Example 1:
Sol :
Let us assume as c=1
If n=1:
2n+5 >= 2n => 2(1) + 5 >=2(1) => 7 > = 2 (true)
If n=2:
Example :
4. LITTLE-OH(O)NOTATION:
5. LITTLEOMEGANOTATION:
RANDOMIZED ALGORITHMS:
A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic.
The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in
the hope of achieving good performance in the "average case" over all possible choices of
random bits.
The output or the running time are functions of the input and random bits chosen .
Probability theory has the goal of characterizing the outcomes of natural or Conceptual
"experiments.” Example of such experiment is include tossing a Coin ten times, rolling a die
three times, playing a lottery, gambling, picking a ball from an urn containing white and red
Each possible outcome of an experiment is called a sample point and the set of all possible
outcomes is known as the sample space S. In this text e assume that S is finite (such a sample
space is called a discrete sample space).An event E is a subset of the sample space S. If the
sample space consists of n sample points, then there are 2n possible events.
In a Las Vegas randomized algorithm, the answer is guaranteed to be correct no matter which
random choices the algorithm makes. But the running time is a random variable, making the
expected running time the most important parameter. Our first algorithm today, Quicksort, is a
Las Vegas algorithm.
Quicksort is one of the basic sorting algorithms normally encountered in a data structures course.
The idea is simple:
• Choose a pivot element p
• Divide the elements into those less than p, those equal to p, and those greater than p
• Recursively sort the “less than” and “greater than” groups
• Assemble the results into a single list
Randomized algorithms that always has the same time complexity, but may sometimes
produce incorrect outputs depending on random choices made
Set:
A set is a collection of distinct elements. The Set can be represented, for examples, as
S1={1,2,5,10}.
Disjoint Sets:
The disjoints sets are those do not have any common element.
For example S1={1,7,8,9} and S2={2,5,10}, then we can say that S1 and S2 are two disjoint
sets.
Example:
S1={1,7,8,9} S2={2,5,10}
S1 U S2={1,2,5,7,8,9,10}
Find:
Given the element I, find the set containing i.
Example:
S1={1,7,8,9} Then, S2={2,5,10} s3={3,4,6}
Find(4)= S3 , Find(5)=S2, Find(7)=S1
UNION operation:
Union(i,j) requires two tree with roots i and j be joined. S1 U S2 is obtained by making any one
of the sets as sub tree of other.
Set Representation:
The set will be represented as the tree structure where all children will store the address of parent
/ root node. The root node will store null at the place of parent address. In the given set of
elements any element can be selected as the root node, generally we select the first node as the
root node.
Example:
S1={1,7,8,9} S2={2,5,10} s3={3,4,6}
Then these sets can be represented as
Disjoint Union:
To perform disjoint set union between two sets Si and Sj can take any one root and make it sub-
tree of the other. Consider the above example sets S1 and S2 then the union of S1 and S2 can be
represented as any one of the following.
Find:
To perform find operation, along with the tree structure we need to maintain the name of each
set. So, we require one more data structure to store the set names. The data structure contains two
fields. One is the set name and the other one is the pointer to root.
Example:
For the following sets the array representation is as shown below.
To perform union the SimpleUnion(i,j) function takes the inputs as the set roots i and j . And
make the parent of i as j i.e, make the second root as the parent of first root.
Algorithm SimpleUnion(i,j)
{
P[i]:=j;
}
The SimpleFind(i) algorithm takes the element i and finds the root node of i. It starts at I until it
reaches a node with parent value -1.
Algorithms SimpleFind(i)
{
while( P[i]≥0) do i:=P[i]; return i;
}
Although the SimpleUnion(i,j) and SimpleFind(i) algorithms are easy to state, their performance
characteristics are not very good. For example, consider the sets
Search means finding a path or traversal between a start node and one of a set of goal
nodes. Search is a study of states and their transitions.
Search involves visiting nodes in a graph in a systematic manner, and may or may not
result into a visit to all nodes. When the search necessarily involved the examination of every
vertex in the tree, it is called the traversal.
Inorder Traversal:
In this traversal method, the left subtree is visited first, then the root and later the right sub-tree.
We should always remember that every node may represent a subtree itself.
The steps for traversing a binary tree in inorder traversal are:
1. Visit the left subtree, using inorder.
2. Visit the root.
3. Visit the right subtree, using inorder.
We start from A, and following in-order traversal, we move to its left subtree B. B is also
traversed in-order. The process goes on until all the nodes are visited. The output of inorder
traversal of this tree will be −
D→B→E→A→F→C→G
Preorder Traversal:
In this traversal method, the root node is visited first, then the left subtree and finally the right
subtree. Preorder search is also called backtracking. The steps for traversing a binary tree in
preorder traversal are:
1. Visit the root.
2. Visit the left subtree, using preorder.
3. Visit the right subtree, using preorder.
A→B→D→E→C→F→G
Postorder Traversal:
In this traversal method, the root node is visited last, hence the name. First we traverse the left
subtree, then the right subtree and finally the root node.
The steps for traversing a binary tree in postorder traversal are:
1. Visit the left subtree, using postorder.
2. Visit the right subtree, using postorder
3. Visit the root.
D→E→B→F→G→C→A
Graph traversal is a technique to visit the each nodes of a graph G. It is also use to calculate the
order of vertices in traverse process. We visit all the nodes starting from one node which is
connected to each other without going into loop. The edges may be director or undirected. This
graph can be represented as G(V, E).
A graph search (or traversal) technique visits every node exactly one in a systematic
fashion. Two standard graph search techniques have been widely used:
1. Breadth-First Search (BFS)
2. Depth-First Search (DFS)
BFS is a traversing algorithm where we start traversing from a selected source node layer
wise by exploring the neighboring nodes.
The data structure used in BFS is a queue and a graph. The algorithm makes sure that
every node is visited not more than once.
BFS follows the following 4 steps:
1. Begin the search algorithm, by knowing the key which is to be searched. Once the
key/element to be searched is decided the searching begins with the root (source)
first.
2. Visit the contiguous unvisited vertex. Mark it as visited. Display it (if needed). If
this is the required key, stop. Else, add it in a queue.
3. On the off chance that no neighboring vertex is discovered, expel the first vertex
from the Queue.
4. Repeat step 2 and 3 until the queue is empty.
Step Traversal Description
1.
2.
4.
5.
6.
Time Complexity
The time complexity of BFS if the entire tree is traversed is O(V) where V is the number
of nodes.
If the graph is represented as adjacency list:
o Here, each node maintains a list of all its adjacent edges. Let‟s assume that there
are V number of nodes and E number of edges in the graph.
o For each node, we discover all its neighbors by traversing its adjacency list just
once in linear time.
o For a directed graph, the sum of the sizes of the adjacency lists of all the nodes is
E. So, the time complexity in this case is O(V) + O(E) = O(V + E).
o For an undirected graph, each edge appears twice. Once in the adjacency list of
either end of the edge. The time complexity for this case will be O(V) + O (2E) ~
O(V + E).
If the graph is represented as an adjacency matrix (a V x V array):
o For each node, we will have to traverse an entire row of length V in the matrix to
discover all its outgoing edges.
o Note that each row in an adjacency matrix corresponds to a node in the graph, and
that row stores information about edges emerging from the node. Hence, the time
complexity of BFS in this case is O(V * V) = O(V2).
The time complexity of BFS actually depends on the data structure being used to
represent the graph.
Space Complexity
Since we are maintaining a priority queue (FIFO architecture) to keep track of the visited
nodes, in worst case, the queue could take upto the size of the nodes(or vertices) in the
graph. Hence, the space complexity is O(V).
The DFS algorithm is the search algorithm which begins the searching from the root node
and goes down till the leaf of a branch at a time looking for a particular key. If the key is
not found, the searching retraces its steps back to the point from where the other branch
was left unexplored and the same procedure is repeated for that branch.
DFS follows the following 3 steps:
Visit a node “S”.
1.
7.
As C does not have any unvisited
adjacent node so we keep
popping the stack until we find a
node that has an unvisited
adjacent node. In this case, there's
none and we keep popping until
the stack is empty.
Time Complexity
The time complexity of DFS if the entire tree is traversed is O(V) where V is the number
of nodes.
If the graph is represented as adjacency list:
o Here, each node maintains a list of all its adjacent edges. Let‟s assume that there
are V number of nodes and E number of edges in the graph.
o For each node, we discover all its neighbors by traversing its adjacency list just
once in linear time.
o For a directed graph, the sum of the sizes of the adjacency lists of all the nodes is
E. So, the time complexity in this case is O(V) + O(E) = O(V + E).
o For an undirected graph, each edge appears twice. Once in the adjacency list of
either end of the edge. The time complexity for this case will be O(V) + O (2E) ~
O(V + E).
If the graph is represented as an adjacency matrix (a V x V array):
o For each node, we will have to traverse an entire row of length V in the matrix to
discover all its outgoing edges.
o Note that each row in an adjacency matrix corresponds to a node in the graph, and
that row stores information about edges emerging from the node. Hence, the time
complexity of DFS in this case is O(V * V) = O(V2).
The time complexity of DFS actually depends on the data structure being used to
represent the graph.
Space Complexity
Since we are maintaining a stack to keep track of the last visited node, in worst case, the
stack could take upto the size of the nodes(or vertices) in the graph. Hence, the space
complexity is O(V).
A spanning tree is a subset of Graph G, which has all the vertices covered with minimum
possible number of edges. Hence, a spanning tree does not have cycles and it cannot be
disconnected. By this definition, we can draw a conclusion that every connected and undirected
Graph G has at least one spanning tree. A disconnected graph does not have any spanning tree,
as it cannot be spanned to all its vertices
We found three spanning trees off one complete graph. A complete undirected graph can have
maximum n n-2 number of spanning trees, where n is the number of nodes. In the above
addressed example, n is 3, hence 3 3−2 = 3spanning trees are possible.
Example:
Consider the following undirected, simple and connected graph G having 4 vertices: A, B, C
and D.
There are many spanning trees possible for this graph. Some of them are shown below:
An undirected graph is said to be a biconnected graph, if there are two vertex-disjoint paths
between any two vertices are present. In other words, we can say that there is a cycle between
any two vertices.
An articulation point in a connected graph is a vertex (together with the removal of any incident
edges) that, if deleted, would break the graph into two or more pieces.
Bridge:
Biconnected:
Depth First Search (DFS) based algorithm to find all the articulation points in a graph. Given a
graph, the algorithm first constructs a DFS tree.
DFS Traversal. Using DFS, we will try to find if there is any articulation point is present or not.
We also check whether all vertices are visited by the DFS or not, if not we can say that the graph
is not connected.
Initially, the algorithm chooses any random vertex to start the algorithm and marks its status as
visited. The next step is to calculate the depth of the selected vertex. The depth of each vertex
is the order in which they are visited.
Next, we need to calculate the lowest discovery number. This is equal to the depth of the
vertex reachable from any vertex by considering one back edge in the DFS tree. An edge is a
back edge if is an ancestor of edge but not part of the DFS tree. But the edge is a part of the
original graph.
After calculating the depth and lowest discovery number for the first picked vertex, the
algorithm then searches for its adjacent vertices. It checks whether the adjacent vertices are
already visited or not. If not, then the algorithm marks it as the current vertex and calculates its
depth and lowest discovery number.
Depth first search provides a linear time algorithm to find all articulation points in a connected
graph.
Example-2:
Articulation Point : 3, 2, 5