Professional Documents
Culture Documents
INTRODUCTION :
WHAT IS AN ALGORITHM?
The word algorithm comes from the name of a Persian author, Abu Ja'far Mohammed ibn Musa al
Khowarizmi(c.825A.D.),who wrote a textbook on mathematics. "algorithm" has come to refer to a method that
can be used by a computer for the solution of a problem.This is what makes algorithm different from words
such as process, technique,or method
ALGORITHM DEFINITION
An algorithm is a finite set of instructions that, if followed, accomplishes a particular task.In addition,all
algorithms must satisfy the following criteria:
1. Input. Zero or more quantities are externally supplied.
2. Output. At least one quantity is produced.
3. Definiteness. Each instruction is clear and unambiguous.
4. Finiteness. If we trace out the instructions of an algorithm,then for all cases,the algorithm terminates after a
finite number of steps.
5. Effectiveness. Every instruction must be very basic so that it can be carried out, in principle,by a person
using only pencil and paper.It is not enough that each operation be definite as in criterion3; it also must be
feasible
The study of Algorithms includes many important and active areas of research. There are four distinct areas of
study one can identify
Advantages of Algorithms:
1. It is a step-wise representation of a solution to a given problem, which makes it easy to understand.
2. An algorithm uses a definite procedure.
3. It is not dependent on any programming language, so it is easy to understand for anyone even without
programming knowledge.
4. Every step in an algorithm has its own logical sequence so it is easy to debug.
5. By using algorithm, the problem is broken down into smaller pieces or steps hence, it is easier for
programmer to convert it into an actual program.
ALGORITHM SPECIFICATION
Algorithm can be described in three ways.
1. Natural language like English:
When this way is used care should be taken, we should ensure that each & every statement is definite.
2. Graphic representation called flowchart:
This method will work well when the algorithm is small& simple.
3. Pseudo-code Method:
This method describe algorithms as program, which resembles language like Pascal & algol.
Here link is a pointer to the record type node. Individual data items of a record can be accessed with and
period.
4. Assignment of values to variables is done using the assignment statement.
<Variable>:= <expression>;
5. There are two Boolean values true and false.
Logical Operators and, or, not
Relational Operators <, <=,>,>=, =, !=
6. Elements of multidimensional arrays are accessed using[ and ]. For example,if A is a two dimensional
array, the (i,j)th element of the array is denoted as -A[i,j].Array indices start at zero.
7. The following looping statements are employed.
for, while and repeat-until
While Loop:
while < condition > do
{ As long as (condition)is true,the statements get
<statement-1> executed.When (condition) becomes false, the
. loop is exited.The value of (condition) is
. evaluated at the top of the loop
.
.
<statement-n>
} Here value l,value 2, and step are arithmetic
expressions.A variable of type integer or real or a
For Loop: numerical constant is a simple form of an arithmetic
for variable: = value 1 to value 2 step step do expression.The clause "step step" is optional and taken
{ as +1if it does not occur,step could either be positive
<statement-1> or negative
Y.Jyothi , Assistant Professor, IT Department, Dhanekula Institute of Engineering & Technology 3
R19 IT III YEAR 1 SEMESTER DESIGN AND ANALYSIS OF ALGORITHMS UNIT 1
.
.
.
<statement-n>
}
repeat-until:
repeat The statements are executed as long as {condition) is false.The
<statement-1> value of {condition)is computed after executing the statements. The
. instruction break;can be used within any of the above looping
. instructions to force exit. In case of nested loops,break;results in the
. exit of the innermost loop that it is a part of. A return statement
<statement-n> within any of the above also will result in exiting the loops.A return
until<condition> statement results in the exit of the function itself
9. Input and output are done using the instructions read & write. .No format is used to specify the size of input
or output quantities
10. There is only one type of procedure: Algorithm, the heading takes the form,
Algorithm Name (Parameter lists)
Where Name is the name of the procedure and (parameter list) is a listing of the procedure parameters.The
body has one or more (simple or compound)statements enclosed within braces{and }.An algorithm may or may
not return any values
As an example,the following algorithm finds and returns the maximum of n given number
Recursive Algorithms:
-->A Recursive function is a function that is defined in terms of itself.
-->Similarly, an algorithm is said to be recursive if the same algorithm is invoked in the body.
-->An algorithm that calls itself is Direct Recursive.
-->Algorithm “A” is said to be Indirect Recursive if it calls another algorithm which in turns calls “A”.
The recurrence relation capturing the optimal time of the Tower of Hanoi problem with n discs is. T(n) = 2T(n – 1) + 1
Where n is the number of disks.
Step 1: Move the smaller disk which is present at the top of the tower A to C.
Step 2: Then move the next smallest disk present at the top of the tower A to B.
Step 3: Now move the smallest disk present at tower C to tower B
Step 4: Now move the largest disk present at tower A to tower C
Step 5: Move the disk smallest disk present at the top of the tower B to tower A.
Step 6: Move the disk present at tower B to tower C.
Step 7: Move the smallest disk present at tower A to tower C In this way disks are moved from source tower to
destination tower.
PERFORMANCE ANALYSIS:
1. Space Complexity : The space complexity of an algorithm is the amount of memory it needs to run to compilation.
2. Time Complexity : The time complexity of an algorithm is the amount of computer time it needs to run to compilation.
Space Complexity:
The Space needed by each of these algorithms is seen to be the sum of the following component.
1. A fixed part that is independent of the characteristics (eg: number, size)of the inputs and outputs. The part typically
includes the instruction space (ie. Space for the code), space for simple variable and fixed- size component variables
space for constants, and so on.
Time Complexity:
The time T(p) taken by a program P is the sum of the compile time and the run time(execution time)
The compile time does not depend on the instance characteristics.
T(P)=Compile time + Execution time
T(P)=tp(Execution time)
Step Count
For algorithm heading -> 0
For comments -> 0
For Braces -> 0
Expressions -> 0
For any loop, if condition check is in form of i<=n then time taken is [(top index+1)-bottom index]+1
if condition check is in form of i<n then time taken is [top index+-bottom index]+1
Example1:
Algorithm sum(a,n)
{
s= 0.0;
count = count+1;
for i=1 to n do
{
count =count+1;
s=s+a[i];
count=count+1;
}
count=count+1;
count=count+1;
return s;
}
Amortized Analysis:
->Amortized analysis means finding average running time per operation over a worst case sequence of operations.
->Suppose a sequence I1,I2,D1,I3,I4,I5,I6,D2,I7 of insert and delete operations is performed on a set.
->Assume that the actual cost of each of the seven inserts is one and for delete operations D1 and D2 have an
actual cost of 8 and 10 so the total cost of sequence of operations is 25.
->In amortized scheme we charge some of the actual cost of an operation to other operations.
->This reduce the charge cost of some operations and increases the cost of other operations.
->The amortized cost of an operation is the total cost charge to it.
->The only requirement is that the some of the amortized complexities of all operations in any sequence of
operations be greater than or equal to their some of actual complexities.
->Let us assume the ‘amortized cost’ is greater than ‘actual cost’ I.e amortized(i) >= actual(i) (or)
amortized(i) - actual(i) >= 0
->In the above equations, amortized(i) and actual(i) denotes amortized and actual complexities of ith operation
Accounting Method
Potential Method
Asymptotic Notations
->The best algorithm can be measured by the efficiency of that algorithm.
->The efficiency of an algorithm is measured by computing time complexity.
->The asymptotic notations are used to find the time complexity of an algorithm.
->Asymptotic notations gives fastest possible, slowest possible time and average time of the algorithm.
The basic asymptotic notations are
Big-oh(O)
Omega(Ω)
Theta(Θ).
Little-oh(o)
Little Omega (ω)
1 BIG-OH(O) NOTATION:
->It is denoted by 'O'.
->It is used to find the upper bound time of an algorithm , that means the maximum time taken by the algorithm.
Definition : Let f(n),g(n) are two non-negative functions. If there exists two positive constants c ,n0 . such that
c>0 and for all n>=n0 if f(n)<=c*g(n) then we say that f(n)=O(g(n))
4. LITTLE O NOTATIONS :
-> It is denoted by ‘o’
-> Little o notation is used to describe an upper bound that cannot be tight. In other words, loose upper bound
of f(n).
-> Let f(n) and g(n) are the functions that map positive real numbers. We can say that the function f(n) is o(g(n))
if for any real positive constant c, there exists an integer constant n0 ≤ 1 such that f(n) > 0.
The relationship between Big Omega (Ω) and Little Omega (ω) is similar to that of Big-Ο and Little o except
that now we are looking at the lower bounds. Little Omega (ω) is a rough estimate of the order of the growth
whereas Big Omega (Ω) may represent exact order of growth. We use ω notation to denote a lower bound that
is not asymptotically tight.
And, f(n) ∈ ω(g(n)) if and only if g(n) ∈ ο((f(n)).
In mathematical relation,
if f(n) ∈ ω(g(n)) then,
lim f(n)/g(n) = ∞
n→∞
Example:
Prove that 4n + 6 ∈ ω(1);
the little omega(ο) running time can be proven by applying limit formula given below.
Y.Jyothi , Assistant Professor, IT Department, Dhanekula Institute of Engineering & Technology 13
R19 IT III YEAR 1 SEMESTER DESIGN AND ANALYSIS OF ALGORITHMS UNIT 1
if lim f(n)/g(n) = ∞ then functions f(n) is ω(g(n))
n→∞
here,we have functions f(n)=4n+6 and g(n)=1
lim (4n+6)/(1) = ∞
n→∞
and,also for any c we can get n0 for this inequality 0 <= c*g(n) < f(n), 0 <= c*1 < 4n+6
Hence proved.
RANDOMIZED ALGORITHMS
Basics of Probability Theory
->The probability of an event E is defined to be |E|/|S| where S is the sample space
->Probability theory has the goal of characterizing the outcomes of natural or conceptual "experiments”.
->Examples of such experiments include tossing a coin ten times, rolling a die three times, playing a lottery,
gambling, picking a ball from an urn containing white and red balls, and soon.
->Each possible outcome of an experiment is called a sample point and the set of all possible out comes is
known as the sample space S.
->In this An event E is a subset of the sample space consists of n sample points, then there are2^n possible
events.
Deterministic algorithm:
->Deterministic algorithm is an algorithm in which for given particular input it will always produce the same
output.
->The goal of a deterministic algorithm is to always solve a problem correctly and quickly (in polynomial time).
Randomized algorithms:
->In addition to input, the algorithm uses random numbers to make random choices during execution.
->For the same input there can be different output, i.e. its behavior changes with different runs.
->These algorithms depend upon Random Number Generators.
->Randomized algorithms rely on the statistical properties of random numbers. example of a randomized
algorithm is quick sort.
->It tries to achieve good performance in the average case.
->Randomized algorithm is also called as probabilistic algorithm.
->To overcome the computation problem of exploitation of running time of a deterministic algorithm,
randomized algorithm is used.
Primality Testing :
->Any integer greater than one is said to be a prime if its only divisors are 1 and the integer itself
->Given an integer ‘n’, the problem of deciding whether ‘n’ is a prime is known as primality testing. It has a
number of applications including cryptology
-> Assuming that it takes Θ(1)time to determine whether one integer divides another, the naive primality testing
algorithm has a run time of 0(^/n). The input size for this problem is [(logn+ 1)],since n can be represented in
binary form with these many bits.Thus the run time of this simple algorithm is exponential in the input
size(notice that i/n = 22logn). We can devise a Monte Carlo randomized algorithm for primality testing that
runs in time 0((logn)2). The output of this algorithm is correctwith high probability. If the input is prime,the
algorithm never gives an incorrect answer. However,if the input number is composite(i.e.,non prime),then there
is a small probability that the answer may be incorrect.Algorithms of this kind are said to have one-sided error
Advantages:
->The algorithm is usually simple and easy to implement.
->The algorithm is fast with very high probability, and/or it produces optimum output with very high
probability.
Disadvantages:
->The use of a randomizer within an algorithm doesn’t always result in better performance.
->Small probability of error cannot be tolerated in some critical applications such as nuclear reactor.
INTRODUCTION
To obtain the union of two sets,all that has to be done is to set the parent field of one of the roots to the other
root.This can be accomplished easily if, with each set name,we keep a pointer to the root of the tree
representing that set.If, in addition,each root has a pointer to the set name,then to determine which set an
element is currently in, we follow parent links to the root of its tree and use the pointer to the set name.The data
representation for S1,S2,and S3 may then take the form shown in Figure2.19
->In presenting the union and find algorithms,we ignore the set names and identify sets just by the roots of the
trees representing them
-> FindPointer is a function that takes a set name and determines the root of the tree that represents it. This is
done by an examination of the [set name,pointer] table.
-> Since the set elements are numbered 1through n, we represent the tree nodes using an array p[1 :n], where n
is the maximum number of element
-> The ith element of this array represents the tree node that contains element i. This array element gives the
parent pointer of the corresponding tree node.Figure2.20 shows this representation of the sets S1,S2, and S3
of Figure2.17.Notice that root nodes have a parent of -1
->We can now implement Find(i)by following the indices,starting at i until we reach a node with parent value
-1.For example,Find(6) starts at 6 and then moves to 6'sparent,3.Since p[3] is negative,we have reached the
root.
-> The operation Union(i,j)is equally simple.We pass in two trees with roots i and j. Adopting the convention
that the first tree becomes a sub tree of the second,the statement p[i] :=j; accomplishes the union.
Collapsing Rule :
->To find the root, we replace the parent pointer of the given node with a pointer to the root.
-> In fact, we can go one step further and replace the parent pointer of every node along the search path to
the root. This is called a collapsing find operation.
BFS ALGORITHM
DFS ALGORITHM
Y.Jyothi , Assistant Professor, IT Department, Dhanekula Institute of Engineering & Technology 25
R19 IT III YEAR 1 SEMESTER DESIGN AND ANALYSIS OF ALGORITHMS UNIT 1
Connected graph:
->A connected graph is a graph in which a path always exists from a vertex to any other vertex.
->A graph is connected if we can reach any vertex from any other vertex by following edges in either direction.
Connected Component Definition:
->A connected component or simply component of an undirected graph is a sub graph in which each pair of
nodes is connected with each other via a path.
SPANNING TREES
->If we have a graph containing V vertices and E edges, then the graph can be represented as: G(V, E)
->If we create the spanning tree from the above graph, then the spanning tree would have the same number of
vertices as the graph, but the edges are not equal.
->The edges in the spanning tree would be equal to the number of vertices in the graph minus 1.
Articulation point:
->A vertex v in a connected graph G is an articulation point if and only if the deletion of vertex v together with
all edges incident to v disconnects the graph into two or more non empty components.
->Articulation points are sometimes called cut vertices.
Biconnected Graph:
A graph is said to be Biconnected if:
1) It is connected, i.e. it is possible to reach every vertex from every other vertex, by a simple path.
2) Even after removing any vertex the graph remains connected.
Problem of identifying the articulation points and biconnected components of a connected graph G with n > 2
vertices.The problem is efficiently solved by considering a depth first spanning tree of G. Figure6.9(a)and (b)
shows a depth first spanning tree of the graph of Figure6.6(a).In each figure there is a number out side each
vertex.These numbers correspond to the order in which a depth first search visits these vertices and are referred
to as the depth first numbers (dfns) of the vertex. Thus,dfn[l]= 1,dfn[4] = 2,dfn[6] = 8,and soon.In
Figure6.9(b)solid edges form the depth first spanning tree. These edges are called tree edges. Broken
edges(i.e.,all the remaining edges) are called back edge
->Initially, the algorithm chooses any random vertex to start the algorithm and marks its status as visited. The
next step is to calculate the depth of the selected vertex. The depth of each vertex is the order in which they
are visited.
->Next, we need to calculate the lowest discovery number. This is equal to the depth of the vertex reachable
from any vertex by considering one back edge in the DFS tree. An edge is a back edge if is an ancestor of
edge but not part of the DFS tree. But the edge is a part of the original graph.
->Let’s say we want to calculate the lowest discovery number for vertex 4. The neighbor vertex is 3, but it
doesn’t have any back edge. So the search goes on and reaches vertex 2 . Vertex 2 has a back edge, and it
connects to vertex 1. The lowest discovery number of is equal to the depth of 1. Therefore the lowest
discovery number of is 1 .
->L[u] = min {dfn[u], min {L[w} | w is a child of u},min {dfn[w}| (u,w) is a back edge}}
L[1: 10]= {1,1,1,1,6,8,6,6,5,4}
->After calculating the depth and lowest discovery number for all the vertices, the algorithms take a pair of
vertices and checks whether its an articulation point or not.
->Let’s take two vertices and . We’re further considering is a parent, and is a child vertex. If the lowest
discovery number of is greater than or equal to the depth of , then is an articulation point.
->There’s one special case when the root is an articulation point. The root is an articulation point if and only if
it has more then one child in the tree. That’s why we put a checking condition in our algorithm to identify the
root of the tree.
->Checking condition for articulation point: L[w] >dfn[u]
L[5] >=dfn[2] 6>=6 (True)
Y.Jyothi , Assistant Professor, IT Department, Dhanekula Institute of Engineering & Technology 36
R19 IT III YEAR 1 SEMESTER DESIGN AND ANALYSIS OF ALGORITHMS UNIT 1
L[10] >=dfn[3] 4>=3 (True)
L[6] >=dfn[5] 8>=7 (True)
Hence, the vertices 2,3,5 are articulation points in the graph.
->Time Complexity
The algorithm mainly uses the DFS search with some additional steps. So basically, the time complexity of
this algorithm is equal to the time complexity of the DFS algorithm.
In case the given graph uses the adjacency list, the time complexity of the DFS algorithm would be (O+E) .
Therefore, the overall time complexity of this algorithm is(O+E) .
Applications
->The concept of articulation points is very crucial in a network. The presence of articulation points in a
connected network is an indication of high risk and vulnerability. If a connected network has an articulation
point, then a single point failure would disconnect the whole network.
->In computer science, one popular problem is to find out the maximum amount of flow passing from a source
vertex to a sink vertex. This problem is known as the max-flow min-cut theorem. The concept of articulation
point is used here to find the solution.
Graph
FIG1
-> So simply check if the given graph has any articulation point or not. If it has no articulation point then it is
Biconnected otherwise not.
->Clearly for vertex 4 and its child 1, low[1]≥disc[4], so that means 4 is an articulation point. The algorithm
returns false as soon as it discovers that 4 is an articulation point and will not go on to check low[] for vertices
0, 2 and 3.
->Then with 3 it finds the edge 3-2 and pushes that in stack.
->It then discovers the fact that low[1]≥disc[4] i.e. discovers that 4 is an articulation point. So all the edges
inserted after the edge 4-1 along with edge 4-1 will form first biconnected component. So it pops and print
the edges till last edge is 4-1 and then prints that too and pop it from the stack
Then it discovers the edge 4-5 and pushes that in stack
->For 5 it discovers the back edge 5-2 and pushes that in stack
-> After that no more edge is connected to 5 so it goes back to 4. For 4 also no more edge is connected and
also low[5]≱ disc[4].
->Then it goes back to 2 and discovers that low[4]≥disc[2], that means 2 is an articulation point. That means all
the edges inserted after the edge 4-2 along with the edge 4-2 will form the second biconnected component.
So it print and pop all the edges till last edge is 4-2 and then print and pop that too.
->Then it goes to 3 and discovers that 3 is an articulation point as low[2]≥disc[3] so it print and pops till last
edge is 3-2 and the print and pop that too. That will form the third biconnected component.
Y.Jyothi , Assistant Professor, IT Department, Dhanekula Institute of Engineering & Technology 38
R19 IT III YEAR 1 SEMESTER DESIGN AND ANALYSIS OF ALGORITHMS UNIT 1
->Now finally it discovers that for edge 0-3 also low[3]≥disc[0] so it pops it from the stack and print it as the
fourth biconnected component.
Then it checks visited[] value of other vertices and as for all vertices it is true so the algorithm terminates.
So ultimately the algorithm discovers all the 4 biconnected components shown
->Time complexity of the algorithm is same as that of DFS. If V is the number of vertices and E is the number
of edges then complexity is O(V+E).