You are on page 1of 242

1

Design and Analysis of Algorithm


KCS503
Instructor: Md. Shahid
2

Preface

Dear AKTU University Students,


I am excited to present these comprehensive notes for the "Design
and Analysis of Algorithms" course tailored specifically for your
academic journey. These notes aim to serve as a valuable companion,
offering clear explanations, insightful examples, and practical insights
to aid your understanding of algorithmic principles. Whether you're
preparing for exams or deepening your grasp of key concepts, these
notes are crafted to enhance your learning experience. Wishing you a
successful and enriching exploration of algorithm design.

Best regards,
Md shahid (Assistant professor, MIET, MEERUT)
2nd edition
Year-2023
The author can be reached at shahid.55505@gmail.com.
3

Unit-01

Algorithm—It is a combination of a sequence of finite steps to


solve a computational problem.

Program— A program, on the other hand, is a concrete


implementation of an algorithm using a programming
language.

Properties of Algorithm
 It should terminate after a finite time.
 It should produce at least one output.
 It should take zero or more input externally.
 It should be deterministic (unambiguous).
 It should be language independent.

Example to differentiate between algorithm and program


Add()
{
1. input two numbers a and b
2. sum a and b and store the result in c

3. return c

The above example is an algorithm as it follows its


properties.
4

While(1)
{
printf(“MIET”);
}
The above example is not an algorithm as it will never
terminate. Though it is a program.

The main algorithm design techniques


1. Divide and conquer technique
2. Greedy technique
3. Dynamic programming
4. Branch and bound
5. Randomized
6. Backtracking

Note— The most basic approach to designing algorithms is


the brute force technique, where one attempts all possible
solutions to a problem and opts for the successful one. All
computational problems can be solved through the brute
force method, though often not achieving noteworthy
efficiency in terms of space and time complexity.

For example, search for an element in a sorted array of


elements using linear search.
5

Steps required to design an algorithm


1.Problem Definition: Clearly understand the problem you need to
solve. Define the input, output, constraints, and objectives of the
algorithm.
2.Design Algorithm: Choose an appropriate algorithmic technique
based on the nature of the problem, such as greedy, divide and
conquer, dynamic programming, etc.
3.Draw Flowchart: Create a visual representation of your algorithm
using a flowchart. The flowchart helps to visualize the logical flow of
steps.
4.Validation: Mentally or manually walk through your algorithm with
various inputs to verify its correctness. Ensure it produces the
expected results.
5.Analyze the Algorithm: Evaluate the efficiency of the algorithm in
terms of time complexity (how long it takes to run) and space
complexity (how much memory it uses).
6.Implementation (Coding): Translate your algorithm into actual code
using a programming language of your choice. Write clean, well-
organized code that follows best practices.

Q. Define 'algorithm,' discuss its main properties, and outline the steps
for designing it.
Q. What do you mean by an algorithm, and what are the main
algorithm design techniques?
6

Analysis of Algorithms
The efficiency of an algorithm can be analyzed at two different stages:
before implementation (A Priori Analysis) and after implementation (A
Posteriori Analysis).

A Priori Analysis— This is a theoretical analysis of an algorithm's


efficiency before it's actually implemented. The analysis assumes that
certain factors, such as processor speed and memory, remain constant
and do not affect the algorithm's performance. It involves evaluating
an algorithm based on its mathematical characteristics, such as time
complexity (Big O notation) and space complexity. It provides insights
into how an algorithm will perform under ideal conditions and helps in
comparing different algorithms in terms of their theoretical efficiency.

A Posteriori Analysis— This is an empirical analysis that occurs after


an algorithm has been implemented in a programming language and
executed on an actual computer. The algorithm is tested and executed
on a target machine, and actual statistics like running time and space
required are collected. A Posteriori Analysis provides a more realistic
view of how an algorithm performs in a real-world setting, considering
hardware characteristics, compiler optimizations, and other factors. It
helps validate the theoretical analysis and may reveal unexpected
performance issues or bottlenecks.

Note—Writing a computer program that handles a small set of data is


entirely different from writing a program that takes a large number of
input data. The program written to handle a big number of input data
must be algorithmically efficient in order to produce the result in
reasonable time and space.
7

Asymptotic Analysis of Algorithms


Asymptotic analysis of algorithms is a method used to analyze the
efficiency and performance of algorithms as the input size grows
towards infinity. It focuses on understanding how an algorithm's
performance scales with larger inputs and provides a way to express
the upper (worst-case) and lower bounds (best-case) on its execution
time or space complexity. The primary goal of asymptotic analysis is
to identify the algorithm's growth rate, which helps in making
comparisons between different algorithms and determining their
suitability for various problem sizes.
Asymptotic Notations:
1. O [Big-oh] (upper bound)
2. Ω [Big-omega] (lower bound)
3. ɵ [Big-theta] (tight bound)
4. o [small-oh] (Not tightly upper bound)
5. w [small omega] (Not tightly lower bound)

1. Big-oh Notation (O) : It is used to describe the upper bound or


worst-case performance of an algorithm in terms of its time
complexity or space complexity. It provides an estimate of the
maximum amount of time or space an algorithm can require for
its execution as the input size increases.

Note: Most of the time, we are interested in finding only the


worst-case scenario of an algorithm (worst case time complexity).
Big O notation allows for a high-level understanding of how an
algorithm's efficiency scales without getting into specific
constants or lower-order terms.
8

We say that
f(n) = O g(n) if and only if
f(n) <= c . g(n) for some c >0 after n >= no >=0
Question: Find out upper bound for the function f(n) = 3n+2.
Solution:
Steps:
1. We know that definition of upper bound is f(n) <= c. g(n).
2. f(n) = 3n + 2 (Given)
3. We need to find out c and g(n).
4. If we choose c=5 and g(n) = n, then 3n+2 <= 5*n.
5. For c=5 and n0 = 1 (starting value of “n”), f(n)<=c. g(n).
6. Therefore, f(n) = O(n)

Note: Other functions that are larger than f(n), such as n^2, n^3,
nlogn, etc., will also serve as upper bounds for the function f(n).
However, we typically consider only the closest upper bound among
all possible candidates, as the remaining upper bounds are not as
useful.
1. Given f(n) = 2n2 + 3n + 2 and g(n) = n2 , prove that f(n) = O g(n).

Solution:
Steps:
9

1. We have to show f(n) <= c. g(n) where c and n0 are some positive
constants for all n which is >= n0
2. Now, find out the value of c and n0 such that the equation-(1)
gets true.

2n2 + 3n + 2 <= c. (n2) ………(1)


If we put c = 7 (note— we can take any positive value for c), then
2n2 + 3n + 2 <= 7 n2
Now, put n=1 which is n0 (starting value for input n)
7 <= 7 [ True]
Hence, when c=7 and n0 =1, f(n) = O g(n) for all n which is > = n0

2. Given f(n) = 5n2 + 6n + 4 and g(n) = n2 , then prove that f(n) is O(n2).

Solution:

f(n) will be O(n2) if and only if the following condition holds good:

f(n) <= C. g(n) where C is some constant and n>=n0 >=0

5n2 + 6n + 4 < = C. n2
If we put C=15 and n0 = 1 , then we get
15 <= 15 ( which is true. )

It means f(n) is O(n2) where C=15 , and n0 =1

Note: We have to find out c and n0 (starting value of input n) to solve


such a question.
10

3. Solve the function f(n) = 2n + 6n2 + 3n and find the big-oh (O)
notation.
Steps:
1. Find out the greatest degree of “n” from f(n), which is big-oh.
2. Prove it using the formula f(n) <= O(g(n)).

Solution:
Big-oh ( upper bound ) of f(n) = 2n + 6n2 + 3n will be 2n iff
f(n)<= c. 2n for some constant c > 0 and n> = n0 >=0
2n + 6n2 + 3n < c. 2n
If we put c=11 and n0 =1, then we get
11 <= 22 ( It is true.)
It means big-oh of f(n) is 2n when c=11 and n0 = 1

2. Big-omega Notation (Ω): It is used to describe the lower bound or


best-case performance of an algorithm in terms of its time complexity
or space complexity. It provides an estimate of the minimum amount
of time or space an algorithm can require for its execution as the input
size increases.
11

We say that
f(n) = Ω g(n) if and only if
f(n) >= c . g(n) for some c >0 after n >= no >=0

For example :
1. Given f(n) = 3n + 2 and g(n) = n, then prove that f(n) = Ω g(n)

Solution
1. We have to show that f(n) >= c. g(n) where c and n0 are some
positive constants for all n which is >= n0
2. 3n + 2 >= c . n
3. When we put c=1
4. 3n +2 >= n
5. Put n = 1
6. 5 > = 1 [ True ]
7. Hence, when we put c=1 and n0=1, f(n)= Ω g(n).
12

2. Solve the function: f(n) = 3n +5n2 + 8n and find the big


omega(lower bound) notation.

Solution :

Steps:
1. Find out the smallest degree of n from f(n). This will be the
value for lower bound ( best case for the function f(n).
2. Use the formula to find out c and n0 to prove your claim.

f(n) = Ω (n ) iff f(n) >= C. n where c is some constant and n>=n0>=0


3n +5n2 + 8n >= c. n
If we put c=16 and n0 = 1 , then we get
16 >= 16 ( holds good)
It is means lower bound (Ω) for the given function f(n)= 3n +5n2 + 8n
is n.

3. Big Theta Notation (Θ): It’s the middle characteristics of both


Big O and Omega notations as it represents the lower and upper
bound of an algorithm.
13

We say that
f(n) = ɵ g(n) if and only if
c1.g(n) <= f(n) <= c2.g(n) for all n >=n0>=0 and c >0

For example
1. Given f(n) = 3n +2 and g(n) = n, prove that f(n) = ɵ g(n).

Solution
1. We have to show that c1.g(n) <= f(n) <= c2.g(n) for all n >=n0>=0
and c >0
2. c1.g(n) <= f(n) <= c2.g(n)
3. c1. n <= 3n+2 <= c2.n
4. Put c1 =1, c2 = 4 and n=2 , then 2 <= 8 <=8 [ True ]
14

5. Hence, f(n) = ɵ g(n) where c1=1,c2=4 and n0=2.

3. Solve the function: f(n) = 27n2 +16 and find the Tight (average bound)
bound it.

Solution:
1. If we have upper bound (big-oh) and lower bound (big omega)
of f(n) equal, that’s when we can call it Theta-notation of f(n).
2. Use the formula c1. g(n) <= f(n) <= c2. g(n)

Let’s check if n2 is Theta or not.


27n2 +16 < = c1. n 2 ( for upper bound )
If we put c1 = 43 and n=1, then we get
43 < = 43 ( holds good)

Now check for the lower bound


27n2 +16 >= c2. n 2
If we put c1 = 43 and n=1, then we get
43 > = 43 ( hold good)
Since upper and lower bounds are the same for the given function
f(n)= 27n2 + 16, n2 is Tight- bound for the function f(n).
15

4. Small-oh [o] : We use o-notation to denote an upper bound that is


not asymptotically tight whereas big-oh ( asymptotic upper bound)
may or may not be asymptotically tight.

We say that
f(n) = o(g(n)) if and only if
0<= f(n) < c. g(n) for all values of c which is >0 and n>=n0>0
Or

Lim f(n)/g(n) = 0
16

n->∞

For example
1. Give f(n) = 2n and g(n) = n2, prove that f(n) = o(g(n))

Solution

Lim 2n/n2
n->∞

Lim 2/n
n->∞

Lim 2/∞ =0
n->∞

Hence, f(n) = o(g(n))


5. w [small omega] : We use w-notation to denote a lower bound that
is not asymptotically tight.
We say that
f(n) = w(g(n)) if and only if
0 <= c. g(n) < f(n) for all values of c which is >0 and n>=n0>0
Or

Lim f(n)/g(n) = ∞
n->∞

For example
1. Given f(n)= n2/2 and g(n)= n , prove that f(n) = w(g(n)).
17

Solution

Lim n2/2 /n
n->∞

Lim n/2
n->∞

Lim ∞ /2 = ∞
n->∞

Hence, f(n) = w(g(n))

Question. Why should we do asymptotic analysis of algorithms?


It is crucial for several reasons:
Efficiency Comparison: It allows us to compare and evaluate different
algorithms based on their efficiency. By analyzing how an algorithm's
performance scales with input size, we can select the most suitable
algorithm for a given problem.
Algorithm Design: Asymptotic analysis guides the design of new
algorithms. It helps in making informed design choices to optimize
algorithms for various use cases.
Resource Management: Understanding the resource requirements of
algorithms as input size grows helps allocate computational resources
efficiently, preventing bottlenecks.
Scalability: It provides insights into how algorithms will perform as
data sizes increase, ensuring that systems can handle larger inputs
efficiently.
18

Question. Order the following functions by their asymptotic growth,


and justify you answer: f1=2n, f2= n3/2, f3=nlog n, f4= nlogn.
19
20
21

Complexity of Algorithms

1. Time complexity
2. Space complexity

Algorithms can be broadly categorized into two main groups based on


their structure and approach:
1. Iterative algorithms ( having loop(s) )
2. Recursive algorithms (having recursion)

Note: In the ‘a priori’ analysis of algorithms, the RAM (Random


Access Machine) model is used for analyzing algorithms without
running them on a physical machine.
The RAM model has the following properties:

 A simple operation ( + , \ , * , - , = , &&, ||, if ) takes one-time


step.
 Loops are comprised of simple/primitive/basic operations.
 Memory is unlimited and access takes one-time step.
Note: In the last step of the ‘a priori’ analysis of algorithms,
asymptotic notation is commonly used to characterize the time
complexity of an algorithm. Asymptotic notation provides a concise
way to describe how the performance of an algorithm scales as the
input size becomes very large. It allows us to focus on the most
significant factors affecting the algorithm's efficiency and ignore
constant factors and lower-order terms.
Q. What are the key characteristics of the RAM model?
22

Time complexity (running time)

The running time of the algorithm is the sum of running times for
each statement executed; a statement that takes Ci steps to execute
and executes “n” times will contribute (Ci * n) to the total running
time.

Note: “Ci (i=1,2,3 …, n)” indicates constant unit of time.

A(n)
{ Cost Times
int i; C1 1
for( i = 1; i <=n ; i++) C2 n+1
printf(“MIET”); C3 n
}

T(n) = C1*1 + C2*(n+1) + C3* n


After eliminating constant terms, we get the time complexity in
terms of n.
O(n); linear time

Q. With a suitable example, define the term "running time" of an algorithm.


23

Time Complexity of Iterative Algorithms

Note— When an algorithm contains an iterative control construct


such as a while or for loop, we can express its running time as the sum
of the times spent on each execution of the body of the loop.
Note— If a program doesn’t have loop(s) as well as recursion, then it
takes O (1)- constant running time.
A()
{
pf(“MIET”); // one-time step
pf(“MIET”); // one-time step
pf(“MIET”); // one-time step
}

1+1+1= 3 (it’s a constant.)


Note— We can define a constant running time using either O (1) or
O(C).
Pattern-01 One loop and increment/decrement is by 1
A(n)
{
for(i=1 ; i<=n; i++)  n+1
pf(“MIET”);  n-1
}

T(n)= n+1 +n+1


= 2n+2
T(n) = O(n) [ We remove all constant terms and consider only highest
degree of n for the running time.]
24

Pattern-02 One loop and increment/decrement is not by 1


In this case, we need to calculate the number of iterations carefully.

A (n)
{
int i;
for( i= 1; i<n ; i= i*2){
pf(“MIET”);
}
}
As loop is not getting incremented by one, we will have to carefully
calculate the number of times “MIET” will be executed.
i= 1, 2, 4, . . . , n
After Kth iterations, “i” gets equal to “n”:
i= 1, 2, 4, . . . , n
Iterations= 20, 21, 22, … , 2k
2k = n
Convert it into logarithmic form… [ If ab = c, we can write it logac = b ]
k = log2n
O(logn)
25

A(n) A(n)
{ {
int i , j; int i , j , k;
for( i = 1 to n) for( i = 1 to n)
for(j = 1 to n) for(j = 1 to n)
pf(“MIET”); //It for(k = 1 to n)
will be printed n2 times . pf(“MIET”);//n3
} }

Time complexity is O(n2) Time complexity is O(n3)

Pattern-4 When there is a dependency between the loop and the


statements in the body of the loop.

A(n)
{
1. int i = 1, j = 1;
2. while( j <= n)
{
3. i++;
26

4. j = j + i;
5. pf(“MIET”); // We need to know the no of times it’ll be printed
}
}

Solution:
We have to find out the number of times “MIET” will be printed to know
the time complexity of the above program. We can see that there is a
dependency between the line number 2 (while loop) and 4( the value of
“j” which in turns depends on “i”).

i = 1, 2, 3, 4, … k
j = 1, 3, 6, 10 … k(k+1)/2 [sum of the first “K” natural numbers]

k(k+1)/2 = n+1 [ when the value of “n” gets n+1, condition gets false]
k2 = n [ We eliminate constant terms, consider only variable.]
k =√n Time complexity is O(√n)
Pattern05: When there is a dependency between loops (having more
than one loop)

Note – We have to unroll loops in order to find out the number of times
a particular statement gets executed.
27

A(n)
{
int i, j, k;
for(i = 1; i <= n; i++)
{
for(j = 1; j <= i; j++)
{
for(k = 1; k <= 100; k++)
{
Pf(“MIET”);
}
}
}
}

There is a dependency between the second and the first loop; therefore,
we will have to unroll the loops to know the number of times “MIET” will
be printed.
i=1 i=2 i=3 … i=n
j=1 j=2 j=3 j=n
k = 1*100 k = 2 * 100 k = 3* 100 k = n* 100

1*100 + 2*100 + 3*100 + . . . + n*100


100( 1+ 2+3 +…n)
100( n(n+1)/2) = 50*n2 + 50*n

Time complexity = O(n2) [ We remove constant terms and lower order


terms.]
28

Q. Write a function to compute xn in logarithmic time complexity using


an iterative approach.

Time Complexity of Recursive Algorithms


To find out the time complexity of recursive programs, we have to write
a recurrence relation for the given program and then solve it using one
of the following methods:
1. Iteration or Back substitution method
2. Recursion-tree method
3. Master method
4. Forward substitution method (Substitution Method)
5. Changing variable method

Let’s learn how to write a recurrence relation for the given program
having a recursive call/function.
29

Note: Each recursive algorithm or program must have a stopping


condition (also known as an anchor condition or base condition) to halt
the recursive calls.

A(n)
{
if (n > 0) // stopping condition for the recursive call
{
pf(“MIET”);
A(n-1); // Calling itself( recursive function)
}
}

We assume that T(n) is the total time taken to solve A(n) , where n is the
input. It means that this T(n) is split up among all statements inside the
function i.e., time taken by all instructions inside a function is equal to
T(n).
Note: “if” and “print” take constant amount of time step as per the RAM
model, and we can use either 1 or C to indicate it. When “if- condition”
gets false, it again takes constant amount of time — (one-time step).

Recurrence relation for the above program is given below:


T(n) = T(n-1) + 1 when n>0
1 When n = 0 ( stopping condition )
30

#Mixed (iterative + recursive)

A(n)
{
If(n>0) …... 1
{
for(i=1; i<=n; i++) …. n+1
{
pf(“MIET”); …... n
}
A(n-1); ….T(n-1)
}
}

T(n-1) + n when n>0


T(n) =
1 When n = 0

#Factorial of a number

fact(n)
{
if(n<=1)
31

return 1;
else
return n*fact(n-1); // here “*” takes constant time step
}

Note: Multiplication and other instructions in green will take a constant


amount of time. Left side of the * is the first operand, cannot be
included in the equation.

T(n) = T(n-1) + c when n>1


= 1 when n <=1

#Fibonacci number Fn
fib(n)
{
if (n== 0 || n==1)
return n;
else
return fib(n-1)+ fib(n-2);
}

T(n) = T(n-1)+ T(n-2) + 1 when n >1


1 unit of time when n<=1
32

1. Iteration method (backward substitution) for solving recurrences

The Iteration Method, also known as Backward Substitution, is a


technique used to solve recurrence relations and determine the time
complexity of algorithms. This method involves iteratively substituting a
recurrence relation into itself, moving backward towards the initial
conditions or base cases, until a pattern or closed-form solution
emerges.

T(n-1) + 1 when n>0


T(n) =
1 When n = 0

Note— When solving a recurrence relation to determine the time


complexity, our goal is to address the initial term given in T(…), which
represents a sub-problem. We utilize the base condition to simplify the
T() term.

T(n) = T(n-1) + 1 ………………….(1)

T(n-1)= T(n-2) + 1

T(n-2) = T(n-3) +1

Back substitute the value of T(n-1) into (1)


T(n) = [ T(n-2) + 1] + 1
33

T(n) = T(n-2) + 2 …………………(2)

Now, substitute the value of T(n-2) into (2)


T(n) = [T(n-3)+1]+2
T(n) = T(n-3)+3 …………………………(3)
.
.
.
= T(n-k)+k [ Assume n-k = 0 so, n= k ]
= T(n-n)+n
= T(0) + n [ T(0) = 1 is given ]
= 1+ n

T(n) = O(n)

T(n-1) + n when n>0


T(n) =
1 When n = 0

T(n) = T(n-1) + n …………………………(1)


T(n-1)= T(n-2)+ n-1
T(n-2)= T(n-3) + n-2
Substituting the value of T(n-1) into (1)
T(n)= [T(n-2)+(n-1)]+n
34

T(n)= T(n-2) + (n-1) + n …………………….(2)


Substituting the value of T(n-2) into (2)
T(n) = [ T(n-3) +(n-2) ] + (n-1)+n
T(n) = T(n-3) + (n-2) + (n-1) +n ………………….(3)
.
.
.
T(n) = T(n-k)+(n-(k-1))+ (n-(k-2))+. . . +(n-1) + n …………(4)
Assume that n-k = 0
Therefore n = k
In place of k, substitute “n” in the equation (4)
T(n) = T(n-n) + (n –(n-1)) + (n- (n-2) + . . . (n-1) +n
T(n) = T(0) + 1 +2 +. . .(n-1) + n
T(n) = 1+ n(n+1)/2
= O(n2)

Solve the recurrence using back substitution method :


T(n) = 2T(n/2) +n [previous year question]

Base condition is not given in the question; therefore, we assume that


when n=1 , it takes 1 unit of time.
35

T(n) = 2T(n/2) + n ……………………………… (1)

T(n/2)= 2T(n/4) + n/2


T(n/4)= 2T(n/8) + n/4

Substituting the value of T(n/2) into (1), we get


T(n) = [ 2 ( 2T(n/4)+n/2)+ n]
T(n) = 22 T(n/4) + 2n ……………………………………(2)

Substituting the value of T(n/4) into (2), we get


T(n) = [4 (2T(n/8) + n/4) + 2n
= 23 T(n/8) + n + 2n
= 23 T(n/23) + 3n…………………………………………(3)
.
.
.
= 2k T(n/2k) + k*n ………………………………………….(4)
Assume that (n/2k) = 1, then
2k = n
log2n = k
T(n) = 2log2n T(n/n) + n*logn

T(n) = n T (1) + n logn [ Since 2log2n = n]


36

= n+ nlogn= O(nlogn)

V.V.I
37
38

V.V.I
39

2. Recursion-Tree Method for Solving Recurrences

Type- 01 (Reducing function)


Steps:
1. Make T(n) the root node.
2. Draw the tree for two to three levels to calculate the cost and
height.
3. If the cost at each level is the same, multiply it by the height of
the tree to determine the time complexity.
4. If the cost at each level is not the same, try to identify a sequence.
The sequence is typically in an arithmetic progression (A.P.) or
geometric progression (G.P.).
40
41
42
43
44

Type-2 ( Dividing function- when there is more than one sub-problem,


and the size of each sub-problem is the same.)
Steps:
1. Make the last term the root node.
2. Draw the tree for two to three levels to calculate the cost and
height.
3. If the cost at each level is the same, multiply it by the height of
the tree to determine the time complexity.
4. If the cost at each level is not the same, try to identify a sequence.
The sequence is typically in an arithmetic progression (A.P.) or
geometric progression (G.P.).

Note: If the size of sub-problem is only one, follow Type-1 approach


only.
45
46
47
48
49

Q. Solve the following recurrences:


i. T (n) = T (n-1) + n4 using iteration method
ii. T(n) = 3T(n/4) + cn2 using recursion tree method
50
51

Q Explain Binary Search Algorithm. Also solve its recurrence relation.


It is a searching algorithm used in a sorted array by repeatedly dividing
the search interval in half. The idea of binary search is to use the
information that the array is sorted and reduce the time complexity to
O(Log n).
52
53
54

3 Master Theorem to solve recurrences

Note—Effective for the university exam

Question. State Master Theorem and find time complexity for the
recurrence relation T(n) = 9 T(n/3) +n.
Solution— Let a >= 1 and b > 1 be constants, let f(n) be a function , and
let T(n) be defined on the nonnegative integers by the recurrence
T(n) = aT(n/b) + f(n)
Where we interpret n/b to mean either floor value of (n/b) or ceiling
value of (n/b). Then T(n) has the following asymptotic bounds:
1. If f(n) = O(nlogba-ϵ ) for some constants ϵ >0, then T(n)= ɵ(nlogba)
2. If f(n) = ɵ(nlogba), then T(n) = ɵ(nlogba logn)
3. If f(n) = Ω(nlogba+ϵ ) for some constant ϵ >0, and if af(n/b) <= c(f(n))
for some constant c <1 and all sufficiently larger n, then T(n)=
ɵ(f(n)).

Given: a= 9, b= 3 and f(n) = n


Now we need to calculate nlogba as it’s the common term in all 3-cases of
the Master Theorem.
nlogba = nlog39 = n2 ( It is clearly bigger than f(n), which is n)
Case-1 can be applied; therefore, T(n) = ɵ(n2).
Question. State Master Theorem and find time complexity for the
recurrence relation T(n) = T(2n/3) +1.
55

Solution— Given: a = 1, b= 3/2 and f(n) =1


Now we need to calculate nlogba as it’s the common term in all 3-cases of
the Master Theorem.
nlogba = nlog3/21 = n0 = 1 [ Since, log1 = 0]
As the result of nlogba is equal to f(n) in the question, we can apply the
second case ( for tight bound/average case).

T(n)= T(n) = ɵ(nlogba logn)


=T(n) = ɵ(logn)

Question. State Master Theorem and find time complexity for the
recurrence relation T(n) = 3 T(n/4) +nlogn.
Solution— Given: a = 3, b= 4 and f(n) =nlogn
Now we need to calculate nlogba as it’s the common term in all 3-cases of
the Master Theorem.
nlogba = nlog43 = n0.793
Since, nlogn = Ω(nlog43+ ϵ) where ϵ = 0.2

Case-3 applies if we can show that

af(n/b) <= c.f(n)


3(n/4)log(n/4) <= (3/4)nlogn for c = ¾
56

By case-3, T(n) = ɵ(nlogn)

Note—Effective for the competitive exam


57
58
59
60
61
62

4. Substitution method for solving recurrences [ Forward


Substitution method]
63
64
65

Q. Solve by substitution method ( Forward substitution method):


a. T(n) = n* T(n-1) if n>1 ; T(1) =1 …… [A]

Solution:
Step 1: Guess the solution
T(n) = O ( nn ) ……. [1] [ You can easily get it using iteration method.]

Step 2: Now, we have to prove that our assumption is true using


property of mathematical induction.

T(n) <= c. nn from equation-[1]


66

Now, put n=1 in equation-[1]


T(1) <= c. 1
1 <= c.1 [ True for c>=1 , n0 = 1 ]

It should be true 1, 2, 3, . . ., k
T(k) <= c. kk [ 1<= k <= n]
When we move forward from 1 to n somewhere we get k = n-1.
T(n-1) <= c. (n-1)(n-1) …………………… [2]

Now, put the value of T(n-1) into equation [A].


T(n) <= n * c. (n-1)(n-1)
<= c * n * (n-1) (n-1)
<= c * n * n n [ if n-1 = n ]
<= cn * nn [ we consider only bigger term]
<= n * nn [ We remove constant term(s)]
<= nn+1
<= nn
Hence, T(n) = O(nn) proved
67
68
69
70

Sorting Algorithms and their Analysis


71
72
73
74

Shell-Sort Algorithm
75
76
77

Quick Sort Algorithm


78
79
80
81

V.V.I
82
83
84

Apply Merge sort on the array { 9, 6, 5, 0, 8,5} and also


write down its time complexity
85
86

Heap Sort Algorithm


87
88
89
90

Q What do you understand by a stable sort? Name two stable sort


algorithms.
A sorting algorithm is said to be stable if two objects having equal keys appear in
the same order in sorted output as they appear in the input data set. For
example, Insertion and Counting sorts.

Q. Define in-place sorting algorithm.


It is a type of sorting algorithm that rearranges the elements of an array or list
without requiring additional memory space proportional to the size of the input.

Q Describe the difference between the average-case and the worst-case analysis
of algorithm, and give an example of an algorithm whose average-case running
time is different from worst case analysis.

Q. Compare sorting algorithms in a tabular form.


91

Sorting in linear time O(n)


[Non-comparison-based sorting algorithms]
Non-comparison-based sorting algorithms do not rely on pairwise comparisons
between elements to determine their relative order. Instead, they exploit
properties of the input data, such as the range of values, to achieve efficient
sorting. These algorithms are often used when the range of possible input values is
known and limited, making them more efficient than comparison-based algorithms
in certain scenarios. Here are some common non-comparison-based sorting
algorithms:

1. Count sort
2. Radix sort
3. Bucket sort

Counting-Sort Algorithm

Counting Sort assumes that each of the "n" input elements is an integer in the range
from 0 to "k," where "k" is an integer representing the range of values.
92
93

Q. Write down the Counting-Sort algorithm and illustrate the


operation of Counting Sort on the array A = {6, 4, 8, 4, 5, 1}.

Solution:
94
95

Radix-Sort Algorithm
96
97

Bucket-Sort Algorithm
98
99

Unit-02

Red-Black Tree
It is a Binary Search Tree (BST) with one extra bit of storage per node: its
color, which can be either red or black.
Properties of RB Tree:
1. Every node is either red or black.
2. The root is black.
3. Every leaf node (NIL) is black.
4. If a node is red, then both its children are black. It means we need
to avoid red-red (RR) conflict.
5. For each node, all simple paths from the node to descendant
leaves contain the same number of black nodes.
100

Fig – Red Black Tree


Q1. Explain various rotations in an RB Tree.
We have four types of rotations in RB tree like AVL tree:
1. Left-Left (LL) problem: Needs a single right rotation
2. Right-Right (RR) problem : Needs a single left rotation
3. Left-Right (LR) problem: Needs one left and then one right
rotations.
4. Right-Left (RL) problem: Needs one right and one left rotations.

Note: We have to recolor only those nodes which are involved in


rotation. When dealing with RL or LR rotation, we have to recolor only
last rotated nodes.
101
102
103
104

Q2. Compare the properties of AVL tree with RB Tree.

Q3. What are advantages of an RB Tree?


Advantages:
Balanced Structure: Ensures efficient operations (O(log n)) by self-
balancing the tree.

Predictable Performance: Rules maintain consistent performance


regardless of data.
Fast Insertions/Deletions: Efficient for frequent changes.
Sorted Order: Supports sorted data and range queries.
Search Efficiency: Quick search operations due to balanced
structure.
105

Q4. Write an algorithm for insertion of keys into an RB Tree and


also insert the following keys <5,16,22, 25, 2,10,18,30,50,12,1>
into an empty RB Tree.

Algorithm for insertion


1. If the tree is empty, create a new node as the root node with
the color “black”.
2. If the tree is not empty, insert the new node with the color
“red”.
3. If the parent of the new node is “black”, exit.
4. If the parent of new node is “red”, check the color of parent’s
sibling (the uncle of the newly inserted node).

4(a) If its color is black or Null (no uncle), perform suitable


rotations and recolor only the last rotated nodes.
4(b) If its (uncle’s) color is red, recolor the following nodes:
1. Uncle
2. Parent
3. Grandfather (if it is not the root of the tree).

Check if the tree is an RB tree or not. If not, consider the grandfather as


a newly inserted node and repeat step -4.
106
107
108
109
110
111
112
113
114
115

Q4. Explain about double black node problem in RB tree.


When a black node is deleted and replaced by a black child, the child is marked as
double black. The main task now becomes to convert this double black to single
black.

Q5. Construct an RB Tree, and let h be the height of the tree and n be
the number of internal nodes in the tree. Show that h<= 2log2(n+1).
116
117

B-Trees
A B-tree is a self-balancing m-way search tree with the following
restrictions:
1. The root node must have at least two children.
2. Every node, except for the root and leaf nodes, must have at least ⌈m/2⌉
children.
3. All leaf nodes must be at the same level.
4. The creation process is performed bottom-up.

Properties of m-way search tree:


1. m (degree/order): Maximum number of children (or child pointers)
2. m-1 keys : Maximum keys per node
3. All keys are arranged in ascending order.
118

Note— An m-way search tree, also known as an m-ary search tree, does not have
strict height control during its construction. Depending on the order (m) and the
specific keys inserted, an m-way search tree can grow to a height of "n," where "n"
is the number of keys inserted into the tree. To address this issue and maintain a
more balanced structure, B-trees were introduced. B-trees are a type of m-way tree
with specific restrictions and properties designed to keep their height under control
and ensure balanced branching.

Insertion of keys in a B-tree


Type-01 We are given some keys with a degree.
For example: insert the keys: 12, 21, 41, 50, 60, 70, 80, 30, 36, 6, 16 into
an empty B-tree with degree 4.

Create a table following the properties of the B-tree

Type-02 We are given some keys with a maximum degree.


119

For example: Using the maximum degree m= 4, insert the following


sequence of integers 10, 25, 20, 35, 30, 55, 40, 45, 50, 60, 75, 70, 65, 80,
85,90 into an initially empty B-Tree.

Create a table following the properties of the B-tree


120

Type-03 We are given some keys with a minimum degree.


For example: Using the minimum degree t= 3, insert the following
sequence of integers 10, 25, 20, 35, 30, 55, 40, 45, 50, 60, 75, 70, 65, 80,
85,90 into an initially empty B-Tree.

Create a table following the properties of the B-tree

Note— The maximum number of keys that can be stored in a particular


node of a B-tree with a minimum degree of t is 2t-1. Therefore, based
on the question, the maximum number of keys per node is 5.
121

Note—The simplest B-tree occurs when t=2. Every internal node then
has either 2, 3, or 4 children, and we have a 2-3-4 tree.

Q. Using the minimum degree t= 3, insert the following sequence of


integers 10, 25, 20, 35, 30, 55, 40, 45, 50, 60, 75, 70, 65, 80, 85,90,
100,110,120, 112, 114, 120 into an initially empty B-Tree.
Solution—
122
123
124
125

Q. Using the minimum degree t= 2, insert the following sequence of


integers 12, 25, 42, 50, 60, 70, 80, 28, 36, 14, 18 into an initially empty B-
Tree.
Solution—
126
127

s
128
129
130

Deletion of keys in a B-tree


131
132
133

Q. Using the minimum degree t = 2, insert the following sequence of


integers: 12, 25, 42, 50, 60, 70, 80, 28, 36, 14, 18 into an initially empty
B-Tree. Now, delete 60, 18, and 14.
134
135
136

Q. Why don’t we allow a minimum degree of t=1 in B-trees?


Allowing a minimum degree of t=1 in B-trees would fundamentally change their
structure and behavior in ways that make them less efficient and undermine some
of their key advantages. The choice of a minimum degree of t>=2 in B-trees is a
deliberate design decision to maintain the desirable properties and performance
characteristics that make B-trees valuable for a wide range of applications involving
sorted data storage and retrieval.

Q. Show all legal B-Trees of minimum degree 2 that represent <1,2,3,4,5>

Q.
137

Q7. Define Binomial Tree and mention its properties.


138
139

Q8. Define Binomial heap, write an algorithm for union of two Binomial
heaps and also write its time complexity.
140
141
142
143
144

Time complexity : O(Log(n))


Q9. Design a Binomial heap for a given A, A= [ 7, 2, 4, 17, 1, 11, 6, 8,15,10,
20]
Steps:
1. Create a Binomial heap H1 containing a new element (key).
2. Apply union operation between the two Binomial min heaps H and
H1.
145
146
147
148
149

Q. Write down the algorithm for Decrease key operation in Binomial


Heap also write its time complexity.

BINOMIAL-HEAP-DECREASE-KEY(H, x, k)
1 if k > key[x]
2 then error "new key is greater than current key"
3 key[x] = k
4 y=x
5 z = p[y]
6 while z != NIL and key[y] < key[z]
7 do exchange key[y] and key[z]
8 If y and z have satellite fields, exchange them, too.
9 y=z
10 z = p[y]
Running time T(n) = O(logn)
150

Q10. Define Fibonacci heap and also compare the complexities of Binary
heap, Binomial heap and Fibonacci heap.
151
152

Q11. Explain the extracting minimum node operation of Fibonacci heap


with example.

This operation is accomplished by deleting the minimum key node and


then moving all its children to the root list. It uses the process called
“consolidate” to merge the trees having same degree.

Steps:
1. Delete the minimum node .
2. Join the root list of deleted node’s descendants to the root list of
original root list.
3. Traverse left to right in the root list
3.a Find new minimum.
3.b Merge trees having same degree.

4. Stop after having every tree with unique degree.


153

Example-
154

Q12. Define Skip List and Trie with example.


155
156

Q. Explain the Search operation in Skip list with a suitable example. Also
write its algorithm.
157
158
159

Q. Insert the following strings into the initially empty trie: DOG, DONE, CAT, CAN, RIGHT, DO,
JUG, DAA, CA, CAME. Then delete the strings DO, CAME, and RIGHT from it.
160

Q14. Insert strings < ten, test, car, card, nest, next, tea, tell, park, part,
see, seek, seen> in an empty Trie data structure and also compress the
Trie.
161

Unit-03

Divide and Conquer: It is one of the algorithm design techniques in


which the problem is solved using the divide, conquer and combine
strategy.
Divide: This involves dividing the problem into smaller sub-problems.
This step generally takes a recursive approach to divide the problem
until no sub-problem is further divisible.
Conquer: This involves solving sub-problems by calling them recursively
until solved.
Combine: When the smaller sub-problems are solved, this stage
recursively combines them until they formulate a solution to the original
problem.

Following are some standard algorithms that follow Divide and


conquer approach:

1. Quick sort
2. Merge sort
3. Strassen’s algorithm for matrix multiplication
4. Convex Hull algorithm
5. Closest pair of points
162

Q1. Describe the Convex-Hull problem with a suitable example.


Given a set of points, a Convex Hull is the smallest convex polygon
containing all the given points.

Quickhull Algorithm [ Divide and Conquer Algorithm]


Let P[0…n-1] be the input array of points. Following are the steps for
finding the convex hull of the points.
1. Find the point with the minimum X-coordinate. Let’s say, min_x and
similarly the point with the maximum X-coordinate, max_x.
2. Make a line joining these two points, say L. This line divides the
whole set into two parts. Take both parts one by one and proceed
further.
163

3. For a part, find the point P with the maximum distance from line L.
P forms a triangle with the points min_x and max_x.
4. The above step divides the problem into two sub-problems, which
are solved recursively. Now, the line joining the points P and min_x
and the line joining the points P and max_x are new lines, and the
points residing outside the triangle are the set of points. Repeat
line number 3 till there is no point left with the line. Add the
endpoints of this point to the convex hull.

Example:
Find a convex hull of a set of points given below.
164
165
166

Matrix Multiplication
167
168
169

V.V.I
170

Q2. Compare and contrast BFS and DFS. How do they fit into the
decrease and conquer strategy?

Decrease and Conquer Strategy: The name ‘divide and conquer’ is used
only when each problem may generate two or more sub-problems. The
name ‘decrease and conquer’ is used for the single sub-problem class.
The Binary search rather comes under decrease and conquer because
there is one sub-problem. Other examples are BFS and DFS.

BFS (Breadth First Search )


1. It is a traversal approach in which we first walk through all nodes on the same
level before moving on to the next level.
2. It uses Queue data structure.
3. It is more suitable for searching vertices closer to the given source.
4. It requires more memory.
5. It considers all neighbors first and therefore not suitable for decision-making
trees used in games or puzzles.

DFS (Depth First Search)


1. It is also a traversal approach in which the traversal begins at the root node and
proceeds through the nodes as far as possible until we reach the node with no
unvisited nearby nodes.

2. It uses stack data structure.

3. It is more suitable when there are solutions away from source.

4. It requires less memory.


171

5. It is more suitable for game or puzzle problems. We make a decision, and the
then explore all paths through this decision. And if this decision leads to win
situation, we stop.

Greedy Method of Algorithm Design


As the name suggests it builds up a solution piece by piece locally, always
choosing the next piece that offers immediate benefit. The main function
of this approach is that the decision is taken on the basis of the currently
available information.
Note— A greedy algorithm always makes the choice that looks best at
the moment. That is, it makes a locally optimal choice in the hope that
this choice will lead to a globally optimal solution.

Pseudo code of Greedy Algorithm

Greedy(arr[], n)
{
Solution = 0;
for i=1 to n
{
x = select (arr[i]);
if feasible(x)
{
Solution = solution + x;
}
}
}
172

Initially, the solution is assigned with zero value. We pass the array and
number of elements in the greedy algorithm. Inside the for loop, we
select the element one by one and checks whether the solution is
feasible or not. If the solution is feasible, we add it to the solution.

Applications of Greedy Algorithm

o It is used in finding the shortest path.


o It is used to find the minimum spanning tree using the prim's
algorithm or the Kruskal's algorithm.
o It is used in a job sequencing with a deadline.
o This algorithm is also used to solve the fractional knapsack
problem.

Let’s try to understand some terms used in the optimization problem.


Suppose we want to travel from Delhi to Mumbai as shown below:
Problem (P): Delhi(D)  Mumbai (M)
There are multiple solutions to go from D to M. We can go by walk, bus,
train, airplane, etc., but there is a constraint in the journey that we have
to travel this journey within 16 hrs. If we go by train or airplane then only,
we can cover this distance within 16 hrs. Therefore, we have multiple
solutions to this problem, but only two solutions satisfy the constraint,
which are called feasible solutions.
If we say that we have to cover the journey at the minimum cost, then
this problem is known as a minimization problem.
Till now, we have two feasible solutions, i.e., one by train and another
one by air. Since travelling by train cost us minimum, it is an optimal
solution. The problem that requires either minimum or maximum result
is known as an optimization problem. Greedy method is one of the
strategies used for solving the optimization problem. A Greedy algorithm
173

makes good local choices in the hope that the solution should be either
feasible or optimal.

Q3. Describe optimization problem, feasible solution and optimal


solution.

Optimization problem – An optimization problem refers to a


computational problem where the goal is to find the best solution from
a set of possible choices (called feasible solutions). The objective is to
either maximize or minimize a specific criterion while adhering to a set
of constraints. We have the following methods to solve optimization
problems:
1. Greedy
2. Dynamic programming
3. Branch and bound

Feasible solution - Most of the problems have ‘n’ inputs and require us
to obtain a subset that satisfies some constraints. Any subset that
satisfies these constraints is called a feasible solution. A problem can
have many feasible solutions.
“A feasible solution satisfies all constraints of the problem.”
Optimal solution - It is the best solution out of all possible feasible
solutions.
174

Q4. What is principle of optimality?

A problem is said to satisfy the Principle of Optimality if the sub solutions


of an optimal solution of the problem are themselves optimal solutions
for their subproblems. For example, the shortest path problem satisfies
the principle of optimality.

Q5. Differentiate between Greedy approach and Dynamic programming


approach.

Greedy Approach:

1. We make a choice that seems best at the moment in the hop that
it will lead to global optimal solution.
2. It does not guarantee an optimal solution.
3. It takes less memory.
4. Fractional Knapsack is an example of Greedy approach.

Dynamic Programming Approach:

1. we make a decision at each step considering current problem


and solution to previously solved sub problem to calculate
optimal solution.
2. It guarantees an optimal solution.
3. It takes more memory.
4. 0/1 Knapsack is an example of Dynamic programming approach.
175

Q6. Find the optimal solution for the fractional knapsack problem
with n=7 and a knapsack capacity of m=15, where the profits and
weights of the items are as follows: (p1, p2, . . . , p7) = (10, 5, 15, 7,
6, 18, 3) and (w1, w2, . . . , w7) = 2, 3, 5, 7, 1, 4, 1, respectively.

Capacity of knapsack(bag) = 15

We have to put objects in the bag (knapsack) such that we should get
maximum profit.

Selection of the object can be entirely (x=1), in fraction ( 0<=x<=1) or


not selected ( x=0).

Object Profits Weights P/W


1 10 2 5
2 5 3 1.6
3 15 5 3
4 7 7 1
5 6 1 6
6 18 4 4.5
7 3 1 3

We select an object according to its P/W ratio. An object with


maximum P/W ratio will be selected first and then second maximum
P/W ratio and so on.
176

Remaining = Capacity of Knapsack- Weight of the selected object

Final table according to decreasing P/W ratio


Objects Profits(P) Weights P/W Remaining Selection(X)
5 6 1 6 15-1=14 1
1 10 2 5 14-2=12 1
6 18 4 4.5 12-4=8 1
3 15 5 3 8-5=3 1
7 3 1 3 3-1=2 1
2 5 3 1.66 2-2=0 2/3
4 7 7 1 0 0

Object-4 is not selected; therefore, value of x for this object is zero.


Object-2 is selected only 2 units out of 3 units, so its value for x is
(2/3).

Profit = Xi * Pi

Profit = 1* 6 + 1*10 + 1*18 + 1*15 + 1*3+ (2/3) * 5 + 0*7


= 6 + 10 +18+ 15+ 3+3.3 +0
= 55.3

Job Sequencing with Deadlines Problem

We are given a set of n jobs. Associated with job i is an integer deadline d i >= 0
and a profit pi >0. For any job i, the profit pi is earned if and only if the job is
completed by its deadline. The objective is to maximize the total profit by
scheduling jobs in a way that they meet their deadlines.
177

The constraints in this problem include:

Each job has a specific deadline by which it must be completed.


Each job takes one unit of time to complete. This one unit of time may be
equal to one hour, one day, one week, or one month.
Only one machine is available for processing jobs.
We can only work on one job at a time.
We cannot exceed the specified deadlines.
No preemption.

Q. Using a greedy method, find the optimal solution for the “job
sequencing problem with deadlines” with n = 4, where (p1, p2, p3,
p4) = (100, 10, 15, 27) and (d1, d2, d3, d4) = (2, 1, 2, 1).
178

Q. Identify all solutions satisfying the constraints for the “job


sequencing with deadlines” problem with n = 4, where (p1, p2, p3,
p4) = (100, 10, 15, 27) and (d1, d2, d3, d4) = (2, 1, 2, 1).

Solution— We know that we can have multiple solutions for an


optimization problem, but feasible solutions are only those that
satisfy all constraints of the problem. Therefore, we need to identify
all feasible solutions for this problem.
179

Activity Selection Problem

The activity selection problem involves selecting a maximum number of non-


overlapping activities from a given set of activities, each with a start time and
finish time. The goal is to choose a set of activities that do not overlap in time
and, therefore, can all be completed.

The constraints in this problem include:

You can only perform one activity at a time.


The activities must be selected in a way that none of them overlap in time.

Q. Using Greedy method, find an optimal solution to the activity


selection problem with the following information:

Activity A1 A2 A3 A4 A5 A6 A7 A8

Start 1 0 1 4 2 5 3 4

Finish 3 4 2 6 9 8 5 5
180

Solution—
181
182
183
184
185
186
187

Q7. Define spanning tree and minimum spanning tree with an


example.

Spanning Tree – It is a subset of graph G having all its vertices covered


with minimum possible number of edges. If there are ‘n’ vertices in
an undirected connected graph, then every possible spanning tree
out of this graph has “n-1” edges. It does not have a cycle.
Minimum Spanning Tree (MST) – An MST for a weighted, connected,
undirected graph is a spanning tree having a weight less than or equal
to the weight of every other possible spanning tree.
188

Example:
189

Prim’s Algorithm for finding an MST


190

Q8. Give an example of an MST using Prim’s algorithm for a


connected graph.
191

Kruskal’s Algorithm for finding an MST

Algorithm:

1. Sort all the edges in increasing order of their weight.

2. Pick the smallest edge. Check if it forms a cycle with the spanning tree
formed so far. If cycle is not formed, include this edge. Else, discard it.

3. Repeat step#2 until there are (V-1) edges in the spanning tree.

Single Source Shortest-Paths Problem

1. Dijkstra’s Algorithm (Greedy )


2. Bellman-Ford Algorithm (Dynamic)

Given a graph G = (V, E), we want to find a shortest path from a given
source vertex s ϵ V to each vertex v ϵ V.
192

1. Dijkstra’s Algorithm for Single Source Shortest Paths

Dijkstra’s algorithm solves the single source shortest-paths problem on


a weighted graph G = (V , E) for the case in which all edge weights are
nonnegative. It works for directed as well as undirected graph. It may or
may not work with negative edge weights.
193
194
195
196

Q. Apply the greedy single source shortest path algorithm on the graph
given below.
197

2. Bellman-Ford Algorithm (Dynamic)


198
199

Unit-04

Dynamic Programming

It is one of the algorithm design techniques used to solve optimization


problems. It is mainly an optimization over plain recursion. Wherever we
encounter a recursive solution with repeated calls for the same inputs, we
can optimize it using Dynamic Programming. The idea is to store the
results of subproblems so that we do not have to re-compute them when
needed later. This simple optimization reduces time complexities from
exponential to polynomial.

Principle of Optimality : The principle of optimality, developed by


Richard Bellman, is the basic principle of dynamic programming. It states
that in an optimal sequence of decisions, each subsequence must also be
optimal.

Memoization : It is the top-down approach (start solving the given


problem by breaking it down) . If we want, we can use this approach in
Dynamic programming as well, but we generally use iterative method
(tabulation method), which is the bottom-up approach, in Dynamic
programming.

Let’s try to understand Dynamic Programming approach with a suitable example.

Find Fibonacci term using plain recursion ( recursive program).


200

Fibonacci series : 0 1 1 2 3 5 . . .
Fn: 0 1 2 3 4 5 . . . (Fibonacci terms)
F3 term= 2 , F1 term=1 , F4 term=3, etc.

int fib(int n)
{
if(n<=1)
return n;
return fib(n-2) + fib(n-1);
}

n if n<=1
T(n) = T(n-2) + T(n-1) +1 if n>1

Time complexity ( Upper bound )


T(n) = 2T(n-1) + 1 [ Since T(n-1) is almost equal to T(n-2)]
Using master method for decreasing functions, we get the time
complexity O(2n) , which is exponential.

Now, try to observe repeated recursive calls for the same argument
(input value ) using a recursive tracing tree.
201

Count of Repeated Recursive calls in fig 1:


fib(3) – 2 times repeated, fib(2) – 3 times repeated, fib(1) – 5 times
repeated, and fib(0) -- 3 times repeated
We have got repeated recursive calls for the same input. It makes this
approach have exponential running time. It is where Dynamic
Programming approach comes into the picture, which reduces time
complexity drastically by avoiding repetitive computation for the same
recursive call.
Find Fibonacci term using memoization (Dynamic Programming Approach).
202

int F[20]; // Global array

int fib(int n) // Function definition


{

if(n <= 1)
return n;
if(F[n] != -1)
return F[n];

F[n] = fib(n-2) + fib(n-1); // recursive call


Return F[n];
}

void main(void)
{
int i, result=0;

for( i=0 ; i< 20 ; i++)


F[i] = -1;
result = fib(5);
printf(“%d”, result); }
From the above example, we can observe the following points:
If we use memoization method to solve the same problem, we don’t
have to go for repetitive computation for the same recursive calls. It
means for fib(5), we have to compute recursive function calls only 6
times ( fib(5), fib(4), fib(3), fib(2), fib(1) and fib(0)).
If we generalize it for fib(n), the number of recursive calls will be n+1.It
means time complexity will be O(n)- linear.

Note – We generally don’t use the memoization method in Dynamic


programming as it consumes more space due to recursion.
Note – Memoization follows top-down approach.
203

Iterative Method (tabulation method) for the Same Problem [


bottom-up approach]
int F[20];
int fib(int n)
{
if(n <=1)
{
return n;
}
F[0]=0;
F[1]=1;
for(int i = 2; i<=n; i++)
{
F[i]= F[i-2] + F[i-1];
}
return F[n];
}

1. 0/1 Knapsack Problem


The knapsack problem deals with putting items in the knapsack based
on the value/profit of the items. Its aim is to maximize the value inside
the bag. In 0-1 Knapsack, you can either put the item or discard it; there
is no concept of putting some part of an item in the knapsack like
fractional knapsack.
204

Q. Find an optimal solution to the 0/1 Knapsack instances n=4 and


Knapsack capacity m=8 where profits and weights are as follows p= {1,
2, 5, 6} and w = {2, 3, 4, 5}

Note – If weights are not given in the increasing order, then arrange
them in the increasing order and also arrange profits accordingly.

The matrix (mat[5][9]) will contain 9 columns ( as capacity (m) = 8 is


given) and 5 rows (as n= 4 is given)

Pi = profits
Wi = weights
i = Objects

Formula to fill out cells : mat[i, w] = max ( mat[i-1, w], mat[i-1, w-weight[i]+ p[i])

Short-cut to fill the table


1. Fill the first row and the first column with zero.

2. For the first object, check the weight (wi) of the first object, which
is 2. We have capacity w=2, so place profit of this object in the cell
having capacity of 2 units (mat[1][2]=1). So far, we have only one
object to consider, so we can put the first object (i = 1) having 2
units of weight (w1 = 2 ) in the knapsack having capacity (w)
205

3,4,5,6,7 and 8 units. Therefore, fill mat[1][3], mat[1][4], mat[1][5],


mat[1][6], mat[1][7] and mat[1][8] with 1.

3. For the cell(s) left side of the current cell, we just consider the
maximum value between left side and above of the current cell. For
example, for the left side of mat[1][2], we need to pick
max(mat[1][1], and mat[0][2]), which is 0. Therefore, place zero in
the mat[1][1].

4. For the second object, weight is given 3 units. Now, we can consider
two objects ( 1 and 2) together. The second object having 3 units of
weight can be placed in the cell [2][3] having 3 units of capacity.
Both objects together have 5 units of weight, which can be placed
in the cells [2][5], [2][6], [2][7] and [2][8] having 5 units of capacity.
For the cell [2][2], pick max(mat[2][1], mat[1][2]) which is 1. And
follow the same for the cell [2][4].

5. For the third object, 4 units of weight is given. Now, we can


consider three objects (1,2, and 3 objects) together . Weight of the
third object is 4 units , so we can place its profit (5) in the cell [4][4]
having 4 units of capacity. Objects 2 and 3 together have 7 units of
weight and 7 units of profit (5+2), so we can place them in the cell
[3][7] having 7 units of capacity. Object 1 and 3 together have 6
units of weight, so we can place them in the cell [3][6] having 6
units of capacity. To fill out remaining cells , follow above steps.

Maximum profits = 8 ( placed in the last cell of the matrix )


206

Selection of objects Xi = X1 X2 X3 X4 ( 0 1 0 1 )
Only objects 2 and 4 have been placed in the knapsack to gain
maximum profit.

Capacity(w)
Instances/
Objects (i)
0 1 2 3 4 5 6 7 8
Pi Wi 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 1 1 1 1 1 1 1
2 3 2 0 0 1 2 2 3 3 3 3
5 4 3 0 0 1 2 5 5 6 7 7
6 5 4 0 0 1 2 5 6 6 7 8

2. Single Source Shortest Path using Bellman-Ford Algorithm (


Dynamic Programming)
Kindly refer unit-03 notes

3. All Pairs Shortest Path ( Floyd-Warshall Algorithm)

Apply Floyd-Warshall algorithm on the below graph:


3

1 2
8

7 2 5 2

1
4 3
207
208
209
210

4. Matrix Chain Multiplication Problem

We are given n matrices A1, A2, …. , An and asked in what order these
matrices should be multiplied so that it would take a minimum number of
computations to derive the result.

Two matrices are called compatible only if the number of columns in the first
matrix and the number of rows in the second matrix are the same. Matrix
multiplication is possible only if they are compatible. Let A and B be two
compatible matrices of dimensions p x q and q x r. Each element of each row
of the first matrix is multiplied with corresponding elements of the appropriate
column in the second matrix.

The total number of multiplications required to multiply matrix A and matrix


B is p x q x r.

Suppose dimension of two matrices are :

A1 = 5 x 4

A2 = 4 x 3

Resultant matrix will have 15 elements ( 5 rows and 3 columns ), and each
element in the resultant matrix is derived using 4 multiplications. It means 60
(5 x 4 x 3) multiplications are required.

We cannot multiply A2 = (4 x 3) and A1 = (5 x 4) as column of A2 and row of


A1 are different. Therefore, we can parenthesize A1 and A2 in one way only
i.e., (A1 x A2).
211

Suppose dimension of three matrices are :


A1 = 5 x 4
A2 = 4 x 6
A3 = 6 x 2

1. In how many ways can we parenthesize them?


2. How many multiplications are required to derive the resultant matrix?
2(n – 1)
Formula to find out all valid combinations: 1/n x Cn-1

For n=3
1/3 x 4C2
1/3 x 4! / 2! * (4 – 2)!
1/3 x 4 x 3 x 2! / 2! * 2!
1/3 x 4 x 3 / 2!
=2 ( We can parenthesize these three matrices only in two ways.)

A. A1 ( A2 X A3) [ First possible order of multiplication ]


(5 x 4) { (4 x 6 ) (6 x2 ) } [ Here last two matrices require 48
multiplications]
(5 x 4) ( 4 x 2) [ Here two matrices require 40 multiplications ]
Total 88 multiplications are required.

B. (A1 X A2 ) A3 [ Second possible order of multiplication ]


{ (5 X 4) ( 4 X 6)} (6 X 2)
(5 X 6) (6 X 2)
Total 180 multiplications are required.

The answer of both multiplication sequences would be the same in the resultant
matrix having 5 rows and 2 columns, but the numbers of multiplications are
different. This leads to the question- what order should be selected for a chain
of matrices to minimize the number of multiplications to reduce time complexity?
212
213
214
215

V.V.I
216

V.V.I
217

V.V.I
218

Hamiltonian Cycle (using backtracking)


A Hamiltonian Cycle is a cycle in a graph that visits every vertex exactly once and returns to
the starting vertex.
219

Graph Coloring Problem (using backtracking)

The goal is to assign colors to the vertices of a graph in such a way that no two adjacent
vertices (connected by an edge) have the same color, while using the minimum number of
colors.
220
221

Travelling Salesman Problem using Least Cost Branch and Bound

In this Problem, a salesman must visits n cities. We can say that salesman wishes to make
a tour, visiting each city exactly once and finishing at the city he starts from. The goal is
to find a tour of minimum cost. We assume that every two cities are connected. We can
model the cities as a complete graph of n vertices, where each vertex represents a city.
222
223
224

l
225

Unit-05

Problems

Solvable Unsolvable

Solvable problems - A problem is said to be solvable if we know either


there exists a solution or we are able to prove mathematically that the
problem cannot be solved.

Unsolvable problems – A problem is said to be unsolvable if we know


neither there exists a solution nor we are able to prove mathematically
that the problem cannot be solved. It means that in the future we will
have all problems currently in unsolvable domain in solvable domain for
sure. For example, time complexity of Shell sort.

Solvable Problems

Decidable Problems Undecidable Problems


226

Decidable Problems – A problem is said to be decidable if we are able to


predict the time to solve the problem. It means that we have an
algorithm as well as procedure to solve the problem. For example,
sorting problem.
Undecidable Problems – A problem is said to be undecidable if we are
not able to predict the time to solve the problem. It means that we have
only procedure to solve the problem but not an algorithm- which is used
to predict the time. For example, if I ask ,” Is it possible to become the
PM of India?” Answer is yes as we have a certain procedure to become
the PM of India, but the time for this problem cannot be predicted.
Q 1. Explain the complexity classes P, NP, NPC and NP hard. How are
they related to each other?
P class – P stands for polynomial. It is a set of problems which can be
solved as well as verified in polynomial time. Linear Search O(n) , Binary
Search O(logn), Merge Sort O(nlogn), Heap Sort O(nlog), etc., are the
examples of algorithms which solve the problem in polynomial time.
Note - Whatever algorithms we studied before dynamic programming
belong to P class only.

NP class – NP stands for non-deterministic polynomial. It is a set of


decision problems for which there exists a polynomial time verification
algorithm. For example, for TSP , so far ( we don’t know about future),
we have been unable to find out any polynomial time solution but then,
given a solution of a TSP , we can verify it in polynomial time.
227

Note – If a problem belongs to P, then by default, it also belongs to NP


because it can be verified in polynomial time, but vice versa does not hold
good.

NP

As of now, NP minus P (NP-P) problems have been unable to be solved


in polynomial time. We don’t know if these problems ( TSP, 0/1
Knapsack, etc.) can be solved in polynomial time in the future or not.

NP hard class – If every problem in NP can be polynomial time reducible


to a problem “A”, then ‘A’ is called NP hard. If “A” could be solved in
polynomial time, then by default, every problem in NP would become P.

NP complete class – A problem is said to be NP complete if it is NP as


well as NP hard.
NP NP hard

NPC
P
228

Fig - How P, NP, NP hard and NP complete are related to each other.
Q2. What are approximation algorithms? What is meant by P (n)
approximation algorithms? Discuss approximation algorithm for Vertex
cover problem.
An approximation algorithm is a way of dealing with NP-completeness for
an optimization problem. This technique does not guarantee the best
solution. The goal of the approximation algorithm is to come as close as
possible to the optimal solution in polynomial time.

Some examples of the Approximation algorithm :


1. The Vertex Cover Problem
2. Travelling Salesman Problem
3. The Set Covering Problem
4. The Subset Sum Problem

If an algorithm reaches an approximation ratio of P (n), then we call it a P


(n)-approximation algorithm.

C = Cost of solution
C*= Cost of optimal solution

 For a maximization problem, 0< C < C*, and the ratio of C*/C (approximation
ration) gives the factor by which the cost of an optimal solution is larger than
the cost of the approximate algorithm.
 For a minimization problem, 0< C* < C, and the ratio of C/C* gives the factor
by which the cost of an approximate solution is larger than the cost of an
optimal solution.

Vertex Cover Problem – Given an undirected graph, the vertex cover problem
is to find minimum size vertex cover. Although the name is Vertex Cover, the set
covers all edges of the given graph. It is a minimization problem because we have
to find a set of vertices containing minimum number of vertices covering all edges
of the given undirected graph.
229

Approximation algorithm for vertex cover problem

[ Important for your university exam ]

Example –

a b

c d

Solution –
Line 1. C = solution set, which is empty in the beginning.
Line 2. E ʹ = { (a, b), (a, c), (c, d), (b, d), (d, e)} // set of edges
Line 3. While E ʹ ! = empty
Line 4. Add an arbitrary edge from E ʹ in the solution set on line 5.
Suppose we consider (a, b) from E ʹ, then
230

Line 5. C = C U { a, b }
C = { a, b } // As of now solution set contains two vertices
Line 6. Remove every edge incident on either a or b vertex. Therefore,
Remove the following edges from E ʹ :
(a, b), (b, c), and (a, c) [ These three edges are incident on a and b. ]
Now, E ʹ = { (c, d), (b, d), (d, e) }
As E ʹ is not empty, add another arbitrary edge from E ʹ in the solution
set C. Let’s take the edge (c, d) now.
C = { a, b, c, d}
Using line number 6, remove every edge incident on either vertex c and
d. Therefore, remove (c, d), (b, d) and (d, e) from E ʹ. E ʹ is empty now,
so return C using line number 7, which contains four vertices a, b,
c, and d.
C is a set of minimum number of vertices, which covers all edges.
Randomized algorithm – Algorithms using random numbers to decide
what to do next anywhere in its logic is called Randomized Algorithm. For
example, in Randomized Quick Sort, we use a random number to pick
the next pivot (or we randomly shuffle the array). Typically, this
randomness is used to reduce time complexity or space complexity in
other standard algorithms.
231

String Matching Algorithms


String matching algorithms, sometimes called string searching
algorithms, are an important class of string algorithms that try to find
a place where one or several strings ( also called pattern ) are found
within a large string or text. For example,
If text array = T [ 1 . . . n] and pattern array = P[1 . . . m], then we have
to find out P in T. Therefore, length of P must be less than or equal to T.
Both the pattern and searched text belong to ∑ (set of alphabets) , and
it can contain either English alphabet ( finite set) or binary number ( 0
and 1).

We have the following algorithms to search a patter in the given text:


1. Naive String-Matching Algorithm.
2. Rabin-Karp String Matching Algorithm
3. Finite Automata String Matching Algorithm
4. Knuth-Morris-Pratt (KMP) String Matching Algorithm

Naive String-Matching Algorithm

The naive approach tests all the possible placement of Pattern P


[1.......m] relative to text T [1......n]. We try shift s = 0, 1.......n-m
successively and for each shift s. Compare T [s+1.......s + m] to P
[1......m].
232

Q. Reframe an algorithm for naive string matcher?

Algorithm
NAIVE-STRING-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1...m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s

Find an example on the next page.


233
234

Rabin-Karp String Matching Algorithm

The Rabin–Karp algorithm is a string-searching algorithm created by Richard


M. Karp and Michael Rabin (1987) that uses hashing to find an exact match of a
pattern string in a text. They suggest the hash function by choosing a random
prime number q and calculate p [ 1 ….. m ] mod q.

Algorithm

Q 3. For q = 11, how many valid and spurious hits are found for
the given Text and Pattern:
T = 3141592653589793
P = 26

Solution –
Step -1 Find p mod q
26 % 11 = 4

Step -2 As “p” contains a 2 digits number, find mod 11 of each 2


digits number from T as follows:
235

T=3141592653589793

31 mod 11 = 9
14 mod 11 = 3
41 mod 11 = 8
15 mod 11 = 4
59 mod 11 = 4
92 mod 11= 4
26 mod 11 = 4
65 mod 11 = 10
53 mod 11 = 9
35 mod 11 = 2
58 mod 11 = 3
89 mod 11 = 1
97 mod 11 = 9
79 mod 11 = 2
93 mod 11 = 5

We consider only the number which gives us 4 after performing


mod 11. Therefore, we have :

15 mod 11 = 4
59 mod 11 = 4
92 mod 11= 4
26 mod 11 = 4[ Here, 26 is equal to P (pattern), so it is a valid hit ]

Three spurious hits in yellow and one valid hit in red.


236

Finite Automata Based String Matching Algorithm

For a given pattern P, we construct a string-matching automaton in a


preprocessing step before using it to search the text string.

Algorithm

For the pattern p = abcd


Prefixes of P = a, ab, abc, abcd [ started from the left side ]
Suffixes of P = d, cd, bcd, abcd [ started from the right side]

In order to specify the string-matching automaton corresponding


to a given pattern p[1 . . . m], we first define an auxiliary function σ
, called suffix function. σ(x) is the length of the longest prefix of p
that is also a suffix of x. For example,

For the pattern p = abab and x = aba


σ(x) = ?

“a” is a suffix of x as well as a prefix of p


“aba” is a suffix of x as well as prefix of p.
Length of “a” = 1
237

Length of “aba” = 3
Therefore, σ(x) = 3

Example – T = {abababacaba} and p ={ababaca}

Solution – Pattern length (m) = 7


Number of states = m+1 = 8
Q = { q0, . . ., q7 }
∑ = { a, b, c }

Prefixes of P ={ a, ab, aba, abab, ababa, ababac, ababaca}

Now, we can create a transition table for P using suffix function σ.

Transition function δ : Q x ∑  Q

δ(q0 , a) = σ(a) = 1 ( since “a” is a prefix in P and a’s length is 1)


δ(q0 , b) = σ(b) = 0 ( No transition as “b” is not a prefix in P)
δ(q0 , c) = σ(c) = 0 ( No transition as “c” is not a prefix in P}

δ(q1 , a) = σ(aa) = 1 (only single “a” is a prefix in P)


δ(q1 , b) = σ(ab) = 2 ( “ab” is a prefix in P and its length is 2)
δ(q1 , c) = σ(ac) = 0 ( No transition as “ac” is not a prefix in P)

δ(q2 , a) = σ(aba) = 3 ( “aba” is a prefix in P and its length is 3)


δ(q2 , b) = σ(abb) = 0 ( None of its suffixes ( b, bb and abb) is in P)
δ(q2 , c) = σ(abc) = 0 ( None of its suffixes ( c, bc and abc) is in p)
238

δ(q3 , a) = σ(abaa) = 1 (Among suffixes (a,aa,baa,abaa) only “a” is a


prefix in P and its length is 1)
δ(q3 , b) = σ(abab) = 4 ( “abab” is a prefix in p and its length is 4)
δ(q3 , c) = σ(abac) = 0(among all suffixes ( c, ac, bac, abac), none is
there in p; therefore, no transition)

δ(q4 , a) = σ(ababa) = 5 (“ababa” is a prefix in p and its length is 5)


δ(q4 , b) = σ(ababb) = 0 (among all suffixes , none is there in p)
δ(q4 , c) = σ(ababc) = 0 (among all suffixes , none is there in p)

δ(q5 , a) = σ(ababaa) = 1( among all suffixes ( a, aa, baa, abaa,


babaa, ababaa) only “a” is prefix in p and its length is 1)
δ(q5 , b) = σ(ababab) = 4 ( among all suffixes ( b, ab, bab, abab,
babab, ababab) the longest prefix “abab” is in p and its length is 4)
δ(q5 , c) = σ(ababac) = 6 (“ababac” is a prefix in p and its length is
6)

δ(q6 , a) = σ(ababaca) = 7 (“ababaca” is a prefix in p and its length


is 7)
δ(q6 , b) = σ(ababacb) = 0
δ(q6 , c) = σ(ababacc) = 0

δ(q7 , a) = σ(ababacaa) = 1( among all suffixes( a, aa, caa,….) only


“a” is in p)
δ(q7 , b) = σ(ababacab) =2 (among all suffixes ( b, ab, cab, …) only
ab is in p)
δ(q7 , c) = σ(ababacac) = 0 (None of the prefixes is there in p)
239
240

Knuth-Morris-Pratt (KMP) String Matching Algorithm


KMP is a linear time string matching algorithm. It uses concept of prefix and suffix
to generate Π table.

KMP-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. Π← COMPUTE-PREFIX-FUNCTION (P)
4. q ← 0 // numbers of characters matched
5. for i ← 1 to n// scan S from left to right
6. do while q > 0 and P [q + 1] ≠ T [i]
7. do q ← Π [q] // next character does not match
8. If P [q + 1] = T [i]
9. then q ← q + 1 // next character matches
10. If q = m // is all of p matched?
11. then print "Pattern occurs with shift" i - m
12. q ← Π [q] // look for the next match

COMPUTE- PREFIX- FUNCTION (P)


1. m ←length [P] //'p' pattern to be matched
2. Π [1] ← 0
3. k ← 0
4. for q ← 2 to m
5. do while k > 0 and P [k + 1] ≠ P [q]
6. do k ← Π [k]
7. If P [k + 1] = P [q]
8. then k← k + 1
9. Π [q] ← k
10. Return Π
241

Prefix:

Definition: A prefix of a string is any substring that starts at the beginning


of the string (from left to right) and includes zero or more characters, but
does not extend to the end of the string.
Example: For the string
abcd = "a," "ab," and "abc" are all prefixes.

Suffix:

Definition: A suffix of a string is any substring that starts at the ending


position (from right to left) and includes zero or more characters, but
does not extend to the beginning of the string.
Example: For the string
abcd = "d," "cd," and "bcd" are all suffixes.

Q 4. Compute the prefix function Π for the pattern


ababbabbabbababbabb when the alphabet is ∑ = {a, b}.

Π – It
is also called the longest prefix which is same as some suffix
(LPS).

Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

p a b a b b a b b a b b a b a b b a b b
Π 0 0 1 2 0 1 2 0 1 2 0 1 2 3 4 5 6 7 8

Note – Use any short-cut trick to prepare LPS or Π table in the exam.
242

Fast Fourier Transform (FFT)

An FFT algorithm computes the discrete Fourier transform (DFT) of a


sequence or its inverse DFT in time O(nlogn).

All the best 😊

You might also like