Datastructure Material

Abstract Data Types
Abstract Data type (ADT) is a type (or class) for objects whose behaviour
is defined by a set of value and a set of operations.The definition of ADT
only mentions what operations are to be performed but not how these
operations will be implemented. It does not specify how data will be
organized in memory and what algorithms will be used for implementing
the operations. It is called “abstract” because it gives an implementation
independent view. The process of providing only the essentials and
hiding the details is known as abstraction.
The user of data type need not know that data type is implemented, for
example, we have been using int, float, char data types only with the
knowledge with values that can take and operations that can be
performed on them without any idea of how these types are
implemented. So a user only needs to know what a data type can do but
not how it will do it. We can think of ADT as a black box which hides the
inner structure and design of the data type. Now we’ll define three ADTs
namely List ADT, Stack ADT, Queue ADT.
List ADT
A list contains elements of same type arranged in sequential order and
following operations can be performed on the list.
get() – Return an element from the list at any given position.

insert() – Insert an element at any position of the list.
remove() – Remove the first occurrence of any element from a non-
empty list.
removeAt() – Remove the element at a specified location from a non-
empty list.
replace() – Replace an element at any position by another element.
size() – Return the number of elements in the list.
isEmpty() – Return true if the list is empty, otherwise return false.
isFull() – Return true if the list is full, otherwise return false.
Stack ADT
A Stack contains elements of same type arranged in sequential order. All
operations takes place at a single end that is top of the stack and
following operations can be performed:
push() – Insert an element at one end of the stack called top.
pop() – Remove and return the element at the top of the stack, if it is not
empty.
peek() – Return the element at the top of the stack without removing it, if
the stack is not empty.
size() – Return the number of elements in the stack.
isEmpty() – Return true if the stack is empty, otherwise return false.
isFull() – Return true if the stack is full, otherwise return false.
QueueADT
A Queue contains elements of same type arranged in sequential order.
Operations takes place at both ends, insertion is done at end and deletion
is done at front. Following operations can be performed:
enqueue() – Insert an element at the end of the queue.
dequeue() – Remove and return the first element of queue, if the queue
is not empty.
peek() – Return the element of the queue without removing it, if the
queue is not empty.
size() – Return the number of elements in the queue.
isEmpty() – Return true if the queue is empty, otherwise return false.
isFull() – Return true if the queue is full, otherwise return false.
From these definitions, we can clearly see that the definitions do not
specify how these ADTs will be represented and how the operations will
be carried out. There can be different ways to implement an ADT, for
example, the List ADT can be implemented using arrays, or singly linked
list or doubly linked list. Similarly, stack ADT and Queue ADT can be
implemented using arrays or linked lists.
Asymptotic Analysis:
Asymptotic analysis of an algorithm refers to defining the mathematical
boundation/framing of its run-time performance. Using asymptotic
analysis, we can very well conclude the best case, average case, and
worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the

algorithm, it is concluded to work in a constant time. Other than the
"input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any

operation in mathematical units of computation. For example, the
running time of one operation is computed as f(n) and may be for
another operation it is computed as g(n2). This means the first operation
running time will increase linearly with the increase in n and the running
time of the second operation will increase exponentially
when n increases. Similarly, the running time of both operations will be
nearly the same if n is significantly small.
Usually, the time required by an algorithm falls under three types −
 Best Case − Minimum time required for program execution.
 Average Case − Average time required for program execution.
 Worst Case − Maximum time required for program execution.
Asymptotic Notations
Following are the commonly used asymptotic notations to calculate the
running time complexity of an algorithm.
 Ο Notation
 Ω Notation
 θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an
algorithm's running time. It measures the worst case time complexity or
the longest amount of time an algorithm can possibly take to complete.
For example, for a function f(n)

Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an
algorithm's running time. It measures the best case time complexity or
the best amount of time an algorithm can possibly take to complete.
For example, for a function f(n)
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and
the upper bound of an algorithm's running time. It is represented as
follows −
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
Common Asymptotic Notations

Following is a list of some common asymptotic notations −
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
n log n − Ο(n log n)
quadratic − Ο(n2)
cubic − Ο(n3)
polynomial − nΟ(1)
exponential − 2Ο(n)
Sorting And Searching Algorithms - Time Complexities Cheat Sheet
Time complexity Cheat Sheet

BigO Graph
Interpolation Search:
Interpolation search is an improved variant of binary search. This search

algorithm works on the probing position of the required value. For this
algorithm to work properly, the data collection should be in a sorted form
and equally distributed.
Binary search has a huge advantage of time complexity over linear

search. Linear search has worst-case complexity of Ο(n) whereas binary
search has Ο(log n).
There are cases where the location of target data may be known in
advance. For example, in case of a telephone directory, if we want to
search the telephone number of Morphius. Here, linear search and even
binary search will seem slow as we can directly jump to memory space
where the names start from 'M' are stored.
Positioning in Binary Search
In binary search, if the desired data is not found then the rest of the list
is divided in two parts, lower and higher. The search is carried out in
either of them.
Even when the data is sorted, binary search does not take advantage to
probe the position of the desired data.
Position Probing in Interpolation Search

Interpolation search finds a particular item by computing the probe
position. Initially, the probe position is the position of the middle most
item of the collection.
If a match occurs, then the index of the item is returned. To split the list
into two parts, we use the following method −
mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])
where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
If the middle item is greater than the item, then the probe position is
again calculated in the sub-array to the right of the middle item.
Otherwise, the item is searched in the subarray to the left of the middle
item. This process continues on the sub-array as well until the size of
subarray reduces to zero.
Runtime complexity of interpolation search algorithm is Ο(log (log
n)) as compared to Ο(log n) of BST in favorable situations.
Algorithm
As it is an improvisation of the existing BST algorithm, we are
mentioning the steps to search the 'target' data value index, using
position probing −
Step 1 − Start searching data from middle of the list.
Step 2 − If it is a match, return the index of the item, and exit.
Step 3 − If it is not a match, probe position.
Step 4 − Divide the list using probing formula and find the new midle.
Step 5 − If data is greater than middle, search in higher sub-list.
Step 6 − If data is smaller than middle, search in lower sub-list.
Step 7 − Repeat until match.
Pseudocode
A → Array list
N → Size of A
X → Target Value
Procedure Interpolation_Search()
Set Lo → 0
Set Mid → -1
Set Hi → N-1
While X does not match
if Lo equals to Hi OR A[Lo] equals to A[Hi]
EXIT: Failure, Target not found
end if
Set Mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])
if A[Mid] = X
EXIT: Success, Target found at Mid
else
if A[Mid] < X
Set Lo to Mid+1
else if A[Mid] > X
Set Hi to Mid-1
end if
end if
End While
End Procedure
Interpolation search is an improved variant of binary search. This search

algorithm works on the probing position of the required value. For this
algorithm to work properly, the data collection should be in sorted and
equally distributed form.
It's runtime complexity is log2(log2 n).
Implementation in C
Live Demo
#include<stdio.h>
#define MAX 10
// array of items on which linear search will be conducted.
int list[MAX] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44 };
int find(int data) {
int lo = 0;
int hi = MAX - 1;
int mid = -1;
int comparisons = 1;
int index = -1;
while(lo <= hi) {
printf("\nComparison %d \n" , comparisons ) ;
printf("lo : %d, list[%d] = %d\n", lo, lo, list[lo]);
printf("hi : %d, list[%d] = %d\n", hi, hi, list[hi]);
comparisons++;
// probe the mid point
mid = lo + (((double)(hi - lo) / (list[hi] - list[lo])) * (data -

list[lo]));
printf("mid = %d\n",mid);
// data found
if(list[mid] == data) {
index = mid;
break;
} else {
if(list[mid] < data) {
// if data is larger, data is in upper half
lo = mid + 1;
} else {
// if data is smaller, data is in lower half
hi = mid - 1;
printf("\nTotal comparisons made: %d", --comparisons);

return index;
int main() {
//find location of 33
int location = find(33);
// if element was found
if(location != -1)
printf("\nElement found at location: %d" ,(location+1));
else
printf("Element not found.");
return 0;
If we compile and run the above program, it will produce the following
result −
Output
Comparison 1
lo : 0, list[0] = 10
hi : 9, list[9] = 44
mid = 6
Total comparisons made: 1

Element found at location: 7
Binary tree sort:

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
struct node
{
int data;
struct node *left, *right;
};
struct node *root;
void ins(struct node *, int, int);
void inser(struct node *, int);
void display(struct node *);
int main()
{
int choice, no = 0, parentnode;
root = (struct node *) malloc (sizeof(struct node));
printf("\n Enter a number for parent node : ");
scanf("%d",&parentnode);
root -> data = parentnode;
root -> left = root -> right = NULL;
do{
printf("\n 1.Add element");
printf("\n 2.Sort");
printf("\n 3.Exit");
printf("\n Enter your choice : ");
scanf("%d",&choice);
switch(choice)
{
case 1:
printf("Enter the element to insert : \n");
scanf("%d",&no);
inser(root, no);
break;
case 2:
printf("\n Sorted elements are : \n");
display(root);
break;
default:
printf("\n Invalid press...");
exit(0);
}
}while(choice < 3);
return 0;
}
void ins(struct node *n, int value, int opt)
{
struct node *t;
t = (struct node *)malloc (sizeof (struct node));
t -> data = value;
t -> left = t -> right = NULL;
if(opt == 1)
n -> left = t;
else
n -> right = t;
printf("%d is inserted",value);
if(opt == 1)
printf(" at the left \n");
else
printf(" at the right \n");
}
void inser(struct node *t,int x)
{
if(t -> data > x)
if(t -> left == NULL)
ins(t,x,1);
else
inser(t -> left, x);
else if(t -> data < x)
if(t -> right == NULL)
ins(t, x, 2);
else
inser(t -> right,x);
else
printf(" Element is already exist in the list ");
}
void display(struct node *p)
{
if(p != NULL)
{
display(p -> left);
printf("%5d",p -> data);
display(p -> right);
}
}
Time Complexity: The worst case time complexity of search and insert operations is
O(h) where h is height of Binary Search Tree. In worst case, we may have to travel from
root to the deepest leaf node. The height of a skewed tree may become n and the time
complexity of search and insert operation may become O(n).
Dictionary Abstract Data Type:
Dictionary ADT „ Store elements so that they can be quickly located using keys „ Typically, useful
additional information in addition to key „ Examples – bank accounts with SSN as key – student
records with UMID or uniqname as key
Dictionary ADTs Types „: Log File ,Ordered Dictionary,Hash Table and Skip List
Operations : Search , Insertion , Removal
Stores items by key :– element pairs „ (k,e) „ k and e may be of any type „ k and e may be the same „
In general, items with the same key may be stored in same Dictionary.
Unordered vs Ordered:
Ordered „ Relative order determined by comparator between keys „ Total Order relation defined on
keys unordered „ No order relation is assumed on keys „ Only equality testing between keys.
Dictionary ADT:
Log File Defn: implementation of Dictionary ADT using a sequence to store items in arbitrary order
Obviously, Unordered Dictionary Useful implementation for case with many insertions and few
searches Implemented as array (vector) or linked list.
Applications of Log Files: „ Database systems ,„ File systems and Security audit trails.
Dict. ADT: Ordered Dictionary:
Defn: implementation of Dictionary ADT in which usual operations may be used and there exists an
order relationship between keys Useful implementation for few insertions / removals, but many
searches.
Priority Queue:
Overview
Priority Queue is more specialized data structure than Queue. Like
ordinary queue, priority queue has same method but with a major
difference. In Priority queue items are ordered by key value so that item
with the lowest value of key is at front and item with the highest value of
key is at rear or vice versa. So we're assigned priority to item based on
its key value. Lower the value, higher the priority. Following are the
principal methods of a Priority Queue.
Basic Operations
 insert / enqueue − add an item to the rear of the queue.
 remove / dequeue − remove an item from the front of the queue.
Priority Queue Representation

We're going to implement Queue using array in this article. There is few
more operations supported by queue which are following.
 Peek − get the element at front of the queue.
 isFull − check if queue is full.
 isEmpty − check if queue is empty.
Insert / Enqueue Operation

Whenever an element is inserted into queue, priority queue inserts the
item according to its order. Here we're assuming that data with high
value has low priority.
void insert(int data){
int i = 0;
if(!isFull()){
// if queue is empty, insert the data
if(itemCount == 0){
intArray[itemCount++] = data;
}else{
// start from the right end of the queue
for(i = itemCount - 1; i >= 0; i-- ){
// if data is larger, shift existing item to right end
if(data > intArray[i]){
intArray[i+1] = intArray[i];
}else{
break;
// insert the data
intArray[i+1] = data;
itemCount++;
Remove / Dequeue Operation

Whenever an element is to be removed from queue, queue get the
element using item count. Once element is removed. Item count is
reduced by one.
int removeData(){
return intArray[--itemCount];
Demo Program
PriorityQueueDemo.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>
#define MAX 6
int intArray[MAX];
int itemCount = 0;
int peek(){
return intArray[itemCount - 1];
}
bool isEmpty(){
return itemCount == 0;
bool isFull(){
return itemCount == MAX;
int size(){
return itemCount;
void insert(int data){
int i = 0;
if(!isFull()){
// if queue is empty, insert the data
if(itemCount == 0){
intArray[itemCount++] = data;
}else{
// start from the right end of the queue
for(i = itemCount - 1; i >= 0; i-- ){
// if data is larger, shift existing item to right end
if(data > intArray[i]){
intArray[i+1] = intArray[i];
}else{
break;
}
}
// insert the data
intArray[i+1] = data;
itemCount++;
int removeData(){
return intArray[--itemCount];
int main() {
/* insert 5 items */
insert(3);
insert(5);
insert(9);
insert(1);
insert(12);
// ------------------
// index : 0 1 2 3 4
// ------------------
// queue : 12 9 5 3 1
insert(15);
// ---------------------
// index : 0 1 2 3 4 5
// ---------------------
// queue : 15 12 9 5 3 1
if(isFull()){
printf("Queue is full!\n");
// remove one item
int num = removeData();
printf("Element removed: %d\n",num);
// ---------------------
// index : 0 1 2 3 4
// ---------------------
// queue : 15 12 9 5 3
// insert more items
insert(16);
// ----------------------
// index : 0 1 2 3 4 5
// ----------------------
// queue : 16 15 12 9 5 3
// As queue is full, elements will not be inserted.
insert(17);
insert(18);
// ----------------------
// index : 0 1 2 3 4 5
// ----------------------
// queue : 16 15 12 9 5 3
printf("Element at front: %d\n",peek());
printf("----------------------\n");
printf("index : 5 4 3 2 1 0\n");
printf("----------------------\n");
printf("Queue: ");
while(!isEmpty()){
int n = removeData();
printf("%d ",n);
If we compile and run the above program then it would produce following
result −
Queue is full!
Element removed: 1
Element at front: 3
----------------------
index : 5 4 3 2 1 0
----------------------
Queue: 3 5 9 12 15 16
Tower of Hanoi:
Tower of Hanoi, is a mathematical puzzle which consists of three towers
(pegs) and more than one rings is as depicted −
These rings are of different sizes and stacked upon in an ascending
order, i.e. the smaller one sits over the larger one. There are other
variations of the puzzle where the number of disks increase, but the
tower count remains the same.
Rules
The mission is to move all the disks to some another tower without
violating the sequence of arrangement. A few rules to be followed for
Tower of Hanoi are −
 Only one disk can be moved among the towers at any given time.
 Only the "top" disk can be removed.
 No large disk can sit over a small disk.
Following is an animated representation of solving a Tower of Hanoi

puzzle with three disks.
Tower of Hanoi puzzle with n disks can be solved in

minimum 2n−1 steps. This presentation shows that a puzzle with 3 disks
has taken 23 - 1 = 7 steps.
Algorithm
To write an algorithm for Tower of Hanoi, first we need to learn how to
solve this problem with lesser amount of disks, say → 1 or 2. We mark
three towers with name, source, destination and aux (only to help
moving the disks). If we have only one disk, then it can easily be moved
from source to destination peg.
If we have 2 disks −
 First, we move the smaller (top) disk to aux peg.

 Then, we move the larger (bottom) disk to destination peg.
 And finally, we move the smaller disk from aux to destination peg.
So now, we are in a position to design an algorithm for Tower of Hanoi

with more than two disks. We divide the stack of disks in two parts. The
largest disk (nth disk) is in one part and all other (n-1) disks are in the
second part.
Our ultimate aim is to move disk n from source to destination and then
put all other (n1) disks onto it. We can imagine to apply the same in a
recursive way for all given set of disks.
The steps to follow are −

Step 1 − Move n-1 disks from source to aux
Step 2 − Move nth disk from source to dest
Step 3 − Move n-1 disks from aux to dest
A recursive algorithm for Tower of Hanoi can be driven as follows −
START
Procedure Hanoi(disk, source, dest, aux)
IF disk == 1, THEN
move disk from source to dest
ELSE
Hanoi(disk - 1, source, aux, dest) // Step 1
move disk from source to dest // Step 2
Hanoi(disk - 1, aux, dest, source) // Step 3
END IF
END Procedure
STOP
Program:
#include <stdio.h>
#include <stdbool.h>
#define MAX 10
int list[MAX] = {1,8,4,6,0,3,5,2,7,9};
void display(){
int i;
printf("[");
// navigate through all items
for(i = 0; i < MAX; i++) {
printf("%d ",list[i]);
printf("]\n");
void bubbleSort() {
int temp;
int i,j;
bool swapped = false;
// loop through all numbers
for(i = 0; i < MAX-1; i++) {

swapped = false;
// loop through numbers falling ahead
for(j = 0; j < MAX-1-i; j++) {
printf("Items compared: [ %d, %d ] ", list[j],list[j+1]);
// check if next number is lesser than current no
// swap the numbers.
// (Bubble up the highest number)
if(list[j] > list[j+1]) {
temp = list[j];
list[j] = list[j+1];
list[j+1] = temp;
swapped = true;
printf(" => swapped [%d, %d]\n",list[j],list[j+1]);
} else {
printf(" => not swapped\n");
// if no number was swapped that means
// array is sorted now, break the loop.
if(!swapped) {
break;
printf("Iteration %d#: ",(i+1));
display();
}
int main() {
printf("Input Array: ");
display();
printf("\n");
bubbleSort();
printf("\nOutput Array: ");
display();
If we compile and run the above program, it will produce the following
result −
Output
Input Array: [1 8 4 6 0 3 5 2 7 9 ]
Items compared: [ 1, 8 ] => not swapped

Items compared: [ 8, 4 ] => swapped [4, 8]
Iteration 1#: [1 4 6 0 3 5 2 7 8 9 ]
Iteration 2#: [1 4 0 3 5 2 6 7 8 9 ]
Iteration 3#: [1 0 3 4 2 5 6 7 8 9 ]
Iteration 4#: [0 1 3 2 4 5 6 7 8 9 ]
Iteration 5#: [0 1 2 3 4 5 6 7 8 9 ]
Output Array: [0 1 2 3 4 5 6 7 8 9 ]
ADS Array:The array data type is the simplest structured data type. It is such a useful
data type because it gives you, as a programmer, the ability to assign a single name to a
homogeneous collection of instances of one abstract data type and provide integer names
for the individual elements of the collection.
What are dangling pointers?

Generally, daggling pointers arise when the referencing object is deleted or
deallocated, without changing the value of the pointers. It creates the problem because
the pointer is still pointing the memory that is not available. When the user tries to
dereference the daggling pointers than it shows the undefined behavior and can be the
cause of the segmentation fault.
In simple word, we can say that dangling pointer is a pointer that not pointing a valid
object of the appropriate type and it can be the cause of the undefined behavior.
Let’s see the below image for the better understanding.
In the image Pointer1, Pointer2 are pointing a valid memory object but Pointer3 is
pointing a memory object that has been already deallocated. So Pointer3 become a
dangling pointer when you will try to access the Pointer3 than you will get the
undefined result or segmentation fault.
Important causes of the dangling pointers in C language
There a lot of cause to arise the dangling pointers but here I am describing some
common cause that creates the dangling pointers.
When variable goes out of the scope
A local variable’s scope and lifetime belong to their block where it is declared.
Whenever control comes to the block than memory is allocated to the
local variable and freed automatically upon exit from the block.
If a local variable is referred to outside of its lifetime, the behavior is undefined. The
value of a pointer becomes indeterminate when the variable it points to reaches the
end of its lifetime.
Let see the below code for the better understanding.
In the below code, we have tried to read the value of Data (integer variable) outside of
their block (scope) through the pointer (piData), so the value of piData is
indeterminate.
1 #include <stdio.h>
3 int main(void)
4 {
5 int * piData;
7 { //block
8 int Data = 27;
9 piData = &Data;
10 }
11
12 printf("piData = %d\n", *piData); //piData is dangling pointer
13
14 return 0;
15 }
After destroying the stack frame

The stack frame that is allocated to a function is destroyed after returning the control
from the function. The common mistake performed by the developer is that to return
the address of the stack allocated variable from the function. If you tried to access the
returning address from the pointer, you will get the unpredictable result or might get
the same value but it is very dangerous and need to avoid.
Let see the below programming example,
In below code, Data has not scope beyond the function. If you try to read the value of
Data after calling the Fun() using the pointer may you will get the correct value (5),
but any functions called thereafter will overwrite the stack storage allocated for Data
with other values and the pointer would no longer work correctly.
So in the below code piData is a dangling pointer that is pointing a memory which is
not available.
1 #include<stdio.h>
2
3 int *Fun()
4 {
5 int Data = 5; //Local variable
7 return &Data; //Address of local variable
8 }
10
11 int main()
12 {
13 int *piData = Fun(); //Returning address of the local variable
14
15 printf("%d", *piData);
16
17 return 0;
18 }
Difference between Top-down and Bottom-up Approach:
The algorithms are designed using two approaches that are the top-down
and bottom-up approach. In the top-down approach, the complex module is
divided into submodules. On the other hand, bottom-up approach begins
with elementary modules and then combine them further. The prior purpose
of an algorithm is to operate the data comprised in the data structure. In
other words, an algorithm is used to perform the operations on the data
inside the data structures.
A complicated algorithm is split into small parts called modules, and the
process of splitting is known as modularization. Modularization significantly
reduces the complications of designing an algorithm and make its process
more easier to design and implement. Modular programming is the technique
of designing and writing a program in the form of the functions where each
function is distinct from each other and works independently. The content in
the functions are cohesive in manner, and there exists a low coupling
between the modules.
BASIS FOR
TOP-DOWN APPROACH BOTTOM-UP APPROACH
COMPARISON
Basic Breaks the massive problem Solves the fundamental
into smaller subproblems. low-level problem and
integrates them into a
larger one.
Process Submodules are solitarily Examine what data is to be
analysed. encapsulated, and implies
the concept of information
hiding.
Communication Not required in the top-down Needs a specific amount of
approach. communication.
Redundancy Contain redundant information. Redundancy can be
eliminated.
BASIS FOR
TOP-DOWN APPROACH BOTTOM-UP APPROACH
COMPARISON
Programming Structure/procedural oriented Object-oriented
languages programming languages (i.e. programming languages
C) follows the top-down (like C++, Java, etc.)
approach. follows the bottom-up
approach.
Mainly used in Module documentation, test
case creation, code

Testing
implementation and
debugging.
Definition of Top-down Approach
The top-down approach basically divides a complex problem or algorithm

into multiple smaller parts (modules). These modules are further
decomposed until the resulting module is the fundamental program
essentially be understood and can not be further decomposed. After
achieving a certain level of modularity, the decomposition of modules is
ceased. The top-down approach is the stepwise process of breaking of the
large program module into simpler and smaller modules to organise and code
program in an efficient way. The flow of control in this approach is always in
the downward direction. The top-down approach is implemented in the “C”
programming language by using functions.
Thus, the top-down method begins with abstract design and then
sequentially this design is refined to create more concrete levels until there is
no requirement of additional refinement.
Definition of Bottom-up Approach
The bottom-up approach works in just opposite manner to the top-down

approach. Initially, it includes the designing of the most fundamental parts
which are then combined to make the higher level module. This integration
of submodules and modules into the higher level module is repeatedly
performed until the required complete algorithm is obtained.
Bottom-up approach functions with layers of abstraction. The primary

application of the bottom-up approach is testing as each fundamental module
is first tested before merging it to the bigger one. The testing is
accomplished using the certain low-level functions.
Key Differences Between Top-down and Bottom-up Approach
1. Top-down approach decomposes the large task into smaller subtasks

whereas bottom-up approach first chooses to solve the different
fundamental parts of the task directly then combine those parts into a
whole program.
2. Each submodule is separately processed in a top-down approach. As
against, bottom-up approach implements the concept of the information
hiding by examining the data to be encapsulated.
3. The different modules in top-down approach don’t require much
communication. On the contrary, the bottom-up approach needs
interaction between the separate fundamental modules to combine
them later.
4. Top-down approach can produce redundancy while bottom-up approach
does not include redundant information.
5. The procedural programming languages such as Fortran, COBOL and C
follows a top-down approach. In contrast, object-oriented programming
languages like C++, Java, C#, Perl, Python abides the bottom-up
approach.
6. Bottom-up approach is priorly used in testing. Conversely, the top-down
approach is utilized in module documentation, test case creation,
debugging, etcetera.
Breadth First Traversal:

Breadth First Search (BFS) algorithm traverses a graph in a breadthward
motion and uses a queue to remember to get the next vertex to start a
search, when a dead end occurs in any iteration.
As in the example given above, BFS algorithm traverses from A to B to E
to F first then to C and G lastly to D. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it.
Insert it in a queue.
 Rule 2 − If no adjacent vertex is found, remove the first vertex from the
queue.
 Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.
Step Traversal Description
Initialize the queue.

2
We start from
visiting S(starting node), and
mark it as visited.
We then see an unvisited

adjacent node from S. In this
example, we have three nodes
but alphabetically we
choose A, mark it as visited
and enqueue it.
Next, the unvisited adjacent

node from S is B. We mark it
as visited and enqueue it.
Next, the unvisited adjacent

node from S is C. We mark it
as visited and enqueue it.
6
Now, S is left with no

unvisited adjacent nodes. So,
we dequeue and find A.
From A we have D as
unvisited adjacent node. We
mark it as visited and
enqueue it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per
the algorithm we keep on dequeuing in order to get all unvisited nodes.
When the queue gets emptied, the program is over.
Depth First Traversal:

Depth First Search (DFS) algorithm traverses a graph in a depthward
motion and uses a stack to remember to get the next vertex to start a
search, when a dead end occurs in any iteration.
As in the example given above, DFS algorithm traverses from S to A to D
to G to E to B first, then to F and lastly to C. It employs the following
rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it.
Push it in a stack.
 Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It
will pop up all the vertices from the stack, which do not have adjacent
vertices.)
 Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.
Step Traversal Description
Initialize the stack.

2
Mark S as visited and put it

onto the stack. Explore any
unvisited adjacent node
from S. We have three nodes
and we can pick any of them.
For this example, we shall
take the node in an
alphabetical order.
Mark A as visited and put it

onto the stack. Explore any
unvisited adjacent node from
A. Both Sand D are adjacent
to A but we are concerned for
unvisited nodes only.
Visit D and mark it as visited

and put onto the stack. Here,
we have B and C nodes,
which are adjacent to D and
both are unvisited. However,
we shall again choose in an
alphabetical order.
5
We choose B, mark it as
visited and put onto the stack.
Here Bdoes not have any
unvisited adjacent node. So,
we pop Bfrom the stack.
We check the stack top for

return to the previous node
and check if it has any
unvisited nodes. Here, we
find D to be on the top of the
stack.
Only unvisited adjacent node

is from D is C now. So we
visit C, mark it as visited and
put it onto the stack.
As C does not have any unvisited adjacent node so we keep popping the
stack until we find a node that has an unvisited adjacent node. In this
case, there's none and we keep popping until the stack is empty.
Spanning Tree
A spanning tree is a subset of Graph G, which has all the vertices

covered with minimum possible number of edges. Hence, a spanning tree
does not have cycles and it cannot be disconnected..
By this definition, we can draw a conclusion that every connected and

undirected Graph G has at least one spanning tree. A disconnected graph
does not have any spanning tree, as it cannot be spanned to all its
vertices.
We found three spanning trees off one complete graph. A complete

undirected graph can have maximum nn-2 number of spanning trees,
where n is the number of nodes. In the above addressed example, n is
3, hence 33−2 = 3spanning trees are possible.
General Properties of Spanning Tree

We now understand that one graph can have more than one spanning
tree. Following are a few properties of the spanning tree connected to
graph G −
 A connected graph G can have more than one spanning tree.

 All possible spanning trees of graph G, have the same number of edges and
vertices.
 The spanning tree does not have any cycle (loops).
 Removing one edge from the spanning tree will make the graph
disconnected, i.e. the spanning tree is minimally connected.
 Adding one edge to the spanning tree will create a circuit or loop, i.e. the
spanning tree is maximally acyclic.
Mathematical Properties of Spanning Tree

 Spanning tree has n-1 edges, where n is the number of nodes (vertices).
 From a complete graph, by removing maximum e - n + 1 edges, we can

construct a spanning tree.
 A complete graph can have maximum nn-2 number of spanning trees.
Thus, we can conclude that spanning trees are a subset of connected

Graph G and disconnected graphs do not have spanning tree.
Application of Spanning Tree

Spanning tree is basically used to find a minimum path to connect all
nodes in a graph. Common application of spanning trees are −
 Civil Network Planning
 Computer Network Routing Protocol
 Cluster Analysis
Let us understand this through a small example. Consider, city network

as a huge graph and now plans to deploy telephone lines in such a way
that in minimum lines we can connect to all city nodes. This is where the
spanning tree comes into picture.
Minimum Spanning Tree (MST)

In a weighted graph, a minimum spanning tree is a spanning tree that
has minimum weight than all other spanning trees of the same graph. In
real-world situations, this weight can be measured as distance,
congestion, traffic load or any arbitrary value denoted to the edges.
Minimum Spanning-Tree Algorithm
We shall learn about two most important spanning tree algorithms here
−
 Kruskal's Algorithm
 Prim's Algorithm
Both are greedy algorithms.
Kruskal's Spanning Tree Algorithm:

Kruskal's algorithm to find the minimum cost spanning tree uses the
greedy approach. This algorithm treats the graph as a forest and every
node it has as an individual tree. A tree connects to another only and
only if, it has the least cost among all available options and does not
violate MST properties.
To understand Kruskal's algorithm let us consider the following example

−
Step 1 - Remove all loops and Parallel Edges

Remove all loops and parallel edges from the given graph.
In case of parallel edges, keep the one which has the least cost
associated and remove all others.
Step 2 - Arrange all edges in their increasing

order of weight
The next step is to create a set of edges and weight, and arrange them
in an ascending order of weightage (cost).
Step 3 - Add the edge which has the least

weightage
Now we start adding edges to the graph beginning from the one which
has the least weight. Throughout, we shall keep checking that the
spanning properties remain intact. In case, by adding one edge, the
spanning tree property does not hold then we shall consider not to
include the edge in the graph.
The least cost is 2 and edges involved are B,D and D,T. We add them.
Adding them does not violate spanning tree properties, so we continue to
our next edge selection.
Next cost is 3, and associated edges are A,C and C,D. We add them
again −
Next cost in the table is 4, and we observe that adding it will create a
circuit in the graph. −
We ignore it. In the process we shall ignore/avoid all edges that create a
circuit.
We observe that edges with cost 5 and 6 also create circuits. We ignore
them and move on.
Now we are left with only one node to be added. Between the two least
cost edges available 7 and 8, we shall add the edge with cost 7.
By adding edge S,A we have included all the nodes of the graph and we
now have minimum cost spanning tree.
The steps for implementing Kruskal's algorithm are as follows:
1. Sort all the edges from low weight to high

2. Take the edge with the lowest weight and add it to the spanning
tree. If adding the edge created a cycle, then reject this edge.
3. Keep adding edges until we reach all vertices.
Prim's Spanning Tree Algorithm:

Prim's algorithm to find minimum cost spanning tree (as Kruskal's
algorithm) uses the greedy approach. Prim's algorithm shares a similarity
with the shortest path first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes

as a single tree and keeps on adding new nodes to the spanning tree
from the given graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm

better, we shall use the same example −
Step 1 - Remove all loops and parallel edges
Remove all loops and parallel edges from the given graph. In case of
parallel edges, keep the one which has the least cost associated and
remove all others.
Step 2 - Choose any arbitrary node as root
node
In this case, we choose S node as the root node of Prim's spanning tree.
This node is arbitrarily chosen, so any node can be the root node. One
may wonder why any video can be a root node. So the answer is, in the
spanning tree all the nodes of a graph are included and because it is
connected then there must be at least one edge, which will join it to the
rest of the tree.
Step 3 - Check outgoing edges and select the

one with less cost
After choosing the root node S, we see that S,A and S,C are two edges
with weight 7 and 8, respectively. We choose the edge S,A as it is lesser
than the other.
Now, the tree S-7-A is treated as one node and we check for all edges
going out from it. We select the one which has the lowest cost and
include it in the tree.
After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a
node and will check all the edges again. However, we will choose only
the least cost edge. In this case, C-3-D is the new edge, which is less
than other edges' cost 8, 6, 4, etc.
After adding node D to the spanning tree, we now have two edges going
out of it having the same cost, i.e. D-2-T and D-2-B. Thus, we can add
either one. But the next step will again yield edge 2 as the least cost.
Hence, we are showing a spanning tree with both edges included.
We may find that the output spanning tree of the same graph using two
different algorithms is same.
Steps for implementing Prim’s Algorithm-

Step-01:
 Randomly choose any vertex.

 We usually select and start with a vertex that connects to the edge having least
weight.
Step-02:
 Find all the edges that connect the tree to new vertices, then find the least weight
edge among those edges and include it in the existing tree.
 If including that edge creates a cycle, then reject that edge and look for the next least
weight edge.
Step-03:
 Keep repeating step-02 until all the vertices are included and Minimum Spanning Tree
(MST) is obtained.
Time Complexity-
Worst case time complexity of Prim’s Algorithm

= O(ElogV) using binary heap
= O(E + VlogV) using Fibonacci heap
DAA - Shortest Paths

Dijkstra’s Algorithm
Dijkstra’s algorithm solves the single-source shortest-paths problem on a
directed weighted graph G = (V, E), where all the edges are non-
negative (i.e., w(u, v) ≥ 0 for each edge (u, v) Є E).
In the following algorithm, we will use one function Extract-Min(),

which extracts the node with the smallest key.
Algorithm: Dijkstra’s-Algorithm (G, w, s)
for each vertex v Є G.V
v.d := ∞
v.∏ := NIL
s.d := 0
S := Ф
Q := G.V
while Q ≠ Ф
u := Extract-Min (Q)
S := S U {u}
for each vertex v Є G.adj[u]
if v.d > u.d + w(u, v)
v.d := u.d + w(u, v)
v.∏ := u
Analysis
The complexity of this algorithm is fully dependent on the
implementation of Extract-Min function. If extract min function is
implemented using linear search, the complexity of this algorithm
is O(V2 + E).
In this algorithm, if we use min-heap on which Extract-Min() function

works to return the node from Q with the smallest key, the complexity of
this algorithm can be reduced further.
Example
Let us consider vertex 1 and 9 as the start and destination vertex
respectively. Initially, all the vertices except the start vertex are marked
by ∞ and the start vertex is marked by 0.
Step1 Step2 Step3 Step4 Step5 Step6 Step7 Step8

Vertex Initial
V1 V3 V2 V4 V5 V7 V8 V6
1 0 0 0 0 0 0 0 0 0
2 ∞ 5 4 4 4 4 4 4 4
3 ∞ 2 2 2 2 2 2 2 2
4 ∞ ∞ ∞ 7 7 7 7 7 7
5 ∞ ∞ ∞ 11 9 9 9 9 9
6 ∞ ∞ ∞ ∞ ∞ 17 17 16 16
7 ∞ ∞ 11 11 11 11 11 11 11
8 ∞ ∞ ∞ ∞ ∞ 16 13 13 13
9 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 20
Hence, the minimum distance of vertex 9 from vertex 1 is 20. And the
path is
1→ 3→ 7→ 8→ 6→ 9
This path is determined based on predecessor information.
Bellman Ford Algorithm

This algorithm solves the single source shortest path problem of a
directed graph G = (V, E) in which the edge weights may be negative.
Moreover, this algorithm can be applied to find the shortest path, if there
does not exist any negative weighted cycle.
Algorithm: Bellman-Ford-Algorithm (G, w, s)
for each vertex v Є G.V
v.d := ∞
v.∏ := NIL
s.d := 0
for i = 1 to |G.V| - 1
for each edge (u, v) Є G.E
v.d := u.d +w(u, v)
v.∏ := u
for each edge (u, v) Є G.E
return FALSE
return TRUE
Analysis
The first for loop is used for initialization, which runs in O(V) times. The
next for loop runs |V - 1| passes over the edges, which
takes O(E) times.
Hence, Bellman-Ford algorithm runs in O(V, E) time.
Example
The following example shows how Bellman-Ford algorithm works step by
step. This graph has a negative edge but does not have any negative
cycle, hence the problem can be solved using this technique.
At the time of initialization, all the vertices except the source are marked
by ∞ and the source is marked by 0.
In the first step, all the vertices which are reachable from the source are
updated by minimum cost. Hence, vertices a and h are updated.
In the next step, vertices a, b, f and e are updated.

Following the same logic, in this step vertices b, f, c and g are updated.
Here, vertices c and d are updated.
Hence, the minimum distance between vertex s and vertex d is 20.
Based on the predecessor information, the path is s→ h→ e→ g→ c→ d

Floyd–Warshall's Algorithm
Floyd–Warshall's Algorithm is used to find the shortest paths between between all pairs of
vertices in a graph, where each edge in the graph has a weight which is positive or negative. The
biggest advantage of using this algorithm is that all the shortest distances between any 2 vertices
could be calculated in O(V3), where V is the number of vertices in a graph.
The Algorithm Steps:
For a graph with N vertices:
 Initialize the shortest paths between any 2 vertices with Infinity.

 Find all pair shortest paths that use 0 intermediate vertices, then find the shortest paths
that use 1intermediate vertex and so on.. until using all N vertices as intermediate nodes.
 Minimize the shortest paths between any 2 pairs in the previous operation.
 For any 2 vertices (i,j) , one should actually minimize the distances between this pair
using the first Knodes, so the shortest path will be: min(dist[i][k]+dist[k][j],dist[i][j]).
dist[i][k] represents the shortest path that only uses the first K vertices, dist[k][j] represents the
shortest path between the pair k,j. As the shortest path will be a concatenation of the shortest
path from i to k, then from k to j.
for(int k = 1; k <= n; k++){

for(int i = 1; i <= n; i++){
for(int j = 1; j <= n; j++){
dist[i][j] = min( dist[i][j], dist[i][k] + dist[k][j] );
}
}
}
Time Complexity of Floyd–Warshall's Algorithm is O(V3), where V is the number of vertices in a

graph.
Floyd–Warshall's Algorithm
 Floyd-Warshall Algorithm is an algorithm for solving All Pairs Shortest path

problem which gives the shortest path between every pair of vertices of the given
graph.
 Floyd-Warshall Algorithm is an example of dynamic programming.
 The main advantage of Floyd-Warshall Algorithm is that it is extremely simple and
easy to implement.
Algorithm-
Create a |V| x |V| matrix // It represents the distance between every pair of vertices
as given
For each cell (i,j) in M do-
if i = = j
M[ i ][ j ] = 0 // For all diagonal elements, value = 0
if (i , j) is an edge in E
M[ i ][ j ] = weight(i,j) // If there exists a direct edge between the vertices, value =
weight of edge
else
M[ i ][ j ] = infinity // If there is no direct edge between the vertices, value = ∞
for k from 1 to |V|
for i from 1 to |V|
for j from 1 to |V|
if M[ i ][ j ] > M[ i ][ k ] + M[ k ][ j ]
M[ i ][ j ] = M[ i ][ k ] + M[ k ][ j ]
Time Complexity-
 Floyd Warshall Algorithm consists of three loops over all nodes.

 The inner most loop consists of only operations of a constant complexity.
 Hence, the asymptotic complexity of Floyd-Warshall algorithm is O(n3), where n is the
number of nodes in the given graph.
When Floyd- Warshall Algorithm is used?
 Floyd-Warshall Algorithm is best suited for dense graphs since its complexity depends
only on the number of vertices in the graph.
 For sparse graphs, Johnson’s Algorithm is more suitable.
PRACTICE PROBLEM BASED ON FLOYD-WARSHALL ALGORITHM-
Problem-
Consider the following directed weighted graph-
Using Floyd-Warshall Algorithm, find the shortest path distance between every pair of
vertices.
Solution-
Step-01:
 Remove all the self loops and parallel edges (keeping the edge with lowest weight)
from the graph if any.
 In our case, we don’t have any self edge and parallel edge.
Step-02:
Now, write the initial distance matrix representing the distance between every pair of
vertices as mentioned in the given graph in the form of weights.
 For diagonal elements (representing self-loops), value = 0

 For vertices having a direct edge between them, value = weight of that edge
 For vertices having no direct edges between them, value = ∞
Step-03:
From step-03, we will start our actual solution.
NOTE
 Since, we have total 4 vertices in our given graph, so we will have total 4 matrices of
order 4 x 4 in our solution. (excluding initial distance matrix)
 Diagonal elements of each matrix will always be 0.
The last matrix D4 represents the shortest path distance between every pair of vertices.

Datastructure Material

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Datastructure Material

Uploaded by

Copyright:

Available Formats

Abstract Data Types

get() – Return an element from the list at any given position.

Asymptotic analysis is input bound i.e., if there's no input to the

Asymptotic analysis refers to computing the running time of any

Usually, the time required by an algorithm falls under three types −

 Best Case − Minimum time required for program execution.

 Average Case − Average time required for program execution.

 Worst Case − Maximum time required for program execution.

For example, for a function f(n)

Common Asymptotic Notations

Sorting And Searching Algorithms - Time Complexities Cheat Sheet

Time complexity Cheat Sheet

Interpolation search is an improved variant of binary search. This search

Binary search has a huge advantage of time complexity over linear

Position Probing in Interpolation Search

While X does not match

if Lo equals to Hi OR A[Lo] equals to A[Hi]

EXIT: Failure, Target not found

Set Mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

else if A[Mid] > X

Interpolation search is an improved variant of binary search. This search

It's runtime complexity is log2(log2 n).

// array of items on which linear search will be conducted.

int find(int data) {

int mid = -1;

while(lo <= hi) {

printf("\nComparison %d \n" , comparisons ) ;

printf("lo : %d, list[%d] = %d\n", lo, lo, list[lo]);

printf("hi : %d, list[%d] = %d\n", hi, hi, list[hi]);

// probe the mid point

mid = lo + (((double)(hi - lo) / (list[hi] - list[lo])) * (data -

if(list[mid] < data) {

// if data is larger, data is in upper half

// if data is smaller, data is in lower half

printf("\nTotal comparisons made: %d", --comparisons);

int location = find(33);

// if element was found

printf("\nElement found at location: %d" ,(location+1));

printf("Element not found.");

Total comparisons made: 1

Binary tree sort:

Dictionary Abstract Data Type:

Operations : Search , Insertion , Removal

 remove / dequeue − remove an item from the front of the queue.

Priority Queue Representation

 Peek − get the element at front of the queue.

 isFull − check if queue is full.

 isEmpty − check if queue is empty.

Insert / Enqueue Operation

void insert(int data){

// if queue is empty, insert the data

for(i = itemCount - 1; i >= 0; i-- ){

// if data is larger, shift existing item to right end

if(data > intArray[i]){

// insert the data

Remove / Dequeue Operation

return intArray[itemCount - 1];

return itemCount == MAX;

void insert(int data){

// if queue is empty, insert the data

// start from the right end of the queue

for(i = itemCount - 1; i >= 0; i-- ){

// if data is larger, shift existing item to right end