Dictionaries: Advanced Data Structures 1

Advanced Data Structures 1
CHAPTER
1
DICTIONARIES
Syllabus:
Sets, Dictionaries, Hash Tables, Open Hashing, Closed Hashing(Rehashing
Methods), Hashing Functions(Division Method, Multiplication Method, Universal
Hashing), Analysis of Closed Hashing Result (Unsuccessful Search, Insertion,
Successful Search, Deletion), Hash Table Restructuring, Skip Lists, Analysis of Skip
Lists.
Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

1. Sets: -
1.1 Definition: A set is a collection of Objects. Set A is a subset of Set B if all
elements of A are in B. Subsets are also Sets. For example Set A consists of
(3,5,7,8,9) and Set B consists of (5,8) is called as subset of set A. Union of two sets
A and B is a Set C which consists of all elements in A and B.
1.1 Representation of Sets Using Linked List:
Here single linked list data structure can be used. This data structure is best
suited because it allows dynamic storage facility.
For 2example, a set A={10,20,40,50,60}
10 20 40 50 60 NULL
1.2 Operations of sets using linked lists:
The sets Operations are Union, Intersection, Difference and Equality

: - Suppose two sets A and B are given. We find C, where C=A U B. Initially
S is empty. At first we have to copy all the elements A into C, for each element in B
we have to search A whether the element is A or not. If the element is not A, then
we have to insert that element at the end of C.
A
1 8 9 6 4 NULL
5 9 6 7 3 NULL
1 8 9 6 4
5 7 3 NULL
: Intersection can be defined by searching a list for each element in

other list. Suppose A and B are two sets in form of lined list, we intent to find C = A
∩ B.
Initially C is empty. In each element of A, we have to search whether element
is B or not. If the element is A, we are to insert a node corresponding to the
element in C.

A
1 8 9 6 4 NULL
5 9 6 7 3 NULL
9 6 NULL
: - Two sets A and B in the form of linked list are given to find A-B the
difference of B from A.
A
1 8 9 6 4 NULL
5 9 6 7 3 NULL
1 8 4 NULL
: - The two A and B are equal. Therefore, the equality test of two given sets
first check whether the number of elements in both sets are equal or not. If the root
elements in the sets are equal, next we test whether an element in a set is also
present in other set.
1.3 Applications of Sets: In sets we have different applications and require a

particular representation of sets.
1. Hash table representation of sets in spell check
2. Bit array representation of sets in information system.
3. Tree representation of sets in client-server information.
2. Dictionaries:
Dictionary is data structure, It is a collection of a pair key and value.
Basic operations that can be performed on dictionary are
1. Insertion of element in the dictionary
2. Deletion of particular element from dictionary
3. Searching of a specific element with the help of key element

2.1 Dictionary ADT Operations
· Dictionary create()
creates empty dictionary
· boolean isEmpty(Dictionary d)
tells whether the dictionary d is empty
· put(Dictionary d, Key k, Value v)
associates key k with a value v;
if key k already presents in the dictionary
old value is replaced by v
· Value get(Dictionary d, Key k)
returns a value, associated with key k
or null, if dictionary contains no such key
· remove(Dictionary d, Key k)
removes key k and associated value
· destroy(Dictionary d)
destroys dictionary d
2.2 Linear List Representation:
The dictionary can be represented as a linear list. The linear list is a collection of
pair and value. There are two method of representing linear list.
1. Sorted Array – An array data structure is used to implement the dictionary
2. Sorted Chain – A linked list data structure is used to implement the dictionary.
The operations
1. Insertion of record in the list
2. Deletion of any record from the list
3. Finding length/size of the list
4. Display of the list
3. Skip List Representation:
A skip list is a data structure for storing a sorted list of items, using a hierarchy
of linked lists that connect increasingly sparse subsequences of the items. There
are two special nodes in the skip list one is head node which is the starting node of
the list and tail node is the last node of the list.

Each link of the sparser lists skips over many items of the full list in one step,
hence the structure's name. The skip list is an efficient implementation of
dictionary using sorted chain. This is because in skip list each node consists of
forward of more than one node at time.
3.1 Skip List Creation:
Step 1: First all the data items created in normal order with head and tail nodes.
Step 2: In the total list the middle element is 40, a pointer to middle element is
added then the skip list as follows
For example If we want to search node 50 from above chain there we will require
comparatively less time. This search again can be made efficient if we add few more
pointers of forward references.
Step 3: The Final Skip list is
3.2 Searching of an Element:

Search for a particular element in a skip list is as follows.
For example search 70 element is

4 Hashing:
4.1 Hash Table Representation:
Hash Table is a data structure used for storing and retrieving data very quickly.
Insertion of data in the hash table is based on Key. For example for storing an
student record in the hash table the student ID will work as a Key.
The effective representation of dictionary can be done using hash table. We can
place the dictionary entries (key and value pair)in the hash table using hash
function.
4.2 Hash Function:
Hash function is a function which is used to put the data in the hash table. Hence
one can use the same hash function to retrieve the data from the hash table. Thus
hash function is used to implement the hash table.
1. Division Method
2. Mid square method
3. Multiplicative hash function
4. Digit folding
1. Division method:
The Division method returns the remainder after division. The divisor is the table
length.
H(key)=data/table length
For example the elements are 55 67 88 34 is to be placed in the hash table and
the table size is 10
Key = 55 % 10 =5

0
1
2
3
4 34
5 55
6
7 67
8 88
9
2. Mid square method:
In the mid square method the key is squared and the middle or mid part of the
result is used as the index.
The key k is squared then the hash function H is defined by H(k)=I
Where I is obtained by deleting digits from both ends of K2 we emphasize that the
same positions of K2 must be used for all of the keys.
Ex: the record no is 2991
29912 = 8946081
The hash table size is 1000
H(2991) = 460
3. Multiplicative hash function:
The given record is multiplied by some constant value. The formula for computing
the hash key is –
H(key)=floor(p*(fractional part of key *A)) Where p is integer constant and A is
constant real number.
Donald Knuth suggested to use constant A =0.61803398987
If key 201 and p=50 then
H(key) = floor(50*(201*0.61803398987)
= floor(6211.2415981935)= 6211

4. Digit folding:
The key is folding is similar to three part paper folding, combined three parts to
produce the hash key.
For example, the element is 53634641
The parts are 536 346 41=536+346+41
H(key)=923
The record will be placed at location 923 in the hash table.
4.3 Collision: The hash function returns the same hash key for more than one
record is called collision.
Collision Resolution Techniques:
If collision occurs then it should be handled by applying some techniques.
The Collision Handling Techniques are
1. Separate Chaining(Open Hashing)
2. Open Addressing (Linear Probing)
3. Quadratic Probing
4. Double Hashing
5. Rehashing
6. Extendible
1. Separate Chaining: When collision occurs then a linked list (chain) is
maintained at the Key. It is similar to linked list format.
For example:
The keys are 121,2,4,31,41,64,7,87,9.
a hash function as
H(key) = key%T
Where T is the size of table . The hash table size (T) is 10

0
1
121 31 41
2
2
3
4 64
4
5
7 87
6
7 9
8
9
Advantages of separate chaining unlimited number of elements, unlimited number
of collisions. Disadvantages of separate chaining overhead of multiple linked lists.
2. Open Addressing-Linear Probing:
One of the simplest rehashing functions is linear probing. Just finding the
next free cell to store in the case of collision occurs.
For example:
The keys are 121,2,4,31,41,64,7,87,9.
a hash function as
H(key) = key%T
Where T is the size of table. The hash table size (T) is 10
The element 121 can be placed at
H(key) = 121%10 = 1
Index 1 will be the home bucket for 121.Continuning in this fashion we will place
2,4 and 9.

Index Key
0 Null
1 121
2 2
3 Null
4 4
5 Null
6 Null
7 Null
8 Null
9 9
Now the next key to be inserted is 31.According to the hash function
H(key) = 31%10
H(key) = 1.
But the index 1 location is already occupied by 121 i.e. collision occurs .To resolve
this collision we will linearly move down and at the next empty location we will prob
the element. Therefore 31 will be placed at the index 3. Same process to remaining
the all elements.
Index Key
0 Null
1 121
2 2
3 31
4 4
5 41
6 64
7 7
8 87
9 9

3. Quadratic Probing:
Quadratic probing operates by taking the original hash value and adding successive
values of an arbitrary quadratic polynomial to the starting value.
This method uses following formula-
Hi(key)=(Hash(key)+i2)%k
Where k can be a table size
For example:
The keys are 121,2,4,31,41,64,7,87,9.
H(key) = 121%10
=1
Index 1 will be the home bucket for 121.Continuning in this fashion we will place
2,4 and 9.
Index Key
0 Null
1 121
2 2
3 Null
4 4
5 Null
6 Null
7 7
8 Null
9 9

H(key) = 31%10
H(key) = 1.

But the index 1 location is already occupied by 121 i.e. collision occurs. Hence we
will apply quadratic probing to insert this record in the hash table.
Consider i=0 then
(31+02)%10 =1
(31+12)%10 =2
(31+22)%10 =5
The index position 5 is empty hence we will place the element at index 5.
Index Key
0 Null
1 121
2 2
3 Null
4 4
5 31
6 Null
7 7
8 Null
9 9
Then next key element to be inserted is 41. According to the hash function
H(key) = 41%10
H(key) = 1.
will apply quadratic probing to insert this record in the hash table.
Consider i=0 then
(41+02)%10 =1
(41+12)%10 =2
(41+22)%10 =5

(41+32)%10 =0
Index Key
0 41
1 121
2 2
3 Null
4 4
5 31
6 Null
7 7
8 Null
9 9
H(key) = 64%10
H(key) = 4.
But the index 4 location is already occupied by 4 i.e. collision occurs. Hence we will
apply quadratic probing to insert this record in the hash table.
Consider i=0 then
(64+02)%10 =4
(64+12)%10 =5
(64+22)%10 =8

Index Key
0 Null
1 121
2 2
3 Null
4 4
5 31
6 Null
7 7
8 64
9 9
Then next key element to be inserted is 87. According to the hash function
H(key) = 87%10
H(key) = 7.
apply quadratic probing to insert this record in the hash table.
Consider i=0 then
(87+02)%10 =7
(87+12)%10 =8
(87+22)%10 =1
(87+32)%10 =6

Index Key
0 41
1 121
2 2
3 Null
4 4
5 31
6 87
7 7
8 64
9 9
4. Double hashing:
Double hashing is technique in which a second hash function is applied to the key
when a collision occurs. By applying the hash function we will get the number of
positions from the point of collision to insert.
There are two important rules to be followed for the second functions:
Ø It must never evaluate to zero.
Ø must make sure that all cells can be probed.
The formula to be used for double hashing is
H(key)=key % tablesize
H(key)= M-(key % M)
Where M is a prime number smaller than the size of the table.
For example:
The keys are 121,2,4,32,64,7,88,9.

Index Key
0 Null
1 121
2 2
3 Null
4 4
5 Null
6 Null
7 7
8 88
9 9

H(key) = 121%10 = 1
H(key) = 2%10 = 2
H(key) = 4%10 = 4
H(key) = 7%10 = 7
H(key) = 9%10 = 9
H(key) = 88%10 = 8
H(key) = 32%10
H(key) = 2.
will apply double hashing to insert this record in the hash table.
H(key) = M-(key%M)
H(key) = 7-(32%7)
= 7-4=3
That means we have to insert the element 32 at 3 places from 2. Then we have to
take 3 jumps. Then 32 will be placed at index 5.
H(key) = 64%10
H(key) = 4.

apply double hashing to insert this record in the hash table.
H(key) = M-(key%M)
H(key) = 7-(64%7)
= 7-1=6
Index Key
0 64
1 121
2 2
3 Null
4 4
5 32
6 Null
7 7
8 88
9 9
That means we have to insert the element 64 at 6 places from 4. Then we have to
take 6 jumps. Then 64 will be placed at index 0.
5.Rehashing (Hash Restructuring):
Rehashing is a technique in which the table is resized, i.e., the size of the table is
doubled by creating a new table .it preferable if the total size of table is a prime
number . There are situations in which the rehashing is required-
When table is completely full.
With quadratic probing when the table is filled half.
When insertions fail due to overflow.

Applications of Hashing:
1. In compilers to keep track of declared variables.
2. For online spelling checking the hashing functions are used.
3. Hashing helps in Game playing programs to store the moves made.
4. For browser program while caching the web pages, hashing is used.
Hashing Skip List

This method is sued to carry out Skip lists are used to implement
dictionary operations using randomized dictionary operations using randomized
process. processes.
It is based on hash function. It does not require hash function
If the sorted data is given then hashing The sorted data improves the
is not an effective method to implement performance of skip list.
dictionary.
The space requirement in hashing is for The forward pointers are required for
hash table and a forward pointer is every level of skip list.
required per node.
Hashing is an efficient method than skip The skip lists are not that much efficient
lists
Skip list are more versatile than hash Worst case space requirement is larger
table. for skip than hashing
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
typedef struct node
{
int data;

struct node *next;
}s;
s *ptr[10],*root[10],*temp[10];
int max=10;
int index=-1;
void insert(int key)
{
index=key%max;
ptr[index]=(s*)malloc(sizeof(s));
ptr[index]->data=key;
if(root[index]==NULL)
{
root[index]=ptr[index];
root[index]->next=NULL;
temp[index]=ptr[index];
}
else
{
temp[index]=root[index];
while(temp[index]->next!=NULL)
temp[index]=temp[index]->next;
temp[index]->next=ptr[index];
}
}
void search(int key)
{
int flag=0;
index=key%max;
while(temp[index]!=NULL)
{
if(temp[index]->data==key)
{
printf("\nSearch key is found!!");
flag=1;
break;
}
else
}
if (flag==0)
printf("\nsearch key not found.......");
}
void delete_ele(int key)
{
index=key%max;

while(temp[index]->data!=key && temp[index]!=NULL)
{
ptr[index]=temp[index];
}
ptr[index]->next=temp[index]->next;
printf("\n %d has been deleted.",temp[index]->data);
temp[index]->data=-1;
temp[index]=NULL;
free(temp[index]);
}
void main()
{
int val,ch,n,num;
int c;
int i;
for(i=0;i<max;i++)
{
root[i]=NULL;
ptr[i]=NULL;
temp[i]=NULL;
}
while(1)
{
printf("\nMENU:\n1.Create");
printf("\n2.Search for a value\n3.Delete an value");
printf("\nEnter your choice:");
scanf("%d",&ch);
switch(ch)
{
case 1:printf("\nEnter the number of elements to be inserted:");
scanf("%d",&n);
printf("\nEnter the elements to be inserted:");
for(i=0;i<n;i++)
{
scanf("%d",&num);
insert(num);
}
break;
case 2:printf("\nEnter the element to be searched:");
scanf("%d",&n);
search(n);
break;
case 3:printf("\nEnter the element to be deleted:");

scanf("%d",&n);
delete_ele(n);
break;
default:printf("\nInvalid choice....");
exit(0);break;
}
}
getch();
}

CHAPTER
2
BALANCED
TREES
Syllabus:
AVL Trees: Maximum Height of an AVL Tree, Insertions and Deletions. 2-3 Trees :
Insertion, Deletion.

Balanced Trees: - A balanced tree is a rooted tree where each sub tree of the root
has equal number of nodes. The height of such binary tree is O(logn)
Various types of balanced trees are
1. Height balance tree (AVL Tree)
2. Red-Black tree
3. Splay Tree
4. B-tree
1. AVL Trees: - An AVL tree is special type of binary tree that is based on partial
balancing. AVL trees are also known as balanced search trees. Each character in
AVL tree is Adelson-Velskii and Landis Tree in 1962 introduced binary tree
structure that balanced with respect to height of sub trees. AVL tree can be defined
as follows “An AVL tree is a binary search tree such that for every internal node v of
T, the heights of the children v can differ by at most 1”. This property is called as
height-balanced property.
The height of a tree is as follows
A tree with no elements height is -1
A tree with 1 element height is 0
A tree with more than 1 element is equivalent 1+height of its tallest sub tree.
|TL – TR|<=-1
Balance property ensures depth of tree is O(log n)
The height of an AVL tree storing n keys is O(log n)
The following diagrams shown examples for AVL trees and non-AVL trees
An AVL tree is a binary tree in which the difference between the height of the right
and left sub tree is never more than one. At the time of insertion of deletion of
element there is scope of violating this rule. In that case the tree must maintain
AVL property by applying restore methodology.
Restructuring requires rotation of the trees. These rotations are two types one
is single rotation and other one right rotation. Each model again comes with left
rotations and right rotations.
Difference between AVL tree and binary search tree, in AVL trees every node
in the re the height of the left and right sub trees can differ by at most 1.
: - There are four different cases when rebalancing is
required after insertion of new node –
1. An insertion of new node into left subtree of left child (LL)

2. An insertion of new node into right subtree of left child (RL)
3. An insertion of new node into left subtree of right child (LR)
4. An insertion of new node into right subtree of right child (RR)
There are two types of rotations.
Left-Left Left-Right
(LL rotation) (LR rotation)
Right-Right Right-Left
(RR rotation) (RL rotation)
: - Consider A, B and C are any three sub trees and p, q are two
nodes. The relation between p and q is always p<q.
!"#
In the above diagram if we are inserting element in to sub tree A or sub tree B
considering there is a chance for violating AVL tree property at node q. In this case
p is made to be parent node and q is made to right child of p. Sub tree B is assigned
as left tree to node q. This is called single rotation. In these diagrams either p or q
need not be root nodes.
The following diagram shows this Single rotation applied for above diagram.

So, clearly violation occurred at Node 45. Consider the single rotation principle
according to it p is 41 and q is 45. Sub tree A is at 38 and there no sub trees like B
and C. After applying single rotation operation the new AVL tree is shown in the
following diagram. Here partial sub tree rotation for 45 is shown in first diagram
and complete diagram shown next.
Double Rotation: - Unlike single rotation, in double rotation there are two cases
exists. One is right left double rotation and other one left right double rotation.
Rotations based on the relationship between nodes and balancing depends on type

of double rotation. Consider A,B,C and D are any four sub trees of three nodes. P,q
and r are three nodes.
Right Left Double Rotation: - The ideal case of right left double rotation, thetree
appears as follows:
$ % & # &
In the above diagram if we are inserting element in to sub tree considering there is a
chance for violating AVL tree property at a node. In this case q is made as parent
node for r and p. Node r is added as left node to q and node p is added as right node
to q. Sub trees A and B are assigned to node r, C and D are assigned to node p. In
this q need not be root node.
The following diagram shows this Right-Left double rotation applied for above
diagram.
After Applying & # &

For example observe the following diagrams that show insertion of element without
applying double rotation that results for violating AVL balanced property (inserted
element with dotted lines)

!"# ' ! $ ( &# &

After applying right-left double rotation the following balanced AVL tree appears as
result
!"# ' ! $ ! &# &

Left Right Double Rotation:
The ideal case of right left double rotation, the tree appears as follows:

In the above diagram if we are inserting element in to sub tree considering there is a
chance for violating AVL tree property at a node. In this case q is made as parent
node for p and r. Node p is added as left node to q and node r is added as right node
to q. Sub trees A and B are assigned to node p, C and D are assigned to node r. In
this q need not be root node.
The following diagram sows this Left-Right double rotation applied for above
diagram.
For example observe the following diagrams that show insertion of an element
without applying double rotation that results for violating AVL balanced property
(inserted element shown with dotted lines)
!"# ' ! $ ( #& &

After applying left-right double rotation the following balanced AVL tree appears as
results.

!"# ' ! $ ! #& &

Removing an element from AVL tree: -
Removing an element from AVL tree begins same as in binary search tree in the
beginning. That means the node removed will become an empty external node. In its
parent may cause an imbalance. Observe the following diagram deletion and after
deletion of an element from AVL tree.
( & ) !"#
After removing element 17 the new diagram is as follows
Rebalancing After Removing: -

Here a case exists that needs right-left double rotation at the node.
After applying right-left double rotation the new AVL diagram is follows:






Examples:
Construct an AVL tree for
20, 11, 5, 32, 40, 2, 4, 27, 23, 28, 50

Sun, Mon, Tue, Wed, Thu, Fri, Sat
Jan, Feb, Mar, Apr, May, June, July, Aug, Sept, Oct, Nov, Dec
Construct an AVL tree using the following data entered in sequence
7,14,2,5,10,33,56,30,15,25,66,70,4
!" # $ %
%
#include <stdio.h>
#include <stdlib.h>
#define max(a,b) (((a) > (b)) ? (a) : (b))
typedef struct node
{
int data;
struct node *left, *right;
}s;
s *root;
s* rotate_LL(s *p)
{
s *p1 = p->left;
p->left = p1->right;
p1->right = p;
return p1;
}
s* rotate_RR(s *p)
{
s *p1 = p->right;
p->right = p1->left;
p1->left = p;
return p1;
}
s* rotate_RL(s *p)
{
s *p1 = p->right;
p->right = rotate_LL(p1);
return rotate_RR(p);
}
s* rotate_LR(s *p)
{
s *p1 = p->left;
p->left = rotate_RR(p1);
return rotate_LL(p);
}
int get_height(s *p)
{
int height=0;

if(p != NULL)
height = 1+max(get_height(p->left),get_height(p->right));
return height;
}
int get_balance(s *p)
{
if(p == NULL)
return 0;
return get_height(p->left) - get_height(p->right);
}
s* balance_tree(s **p)
{
int height_diff= get_balance(*p);
if(height_diff > 1)
{
if(get_balance((*p)->left) > 0)
*p = rotate_LL(*p);
else
*p = rotate_LR(*p);
}
else if(height_diff < -1)
{
if(get_balance((*p)->right) < 0)
*p = rotate_RR(*p);
else
*p = rotate_RL(*p);
}
return *p;
}
s* avl_add(s **root,int key)
{
if(*root == NULL)
{
*root = (s*)malloc(sizeof(s));
(*root)->data = key;
(*root)->left = (*root)->right = NULL;
}
else if(key < (*root)->data)
{
(*root)->left = avl_add(&((*root)->left),key);
(*root) = balance_tree(root);
}
else if(key > (*root)->data)
{
(*root)->right = avl_add(&((*root)->right), key);

(*root) = balance_tree(root);
}
else
{
printf("fail! - duplicated key\n");
exit(-1);
}
return *root;
}
void inorder(s *p)
{
if(p!=NULL)
{
inorder(p->left);
printf("%3d",p->data);
inorder(p->right);
}
else
return;
}
void main()
{
int data;
clrscr();
while(1)
{
printf("enter the data ");
scanf("%d",&data);
if(data==-1)
break;
avl_add(&root,data);
}
printf("\n Inorder traversal : \n");
inorder(root);
getch();
}
2. 2-3 Trees: A 2-3 tree is either an empty tree or a single node tree or a tree with
multiple nodes with following properties.
1. Each interior node has either two or three children
2. Each path from root to leaf has the same length
2-3 tree is always height balanced. That is path from root to leaf should have
same length.

Key Key
Elements that Elements that Elements that Elements that

are < Key are > Key are < Key are > Key
Elements
between Key 1
and Key 2
Insertion: To insert an item, find a leaf to put the item in and then split nodes, if
necessary. There are three cases for insertion of a node.
& ! :
i) Left branch
ii) Right branch
& '! % :
i) Left branch

ii) Right branch
Case 3: Splitting the root node
)* % +,- .
44
20 60 70
11 12 30 50 65 90
Insert 29, 28, 27, 26, 25, 24, 23, 22

/ $ +0
Locate leaf to insert 29
+ $ +1
Determine smallest=28, Middle=29, Largest=30
- $ +2
There is a leaf that contains only 1 data value, insert the value 27

3 $ +4
Conceptually there cannot be 4 children

Hence split internal node 20 27 29
5 $ +5

4 $ +3
2 $ +-
1 $ ++
Deletion: When we insert a node, we split nodes. But when we delete some node
we need to merge the nodes.
The deletion operation is carried out using two stages – either remove merge
or remove – redistribute.

)* % . +,- 44
20 50 95
11 12 30 48 75 88 97
Delete 50, 75, 97, 88 in order

/ Delete 50. For this deletion we will find inorder successor, of node 50. It is
75. Swap 50 and 75. Then delete 50 from leaf.
+ Delete 75. The node 75 is an internal node. Swap it with inorder successor.
The inorder successor will always be in leaf node.
- Delete 97
The 97 is already a leaf, just remove leaf.

3 Delete 88
Swap 88 with inorder successor. Then delete 88
Cannot redistribute, hence merge nodes.
As root is empty, set new pointer to root.

Hence

CHAPTER
3
PRIORITY
QUEUES
Syllabus:
Binary Heaps : Implementation of Insert and Delete min, Creating Heap.
Binomial Queues : Binomial Queue Operations, Binomial Amortized Analysis, Lazy
Binomial Queues

Priority Queue: The priority queue is a data structure having a collection of
elements which are associated with specific ordering or priorities.
There are two types of priority queues.
1. Ascending priority queue.
2. Descending priority queue.
1.Ascending priority queue – It is a collection of items in which the items can be
inserted arbitrarily but only smallest element can be removed.
2. Descending priority queue – It is a collection of items in which insertion of items
can be in any order but only largest element can be removed.
The implementation of priority queue can be done using arrays or linked list.
The data structure heap is used to implement the priority queue effectively.
ADT for Priority Queue:
Various operations that can be performed on priority queue are
1. Insertion
2. Deletion
3. Display
Insertion operation:
While implementing the priority queue we will apply a simple logic. That is
while inserting the element we will insert the element in the array at the proper
position. For example, if the elements are placed in the queue as
9 12
Q[0] Q[1] Q[2] Q[3] Q[4]
And now if an element 8 is to be inserted in the queue then it will be at 0th
location as –
8 9 12
Q[0] Q[1] Q[2] Q[3] Q[4]
If the next element comes as 11 then the queue will be
8 9 11 12
Q[0] Q[1] Q[2] Q[3] Q[4]
Deletion operation:
In the deletion operation we are simply removing the element at the front.
For example, if queue is created
8 9 11 12
Q[0] Q[1] Q[2] Q[3] Q[4]
Then the element at q[0] will be deleted first.
8 9 11 12
Q[0] Q[1] Q[2] Q[3] Q[4]
Binary Heap: Heap is a complete binary tree in which every parent node be either
greater or lesser than its child nodes.
Heap can be min heap or max heap.

A * is a tree in which value of each node is greater than or equal to the
value of its children nodes.
A is a tree in which value of each node is less than or equal to value of its
children nodes.
Ex: Construct heap structure for given list of elements
8,7, 4, 5, 6 2, 3, 11, 9
Step 1:
Step 2: Now will construct min heap structure. That means a binary tree in which
each parent node is less than its children.
We will start scanning from bottom up manner.

Insertion:


Deletion:


Binomial Queues:
A binomial queue is a forest of heap-ordered trees.
Binomial heap is a collection of binomial trees. Hence let us define binomial trees
first.
Binomial tree: The ith binomial tree, Bi with i>=0 has a root with i children,Bi-1,….,B0.
Representation: Binomial heap is a collection of binomial trees of distinct sizes

each of which the heap property.
The heap property can be a min heap property or max heap property.
In min heap property the key of a node is greater than or equal to its parent
node.
Operations:
Various operations that can be performed on binary heap are
1. Merging of two binomial heaps.
2. Union of two binomial heaps.

3. Insertion of the element in the binomial heap.
4. Deletion of an element from the binomial heap.
Union of two binomial heaps.
1st stage: Merge two root lists of binomial heaps into single linked list.
2nd stage: Apply four cases for uniting two binomial heaps.
Example:
Consider that there are two binomial heaps
Merged binomial heap
Four Cases:
Case 1: If degree[m] != degree[next_m] then move the pointers ahead.

Case 2: if degree[m]=degree[next_m]=degree[sibling[next_m]] then move the pointers

a head.
Case 3: if degree[m] = degree[next_m] != degree[sibling[next_m]] and if key[m] <=

key[next_m] then we attach next_m node to m node in order to create Bk+1 tree.
Case 4: if degree[m] = degree[next_m] != degree[sibling[next_m]] and if key[m] >

key[next_m] then we attach m node to next_m node in order to create Bk+1 tree.
Now let us apply these case on a merge tree as follows.

The above situation resembles case 3. Hence we will apply transition of case. That
is node 20 will be the child of node 13. Hence we will get.
The above situation resembles case 2. Then according to case 2 we will simply
move the pointers ahead. Hence we will get

Here degree[m] = degree[next_m] != degree[sibling[next_m]] i.e. degree[node 8] =
degree[node 2] != degree[node 17] and key [next_m] <= key[m] i.e. 2<=8. Hence
attach list of node 8 as a child to node 2. So we get
Here degree[m] = degree[next_m] != degree[sibling[next_m]] i.e. degree[node 2] =

degree[node 17] != degree[node 5] and key[m] <= key[next_m] i.e. 2<17. Hence case
3 is applicable.
Here degree[m] != degree[next_m] i.e. degree[node 2] != degree[node 5]. Hence case 1

is applicable which suggests us to simply move the prev_m, m and next_m pointers
ahead.
Thus finally we get the united binomial heap as follows.


CHAPTER
4
GRAPHS
Syllabus:
Operations on Graphs: Vertex insertion, vertex deletion, find vertex, edge addition,
edge deletion, Graph Traversals- Depth First Search and Breadth First Search(Non
recursive) .Graph storage Representation- Adjacency matrix, adjacency lists.

4.1 Graphs: - A graph is a set of nodes (or vertices) and edges (or arcs) which
connect them. We write a graph G as : G= (V,E)
Where V is the set of nodes and E is the set of edges
V1 E7
E1
E5
E3 V3
V2
E4 E6
E2
V4 V5
If two nodes are connected by an edge, those two nodes are said to be
adjacent or neighbors. An edge which has direction is called directed edges. The
edge which has no specific direction is called undirected edge.
If a node does not have any adjacent nodes, then it is said to be isolated node.
A graph which contains only isolated node. A graph which contains only isolated
node is called a null graph.
( ) : Graphs are two types
1. Directed graphs 2. Undirected graphs
A directed graph (or diagraph) is a graph in which all edges are directed
edges.
An undirected graph is a graph in which all edges are undirected edges.
If a graph contains both directed and undirected edges, it is said to be mixed
graph.
An edge in a graph which starts and ends on the same node is called a loop.
The degree of a node is the number of edges connected directly to that node, i.e. the
number edges incident on it.
In a directed graph, the indegree of a node is the number of edges beginning
from the node. The outdegree of a node is the number of edges terminating at that
node. The sum of indegree and outdegree of a node is called total degree. The
concept of indegree and outdegree can not apply to undirected graph. A node whose
outdegree is 0 is called source node and a node whose indegree is 0 is called sink
node. For isolated nodes, the degree is 0.
( ' " % ) : - A weighted graph is a graph which consists of weights
along its edges.
A
5 10
C
B
30
20 50
40
D E

4.2 Graphs representation: - To represent a graph we have to represent two things
nodes and edges. Graphs are generally represented either in sequential
representation or linked representation. Sequential representation uses a two-
dimensional array where as linked representation uses a linked lists.
: - A graph is conveniently represented by a matrix(two
dimensional array) called adjacency matrix(or incidence matrix). A graph containing
n nodes can be represented by a matrix containing n rows and n cols.
! 6 *
The adjacency matrix A of a graph G=(V,E) with n nodes is an n X n matrix such
that Aij =1 if there is edge between vi and vj
= 0 otherwise
An adjacency matrix has certain disadvantages:

a) Graphs with few edges would have a lot of wasteful Zeros in the adjacency
matrix. That is the corresponding adjacency matrix is sparse.
b) Insertion and deletion of nodes is difficult.
* % : - In this representation we maintain an adjacency list. In
adjacency list for each node we keep a list of all adjacent nodes.
4.3 Graph Traversals: - Graph Traversals are two types.

1. Depth-first and 2. Breadth-first
+ : Depth-first traversal of a graph is roughly analogous to preorder
traversal of an ordered tree. Suppose that the traversal has just visited a vertex V,
and let w1, w2, ………. Wk be the vertices adjacent to V. Then we shall next visit w1
and keep w2,……..,wk waiting. After visiting w1, we traverse all the vertices to
which it is adjacent before returning to traverse w2,…….,wk
' , % + : - Breadth-first traversal of a graph is roughly analogues to level-
by level traversal of an ordered tree. If the traversal has just visited a vertex V, then

it next visits all the vertices adjacent to V, putting the vertices adjacent to these in a
waiting list to be traversed after all the vertices adjacent to v have been visited.
+ # :
1. All nodes are initialized to ready state and initialize stack to empty.
2. Begin with any nod which is in ready state and push into stack
Mark the status of that node to waiting.
3. While stack is not empty do
Begin
4. Pop the top node k of stack and process it. Mark the status of that node to
visited.
5. Push all the adjacent nodes of K which are in ready state into stack and mark
the status of those nodes to waiting.
End
6. If the graph still contains nodes which are in ready state then
Go to step 2
7. return
, % + # :-
1. All nodes are initialized as ready states and initialize Queue to empty
2. Begin with any node which is in ready state and put into queue
Mark the status of that node to waiting
3. while queue is not empty do
a. begin
4. delete the first node K from queue and process it. Mark the status of that node
5. Add all the adjacent nodes of K which are in ready state to the rear side of the
queue and mark the status those nodes to waiting
End
6. If the graph still contains nodes which are in ready state then got to step 2
7. return.

' 7 ,
#include<stdio.h>
#include<conio.h>
# define MAX 20
# define TRUE 1
# define FALSE 0
int g[MAX][MAX];
int v[MAX];
int n;
int top=-1;
int st[10];
void create();
void dfs(int);
void push(int);
int pop();
void main()
{
int v1,v2;
char ans;
clrscr();
create();
getch();
do
{
for(v1=0;v1<n;v1++)
v[v1]=FALSE;
printf("\nEnter the vertex from which to traverse\n");
scanf("%d",&v1);
if(v1>=MAX)
printf("\nInvalid vertex");
else
{
printf("\nThe depth first search is:\n");
dfs(v1);
}
printf("\nDo you want to traverse by any other node\n");
ans=getch();
}
while(ans=='y');
}
void dfs(int v1)
{
int v2;
push(v1);
while(top!=-1)
{

v1=pop();
if(v[v1]==FALSE)
{
printf("\n%d",v1);
v[v1]=TRUE;
}
for(v2=0;v2<n;v2++)
if(g[v1][v2]==TRUE&&v[v2]==FALSE)
push(v2);
}
}
void push(int item)
{
st[++top]=item;
}
int pop()
{
int item;
item=st[top];
top--;
return item;
}
void create()
{
int ch,v1,v2,flag;
char ans='y';
for(v1=1;v1<=n;v1++)
for(v2=1;v2<=n;v2++)
g[v1][v2]=FALSE;
printf("\nEnter the number of nodes\n");
scanf("%d",&n);
printf("\nEnter the vertices starting from 0:\n");
do
{
printf("\nEnter the vertices v1 & v2:\n");
scanf("%d%d",&v1,&v2);
if(v1>=n||v2>=n)
printf("\nInvalid vertex value\n");
else
{
g[v1][v2]=TRUE;
g[v2][v1]=TRUE;
}
printf("\nAdd more edges?\n");
ans=getch();
}

while(ans=='y');
getch();
}
' ' ( 7 ,
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#define size 20
#define TRUE 1
#define FALSE 0
int g[size][size];
int visit[size];
int Q[size];
int front,rear;
int n;
void create();
void bfs(int);
void main()
{
int v1,v2;
char ans='y';
clrscr();
create();
getch();
do
{
for(v1=0;v1<n;v1++)
visit[v1]=FALSE;
clrscr();
printf("enter the vertex from which you want to traverse");
scanf("%d",&v1);
if(v1>=n)
printf("invalid vertex\n");
else
{
printf("the breadth first search of the graph is \n");
bfs(v1);
getch();
}
printf(" \n do you want to traverse from any other node?");
ans=getche();
}
while(ans=='y');
exit(0);
}

void create()
{
int v1,v2;
char ans='y';
printf("\n enter number of nodes");
scanf("%d",&n);
for(v1=0;v1<n;v1++)
for(v2=0;v2<n;v2++)
g[v1][v2]=FALSE;
printf("\n enter the vertices no. starting from 0");
do
{
printf("\n enter the vertices v1 and v2 ");
if(v1>=n||v2>=n)
printf("Invalid vertex value\n");
else
{
g[v1][v2]=TRUE;
g[v2][v1]=TRUE;
}
printf("\n\n add more edges??(y/n)");
ans=getche();
}while(ans=='y');
}
void bfs(int v1)
{
int v2;
visit[v1]=TRUE;
front=rear=-1;
Q[++rear]=v1;
while(front!=rear)
{
v1=Q[++front];
printf("%d\n ", v1);
for(v2=0;v2<n;v2++)
{
if(g[v1][v2]==TRUE && visit[v2]==FALSE)
{
Q[++rear]=v2;
visit[v2]=TRUE;
}
}
}
}

4.4 Operations on Graphs:
Various operation on graphs are
1. creation of a graph
2. vertex insertion
3. vertex deletion
4. edge insertion
5. edge deletion
6.Finding the vertex
& ) :
1. Graph creation using adjacency matrix
2. Graph creation using adjacency list
/8 9 6 *:
Creation of graph using adjacency matrix is quite simple task. The adjacency
matrix is nothing but a two dimensional array. The algorithm for creation of graph
using adjacency matrix will be as follows:
1. Declare an array of M[size][size] which will store the graph
2. Enter how many nodes you want in a graph
3. Enter the edges of the graph by two vertices each, Vi, Vj indicates some edge.
4. If the graph is directed set M[i][j]=1. If graph is undirected set M[i][j]=1 and
M[i][j]=1 as well.
5. When all the edges for the desired graph is entered print the graph M[i][j].
+8 9 6 :
1. Declare node structure for creating adjacency list
2. Initialize an array of nodes. This array will act as head nodes. Say *head[10]. The
index of head[ ] will be the starting vertex.
3. The create function will create the adjacency list for given graph G as follows:
A B
0 A D
B C
1 B D C
2 C A
D E
3 D C A
4 E C A

' - : - The vertex can be inserted in a graph by adding name of new
vertex and name of the existing vertex to which it is attached.
Ex: Consider a graph G given below
V0 0 1 2 3
0 0 1 1 0
1 1 0 0 1
V2 2 1 0 0 1
V1
3 0 1 1 0
V3
Now if we insert vertex V4, by attaching it to V3 then, the graph becomes
V0
0 1 2 3 4
V1 V2
0 0 1 1 0 0
1 1 0 0 1 0
2 1 0 0 1 0
V3 3 0 1 1 0 1
4 0 0 0 1 0
V4
. - : When particular vertex is deleted then all the edges associated

with it gets deleted. All these edges are actually attached to neighbouring vertices.
Hence while deleting some particular vertex, ask for the names of neighbouring
vertices.
V0
0 1 2 3 4
0 0 1 1 1 0
1 1 0 0 0 1
V1 V3 2 1 0 0 0 1
V2
3 1 0 0 0 1
4 0 1 1 1 0
V4

The neighbouring vertices are V1, V2, V3
G[V4] [V1]= G[V1] [V4]=0
G[V4] [V2]= G[V2] [V4]=0
0 1 2 3 4
G[V4] [V3]= G[V3] [V4]=0
0 0 1 1 1 0
1 1 0 0 0 0
V0
2 1 0 0 0 0
3 1 0 0 0 0
4 0 0 0 0 0
V1 V3
V2
Thus vertex V4 gets deleted.

( % : We can insert an edge in the graph. The edge is always
represented by two vertices. Following are the steps to be carried out for adding an
edge in the existing graph.
Step 1: Enter the name of new edge by specifying two vertices, say V1, V2
Step 2: Set G[v1][v2]=G[v2][v1]=1
Before addition of edge
0 1 2 3
V0 0 0 1 0 1
1 1 0 1 0
2 0 1 0 1
V1 V3 3 1 0 1 0
V2
After addition of edge
V0 0 1 2 3
0 0 1 1 1
1 1 0 1 0
V1 V3 2 1 1 0 1
3 1 0 1 0
V2
Thus a new edge gets added in the graph.

/ % : Deletion of any desired edge is the simplest operation. Following
are the steps to be followed for deletion of an edge.
Step 1: Specify the name of an edge by two vertices say V1 and V2
Step 2: Set G[V1][V2]=G[V2][v1]=0
Step 3: Display the message that the edge is deleted

0 1 2 3
V0
0 0 1 1 1
1 1 0 1 0
2 1 1 0 1
V1 V3
3 1 0 1 0
V2
If we want to delete an edge V2-V3 then
0 1 2 3
V0 0 0 1 1 1
1 1 0 1 0
2 1 1 0 0
V1 V3 3 1 0 0 0
V2
Thus the required edge is deleted from the graph.

6. Find Vertex: For finding out the desired vertex we need to scan the complete
graph. If vertex is present, then we will display its adjacent or neighbouring
vertices.
For example
V0
V1 V3
V2
V4
If we search for vertex V4 then the vertices V1, V2 and V3 are neighbouring
vertices.
If the given vertex is not present in the graph then there will not be the
neighbouring vertices. Hence we can simply display a message-“vertex not found”.
- '
*
" *
*
)

#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#define MAX 10
int choice,n;
int G[MAX][MAX];
void insert_vertex();
void delete_vertex();
void find_vertex();
void insert_edge();
void delete_edge();
void display();
void create();
int i,j;
void main()
{
int choice;
char ch='y';
clrscr();
for(i=0;i<MAX;i++)
for(j=0;j<MAX;j++)
G[i][j]=0;
printf("\n Program for graph creation");
create();
display();
do
{
printf("\n Enter your choices");
printf("\n 1. Insertion for vertex");
printf("\n 2. Deletion for vertex");
printf("\n 3. Finding the vertex");
printf("\n 4. Edge Addition");
printf("\n 5. Edge Deletion");
printf("\n 6. Exit");
scanf("%d",&choice);
switch(choice)
{
case 1: insert_vertex();
display();
break;
case 2: delete_vertex();
display();
break;
case 3: find_vertex();
display();
break;

case 4: insert_edge();
display();
break;
case 5: delete_edge();
display();
break;
case 6: exit(0);
}
printf("Do you want to go to main menu?");
ch=getch();
}while(ch=='y');
}
void create()
{
int v1,v2;
char ans='y';
do
{
printf("\n Enter vertex v1 & v2");
G[v1][v2]=1;
G[v2][v1]=1;
printf("\n Do you want to insert more node?");
ans=getch();
}while(ans=='y');
}
void insert_vertex()
{
int v1,v2;
char ans='y';
printf("Enter the vertex to be inserted");
scanf("%d",&v1);
do
{
printf("\n Enter neighbouring vertex ");
scanf("%d",&v2);
G[v1][v2]=1;
G[v2][v1]=1;
printf("\n More neighbouring vertex?");
ans=getch();
}while(ans=='y');
}
void display()
{
printf("\n");
for(i=0;i<MAX;i++)

{
for(j=0;j<MAX;j++)
{
printf("%3d",G[i][j]);
}
printf("\n");
}
}
void delete_vertex()
{
int i,v;
printf("\n Enter the vertex to be deleted");
scanf("%d",&v);
for(i=0;i<MAX;i++)
{
G[v][i]=0;
G[i][v]=0;
}
printf("\n The vertex is deleted");
}
void find_vertex()
{
int v,i;
int flag=1;
printf("\n Enter the vertex to be searched in the Graph");
scanf("%d",&v);
for(i=0;i<MAX;i++)
{
if(G[v][i]==1)
{
flag=0;
printf("\n Neighbouring vertex is %d",i);
}
}
if(flag==1)
printf("\n Vertex is not present in the Graph");
}
void insert_edge()
{
int v1,v2;
printf("\n Enter the edge to be inserted by v1 & v2");
G[v1][v2]=1;
G[v2][v1]=1;
}

void delete_edge()
{
int v1,v2;
printf("\n Enter the edge to be deleted by v1 & v2");
G[v1][v2]=0;
G[v2][v1]=0;
}

CHAPTER
5
GRAPH
ALGORITHMS
Syllabus:
Minimum-Cost Spanning Trees- Prim's Algorithm, Kruskal's Algorithm Shortest
Path Algorithms: Dijkstra's Algorithm, All Pairs Shortest Paths Problem: Floyd's
Algorithm, Warshall's Algorithm

5.1 Minimum Cost Spanning Tree:
A spanning tree of a graph G is a subgraph which is basically a tree and it
contains all the vertices of G containing no circuit.
A minimum spanning tree of a weighted connected graph G is a spanning tree
with minimum or smallest weight.
When determining the cost of spanning tree of a weighted graph, the cost is
simply the sum of the weights of the tree’s edges. A minimal cost spanning tree is
formed when the edge are picked to minimize the total cost.
Given a connected weighted graphs G, it is often desired to create spanning
tree T for G such that the sum of weights of the tree edges in T is as small as
possible. Such a tree is called a minimum spanning tree and represent the cheapest
way of connected all nodes G.
1. Spanning trees are very important is designing efficient routing algorithms.

2. Spanning tree have wide applications in many areas such as network design.
There are number of techniques for creating a minimum spanning tree for a
weighted graph.
1. Prim’s algorithm
2. Kruskal’s algorithm
5.2 Prim’s Algorithm:

Minimum spanning tree can also generated from the Prim’s algorithm. In this
approach tree grows by adding in each step with minimal cost. The outcome is
optimal. In the Prim’s algorithm process first all the nodes created as forest. Select
source vertex as visiting node. Select the minimum of all the available edges of the
source vertex. Enter these values in to the matrix Choose the edge that is having
minimum value and take that node as the processing node and continue the
process.

34
a b 44
g
7
12 9
10 5
15 d f
13 8
c
20 e
11 h
First create a forest for the above graph. Forest means nodes with out any
edges, the following diagram shows forest and respective initialized matrix.
Representation as follows:
V(Vertex) K (Known) D(Distance) P(Path)
a b
g a 0 0 0
b 0 ∞ 0
d f
c 0 ∞ 0
c
d 0 ∞ 0
e h
e 0 ∞ 0
f 0 ∞ 0
g 0 ∞ 0
h 0 ∞ 0
Step 1: In the above diagram a is considered as source vertex. Values of edges for a
are b and c filled in the matrix.

a b V(Vertex) K (Known) D(Distance) P(Path)
g
a 1 0 0
b 0 34 a
d f
c c 0 12 a
e d 0 ∞ 0
h
e 0 ∞ 0
f 0 ∞ 0
g 0 ∞ 0
h 0 ∞ 0
Step 2: Minimum of edges of a values is 12, so draw edge for a and c. Next c is
processing node. Values of edges for c are d and e filled in the matrix.

a b
g a 1 0 0
12
b 0 34 a
d f
c 1 12 a
c
d 0 15 c
e h
e 0 20 c
f 0 ∞ 0
g 0 ∞ 0
h 0 ∞ 0
Step 3: Minimum of edges of c is 15, so draw edge for c and d. Next d is processing
node. Values of edges for d are b, e and f filled in the matrix. Current value of b
(34) is greater than weight of edge for b and d(7). So, update the distance of b to 7
and path to d.


g
a 1 0 0
12
b 0 7 d
15 f
d
c 1 12 a
c
d 1 15 c
e h
e 0 13 d
f 0 10 d
g 0 ∞ 0
h 0 ∞ 0
Step 4: Minimum of edges of d is 7 of b, so draw edge for d and b. Next b is

processing node. Value of edges for b is only g filled in the matrix.
g
7 a 1 0 0
12
15 b 1 7 d
d f
c c 1 12 a
e d 1 15 c
h
e 0 13 d
f 0 10 d
g 0 44 b
h 0 ∞ 0
Step 5: Next selected edge 10 of d and f, so draw edge between d and f. Now f is the
processing node. Values of edges of f are g and h entered in to matrix.


g a 1 0 0
7
12
b 1 7 d
15 10 f c 1 12 a
d
c d 1 15 c
e 0 13 d
e h
f 1 10 d
g 0 9 f
h 0 8 f
Step 6: Minimum of g and h is h, so draw edge between f and h. Now h is the

processing node. Values of edges of h are g and e entered in to matrix. Existing e
value is 13 greater than current value 9 so row e is updated with 9

a b
g
7 a 1 0 0
12
b 1 7 d
15 10 f
d
c 1 12 a
c
8
d 1 15 c
e h
e 0 11 b
f 1 10 d
g 0 5 h
h 1 8 f
Step 7: Next minimum value is 5 so draw edge for h and g. For node g all edges are
processed.

a b
g
7 a 1 0 0
12
5 b 1 7 d
15 10 f
d
c 1 12 a
c
8
d 1 15 c
e h
e 0 11 b
f 1 10 d
g 1 5 h
h 1 8 f
Step 8: After drawing edge between e and h, Prim’s algorithm generated final
minimum spanning tree that is shown in the following diagram
a b
g
7 a 1 0 0
12
5 b 1 7 d
15 10 f
d
c 1 12 a
c
8
d 1 15 c
e h
11 e 0 11 b
f 1 10 d
g 1 5 h
h 1 8 f
5.3 Kruskal's Algorithm: - Krushkal algorithm is used to construct the minimum

spanning tree. First all the edges the graph is arranged in to increasing
order(sorted order). Add the next smallest weight edge to forest if it will not cause a
cycle. After encountering n-1 edges stop the process. Other wise repeat these
steps.
Krushkal algorithm creates a forest of trees, initially the forest consists of n
single node trees and no edges are shown. At each step we add smallest edge so
that it joins two trees together. Always checks are there any cycle formed. If it were

to form a cycle, it means that these two nodes were already part of a single
connected tree, so that this edge not needed to be added.
34
a b 44
g
7
12 9
10 5
15 d f
13 8
c
20 e
11 h
Step 1: - First create a forest for the above graph. Forest means nodes without any
edges.
a b
g
d f
e h
Step 2: - Arrange all the weighted values into increasing order (sorted order)
5 7 8 9 10 11 12 13 15 20 34 44
a b
g
d f
e h
Step 3: - Pick the smallest weight from the sorted weights that is 5, draw edge for g
and h.
5 7 8 9 10 11 12 13 15 20 34 44

a b
g
5
d f
e h
Step 4: - Remove 5 from the Queue of sorted weights. Next minimum is 7 so draw
edge for b and d. (no cycle)
7 8 9 10 11 12 13 15 20 34 44
a b
g
7
5
d f
e h
Step 5: - Remove 7 from the Queue of sorted weights. Next minimum is 8, draw
edge between h and f(no cycle)
8 9 10 11 12 13 15 20 34 44
a b
g
7
5
d f
c
8
e h
Step 6: - Remove 8 from the Queue of sorted weights. Next minimum is 9, draw
edge between h and f(But creates cycle).
According to Krushkal algorithm, there must be no cycle in the minimum
spanning tree, so remove this edge from the diagram.
Remove 9 from the Queue of sorted weights. Next minimum is 10, so draw
edge between d and f (no cycle)

9 10 11 12 13 15 20 34 44 10 11 12 13 15 20 34 44
a b
g a b
7 9 g
7
5 10
d f 5
d f
c
c
8
e 8
h e h
Step 7: - Remove 10 from Queue of sorted weights. Next minimum value is 11 , so

draw edge between e and h (no cycle)
11 12 13 15 20 34 44
a b
g
7
10 5
d f
c
8
e h
11
Step 8: - Remove 11 from Queue of sorted weights. Next minimum value is 12, so
draw edge between a and c (No cycle)
12 13 15 20 34 44
a b
g
7
12
10 5
d f
c
8
e h
11
Step 9: - Remove 12 from Queue of sorted weights. Next minimum value is 13,
draw edge between d and e (But creates cycle)

According to Krushkal algorithm, there must be no cycle in the minimum
spanning tree, so remove this edge from the diagram.
Remove 13 from the Queue of sorted weights. Next minimum is 15, so draw
edge between c and d (no cycle)
15 20 34 44
13 15 20 34 44
a b
a b g
g 7
7 12
10 5
12 f
10 5 d
d f
c 15
c 8
13 e h
8
e h 11
11
All the vertices are connected, hence completed Krushkal’s minimum
spanning tree
Action of Krushkal’s algorithms are shown in the following table
Edge Weight Action
(a,b) 34 Rejected
(a,c) 12 Connected
(b,g) 44 Rejected
(c,d) 15 Connected
(c,e) 20 Rejected
(d,b) 7 Connected
(d,f) 10 Connected
(e,d) 13 Rejected
(e,h) 11 Connected
(f,g) 9 Rejected
(f,h) 8 Connected
(h,g) 5 Connected

5.4 Dijkstra’s Algorithm (Shortest Path Algorithm):
The distance of a vertex from a vertex node a is the length of a shorted path
between a and v. Dijkstra’s algorithm computes the distances of the vertices in a
graph from a given vertex a. Some of the assumptions related to Dijkstra’s
algorithm are the graph is connected and the edges are undirected. As we said in
the above the edge weights are assumed to be non-negative. First begin with ‘a’ and
it covers all the vertices after completion of algorithm. Each vertex v is stored, a
label d(v) representing the distance of v from a in the sub graph consisting of the
sub graph and its adjacent vertices.
At each step there is a need finding smallest distance, and update the labels
of the vertices adjacent to search node. In Dijkstra’s algorithm there is a need of
priority queue that stores the vertices outside the set.
34
a b 44
g
7
12 9
10 5
15 d f
13 8
c
20 e
11 h

All the vertices are initialized with know value 0 and distance value ∞, path value
with 0. The table is as follows when a is considered as source vertex.

a 0 0 0
b 0 ∞ 0
c 0 ∞ 0
d 0 ∞ 0
e 0 ∞ 0
f 0 ∞ 0
g 0 ∞ 0
h 0 ∞ 0
Step 1: - After selecting vertex a as selected node, make known value of vertex a to
1, distance to 0 and path to 0 and fill all the entries of the entries of the vertices
which are adjacent to vertex a. Here b and c are adjacent to vertex a. Distance of a
and b is 34, so fill the b row with known value as 0, distance as 34 and path is to a.
similarly fill the details of row c. New diagram is as follows.
Step 2: - From the two available distances 12 is the smallest value so consider c for
Dijkstra’s node. Here the cell c has two paths c -> d and c -> e with weights 15 and
20 respectively. From these 15 is small value, it is added to existing(selected) path
that is 12. That results matrix update with the value 27 in distance of row d.

Step 3:
Step 4:

Step 5:
Step 6:

Step 7:

CHAPTER
6
SORTING
METHODS
Syllabus:
Order Statistics: Lower Bound on Complexity for Sorting Methods: Lower Bound on
Worst Case Complexity, Lower Bound on Average Case Complexity, Heap Sort,
Quick Sort, Radix Sorting, Merge Sort.

6.1 Quick Sort:- This method is also called as )* . The Quick
Sort is developed by C.A.R. Hoare and it has the best average behavior among all
the sorting methods. It is the most efficient internal sorting algorithm which is
based on, DIVIDE-AND-CONQUER approach. According to this problem of sorting a
set of elements is reduced to the problem of sorting two smaller sets.
In quick sort, we divide the array of items to b sorted into two partitions and
then call the quick sort procedure recursively to sort the two partitions. To partition
the data elements, a pivot element is to be selected such that all the items in the
lower part are less than the pivot and all those in the upper part greater than it.
Consider an array A of N elements to be sorted. Select a pivot element among
the N elements. The selection of pivot element is somewhat arbitrary, the first
element is a convenient one.
The array is divided into two partitions so that the pivot( or partitioning)
element is placed into its proper position satisfying the following properties:
1. All elements to the left of pivot are less than the pivot element.
2. All elements to the right of pivot are greater than or equal to the pivot
element.
In this method the element to be sorted are divided into three groups A,B,C. B
contains exactly one element. No element in-group A is larger than B (pivot
element). As a result the elements of A and the elements of C can be sorted
independently.
The algorithm for Quick sort is as shown below:
Procedure Quick sort(a,n)
Begin
1. Select a splitting value B
2. Partition the array into two sub lists A,C
3. Quick Sort A
4. Quick Sort B
5. The answer is A, followed by B followed by C
End
Example
i<j condition true then swap i position element and j position element
i<j condition true then swap i position element and j position element
i < j condition false then terminate the loop then swap pivot element and j
position element

)* /: Trace the Quick sort algorithm for the following list of numbers:
90, 77, 60, 99, 55, 88, 62
)* +: Sort the following list of elements using quick sort
50, 30, 10, 90, 80, 20, 40, 70
Program : % : ;
#include<stdio.h>
void qsort(int[],int,int);
void main()
{
int n,i,a[10];
printf("enter n value ");
scanf("%d",&n);
printf("enter the elements ");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
qsort(a,0,n-1);
printf(" output sorted list is \n");
for(i=0;i<n;i++)
printf("%3d",a[i]);
}
void qsort(int a[10],int left,int right)
{
int p,i,j,t;
if(left<right)
{
p=a[left];
i=left;
j=right;
do
{
do
{
i++;
}while(a[i]<=p&&i<=right);
while(a[j]>p&&j>left)

{
j--;
}
if(i<j)
{
t=a[i];
a[i]=a[j];
a[j]=t;
}
}while(i<j);
t=a[j];
a[j]=a[left];
a[left]=t;
qsort(a,left,j-1);
qsort(a,j+1,right);
}
else
return;
}
Input: enter the n value 10
Enter the array elements
5 7 2 10 4 9 6 3 8
Output:
1 2 3 4 5 6 7 8 9 10
Time Complexity: -
a) Worst Case: The Worst case occurs when the list of element is already sorted
then the first element requires ‘n’ comparisons to recognize that it remains in the
first position.
Then the list will be sub divided into two sub lists, the first sub list gets only
goes element and the second sub list gets n-1 elements. Accordingly the second
element requires n-1 comparisons to recognize that it remains in the second
position.
The time complexity function f(n) in the worst case is
F(n) =n+(n-1)+(n-2) +……………………….+1
=n(n+1)/2 = O(n2)
b) Average Case: On an average each reduction step produces two sub lists
accordingly.
Step 1: It reduces the initial list, an places on element in the correct position and
produces two sub lists.
Step 2: Reduces the two sub list places two elements and produces for sub lists.
In general the reduction step at kth level finds the location of 2k-1 elements.
= O(n log n)

6.2 Merge Sort: Merge sort is based on the divide-and-conquer algorithm. There
merge sort splits the list to be sorted into two equal halves and places them in two
separate arrays. Each array is recursively sorted, and then merged back together to
form the final sorted list. Elementary implementation of the merge sort is based on
three arrays. Two arrays are used for two sub lists and third array for combined
lists. To get significant performance for merge sort we have to use recursive
procedures.
There are three important steps based on this technique in merge sort, they
are
Divide step: if given array A has zero or one element then no need to sort, already
elements are in sorted order. Otherwise divide the array into two arrays,
each containing half of elements of given array.
Recursion step: We can recursively sort the elements of each half array.
Conquer Step: Finally combine the elements back in array by merging the sorted
sub lists (half arrays) in a sorted sequence.
Example:
The following diagram shows merge step as combine technique
In the above diagram we have seen process of merge sort but execution of merge
sort not done as said above. Because recursive calls made for each sub list, observe
the following diagrams shows actual process of merge sort call.
Advantages and Disadvantages: -
The merge sort is slightly faster than the heap sort for large sets. It requires twice
the memory of the heap sort because of the second array. This additional memory
requirement makes it unattractive for most purposes in those cases the quick sort

is a better choice most of the time and the heap sort is a better choice for very large
sets.
)* /: Sort the following elements using merge sort.
10, 5, 7, 6, 1, 4, 8, 3, 2,
Program : %
#include<stdio.h>
#include<conio.h>
int a[10],n;
void getdata();
void display();
void msort(int,int);
void merge(int,int);
void main()
{
clrscr();
getdata();
display();
msort(0,n-1);
display();
}
void getdata()
{
int i;
printf("enter the n value");
scanf("%d",&n);
printf("enter the array elements");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
}
void display()
{
int i;
printf("\n the array elements \n");
for(i=0;i<n;i++)
printf("%3d",a[i]);
}
void msort(int left,int right)
{
int mid;
if(left<right)
{
mid=(left+right)/2;
msort(left,mid);
msort(mid+1,right);
merge(left,right);

}
else
return;
}
void merge(int first,int last)
{
int b[10],mid,i1,i2,i3,k,k1;
mid=(first+last)/2;
i1=first;
i2=mid+1;
i3=0;
while(i1<=mid&&i2<=last)
{
if(a[i1]<a[i2])
b[i3++]=a[i1++];
else
b[i3++]=a[i2++];
}
for(;i1<=mid;)
b[i3++]=a[i1++];
for(;i2<=last;)
b[i3++]=a[i2++];
for(k=first,k1=0;k<=last;k++,k1++)
a[k]=b[k1];
}
5 7 2 10 4 9 6 3 8
Output:
1 2 3 4 5 6 7 8 9 10
Time Complexity: -
The time complexity of merge sort is O(n log n). Recursive nature of merge sort
results for O(n log n). Each Recursive call has O(n) steps and there are log n steps
to get the final solution. so, it results for O(n log n).
2.7 Bucket (or) Radix sort:-

In this method sorting can be done digit by digit and thus all the elements
can be sorted.
Bucket sort or radix sort, is a method can be used to sort a list of names
alphabetically. Here the base or radix is 26, (the 26 letters of the alphabetic).
First of all the list of names is sorted according to the first letter of each name thus
the names are arranged in 26 buckets.
In second pass, names are arranged according to the second letter of each name
and so on this process depend on the length of the names with maximum letters.

Suppose if no name contains more than 15 letters, the names are alphabetized with
at most 15 passes.
To sort decimal numbers, where radix or base is 10, we need ten buckets. These
buckets are numbered 0,1,2,3,4,5,6,7,8,9. Unlike, sorting names, decimal numbers
are sorted from right to left.
121
70
965
432
12
577 12
683 70 121 432 683 965 577
0 1 2 3 4 5 6 7 8 9
a) Pass-I
70
121
432
12
683
70
965 121 577
577 12 432 965 683
0 1 2 3 4 5 6 7 8 9
b)Pass-2
b) Pass-III
70
121
432
12
683
70
965
577 12 121 432 577 683 965
432
0 1 2 3 oIn
4 5 6 7 8 9

After pass-III when the number are checked, they are in ascending order.
12, 70, 121, 432, 577, 683, 865
Algorithm:
Let a is an array with n elements in the memory
Step 1: find the largest element of the array.
Step 2: Find the total no. of digits num in the largest digit
Set digit=num
Step 3: Repeat step 4,5 for pass=1 to num
Step 4: initialize buckets
For i=1 to (n-1)
Set num=obtain digit number pass of a[i]
Put a[i] in bucket number digit
[ end for loop ]
Step 5: calculate all number from the bucket in order
Step 6: exit
Program : % *
#include<stdio.h>
#include<conio.h>
int a[10],n,bucket[10][10],b[10];
void getdata();
void display();
void rsort();
void main()
{
clrscr();
getdata();
display();
rsort();
display();
}
void getdata()
{
int i;
printf("enter the n value");
scanf("%d",&n);
printf("enter the array elements");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
}
void display()
{
int i;
printf("\n the array elements \n");

for(i=0;i<n;i++)
printf("%8d",a[i]);
}
void rsort()
{
int l,i,num,div,k,j,p,x;
l=a[0];
for(i=1;i<n;i++)
{
if(a[i]>l)
l=a[i];
}
num=0;
while(l!=0)
{
num++;
l=l/10;
}
div=1;
for(i=0;i<num;i++)
{
for(k=0;k<10;k++)
b[k]=0;
for(j=0;j<n;j++)
{
p=(a[j]/div)%10;
bucket[p][b[p]]=a[j];
b[p]=b[p]+1;
}
x=0;
for(j=0;j<10;j++)
{
for(k=0;k<b[j];k++)
a[x++]=bucket[j][k];
}
div=div*10;
}
}

5 7 2 10 4 9 6 3 8
Output:
1 2 3 4 5 6 7 8 9 10





!" 0
#include<stdio.h>
void read(int*,int);
void display(int*,int);
void heapsort(int*,int);
void formheap(int*,int,int);
void swap(int*,int*);
void main()
{
int arr[10],n;
printf("enter the n value ");
scanf("%d",&n);
printf("enter the array elements \n");
read(arr,n-1);
heapsort(arr,n-1);
printf("enter the array elements \n");
display(arr,n-1);
}
void read(int *a,int n)
{
int i;
for(i=0;i<=n;i++)

scanf("%d",a+i);
}
void display(int *a,int n)
{
int i;
for(i=0;i<=n;i++)
printf("%3d",*(a+i));
}
void heapsort(int *a,int n)
{
int i;
for(i=(n-1)/2;i>=0;i--)
formheap(a,i,n);
for(i=n;i>0;i--)
{
swap(a,a+i);
display(a,n);
printf("\n");
formheap(a,0,i-1);
}
}
void formheap(int *a,int i,int n)
{
int j;
j=(2*i)+1;
while(j<=n)
{
if(j+1<=n&&*(a+j+1)>*(a+(j-1)/2))
if(*(a+j+1)>*(a+j))
j++;
if(*(a+j)>*(a+(j-1)/2))
swap(a+j,a+(j-1)/2);
j=(2*j)+1;
}
}
void swap(int *c,int *d)
{
int t;
t=*c;
*c=*d;
*d=t;
}

Time Complexities of Sorting Algorithms:
Sorting Time Complexity (Cases)

Technique Average Best Worst
Bubble Sort n 2 n 2 n2
Selection Sort n2 n2 n2
Insertion Sort n 2 n 2 n2
Radix Sort nlogn nlogn nlogn
Merge Sort nlogn nlogn nlogn
Quick Sort nlogn nlogn n2
Quick sort is fastest sorting algorithm

CHAPTER
7
PATTERN
MATCHING &
TRIES
Syllabus:
Pattern matching algorithms- the Boyer –Moore algorithm, the Knuth-Morris-Pratt
algorithm Tries: Definitions and concepts of digital search tree, Binary trie, Patricia
, Multi-way trie

String Operations: -
A string is a sequence of characters. String objects are useful for handling data
items. Examples of Strings are
Ø C++ program
Ø HTML document contains tags
Ø DNA Sequence useful in bioinformatics
Ø Digitized images useful in image processing.
Definition:
An alphabet Σ is the set of possible characters for a set of strings. Some examples
for alphabets are
ASCII characters
Binary Alphabet {0,1}
Hexa-decimal Alphabet {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}
DNA Alphabet {A,C,G,T,a,c,g,t}
RNA Alphabet {A,C,G,U,a,c,g,u}
Basic Operations on Strings:
Let S be a string of size n,
Substring: A substring S[i..j] of S is the subsequence of P consisting of the
characters with ranks between i and j. For example “DATASTRUCTURES” is a string
“ATA”, “STRUCT”, “TURE” all are substrings of it.
Prefix: A prefix of S is a substring of the type S[0..i] where i<n. For example,
“DATAS” is prefix of “DATASTRUCTURES”.
Suffix: A suffix of S is asubstring of the type S[i,n-1] where i=0. For example,
“TURES” a string “DATASTRUCTURES”.
Pattern Match: Consider T as Text and P is pattern, the pattern matching problem
consists of finding substring of Text equal to Pattern. For example, “STRUCT” is
pattern find in “DATASTRUCTURES” positioned at 4th character.
Applications of Strings:
The following areas are some of major applications for strings.
Text Editors
Search Engines
In biological research pattern matching algorithm is used.
Pattern Matching Algorithms:

They are
Brute force Algorithm
Boyer-Moore Algorithm
Knuth-Morris-Pratt Algorithm
Brute-Force Algorithm: -
The brute-force algorithmic design pattern is a useful technique for algorithm
design. Searching for some information on the web requires an optimized solution.
Brute-force method provides this solution. The brute force algorithm maintains
consistency in checking, at all positions in the text between 0 and n-m, whether an
occurrence of the patterns starts there or not.
After each attempt, shifts the pattern by exactly one position to the right. The brute
force algorithm requires no preprocessing phase, and a constant extra space in

addition to the pattern and the text. During the searching phase the text character
comparisons can be done in any order. The time complexity of the this searching
phase is O(mn). The expected number of text character comparisons is 2n.
Shift Process in Brute Force:

Consider String is ABABBABABABA and search pattern is ABABA, then the
searching in brute-force done as follows
Step 1: ABABBABABABA
ABABA
ABABA
ABABA
ABABA
ABABA
ABABA
Pattern matched, so return the index
Boyer-Moore Algorithm:-
The Boyer-Moore algorithm was invented by Boyer and Moore. The Boyer-Moore
algorithm is considered to be the one of the most efficient string matching
algorithms. One of the applications that make use of this algorithm is text editors
for Find and Replace with commands. Boyer Moore algorithm scans the characters
of the pattern from right to left beginning with the rightmost character. In case of a
match or mismatch it uses two predefined functions to shift the search window to
the right. These two shift functions are called as the good-suffix shift, and bad-
character shift functions. Good suffix shift is also called as matching shift and bad
character shift is also called as the occurrence shift.
The Boyer-Moore’s Pattern matching algorithm is based on to heuristics. One
is looking glass heuristic and second one character-jump heuristic.
Looking-Glass Heuristic: Compare Ptrn with a sub-sequence of String moving
backwards
Character-Jump Heuristic: When a mismatch occurs at S[i]=c then If Ptrn contains
c, shift P to align the last occurrence of c in P with S[i] else shift Ptrn to align P[0]
with S[i+1].
Shift Operation: -
There are two cases exists for shift operation
Shift Case 1: if 1+Last(c)<=j then shift the pattern j-Last(c) units

Shift Case 2: if 1+Last(c)>j then shift the pattern by one unit
General Application : Example

Search String is “ABACAABADCABACABAABB
Consider Σ = { A, B, C, D} and Ptrn = ABACAB
Ptrn is expanded as follows


Examples:
1. Consider a text T= XYXZXXYXTZXYXZXXYXXYXXYY
To match against the pattern P = XYXZXY
Knuth, Morris and Pratt (KMP) Algorithm: -

KMP, Knuth-Morris-Pratt algorithm compares the pattern to the text in left-
to-right but shifts the pattern more intelligently than the previous two algorithms.
Worst case performances of Brute Force and Boyer-Moore algorithms are improved
in the KMP algorithm. KMP has a running time of O(n+m), a best solution for the
worst case. Main theme of algorithm is shifting operation in the caste of mismatch.
When a mismatch occurs, the largest prefix of P[0..j] that is suffix of P[1..j] is the
most we can shift the pattern so as to avoid redundant comparisons. KMP
algorithms make use of failure function for proper shift operation.
The Knuth-Morris-Pratt (KMP) algorithm has similar features of finite
automat matching model. The computation of prefix function needs time O(m).
While the computation of the automation needs time O(n+m). Amortized analysis
shows that the KMP-Matcher is up to a constant factor as fast as the finite-
automaton matcher. This gives runtime of O(m+n) for the KMP.


Trie: - A trie is a tree-based data structure for storing strings in order to search
patterns in quick way. A trie is an indexed search tree. Tries can be used to perform
prefix queries for information retrieval. These queries search for the longest prefix of
given string that matches a prefix of some string in trie. The name trie comes from
the retrieval. Much faster for retrieving data in stored in a trie structure, hence it is
trie.
Consider a set of Strings s = { a, an, and ant, any, at} and these words
represented in trie diagram as follows.

Advantages of tries:
1. In tries the keys are searched using common prefixes. Hence it is faster. The
lookup of keys depends upon the height in case of binary search tree.
2. Tries take less space when they contain a large number of short strings. As
nodes are shared between the keys.
3. Tries help with longest prefix matching, when we want to find the key.
' ' :
Standard Tries Single character per node
Compressed Tries Eliminating chains of nodes
Compact Tries Stores indices into tree
Suffix Tries Stores all suffixes of string into node
% % :-
A Standard trie representing a set of strings S is ordered tree that based on
following rules.
1. All nodes except root node labeled with a character
2. Nodes order children alphabetically
3. Combining characters along path from root to external node yields a string in set
of strings S.
4. All strings in the set S encoded with in standard trie
Consider a String S={bat,ball,bear,bell,bid,bull,buy,may,me,men}

! ' : - Tries use space complexity of O(n), Searching,

insertion and deletion takes O(dm) time, d stands for size of the alphabet, m is size
of string used in operation. Compact representations for string so that is reason for
O(n) space. A standard trie can be constructed with in O(n) time.
Compressed Tries: - A compressed trie is like a standard trie but each trie had, a
degree of at least 2. Single child nodes are compressed into single edge. This
ensures that each internal node in the trie has at lease two children. A compressed
trie is obtained from standard trie by compressing chains of redundant nodes. A
critical node is a node v such that v labeled with a string from S, v has at least 2
children, or v is the root.
Each internal node in a compressed trie has at least two children and each
external is associated with a string. The compression reduces the total space for trie
from O(m) where m is the sum of the lengths of strings in set S to O(n) where is n is
number of strings in S.
In this standard we shown the height that is 6. Above diagram is compressed as

follows, after compression the height of trie is 4.

Compact Tries: -
Compact tries are the compact representation of compressed tries. For an array of
strings S that is S[0], S[1],… S[s-1] store ranges of indices at each node instead of
substring. Each one represents as a triplet of integers (i,j,k) such that X=s[i][j..k]
First consider all strings into S
S = {SELL, STOCK, BALL, BEARD, BELL, BULL, ROCK, STOP}
All the above strings are based on indexes as shown in the following table
Triplet representation is shown in the following diagram

(i j k)
i indicates index char in string
j indicates starting character position in i string of S
k indicates ending character position in ith string of S
Means that (6,1,2) indicates that in 6th string, (1+1) 2nd character to (2+1) 3rd
character, that is 1st index to 2nd index. According to above array table is results
as ID. Following diagrams represents compl compact representation for given trie.
Uses O(s) space s is number of strings in the array. Compact version serves as an
auxiliary index structure.

SUFFIX TRIES: - Suffix tries, tree of all suffixes of a single string. Suffix tries are
used in pattern matching that is a substring is the prefix of a suffix. Changes a
linear search for the beginning of the pattern is like in KMP to a tree search. In
space complexity uses O(n) instead of O(n2) because characters only need to appear
once.
Consider MINIMIZE as a string it has eight suffixes as follows.
In the above suffix table I,.. MI are having suffixes as follows:
Remaining suffixes are act as nodes. The following diagram represents suffix trie for
the above diagram.


CHAPTER
8
FILE
STRUCTURES
Syllabus:
Fundamental File Processing Operations-opening files, closing files, Reading and
Writing file contents, Special characters in files.
Fundamental File Structure Concepts- Field and record organization, Managing
fixed-length, fixed-field buffers.

8.1 STREAMS: A stream is a flow of a data (sequence of bytes). The data can flow
in two directions either in or out.
Keyboar Monitor
Mouse Printer
Program
Network Network
Memory Memory
The source stream that provides data to program is called the input stream,
the destination that receives output functions the program is called output stream.
A program extracts the bytes from an input stream and inserts bytes into output
stream. The data in the input stream can come from the key board or any other
storage device. The data in the output stream can go the screen or any other
storage device. A stream acts as a interface between the program and the
input/output device.
Input Stream
Input
>>
Device
Program
Output
Device <<
Output Stream
Streams are divided into 3 types. Console Streams, File Streams, String Streams.
Console output streams are used displaying data to standard output device that is
monitor. File input and output streams are used for handling input and output
operations to and from files. String input and output streams are used for handling
string operations in C++, a stream is represented by an object

eg: cin and cout
C++ contains several pre-defined I/O streams that are automatically opened
before a program begins its execution. There are four standard I/O streams, these
are cin, cout, cerr, clog
cin is an input stream that is connected to the standard input device(the

keyboard)
cout is an output stream that is connected to the standard output device(the

screen)
cerr is a standard unbuffered output stream that provides error messages to

the standard error device.
clog is a standard buffered output stream that also provides error messages
to the standard error device .
These classes are declare in the header file iostream.h
8.2 HIERARCHY OF CONSOLE STREAM CLASSES: -
There are many classes involved in input-output stream operations. Stream

operations that are based on console and disk files are called as stream classes.
There are different approaches are there for handling console and disk files.
streamb
ios
uf
istream ostream
iostrea
m
istream_with iostream_with ostream_with

assign assign assign

ios class: This class provides both input and output operations. A streambuf (buffer
object) is pointed by this ios class. Three important classes are derived from the ios
class are istream, ostream and streambuf.
istream class: This class is input stream does formatted input. It contains some
input functions such as get(), getline() and read(),there is an overloaded member
function, stream extraction operator >>. This extraction operator is used to read
data from the standard input device to the memory items.
ostream class: This class is output stream does formatted output. It contains some
important output functions such as put() and write(). This class insertion operator
is used to write data to the standard output device.
streambuf: This class is a pointer class for the ios class.
iostream class: It provides both input and output functions. It is derived from
multiple base class istream and ostream. It provides both input and output
function because at these functions are inherited from istream and ostream classes.
istream_withassign: This class adds the assignment operators to the istream class.
ostream_withassign: This class adds the assignment operators to the ostream

class.
iostream_withassign: This class adds the assignment operators to the iostream

class.
8.2.1 istream class:- istream class performs all the activities related to specific
input. It is derived from the base class ios. The most important member function
of this class is the overloaded operator >>. For example: cin>>name;
The istream class contains several member functions the most important functions
are get().getline(),read()
1.get() : This function take several forms these are as follows:

get() : extract one character and returns its value. Extracts one character into ch.
get(str, size) :extracts character upto size specified into str array.
get(str, size, delim) :extracts the character into str array until size characters are
extracted or delim character is reached.
get() example:-
char ch;
cout<<"enter the character":
cin.get(ch);
cout<<ch;
enter the character ramesh

r
get(str,size) example:-
char ch[10];
cout<<"enter the character";
cin.get(ch,3);
cout<<ch;
ra
get(str,size,delim) example:-
char ch[10];
cout<<"enter the character":
cin.get(ch,5,'e');
cout<<ch;
rame
2.getline(): This function extracts the characters from stream into a string array.
Until size characters an extracted, the delim character is found or the end of file is
reached.
syntax: istream & getline(char *str,size n);
istream & getline(char *str,size n,char delim);
ex:- char name[20];

cin.getline(name,10);
3.read(): This function reads a block of data of length specified by an argument 'n'.
This function reads data sequentially. So when the EOF is reached before the whole
block is read then the buffer will contain the elements read until EOF.
syntax: istream & read (char *str,size n);
8.2.2 ostream class:- The ostream class handles all the activities related to specific
output. The ostream class is derived from the ios class. The ostream class contains
several member functions. The most important function of this class is overloaded
<< operator function. The ostream class overloades << operator.
for example: cout<<name;
the member functions of this class are put() and write()
1.put(): The put() function inserts the character specified by 'ch' onto output
stream. if it is ready, otherwise the bad bit is set.
syntax: put(char ch);
2.write(): This function output the sequence of characters starting by character

specified by str until the number of characters specified by 'n' has been successfully
output on the output stream. this function does not check for ending null
characters. syntax: write(const char *str,size n);
8.3 UNFORMATTED I/O OPERATIONS:-
The objects cin and cout (predefined in iostream class) are used for taking input
and displaying output for various data types. Actually the operations >> (used for
input) and << (used for output) are overloaded to recognize all the basic c++ types.
The istream class overloads the operator>> and ostream class overloads the
operator<<.
syntax: cin>>v1>>v2.........>>vn;
cout<<i1<<i2<<......<<in;
where v1,v2,......are any valid c++ variable names
i1,i2,......may be variables or constants.
8.3.1 Character Oriented Input/Output Operations:-
The two functions get() and put()from istream and ostream class respectively
are used to handle the single character oriented input/output operations. The get()
function gets a character from keyboard and put() function output a character on
the screen.
get():- There are two forms of get()function
1. get(char *)function gets a character including blank space, tab and newline
character and assign it to its argument.
2. get(void)function is same as the above function but returns the input

character.
put(): put()function outputs a line of text character by character. The arguments to

this function may be a single character variable or a number. If argument is a
number(eg:68) then this integer value will be converted to character value and
displays the character
:
#include<iostream.h>
void main()
{
int c=0;
char ch;
cout<<"enter the text \n";

cin.get(ch);
while(ch!='n')
{
cout.put(ch);
c++;
cin.get(ch);
}
cout<<endl<<no. of characters ="<<c;
}
Output : enter the test : suresh is good boy
suresh is good boy
no. of characters:18
8.3.2 Line Oriented Input/Output Operations:
The two line oriented I/O functions getline() and write() can read and display a line
of txt more efficiently.
getline():- This function reads a line of text which ends with a new line character it
can call using cin object as follows.
cin.getline(line,size);
where line is a variable that reads a line of text.
size is number of characters to be read getline()function reads the input until 'it
encounters '\n' or size-I characters are read. when '\n' is read it is not saved
instead it is replaced by the null characters.
A string can also be read using >> operator but the string should not contain any
white spaces. after reading a string 'cin' adds a null character at the end of the
string automatically.
write(): This function displays a line on the screen. it is called using cout object as
follows: cout. write(line,size);
where line is the string to be display. size is the number of characters to be
displayed.
if the size of string is greater than the line (i.e., text to be displayed) then
write () function stop displaying on encountering null character but displays beyond
the bonds of line.
getline() and write() functions

:
void main()
{
int size =10;
char text[5];
cout<<"enter some text\n";
cin.getline(text,s);
cout<<"your text is: \n";
cout.write(text,size);
}
Output : enter some text : ramesh kumar
your text is : ramesh ku
8.3 FORMATTED CONSOLE I/O OPERATIONS:
C++ supports a number of features to format the output of a program. These are
1. ios class functions and flags
2. manipulators
3. user define manipulators
i. ios class functions and flags:- the ios class in includes a number of functions that
are used format the output of a program in several ways.
width():- This functions specifies the required width(or field size) for displaying
output of an item. This function is invoked using an object of ios class as follows:
cout.width(w);
where w is width(i.e .,number of columns in a field)
for example
cout.width(5);
cout<<543<<12<<"\n";
5 4 3 1 2
The value 543 is printed right justified in the first five columns. The
specification width(5) does not retain the setting for printing the number 12.
cout.width(5);
cout<<543;

cout.width(5);
cout<<12;
5 4 3 1 2
If the width specified is smaller than the size of value to be printed then C++
expands the field width to fit the value. The field width for each item should be
specified separately because after printing one item the width is revert back to the
default value.
precision():- This function is used to set the number of digits to be displayed after
decimal point while printing floating point numbers. The default precision is six
digits. This function is invoked as follows,
cout.precision(d);
e.g; cout.precision(2);
where d specifies the number of digits to be displayed after decimal point.
fill():- This function is used to fill the unused position in a filed by any desired
character. When the field width is more than actually required by the values these
unused positions are filled with white spaces by default. This function is called as
follows:
cout.fill(ch);
cout.fill('*');
#include<math.h>
#include<conio.h>
void main()
{
clrscr();
int n;
cout<<"enter the n value ";
cin>>n;
cout.precision(2);
for(int i=0;i<n;i++)
{
cout.width(5);
cout.fill('*');
cout<<i;

cout.width(8);
cout.fill('#');
cout<<sqrt(i)<<endl;
}
}
Output: ****0#######0
****1#######1
****2####1.41
****3####1.73
****4#######2
****5####2.24
Setf(): - Set flags is used to set format that controls the output. The setf() is an
overloaded function in ios class. Therefore it has two forms as shown below
cout.setf(arguement);
cout.setf(arguement1,arguement2);
where first argument in both of forms is the flag to be set and argument 2 in second
form specifies the group to which argument 1 belongs.
There are two types of flags.

1. On/off flags
2. flags that work in a group.
The first form of flags can be set or unset using setf() and unsetf() respectively.
These types of flags takes the first form of set() functions.
For example:
cout.setf(ios::showpos);
cout<<375;
The flags that can be set/unset in this manner are as follows.

flags meaning
ios::showpos prints + sign before positive numbers.
ios::showpoint shows trailing decimal point and zeros for

floating outputs.
ios::uppercase uses uppercase letters for hex values.
ios::skipws skips white spaces on input
ios::unitbuf flushes all streams after each insertion.

ios::stdio flushes stdout and stderr after each insertion.
The second type of formatting flags works in a group. This flags takes the second
form of the setf() functions.
for example: cout.setf(ios::g\hex,ios::basefield);
cout.setf(ios::left,ios::adjustfield);
There are three groups to which a flag can belong these are : adjustfield, floatfield
and basefield. The table below shows the flags (arguement1), bit field (arguement 2)
and their format actions.
Format Flag (Argument 1) Bit-field (Argument 2)
Left justification ios:left ios: adjustfield
Right justification ios:right ios: adjustfield
Padding after sign or base ios:internal ios: adjustfield
Scientific notation ios:scientific ios: floatfield
Fixed point notation ios:fixed ios: basefield
Decimal base ios:dec ios: basefield
Octal base ios:oct ios: basefield
Hexadecimal base ios:hex ios: basefield
unsetf(): This function is used clear the flag specified. This is called as follows:
cout.unsetf(flag);
for example:cout.unsetf(ios::left);

8.3.2 Manipulators: - The output of a program can also be formatted using a set of
function called manipulators. These functions are more simple and clear when
compared to using member functions of ios class to format the output. That is when
using manipulators formatting inductions can be directly inserted into a stream.
There are two types of manipulators : one take argument and the other that don’t
take argument.
Manipulators that don’t take arguments are declared in <iostream.h> header file
and the one that take arguments are declared in <iomanip.h> header file.
These manipulators are shown in the below table.

Manipulators that don’t take arguments
Manipulator Meaning
Skipws Skips white space
Noskipws Don’t skip white space on input.
Deco Convert to decimal.
Oct Convert to octal
Hex Convert to hexadecimal
Left Left justification
Right Right justification
Interval Padding used between sign or base
Indicator and value.
Endl Inserts newline and flush output stream.
Showpos Shows + sign before positive numbers.
Nonsshowpos Don’t show + sign before positive number.
Ends Terminates output string by inserting null characters.
Flush Flushes output stream.

Manipulators that take an argument
Manipulator Meaning Equivalent ios class

function
Setw(int w) Sets field width to w. Width()
Setfill(char ch) Sets the fill character to ch Fill()
Setprecision(int d) Sets the precision to d for Precision()

floating Point values.
Sets the format flag specified

Setiosflags(long f) by f.
Setf()
Resetiosflags(longf) Clears the format flag.

Unsetf()
specified by f
For example we can set the width of a field as follows.

cout<<setw(5);
cout<<1350;
The output of this statement is right justified in a field width of 5 the output
to be left justified.
8.3.2 User defined Manipulators:- User defined manipulators are the

manipulators created by the user. A user can create both the manipulators with
arguments and manipulators without arguments.
Syntax: ostream and manipulator(ostream and output)

{
//code;
Return output;
}
Where manipulator is the name of user defined manipulator.
Ex: ostream and tab(ostream and output)
{
Return output<<”\t”;
}
In an example ‘tab’ is a user defined manipulator. Whenever this manipulator
is gets called it returns ‘\t’ to the output stream.
The general form for creating user defined manipulators with argument is as
follows.

Syntax: ostream and manipulator(ostream and output)
{
//code;
Return output;
}
Where manipulator is the name of user defined manipulator.
] example: ostream and measure(ostream and output)
{
Output<<”centimeter”;
Return output;
}
If the statement is as follows
Cout<<100<measure
Then the output of this statement is 100 centimeters.
Let’s see another example of a user defined manipulator that does a sequence of
operations. When it gets called.
Example: ostream and display(ostream and output)
{
Output.setf(ios::showpos);
Output.setf(ios::showpoint);
Output<<setw(7);
Return output;
}
The user defined manipulator “display” sets the showpos and show point flags
of the ios class and also sets the field width to 7.
Difference between istream and ostream

Istream Ostream
1. The istream class manages the input 1. The ostream class manages the
stream. output stream.
2. It provides the facilities for both the 2. It provides the facilities for formatted
formatted and unformatted input. output only.
3. It overloads the extraction operator 3. It overloads the insertion operator
>> <<
4. Istream class uses the cin object to 4. ostream class uses the cout to write
read the data from an input device. the data to an output device.
5. The member functions of this class 5. The member functions of this class
are get(), getline() and read() are put() and write()
6. It contains the member function 6. It contains both the member
open() but does not contain the functions i.e. open() to open an
member function precision(). existing file and precision() to set
precision after decimal point in float
numbers.
8.4 File Streams:

A file is a collection of related information. Generally a file represents
programs or data. Common definition for file is collection of characters or sequence
of bytes. A file is referred through a name called as file name. List of file
manipulation operations are as follows.
Creation of a file
Opening a file
Writing to a file
Reading from a file
Closing file
Three important classes for handling files are
ofstream: Stream class for writing data to files and for handling output files.
ifstream: Stream class for reading data from files and for handling input files.
fstream: Stream class to both read from and write to files.
To perform file I/O, we need to include <fstream.h>
< 7 < : The file can be opened by the function called open(). The
syntax of file open is
open(filename,mode)
File Modes are
ios::in Open a file for input operations
ios::out Open a file for output operations
ios::binary Open a file in binary mode
Ex:

fstream fs;
fs.open(“raja.txt”,ios::in|ios::out)
Close File Operation: To close the file the member function close() is used. The
close() function takes no parameter and returns no value.
fs.close();
Read and Write Operations:
#include<iosteram.h>
#include<fstream.h>
void main()
{
char name[20];
int sno,marks;
cout<<”enter the name,sno,marks”;
cin>>name>>sno>>marks;
fstream fs;
fs.open(“raja.txt”,ios::in|ios::out);
fs<<sno<<endl;
fs<<name<<endl;
fs<<marks<<endl;
fs.close();
fs.open(“raja.txt”,ios::in|ios::out);
fs>>sno;
fs>>name;
fs>>marks;
cout<<sno<<name<<marks;
}
= 6 8 * > : Opens the file raja.txt in output mode for writing data on it.
All the existing contents deleted and output pointer located at the beginning of the
file( Read Mode)
= 6 8 * > : Opens the file raja.txt in input mode for reading data from the
file. Input pointer located at the beginning of file (Write Mode)
= 6 8*>> : Opens the raja.txt in append mode for writing data at
the from the end of the file. Output pointer located at the end of file(Append Mode)
8.5 Positioning the Pointer in the File: While performing file operations, we must
be able to reach at any desired position inside the file. For this purpose there are
two commonly used functions –
1 *: The seek operation is using two functions seekg and seekp
* : means get pointer of specific location for reading of record
* : means get pointer of specific location for writing of record
syntax:
seekg(offset, reference-position);
seekp(offset, reference-position);
where offset is any constant specifying the location.
reference-position is for specifying beginning, end or current position. It can
be specified as

ios::beg for beginning location
ios::end for end of file
ios::cur for current location
1 : This function tells us the current position
seqfile.tellg() - gives current position of get pointer
(for reading the record)
Seqfile.tellp() - gives current position of put pointer
8.6 Sequential Access to Files:

A file can accessed either sequentially or randomly. A sequential file means
data in the file accessed sequentially. To read the data from sequential file needs
preceding data item should be processed first. A random file allows accessing of
data randomly. In random file there is no need for accessing its preceding
elements. A random file can also be accessed sequentially. Type of media used for
storing/accessing decides whether the file access randomly or sequentially. A file
on tape can be accessed sequentially, and a file on disk can be accessed randomly.
File streams in C++ supports various kinds of functions for performing I/O
operations on files. For handling character by character you can use put() and get()
functions. To manipulate set of characters at a time you can used write() and read()
functions.
8.7 Special Characters in File:

When we handle files, there are situations in which some extra and unexptect
characters may appear in the file. For example
1. Sometimes we can see the ^Z appear at the end of the file. It indicates the end of
the file character. This most likely to happen MS-DOS systems.
2. Some programmer indicate end of line as a pair of characters using carriage
returns.
8.8 Fundamental File Structure Concepts:

Field and Record Organization:
1. File is a persistent data structure. That means once we store the data in the file,
it can be read by another file or one can also modify the contents of an existing file.
2. The basic unit of file is field. Each field contains some data. In a file there can
be many entries of data which is of same data type. For example – a file may
contain multiple roll numbers.
3. Similarly, a file may contain the data of mixed data types. That is a file may
contain the student’s roll number, name, address and marks. Thus if a file
contains the data of various fields then it is called record.
SNo Name Place Address

1111 Aaaaa Kkkk pune
2222 Bbbbb Yyyyy Mumbai
3333 Cccccc Yyyyy Chennai
4444 ddddd Dddd Pune

4. In C we can define these records of type structure. In C++ or java we can create
a class for defining records. The fields of these records are called as members. We
can store or refer these records in the memory by means of object.
5. The records in the file can be organized as stream of bytes.
Field Structure:
There are various ways to create field structure. These are
1. Create a field structure of fixed length.
2. Add field length at the start of each field.
3. Place the delimiter at the end of each field.
4. Use the expression “keyword=value” for defining the fields
Record Structure:
A record is a collection of various fields. There are three methods of defining
the structure of the record.
1. Create a record with predictable number of bytes in length.
2. Create a record with predictable number of fields in length.
3. Begin each record with length indicator which consists of number of bytes of the
record.
4. Storing the starting address of the records.

Dictionaries: Advanced Data Structures 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dictionaries: Advanced Data Structures 1

Uploaded by

Copyright:

Available Formats

Advanced Data Structures 1

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

1.2 Operations of sets using linked lists:

The sets Operations are Union, Intersection, Difference and Equality

: Intersection can be defined by searching a list for each element in

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

1.3 Applications of Sets: In sets we have different applications and require a

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

3.2 Searching of an Element:

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Now the next key to be inserted is 31.According to the hash function

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

The element 121 can be placed at

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Hashing Skip List

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

After Applying & # &

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

!"# ' ! $ ( &# &

!"# ' ! $ ! &# &

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

!"# ' ! $ ( #& &

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

!"# ' ! $ ! #& &

Rebalancing After Removing: -

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Elements that Elements that Elements that Elements that

ii) Right branch

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

ii) Right branch

Case 3: Splitting the root node

Insert 29, 28, 27, 26, 25, 24, 23, 22

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Determine smallest=28, Middle=29, Largest=30

Y. Ramesh Kumar, HOD of CSE, Avanthi Institute of Engg. & Technology

Conceptually there cannot be 4 children