Professional Documents
Culture Documents
DATA STRUCTURE
rd
Second Year Computer Science and Engineering, 3 Semester
LINEAR STRUCTURES
Example:
Example:
Find routine
Example:
Find Previous
The next routine is Insertion routine. We will pass an element to be inserted along with
list L and a Position P. The insertion routine will insert an element after Position P.
Find routine
Deletion Routine
Insertion Routine
Position Tmpcell;
if(Tmpcell ==0)
Fatalerror (“out of space”);
CursorSpace[Tmpcell].Next= CursorSpace[P].Next;
CursorSpace[P].Next = Tmpcell;
In the above example, if the value of L is 5 and the value of M is 3,then L represents the
List a,b,c and M represents the list c,d,f.
The ‘L’ List is as follows
A Doubly linked list is a linked list in which each node has three fields namely,
o Data field
o Forward Link(FLINK)
o Backward Link(BLINK)
Advantages
Disadvantage
1. Polynomial ADT
o An Abstract Data Type for single _variable polynomials (with non negative
exponents) can be defined by using a list.
o Let ∑ . If most of the coefficients Ai are non zero, then a simple
array is used to store the coefficients.
o Then write the routines to perform addition, subtraction, multiplication and other
operations on these polynomials.
Linked list representation of polynomials
P1(X)=10X1000+5X14+1
P2(X)=3X1990-2X1492+11X+5
typedef struct
{
int CoeffArray[MaxDegree + 1];
int HighPower;
}*Polynomial;
2. Radix sort
The first step bucket sorts by the least significant digit. so, the list sorted by least
significant digit is:0,1,512,343,64,125,216,27,8,729.
8 729
1 216 27
0 512 125 343 64
--------------------------------------
0 1 2 3 4 5 6 7 8 9
The list is now sorted with respect to the two least significant digits.
The final pass, bucket-sorts by the most significant digit.
The final list is 0,1,8,27,64,125,216,343,512,729.
64
27
8
1
0 125 216 343 512 729
------------------------------------------
0 1 2 3 4 5 6 7 8 9
Example:
A University with 40,000 students and 2,500 courses need to be able to generate
two types of reports.
1. The first report lists the registration for each class.
2. The second report lists, by student the classes that each student is
registered for.
The implementation might be use a two-dimensional array. Such an array would
have 100 million entries. The average student registers for about three courses, so
only 120,000 of these entries, would have meaningful data.
This problem can be easily solved using linked list. Two lists are needed for this
implementation.
A list for each class containing the student in the class, and a list for each student
containing the classes the student is registered for.
The following figure shows the implementation.
As the figure shows, two lists are combined into one.
All lists use a header and are circular.
Figure: Multilist implementation for registration problem
To list all of the student in classc3, start at c3 and traverse its list by going right.
The first cell belongs to student s1.
int IsEmpty(Stack S)
{
return S→Next ==NULL;
}
Stack CreateStack(void)
{
Stack S;
S=malloc(sizeof(struct Node));
if(S==NULL)
Fatal Error (“Out of space”);
MakeEmpty(S);
return S;
}
void MakeEmpty(Stack S)
{
if(S==NULL)
Error(“Must use CreateStack first”);
else
while(!IsEmpty(S))
Pop(S);
}
Insert 5
ElementType Top(Stack S)
{
if(!IisEmpty(S))
return S→Next→Element;
Error(“Empty Stack”);
return 0;
}
Examle:
It returns 10
void Pop(Stack S)
{
PtrToNode FirstCell;
if(IsEmpty(S))
Error(“EmptyStack”);
else
{
FirstCell=S→Next;
S→Next=S→Next →Next;
Free(FirstCell);
}
}
Example:
Each Stack has TopofStack whose value is set to -1 for empty Stack.
PUSH
To Push some elements “X” onto the Stack, increment TopofStack and then set
Stack[TopofStack]=X, where Stack is the representing the actual Stack.
POP:
To Pop, set the return value to Stack[TopofStack] and then decrement TopofSatck.
The Push and Pop operations are very fast. In some machines it takes only one machine
instruction.
#define EmptyTOS(-1)
#define MinStackSize(5)
struct StackRecord
{
int Capacity;
int TopOfStack;
ElementType *Array;
};
Now the maximum size of stack is known, Stack can be dynamically allocated.
Stack Creation
int IsEmpty(Stack S)
{
return S→TopofStack==EmptyTOS;
}
Routine to create an empty stack
First, the symbol a is read, so it is passed through to the output. Then '+' is read and
pushed onto the stack. Next b is read and passed through to the output.
Next '*' is read. The top entry on the operator stack has lower precedence than '*', so nothing is
output and '*' is put on the stack. Next, c is read and output. Thus far, we have
The next symbol is a '+'. Checking the stack, we find that we will pop a '*' and place it on the
output, pop the other '+', which is not of lower but equal priority, on the stack, and then push the
'+'.
The next symbol read is an '(', which, being of highest precedence, is placed on the stack. Then d
is read and output.
We continue by reading a '*'. Since open parentheses do not get removed except when a closed
parenthesis is being processed, there is no output. Next, e is read and output.
The next symbol read is a '+'. We pop and output '*' and then push '+'. Then we read and output
.
Now we read a ')', so the stack is emptied back to the '('. We output a '+'.
We read a '*' next; it is pushed onto the stack. Then g is read and output.
The input is now empty, so we pop and output symbols from the stack until it is empty.
TREE STRUCTURES
Inorder Traversal
Traverse the left subtree
process the node
process the right subtree
Preorder Traversal
process the node
process left subtree in preorder
process right subtree in preorder
Postorder Traversal:
process the left subtree
process the right subtree
process the root
Expression tree is a binary tree in which the leaf nodes are operands and other nodes
contain operators.
For eg:
Expression tree for (a+b*c)+((d*e+f)*g)
If the operator is unary minus operator, the node can have only one child.
Postorder Traversal
Recursively print left subtree
Then right subtree
Operator
Eg: abc*+de*f+g*+
Preorder Traversal
Print the operator
Left subtree
Right subtree
There are many applications for trees. One of the popular uses is the directory structure in
many common operating systems, including UNIX, VAX/VMS, and DOS.
The popular methods for traversing a tree is preorder, postorder and inorder. In a
preorder traversal, work at a node is performed before (pre) its children are processed. In a
postorder traversal, the work at a node is performed after (post) its children are evaluated.
Make Empty
This operation is mainly for initialization. When programmers prefer to initialize the first
element as a one node tree.
SearchTree MakeEmpty(SearchTree T)
{
If(T!=NULL)
{
MakeEmpty(T→left);
MakeEmpty(T→Right);
Free(T);
}
return NULL;
}
Find:
This operation returns a pointer to the node in tree T that has a key X or NULL if there is
no such node.
Position Find(ElementType X, SearchTree T)
{
If(T==NULL)
return NULL;
if(X < T→Element)
return Find(X, T→Left);
elseif(X >T→Element)
return Find(X, T→Right);
else
return T;
}
10 is checked with 10
Element is found
Position FindMin(SearchTree T)
{
if(T==NULL)
return NULL;
elseif(T→Left==NULL)
return T;
else
return FindMin(T→Left);
}
Insert 5:
Insert 10:
Insert 15:
Insert 20:
Insert 18:
Insert 3:
The node structure for a threaded binary tree varies a bit and its like this
struct NODE
{
struct NODE *leftchild;
int node_value;
struct NODE *rightchild;
struct NODE *thread;
}
Let's make the Threaded Binary tree out of a normal binary tree...
The INORDER traversal for the above tree is -- D B A E C. So, the respective Threaded Binary
tree will be –
B has no right child and its inorder successor is A and so a thread has been made in between
them. Similarly, for D and E. C has no right child but it has no inorder successor even, so it has a
hanging thread.
As this is a non-recursive method for traversal, it has to be an iterative procedure; meaning, all
the steps for the traversal of a node have to be under a loop so that the same can be applied to all
the nodes in the tree.
consider the INORDER traversal again. Here, for every node, we'll visit the left sub-tree (if it
exists) first (if and only if we haven't visited it earlier); then we visit (i.e print its value, in our
case) the node itself and then the right sub-tree (if it exists). If the right sub-tree is not there, we
check for the threaded link and make the threaded node the current node in consideration. Please,
follow the example given below.
step-1:
'A' has a left child i.e B, which has not been visited. So, we put B in our "list of visited nodes"
and B becomes our current node in consideration.
Inorder
step-2:
'B' also has a left child, 'D', which is not there in our list of visited nodes. So, we put 'D' in that
list and make it our current node in consideration.
BD
Inorder
step-3:
'D' has no left child, so we print 'D'. Then we check for its right child. 'D' has no right child and
thus we check for its thread-link. It has a thread going till node 'B'. So, we make 'B' as our
current node in consideration.
BD
Inorder
D
step-4:
'B' certainly has a left child but its already in our list of visited nodes. So, we print 'B'. Then we
check for its right child but it doesn't exist. So, we make its threaded node (i.e 'A') as our current
node in consideration.
BD
Inorder
DB
step-5:
'A' has a left child, 'B', but its already there in the list of visited nodes. So, we print 'A'. Then we
check for its right child. 'A' has a right child, 'C' and its not there in our list of visited nodes. So,
we add it to that list and we make it our current node in consideration.
BDC
Inorder
DBA
step-6:
'C' has 'E' as the left child and its not there in our list of visited nodes even. So, we add it to that
list and make it our current node in consideration.
BDCE
Inorder
DBA
and finally.....
DBEAC
6.Explain the C implementation for threaded binary tree?
Algorithm:-
Step-1: For the current node check whether it has a left child which is not there in the visited list.
If it has then go to step-2 or else step-3.
Step-2: Put that left child in the list of visited nodes and make it your current node in
consideration. Go to step-6.
Step-3: For the current node check whether it has a right child. If it has then go to step-4 else go
to step-5
Step-4: Make that right child as your current node in consideration. Go to step-6.
Step-5: Check for the threaded node and if its there make it your current node.
Step-6: Go to step-1 if all the nodes are not over otherwise quit
struct NODE
{
struct NODE *left;
int value;
struct NODE *right;
struct NODE *thread;
}
Deletion Routine:
if( T == NULL )
Error( "Element not found" );
else
if( X < T->Element )
T->Left = Delete( X, T->Left );
else
if( X > T->Element )
T->Right = Delete( X, T->Right );
else
if( T->Left && T->Right )
{
TmpCell = FindMin( T->Right );
T->Element = TmpCell->Element;
T->Right = Delete( T->Element, T->Right );
}
else
{
TmpCell = T;
if( T->Left == NULL )
T = T->Right;
else if( T->Right == NULL )
T = T->Left;
free( TmpCell );
}
return T;
}
Example 2:
Delete 25
8.With a neat example Construct an expression tree:
Expression tree can be constructed from an postfix expression.
Algorithm for construction of expression tree:
1. If infix expression is given, convert it to postfix expression
2. Read the expression one symbol at a time
3. If the symbol is an operand, create a one node tree and push a pointer onto a stack
4. If the symbol is an operator, pop pointers of two trees T1 and T2 from the stack and
form a new tree whose root is the operator, T2 is the left child and T1 is the right
child. A pointer to this new tree is then pushed onto the stack.
Next, a '+' is read, so two pointers to trees are popped, a new tree is formed, and a pointer
to it is pushed onto the stack.*
Next, c, d, and e are read, and for each a one-node tree is created and a pointer to the
corresponding tree is pushed onto the stack.
Now a '+' is read, so two trees are merged.
Now, a '*' is read, so we pop two tree pointers and form a new tree with a '*' as root.
Finally, the last symbol is read, two trees are merged, and a pointer to the final tree is left
on the stack.
UNIT –III
BALANCED TREE
An AVL tree (Adelson – Velskii and Landis) tree is a binary search tree with a balance
condition. The balance condition
Must be easy to maintain
Ensures that the depth of the tree is O(log N)
Require that left and right subtree have the same height
Definition:
An AVL tree is a binary search tree, except that for every node in the tree, the height of
left and right subtrees can differ by atmost 1.
The height of empty tree is defined to be -1.
Balance Factor:
A Balance factor is the height of the left subtree minus height of the right subtree.
BF=height of left subtree – height of right subtree
For an AVL tree all balance factor should be +1,0,-1
Single Rotation:
Case (i): An insertion into the left subtree of left child of k2.
Single rotation to fix case 1
Case (iv): An insertion into the right subtree of the right child of K1
Single Rotation to fix case 4:
Eg: Insertion of 10 to the AVL tree
Double Rotation:
General Representation
Declaration
struct AVLNode
{
ElementType element;
AVLTree Left;
AVLTree Right;
int height
};
Eg: Insert the following elements into the AVL tree 2,1,4,5,9,3,6,7
4.Explain binary heap?
Structure Property:
A heap is a binary tree that is completely filled with the possible exception of bottom
level which is filled from left to right. Such a tree is known as complete binary tree.
A complete binary tree of height h has between 2h and 2h+1-1 nodes.
For eg. If the height is 3 than the number of nodes is in between 8 and 15(i.e 23 and 24-1).
A B C D E F G H I J
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
For any element in array position i, the left child is in position 2i, the right child is in
position (2i + 1) and parent is in position (i/2).
Advantage:
It doesn’t require pointers
Operations required to traverse the tree is simple & fast.
Disadvantage:
Estimation of maximum heap size is required in advance.
Heap Order Property:
It allows operations to be performed quickly.
In order to find the minimum quickly, the smallest element should be the root.
Every node should be smaller than all of its descendants.
For every node X, the key in the parent of X is smaller than or equal to the key in X, with
the exception of the root.
Insert
To insert an element X into heap, we create a hole in next available location; otherwise
the tree will not be complete. If X can be placed in hole without violating heap order, then place
element X there itself. Otherwise, we slide the element that is in hole’s parent node into the hole,
thus bubbling the hole up toward root. This process continues until X can be placed in hole. This
strategy is known as Percolate up (the new element is percolated up the heap until current
location is found.)
Routine to insert into a binary key:
void Insert(ElementType X, PriorityQueue H)
{
int i;
if (IsFull(H)) {
Error("Priority queue is full");
return;
}
for (i = ++H->Size; H->Elements[ i / 2 ] > X; i /= 2)
H->Elements[ i ] = H->Elements[ i / 2 ];
H->Elements[ i ] = X;
}
if (IsEmpty(H))
{
Error("Priority queue is empty");
return H->Elements[ 0 ];
}
MinElement = H->Elements[ 1 ];
LastElement = H->Elements[ H->Size-- ];
for (i = 1; i * 2 <= H->Size; i = Child) {
/* Find smaller child */
Child = i * 2;
if (Child != H->Size && H->Elements[ Child + 1 ]
< H->Elements[ Child ])
Child++;
/*Percolate one level */
if (LastElement > H->Elements[ Child ])
H->Elements[ i ] = H->Elements[ Child ];
else
break;
}
H->Elements[ i ] = LastElement;
return MinElement;
}
Eg. Remove 13
6.Explain B Tree?
Definition of a B-tree
• A B-tree of order m is an m-way tree (i.e., a tree where each node may have up to m
children) in which:
1. the number of keys in each non-leaf node is one less than the number of its children and
these keys partition the keys in the children in the fashion of a search tree
An example B-Tree
Constructing a B-tree
• Suppose we start with an empty B-tree and keys arrive in the following order:1 12 8 2
25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
• Therefore, when 25 arrives, pick the middle key to make a new root
Adding 17 to the right leaf node would over-fill it, so we take the middle key, promote it (to the
root) and split the leaf
• If this would result in that leaf becoming too big, split the leaf into two, promoting the
middle key to the leaf’s parent
• If this would result in the parent becoming too big, split the parent into two, promoting
the middle key
• This strategy might have to be repeated all the way to the top
• If necessary, the root is split in two and the middle key is promoted to a new root, making
the tree one level higher
7.Explain the steps to insert into B Tree?
Exercise in Inserting a B-Tree
• 3, 7, 9, 23, 45, 1, 5, 14, 25, 24, 13, 11, 8, 19, 4, 31, 35, 56
• During insertion, the key always goes into a leaf. For deletion we wish to remove from a
leaf. There are three possible ways we can do this:
• 1 - If the key is already in a leaf node, and removing it doesn’t cause that leaf node to
have too few keys, then simply remove the key to be deleted.
• 2 - If the key is not in a leaf then it is guaranteed (by the nature of a B-tree) that its
predecessor or successor will be in a leaf -- in this case we can delete the key and
promote the predecessor or successor key to the non-leaf deleted key’s position.
• If (1) or (2) lead to a leaf node containing less than the minimum number of keys then we
have to look at the siblings immediately adjacent to the leaf in question:
• 3: if one of them has more than the min. number of keys then we can promote one
of its keys to the parent and take the parent key into our lacking leaf
4: if neither of them has more than the min. number of keys then the lacking leaf and one of its
neighbours can be combined with their shared parent (the opposite of promoting a key) and the
new leaf will have the correct number of keys; if this step leave the parent with too few keys then
we repeat the process up to the root itself, if required
Enough siblings
Enough siblings
• 3, 7, 9, 23, 45, 1, 5, 14, 25, 24, 13, 11, 8, 19, 4, 31, 35, 56
Analysis of B-Trees
root m–1
level 1 m(m – 1)
level 2 m2(m – 1)
. . .
level h mh(m – 1)
• When searching tables held on disc, the cost of each disc transfer is high but doesn't
depend much on the amount of data transferred, especially if consecutive items are
transferred
– If we use a B-tree of order 101, say, we can transfer each node in one disc read
operation
– A B-tree of order 101 and height 3 can hold 1014 – 1 items (approximately 100
million) and any item can be accessed with 3 disc reads (assuming we hold the
root in memory)
• If we take m = 3, we get a 2-3 tree, in which non-leaf nodes have two or three children
(i.e., one or two keys)
– B-Trees are always balanced (since the leaves are all at the same level), so 2-3
trees make a good type of balanced tree
Comparing Trees
• Binary trees
– Can become unbalanced and lose their good time complexity (big O)
– AVL trees are strict binary trees that overcome the balance problem
– Heaps remain balanced but only prioritise (not order) the keys
– Multi-way trees
– B-Trees can be m-way, they can have any (odd) number of children
– One B-Tree, the 2-3 (or 3-way) B-Tree, approximates a permanently balanced
binary tree, exchanging the AVL tree’s balancing operations for insertion and
(more complex) deletion operations
• Assuming that a disk spins at 3600 RPM, one revolution occurs in 1/60 of a second, or
16.7ms
• Crudely speaking, one disk access takes about the same time as 200,000 instructions
• We know we can’t improve on the log n lower bound on search for a binary tree
• But, the solution is to use more branches and thus reduce the height of the tree!
SEPARATE CHAINING
Separate chaining is a technique that keeps a list of elements that hash to the same value.
This is called separate chaining because each hash table element is a separate chain(linked list).
A separate chaining hash table
Type declaration
#define MinTableSize(10)
struct ListNode
{
ElementType Element;
Position Next;
};
Find routine:
Insertion
Traverse down the list to check whether the element is already available.
If element is new, it is inserted either in front of the list or at end of the list.
LINEAR PROBING
Example: Insert the keys {89, 18, 49, 58, 69} into a hash table.
The first collision occurs when 49 is inserted, it is put in next available spot (spot 0). The
key 58 collides with 18, 89 and then 49 before an empty cell is found. The collision for
69 is handled in similar manner.
Primary Clustering:
Any key that hashes into cluster will require several attempts to resolve the collision, and
then it will add to the cluster.
Advantage:
It does not require pointers .
Disadvantage:
It forms clusters which degrades the performance of the Hash table for storing and
retrieving data.
Quadratic Probing
Quadratic probing is a collision resolution method that eliminates the primary clustering
problem of linear probing. The collision function is quadratic, the popular choice is (i) = i2.
When 49 collides with 89, the next position attempted is one cell away. This cell is
empty, so 49 is placed there. Next 58 collides at position 8. Then the cell one away is tried but
another collision occurs. A vacant cell is found at the next cell tried, which is 2 2 = 4 away. 58 is
thus placed in cell 2. The same thing happens for 69.
Disadvantage:
There is no guarantee of finding an empty cell once the table gets more than half full, or
even before the table gets half full if the table size is not prime.
If quadratic probing is used, and the table size is prime, then a new element can always
be inserted if the table is at least half empty.
type declarations
enum kind_of_entry { legitimate, empty, deleted };
struct hash_entry
{
ElementType Element;
enum KindOfEntry Info;
};
struct HashTbl
{
int TableSize;
Cell *TheCells;
};
CollisionNum = 0;
CurrentPos = Hash( Key, H->TableSize );
while( H->TheCells[CurrentPos].Info != empty &&
H->TheCells[CurrentPos]. Element != key )
{
CurrentPos += 2 * ++ CollisionNum - 1;
if(CurrentPos >= H-> TableSize)
CurrentPos -= H-> TableSize;
}
return CurrentPos;
}
Secondary Clustering
Although quadratic probing eliminates primary clustering, elements that hash to the
same position will probe the same alternate cells. This is known as secondary clustering.
3.Explain rehashing?
Rehashing
If the table gets too full, the running time for the operations will start taking too long and inserts
might fail for closed hashing with quadratic resolution. This can happen if there are too many
deletions intermixed with insertions. A solution, then, is to build another table that is about twice
as big and scan down the entire original hash table, computing the new hash value for each (non-
deleted) element and inserting it in the new table.
As an example, suppose the elements 13, 15, 24, and 6 are inserted into a closed hash
table of size 7. The hash function is h(x) = x mod 7.
If 23 is inserted into the table, the resulting table will be over 70 percent full. Because the
table is so full, a new table is created. The size of this table is 17, because this is the first prime
which is twice as large as the old table size. The new hash function is then h(x) = x mod 17.
Theold table is scanned, and elements 6, 15, 23, 24, and 13 are inserted into the new table. The
result after rehashing is given below.
This entire operation is called rehashing. This is a very expensive operation.
OldCells = H->TheCells;
OldSize = H->TableSize;
Given an equivalence relation ~, the natural problem is to decide, for any a and b, if a ~ b.
suppose the equivalence relation is defined over the five-element set {a1, a2, a3, a4, a5}. Then
there are 25 pairs of elements, each of which is either related or not. However, the information a1
~ a2, a3 ~ a4, a5 ~ a1, a4 ~ a2 implies that all pairs are related.
The equivalence class of an element a €S is the subset of S that contains all the elements
that are related.
Every member of S appears in exactly one equivalence class. To decide if a ~ b, we need
only to check whether a and b are in the same equivalence class. This provides our strategy to
solve the equivalence problem.
The input is initially a collection of n sets, each with one element. This initial
representation is that all relations are false. Each set has a different element, so that Si Sj =Sk; this
makes the sets disjoint.
There are two permissible operations.
1. The first is find, which returns the name of the set (that is, the equivalence class)
containing a given element.
2. The second operation is Union which merges the two equivalence classes containing
a and b into a new equivalence class.
This algorithm is dynamic because, during the course of the algorithm, the sets can
change via the union operation.
In on-linealgorithm when a find is performed, it gives an answer before continuing.off-
line algorithm sees the entire sequence of unions and finds.
To perform a union of two sets, we merge the two trees by making the root of one tree point to
the root of the other.
Figure After union (5, 6)
The unions are performed by making the second tree a subtree of the first.
1) Union-By-Size
In this method the smaller tree is made a subtree of the larger one.
The three unions in the preceding example were all ties, and so we can consider that they were
performed by size.
Result of union-by-size
To implement this strategy, we need to keep track of the size of each tree. Since we are
really just using an array, we can have the array entry of each root contain the negative of the
size of its tree.
Thus, initially the array representation of the tree is all -1s. When a union is performed,
check the sizes; the new size is the sum of the old.
2) Union-By-Height
An alternative implementation, which also guarantees that all the trees will have depth at
most O(log n), is union-by-height.
We keep track of the height, instead of the size, of each tree and perform unions by
making the shallow tree a subtree of the deeper tree.
If there many number of find operation than unions, this running time is worse than that
of the quick-find algorithm. Moreover, no more improvements possible is possible for the
union algorithm.
Therefore, the only way to speed the algorithm up, without reworking the data structure
entirely, is to do something clever on the find operation.
The only change to the find routine is that S[x] is made equal to the value returned by
find; thus after the root of the set is found recursively, x is made to point directly to it. This
occurs recursively to every node on the path to the root, so this implements path compression
Code for disjoint set find with path compression
Extendible hashing:
Extendible hashing is used where the amount of data is too large to fit in main memory.
The main consideration then is the number of disk accesses required to retrieve data.
We assume that at any point we have n records to store, and at most m records fit in one
disk block so assume m = 4. Extendible hashing, allows a find to be performed in two disk
accesses. Insertions also require few disk accesses. Let us suppose, for the moment, that our data
consists of several six-bit integers.
The root of the "tree" contains four pointers determined by the leading two bits of the
data. Each leaf has up to m = 4 elements. D will represent the number of bits used by the root,
which is sometimes known as the directory. The number of entries in the directory is thus 2D. dl
is the number of leading bits that all the elements of some leaf l have in common. dl will depend
on the particular leaf, and dl D.
Suppose that we want to insert the key 100100. This would go into the third leaf, but as
the third leaf is already full, there is no room. We thus split this leaf into two leaves, which are
now determined by the first three bits. This requires increasing the directory size to 3.
If the key 000000 is now inserted, then the first leaf is split, generating two leaves with dl
= 3. Since D = 3, the only change required in the directory is the updating of the 000 and 001
pointers.
This very simple strategy provides quick access times for insert and find operations on
large databases.
Double Hashing
For double hashing, one popular choice is F(i) = i .h2(x).
This formula says that we apply a second hash function to x and probe at a distance
h2(x), 2h2(x), . . ., and so on.
The second hash function such as h2(x) = R - (x mod R), with R a prime smaller than
TableSize, will work well. If we choose R = 7, then the results of inserting is shown below.
The first collision occurs when 49 is inserted. h2(49) = 7 - 0 = 7, so 49 is inserted in
position 6. h2(58) = 7 - 2 = 5, so 58 is inserted at location 3. Finally, 69 collides and is inserted at
a distance h2(69) = 7 - 6 = 1 away.
The table size should prime when double hashing is used. If we attempt to insert 23 into
the table, it would collide with 58. Since h2(23) = 7 - 2 = 5, and the table size is 10, we
essentially have only one alternate location, and it is already taken. Thus, if the table size is not
prime, it is possible to run out of alternate locations prematurely.
UNIT V
GRAPHS
Step 1:
Number of 1’s present in each column of adjacency matrix represents the Indegree of the
corresponding vertex.
In figure. Indegree[a] = 0
Indegree[b] = 2
Indegree[c] =1
Indegree[d] = 2
Step 2:
Enqueue the vertex, whose Indegree is ‘0’
Vertex ‘a’ is 0, so place it on the queue.
Step 3:
Dequeue the vertex ‘a’ from the queue and decrement the Indegree’s of its adjacent
vertex ‘b’ & ‘c’.
Hence, Indegree[b] = 1
Indegree[c] = 0
Now, Enqueue the vertex ‘c’ as it’s Indegree becomes zero.
Step 4:
Dequeue the vertex ‘c’ from Q and decrement the Indegree’s of it’s adjacent vertex ‘b’ and ‘d’.
Hence, Indegree[b] = 0
Indegree[d] =1
Now, Enqueue the vertex ‘b’ as it’s Indegree falls to zero.
Step 5:
Dequeue the vertex ‘b’ from Q and decrement the Indegree’s of it’s adjacent vertex ‘d’.
Hence, Indegree[d]=0
Now, Enqueue the vertex ‘d’ as it’s Indegree falls to zero.
Step 6:
Dequeue the vertex ‘d’.
Step 7:
As the queue becomes empty, topological ordering is performed which is nothing but, the
order in which the vertices are dequeued.
Result of applying Topological sort to the graph, in the above figure.
Vertex 1 2 3 4
a 0 0 0 0
b 2 1 0 0
c 1 0 0 0
d 2 2 1 0
Enqueue a c b d
Dequeue a c b d
Initial Configuration:
1) Source vertex ‘a’ is initially assigned a path length ‘0’.
V Known dv Pv
a 0 0 0
b 0 ∞ 0
c 0 ∞ 0
d 0 ∞ 0
Queue a
‘a’ is dequeued
V Known dv Pv
a 1 0 0
b 0 1 a
c 0 1 a
d 0 ∞ 0
Queue b,c
3) After finding all vertices whose path length from ‘a’ is 2.
‘b’ is dequeued
V Known dv Pv
a 1 0 0
b 1 1 a
c 0 1 a
d 0 2 b
Queue c,d
‘c’ is dequeued
V Known dv Pv
a 1 0 0
b 1 1 a
c 1 1 a
d 0 2 b
Queue d
‘c’ is dequeued
V Known dv Pv
a 1 0 0
b 1 1 a
c 1 1 a
d 1 2 b
Queue empty
Intial Configuration:
V Known dv Pv
a 0 0 0
b 0 ∞ 0
c 0 ∞ 0
d 0 ∞ 0
Vertex a is chosen as source and is declared as known vertex. Then the adjacent vertices
of a is found and its distance are updated as follows:
T [ b ].Dist = Min[ T [ b ].Dist, T[ a ].Dist + Ca,b ]
= Min[ ∞, 0+2 ]
= 2
T[ d ].Dist = Min[ T[ d ].Dist,T[ a ].Dist + Ca,d ]
= Min[ ∞, 0+1 ]
= 1
After ‘a’ is declared known
V Known dv Pv
a 1 0 0
b 0 2 a
c 0 ∞ 0
d 0 1 a
Now select the vertex with minimum distance, which is not known and mark that vertex
as visited. Here ‘d’ is the next minimum distance vertex. The adjacent vertex to ‘d’ is ‘c’
therefore, the distance of c is updated as follows
T[ c ].Dist = Min[ T[ c ].Dist, T[ d ].Dist + Cd,c ]
= Min[ ∞, 1 + 1 ]
= 2
After ‘d’ is declared known
V Known dv Pv
a 1 0 0
b 0 2 a
c 0 2 d
d 1 1 a
The next minimum vertex is b and mark it as visited. Since the adjacent vertex d is
already visited, select the next minimum vertex ‘c’ and mark it as visited.
After ‘b’ is declared known
V Known dv Pv
a 1 0 0
b 1 2 a
c 0 2 d
d 1 1 a
V Known dv Pv
a 1 0 0
b 1 2 a
c 1 2 d
d 1 1 a
Cost = 8
Cost = 9
Cost = 8
Cost = 5
Cost=5
EXAMPLE 2:
Minimum Spanning Tree( MST )
Cost=16
Applications of MST in real world are:
1. Wiring a house with a minimum of cable
2. Cheapest cost tour of traveling salesman
3. Networking the PC’s with low cost
V Known dv Pv
V1 0 0 0
V2 0 ∞ 0
V3 0 ∞ 0
V4 0 ∞ 0
V5 0 ∞ 0
V6 0 ∞ 0
V7 0 ∞ 0
Consider V1 as source vertex and proceed from there. Vertex V1 is marked as visited and
then the distance of its adjacent vertices are updated as follows.
T[ V2 ].Dist = Min[ T[ V2 ].Dist, Cv1, v2 ]
= Min[ ∞, 2 ]
=2
T[ V3 ].dist = Min[ T[ V3 ].Dist, Cv1, v3 ]
= Min[ ∞, 4 ]
= 4
V Known dv Pv
V1 1 0 0
V2 0 2 V1
The table after V1 is declared known V3 0 4 V1
V4 0 1 V1
V5 0 ∞ 0
V6 0 ∞ 0
V7 0 ∞ 0
Vertex V4 is marked as visited and then the distance of its adjacent vertices are updated.
V Known dv Pv
V1 1 0 0
V2 1 2 V1
V3 0 2 V4
V4 1 1 V1
V5 0 7 V4
V6 0 8 V4
T[ V6 ].Dist = Min[ T[ V6 ].Dist, Cv3, v6 ] V7 0 4 V4
= Min[ 8, 5 ]
=5
The table after V3 is declared known
V Known dv Pv
V1 1 0 0
V2 1 2 V1
V3 1 2 V4
V4 1 1 V1
V5 0 7 V4
V6 0 5 V3
V7 0 4 V4
V7 is declared known
T[ V6 ].Dist = Min[ T [ V6 ].Dist, Cv7, v6 ]
= Min[ 5, 1 ]
=1
T[ V5 ].Dist = Min[T [ V5 ].Dist, Cv7, v5 ]
= Min[ 7, 6 ] = 6
V Known dv Pv
V1 1 0 0
V2 1 2 V1
V3 1 2 V4
V4 1 1 V1
V5 0 6 V7
V6 0 1 V7
V7 1 4 V4
V Known dv Pv
V1 1 0 0
V2 1 2 V1
V3 1 2 V4
V4 1 1 V1
V5 0 6 V7
V6 1 1 V7
V7 1 4 V4
V Known dv Pv
V1 1 0 0
V2 1 2 V1
V3 1 2 V4
V4 1 1 V1
V5 1 6 V7
V6 1 1 V7
V7 1 4 V4
Depth first search is a kind of tree traversal. Starting vertex may be determined by the
problem or chosen arbitrarily. The analogy with tree traversal is easier to discuss with directed
graphs, because the edges are directed as tree edges, (one way direction).
The two important key points of depth first search are:
1. If path exists from one node to another node walk across the edge- exploring the edge.
2. If path does not exist from one specific node to any other node, return to the previous node
where we have been before - backtracking.
The theme of depth first search is to explore if possible, otherwise backtrack.
Example :
Given directed graph G = ( V, E), where V={ A, B, C, D, E, F, G }.
For simplicity, assume the start vertex is A and exploration is done in alphabetical order. From
start vertex A explores to B, now AB is explored edge.
Algorithm: Depth First Search Or Traversal (DFS)
dfs ( G, v )
Mark v as "discovered"
for each vertex w such that edge vw is in G:
if w is undiscovered;
dfs(G,w); that is, explore vw, visit w,
explore from there as much as possible, and backtrack from w to v.
Otherwise:
"check" vw without visiting w
Mark v as "finished".
From the
dfssweep( G )
initialise all vertices of G to "undiscovered"
for each vertex v € G, in some order
if v is undiscovered;
dfs (G, v);
Breadth first search performs simultaneous explorations starting from a common point and spreading out
Assume start vertex as A. Explore all paths from vertex A. From A the explored edges are AF, AB. Explore all
paths from vertex B and F. From B the explored edges are BD, BC From F the explored edges
are FA, FC. The dashed lines show the edges that were explored but went to vertices that were
Previously discovered, (i.e. FA, FC).From D the explored edge is DA. But A already exists in the discovered
vertices list. So, we will say that the edge DA is checked rather than explored.
Algorithm: Breadth First Search Or Traversal (BFS)
bfs( G, v )
Queue Q = create (n); enqueue ( Q, v );
while Q is non empty
v = front ( Q );
dequeue( Q );
For each vertex w adjacent to v
enqueue ( Q, w );
parent[ w ] = v;
7.Explain biconnectivity?
Low can be computed by performing a postorder traversal of the depth-first spanning tree(i.e.)
Low(F)=Min(Num(F),Low(D))
/*Since there is no tree edge & only one back edge*/
=Min(6,4) =4
Low(E)=Min(Num(E),Low(F))
/*there is no back edge*/
=Min(5,4)
=4
Low(D)=Min(Num(D),Low(E),Num(A))
=Min(4,4,1)
=1
Low(G)=Min(Num(G))
=Min(7)
=7
Low(C)=Min(Num(C),Low(D),Low(G))
=Min(3,1,7)
=1
Low(B)=Min(Num(B),Low(C))
=Min(2,1)
=1
Low(A)=Min(Num(A),Low(B))
=Min(1,1)
=1
From figure it is clear that Low(G)>Num(C) (i.e.) 7>3 /* if Low(W)>=Num(V) */ the ‘V’ is as
articulation point therefore ‘C’ is an articulation point
Similarly Low(E)=Num(D), Hence D is an articulation point.