You are on page 1of 30

IT T33- Data Structures

UNIT V
Hashing: Introduction – Hash function – methods - Hash table implementation - rehashing.
Graph: Directed and un directed graph – representation of graphs – graph traversals: Depth first
search – Breadth first search – transitive closure – spanning trees – application - topological
sorting.
Objective:

 This course is aimed to cover a variety of different problems in Graph Theory.


 To learn concept and design of graph, traversals and its different applications

Outcomes:
 The student will have a strong background of graph theory which has diverse applications
in the areas of computer science, biology, chemistry, physics, sociology, and engineering.

5.1 Hashing:
Hashing is used for storing relatively large amounts of data in a table called a hash table.
Hashing is a technique used to perform insertions, deletions, and finds the element in constant
averagetime.

Hash table:
 Hash Table is a data structure in which keys are mapped to array positions by a hash
function.
 Hash table is usually fixed as M-size, which is larger than the amount of data that we want
to store.

Hash function:
Hash function is a mathematical formula, produces an integer which can be used as an index for
the key in the hash table.

 Perfect Hash Function- Each key is transformed into a unique storage location
 Imperfect hash Function- Maps more than one key to the same storage location

Methods of hash function:


 Division Method
 Multiplication Method
 Mid Square Method
 Folding Method

5.1.1 Division-method:

In this method we use modular arithmetic system to divide the key value by some
integer division m. It gives us the location value, where the element can be placed.

L=(k mod m)
L->Location in table
K->key value
m->table size

SMVEC- Department of Information Technology 1


IT T33- Data Structures
(eg) k=23, m=10 then

L=(23 mod 10)=3

The key whose value 2.3 is placed in 3rd location.

5.1.2 Mid square method:


In this we square the value of a key and take the number of digits required to form an
address, from the middle position of squared value

(eg) key =16

Square is 256. Then the address as 56(two digits starting form mid of 256).

5.1.3 Folding method:

In this the key is actually positioned into number of parts, each post having the same length
as their of the required address. Add the value of each parts, ignoring the final carry to get the
required address.

(eg) key is: 12345678

Partitioned:12,34,56,78

Add = 12 +34 + 56+ 78 = 180 (ignore 1 in 180) so the address is 80.

5.1.4 Digit analysis:

This hashing function is a distribution – dependent. Here we make a statistical analysis of


digits of the key, and select those digits which occur quite frequently. Then reverse or shift the
digits to get the address.

(eg) key = 9861234. If the statistical analysis 6 & 2 occur frequently.

Reverse 6 & 2 then the address is 26.

5.1.4 Multiplication Method


In multiplication method we compute the hash value in 3 steps
1. Fix a constant A from (0,1)
2. Multiply the key k with A and take the fractional part
3. Multiply the fractional part with m, and take the floor of the result
In summary : h(k) = m { kA }  where { x } denote the fractional part of x.

5.2 Collision Resolution Strategies

If the element to be inserted is mapped the same location, where an element is already
inserted the we have a collision and it must be resolved.
 Separate chaining – used with open hashing
 Open addressing – used with closed hashing

SMVEC- Department of Information Technology 2


IT T33- Data Structures
5.2.1 Separate chaining:
 In collision handling method chaining is a concept which introduces an additional field with
data i.e. chain.
 A separate chain table is maintained for colliding data. When collision occurs then a linked
list(chain) is maintained at the home bucket.

For example: Consider the keys to be placed in their home buckets are
131, 3, 4, 21, 61, 24, 7, 97, 8, 9
Then we will apply a hash function as
H(key) = key mod D

where D is the size of table. The hash table will be: Here D = 10

Separate chaining hash table.

5.2.2 Open Addressing:


In open addressing, if a collision occurs, alternate cell are tried until an empty cell is found.
Because all the data elements are stored inside the table, a large memory space is needed for
open addressing.
There are three commonly used collision resolution strategy

 Linear probing
 Quadratic probing
 Double hashing
5.2.2.1 Linear Probing

Example: Consider a hash table with size = 10. Using linear probing insert the keys 72, 27, 36, 24,
63, 81 and 92 into the table.

Let h’(k) = k mod m, m = 10

Initially the hash table can be given as,


SMVEC- Department of Information Technology 3
IT T33- Data Structures

Step1: Key = 72
h(72, 0) = (72 mod 10 + 0) mod 10
= (2) mod 10
=2
Since, T[2] is vacant, insert key 72 at this location

Step2: Key = 27
h(27, 0) = (27 mod 10 + 0) mod 10
= (7) mod 10
=7
Since, T[7] is vacant, insert key 27 at this location

Step3: Key = 36
h(36, 0) = (36 mod 10 + 0) mod 10
= (6) mod 10
=6
Since, T[6] is vacant, insert key 36 at this location

Step4: Key = 24
h(24, 0) = (24 mod 10 + 0) mod 10
= (4) mod 10
=4
Since, T[4] is vacant, insert key 24 at this location

Step5: Key = 63
h(63, 0) = (63 mod 10 + 0) mod 10
= (3) mod 10
=3
Since, T[3] is vacant, insert key 63 at this location

Step6: Key = 81
h(81, 0) = (81 mod 10 + 0) mod 10= (1) mod 10 = 1
Since, T[1] is vacant, insert key 81 at this location

SMVEC- Department of Information Technology 4


IT T33- Data Structures
Step7: Key = 92
h(92, 0) = (92 mod 10 + 0) mod 10= (2) mod 10 = 2

Now, T[2] is occupied, so we cannot store the key 92 in T[2]. Therefore, try again for next location. Thus
probe, i = 1, this time.

Key = 92

h(92, 1) = (92 mod 10 + 1) mod 10= (2 + 1) mod 10 = 3

Now, T[3] is occupied, so we cannot store the key 92 in T[3]. Therefore, try again for next location. Thus
probe, i = 2, this time.

Key = 92

h(92, 2) = (92 mod 10 + 2) mod 10 = (2 + 2) mod 10 = 4

Now, T[4] is occupied, so we cannot store the key 92 in T[4]. Therefore, try again for next location. Thus
probe, i = 3, this time.

Key = 92

h(92, 3) = (92 mod 10 + 3) mod 10 = (2 + 3) mod 10= 5

Since, T[5] is vacant, insert key 92 at this location

5.2.2.2 Quadratic probing

• It eliminates the primary clustering problems of linear probing


f(i)=i2
• If quadratic probing is used and the table size is prime , then a new element can always be
inserted if the table is at least half empty.
• If the table is even one more than half full, the insertion could fail [prime].

5.2.2.3 Double Hashing:

The double hashing is performed by


F(i)= i.hash2(x)
here , hash2(x) = R-(x mod R) with R is a prime smaller than table size.

SMVEC- Department of Information Technology 5


IT T33- Data Structures
Example:Let us consider following key values 89, 18, 49, 58,69
For first value apply the normal hash function i.e key mod tablesize
hash(89)=89%10=9
hash(18)=18%10=8
Now the key values 89 and 18 are stored at the corresponding location ,when for inserting the 3rd
element 49
hash(49)=49%10=9 , collision occurs
so apply double hashing technique and Choose “R” a prime number smaller than table size, so we
choose R=7 then
hash2(49)=7-(49%7)= 7 - 0 =7
hash2(58)=7-(58%7)= 7 – 2 =5
hash2(69)=7-(69%7)= 7 – 6 = 1

5.3 Rehashing: Rehashing is a technique in which the table is resized, i.e., the size of table is
doubled by creating a new table. It is preferable if the total size of table is a prime number. There
are situations in which the rehashing is required

 When table is completely full.


 With quadratic probing when the table is filled half.
 When insertions fail due to overflow.
In such situations, we have to transfer entries from old table to the new table by re-computing their
positions using suitable hash functions.

SMVEC- Department of Information Technology 6


IT T33- Data Structures
Consider we have to insert the elements 37, 90, 55, 22, 17, 49 and 87. The table size is 10 and will
use hash function,
H(key) = key mod tablesize

 Now this table is almost full and if we try to insert more elements collisions will occur and
eventually further insertions will fail. Hence we will rehash by doubling the table size.

 The old table size is 10 then we should double this size for new table, that becomes 20. But
20 is not a prime number, we will prefer to make the table size as 23. And new hash function
will be

H(key) = key mod 23

Rehashing Example

SMVEC- Department of Information Technology 7


IT T33- Data Structures

5.3 Graph
• A data structure that consists of a set of nodes (vertices) and a set of edges that relate
the nodes to each other
• The set of edges describes relationships among the vertices
• Trees are special cases of graphs.
Definition of graph.

A graph G is defined as follows: G=(V,E)


V(G): a finite, nonempty set of vertices
E(G): a set of edges i.e pairs of vertices,(v,w) where v,w belongs to V
The edges are some time referred as arcs.
We have numbered the graph as 1,2,3,4.
Therefore,
1 V(G)=(1,2,3,4)
E(G) = {(1,2),(1,3),(1,4),(2,3),(2,4)}
2 3

4
Graph G

5.3.1 Types of graph

Undirected graph- When the edges in a graph have no direction, the graph is called undirected

Directed graph- When the edges in a graph have a direction, the graph is called directed (or
digraph)

Undirected graph Directed graph

Adjacent nodes: two nodes are adjacent if they are connected by an edge

Path: a sequence of vertices that connect two nodes in a graph

Length of path of graph: The length of a path in a graph is the number of edges in the path

SMVEC- Department of Information Technology 8


IT T33- Data Structures
In-degree and Out-degree in graph:
Let G be a directed graph
– The in-degree of a node x in G is the number of edges coming to x
– The out-degree of x is the number of edges leaving x.

Degree and Neighbor:


Let G be an undirected graph
– The degree of a node x is the number of edges that have x as one of their end nodes
– The neighbors of x are the nodes adjacent to x

Sub Graph : A sub-graph of G is a graph G„ such that V(G‟)  V(G ) and E(G „)  E(G). Some of

the sub graphs are as follow,

Complete graph: a graph in which every vertex is directly connected to every other vertex .The
Complete graph can be directed or undirected.

Weighted graph: a graph in which each edge carries a cost for traveling between the nodes.

Cyclic and acyclic graph:

• A cycle is a path that begins and ends at the same node.


• An acyclic graph is one that has no cycles.
• An acyclic, connected graph is also called an un-rooted tree

SMVEC- Department of Information Technology 9


IT T33- Data Structures
• Directed Acyclic Graph: A directed graph is acyclic if it has no cycles.

Graph Connectivity: An undirected graph is said to be connected if there is a path between every
pair of nodes. Otherwise, the graph is disconnected

Strongly connected and weekly connected graph:

 In a directed or undirected graph if there is path from every vertex to other vertex then it is
called as strongly connected.
 If a directed graph is not strongly connected , but the underlying graph(without direction to
arcs) is connected then the graph is said to weakly connected

Forest in graph: A forest is an acyclic undirected graph (not necessarily connected), i.e., each
connected component is a tree.

5.4 Representation of graphs


There are two representations of graphs:
 Adjacency matrix representation
 Adjacency lists representation
Representing the graph as adjacency matrix
In this representation, each graph of n nodes is represented by an n x n matrix A, that is, a two-
dimensional array A. The nodes are labeled 1,2,…,n.
• A[i][j] = 1 if (i,j) is an edge
• A[i][j] = 0 if (i,j) is not an edge
Data Structures for Graphs as Adjacency Matrix
• A two-dimensional matrix or array that has one row and one column for each node in the
graph
• For each edge of the graph (Vi, Vj), the location of the matrix at row i and column j is 1

SMVEC- Department of Information Technology 10


IT T33- Data Structures
• All other locations are 0
• For an undirected graph, the matrix will be symmetric along the diagonal
• For a weighted graph, the adjacency matrix would have the weight for edges in the graph,
zeros along the diagonal, and infinity (∞) every place else

Example for adjacency matrix – Undirected Graph:

Example for adjacency matrix – Directed Graph:

Advantage:
– Simple to implement
– Easy and fast to tell if a pair (i,j) is an edge: simply check if A[i][j] is 1 or 0
Disadvantage:
Even if there are few edges, the matrix takes O(n2) in memory

Representing the graph as adjacency list:


• A graph of n nodes is represented by a one-dimensional array L of linked lists, where
– A list of pointers, one for each node of the graph
– L[i] is the linked list containing all the nodes adjacent from node i.
– The nodes in the list L[i] are in no particular order

Data Structures for Graphs as Adjacency List:


• A list of pointers, one for each node of the graph
• These pointers are the start of a linked list of nodes that can be reached by one edge of the
graph
• For a weighted graph, this list would also include the weight for each edge

SMVEC- Department of Information Technology 11


IT T33- Data Structures
Example for adjacency list – Undirected Graph:

Example for adjacency matrix – Directed Graph:

5.4 Graph traversals


 Travel to every node in the graph
 During Traversals we will visit each node exactly once
 This can be used if we want to search for information held in the nodes or if we want to
distribute information to each node
Two types of Traversal:
• Depth first Search(or) Traversal (DFS)
• Breadth first search(or) Traversal (BFS)

5.4.1 Depth First Search


Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data
structures. One starts at the root by selecting any node as the root and explores as far as
possible along each branch before backtracking.

Backtracking is a general algorithm for finding all (or some) solutions to some
computational problem, that incrementally builds candidates to the solutions, The DFS uses Stack
data structure.

SMVEC- Department of Information Technology 12


IT T33- Data Structures

Steps involved in Depth-First Traversal:


Step1: Select an unvisited node x, visit it, and treat as the current node
Step2: Find an unvisited neighbor of the current node, visit it, and make it the new current node;
Step 3: If the current node has no unvisited neighbors, backtrack to the its parent, and make that
parent the new current node;
Step 4: Repeat steps 3 and 4 until no more nodes can be visited.
Step 5: If there are still unvisited nodes, repeat from step 1.

Note: DFS can be implemented efficiently using a stack


Algorithm
// Given an undirected graph G = (V.E) with n vertices and an array visited (n) initially set to zero
//This algorithm visits all vertices reachable from v .

//G and VISITED are global >

//VISITED (v)  1

procedure DFS(G,v):
label v as discovered
for all edges from v to w in G.adjacentEdges(v) do
if vertex w is not labeled as discovered then
recursively call DFS(G,w)

Computing time
 In case G is represented by adjacency lists then the vertices w adjacent to v can be
determined by following a chain of links. Since the algorithm DFS would examine each
node in the adjacency lists at most once and there are 2e list nodes. The time to complete
the search is O (e).
 If G is represented by its adjacency matrix, then the time to determine all vertices adjacent to
v is O(n). Since at most n vertices are visited. The total time is O(n2).

Output of a depth-first search: The depth first search of a graph outputs a spanning tree of the
vertices reached during the search.

Example for DFS- Directed Graph:

SMVEC- Department of Information Technology 13


IT T33- Data Structures

5.4.2 Breadth First Search


Breadth-first search (BFS) is a strategy for searching in a graph when search is limited to
essentially two operations: (a) visit and inspect a node of a graph; (b) gain access to visit the
nodes that are neighbor to the currently visited node.

Note: The BFS algorithm uses a queue data structure to store intermediate results as it
traverses the graph

SMVEC- Department of Information Technology 14


IT T33- Data Structures
Procedure BFS(v)
//A breadth first search of G is carried out beginning at vertex v. All vertices visited are marked as
VISITED(I) = 1. The graph G and array VISITED are global and VISITED is initialised to 0.//
VISITED(v)  1
Initialise Q to be empty //Q is a queue//
loop
for all vertices w adjacent to v do
if VISITED(w) = 0 //add w to queue//
then [call ADDQ(w, Q); VISITED(w)  1] //mark w as VISITED//
end
if Q is empty then return
call DELETEQ(v,Q)
forever
end BFS

Computing Time
 Each vertex visited gets into the queue exactly once, so the loop forever is iterated at most n
times.If an adjacency matrix is used, then the for loop takes O(n) time for each vertex visited.
The Total time is, therefore, O(n2).
 In case adjacency lists are used the for loop as a total cost of d1+……..+dn = O(e) where di =
degree(vi). Again, all vertices visited. Together with all edges incident to from a connected
component of G.

SMVEC- Department of Information Technology 15


IT T33- Data Structures

5.5 Transitive Closure- Checking connectivity of the graph by Warshall’s algorithm

Wars hall's algorithm is an efficient method for computing the transitive closure of a relation.
Wars hall's algorithm takes as input the matrix MR representing the relation and outputs the matrix
MR of the relation R*the transitive closure of R.

Warshall's algorithm determines whether there is a path between any two nodes in the graph.
It does not give the number of the paths between two nodes

Example of transitive closure:

Algorithm Warshall(a[1..n,1..n])
{
R(0) = A
for I = 1 to n
{
for j = 1 to n
{
for k = 1 to n
{
R(k) = R(k-1)[i,j] or R(k-1)[i,k] and R(k-1)[j,k]

}
}
}
}

SMVEC- Department of Information Technology 16


IT T33- Data Structures
Example:

Time Efficiency : O(n3)

5.6 Spanning Tree:

A Minimum Spanning Tree (MST) is a sub-graph of an undirected graph such that the sub-
graph spans (includes) all nodes, is connected, is acyclic, and has minimum total edge weight.

SMVEC- Department of Information Technology 17


IT T33- Data Structures
Characteristics of Minimum Spanning Tree (MST)
• A minimum spanning tree connects all nodes in a given graph
• A MST must be a connected and undirected graph
• A MST can have weighted edges
• Multiple MSTs can exist within a given undirected graph

Note: The minimum spanning tree may not be unique. However, if the weights of all the edges are
pair wise distinct, it is indeed unique

There are two popular techniques for constructing a minimum cost spanning tree.

 Prim’s algorithm
 Kruskal’s algorithm

5.6.1 Prim’s Algorithm

Prim’s Algorithm.
 Prim's algorithm for finding an MST is a greedy algorithm.
 Start by picking any vertex r to be the root of the tree. •
 While the tree does not contain all vertices in the graph find shortest edge leaving the
tree and add it to the tree.

Prim’s Selection rule


• Select the minimum weight edge between a tree-node and a non-tree node and add to
the tree
• A group of edges that connects two set of vertices in a graph is called cut in graph theory
Algorithm Prim(E, Cost, n, t)
// E = set of edge
// cost[1:n, 1:n] = cost of adjacency matrix
//cost[i,j]= +v real no if edge exists
//cost[i,j]=infinity if no edge exists
{
Let (k,l) be an edge of minimum cost in E
Mincost = cost (k,l)
T[1,1]: = k;
T[1,2]: = l;
For i:=1 to n do // initialization of near function
{
If cost[i,l] < cost[i,k] then
Near [i]: =l;
Else

SMVEC- Department of Information Technology 18


IT T33- Data Structures
Near[i]: = k;
}
Near[k]: = near[l]:=0
For i =2 to n-1 do
{
Let j be an index such that near[j]≠0 and cost[j, near[j]] is minimum;
T[i,1]:=j
T[i,2]:=near[j];
Mincost:=mincost+cost[j.near[j]];
Near[j]=0;
For k=1 to n do //update near
If((near[b]≠0) and (cost[k, near[k]]>cost[k,j]))
Then near[k]:=j;
}
Return mincost;
}

. Example for Prim’s Algorithm:

SMVEC- Department of Information Technology 19


IT T33- Data Structures

Vertex v1 is known Vertex v4 is known


Initial Configuration table

vertex v7 known Vertex v5 & v6 visited


Vertex v2 & v3 known
The edges in the spanning tree can be read
from the table: (v2, v1), (v3, v4), (v4, v1), (v5, v7), (v6, v7), (v7, v4). The total cost is 16.

Algorithm Complexity:The running time is O(|V|2) without heaps, which is optimal for dense
graphs, and O(|E| log |V|) using binary heaps, which is good for sparse graphs.

5.6.2 Kruskal’s algorithm


Kruskal‟s algorithm uses a greedy technique to compute a minimum spanning tree. A
MST can be grown from a forest of spanning trees by adding the smallest edge connecting
two spanning trees.
 In general, kruskal‟s algorithm maintains a forest-a collection of trees. Initially, there are |V|
single node trees.
 Adding an edge merges two trees into one. When the algorithm terminates, there is only
one tree, which is called as minimum spanning tree.
Data Structures used in Kruskal’s Algorithm:
 Find (U) returns the root of the tree that contains the vertex U.
 Union (S,U,V) merge the two trees by making the root pointer of one node point to the root
node of the other tree.

SMVEC- Department of Information Technology 20


IT T33- Data Structures
Algorithm kruskal (E, cost, n, t)
//E – set of all the edges in a graph G
//G – Graph consist of n vertices
//cost[u,v] – minimum edge cost in a graph
{
Construct a heap out of the edge costs using heapify; // construct heap tree.
For i=1 to n do
{
Parent[i]=-1
}
i=0;
mincost = 0.0;
while ((i<n-1) and (heap not empty)) do
{
Delete a minimum cost edge (u,v) from the heap and reheapify using adjust;
j=find(u);
k=find(v);
if(j≠k) then // edge does not form a cycle
{
i=i+1;
t[i,1] =u;
t[i,2] =v;
mincost=mincost + cost[u,v]
union[j,k];
}
}
If(i≠n-1) then
write (“No spanning tree”);
Else return mincost;
}

Algorithm find(i) Algorithm Union(i,j)


{ {
While (P[i]≥0)do P[i]:=j
{
i=p[i] }
}
Return i;
}

SMVEC- Department of Information Technology 21


IT T33- Data Structures

5.9 Topological sorting


Topological sort.(Application of Breadth first Search)
A topological sort or topological ordering of a directed graph is a linear ordering of its
vertices such that for every directed edge (v, w) from vertex v to vertex w, v comes before w in the
ordering.

Graphs in which topological sorting is performed:


A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a
directed acyclic graph (DAG)
Definition of Topological sort:
Given a digraph G = (V, E), find a linear ordering of its vertices such that:
for any edge (v, w) in E, v precedes w in the ordering
Example for Topological Sorting:
The topological sort of a directed graph. Here F is independent so it can be placed in any
location .

SMVEC- Department of Information Technology 22


IT T33- Data Structures
• Topological sort is not unique.
• The following are all topological sort of the graph below:

A directed graph with cycle cannot be sorted. The below Figure show as an example where
topological sorting cannot be performed in cyclic graph

Condition for cycle in a graph:


• Given a digraph G = (V,E), a cycle is a sequence of vertices v1,v2, ...,vk such that
k < 1 and V1 = Vk als (Vi,Vi+1) in E for 1 < i < k.
• G is acyclic if it has no cycles

Steps followed in topological sorting algorithm:


Step 1: Identify vertices that have no incoming edges •
The "in-degree" of these vertices is zero
(a)If no such vertices, graph has only cycle(s) (cyclic graph) • Topological sort not possible
so stop.
(b) If such vertex available Select one such vertex
Step 2: Delete this vertex of in-degree 0 and all its outgoing edges from the graph.
Place it in the output.
Step 3: Repeat steps 1 and 2 until the graph is empty.

SMVEC- Department of Information Technology 23


IT T33- Data Structures
Topological Ordering Algorithm: Example
Step1: Step2: Step3:

Step 4: Step 5: Step 6:

Step 7: Fig. Final Topological Sorted List

SMVEC- Department of Information Technology 24


IT T33- Data Structures
Two Marks

1. Define Hashing.
Hashing is the transformation of string of characters into a usually shorterfixed length value or
key that represents the original string. Hashing is used toindex and retrieve items in a database
because it is faster to find the item using theshort hashed key than to find it using the original
value.

2. What do you mean by hash table? (November 2015)


The hash table data structure is merely an array of some fixed size,containing the keys. A
key is a string with an associated value. Each key ismapped into some number in the range 0 to
tablesize-1 and placed in theappropriate cell.

3. What do you mean by hash function? (May 2014)


A hash function is a key to address transformation which acts upon agiven key to compute
the relative position of the key in an array. The choice ofhash function should be simple and it must
distribute the data evenly. A simplehash function is hash_key=key mod tablesize.

4. Write the importance of hashing.


• Maps key with the corresponding value using hash function.
• Hash tables support the efficient addition of new entries and the time spenton searching
for the required data is independent of the number of itemsstored.

5. What do you mean by collision in hashing?


When an element is inserted, it hashes to the same value as an already
inserted element, and then it produces collision.

6. What are the collision resolution methods? (December 2014)


• Separate chaining or External hashing
• Open addressing or Closed hashing

7. What do you mean by separate chaining?


Separate chaining is a collision resolution technique to keep the list of allelements that hash
to the same value. This is called separate chaining becauseeach hash table element is a separate
chain (linked list). Each linked list containsall the elements whose keys hash to the same index.

8. Write the advantage of separate chaining.


• More number of elements can be inserted as it uses linked lists.

9. Write the disadvantages of separate chaining.


• The elements are evenly distributed. Some elements may have more
elements and some may not have anything.
• It requires pointers. This leads to slow the algorithm down a bit because of
the time required to allocate new cells, and also essentially requires the
implementation of a second data structure.

SMVEC- Department of Information Technology 25


IT T33- Data Structures
10. What do you mean by open addressing?
Open addressing is a collision resolving strategy in which, if collision occursalternative cells are
tried until an empty cell is found. The cells h0(x), h1(x), h2(x),….are tried in succession, where
hi(x)=(Hash(x)+F(i))mod Tablesize with F(0)=0. Thefunction F is the collision resolution strategy.

11. What are the types of collision resolution strategies in open addressing?
• Linear probing
• Quadratic probing
• Double hashing

12. What do you mean by Probing?


Probing is the process of getting next available hash table array cell.

13. What do you mean by linear probing?


Linear probing is an open addressing collision resolution strategy in which F is alinear function
of i, F(i)=i. This amounts to trying sequentially in search of an emptycell. If the table is big enough,
a free cell can always be found, but the time to do socan get quite large.

14. What do you mean by primary clustering?


In linear probing collision resolution strategy, even if the table is relatively
empty, blocks of occupied cells start forming. This effect is known as primary
clustering means that any key hashes into the cluster will require several attempts
to resolve the collision and then it will add to the cluster.

15. What do you mean by quadratic probing?


Quadratic probing is an open addressing collision resolution strategy in whichF(i)=i2. There is
no guarantee of finding an empty cell once the table gets half full ifthe table size is not prime. This
is because at most half of the table can be used asalternative locations to resolve collisions.

16. What do you mean by secondary clustering?


Although quadratic probing eliminates primary clustering, elements thathash to the same
position will probe the same alternative cells. This is known assecondary clustering.

17. What do you mean by double hashing?


Double hashing is an open addressing collision resolution strategy inwhich F(i)=i.hash2(X).
This formula says that we apply a second hash function toX and probe at a distance hash2(X),
2hash2(X),….,and so on. A function such ashash2(X)=R-(XmodR), with R a prime smaller than
Tablesize.

18. What do you mean by rehashing? (April 2015)


If the table gets too full, the running time for the operations will start takingtoo long and inserts
might fail for open addressing with quadratic resolution. Asolution to this is to build another table
that is about twice as big with theassociated new hash function and scan down the entire original
hash table,computing the new hash value for each element and inserting it in the new table.This
entire operation is called rehashing.

19. What is the need for extendible hashing?


If either open addressing hashing or separate chaining hashing is used, themajor problem is
that collisions could cause several blocks to be examined duringa Find, even for a well-distributed

SMVEC- Department of Information Technology 26


IT T33- Data Structures
hash table. Extendible hashing allows a find tobe performed in two disk accesses. Insertions also
require few disk accesses.

20. List the limitations of linear probing.


• Time taken for finding the next available cell is large.
• In linear probing, we come across a problem known as clustering.

21. Mention one advantage and disadvantage of using quadratic probing.


Advantage: The problem of primary clustering is eliminated.
Disadvantage:There is no guarantee of finding an unoccupied cell once the table is nearly half
full.

22.Classify the Hashing Functions based on the various methods by which the key value is
found.
 Direct method,
 Subtraction method,
 Modulo-Division method,
 Digit-Extraction method,
 Mid-Square method,
 Folding method,
 Pseudo-random method.

23. Define Graph.


A graph G consist of a nonempty set V which is a set of nodes of the graph, a setE which is
the set of edges of the graph, and a mapping from the set for edge E to a set ofpairs of elements of
V. It can also be represented as G=(V, E).

24. Define adjacent nodes.


Any two nodes which are connected by an edge in a graph are called adjacentnodes. For
example, if an edge x ε E is associated with a pair of nodes (u,v) where u, vεV, then we say that
the edge x connects the nodes u and v.

25. What is a directed graph and undirected graph??


A graph in which every edge is directed is called a directed graph. A graph in which every
edge is undirected is called a undirected graph.

26. What is a loop?


An edge of a graph which connects to itself is called a loop or sling.

27. What is a simple graph and weighted graph?


A simple graph is a graph, which has not more than one edge between a pair of nodes
than such a graph is called a simple graph.
A graph in which weights are assigned to every edge is called a weighted graph.

28. Define outdegree and indegree of a graph?


In a directed graph, for any node v, the number of edges which have v as their initial node
is called the out degree of the node v.

SMVEC- Department of Information Technology 27


IT T33- Data Structures
In a directed graph, for any node v, the number of edges which have v as their terminal
node is called the indegree of the node v.

29. Define path in a graph?


The path in a graph is the route taken to reach terminal node from a starting node.

30. What is a cycle or a circuit?


A path which originates and ends in the same node is called a cycle or circuit.

31. What is an acyclic graph?


A simple diagram which does not have any cycles is called an acyclic graph.

32. What is meant by strongly connected in a graph andweakly connected??


An undirected graph is connected, if there is a path from every vertex to every
other vertex. A directed graph with this property is called strongly connected.
When a directed graph is not strongly connected but the underlying graph is
connected, then the graph is said to be weakly connected.

33.Namethe different ways of representing a graph?(April 2015)


a.Adjacency matrix
b. Adjacency list

34. What is an undirected acyclic graph?


When every edge in an acyclic graph is undirected, it is called an undirected
acyclic graph. It is also called as undirected forest.

35. What is a minimum spanning tree and list two algorithms to find minimum spanning
tree?
A minimum spanning tree of an undirected graph G is a tree formed from graph
edges that connects all the vertices of G at the lowest total cost.
Two algorithms to find minimum spanning tree
Kruskal‟salgorithm
Prim‟s algorithm

36. Define graph traversals and write two graph traversal techniques
Traversing a graph is an efficient way to visit each vertex and edge exactly once.
The two graph traversal techniques are
 DFS
 BFS

37. List the two important key points of depth first search.
i) If path exists from one node to another node, walk across the edge – exploring
the edge.
ii) If path does not exist from one specific node to any other node, return to the previous
node where we have been before – backtracking.

SMVEC- Department of Information Technology 28


IT T33- Data Structures
38. What do you mean by breadth first search (BFS)? (November 2015)
BFS performs simultaneous explorations starting from a common point and
spreading out independently.

39.DifferentiateBFSandDFS.(November2015)

No. DFS BFS


1. Backtracking is possible from a Backtracking is not possible
dead end
2. Vertices from which exploration is The vertices to be explored are
incomplete are processed in a organized as a
3. LIFO order
Search is done in one particular FIFOvertices
The queue in the same level are
direction maintained
parallely
40. Define biconnectivity.
A connected graph G is said to be biconnected, if it remains connected afterremoval of any
one vertex and the edges that are incident upon that vertex. A connectedgraph is biconnected, if it
has no articulation points.

41. What do you mean by articulation point?


If a graph is not biconnected, the vertices whose removal would disconnect the
graph are known as articulation points.

42. Define adjacency list.


Adjacency list is an array indexed by vertex number containing linked lists. Each node Vi
th
the i array entry contains a list with information on all edges of G that leave Vi. It is used to
represent the graph related problems.

Assignment Questions

1. Given input {4371, 1323, 6173, 4199, 4344, 9679, 1989} and a hash function h(x) = x (mod
( ) 10), show the resulting
a. separate chaining hash table
b. hash table using linear probing
c. hash table using quadratic probing
d. hash table with second hash function h2(x) = 7 − (x mod 7)
What are the advantages and disadvantages of the various collision resolution strategies?

2. Find a minimum spanning tree for the graph in Figure using both Prim‟s and Kruskal‟s
algorithms. b. Is this minimum spanning tree unique? Why?

SMVEC- Department of Information Technology 29


IT T33- Data Structures

3. Find the strongly connected components in the graph

4. Perform the BFS and DFS graph traversal on the following graph

5. Consider a directed acyclic graph D given in Figure Sort the nodes of D; by applying
topological sort on D

SMVEC- Department of Information Technology 30

You might also like