0% found this document useful (0 votes)
26 views53 pages

Module 5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views53 pages

Module 5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Structures and its

Application
Subject code: BCS304

1
4/5/2024 Dept of CSE
MODULE-5
HASHING TECHNIQUES

2
4/5/2024 Dept of CSE
HASHING TEHNIQUES

HASHING
SYMBOL TABLE ABSTRACT DATA TYPE
STATIC HASHING
Hash Tables
Hashing Functions
Overflow Handling
Theoretical Evaluation Of Overflow Techniques
DYNAMIC HASHING
PRIORITY QUEUE
Single Ended Priority Queue
Double Ended Priority Queue
Min-Max Priority Queue

3
4/5/2024 Dept of CSE
HASHING TEHNIQUES
LEFTIES TREES
Height Biased Lefties Tree
Weight Biased Lefties Tree
OPTIMAL BINARY SEARCH TREES

4
4/5/2024 Dept of CSE
SYMBOL TABLE AS ADT
Symbol Table is defined as a set of name-attributes
pairs.
Where name and attribute vary according to the
application.

For example, in a thesaurus, the name is a word, and


the attribute is a list of synonyms for the word; in a
symbol table for a compiler, the name is an identifier,
and the attributes might include an initial value and
a list of lines that use the identifier.
5
4/5/2024 Dept of CSE
In Symbol Table Generally we would like to perform
the following Operations:

(1) determine if a particular name is in the table

(2) retrieve the attributes of that name

(3) modify the attributes of that name

(4) insert a new name and its attributes

(5) delete a name and its attributes

6
4/5/2024 Dept of CSE
Symbol Table Abstract Data Type

7
4/5/2024 Dept of CSE
HASHING

8
4/5/2024 Dept of CSE
HASH TABLE ORGANIZATION
Hash Table:
 Hash table is a data structure used for storing and retrieving data very
quickly.
 Insertion, Deletion or Retrieval operation takes place with help of hash
value.
Hence every entry in the hash table is associated with some key.
Using the hash key the required piece of data can be searched in the hash
table by few or more key comparisons. The searching time is dependent
upon the size of the hash table.

9
4/5/2024 Dept of CSE
HASH FUNCTION
Hash Function:
 A hash function is a mathematical formula which, when applied to a key,
produces an integer which can be used as an index for the key in the hash
table.
The main aim of a hash function is that elements should be relatively,
randomly, and uniformly distributed. It produces a unique set of integers
within some suitable range in order to reduce the number of collisions..
A good hash function can only minimize the number of collisions by
spreading the elements uniformly throughout the array.

10
4/5/2024 Dept of CSE
HASH FUNCTION

Properties of Hash Function:


 Low cost: The cost of executing a hash function must be small, so that
using the hashing technique becomes preferable over other approaches.
Determinism: A hash procedure must be deterministic. This means that
the same hash value must be generated for a given input value.
Uniformity: A good hash function must map the keys as evenly as possible
over its output range. This means that the probability of generating every
hash value in the output range should roughly be the same

11
4/5/2024 Dept of CSE
TYPES OF HASH FUNCTION
Types of Hash function are:
Division Method
Mid-Square Method
Folding Method
Converting Key to Integers

1. Division Method:

Division Method It is the most simple method of hashing an integer x. This


method divides x by M and then uses the remainder obtained. In this case, the
hash function can be given as h(x) = x mod M ,the method works very fast.
However, extra care should be taken to select a suitable value for M.

12
4/5/2024 Dept of CSE
TYPES OF HASH FUNCTION
Example: Calculate the hash values of keys 1234 and 5462.
Solution Setting M = 97, hash values can be calculated as:
h(1234) = 1234 % 97 = 70
h(5642) = 5642 % 97 = 16

2. Multiplication Method: The steps involved in the multiplication method are


as follows:
Step 1: Choose a constant A such that 0 < A < 1.
Step 2: Multiply the key k by A.
Step 3: Extract the fractional part of kA.
Step 4: Multiply the result of Step 3 by the size of hash table (m).

13
4/5/2024 Dept of CSE
TYPES OF HASH FUNCTION
Hence, the hash function can be given as: h(k) = m (kA mod 1) .
where (kA mod 1) gives the fractional part of kA and m is the total number of
indices in the hash table.

14
4/5/2024 Dept of CSE
TYPES OF HASH FUNCTION
3. Mid-Square Method :The mid-square method is a good hash function which
works in two steps:
Step 1: Square the value of the key (K). That is, find k^2 .
Step 2: Extract the middle r digits of the result obtained in Step 1.
In the mid-square method, the same r digits must be chosen from all the keys.
Therefore, the hash function can be given as: h(k) = s where s is obtained by
selecting r digits from k2 .

15
4/5/2024 Dept of CSE
TYPES OF HASH FUNCTION
4. Folding Method: The folding method works in the following two steps:
Step 1: Divide the key value into a number of parts. That is, divide k into parts
k1 , k2 , ..., kn , where each part has the same number of digits except the last
part which may have lesser digits than the other parts.
Step 2: Add the individual parts. That is, obtain the sum of k1 + k2 + ... + kn .
The hash value is produced by ignoring the last carry, if any.

Note: that the number of digits in each part of the key will vary depending
upon the size of the hash table. For example, if the hash table has a size of
1000, then there are 1000 locations in the hash table. To address these 1000
locations, we need at least three digits; therefore, each part of the key must
have three digits except the last part which may have lesser digits.

16
4/5/2024 Dept of CSE
TYPES OF HASH FUNCTION

17
4/5/2024 Dept of CSE
TYPES OF HASH TECHNIQUES
There are two types of Hashing Techniques:

Static Hashing
Dynamic Hashing

18
4/5/2024 Dept of CSE
Example of Static Hashing
Consider the hash table ht with b = 26 buckets and S = 2. We have n = 10 distinct identifiers, each
representing a C library function. This table has a loading factor, a, of 10/52 = 0.19. The hash
function must map each of the possible identifiers onto one of the numbers, 0-25. We can construct a
fairly simple hash function by associating the letters, a to z, with the numbers, 0-25, respectively, and
then defining the hash function, f(x), as the first character of x. Using this scheme, the library
functions acos, define, float, exp, char, atan, ceil, floor, clock, and ctime hash into buckets 0, 3, 5,
4, 2, 0, 2, 5, 2, and 2, respectively.

The identifiers acos and atan are synonyms, as are


float and floor, and ceil and char. The next identifier,
clock, hashes into the bucket Since this bucket is full,
we have an overflow.

19
4/5/2024 Dept of CSE
OVERFLOW HANDLING/ COLLISION
Collisions: collisions occur when the hash function maps two different keys
to the same location. Two records cannot be stored in the same location.
Therefore, a method used to solve the problem of collision, also called
collision resolution technique.
The two most popular methods of resolving collisions are:
1. Open addressing
2. Chaining

1. Collision Resolving by Open Addressing: Once a collision takes place,


open addressing or closed hashing computes new positions using a probe
sequence and the next record is stored in that position. In this technique, all
the values are stored in the hash table.

20
4/5/2024 Dept of CSE
The hash table contains two types of values: sentinel values (e.g., –1) and data
values.
The presence of a sentinel value indicates that the location contains no data
value at present but can be used to hold a value.
When a key is mapped to a particular memory location, then the
value it holds is checked. If it contains a sentinel value, then the location is
free and the data value can be stored in it. However, if the location already has
some data value stored in it, then other slots are examined systematically in
the forward direction to find a free slot. If even a single free location is not
found, then we have an OVERFLOW condition.

21
4/5/2024 Dept of CSE
The process of examining memory locations in the hash table is called
probing.

Open addressing technique can be implemented using linear probing,


quadratic probing, double hashing, and rehashing.

22
4/5/2024 Dept of CSE
23
4/5/2024 Dept of CSE
EXAMPLE

24
4/5/2024 Dept of CSE
EXAMPLE

25
4/5/2024 Dept of CSE
EXAMPLE

26
4/5/2024 Dept of CSE
EXAMPLE

27
4/5/2024 Dept of CSE
QUADRATIC PROBING
Quadratic Probing :In this technique, if a value is already stored at a location
generated by h(k), then the following hash function is used to resolve the
collision:

where m is the size of the hash table, h(k) = (k mod m), i is the probe number
that varies from 0 to m–1, and c1 and c2 are constants such that c1 and c2 ≠ 0.

28
4/5/2024 Dept of CSE
EXAMPLE: QUADRATIC PROBING

29
4/5/2024 Dept of CSE
30
4/5/2024 Dept of CSE
31
4/5/2024 Dept of CSE
32
4/5/2024 Dept of CSE
DOUBLE HASHING
To start with, double hashing uses one hash value and then repeatedly steps
forward an interval until an empty location is reached. The interval is decided
using a second, independent hash function, hence the name double hashing. In
double hashing, we use two hash functions rather than a single
function. The hash function in the case of double hashing can be given as:

33
4/5/2024 Dept of CSE
EXAMPLE OF DOUBLE HASHING

34
4/5/2024 Dept of CSE
35
4/5/2024 Dept of CSE
36
4/5/2024 Dept of CSE
37
4/5/2024 Dept of CSE
Now T[6] is occupied, so we cannot store the key 101 in T[6]. Therefore, try
again for the next location with probe i = 2. Repeat the entire process until a
vacant location is found. You will see that we have to probe many times to
insert the key 101 in the hash table. Although double hashing is a very
efficient algorithm, it always requires m to be a prime number. In our case
m=10, which is not a prime number, hence, the degradation in performance.
Had m been equal to 11, the algorithm would have worked very efficiently.
Thus, we can say that the performance of the technique is sensitive to the
value of m.

38
4/5/2024 Dept of CSE
COLLISION RESOLVED BY CHAINING
In chaining, each location in a hash table stores a pointer to a linked list
that contains all the key values that were hashed to that location. That is,
location l in the hash table points to the head of the linked list of all the key
values that hashed to l. However, if no key value hashes to l, then
location l in the hash table contains NULL. Figure shows how the key
values are mapped to a location in the hash table and stored in a linked list
that corresponds to that location.

39
4/5/2024 Dept of CSE
DYNAMIC HASHING

40
4/5/2024 Dept of CSE
PRIORITY QUEUE
Priority Queue: A priority queue is a collection of elements such that
each element has an associated priority. In this queue, elements with
higher priority are dequeued before the elements with lower priority. If
two elements carry the same priority, they are served as per their
ordering in the queue.
Example:
Traffic Management: In this case, vehicles marked 'Emergency' have the
highest priority, followed by VIP vehicles, and finally, regular vehicles.
Operating System Processes: Certain processes are crucial for the
operation of the system and so are given priority over less critical tasks.

41
4/5/2024 Dept of CSE
HEAP DATA STRUCTURE
Heap data structure is a complete binary tree that satisfies the
heap property, where any given node is
always greater than its child node/s and the key of the root
node is the largest among all other nodes. This property is also
called max heap property.
always smaller than the child node/s and the key of the root
node is the smallest among all other nodes. This property is also
called min heap property.

max heap min heap

42
4/5/2024 Dept of CSE
SINGLE ENDED PRIORITY QUEUE
NOTE: Heaps are frequently used to implement the priority Queues
Single Ended Priority Queue: Single Ended Priority Queue is a data
structure which follows priority based sequence.

Further the Single Ended Priority Queue is classified as


min priority Queue
max Priority Queue

min-priority Queue: The operations that are supported by min-priority


queue are:
SP1: Return an element with minimum priority
SP2: Insert an element with arbitrary priority
SP3: Delete an element with minimum priority

43
4/5/2024 Dept of CSE
SINGLE ENDED PRIORITY QUEUE
max-priority Queue: The operations that are supported by max-priority
queue are:
 Return an element with maximum priority
 Insert an element with arbitrary priority
 Delete an element with maximum priority

Example: Construct the max-heap and min heap data structure for the
following sequence: 25,23,32,20,14,19,24,34,26,21,9

44
4/5/2024 Dept of CSE
Extension of Single Ended Priority Queue
1. Meldable(single-ended) priority Queue
2. Delete arbitrary element

Meldable(single-ended) priority Queue: Auguments the operations SP1


through SP3 with a meld operation that melds together two priority
queues.

One application for meld operation is when the server for one priority
queue shuts down. At this time it is necessary to meld its priority
queue with that of functioning server.

Two data Structures for meldable priority queue are:


Leftist tree
Binomial heaps
45
4/5/2024 Dept of CSE
Extension of Single Ended Priority Queue
2. Delete arbitrary element

Delete arbitrary element : Another extension of meldable operation


includes deleting the arbitrary element (given its location in Data
structure) and to decrease the key/priority ( or to increase the key in
case of max priority) of an arbitrary element ( given its location in Data
structure)

Two data Structures for developed for deleting arbitrary element


are:
Fibonacci heaps
Paring heaps

46
4/5/2024 Dept of CSE
DOUBLE ENDED PRIORITY QUEUE
Double Ended priority Queue: is a data structure that supports the
following operations on a collection of elements.
 Return an element with maximum priority
 Return an element with maximum priority
 Insert an element with arbitrary priority
 Delete an element with maximum priority
 Delete an element with maximum priority

47
4/5/2024 Dept of CSE
LEFTIST TREE
Leftist Tree: Leftist tree is defined using the concept of extended Binary tree.
An extended binary tree is a binary tree in which all empty binary subtrees have
been replaced by a square node.
Figure:1 shows example binary trees. Their corresponding extended binary
trees are shown in Figure 2. The square nodes in an extended binary tree are
called external nodes. The original (circular) nodes of the binary tree are called
internal nodes.

Figure:1 Figure:2

48
4/5/2024 Dept of CSE
LEFTIST TREE
The square nodes in an extended binary tree are called external nodes. The
original (circular) nodes of the binary tree are called internal nodes.
Let X be a node in an extended binary tree. Let left-child (x) and right-child
(x), respectively, denote the left and right children of the internal node x.
Define shortest (x) to be the length of a shortest path from x to an external
node. It is easy to see that shortest (x) satisfies the following recurrence:

49
4/5/2024 Dept of CSE
LEFTIST TREE

Definition: A leftist tree is a binary tree such that if it is not empty, then
shortest {left - child (x)) > shortest {right - child (x))
for every internal node x.

Figure (a) is not a leftist tree as shortest (left - child (C) = 0 while
shortest (right - child(C) = 1. The binary tree of Figure (b) is a leftist tree.

50
4/5/2024 Dept of CSE
Min/Max LEFTIST TREE
Min leftist Tree: A min leftist tree is a leftist tree in which the key value in each
node is no larger than the key values in its children .

Max leftist Tree: A max leftist tree is a leftist tree in which the key value in each
node is larger than the key values in its children .

51
4/5/2024 Dept of CSE
TYPES OF LEFTIST TREE
Height Biased Lefties Tree
Weight Biased Lefties Tree

Height Biased Lefties Tree: A binary tree is a Height Biased Leftist Tree
(HBLT), if and only if, at every internal node, the s value of the left child is
greater or equal to the s value of right child.
Example: Tree shown in fig is not HBLT as left value at internal node
Q is 0 and right value is 1.

52
4/5/2024 Dept of CSE
TYPES OF LEFTIST TREE
Weight Biased Lefties Tree: A binary tree is a Weight-biased leftist tree
(WBLT) iff at every internal node the w value of the left child is greater than or
equal to the w value of the right child.
Example:

53
4/5/2024 Dept of CSE

You might also like