You are on page 1of 66

SUBJECT: DATA STRUCTURES AND ALGORITHMS

UNIT : 1 (HASHING)

Presented by

Ratnakar S Jagale
1
ASSISTANT PROFESSOR
Dept. Of Computer Engg.
GESCOE Nashik
EXAMINATION SCHEME
Theory :
In-Sem Exam :- 30
End Sem Exam :- 70
Total : -100

2
HASHING
Hashing is an important Data Structure which is designed
to use a special function called the Hash function which
is used to map a given value with a particular key for
faster access of elements. The efficiency of mapping
depends of the efficiency of the hash function used.

Time Complexity : O(1)

3
HASHING EXAMPLE

4
HASH TABLE

Hash Table is a data structure used for storing and


retriving data quickly. Every entry in hash table is made
using Hash function.

5
HASH FUNCTION
Hash function is a function used to place data in hash
table.
Similarly hash function is used to retrive data from hash
table.

6
BUCKET
Hash function H(Key) is used to map several dictionary
entries in the hash table.
Each position of the hash table is called bucket.

Buck
et

7
COLLISION
Collision is situation in which hash function returns the
same address for more than one record.

8
PROB, SYNONYM

Probe : Each calculation of an address and test for


success is known as a probe.

Synonym : The set of keys that has to the same location


are called synonyms.

9
PROB, SYNONYM, OVERFLOW
Overflow : When hash table becomes full and new record
needs to be inserted then it is called overflow.

10
PERFECT HASH FUNCTION
Perfect hash function : The perfect hash function is a
function that maps distinct key elements into the hash table
with no collision.

Advantages of perfect hash function


1. A perfect hash function with limited set of elements
can be used for efficient lookup operation.
2. There is no need to apply collision resolution
technique.

11
HASH FUNCTIONS

Following are the available types hash


functons.
1. Division method.
2. Multiplication method.
3. Mid Squere method.
4. Extraction method.
5. Folding Method.
6. Universal method.

12
1. DIVISION METHOD

13
1. DIVISION METHOD

14
2. MULTIPLICATION METHOD

15
16
3. MID SQUERE METHOD

17
4. EXTRACTION METHOD

18
5. FOLDING METHOD

19
5. FOLDING METHOD

20
6. UNIVERSAL HASHING METHOD

21
6. UNIVERSAL HASHING METHOD

22
PROPERTIES OF GOOD HASH FUNCTION
The hash function should be simple to compute.
Number of collisions should be less while placing the
record in the hash table. Ideally no collision should
occur. Such a function is called perfect hash function.
Hash function should produce such keys which will
get distributed uniformly over an array.
The hash function should depend on every bit of the
key. Thus the hash function that simply extracts the
portion of a key is not suitable.

23
COLLISION RESOLUTION STRATEGIES
If collisions occure then it should be handled by
applying some techniques, such techniques are called
collision handling techniques.

24
OPEN AND CLOSED HASHING
The open hashing is also called as separate chaining
The closed hashing is closed hashing is called open
addressing.
In Open hashing the collsions are stored outside the
table.
In closed hashing the collisions are stored in the same
table at some other slot.

25
CHAINING WITHOUT REPLACEMENT
A separate chain table is maintain for colliding data.
When collision occurs we store the second colliding data by
linear probing method.
The address of this colliding data can be stored with the first
colliding element in the chain table, without replacement.
For Example : 131, 3, 4, 21, 61, 6, 71, 8, 9

26
CHAINING WITHOUT REPLACEMENT
For Example : 131, 3, 4, 21, 61, 6, 71, 8, 9
Index Data Chain

131 mod 10 = 1 0 -1
1 131 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1 27
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 -1
2 -1
3 mod 10 = 3
3 3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
28
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 -1
2 -1
4 mod 10 = 4
3 3 -1
4 4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
29
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 2
2 21 -1
21 mod 10 = 1
3 3 -1
4 4 -1
5 -1

Collision 6 -1
7 -1
8 -1
9 -1
30
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 2
2 21 5
61 mod 10 = 1
3 3 -1
4 4 -1
5 61 -1

Collision 6 -1
7 -1
8 -1
9 -1
31
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 2
2 21 5
6 mod 10 = 6
3 3 -1
4 4 -1
5 61 -1
6 6 -1
7 -1
8 -1
9 -1
32
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 2
2 21 5
71 mod 10 = 1
3 3 -1
4 4 -1
5 61 7

Collision 6 6 -1
7 71 -1
8 -1
9 -1
33
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 2
2 21 5
8 mod 10 = 8
3 3 -1
4 4 -1
5 61 7
6 6 -1
7 71 -1
8 8 -1
9 -1
34
CHAINING WITHOUT REPLACEMENT

For Example : 131, 3, 4, Index Data Chain


21, 61, 6, 71, 8, 9 0 -1
1 131 2
2 21 5
9 mod 10 = 9
3 3 -1
4 4 -1
5 61 7
6 6 -1
7 71 -1
8 8 -1
9 9 -1
35
CHAINING WITH REPLACEMENT
Chaning without replacement loose the
meaning of the hash function
To overcome this dawback the method
known as chaning with replacement is
introduced.
Advantage of this method is that the
meaning of hash function is preserved.
Each time some logic is needed to test the
element, whether it is at its proper position.
36
CHAINING WITH REPLACEMENT

For Example : 131, 21, Index Data Chain


31, 4, 5, 2 0 -1
1 131 -1
2 -1
131 mod 10 = 1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
37
CHAINING WITH REPLACEMENT

For Example : 131, 21, Index Data Chain


31, 4, 5, 2 0 -1
1 131 2
2 21 -1
21 mod 10 = 1
3 -1
4 -1
5 -1

Collision 6 -1
7 -1
8 -1
9 -1
38
CHAINING WITH REPLACEMENT

For Example : 131, 21, Index Data Chain


31, 4, 5, 2 0 -1
1 131 2
2 21 3
31 mod 10 = 1
3 31 -1
4 -1
5 -1

Collision 6 -1
7 -1
8 -1
9 -1
39
CHAINING WITH REPLACEMENT

For Example : 131, 21, Index Data Chain


31, 4, 5, 2 0 -1
1 131 2
2 21 3
4 mod 10 = 4
3 31 -1
4 4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
40
CHAINING WITH REPLACEMENT

For Example : 131, 21, Index Data Chain


31, 4, 5, 2 0 -1
1 131 2
2 21 3
5 mod 10 = 5
3 31 -1
4 4 -1
5 5 -1
6 -1
7 -1
8 -1
9 -1
41
CHAINING WITH REPLACEMENT
For Example : 131, 21,
31, 4, 5, 2
Index Data Chain
0 -1
2 mod 10 = 2
1 131 6
2 2 -1
21 is not of that position
at which currently it is 3 31 -1
placed. 4 4 -1
5 5 -1
Hence we will replace 21 6 21 3
by 2 and accordingly 7 -1
chain table will be 8 -1
updated.
9 -1
42
OPEN ADDRESSING
1. Linear Probing
2. Quadratic Probing
3. Double Hashing

43
1. LINEAR PROBING
When collision occurs then the colision
can be solved by Placing second record
linearly down wherever the empty
location is found.

44
LINEAR PROBING EXAMPLE
For Example : 131, 21,
31, 4, 5, 2 Index Data
0

131 mod 10 = 1 1 131


2
3
4
5
6
7
8
9
45
LINEAR PROBING EXAMPLE
For Example : 131, 21,
31, 4, 5, 2 Index Data
0

21 mod 10 = 1 1 131
2 21
3
4
5
Collision 6
7
8
9
46
LINEAR PROBING EXAMPLE
For Example : 131, 21,
31, 4, 5, 2 Index Data
0

31 mod 10 = 1 1 131
2 21
3 31
4
5
Collision 6
7
8
9
47
LINEAR PROBING EXAMPLE
For Example : 131, 21,
31, 4, 5, 2 Index Data
0

4 mod 10 = 4 1 131
2 21
3 31
4 4
5
6
7
8
9
48
LINEAR PROBING EXAMPLE
For Example : 131, 21,
31, 4, 5, 2 Index Data
0

5 mod 10 = 5 1 131
2 21
3 31
4 4
5 5
6
7
8
9
49
LINEAR PROBING EXAMPLE
For Example : 131, 21,
31, 4, 5, 2 Index Data
0

2 mod 10 = 2 1 131
2 21
3 31
4 4
5 5
6 2
7
8
9
50
PROBLEM WITH LINEAR PROBING
One problem with linear
probing is primary
clustering.
Primary clustering is a
process in which a block
of data is formed in the
hash table when collision
is resolved.

51
PRIMARY CLUSTERING EXAMPLE
For Example : 19, 18, 39,
29, 8 Index Data
0 39

19 mod 10 = 9 1 29

18 mod 10 = 8 2 8
39 mod 10 = 9 3
29 mod 10 = 9 4
8 mod 10 = 8 5
6
Clustering Problem can be 7
solved by quadratic 8 18
probing.
9 19
52
2. QUADRATIC PROBING
Quadratic probing
operates by taking the
original hash value and
adding succesive values
of an arbtrary quadratic
polynomial to the starting
value.
Formula :
Hi(key) =
(Hash(key)+i2)%m

53
3. DOUBLE HASHING
Double hashing is technique in which a second
hash function is applied to the key when a collision
occurs.
By applying the second hash function we will get
the number of positions from the point of collision
to insert.

H1 (Key) = key mod table size

H2 (Key) = M - (Key mod M)

54
REHASHING
1. Rehasing is a technique in which the table is resized.
2. The size of table is doubled by creating new table.
3. It is preferable if the total size of table is a prime number.
4. In following situations rehashing is required
I. When table is completely full.
II. With quadratic probing when the table is filled half.
III. When insertion is fail due to overflow.
5. In such situations, we have to transfer entries from old table to
the new table by recalaculating their positions.

55
REHASHING
For Example : 131, 3, 4, 21, 61, 6, 71, 8, 9

131 mod 10 = 1

56
ADVANTAGE OF REHASHING
1. This technique provides the programmer a flexibility
to enlarge the table size if required.

2. Only the space gets doubled with simple hash


function which avoids occurence of collisions.

57
APPLICATIONS OF HASHING
1. In compilers to keep track of declared veriables.
2. For online spelling checking the hashing function are
used.
3. hashing helps in Game playing programs to store the
moves made.
4. For browser program while caching the web pages,
hashing is used.

58
59
60
61
62
63
APPLICATIONS OF DICTIONARY

1. Student registration application.


2. Telephone directory.
3. Word dictionary.
4. Symbole table used in compiler.

64
65
APPLICATIONS OF DICTIONARY

66

You might also like