You are on page 1of 14

HASH TABLE

Hash tables are extremely effective and practical way of implementing dictionaries.

A hash table or hash map is a data structure that uses a hash function to map
identifying values, known as keys (e.g., a person's name), to their
associated values (e.g., their telephone number).

The hash function is used to transform the key into the index of the slot or bucket
where the corresponding value is to be stored.

Hashing:

The process of mapping the keys to their respective positions is called as hashing.

It is a technique used to perform Insertion, deletion and search in a constant average


time.

Hash function, h(k):

In hash table keys are mapped or arranged from zero to (m-1).


Where, m is the table size.

Types of Hash functions:

1. Division method
2. Multiplication method
3. Universal method

Division Method:

Here, h(k)=k mod m.

Where, m is the table size and

k is the key which has to be inserted.

Example:

Table size, M=20


Key, k=91
Now, h(k)=k mod m
= 91 mod 20
=11.
Thus, key 91 has to be placed in 11th slot.
Take one more example:

Insert Keys= 42,10,22,33


Table size, m = 7.
Hash function h(k)= k mod m
h(42)= 42 mod 7=0
h(10)= 10 mod 7=3
h(22)= 22 mod 7=1
h(33)= 33 mod 7=5.

Note:

If , we are having another key to insert as 7.


Then, h(7)= 7 mod 7=0.
Already the 0th slot has been filled up. Then it leads to a collision.
Advantage of division method:

Fast, since requires just one division operation.

Disadvantage:

Have to avoid certain values of m. Powers of 2 are bad.

 Two main issues are there in hashing.

Collision:

If two or more keys demand for the same slot in the hash table then collision occurs.
.i.e, if h(k1)=h(k2). (Shown in above example as a Note.)
To resolve collision we are having 2 types of collision resolution techniques.
1. Open hasing.
2. Closed hashing.

 Choosing a hash function that minimizes the number of collisions and also
hashes uniformly is another critical issue.
1) Open hashing (Separate Chaining):

Open hashing is also called as Separate chaining.

In the strategy known as separate chaining, direct chaining, or simply chaining, each
slot of the bucket is a pointer to a linked list that contains the key-value pairs that
hashed to the same location.

Insertion requires adding a new entry record or adding to the end of the list belonging
to the hashed slot (index of the table).

Deletion requires searching the list and removing the element .

In hash table, The next value(pointer) of the last key in the bucket or slot is kept to null.

Example:

Keys : 0,1,4,2,16,25,36,49,64,81,100

Table size m= 7

Hash function h(k)= k mod m.

h(0)=0 mod 7=0

h(1)=1 mod 7=1

h(4)=4 mod 7=4

h(2)=2 mod 7=2

h(16)=16 mod 7=2

h(25)=4 mod 7=4

h(36)=4 mod 7=1

h(49)=4 mod 7=0

h(64)=4 mod 7=1

h(81)=4 mod 7=7

h(100)=4 mod 7=2


We perform operations like insertion, deletion and searching of an element in the hash
table.

The worst case complexity of all these operations is o(n).

The Best case of all these operations is o(1).

The average case of all the operations is O(1 + ),

Where = load factor n/m.

n-> no.of elements stored.

m-> no.of buckets(table size).

Disadvantage: Additional space for pointers.


2) Closed Hashing (Open Addressing):

In closed hashing all elements are stored in the hash table itself.

Avoids pointers; only computes the sequence of slots to be examined.

Here two keys that hash to the same bucket cannot be placed there.

The key which gets the bucket first will be there and the other must be placed in an
empty bucket below the home bucket.

Note: Here the searching for an next bucket will be done through the hash table in a
circular manner.

Example:

Let, keys are 42,10,22,34,17, 20

Table size, m =7

Then h(k)=k mod m

h(42)= 42 mod 7=0

h(10)= 10 mod 7=3

h(22)=22 mod 7=1

h(34)=34 mod 7=6

h(17)= 17 mod 7=3

h(20)= 20 mod 7=6

42
22

20
10

17
34

In the above example, 42 ,10,22,34 are placed without any problem.


But while inserting 17, its slot is already filled up. So we search for an empty slot below
that and placed it.

Again for 20, as the last slot is already filled, we go through the search in a circular
manner and place it where ever we find an empty slot.

Disadvantage:

Performance degrades with difficulty in finding an empty bucket.

So we implement some rehashing methods.

Rehashing methods:
Collisions are handled by generating a sequence of rehash values.

Types of rehashing methods:

Let h(k,0) be h(k). Here we use division method as a hash function.

1. Linear probing
h(k,i)=(h(k)+i) mod m

2. Quadratic probing
h(k,i)=(h(k)+c1i+c2 i^2) mod m
where c1, c2 are constants c2 !=0.

So, if c1=0 and c2=1 then


h(k,i)=(h(k)+ i^2) mod m
3. Double hashing
h(k,i)=(h(k)+i g(k)) mod m.

here g(k)=R – (k mod R)


where R is the largest prime value among the buckets in the hash table.
Here we insert the keys 76,93,40,47,10,55.

Table size is 7.

Probes are the number of attempts made for inserting a key.

Linear probing, h(k,i)=(h(k)+i) mod m

Now see,

h(76,0)=(h(76)+0) mod7 = 6 mod 7=6 (inserted)

h(93,0)= h(93)+0) mod7 = 2 mod 7=2 (inserted)

h(40,0)=(h(40)+0) mod7 = 40 mod 7= 5 (inserted)

h(47,0)=(h(47)+0) mod7 = 5 mod 7=5 (collision)

So, h(47,1)= (h(47)+1) mod 7= (5+1) mod 7= 6 mod 7= 6 (collision).

h(47,2)= (h(47)+2) mod 7= (5+2) mod 7=7 mod 7= 0 (inserted).


h(10,0)=(h(10)+0) mod 7 = 3 mod 7=3 (inserted)

h(55,0)=(h(55)+0) mod 7 = 6 mod 7=6 (collision)

So, h(55,1)= (h(55)+1) mod 7= (6+1) mod 7= 7 mod 7= 0 (collision).

h(55,2)= (h(55)+2) mod 7= (6+2) mod 7=8 mod 7= 1 (inserted).

In this manner we fill up the hash table using linear probing technique.

Quadratic Probing:

Here we insert the keys 76,40,48,5,55.

Table size is 7
Probes are the number of attempts made for inserting a key.

Quadratic probing, h(k,i)=(h(k)+i^2) mod m

Now,

h(76,0)=(h(76)+0) mod7 = 6 mod 7=6 (inserted)

h(40,0)=(h(40)+0) mod7 = 40 mod 7= 5 (inserted)

h(48,0)=(h(48)+0) mod7 = 6 mod 7= 6 (collision)

so, h(48,1)=(h(48)+1^2) mod7 = 7 mod 7= 0 (inserted)

h(5,0)=(h(5)+0) mod7 = 6 mod 7= 6 (collision)

so, h(5,1)=(h(5)+1^2) mod 7 = 5 mod 7= 5( collision)

h(5,2)=(h(5)+2^2) mod 7 = 9 mod 7= 2( inserted)

h(55,0)=(h(55)+0) mod7 = 6 mod 7= 6 (collision)

so, h(55,1)=(h(55)+1^2) mod 7 = 7 mod 7= 0( collision)

h(55,2)=(h(55)+2^2) mod 7 = 10 mod 7= 3 ( inserted)

In this manner we fill up the hash table using Quadratic probing technique.

 If size is prime, then quadratic probing will find an empty slot in size/2 probes or
fewer.

If i is larger thansize/2, then quadratic probing may fail to find an empty slot.

See the below example…..


Here, we insert keys 76,93,40,35,47.

Table size is 7.

h(76,0)=(h(76)+0) mod 7 = 6 mod 7=6 (inserted)

h(93,0)=(h(93)+0) mod 7 = 2 mod 7=2 (inserted)

h(40,0)=(h(40)+0) mod7 = 40 mod 7= 5 (inserted)

h(35,0)=(h(35)+0) mod 7 = 0 mod 7= 0 (inserted)

But for , h(47,0)=(h(47)+0) mod7 = 47 mod 7=5 (collision)

so, h(47,1)=(h(47)+1^2) mod 7 = 6 mod 7= 6 (collision)

h(47,2)=(h(47)+2^2) mod 7 = 9mod 7= 2 (collision)

h(h(47,3)=(h(47)+3^2) mod 7 = 14 mod 7= 0 (collision)

h(47,4)=(h(47)+4^2) mod 7 = 21mod 7= 0 (collision)


……so 47 cannot inserted using quadratic probing.

So we go for double hashing.

Double hashing:

The double hashing is done in the following manner.

h(k,i)=(h(k)+i g(k)) mod m.

here g(k)=R – (k mod R)


where R is the largest prime value among the buckets in the hash table.

Now after getting the g(k) value, we should place the key g(k) buckets or slots after the
home bucket (h(k) ) proceeding in a circular manner through the hash table.

Example:
Here the keys(k) are 76,93, 40, 47,10,55

Table size is 7.

Here the value of will be 5.

h(76,0)=(h(76)+0) mod 7 = 6 mod 7=6 (inserted)

h(93,0)=(h(93)+0) mod 7 = 2 mod 7=2 (inserted)

h(40,0)=(h(40)+0) mod7 = 40 mod 7= 5 (inserted)

h(47,0)=(h(47)+0) mod7 = 40 mod 7= 5 (collision)

so, g(47)=R – (k mod R)

=5-(47 mod 5)= 5-2=3 (inserted)

Now , 47 should be placed 3 buckets next to the home bucket proceeding in a circular
manner in the hash table.

So, 47 will be placed in 1st bucket or index. (5->6,0,1)

h(10,0)=(h(10)+0) mod7 = 10 mod 7= 3 (inserted).

h(55,0)=(h(55)+0) mod7 = 55 mod 7= 6 (collision)

so, g(55)=R – (k mod R)

=5-(55 mod 5)= 5-0=5 (inserted)

Now , 55 should be placed 5 buckets next to the home bucket proceeding in a circular
manner in the hash table.

So, 55 will be placed in 4th bucket or index. (6->0,1,2,3,4).


Hash Table Restructuring:
If the hash table size is not enough to perform the insertion operation then we need to
restructure the hash table.

This is done by increasing the table size. If m be the actual size of the hash table, then
m is replaced with R.

Where, R is the First Prime number after the value of 2m .i.e.

R>2m.

For example, 7 is the actual size of the hash table.

Then R value will be 17. So 17 is the new hash table size.

After reconstructuring the hash table, the values which are already mapped should be
mapped once again with respect to the new hash table.
Example:

42
22

10

33

This is the hash table to be reconstuctured for inserting more values.

So, table size m=7.

For new hash table table size will be 17. (17 is the 1 st prime number after 14).

Then the values will be inserted as below.

h(42)= 42 mod 17=8

h(10)= 10 mod 17=10

h(22)=22 mod 17=5


22
h(33)=33 mod 17=16.

42

10

33

You might also like