You are on page 1of 4

Assignment NO 1

Problem Statement:
Consider telephone book database of N clients. Make use of a hash table implementation to
quickly look up client‘s telephone number. Make use of two collision handling techniques
and compare them using number of comparisons required to find aset of telephone numbers

Theory:
Hashing
Hashing is finding an address where the data is to be stored as well as located using a key
with the help of the algorithmic function. Hashing is a method of directly computing the
address of the record with the help of a key by using a suitable mathematical function called
the hash function.

Hash Function
A hash function is a mathematical function that converts a numerical input value into another
compressed numerical value. The input to the hash function is of arbitrary length but output is
always of fixed length. The address generated by hashing function is called as home address.

Fig. Hash Function


Collision in Hashing
When two values hash to the same array location, this is called a collision. Collisions are
normally treated as “first come, first served”—the first value that hashes to the location gets
it. We have to find something to do with the second and subsequent values that hash to this
same location

Collision Resolution Strategies


• Separate chaining (or linked list)
• Open addressing
- Linear probing
- Quadratic probing
- Double hashing

Collision Resolution:Separate Chaining


When a collision occurs, elements with the same hash key will be chained together. A chain
is simply a linked list of all the elements with the same hash key. The hash table slots will no
longer hold a table element. They will now hold the address of a table element.
Example:

Advantages
1) Simple to implement
2) Hash table never fills up, we can always add more elements to chain.
3) Less sensitive to the hash function or load factors.
4) It is mostly used when it is unknown how many and how frequently keys may be inserted
or deleted.

Disadvantages
1) Cache performance of chaining is not good as keys are stored using linked list. Open
addressing provides better cache performance as everything is stored in same table.
2) Wastage of Space (Some Parts of hash table are never used)
3) If the chain becomes long, then search time can become O(n) in worst case.
4) Uses extra space for links.

Open Addressing(Closed Hashing)


Open addressing is a method for handling collisions.Array-based implementation.All items
are stored in the hash table itself. So at any point, size of the table must be greater than or
equal to the total number of keysIn addition to the cell data (if any), each cell keeps one of
the three states: EMPTY, OCCUPIED, DELETED.
i. Linear probing (linear search)
ii. Quadratic probing (nonlinear search)
iii. Double hashing (uses two hash functions)

Open addressing
In open addressing, when collision occurs, it is resolved by finding an available empty
location other than the home address.If Hash(key) is not empty, the positions are probed in
the following sequence until an empty location is found.

When we reach the end of table, the search is wrapped around to start and the search
continues till the current collide location. The most important factors to be taken care of to
avoid collision are the table size and choice of hash function.

Probe sequence
A probe sequence is the sequence of array indexes that is followed in searching for an empty
cell during an insertion, or in searching for a key during find or delete operations.The most
common probe sequences are of the form:
hi(key) = [h(key) + c(i)] % n,for i = 1, 2,…, n-1.
whereh is a hash function and n is the size of the hash table

Linear probing
A hash table in which a collision is resolved by putting the item in the next empty place in
following the occupied place is called linear probing.This strategy looks for the next free
location until it is found.The function that we can use for probing linearly from the next
location is as follows:

Note: For a given hash function h(key), the only difference in the open addressing collision
resolution techniques (linear probing, quadratic probing and double hashing) is in the
definition of the function p(i).Common definitions of p(i) are:

Collision resolution technique p(i)


Linear probing i
Quadratic probing ±i2
Double hashing i*hp(key)

wherehp(key) is another hash function.

Example:
Perform the operations given below, in the given order, on an initially empty hash table of
size 13 using linear probing with c(i) = i and the hash function: h(key) = key % 13:
insert(18), insert(26), insert(35), insert(9), find(15), find(48), delete(35), delete(40), find(9),
insert(64), insert(47), find(35)
The required probe sequences are given by:
hi(key) = (h(key) + i) % 13 i = 0, 1, 2, . . ., 12
Index Status Value

0 O 26

1 E

2 E

3 E

4 E

5 O 18

6 E

7 E

8 O 47

9 D 35

10 O 9

11 E

12 O 64

Conclusion:
Thus we implemented two Collision resolution strategies- Separate Chaining & open
addressing (Linear Probing).

A. Write short answer of following questions :


1. What is hash function? What are characteristics of good hash function?
2. Explain the different types of hash functions?
3. Explain the terms: Probe, Collision, Bucket, Overflow and Synonym.
4. What are the advantages and disadvantages of separate chaining?
5. What is the drawback of Linear Probing?

You might also like