Professional Documents
Culture Documents
Unit – III – Chapter 5 of Data Structures and Algorithm Analysis in C++ - Mark
Allen Weiss
Array as table
2
Array as table
name score
0
: : :
12345 Kuzhali 81.5 One ‘stupid’ way is to store the
: : : records in a huge array (index
33333 Ezhil 90 0..9999999). The index is used as
: : : the student id, i.e. the record of the
56789 Arasi 56.8 student with studid 0012345 is
: : : stored at A[12345]
: : :
9908080 Begham 49
: : :
9999999
3
Array as table
Store the records in a huge array where the index
corresponds to the key
add - very fast O(1)
delete - very fast O(1)
search - very fast O(1)
But it wastes a lot of memory! Not feasible.
So we need hash tables and Hash Functions
4
Hash Table
A hash table is a data structure that stores things and allows
insertions, lookups, and deletions to be performed in O(1) time.
An algorithm converts an object, typically a string, to a number.
Then the number is compressed according to the size of the table
and used as an index.
There is the possibility of distinct items being mapped to the
same key. This is called a collision and must be resolved.
5
Hash Table Example
• The simplest kind of hash table is an array of records.
• This example has 701 records.
...
An array of records
[4]
Hash Table
Number 506643548
Each record has a special field,
called its key.
In this example, the key is a
long integer field called
Number.
...
[4]
Hash Table
The number might be a person's Number 506643548
...
Hash Table
When a hash table is in use, some spots contain valid
records, and other spots are "empty".
...
281942902 233667136 506643548 155778322
Inserting a new record
In order to insert a new record, the Number 580625685
key must somehow be converted
to an array index.
The index is called the hash value
of the key.
...
281942902 233667136 506643548 155778322
Hash Function h(k)
Number 580625685
h(k) = k mod m
m – hash table length
Typical way create a hash value:
(Number mod 701)
...
281942902 233667136 506643548 155778322
Inserting a new record
Number 580625685
Typical way create a hash value:
(Number mod 701)
...
281942902 233667136 506643548 155778322
Number 580625685
Inserting a new record
The hash value is used for the
location of the new record.
...
281942902 233667136 506643548 155778322
Inserting a new record
The hash value is used for the
location of the new record.
...
281942902 233667136 580625685 506643548 155778322
Collisions
Here is another new record to insert, Number 701466868
with a hash value of 2.
This is called a collision, because
there is already another valid record at
[2].
My hash
value is [2].
...
281942902 233667136 580625685 506643548 155778322
Collisions
Number 701466868
My hash
value is [2].
...
281942902 233667136 580625685 506643548 155778322
Collisions
Number 701466868
My hash
value is [2].
...
281942902 233667136 580625685 506643548 155778322
Collisions
Number 701466868
My hash
value is [2].
...
281942902 233667136 580625685 506643548 155778322
Collisions
Number 701466868
Empty
My hash
Spot
value is [2].
...
281942902 233667136 580625685 506643548 155778322
Collisions
...
281942902 233667136 580625685 506643548 701466868 155778322
Searching for a key
Calculate the hash value. Number 701466868
Check that location of the array for
the key.
My hash
Not me. value is [2].
...
281942902 233667136 580625685 506643548 701466868 155778322
Searching for a key
Keep moving forward until you find the Number 701466868
key, or you reach an empty spot.
My hash
Not me. value is [2].
...
281942902 233667136 580625685 506643548 701466868 155778322
Searching for a key
Keep moving forward until you find the Number 701466868
key, or you reach an empty spot.
My hash
Not me. value is [2].
...
281942902 233667136 580625685 506643548 701466868 155778322
Searching for a key
Keep moving forward until you find the Number 701466868
key, or you reach an empty spot.
My hash
Yes value is [2].
...
281942902 233667136 580625685 506643548 701466868 155778322
Searching for a key
Keep moving forward until you find the Number 701466868
key, or you reach an empty spot.
My hash
Yes value is [2].
...
281942902 233667136 580625685 506643548 701466868 155778322
Solutions to Collision
The problem arises because we have two keys that hash in
the same array entry, a collision. There are two ways to
resolve collision:
Hashing with Chaining: every hash table entry contains a pointer to
a linked list of keys that hash in the same entry
Hashing with Open Addressing: every hash table entry contains
only one key. If a new key hashes to a table entry which is filled,
systematically examine other table entries until you find one empty
entry to place the new key
Linear Probing
Quadratic Probing
Rehashing (Or) Double Hashing
28
Chained Hash Table
One way to handle collision is to store the
collided records in a linked list. The array
now stores pointers to such lists. If no key
0 maps to a certain hash value, that array
1 nil entry points to nil.
2 nil
3
4 nil
5
:
29
Hash Table without Linked List
Separate chaining has the disadvantage of using linked lists.
Normal hash function hash(x) = key mode TableSize
but in the solution resolve strategy
Where hi(x) = (hash(x) + f(i)) mod Table Size, f(0) = 0
The function f is the collision of resolve strategy
There are 3 collision resolve strategy
Linear probing
Quadratic probing
Rehashing (Or double hashing)
30
Linear Probing (நேரியல் ஆய்வு)
Empty Table Linear probing: Given auxiliary hash function h, the probe sequence
starts at slot h(k) and continues sequentially through the table, wrapping
0 after slot m − 1 to slot 0. Given key k and probe number i (0 ≤ i < m),
1 h(k, i) = (h(k) + i) mod m.
2 In linear probing, collisions are resolved by sequentially scanning
3 an array (with wraparound) until an empty cell is found.
4 Example
5 Key k:[89, 18, 49, 58, 69]
6 m = 10
h(89,10) = 9
7
h(18,10) = 8
8 h(49,10) = 9
9 h(58,10) = 8
h(69,10) = 9
31
Hash table with linear probing
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 89
32
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
33
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
34
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 49
35
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 49
36
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 49
37
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 49
38
Linear Probing (நேரியல் ஆய்வு)
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 49
39
Clustering (கொத்தாக்கம்)
The position of the initial mapping i0 of key k is called the home
position of k.
When several insertions map to the same home position, they end up
placed contiguously in the table. This collection of keys with the
same home position is called a cluster.
Primary clustering: It means that any key that hashes into the cluster
will require several attempts to resolve the collision, and then it will
be add to the cluster.
40
Clustering
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 89
41
Clustering
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
42
Quadratic Probing
Quadratic probing is a collision resolution method that eliminates the primary
clustering problem of linear probing.
In this probing, the function f is quadratic. i.e., f(i) = i2
Probing sequence is
43
Quadratic Probing - Example
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 89
44
Quadratic Probing - Example
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 89
45
Quadratic Probing - Example
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
47
Quadratic Probing - Example
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
48
Quadratic Probing - Example
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
49
Double Hashing
The last collision resolution method we will examine is double hashing.
Quadratic probing problems: Secondary clustering: elements that hash to
the same position will probe the same alternative cells
Probe sequence is – combine two different hash functions
51
Double Hashing
Example
Key k:[89, 18, 49, 58, 69], m = 10, i (0 ≤ i < m), After 18
h(k,i) = hi(k) + (i.hash2(x)) mod m.
0 69
Double Hashing
1
i (0 ≤ i < 10), 2
h(89,0) = No collision 3 58
h(18,0) = No collision 4
h(49,0) = Collision… 5
h(49,1) = (h(49) + (1.hash2(49))) mod 10
6 49
= ((9) + (1. (7 – (49 mod 7)) mod 10
7
= (9) + (1. (7 – 0)) mod 10
8 18
= (9) + (7) mod 10
9 89
= (16) mod 10 = 6. h(58)= ? h(69)= ? Hash table
52
Rehashing
Increase the size of the hash table when load factor too high.
Typically expand the table to twice its size (but still prime)
Reinsert existing elements into new hash table
53
Rehashing Example
Problem with large tables
Extensible Hashing
Extensible Hashing Example
Extensible Hashing Example
Extensible Hashing Example
Unit – III Completed