You are on page 1of 15

Hashing

Data Structures
Basic idea
2

A data structure that allows insertion, deletion and


search in O(1) in average.
A data structure that requires a limited or no search
in order to find a record.
The location of the record is calculated from the
value of its key.
No order in the stored records.
…Basic idea

Consider records with


0 1 2 3 4 5 6 7 8 9
integer key values:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Create a table of 10 cells:
index of each cell in the key: key:
range [0..9]. 2 8
Each record is stored in
the cell whose index … …
corresponds to its key
value. … …

3
Example applications

 On-line dictionary. After pre-hashing the entire dictionary,


one can find meaning of each word in constant time.
 So a word like ‘subtle’ can be converted into a index value in a table, and
the meaning of the word subtle can be found there.

 Compilers use hash tables (symbol table) to keep track of


declared variables.
Definitions
5

Hashing
The process of accessing a record, stored in a table,
by mapping the value of its key to a position in the
table.
Hash function
A function that maps key values to table positions.
Hash table
The array where the records are stored.
Hash value
The value returned by the hash function. It usually
corresponds to a position in the hash table.
Example of Hashing
Perfect hashing Hash
table
0
1
H(key)=key
2
Key 8 Hash 3
function: 4
… H(8)=8 5
… 6
Record
7
Key 8 8
7
9
…Perfect hashing
8

Each key value maps to a different position in the


table i.e. not two keys ever map to the same location
in the hash table.
All the keys need to be known before the table is
created.
Problem: what if the keys are neither contiguous
nor in the range of the indices of the table?
Solution: find a hash function that allows perfect
hashing! Is this always possible?
…Perfect hashing
9

Example: a company has 100 employees. Social


Insurance Number (SIN) is used as a key for a each
record.
But the SIN is 9 digits, so should we create a table of
1,000,000,000 cells for only 100 employees?
Also, even if the SI Numbers of all 100 employees is
known in advance, it does not guarantee to find a
perfect hash function.
…Perfect hashing
10

Hash functions that allow perfect hashing are so rare


that it is worth looking for them only in special
circumstances.
 An imperfect hashing function would be where there are
‘collisions’, i.e. two or more records (keys) map to the same
location in the hash table (i.e. same index in hash table).
In addition, it is often that the collection of records is
not known in advance.
 Remember, you needed all the records in advance to find out
whether is a perfect hash for the records.
Collisions
11

What if we cannot find a perfect hash function?


Collision: more than one key will map to the same
location in the table!
 Can we avoid collisions? No, except in the case of
perfect hashing (rare).
 Solution: select a “good” hash function and use a
collision-resolution strategy.
…Collisions

Example: The keys are integers and the hash


function is hashValue = key mod tableSize
 If tableSize = 10, all records whose keys have the
same rightmost digit have the same hash value.

Insert 13 and 23

0 1 2 3 4 5 6 7 8 9
13
23
12
Collision Resolution
13

Data Structures
Open-addressing vs. chaining
14

Open-addressing: Storing the record directly in


the table.
Deal with collisions using collision-resolution
strategies.

Chaining: Each cell of the hash table points


towards a linked-list.
…Chaining

0 H(key)=key mod
1 tableSize
2 Insert 13
3 13 23
Insert 23
4
5 Insert 18
6 Collision is resolved
7 by inserting the
8 elements in a linked-
18 list.
9
15

You might also like