You are on page 1of 25

CHAPTER 8

Hashing

1
Hashing

Definition:
In hashing the dictionary pairs are stored in a table, ht, called the
hash table. The hash table is partitioned into b buckets, ht[0],
…,ht[b-1]. The address or location of a pair is determined by a
hash function, h, which maps keys into buckets. Thus, for any key
k, h(k) is an integer in the range 0 through b-1.

2
Hash Table

index Slot0 Slot1


Bucket 0
1
x H(x) 2

b-1

3
Terminologies

Identifier density:   utilization  Collision


The identifier density of a hash table is the ratio n/T, where n is the
number of identifiers in the table. The loading density or loading
factor of a hash table is   n s  b (T: distinct possible value of
identifiers. s: number of slots per bucket. b: bucket number)

4
Terminologies

Overflow:
Since many keys typically have the same home bucket, it is possible
that the home bucket for a new dictionary pair is full at the time we
wish to insert this pair into the dictionary.

not necessarily
collision overflow
necessarily

Collision:
A collision occurs when the home bucket for the new pair is not
empty
at the time of insertion.
5
Hashing Function

Hashing Function Design :

(1) Easy to Compute

(2) Minimize the Number of Collisions

(3) Uniform Hash function

6
Hashing Function

Hashing Function :

 Mid-Square

 Division (Modulus)

 Folding

 Digit Analysis

7
Mid-square
The mid-square hash function determines the home bucket for a key
by squaring the key and then using an appropriate number of bits
from
the middle of the square to obtain the bucket address.

ex)
We assume the key = 8125 , and hashing table has 1000
buckets.
(8125) 2  66015625

so address is “156” or “015”

8
Divisions

The home bucket is obtained by using the modulo (%) operator. The
key x is divided by some number M, and the remainder is used as the
home bucket for x.

f(x) = x mod M
ex)
prime

9
Folding
In this method the key k is partitioned into several parts, all but
possibly the last being of the same length. These partitons are then
added together to obtain the hash address for k.

There are two ways of carrying out this additon. (1) Shift (2)
Boundary
ex1)
We assume the key = 12320324111220 , and hashing
table has 1000 buckets.
123|203|241|112|20
(1) 123+203+241+112+020=699
(2) 123+302+241+211+020=897
10
Folding
ex2)

(1) Shift (2) Boundary


11
Digit Analysis
All the keys in the table are known in advance. Each key is
interpreted
as a number using some radix r. The same radix is used for all the
keys in the table. Using this radix, the digits of each key are
examined. phone number address

ex)

12
Overflow Handling

 Linear Open Addressing (Linear Probing)

 Quadratic Probing

 Rehashing

 Chaining

13
Linear Probing
When the overflow occurs, we search the hash table buckets in the
order (H(x)+1, H(x)+2…), until the hash table is full or reaching the
first unfilled bucket.

ex) 0 10 0 10 0 10
1 1 1
Insert 55 Insert 25
2 75 2 75 2 75
3 3 55 3 55
4 4 4 25
5 43 5 43 5 43

6 6 6
14
Linear Probing

Advantages:
Simple 、 Easy to Implement 。

Disadvantages:
When the clustering occurs, the search time will increase
rapidly 。

15
Quadratic Probing

When the overflow occurs, we search the hash table buckets by using

ex) Key k, hash function H


 22
1st search : H(k)
 12
H(x), overflow 2nd search : (H(k)+12)%b
1 2

3th search : (H(k)-12)%b


 22
4th search : (H(k)+22)%b
5th search : (H(k)-22)%b
Nth search : (H(k)±((B-1)/2)2)%b

16
Rehashing

The rehashing method is to use a series of hash functions h1,h2,…,hm.


Buckets hi(k), 1≦i≦m are examined in the order.

17
Chaining

Many of the comparisons can be saved if we maintain lists of keys,


one list per bucket, each list containing all the synonyms for that
bucket.

0 10 0 10
1 1
Insert 25
Insert 55
2 75 2 75
55
3 3
25
4 4
5 43 5 43

6 6
18
Question:
Assume that a hash function has the following
characteristics:
keys 257 and 567 hash to 3
keys 987 and 313 hash to 6
keys 734, 189 and 575 hash to 5
keys 122 and 391 hash to 8
Assume that insertions are done in order 257, 987, 122,
575, 189, 734, 567, 313, 391
(1)Indicate the position of the data if open probe addressingis used to
resolve collision
(2) Indicate the position of the data if chining with separate
lists is used to resolve collision

Hash Table
0 1 2 3 4 5 6 7 8 9 10

19
Question:

If H(x) = x mod 7 and separate chaining resolves


collisions, What does the hash table look like after the
following insertions occur: 8, 10, 24, 15, 32, 17?
Assume that each table item contains only a search key.

20
Question:
Suppose the hashing function f(x) = x mod 11 is used to hash a
list of input value (in the given order) into a hash table
implemented by the array
bucked[0],bucket[1],…bucket[10]. The inputs are
10,100,32,45,126,3,24,200,and 53. Each bucket can hold only
one number. Overflow is resolved by quadratic probing, which
examines buckets f(x), (f(x)+ ) mod 11,and (f(x)- ) mod 11, i=1
to 5.Show the final contents in bucket[0] to bucket[10].2
i
i2

21
Ans: 0 32
1 100
2 45
3 3
4
5 126
6 24
7
8 53
9 200
10 10

22
Question:
For each hash table below, show the result of inserting the
following sequence of key values, in the given order , into an
initially empty hash table of that type: 26,17,20,9,34,32,15,21.
In both cases , assume a hash table size of 11 and a hash function
h(x) = x mod 11.
(1)Static hash table that uses chaining
(2)Hash table that uses linear probing

23
(1) (2)
0 0 32
1 →34 1 34
2 2 21
3 3
4 →26→15 4 26
5 →17 5 15
6 6 17
7 7
8 8
9 →20→9 9 20
10 →32→21 10 9
24
Reference
Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
〝 Fundamentals of Data Structures in C 〞 , W. H. Freeman & Co Ltd, 1992.
Ellis Horowitz, Sartaj Sahni, and Dinesh Mehta
〝 Fundamentals of Data Structures in C++ 〞 Silicon Pr, 2006
Richard F.Gilberg, Behrouz A. Forouzan,
〝 Data Structures: A Pseudocode Approach with C 〞 , S
Baker & Taylor Books, 2004
Fred Buckley, and Marty Lewinter 〝 A Friendly Introduction to Graph
Theory 〞 Prentice Hall, 2002
〝資料結構 - 使用 C 語言〞蘇維雅譯,松崗, 2004
〝資料結構 - 使用 C 語言〞 蔡明志編著,全華, 2004
〝資料結構 ( 含精選試題 ) 〞洪逸編著,鼎茂, 2005

25

You might also like