You are on page 1of 15

ADS RV 1

ADS RV 2
ADS RV 3
Basic Ideas
• Basic ideas:
– Hash the key of each record into a reasonably long
integer to avoid collision
• adding 0’s to the left so they have the same length
– Build a directory
• The directory is stored in the primary memory
• Each entry in the directory points to a leaf
• Directory is extensible
– Each leaf contains M records,
• Stored in one disk block
• Share the same D leading digits

ADS RV 4
Directory and Leaves
• Directory: also called “root”, stored in the main memory
– D: the number of bits for each entry in the directory
– Size of the directory: 2^D

• Leaf: Each leaf stores up to M elements


– M = block size /record size
– dL: number of leading digits in common for all elements in leaf L.
– dL < = D. (This will become clear shortly)

ADS RV 5
ADS RV 6
find operation
1. Use the first D digits of the key to find the entry in the
directory;
2. Fin the address of the leaf
3. Read the leaf

• Time performance:
– O(1) disk access
– Time for searching the record in the leaf in the main memory is
negligible.

ADS RV 7
insert operation
1. Find and read the leaf
2. If the leaf has room, insert the record, write
back;
3. Else
– split the leaf into two;
– update the directory if necessary;
– write back the leaf or leaves

ADS RV 8
ADS RV 9
Points to remember - Hash tables
• Table size prime
• Table size much larger than number of inputs (to maintain
λ closer to 0 or < 0.5)
• Tradeoffs between chaining vs. probing
• Rehashing required to resize hash table at a time when λ
exceeds 0.5
• Hashing is good for searching. Not good if there is some
order implied by data.

ADS RV 10
ADS RV 11
D=2 00 01 10 11

2^2
entries
00000010

ADS RV 12
D=2 00 01 10 11

2^2
entries
00000010 01010001 10111101 11001111
00101011 01111111 10011011
11011011
00001011 01101111 10111110
11110000
10010110

10011110 X

ADS RV 13
D=3

2^3
entries

00000010
01010001 10111101 11001111
00101011 01111111 10011011
00001011 11011011
01101111 10111110
11110000
10010110

10011110 X

ADS RV 14
D=3

2^3
entries

00000010 10111101 11001111


01010001 10111101
00101011 01111111 10111110
10011011 11011011
00001011 01101111 10011110 11110000
10010110

ADS RV 15

You might also like