Professional Documents
Culture Documents
Search vs. Hashing
Search vs. Hashing
Hashing
• Search tree methods: key comparisons
– Time complexity: O(size) or O(log n)
• Hashing methods: hash functions
– Expected time: O(1)
• Effective way to reduce the number of comparisons
• Types
– Static hashing- In static hashing, the hash function maps search-key values
to a fixed set of locations.
– Dynamic hashing - In dynamic hashing a hash table can grow to
handle more items. The associated hash function must change as the
table grows.
Hashing
• Is a search scheme that will use some function.
• Hashing is another approach to storing and
searching for values.
• The function that converts the key into array
position is called hash function.
• An effective way to reduce the number of
comparisons.
• It deals with the idea of proving the direct address
of the record where the record is likely to store.
Basic terminologies
• Hash table
Is a data structure used for storing & retrieving data very
quickly.
Insertion of data in the hash tale is based on the key value.
Every entry in the hash table is based on key value.
• Hash function
Is a function which is used to put the data in the hash
table.
h(key)= key %1000
• Hash key
The integer returned by the hash function is called hash
key.
Types of hash function
Used to place the record in the hash table
• Division method
Depends upon the remainder of division.
h(key)= record % table size
• Mid square
The key is squared &the middle or mid part of the result is used as
the index.
• Multiplicative hash function
The given record is multiplied by some constant value. The formula
for computing the hash key is h(key) = floor(p*(fractional part of key
* A)), p- integer constant, A- constant real number
• Digit folding
The key is divided into separate parts & using some simple operation
these parts are combined to produce the hash key
Collision
• The hash function returns the same addresses(hash
keys) for more than one records is called collision.
• Occurrences of collision mean poor design for the hash
functions.
• choosing a hash function
A good has function should satisfy two criteria:
1. It should be quick to compute
2. It should minimize the number of collisions
• Load factor of a hash table is the ratio of the no. of keys
in the table to the size of the hash table.
• If collision occurs then it should be handled by applying
some techniques – collision handling techniques.
Collision handling techniques- Separate
chaining
• Array of linked list implementation
• Is to keep a list of all elements that hash to the
same value.
• Create an array of linked list of words, so that the
item can be inserted into the linked list if collision
occurs.
• Disadvantages
Parts of the array might never be used.
Constructing new chain nodes is relatively expensive
Separate Chaining (cont’d)
• Example: Load the keys 23, 13, 21, 14, 7, 8, and 15 , in this order, in a hash table of
size 7 using separate chaining with the hash function: h(key) = key % 7
h(23) = 23 % 7 = 2
h(13) = 13 % 7 = 6
h(21) = 21 % 7 = 0
h(14) = 14 % 7 = 0 collision
h(7) = 7 % 7 = 0 collision
h(8) = 8 % 7 = 1
h(15) = 15 % 7 = 1 collision
7
Separate Chaining with String Keys (cont’d)
• Use the hash function hash to load the following commodity items into a hash table of size 13 using separate chaining:
onion 1 10.0
tomato 1 8.50
cabbage 3 3.50
carrot 1 5.50
okra 1 6.50
mellon 2 10.0
potato 2 7.50
Banana 3 4.00
olive 2 15.0
salt 2 2.50
cucumber 3 4.50
mushroom 3 5.50
orange 2 3.00
• Solution:
0 okra potato
1 onion carrot
2
Item Qty Price h(key)
3 onion 1 10.0 1
4 tomato 1 8.50 10
cabbage cabbage 3 3.50 4
5 carrot 1 5.50 1
6 okra 1 6.50 0
mushroom mellon 2 10.0 10
7 potato 2 7.50 0
salt
8 Banana 3 4.0
11
9 olive 2 15.0 10
cucumber
10 salt 2 2.50 7
tomato cucumber
mellon 3 4.50olive 9
11 mushroom 3 5.50 6
12 banana orange 2 3.00 12
orange 9
• Open hashing has the disadvantage of requiring
pointers.
– Linear Probing
– Quadratic Probing
– Double Hashing
Collision handling techniques- Open
addressing
• Array based implementation
• Search the array in some systematic way for an
empty cell and insert the new item there if
collision occurs.
• The result of inserting keys {89, 18, 49, 58, 69}
into a closed table using the same hash function
as before and the collision resolution strategy, f(i)
= i.
Hash( 89, 10) = 9
Hash( 18, 10) = 8
Linear Probing Hash( 49, 10) = 9
Hash( 58, 10) = 8
After Hash( 9, 10 ) = 9
49 49 49
0 58 58
1 9
2
3
4
5
6
7
8
9 18 18 18 18
89 89 89 89 89
H + 1, H + 2, H + 3, H + 4,……..H + i
Problem with Linear Probing
• When several different keys are hashed to the
same location, the result is a small cluster of
elements, one after another.
• As the table approaches its capacity, these
clusters tend to merge into larger and lager
clusters.
• Quadratic Probing is the most common
technique to avoid clustering.
Quadratic Probing Hash( 89, 10) = 9
Hash( 18, 10) = 8
Hash( 49, 10) = 9
H+1*1, H+2*2, H+3*3, ….H+i*i Hash( 58, 10) = 8
Hash( 9, 10 ) = 9
After
hash2(x) = R - (x mod R)
R: prime, smaller than table size.
Double Hashing
• f(i) = i*hash2(x)
• E.g.: hash2(x) = 7 – (x % 7)
2. PQueue operations
– insert
– deleteMin
5 12
26 25 14 15
i 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
array __ 4 5 12 26 25 14 15 29 45 35 31 21 3 __ __
currentsize = 13
Insert 3
Inserting a Value
4
5 12
26 25 14 15
29 45 35 31 21
Insert 3
Inserting a Value
4
5 12
26 25 14 15
29 45 35 31 21 14 copy 14 down
because 14 > 3
tmp 3
Insert 3
Inserting a Value
4
5 12
26 25 12 15
29 45 35 31 21 14 copy 12 down
because 12 > 3
tmp 3
Insert 3
Inserting a Value
4
5 4
26 25 12 15
29 45 35 31 21 14 copy 4 down
because 4 > 3
tmp 3
Insert 3
Inserting a Value
3 insert 3
5 4
26 25 12 15
29 45 35 31 21 14
Insert 3
Binary Heap Properties
1. Structure Property
2. Ordering Property
Some Definitions:
A Perfect binary tree – A binary tree with all
leaf nodes at the same depth. All internal
nodes have 2 children.
height h
2h+1 – 1 nodes
11 2h – 1 non-leaves
2h leaves
5 21
2 9 16 25
1 3 7 10 13 19 22 30
Heap Structure Property
• A binary heap is a complete binary tree.
Complete binary tree – binary tree that is
completely filled, with the possible exception of
the bottom level, which is filled left to right.
Examples:
Representing Complete
Binary Trees in an Array
1 A
From node i:
2 3
B C
4 D 5 E
6
F
7
G
left child:
8 9 10 11 12 right child:
H I J K L
parent:
A B C D E F G H I J K L
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Heap Order Property
Heap order property: For every non-root
node X, the value in the parent of X is less
than (or equal to) the value in X.
10
10
20 80
20 80
40 60 85 99
30 15
50 700
not a heap
Heap Operations
• findMin:
• insert(val): percolate up.
• deleteMin: percolate down.
10
20 80
40 60 85 99
50 700 65
Heap – Insert(val)
Basic Idea:
1. Put val at “next” leaf position
2. Percolate up by repeatedly exchanging node
until no longer needed
Insert: percolate up
10
20 80
40 60 85 99
50 700 65 15
10
15 80
40 20 85 99
50 700 65 60
Insert Code (optimized)
void insert(Object o) { int percolateUp(int hole,
assert(!isFull()); Object val) {
while (hole > 1 &&
size++; val < Heap[hole/2])
newPos = Heap[hole] = Heap[hole/2];
percolateUp(size,o); hole /= 2;
}
Heap[newPos] = o; return hole;
} }
Heap – Deletemin
Basic Idea:
1. Remove root (that is always the min!)
2. Put “last” leaf node at root
3. Find smallest child of node
4. Swap node with its smallest child if needed.
5. Repeat steps 3 & 4 until no swaps needed.
DeleteMin: percolate down
10
20 15
40 60 85 99
50 700 65
15
20 65
40 60 85 99
50 700
DeleteMin Code (Optimized)
Object deleteMin() { int percolateDown(int hole,
assert(!isEmpty()); Object val) {
while (2*hole <= size) {
returnVal = Heap[1]; left = 2*hole;
size--; right = left + 1;
newPos = if (right ≤ size &&
Heap[right] < Heap[left])
percolateDown(1, target = right;
Heap[size+1]); else
Heap[newPos] = target = left;
Heap[size + 1];
if (Heap[target] < val) {
return returnVal; Heap[hole] = Heap[target];
} hole = target;
}
else
break;
}
return hole;
}
Exercise
25, 57, 48, 38, 10, 91, 84, 33
Linear and Quadratic probing problems
hash2(x) = R - (x mod R)
R: prime, smaller than table size.
Double Hashing
• f(i) = i*hash2(x)
• E.g.: hash2(x) = 7 – (x % 7)