DS Unit - 5 Hashing

Government Polytechnic, Porbandar Data Structures (3330704) Unit-5 Hashing (CO-5) 1, Define Hashing. Marks [2] Ans: Hashing is a technique or process of mapping keys, values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used. ‘* Best case timing behaviour of searching using hashing = O(1). © Worst case timing behaviour of searching using hashing = O(n). 2. Explain any 2 Hash table Method. Marks [4] Ans: Hash table method are as below: a) Division Method: + The approach is to divide a key value of record by an appropriate number then to use the remainder of division as the relative address for the records, such hash function is known as the division remainder method or simply the division method. « Hash function is defined as, H (x) = (X mode m) +1 Where X = key of the record, m= table size Example: X= 23, m=10 then H(X) = (23 mod 10) +1 H(X) = (3) +1 HO b) Mid Square Method : «The another hash function, known as the mid square method, in this method the key is multiplied by itself (or square of key is obtained) and the middle few digits of the square are used as the index. «If the square is considered as a decimal number, the table size must be a power of 10(for eg. Table size can be 10,100,1000, so on), whereas if it is considered as a Binary number, the table size must be a power of 2(for eg. 2,4,8,16,32,80 on). + Example: Page | 1Government Polytechnic, Porbandar Data Structures (3330704) Key value Squared value Relative address 123456789 15241578750190521 8750 987654321 9754610578997 1041 5778 3. Explain different method of hashing. Marks [4] Ans: There are 4 methods of hashing: 1. The Division Method 2. The Mid Square Method 3. The Folding Method 4, Multiplication Hashing Method a) Division Method : * The approach is to divide a key value of record by an appropriate number then to use the remainder of division as the relative address for the records, such hash function is known as the division remainder method or simply the division method. * Hash function is defined as, H (x) = (X mode m) +1 Where X = key of the record, table size 23 mod 10) + 1 3) +1 HX) =4 b) Mid Square Method: The another hash function, known as the mid square method, in this method the key is multiplied by itself (or square of key is obtained) and the middle few digits of the square are used as the index. If the square is considered as a decimal number, the table size must be a power of 10(for eg. Table size can be 10,100,100, so on), whereas if it is considered as a Binary number, the table size must be a power of 2(for eg. 2,4,8,16,32,50 on). Example: Page | 2Government Polytechnic, Porbandar Data Structures (3330704) Key value Squared value Relative address 123456789 15241578750190521 8750 987654321 (97546105578997 1041 5789 ©) Folding Method: + The folding method is used when you can key value of large number of digits. It divides given number into n parts of equal size then we add each this part & then perform modulo operation. + Foreg. if we have key of 8 digits then in this method we first divide 8 digits into n equal parts. + Ifwe take n=3 then we will get 3 numbers each of 3,3 & 2 digits then we will perform addition operation among each part, then perform modulo operation & result will give us value of hash function, Explanation: Example: The task is to fold the key 12345678 into a Hash Table of ten spaces (0 through 9), + It is given that the key, say X is 12345678 and the table size (i.e., M = 10). + Let us first break X into n parts with each part having maximum 3 digits + Therefore, a = 123, b = 456, c = 78(Last part. + Now, H(x) = (a + b +c) mod M ie., H(12345678) =(123 + 456 + 78) mod 10 = 657 mod 10 = 7. + Hence, 12345678 is inserted into the table at address 7. d) Multiplication Hashing Method : «In multiplication method, first the key k is multiplied by a constant C, where 0 < C.< 1 and the fractional part of the product KC is extracted. + Inthe second step, this fractional part is multiplied by N and the floor of the result is taken as the hash value. + The floor of a value x denoted by | x | is the largest integer less than or ‘equal to x. « Thats, the hash function is: h(k)= | N*(kC mod 1) | where, KC mod 1 represents the fractional part of kC, calculated as kc kC- | kC] 4, What is collision in hashing? Explain any one collision resolution technique Marks [4] Page | 3Government Polytechnic, Porbandar Data Structures (3330704) Ans: «The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision. « _Itshould be resolved by finding some other location to insert the new key. «This process of finding another location is called collision resolution technique. 1. Random Probing: * Theeffect of clustering can be reduced or minimized by using random probing. «This method generates a random sequence of position rather than an ordered sequence as in case of linear probing. 5. Explain Collision resolution Techniques (4 Ans; There are various collision resolution technique: 1, Linear Probing 2. Random Probing 3. Quadratic probing 4, Re-hashing (Double Hashing) 1. Linear Probing: * This is the one of the simplest method or Re-hash function. When collision occurs, look in the neighbouring slot in the table, if slot is empty, it calculates the new address extremely quickly. * Anew modern RISC processor due to efficient cache utilization will perform the above operation quickly. + This technique will do sequential search for new address for a collide records, s0 this method searches in a straight line, so it is called linear probing. «The major drawback of linear probing is that, as table become half full, then there is tendency towards clustering. + The clustering means, two keys that hash in to different values compete with each other in successive rehashes, is called clustering. The clustering lead to instability. Example: Let's we have a table of size, N=5 And hash function H(Key)=key mod 5 And the keys are: 10,11,12,17 = Step 1: The key value 10 hashes to the slot 0 as follows: h (0, 0) = (10 mod 5+ 0) mod 5= (0+0) mod 5 = 0 mod 5= 0 Now h(key,0)= key mod 5 i.e. h(10)=10 mod 5=0 Page | 4Government Polytechnic, Porbandar Data Structures (3330704) Since slot 0 is empty, So 10 will be inserted at location 0 into hash table as below: 0 1 2 3 4 10 * Step 2: The key value 11 hashes to the slot 1 as follows: h (11, 0) = (11 mod 5+ 0) mod 5= (1+0) mod 5 = 1 mod 5=1 Since slot 1 is empty, So 11 will be inserted at location 1 into hash table as below: 0 a 2 3 4 10 i "Step 2: The key value 12 hashes to the slot 2 as follows: h (12, 0) = (12 mod 5+ 0) mod 5= (2+0) mod 5 = 2 mod 5= 2 Since slot 2 is empty, So 12 will be inserted at location 2 into hash table as below: 0 1 2 3 4 10 a 12 + Step 3: The key value 17 hashes to the slot 2 as follows: h (17, 0) = (17 mod 5+ 0) mod 5 (2+0) mod 5 = 2 mod 5= 2 Since slot 2 is not empty, the next probe sequence is computed as follows: h (17, 1) = (17 mod 5+ 1) mod 5 = (2#1) mod 5 =3mod 5=3 0 1 2 3 4 10 i 12 7 2, Random Probing: « Theeffect of clustering can be reduced or minimized by using random probing. «This method generates a random sequence of position rather than an ordered sequence as in case of linear probing. 3. Quadratic probing: Page | 5Government Polytechnic, Porbandar Data Structures (3330704) Quadratic probing eliminates primary clustering, When an incoming data’s hash value indicates it should be stored in an already occupied slot then quadratic probing operated by taking the original hash index and adding successive values until an open slot is found If there is a collision at hash address H, this method probes the table at location: HHL, Hi, H49,,.., H¥(1)? for i=1,2,3 This method also reduces the clustering. Quadratic probes suffer from a different and more subtle clustering problem. ‘This occurs because all the keys that hash to a particular cell follow the same sequence in trying to find a vacant space. 4, Re-hashing (Double Hashing): Re -hashing technique uses a second hashing operation when there is a collision. If there is further collision, we re -hash until an empty “slot” or address in the table is found. The Re -hashing function can either be a new function or a re-application of the original one. Using Re -hashing, we can eliminate primary as well as secondary clustering. The performance of double hashing is very close to the performance of the ideal scheme of uniform hashing. Page | 6

DS Unit - 5 Hashing

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DS Unit - 5 Hashing

Uploaded by

Copyright:

Available Formats

You might also like