You are on page 1of 9
M.Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur ibus: Dictionaries, ADT, The List ADT, Stack ADT, Queue ADT, Hash Table Representation, Hash Functions, Collision Resolution-Separate Chaining, Open Addressing- Linear Probing, Double Hashing. UNIT -HT DICTIONARIS G) ADT (Abstract Data Type): > A data type is a collection of values and a set of operations on those values. > An abstract data type (ADT) is a collection of values and a set of operations without specifying its implementation. The purpose of the ADT is to hide the implementation details of a data structure thus improving software maintenance, reuse and portability. > The abstract data type Natural Number is given by class NaturalNumber { NaturalNumber Zero(); //retums zero bool IsZero(); // If is O return true else retum false NaturalNumber Add(a\b); //return the addition of two natural numbers a and b bool Equal(a,b); /it returns if two natural numbers a,b are equal else false NaturaiNumber Subtraci(NaturalNumber y); /iretum the subtraction of two natural numbers a and b } > The different ADTs given by + List ADT + Stack (last-in, first-out) ADT + Queue (first-in, first-out) ADT + Binary Search Tree ADT etc (2) The List AD ¥ The List ADT is known as the linked list abstract data type. > Linked List is an abstract data type that holds a collection of Nodes, the nodes can be accessed in a sequential way. Linked List doesn't provide a random access to a Node Usually those Nodes are connected to the next Node and/or with the previous one this gives the linked effect. > When the Nodes are connected with only the next pointer the list is called singly linked list and when it is connected by the next and previous the list is called doubly linkedlist. > The abstract data type Natural Number is given by class LinkedList { void prepend(value); //Add a node in the beginning void append(value); // Add a node in the end int pop() //remove a node at the end int popFirst(); //remove a node from the beginning int head();/iretumn the first node int tail()/retum the last node int remove(node);//remove node from the list } @)The Stack: (Stack structure representatio > Astack is a data structure in which additions and deletions are made at the top of the stack. > That is we can perform two operations on stack 1. Adding elements into the stack known as push; 2.Deleting elements from the stack known as pop. > For example if we add the elements A,B,C and D to the stack then D is the first element deleted from the stack. This is shown in the following figure max-1 D ‘top ck -top [_c Cc top B top [B B B oLA -— top [A A A A ~1 push Push push push Pop > Here the last element inserted into a stack is the first element removed from the stack Page 1 I M-Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur > So this is also known as LAST-IN-FIRST-OUT (LIFO) lists. > There are two situations in the stack 1. The stack is empty ((e. no elements in the stack) known as stack underflow 2. The stack is full (i.e. maximum elements in the stack) known as stack overflow > Both overflow and underflow are treated as error situations while implementing a stack > The stack representation in java is given below class singlestack { int top,stack{maxt; singlestack() it top; ‘Void push(int){} void pop() void display(){ (di) The Stack ADT Algoritin > The stack ADT algorithm is given by class Stack { Stack create(maxStackSize); create an empty stack whose maximum size is maxStackSize bool isEmpty(); /if elements in the stack is 0, retum true else return false void push(stack item); // Insert item into the top of the stack Element pop(); )Delete the top of element of the stack void display(); //It display the elements in the stack (ili) The Different stack operations: > The Stack operations are given by. (a)The pushQ operation: void singlestack::push(int x) { max-1) cout << "\nStack is Full, Insertion is not possible!!"; return; } else { top++: stackltop] = x; cout <<"inInsertion success!!!" } } void singlestack::pop() { inty; ifftop == -1) { cout << "\nStack is Empty, Deletion is not possible", return, } else { yzstack{top]; top— cout << "\nDeleted element" <=0, i-) cout < A queue is a data structure in which additions are made at one end and deletions are made at the other end. That is we can perform two operations on queue. 41. Adding elements into the queue known as addg at rear 2. Deleting elements from the queue known as deleteq from front > For example if we insert A,8,C,and D then A is the first element deleted from the queue. > This is shown in the following figure. r tlale © ffafelc[ © tfaleiclo[r B[clo[r max-1 addq addq addq) deleteg r=max-1 f= front, r=rear > Here the first element inserted into a queue is the first element removed from queue. > So this is also known as FIRST-IN-FIRST-OUT (FIFO) lists > There are two situations in the queue 1. The queue is empty (.e. no elements in the queue) known as queue underflow 2. The queue is full (ie. maximum elements in the queue) known as queue overflow > Both overflow and underflow are treated as error situations while implementing a queue > The queue representation in C++ is given below class simplequeue int front,rear,queue[5}; simplequeue() } void addq{int)(} void deleted()} void display(}X} (GiyThe Queue ADT Algorithm: > The queue ADT Algorithm is given by class Queue { ‘Queue create(maxQueueSize) Icteate an empty queue whose maximum size is maxQueueSize bool isEmpty(), //If elements in the queue is 0, retum true else return false Page3 T M.Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur void adda(queue,item); // Insert item into the queue Element deleteq(); //Delete the element in the queue void display(); // It display the elements in the queue 3 {dil The different Queue operations: > The Queue operations are given by. a)The addq() operation: void ordqueue:: addq(int x) { if (ear == max - 1) { cout << "\nQueve is Full, So insertion is not possible", retum; } else { reart+; queue[rear] = x; cout << “\nInsertion success!!!"; } } (b)The deleteq() operatio void ordqueue:: deleteq() { int y; if (front == rear) { cout << "inQueue is Empty, Deletion is not possible"; return; } else { front++; y=queueffront}; cout << "inDeleted element is : << y; I } (c)The display operation: void ordqueue::display() { int i; if fron { ear) cout << "\nQueue is Empty. No elements"; retum; cout << "\nQueue elements are: ", for (i = front: i <= rear; i++) cout << queuefi] << (3) Hash Tab) > Hash Table is a data structure which stores data in an array format, where each data has its own unique index value. This index value is known as bucket. > Using a hash function we can compute an integer value ( or index value) that can map to the bucket of the hash table and store the data into the data field. Page 4 IM.Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur > So in a hashing system the keys are stored in an array which is called the hash table. > A perfectly implemented hash table gives an average insertiretrieval time of O(1). > Thus, hash table becomes a data structure in which insertion and search operations are very fast irrespective of the size of the data. > The maximum number of elements that we can insert in hash table is called size of hash table. > If there are m buckets available, then the hash key is computed using the following formula Hash key (or hash or home address)= key % size of hash table ie. h(k)=k % m_ where mis number of buckets (.¢. size of hash table) Example( > Let we have alist 10,9,8,5 and we want to store these values into the hash table. > Let the number of buckets be 8 as shown in the following figure. Keys Hash Function BucketsData Field (0) 8 70 10%8=2. | D 5 om8=1 Pe 3 8 8%8=0- [4] -——be_s 5%8=5 - m1 > The data stored in hash table is called dictionary pairs. > If hash table represented by ht & buckets are m then the hash table represents ht{0],ht{1]...ht{m-1] > A bucket consists of s slots. When s=1, each bucket can hold exactly one pair. » The key density of a hash table is the ratio n/T, where n is the number of pairs in the table, and T is the total number of possible keys. > The loading density or loading factor of a hash table is: a = n/(sm). > If many keys have same hash key then the hash key for a new dictionary pair is full to insert. > This is an overfiow. In this time if we insert then a collision occurs. 6 buckets and s=2. Assume that there are n=10 distinct keys and that each key represents GA, D, A, G, F, A2, A1, A3, A4, E_ » The loading factor a for this table is 10/52=0.19. > The representation for the letters A to Z corresponds to the numbers 0 to 25 respectively. > The hash function h maps each of the possible keys into one of the numbers 0 to 25 respectively. > The function h defined by h(k) = the first character of k Then the hash keys for GA, D, A, G, F, A2, Al, A3, Ad, E are 6,3,0,6,5,0,0,0,0,4 respectively. The hash table is shown in the following figure. s=1 s=2 (0) A A2 a] 2] BI D [4] 3 (5) E [6] GA G [25]| > In the above table to insert A1,A3,A4 the conflict occurs because data is full in the table. > The cost of various operations on hash table is shown below. Page 5 I M.Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur Average case Worst case Search Olt) O(n) Insert Ot) ‘O(n) Delete ‘O(t) (O(n) (© Hash Functions: > Hashing is a technique through which given data can be stored/retrieved from hash table. > A hash function is required which can generate an integer value for the given input. > This integer value can be used to store given value to particular bucket of hash table. > For example:- If we have 100 buckets available in hash table, then the integer value should lie between 0 & 100(0 inclusive). > If kis a key chosen at random, then h(k)=i to be 1/m for all buckets i > In this case the random key has an equal chance of hashing into any of the buckets. > This hash function is called uniform hash function. To find a hash function for a string, first convert a string into a non-negative integer and then find hash function. Types of Hash Functions: a)Division Method: > If there are m buckets available, then the hash key is computed and mapped to given m buckets for inserting keys using the following formula:- h(k) =k mod m where kis key and mis the size > For example:- Apply division method for some values (10,9,8,5) considering number of buckets be 8 as shown below Keys Hash Function Buckets Data Field -—r of 8 10 10%8=2 ene @ 9%8=1 ~L_10 13] 8 8%8=0 14] p> B__5 5 5%8=5 [6] fl b)Folding Method: > In this method, the given key k is partitioned into subparts ki, ko, ks, ke the same length as the required address. > Now add all these parts together and obtain the hash address for key k. > For example Let the key is 123456789, then a 3 bit address can be calculated as 123+456+780=1368 Mid square Method: > In this method, hash key for given key is obtained by squaring the key and taking m bits from the middle to obtain the bucket addresses. > For example:- If the key is 12345, square of this key is value 152399025. If 2-digit address is required then position 4"" and 5!" can be chosen giving address 39 > It may be possible that for two keys we get same hash key. > This is known as collision. A hash function that maps each key to a distinct bucket (index) is known as perfect hash function. @Digit Analys > This method forms address by selecting and shifting digits of the original key. Page 6 IM.Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur > For a given key set the same positions in the key and same rearrangement must be used > For example a: key 7654321 is transformed to the address 1247 then by selecting digits in position 1,2,4,7 and then reversing their order we obtain digit analysis. \Converting Strings to Integers: > To converta string into a non-negative integer use the following algorithm. unsigned int StringToint(char *key) { int hashvalue=0 i; forli=0:i The above one converts each character into a unique integer and sums these unique integers. (7) Collision & Collision Resolution Techniques: ‘@Coltision: > A situation when two or more data elements maps to the same location in the hash table is called a collision. > In such a situation two or more data elements would qualify to be stored to the same location in the hash table > For example Let table size=17, keys 18 and 36 hash to the same value 18 mod 17=1 and 35 mod 17=1 {ii)Collision Resolution Techniques: > When a collision occurs any of the following collision resolution techniques can be used to resolve the collision 1. Open Addressing (closed hashing) 2. Chaining or Separate chaining(open hashing) LOpen Addressing: > This open addressing can be three types 1, Linear Probing: > When collision occurs, the data will be stored in the next available empty slot in the table > If there is no empty slot then the insertion will be failed Examy (0) Insert 19,22,29,26,15 and size of table=7 [1] | 22 ey % table size (a| 29 9%7=5 (3)[_ 15 Pilla 4 :9%7=1 (collision) next empty location=2(insertion) '6%7=5(collision) next empty location=6(insertion) Gl|_ 19 5%7=1(collision) next empty location=3( insertion) fa, 26 “> In linear probing the collided elements are made into one group. This group is called as primary clustering. > These increases time complexity. > Itis drawback in linear probing > If there are no empty locations then insertion will be failed. This is another drawback in linear probing. 2. Quadratic Probing: > To resolve the primary clustering problem, quadratic probing can be used. > With quadratic probing rather than always moving one spot, it moves i? spots from the point of collision where i is the number of attempts to resolve the collision. > When the hash address is greater than the table size then that address % size can be taken to place the element Page7 M.Tech I Sem CSE ADS &A Unit-3 (R19) Prepared by: Dr. Md. Umar khan, KHIT, Guntur Exampk insert 19,22,26,29,15 and size of table=7 [0] h(19)=19 % 7=5 i) | 22 h(22)=22 % 7=1 = 1(20)-28 96 751(galsion (2)|__ 20 = 1+12= 2(insert) Bl) 45 h(26): a % 7fecaition) => 5+ i2= 5+12 = 6 (insert) (4) h(18)=15 % 7=1 (collision) (5]| 19 >1+?= 1+£= 2(collision) >1+R= 1422 5(collision) (el|__26 >1+i% 1+32= 10(insertion) But 10 is not in our size so find 10 % 7 =3 (insert) > Onee the table is more than half full, itis difficult to find an empty spot. > This new problem is known as secondary clustering problem 3. Double Hashing: > When a collision occurs two hash functions are used to solve a problem is called double hashing > The result of the second hash function will be the number of positions from the point of collision. h(key) = hash1(key) = key % size - when no collision occurs h(key) = hash1(key) + ithash2(key), for i=1 ton. - when collision occurs Here hash (key) = key % size and hash2(key) = R — (key %R) where R is a prime number that is smaller than the size of the table > When the hash address is greater than the table size then that address % size can be taken to place the element. Exampl Insert 15,19,22,26,29 and size of table=7 ro h(15)=189%7=1(insert) h(19)=199%7=5(insert) (15 h(22)=22%7=1(collision) (21) 26 Fori=1=> sh (key}+i*hash2{key) B)|_ 29 2%7 }+i[R- (key%R)] [R=prime number before 7=5] tal 22 2%) +1"|5-(22%5)] +[1"(5-2)] ta 49 Sree insert) h(26)=hash1 (key)+i*hash2(key) =(26%7)+i*[R- (key%R)] [R=prime number before 7=5] =(26%7)+1"[5-(26%5)] = 5H{1*(5-1)] =5+4=9(insert but is out of size) So find 9%7=2(insert) (collision) > h(29)=hash1(key)#ithash2{key) =(29%7)+i"[R- (key%R)] [R=prime number before 7=5] =(29%7)+1"[5-(29%5)] = 14[1"°(5-4)] 1+1 = 2(collision) So find for i h(29)=hash1(key)+i*hash2(key) 9%7 }+i*[R- (key%R)] [R=prime number before 7=5] 19%7)+2"[5-(29%5)] +[2"(5-4)] +2°1 = 3(insert) Page 8 I M-Tech I Sem CSE ADS &A Unit-3 (R19) 2.Chaining (or) Separate Chaining: Prepared by: Dr. Md. Umar khan, KHIT, Guntur > Separate chaining maintains a linked list to place the mapped elements. > When a collision occurs, elements with the same hash key will be chained together. > A chain is simply a linked list of all the elements with the same hash key. The hash table slots will no longer hold a table element. But it holds the address of a table element. Example: {0] 28 [2 la} ca) 3] (4) 39 15] 40 [6] Insert 7,15,22,25,28,39,40 and table size=7 (insert) (insert) (collision) (insert) (collision) (collision) (insert) Differences between Chaining vs Open addre: Sno Chaining Open addressing 1_| itis known as closed addressing. itis known as open addressing 2 | Keysare stored in linked list attached to _| Keys are stored directly at an index in the hash cells of a hash table. table. It does not use any linked list. 3 | Each index maintains a list of keys that are | The index at which the key is actually stored may to be stored at the same index and the keys | depends on the keys already stored in the hash are linked to each other. table 4 For example it two keys hash to the same For example it two keys hash to the same index index then they both are inserted in the list | then they both are inserted at any other free maintained at the same index entry location. 5 _| Efficient for large number of records. Efficient for small number of records | Uses space efficiently Expensive on space Page 9

You might also like