You are on page 1of 42

UNIT 5

Sorting and Searching: Objective and properties of different sorting algorithms: Selection
Sort, Insertion Sort, Quick Sort, Merge Sort, Heap Sort, Linear and Binary Search
algorithms, Hashing (linear probing, random probing, quadratic probing, rehashing, double
hashing), Dictionaries

Graph: Basic Terminologies and Representations, Graph traversal techniques.

5.1 Sorting Algorithms

Sorting is a process of arranging elements in a certain order. Numeric data may be sorted in
ascending or descending order. Alphabets or strings may be sorted in lexicographical order.

Sorting is a topic that has been extensively researched and investigated in the field of computer
science. Sorting is a fundamental need in many applications. Sorting data makes it easier to
search through a given data sample, efficiently and quickly.

For sorting, there is a large class of algorithms available. All have limitations and are limited
to a certain application domain. There is no algorithm that can fulfill all objectives. However,
the most often used measure for selecting the optimal algorithm is running time.

5.1.1 Properties of Sorting Algorithm

Following are the properties of sorting algorithms:

▪ In place: Only require constant additional space to sort the algorithm. Sometimes
O(log n) space is allowed.
▪ Stable: Does not alter the relative position of same elements after sorting
▪ Online: Sort the data as it arrives
▪ Adaptive: Performance of algorithm varies with the input sequence
▪ Incremental: Build sorted sequence one number at a time

5.1.2 Complexity of Sorting Algorithm

The complexity of a sorting algorithm measures the running time of a function with ‘n’ items
to sort. The decision of which sorting technique is appropriate for a problem is determined by
several dependency configurations for various problems. The following are the most important
considerations:
• The amount of time taken by a programmer in developing a certain sorting program.
• The amount of machine time required to run the program
• The amount of memory required to run the program

5.1.3 Efficiency of Sorting Algorithm

The complexity of algorithm is determined based on how many comparison it does to sort given
data. Sorting algorithms are sensitive to arrangement of input data.

Algorithms are analyzed based on following cases:

• Best case
• Worst case
• Average case

5.1.4 Types of Sorting Algorithm

a) Comparison Based Sorting

In comparison based sorting methods, data are sorted by comparing two data elements. The
comparator function is defined to compare and sort the data. Example: Selection sort, Bubble
sort, Insertion sort

b) Counting Based Sorting

These sorting algorithms make no comparisons between elements and instead rely on
calculated assumptions during execution. Example: Radix sort, Bucket Sort, Counting sort

c) In-place vs Not in-place Sorting

In data structures, in-place sorting algorithms change the ordering of array items within the
original array. Not-in-Place sorting methods, on the other hand, sort the original array using an
auxiliary data structure. Examples of in-place algorithms are Quick sort, Insertion sort,
Selection sort. Examples of not in-place algorithm is Merge sort

5.2 Selection Sort

Selection sort is conceptually the simplest sorting algorithm. This algorithm will first find the
smallest element in the array and swap it with the element in the first position, then it will find
the second smallest element and swap it with the element in the second position, and it will
keep on doing this until the entire array is sorted. It is called selection sort because it repeatedly
selects the next-smallest element and swaps it into the right place.

Selection sort is generally used when

• A small array is to be sorted


• Swapping cost doesn't matter
• It is compulsory to check all elements
The selection sort algorithm is performed using the following steps:

Step 1: Select the first element of the list (i.e., Element at first position in the list).
Step 2: Compare the selected element with all the other elements in the list.
Step 3: In every comparision, if any element is found smaller than the selected element (for
Ascending order), then both are swapped.
Step 4: Repeat the same procedure with element in the next position in the list till the entire
list is sorted.

Selection Sort Logic:

for(i=0; i<size; i++){


for(j=i+1; j<size; j++){
if(list[i] > list[j])
{
temp=list[i];
list[i]=list[j];
list[j]=temp;
}
}
}

Example:
Complexity of the Selection Sort Algorithm

To sort an unsorted list with 'n' number of elements, we need to make ((n-1)+(n-2)+(n-
3)+......+1) = (n (n-1))/2 number of comparisons in the worst case. If the list is already sorted
then it requires 'n' number of comparisons.

Worst Case: O(n2)


Best Case: Ω(n2)
Average Case: Θ(n2)

5.3 Insertion Sort

Insertion sort algorithm arranges a list of elements in a particular order. In insertion sort
algorithm, every iteration moves an element from unsorted portion to sorted portion until all
the elements are sorted in the list.

Characteristics of Insertion Sort:

• It is efficient for smaller data sets, but very inefficient for larger lists.
• Insertion Sort is adaptive, that means it reduces its total number of steps if a partially
sorted array is provided as input, making it efficient.
• It is better than Selection Sort and Bubble Sort algorithms.

The insertion sort algorithm is performed using the following steps:

Step 1: Assume that first element in the list is in sorted portion and all the remaining elements
are in unsorted portion.
Step 2: Take first element from the unsorted portion and insert that element into the sorted
portion in the order specified.
Step 3: Repeat the above process until all the elements from the unsorted portion are moved
into the sorted portion.

Insertion Sort Logic

//Insertion sort logic


for i = 1 to size-1 {
temp = list[i];
j = i-1;
while ((temp < list[j]) && (j > 0)) {
list[j] = list[j-1];
j = j - 1;
}
list[j] = temp;
}

Example
Complexity of the Insertion Sort Algorithm

To sort an unsorted list with 'n' number of elements, we need to make (1+2+3+......+n-1) = (n
(n-1))/2 number of comparisions in the worst case. If the list is already sorted then it requires
'n' number of comparisions.

• Worst Case : O(n2)


• Best Case : Ω(n)
• Average Case : Θ(n2)

5.4 Quick Sort

Quick Sort is one of the different Sorting Technique which is based on the concept of Divide
and Conquer, just like merge sort. But in quick sort all the heavy lifting (major work) is done
while dividing the array into subarrays, while in case of merge sort, all the real work happens
during merging the subarrays. In case of quick sort, the combine step does absolutely nothing.

It is also called partition-exchange sort. This algorithm divides the list into three main parts:

i. Elements less than the Pivot element


ii. Pivot element(Central element)
iii. Elements greater than the pivot element

Pivot element can be any element from the array, it can be the first element, the last element
or any random element. In this tutorial, we will take the rightmost element or the last element
as pivot.

Following are the steps involved in quick sort algorithm:


Step 1: After selecting an element as pivot, which is the last index of the array in our case, we
divide the array for the first time.
Step 2: In quick sort, we call this partitioning. It is not simple breaking down of array into 2
subarrays, but in case of partitioning, the array elements are so positioned that all the
elements smaller than the pivot will be on the left side of the pivot and all the elements
greater than the pivot will be on the right side of it.
Step 3: And the pivot element will be at its final sorted position.
Step 4: The elements to the left and right, may not be sorted.
Step 5: Then we pick subarrays, elements on the left of pivot and elements on the right of
pivot, and we perform partitioning on them by choosing a pivot in the subarrays.

Quick Sort Logic

//Quick Sort Logic


void quickSort(int list[10],int first,int last){
int pivot,i,j,temp;

if(first < last){


pivot = first;
i = first;
j = last;

while(i < j){


while(list[i] <= list[pivot] && i < last)
i++;
while(list[j] && list[pivot])
j--;
if(i < j){
temp = list[i];
list[i] = list[j];
list[j] = temp;
}
}

temp = list[pivot];
list[pivot] = list[j];
list[j] = temp;
quickSort(list,first,j-1);
quickSort(list,j+1,last);
}
}

Example

Let's consider an array with values

{9, 7, 5, 11, 12, 2, 14, 3, 10, 6}

Below, we have a pictorial representation of how quick sort will sort the given array.
In step 1, we select the last element as the pivot, which is 6 in this case, and call for partitioning,
hence re-arranging the array in such a way that 6 will be placed in its final position and to its
left will be all the elements less than it and to its right, we will have all the elements greater
than it.

Then we pick the subarray on the left and the subarray on the right and select a pivot for them,
in the above diagram, we chose 3 as pivot for the left subarray and 11 as pivot for the right
subarray.

And we again call for partitioning.

Complexity of the Quick Sort Algorithm

To sort an unsorted list with 'n' number of elements, we need to make ((n-1)+(n-2)+(n-
3)+......+1) = (n (n-1))/2 number of comparisons in the worst case. If the list is already sorted,
then it requires 'n' number of comparisons.

• Worst Case: O(n2)


• Best Case: O (n log n)
• Average Case: O (n log n)
5.5 Merge Sort

In Merge Sort, the given unsorted array with n elements, is divided into n subarrays, each
having one element, because a single element is always sorted in itself. Then, it repeatedly
merges these subarrays, to produce new sorted subarrays, and in the end, one complete sorted
array is produced.

The concept of Divide and Conquer involves three steps:

• Divide the problem into multiple small problems.


• Conquer the subproblems by solving them. The idea is to break down the problem into
atomic subproblems, where they are actually solved.
• Combine the solutions of the subproblems to find the solution of the actual problem.

Following are the steps involved in Merge sort algorithm:

Step 1:Split the given list into two halves (roughly equal halves in case of a list with an odd
number of elements).
Step 2:Continue dividing the subarrays in the same manner until you are left with only single
element arrays.
Step 3:Starting with the single element arrays, merge the subarrays so that each merged
subarray is sorted.
Step 4:Repeat step 3 unit with end up with a single sorted array.

Merge Sort Logic

void mergesort(int *a, int low, int high)


{
int mid;
if (low < high)
{
mid=(low+high)/2;
mergesort(a,low,mid);
mergesort(a,mid+1,high);
merge(a,low,high,mid);
}
return;
}
// Merge sort concepts starts here
void merge(int *a, int low, int high, int mid)
{
int i, j, k, c[50];
i = low;
k = low;
j = mid + 1;
while (i <= mid && j <= high)
{
if (a[i] < a[j])
{
c[k] = a[i];
k++;
i++;
}
else
{
c[k] = a[j];
k++;
j++;
}
}
while (i <= mid)
{
c[k] = a[i];
k++;
i++;
}
while (j <= high)
{
c[k] = a[j];
k++;
j++;
}
for (i = low; i < k; i++)
{
a[i] = c[i];
}
}

Example

Let's consider an array with values {14, 7, 3, 12, 9, 11, 6, 12}


Complexity of the Merge Sort Algorithm

• Best Case O(nlogn)


• Average Case O(nlogn)
• Worst Case O(nlogn)
5.6 Heap Sort

A heap is a complete binary tree, and the binary tree is a tree in which the node can have the
utmost two children. A complete binary tree is a binary tree in which all the levels except the
last level, i.e., leaf node, should be completely filled, and all the nodes should be left-justified.

Heapsort is a popular and efficient sorting algorithm. The concept of heap sort is to eliminate
the elements one by one from the heap part of the list, and then insert them into the sorted part
of the list.

Working of the Heapsort Algorithm.:

In heap sort, basically, there are two phases involved in the sorting of elements. By using the
heap sort algorithm, they are as follows -

Step 1:The first step includes the creation of a heap by adjusting the elements of the array.
Step 2:After the creation of heap, now remove the root element of the heap repeatedly by
shifting it to the end of the array, and then store the heap structure with the remaining
elements.

Merge Sort Logic

void heapify(int arr[], int n, int i)


{
int largest = i;
int l = 2*i + 1;
int r = 2*i + 2;

// if left child is larger than root


if (l < n && arr[l] > arr[largest])
largest = l;

// if right child is larger than largest so far


if (r < n && arr[r] > arr[largest])
largest = r;

// if largest is not root


if (largest != i)
{
swap(arr[i], arr[largest]);

// recursively heapify the affected sub-tree


heapify(arr, n, largest);
}
}

void heapSort(int arr[], int n)


{
// build heap (rearrange array)
for (int i = n / 2 - 1; i >= 0; i--)
heapify(arr, n, i);

// one by one extract an element from heap


for (int i=n-1; i>=0; i--)
{
// move current root to end
swap(arr[0], arr[i]);

// call max heapify on the reduced heap


heapify(arr, i, 0);
}
}

Example

let's take an unsorted array and try to sort it using heap sort

First, we have to construct a heap from the given array and convert it into max heap.

After converting the given heap into max heap, the array elements are -

Next, we have to delete the root element (89) from the max heap. To delete this node, we have
to swap it with the last node, i.e. (11). After deleting the root element, we again have to heapify
it to convert it into max heap.
After swapping the array element 89 with 11, and converting the heap into max-heap, the
elements of array are -

In the next step, again, we have to delete the root element (81) from the max heap. To delete
this node, we have to swap it with the last node, i.e. (54). After deleting the root element, we
again have to heapify it to convert it into max heap.

After swapping the array element 81 with 54 and converting the heap into max-heap, the
elements of array are -

In the next step, we have to delete the root element (76) from the max heap again. To delete
this node, we have to swap it with the last node, i.e. (9). After deleting the root element, we
again have to heapify it to convert it into max heap.
After swapping the array element 76 with 9 and converting the heap into max-heap, the
elements of array are -

In the next step, again we have to delete the root element (54) from the max heap. To delete
this node, we have to swap it with the last node, i.e. (14). After deleting the root element, we
again have to heapify it to convert it into max heap.

After swapping the array element 54 with 14 and converting the heap into max-heap, the
elements of array are -

In the next step, again we have to delete the root element (22) from the max heap. To delete
this node, we have to swap it with the last node, i.e. (11). After deleting the root element, we
again have to heapify it to convert it into max heap.

After swapping the array element 22 with 11 and converting the heap into max-heap, the
elements of array are -

In the next step, again we have to delete the root element (14) from the max heap. To delete
this node, we have to swap it with the last node, i.e. (9). After deleting the root element, we
again have to heapify it to convert it into max heap.
After swapping the array element 14 with 9 and converting the heap into max-heap, the
elements of array are -

In the next step, again we have to delete the root element (11) from the max heap. To delete
this node, we have to swap it with the last node, i.e. (9). After deleting the root element, we
again have to heapify it to convert it into max heap.

After swapping the array element 11 with 9, the elements of array are -

Now, heap has only one element left. After deleting it, heap will be empty.

After completion of sorting, the array elements are -

Now, the array is completely sorted.

Complexity of the Heap Sort Algorithm

To sort an unsorted list with 'n' number of elements, following are the complexities...

• Worst Case: O(n log n)


• Best Case: O(n log n)
• Average Case: O(n log n)
Time Complexity Space Complexity
Algorithm
Best Average Worst Worst

Selection Sort O(n2) O(n2) O(n2) O(1)

Insertion Sort O(n2) O(n2) O(n2) O(1)

Merge Sort O(n log n) O(n log n) O(n log n) O(n)

Quick Sort O(n log n) O(n log n) O(n2) O(n)

Heap Sort O(n log n) O(n log n) O(n log n) O(1)

5.7 Linear Search Algorithm


Linear search is also called as sequential search algorithm. It is the simplest searching
algorithm. In Linear search, we simply traverse the list completely and match each element of
the list with the item whose location is to be found. If the match is found, then the location of
the item is returned; otherwise, the algorithm returns NULL. It is widely used to search an
element from the unordered list, i.e., the list in which items are not sorted. The worst-case time
complexity of linear search is O(n).

Linear search is implemented using following steps:

Step 1: Read the search element from the user.


Step 2: Compare the search element with the first element in the list.
Step 3: If both are matched, then display "Given element is found!!!" and terminate the
function
Step 4: If both are not matched, then compare search element with the next element in the list.
Step 5: Repeat steps 3 and 4 until search element is compared with last element in the list.
Step 6: If last element in the list also doesn't match, then display "Element is not found!!!" and
terminate the function.

Linear Search Logic

int linearSearch(int values[], int target, int n)


{
for(int i = 0; i < n; i++)
{
if (values[i] == target)
{
return i;
}
}
return -1;
}
Example:

To understand the working of linear search algorithm, let's take an unsorted array.
Let the elements of array are -

Let the element to be searched is K = 41


Now, start from the first element and compare K with each element of the array.

The value of K, i.e., 41, is not matched with the first element of the array. So, move to the next
element. And follow the same process until the respective element is found.

Now, the element to be searched is found. So algorithm will return the index of the element
matched.

Time Complexity:

Case Time Complexity


Best Case O(1)
Average Case O(n)
Worst Case O(n)
Space Complexity:

The space complexity of linear search is O(1).

5.8 Binary Search algorithms

Binary search is the search technique that works efficiently on sorted lists. Hence, to search an
element into some list using the binary search technique, we must ensure that the list is
sorted.

Binary search follows the divide and conquer approach in which the list is divided into two
halves, and the item is compared with the middle element of the list. If the match is found then,
the location of the middle element is returned. Otherwise, we search into either of the halves
depending upon the result produced through the match.

Implementing Binary Search Algorithm:

1. Start with the middle element in the given list:


o If the target value is equal to the middle element of the array, then return the
index of the middle element.
o If not, then compare the middle element with the target value,
▪ If the target value is greater than the number in the middle index, then
pick the elements to the right of the middle index, and start with Step 1.
▪ If the target value is less than the number in the middle index, then pick
the elements to the left of the middle index, and start with Step 1.
2. When a match is found, return the index of the element matched.
3. If no match is found, then return -1

Binary Search Logic

int binarySearch(int values[], int len, int target)


{
int max = (len - 1);
int min = 0;
int guess; // this will hold the index of middle elements
int step = 0; // to find out in how many steps we completed the search

while(max >= min)


{
guess = (max + min) / 2;
// we made the first guess, incrementing step by 1
step++;

if(values[guess] == target)
{
printf("Number of steps required for search: %d \n", step);
return guess;
}
else if(values[guess] > target)
{
// target would be in the left half
max = (guess - 1);
}
else
{
// target would be in the right half
min = (guess + 1);
}
}
// We reach here when element is not
// present in array
return -1;
}

Example:

Consider the following list of elements and the element to be searched:


Time Complexity:

Case Time Complexity


Best Case O(1)
Average Case O(logn)
Worst Case O(logn)

Space Complexity:

The space complexity of binary search is O(1).


5.9 Hashing

Hashing is a technique that is used for storing and extracting information in a faster way. It
helps to perform searching in an optimal way. Hashing is used in databases, encryptions,
symbol tables, etc.

Hashing is needed to execute the search, insert, and deletions in constant time on an average.
In our other data structures like an array, linked list the above operations take linear time, O(n).
The best case can be a self-balanced tree-like AVL tree, where the time complexity is of order
O(logn). But Hashing allows us to perform the operations in constant time, O(1) on an average.

Component of hashing:

• Hash table
• Hash functions
• Collisions
• Collision resolution techniques

5.9.1 Hash table

The Hash table data structure stores elements in key-value pairs where

Key- unique integer that is used for indexing the values

Value - data that are associated with keys.

5.9.2 Hash Function

A hash function is used for mapping each element of a dataset to indexes in the table. Hash
functions convert a key into the index of the hash table (location). A hash function should
generate unique locations, but that's difficult to achieve since the number of indexes is much
less than the number of keys. We often lead to the collision while using a hash function which
is not perfect.
A good hash function may not prevent the collisions completely however it can reduce the
number of collisions. Here, we will look into different methods to find a good hash function
a. Division Method
If k is a key and m is the size of the hash table, the hash function h() is calculated as:
h(k) = k mod m
For example, If the size of a hash table is 10 and k = 112 then h(k) = 112 mod 10 = 2. The
value of m must not be the powers of 2. This is because the powers of 2 in binary format are 10,
100, 1000, …. When we find k mod m, we will always get the lower order p-bits.
if m = 22, k = 17, then h(k) = 17 mod 22 = 10001 mod 100 = 01
if m = 23, k = 17, then h(k) = 17 mod 22 = 10001 mod 100 = 001
if m = 24, k = 17, then h(k) = 17 mod 22 = 10001 mod 100 = 0001
if m = 2p, then h(k) = p lower bits of m
b. Multiplication Method
h(k) = ⌊m(kA mod 1)⌋
where,
• kA mod 1 gives the fractional part kA,
• ⌊ ⌋ gives the floor value
• A is any constant. The value of A lies between 0 and 1. But, an optimal choice will be ≈
(√5-1)/2 suggested by Knuth.
c. Universal Hashing
In Universal hashing, the hash function is chosen at random independent of keys.

5.9.3 Collisions

When the two different values have the same value, then the problem occurs between the two
values, known as a collision. In the above example, the value is stored at index 6. If the key
value is 26, then the index would be:

h(26) = 26%10 = 6

Therefore, two values are stored at the same index, i.e., 6, and this leads to the collision
problem. To resolve these collisions, we have some techniques known as collision techniques.
5.9.4 Collision resolution techniques

The following are the collision techniques:

o Open Hashing: It is also known as closed addressing.


o Closed Hashing: It is also known as open addressing.

Open Hashing

In Open Hashing, one of the methods used to resolve the collision is known as a chaining
method.

Let's first understand the chaining to resolve the collision.

Suppose we have a list of key values

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

In this case, we cannot directly use h(k) = ki/m as h(k) = 2k+3

o The index of key value 3 is:

index = h(3) = (2(3)+3)%10 = 9

The value 3 would be stored at the index 9.

o The index of key value 2 is:

index = h(2) = (2(2)+3)%10 = 7

The value 2 would be stored at the index 7.


o The index of key value 9 is:

index = h(9) = (2(9)+3)%10 = 1

The value 9 would be stored at the index 1.

o The index of key value 6 is:

index = h(6) = (2(6)+3)%10 = 5

The value 6 would be stored at the index 5.

o The index of key value 11 is:

index = h(11) = (2(11)+3)%10 = 5

The value 11 would be stored at the index 5. Now, we have two values (6, 11) stored at the
same index, i.e., 5. This leads to the collision problem, so we will use the chaining method to
avoid the collision. We will create one more list and add the value 11 to this list. After the
creation of the new list, the newly created list will be linked to the list having value 6.

o The index of key value 13 is:

index = h(13) = (2(13)+3)%10 = 9

The value 13 would be stored at index 9. Now, we have two values (3, 13) stored at the same
index, i.e., 9. This leads to the collision problem, so we will use the chaining method to avoid
the collision. We will create one more list and add the value 13 to this list. After the creation
of the new list, the newly created list will be linked to the list having value 3.

o The index of key value 7 is:

index = h(7) = (2(7)+3)%10 = 7

The value 7 would be stored at index 7. Now, we have two values (2, 7) stored at the same
index, i.e., 7. This leads to the collision problem, so we will use the chaining method to avoid
the collision. We will create one more list and add the value 7 to this list. After the creation of
the new list, the newly created list will be linked to the list having value 2.

o The index of key value 12 is:

index = h(12) = (2(12)+3)%10 = 7

According to the above calculation, the value 12 must be stored at index 7, but the value 2
exists at index 7. So, we will create a new list and add 12 to the list. The newly created list will
be linked to the list having a value 7.

The calculated index value associated with each key value is shown in the below table:
key Location(u)
3 ((2*3)+3)%10 = 9
2 ((2*2)+3)%10 = 7
9 ((2*9)+3)%10 = 1
6 ((2*6)+3)%10 = 5
11 ((2*11)+3)%10 = 5
13 ((2*13)+3)%10 = 9
7 ((2*7)+3)%10 = 7
12 ((2*12)+3)%10 = 7

Closed Hashing

In Closed hashing, three techniques are used to resolve the collision:

1. Linear probing
2. Quadratic probing
3. Double Hashing technique

Linear Probing

Linear probing is one of the forms of open addressing. As we know that each cell in the hash
table contains a key-value pair, so when the collision occurs by mapping a new key to the cell
already occupied by another key, then linear probing technique searches for the closest free
locations and adds a new key to that empty cell. In this case, searching is performed
sequentially, starting from the position where the collision occurs till the empty cell is not
found.

Let's understand the linear probing through an example.

Consider the above example for the linear probing:

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5 respectively. The calculated index
value of 11 is 5 which is already occupied by another key value, i.e., 6. When linear probing is
applied, the nearest empty cell to the index 5 is 6; therefore, the value 11 will be added at the
index 6.

The next key value is 13. The index value associated with this key value is 9 when hash function
is applied. The cell is already filled at index 9. When linear probing is applied, the nearest
empty cell to the index 9 is 0; therefore, the value 13 will be added at the index 0.

The next key value is 7. The index value associated with the key value is 7 when hash function
is applied. The cell is already filled at index 7. When linear probing is applied, the nearest
empty cell to the index 7 is 8; therefore, the value 7 will be added at the index 8.

The next key value is 12. The index value associated with the key value is 7 when hash function
is applied. The cell is already filled at index 7. When linear probing is applied, the nearest
empty cell to the index 7 is 2; therefore, the value 12 will be added at the index 2.
Quadratic Probing

In case of linear probing, searching is performed linearly. In contrast, quadratic probing is an


open addressing technique that uses quadratic polynomial for searching until a empty slot is
found.

It can also be defined as that it allows the insertion ki at first free location from (u+i2)%m
where i=0 to m-1.

Let's understand the quadratic probing through an example.

Consider the same example which we discussed in the linear probing.

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5, respectively. We do not need to
apply the quadratic probing technique on these key values as there is no occurrence of the
collision.

The index value of 11 is 5, but this location is already occupied by the 6. So, we apply the
quadratic probing technique.

When i = 0
Index= (5+02)%10 = 5
When i=1
Index = (5+12)%10 = 6

Since location 6 is empty, so the value 11 will be added at the index 6.

The next element is 13. When the hash function is applied on 13, then the index value comes
out to be 9, which we already discussed in the chaining method. At index 9, the cell is occupied
by another value, i.e., 3. So, we will apply the quadratic probing technique to calculate the free
location.

When i=0
Index = (9+02)%10 = 9
When i=1
Index = (9+12)%10 = 0
Since location 0 is empty, so the value 13 will be added at the index 0.

The next element is 7. When the hash function is applied on 7, then the index value comes out
to be 7, which we already discussed in the chaining method. At index 7, the cell is occupied by
another value, i.e., 7. So, we will apply the quadratic probing technique to calculate the free
location.

When i=0
Index = (7+02)%10 = 7
When i=1
Index = (7+12)%10 = 8
Since location 8 is empty, so the value 7 will be added at the index 8.

The next element is 12. When the hash function is applied on 12, then the index value comes
out to be 7. When we observe the hash table then we will get to know that the cell at index 7 is
already occupied by the value 2. So, we apply the Quadratic probing technique on 12 to
determine the free location.

When i=0
Index= (7+02)%10 = 7
When i=1
Index = (7+12)%10 = 8
When i=2
Index = (7+22)%10 = 1
When i=3
Index = (7+32)%10 = 6
When i=4
Index = (7+42)%10 = 3
Since the location 3 is empty, so the value 12 would be stored at the index 3.

The final hash table would be:

Therefore, the order of the elements is 13, 9, _, 12, _, 6, 11, 2, 7, 3.

Double Hashing

Double hashing is an open addressing technique which is used to avoid the collisions. When
the collision occurs then this technique uses the secondary hash of the key. It uses one hash
value as an index to move forward until the empty location is found.

In double hashing, two hash functions are used. Suppose h1(k) is one of the hash functions used
to calculate the locations whereas h2(k) is another hash function. It can be defined as "insert
ki at first free place from (u+v*i)%m where i=(0 to m-1)". In this case, u is the location
computed using the hash function and v is equal to (h2(k)%m).

Consider the same example that we use in quadratic probing.

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and

h1(k) = 2k+3

h2(k) = 3k+1

key Location (u) v probe


3 ((2*3)+3)%10 = 9 - 1
2 ((2*2)+3)%10 = 7 - 1
9 ((2*9)+3)%10 = 1 - 1
6 ((2*6)+3)%10 = 5 - 1
11 ((2*11)+3)%10 = 5 (3(11)+1)%10 =4 3
13 ((2*13)+3)%10 = 9 (3(13)+1)%10 = 0
7 ((2*7)+3)%10 = 7 (3(7)+1)%10 = 2
12 ((2*12)+3)%10 = 7 (3(12)+1)%10 = 7 2

As we know that no collision would occur while inserting the keys (3, 2, 9, 6), so we will not
apply double hashing on these key values.

On inserting the key 11 in a hash table, collision will occur because the calculated index value
of 11 is 5 which is already occupied by some another value. Therefore, we will apply the double
hashing technique on key 11. When the key value is 11, the value of v is 4.

Now, substituting the values of u and v in (u+v*i)%m


When i=0
Index = (5+4*0)%10 =5
When i=1
Index = (5+4*1)%10 = 9
When i=2
Index = (5+4*2)%10 = 3
Since the location 3 is empty in a hash table; therefore, the key 11 is added at the index 3.

The next element is 13. The calculated index value of 13 is 9 which is already occupied by
some another key value. So, we will use double hashing technique to find the free location.
The value of v is 0.

Now, substituting the values of u and v in (u+v*i)%m


When i=0
Index = (9+0*0)%10 = 9
We will get 9 value in all the iterations from 0 to m-1 as the value of v is zero. Therefore, we
cannot insert 13 into a hash table.

The next element is 7. The calculated index value of 7 is 7 which is already occupied by some
another key value. So, we will use double hashing technique to find the free location. The value
of v is 2.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (7 + 2*0)%10 = 7
When i=1
Index = (7+2*1)%10 = 9
When i=2
Index = (7+2*2)%10 = 1
When i=3
Index = (7+2*3)%10 = 3
When i=4
Index = (7+2*4)%10 = 5
When i=5
Index = (7+2*5)%10 = 7
When i=6
Index = (7+2*6)%10 = 9
When i=7
Index = (7+2*7)%10 = 1
When i=8
Index = (7+2*8)%10 = 3
When i=9
Index = (7+2*9)%10 = 5

Since we checked all the cases of i (from 0 to 9), but we do not find suitable place to insert 7.
Therefore, key 7 cannot be inserted in a hash table.
The next element is 12. The calculated index value of 12 is 7 which is already occupied by
some another key value. So, we will use double hashing technique to find the free location.
The value of v is 7.

Now, substituting the values of u and v in (u+v*i)%m


When i=0
Index = (7+7*0)%10 = 7
When i=1
Index = (7+7*1)%10 = 4
Since the location 4 is empty; therefore, the key 12 is inserted at the index 4.

The final hash table would be:

The order of the elements is _, 9, _, 11, 12, 6, _, 2, _, 3.


5.10 Dictionaries

A dictionary is defined as a general-purpose data structure for storing a group of objects. A


dictionary is associated with a set of keys and each key has a single associated value. When
presented with a key, the dictionary will simply return the associated value.

For example, the results of a classroom test could be represented as a dictionary with student's
names as keys and their scores as the values:

results = {'Anik' : 75,


'Aftab' :80,
'James' : 85,
'Manisha': 77,
'Suhana' :87,
'Margaret': 82}

The various operations that are performed on a Dictionary or associative array are:

• Add or Insert: In the Add or Insert operation, a new pair of keys and values is added
in the Dictionary or associative array object.
• Replace or reassign: In the Replace or reassign operation, the already existing value
that is associated with a key is changed or modified. In other words, a new value is
mapped to an already existing key.
• Delete or remove: In the Delete or remove operation, the already present element is
unmapped from the Dictionary or associative array object.
• Find or Lookup: In the Find or Lookup operation, the value associated with a key is
searched by passing the key as a search argument.

5.11 Graph

Graph is a non-linear data structure. Graph is a collection of nodes (or vertices) and edges (or
arcs) in which nodes are connected with edges. Generally, a graph G is represented as G = ( V
, E ), where V is set of vertices and E is set of edges.

Example

The following is a graph with 5 vertices and 6 edges.


This graph G can be defined as G = ( V , E )
Where V = {A,B,C,D,E} and E = {(A,B),(A,C)(A,D),(B,D),(C,D),(B,E),(E,D)}.
5.11.1 Graph Terminology

a) Vertex
Individual data element of a graph is called as Vertex. Vertex is also known as node. In above
example graph, A, B, C, D & E are known as vertices.
b) Edge
An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is
represented as (startingVertex, endingVertex). For example, in above graph the link between
vertices A and B is represented as (A,B). In above example graph, there are 7 edges (i.e., (A,B),
(A,C), (A,D), (B,D), (B,E), (C,D), (D,E)).
Edges are three types.
1. Undirected Edge - An undirected egde is a bidirectional edge. If there is undirected
edge between vertices A and B then edge (A , B) is equal to edge (B , A).
2. Directed Edge - A directed egde is a unidirectional edge. If there is directed edge
between vertices A and B then edge (A , B) is not equal to edge (B , A).
3. Weighted Edge - A weighted egde is a edge with value (cost) on it.
c) Undirected Graph
A graph with only undirected edges is said to be undirected graph.
d) Directed Graph
A graph with only directed edges is said to be directed graph.
e) Mixed Graph
A graph with both undirected and directed edges is said to be mixed graph.
f) Adjacent
If there is an edge between vertices A and B then both A and B are said to be adjacent. In other
words, vertices A and B are said to be adjacent if there is an edge between them.
g) Outgoing Edge
A directed edge is said to be outgoing edge on its origin vertex.
h) Incoming Edge
A directed edge is said to be incoming edge on its destination vertex.
i) Degree
Total number of edges connected to a vertex is said to be degree of that vertex.
j) Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
k) Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
l) Self-loop
Edge (undirected or directed) is a self-loop if its two endpoints coincide with each other.
m) Path
A path is a sequence of alternate vertices and edges that starts at a vertex and ends at other
vertex such that each edge is incident to its predecessor and successor vertex.

5.11.2 Graph Representation

Graph data structure is represented using following representations...

1. Adjacency Matrix
2. Incidence Matrix
3. Adjacency List
Adjacency Matrix

In this representation, the graph is represented using a matrix of size total number of vertices
by a total number of vertices. That means a graph with V vertices is represented using a matrix
of size VxV. In this matrix, both rows and columns represent vertices. This matrix is filled with
either 1 or 0. Here, 1 represents that there is an edge from row vertex to column vertex and 0
represents that there is no edge from row vertex to column vertex.

For example, consider the following undirected graph representation:

Directed graph representation...

Incidence Matrix

In this representation, the graph is represented using a matrix of size total number of vertices
by a total number of edges. That means graph with 4 vertices and 6 edges is represented using
a matrix of size 4X6. In this matrix, rows represent vertices and columns represents edges. This
matrix is filled with 0 or 1 or -1. Here, 0 represents that the row edge is not connected to column
vertex, 1 represents that the row edge is connected as the outgoing edge to column vertex and
-1 represents that the row edge is connected as the incoming edge to column vertex.

For example, consider the following directed graph representation:


Adjacency List

In this representation, every vertex of a graph contains list of its adjacent vertices.

For example, consider the following directed graph representation implemented using linked
list:

This representation can also be implemented using an array as follows:

5.11.3 Graph Traversal

Graph traversal is a technique used for a searching vertex in a graph. The graph traversal is
also used to decide the order of vertices is visited in the search process. A graph traversal finds
the edges to be used in the search process without creating loops. That means using graph
traversal we visit all the vertices of the graph without getting into looping path.

There are two graph traversal techniques and they are as follows...

1. DFS (Depth First Search)


2. BFS (Breadth First Search)

DFS (Depth First Search)

DFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph
without loops. We use Stack data structure with maximum size of total number of vertices in
the graph to implement DFS traversal.

We use the following steps to implement DFS traversal...

• Step 1 - Define a Stack of size total number of vertices in the graph.


• Step 2 - Select any vertex as starting point for traversal. Visit that vertex and push it
on to the Stack.
• Step 3 - Visit any one of the non-visited adjacent vertices of a vertex which is at the
top of stack and push it on to the stack.
• Step 4 - Repeat step 3 until there is no new vertex to be visited from the vertex which
is at the top of the stack.
• Step 5 - When there is no new vertex to visit then use back tracking and pop one
vertex from the stack.
• Step 6 - Repeat steps 3, 4 and 5 until stack becomes Empty.
• Step 7 - When stack becomes Empty, then produce final spanning tree by removing
unused edges from the graph

Back tracking is coming back to the vertex from which we reached the current vertex.

Example
BFS (Breadth First Search)

BFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph
without loops. We use Queue data structure with maximum size of total number of vertices
in the graph to implement BFS traversal.

We use the following steps to implement BFS traversal...

• Step 1 - Define a Queue of size total number of vertices in the graph.


• Step 2 - Select any vertex as starting point for traversal. Visit that vertex and insert it
into the Queue.
• Step 3 - Visit all the non-visited adjacent vertices of the vertex which is at front of the
Queue and insert them into the Queue.
• Step 4 - When there is no new vertex to be visited from the vertex which is at front of
the Queue then delete that vertex.
• Step 5 - Repeat steps 3 and 4 until queue becomes empty.
• Step 6 - When queue becomes empty, then produce final spanning tree by removing
unused edges from the graph

Example
Important Questions

1. Discuss the characteristics of linear search.


2. Which sorting algorithm is best if the list is already sorted? Why?
3. State the logic of bubble sort algorithm.
4. What are the problems in hashing?
5. Write the Applications of Graphs
6. What is meant by strongly connected in a graph?
7. Make distinction between bubble sort and quick sort.
8. Discuss the features of Hashing.
9. Explain the steps included in performing binary search.
10. Explain the algorithm for selection sort and give a suitable example.
11. Explain the algorithm for insertion sort and give a suitable example.
12. Explain the algorithm for bubble sort and give a suitable example.
13. Explain the algorithm for QUICK sort and give a suitable example.
14. Explain the algorithm for Merge sort and give a suitable example.
15. Explain various graph traversals with examples.
16. Given the list of numbers
20 1 4 16 20 9 0 11 7
Use quick sort algorithm to sort them. Show different passes indicating the pivot and
partitions formed

You might also like