Professional Documents
Culture Documents
Unit-6 Searching and Sorting
Unit-6 Searching and Sorting
In sequential searching, a given ‘item’ is found in the list (array) by examining its elements
one by one. Initially the process starts with comparing the ‘item’ with the first element. If both
do not match then it proceeds to the next element. This process continues until either the
desired item is found or the list is exhausted.
Let us assume that ‘A’ is an array of 10 elements (A[0], A[1], through A[9]), as shown in
figure 6.1(a), and let we want to find the item ‘16’. The sequential searching works as:
1. The first element of the list A[0] is compared with 16. Since A[0] is not equal to 16,
the next element A[1] is taken (as shown in figure-6.1(b))
2. A[1] is compared with 16. Again A[1] is not equal to 16, therefore the next element,
A[2] is taken (as shown in figure-6.1(c))
3. A[2] is compared with 16. Again A[2] is not equal to 16, therefore the next element,
A[3] is taken (as shown in figure-6.1(c))
4. This process goes on until the item is found or the entire list is searched.
Algorithm : SequentialSearch
SequentialSearch(A, N, Item)
Here A is an array of ‘N’ number of elements and ‘Item’ represents the
item to be searched in array ‘A’
1. Set Flag = 0
2. Initialize I = 0
3. Repeat through Step-4 while (I < N)
4. Compare A[I] and Item
If (A[I] = = Item) Then set Flag = 1 and go to Step-6
Otherwise Increment ‘I’ as: I = I +1
5. Check Flag.
If Flag = 1 Then Display – “Item found”;
Otherwise Display – “Item not found”;
6. Exit
Implementation : SequentialSearch
The limitation of sequential search technique is that we have to search the entire list even
when the item is not in the list. This problem is solved by an efficient technique called binary
searching. The condition for binary search is that all the data should be in sorted order.
In binary search, we first compare the key with the item in the middle position of the sorted
array. If there's a match, we can return immediately. If the key is less than the middle key,
then the item sought must lie in the lower half of the array; if it's greater then the item sought
must lie in the upper half of the array. So we repeat the procedure on the lower (or upper)
half of the array.
Binary Searching – a process of finding an item from a sorted array using divide
and conquer method.
For this we take three variables – Start, End and Middle, which will keep track of the start,
end and middle value of the portion of the array. The value of middle will be as:
Thus in the length of the list to be searched is reduced by half. And this process continues
until we find the element or middle element has no left or right portion to search.
Let we have a list of numbers as shown in figure 6.2(a) and let we want to find item 62. For
this we use two integer variables – Start and End. Initially Start = 0, End = 9 and Middle = 5.
The binary searching works as:
(i) The middle element A[5] is compared with ‘62’, as shown in figure 6.2(b). Since A[5]
is less than item ‘62’, therefore the upper portion of the list is ignored
(ii) For lower portion, Start is set to 6 (i.e. Middle + 1) and End remains same (i.e. 9).
With these new values of Start and End, the new Middle value (i.e. 8)
(iii) Now A[8] is compared with ‘62’. Since A[8] is greater than 62, therefore lower portion
of this subdivided list is ignored.
And this process goes on until the item is found or there is no upper or lower portion to
search. Figure-6.2 shows that the item is found at 6th location.
4
A[0] 21 A[0] 21 A[0] 21
A[1] 33 A[1] 33 A[1] 33
A[2] 39 A[2] 39 A[2] 39
A[3] 46 A[3] 46 A[3] 46
A[4] 55 A[4] 55 A[4] 55
A[5] 60 A[5] 60 Item=62 A[5] 60
A[6] 62 A[6] 62 A[6] 62
A[7] 73 A[7] 73 A[7] 73
A[8] 81 A[8] 81 A[8] 81 Item=62
A[9] 89 A[9] 89 A[9] 89
(a) (b) First Pass (c) Second Pass
Start = 0, End = 9 Start = 6, End = 9
Middle = 5 Mid = 8
A[0] 21 A[0] 21
A[1] 33 A[1] 33
A[2] 39 A[2] 39
A[3] 46 A[3] 46
A[4] 55 A[4] 55
A[5] 60 A[5] 60
A[6] 62 A[6] 62 Item=62
A[7] 73 Item=62 A[7] 73
A[8] 81 A[8] 81
A[9] 89 A[9] 89
(c) Third Pass (c) Fourth Pass
Start = 6, End = 8 Start = 6, End = 6
Middle = 7 Middle = 6
Algorithm : BinarySearch
BinarySearch(A, N, Item)
Here ‘A’ is an array of ‘N’ number of elements and ‘Item’ represents the
item to be searched in array ‘A’
1. Set Flag = 0
2. Set Start = 0 and End = N-1
3. Repeat through Step-5 While (Start <= End)
4. Calculate Middle as: Middle = (Start + End) / 2
5. Compare A[Middle] and Item
If (A[Middle] == Item) then set Flag = 1 and go to step-6
if (A[Middle] > Item ) then update as End = Middle -1
Else update Start as Start = Middle +1
6. Check Flag.
If Flag = 1 Then Display – “Item found”;
Otherwise Then Display – “Item not found”;
7. Exit
Analysis – In binary searching, after each comparison either the search terminates
successfully or the size of the array is reduced about one half of the original size. It means
that each comparison reduces the number of possible comparisons by a factor of 2. Thus
after ‘k’ key comparison, the array remaining is of size at most n/2k. And in worst case, the
expected number of comparison require O(log 2n) to search the desired ‘item’, even if the
search is unsuccessful.
(i) In first pass A[0] is compared with A[1]. If A[0] > A[1], their values are interchanged
(ii) After this A[1] is compared with A[2]. If A[1] > A[2], their values are interchanged
(iii) Next A[2] is compared with A[3]. If A[2] > A[3], their values are interchanged
(iv) And this process continues until we compare A[4] with A[5].
Algorithm: BubbleSort
BubbleSort(A, N)
Here ‘A’ is an array of elements and ‘N’ represents the number of
elements in the array ‘A’.
1. Set Flag=1
2. Initialize I = 0
3. Repeat through Step-9 While (I < N) and Flag=1
4. Reset Flag = 0
5. Initialize J = 0
6. Repeat through Step-8 While (J < N-I-1)
7. Compare A[J] and A[J+1].
If (A[J] > A[J+1]) then set the Flag = 1 and
interchange A[J] and A[J+1]
8. Increment ‘J’ as J = J+1
9. Increment ‘I’ as I = I+1
10. Exit
Implementation : BubbleSort
Selection Sorting – a sorting technique that starts by finding the minimum value
in the array and moving it to the first position and this step is then repeated for the
second lowest value, then the third, and so on until the array is sorted.
A selection sort starts from the first element and it is compared with all the remaining
elements of the list one by one and finds the smallest element. Next, the second element is
taken and searches for the second smallest element. This process continues until the
complete list is sorted. Let we have an array of 6 numbers:
A[0] 29
A[1] 36
A[2] 11
A[3] 24
A[4] 55
A[5] 22
(i) A[0] is compared with A[1]. If A[0] > A[1] then interchange them.
(ii) After this A[0] is compared with A[2]. If A[0] > A[2] then interchange them.
(iii) A[0] is compared with A[3]. If A[0] > A[3] then interchange them.
(iv) And this process continues until we compare A[0] with A[5].
From this pass it is clear that the smallest element is placed in the proper position within the
array. Similarly in second pass, it places the second smallest element in its proper position.
Figure-6.5 illustrates the complete operations performed on this above array.
Algorithm : SelectionSort
SelectionSort(A, N)
Here ‘A’ is an array of elements and ‘N’ represents the number of
elements in the array ‘A’.
1. Initialize I = 0
2. Repeat through Step-7 while (I < N-1)
3. Set J = I+1
4. Repeat through Step-6 while (J < N)
5. Compare A[I] and A[J].
If (A[I] > A[J]) then interchange A[I] and A[J]
6. Increment ‘J’ as J = J+1
7. Increment ‘I’ as I = I+1
8. Exit
Analysis – In Selection sorting, there are N-1 passes and for each pass there are N-I
comparisons. The first pass makes (n-1) comparisons; the second pass makes (n-2)
comparisons and so on. Therefore the total number of maximum comparisons required to
sort an array of ‘n’ elements by this method is n*(n-1)/2. We can say that the total number of
comparisons is O(n2).
Insertion Sorting – a sorting technique which moves elements one at a time into
the correct position, i.e. one element at a time into the previously sorted part of the
array, moving higher ranked elements down as necessary
Suppose ‘A’ is an array of ‘N’ numbers. The insertion sort techniques scans the array from
A[0] to A[n-1], inserting each element into its proper position in the previously sorted
subarray A[0], A[1], A[2], …., A[I-1].
In other words, the insertion sort is carried out with the help of following sequences:
(i) Since A[0] is the very first element, therefore it is already sorted.
(ii) A[1] is inserted before or after A[0] in such a way that A[0] and A[1] is sorted.
(iii) Same way A[2] is inserted in such a way that A[0], A[1] and A[2] is sorted.
(iv) This process continues until A[0], A[1], A[2], …., A[n-1] becomes sorted.
9
Algorithm – InsertionSort
InsertionSort(A, N)
Here ‘A’ is an array of elements and ‘N’ represents the number of
elements in the array ‘A’.
1. Initialize J =1
2. Repeat through Step-8 while (J < N)
3. Set Item = A[J]
4. Set I = J-1
5. Repeat through step-7 while (I >= 0 && Item < A[I])
6. A[I+1] = A[I];
7. Increment ‘I’ as I = I - 1
8. Increment ‘J’ as J = J + 1
9. Exit
Implementation : InsertionSort
Analysis - The best case for insertion sort occurs when the list is already sorted. In such
case the insertion sort will make (n-1) comparison of keys. So we can say that the sort is
O(n) in the best case. On the other hand, in worst case, when the list is initially sorted in the
reverse order, the sort is O(n 2). Similarly in the average case, the insertion sort is O(n2).
Merge sort is a sorting algorithm that sorts data items into ascending or descending order,
which comes under the category of comparison-based sorting. Here we apply the divide-
and-conquer strategy to sort a given sequence of data items, which can be described as
follows:
1. Recursively split the sequence into two halves (i.e. subsequences) until the subsequence
contains only a single data item (i.e. singleton subsequence)
2. Now, recursively merge these subsequences back together preserving their required
order (i.e. ascending or descending order)
(29 36 61 24 55 22 67 42)
(i) Initially lower = 1, upper = 8 and the value of mid is 4. Thus the above array is split
into two sub-lists, as shown below:
(ii) Take the first sublist and apply the same procedure again:
(iii) Once again the list (29 36) is divided into two sublists ((29) (36)).
(iv) Now the list is not further subdivided and they are individually sorted, therefore these
two sublists (29) and (36) are merged into one. On merging the resultant list becomes
(29, 36).
(v) So the sorting on sublist (29 36) is over. Next we take (61 24).
(vi) To do the same so we subdivide it into two sublists ((61) (24)). These sublists are
already sorted because they contain only one element, therefore they are merged to
form (24 61).
(vii) Now the task is to merge these two sublists (29 36) and (24 61) and to form a new
list (24 29 36 61). This completes the first task of sorting the original sublist.
(viii) Now the same procedure is applied to second original sublist (55 22 67 42).
This sublist becomes (22 42 55 67). Now the merging operation is applied on
these two original sublists (24 29 36 61) and (22 42 55 67) to form a new list (22
24 29 36 42 55 61 67).
Algorithm : MergeSort
This algorithm calls another algorithm Merge() whose task is to merge two sublists A[Lower
: Mid] and A[Mid+1 : Upper] into one new sorted list. The Merge() algorithm maintains two
counter variables ‘I’ and ‘J’ to mark the beginning of two arrays. The Merge() algorithm
works as:
11
Algorithm : Merge
6.4.5Quick Sort
Quick sorting is a natural example of recursion sorting technique developed by C.A.B. Hoare
in 1962. Quick Sort is a sorting algorithm that sorts data items into ascending or descending
order, which comes under the category of comparison-based sorting. This algorithm works
as follows:
1. Reorder by splitting the sequence into left and right halves with a pivot data item in
between. Here, pivot data item is identified by comparison i.e. a data item must be greater
than or equal to every data item in the left half and less than or equal to every data item in
the right half.
2. Now, recursively sort the two half's separately.
Quick Sorting – a sorting algorithm that picks an element from the list, and
reorder the list so that all elements which are less than the chosen element come
before the chosen element
12
In Quick sorting, let A[First : Last] be an array to be sorted. Here we will assume two counter
variables ‘I’ and ‘J’ as: I = First+1 and J = Last. The variable ‘I’ moves towards left for
searching for an element, which is greater than A[First] and the counter variable ‘J’ moves
towards left for an element, which is smaller than A[First]. This process ends whenever the
counter variable ‘I’ and “J” meet or cross over. Consider an array of 12 elements as shown
below:
0 1 2 3 4 5 6 7 8 9 10 11
22 29 11 18 34 7 14 46 35 8 56 26
List of 12 numbers
(i) Here I = 1, J = 11and the first element A[First] is 22. In this we use two counter
variables ‘I’ and ‘J’ which are set I = 1 and J = 11
0 1 2 3 4 5 6 7 8 9 10 11
22 29 11 18 34 7 14 46 35 8 56 26
First I J
element
(ii) Now we move ‘I’ towards right until A[I] > 22. Since A[I] > 22, therefore ‘I’ does not
move further to the right and stops immediately. Now we move ‘J’ towards left until
A[J] < 22. Here ‘J moves 2 units to the left and the scene may be as shown below:
0 1 2 3 4 5 6 7 8 9 10 11
22 29 11 18 34 7 14 46 35 8 56 26
First I J
element
0 1 2 3 4 5 6 7 8 9 10 11
22 8 11 18 34 7 14 46 35 29 56 26
First I J
element
(iv) The movement of ‘I’ and ‘J’ resume until A[I] > 22 and A[J] < 22 are found . Now ‘I’
moves 3 units to the right and ‘J’ moves 3 units to left as shown below:
0 1 2 3 4 5 6 7 8 9 10 11
22 8 11 18 34 7 14 46 35 29 56 26
I J
13
First
element
(v) Once again A[I] and A[J] are exchanged as shown below:
0 1 2 3 4 5 6 7 8 9 10 11
22 8 11 18 14 7 34 46 35 29 56 26
First I J
element
(vi) The movement of ‘I’ and ‘J’ resume until A[I] > 22 and A[J] < 22 are found. Now ‘I’
moves 2 units to the right and ‘J’ moves 1 unit to the left as shown below:
0 1 2 3 4 5 6 7 8 9 10 11
22 8 11 18 14 7 34 46 35 29 56 26
First J I
element
(vii) Since the index of ‘I’ becomes greater than ‘J’, the process ends after exchanging
A[J] and A[First]. The resultant array looks like the following:
0 1 2 3 4 5 6 7 8 9 10 11
7 8 11 18 14 22 34 46 35 29 56 26
Now the element 22 is placed at index ‘5’. All elements of array A[ ], which are left to ‘5’ are
less than 22 and all elements which are right to ‘5’ are greater than 22. Thus the element 22
is placed correctly. Now the above procedure is applied on these two subarrays A[First : J-
1] and A[J+1 : Last].
Algorithm : QuickSort
Implementation : QuickSort
do
{
do
{
i++;
}
while (a[i] < a[first] ) ;
do
{
j--;
}
while (a[j] > a[first]) ;
if (i < j)
{
temp = a[i];
a[i] = a[j] ;
a[j] = temp;
}
} while (i < j) ;
temp =a[first];
a[first] = a[j];
a[j]=temp;
QuickSort (a, first, j-1);
QuickSort (a, j+1, last);
}
}
Analysis – In quick sorting, the worst case behavior is O(n2). However each time an
element is correctly positioned in the middle of the array such that the subarray to its left will
be of the same size as that of its right. So if an array has size ‘n’, then it is split into two
subarrays each of size n/2 approximately, thus having approximately n/2 comparisons. And
this process goes on until there are ‘n’ subarrays of size ‘1’. Thus the total number of
comparison for the entire list is
n+2*(n/2)+4*(n/4)+….+n*(n/n) or (n+n+n+…..n)
It has been shown that the average computing time for this sort is O(n*log 2n). And as far as
the average complexity time is concerned, the quick sort is the best of internal sorting
methods we shall be studying.
15
6.4.7 Heaps and Heap Sorting
Before studying heap sorting, we must know what a heap is and how it is created?
Heap
A heap is defined to a complete binary tree ‘H’ such that each node of ‘H’ has the following
property – the value at any node, say ‘i’ is greater than or equal to the value of any of its
descendants. Such a heap is called as maxheap. Opposite to it, if the value at any node, say
‘i’ is less than or equal to the value of any of its descendants then it is called as minheap.
Heap – a balanced, left-justified binary tree in which no node has a value greater than
the value in its parent
82 65
43 59
74 43 28 59
74 82 65 91
Max-heap Min-heap
In an array representation of a binary tree the left and right children of node A[I] are at A[2.I]
and A[2I+1] respectively. Alternatively the parent of any node A[J] is at A[J/2]. Note that the
nodes of a heap H on the same level appears one after the other in the array A[]. Here is the
sequential representation of above maxheap by the array a[].
Index 1 2 3 4 5 6 7
Value 91 82 65 74 43 28 59
(i) In first step, ‘item’ is inserted at the end of ‘H’ so that ‘H’ is still a complete binary tree
but not necessarily a heap.
(ii) And in second step, ‘item’ is placed at its appropriate place in the heap so that the ‘H’
is finally a heap.
Let us understand this procedure by using the heap as shown in figure-6.11
91
82 65
74 43 28 59
Figure 6.11
16
Let we want to insert a new item ‘89’. Initially it is placed in the last to make it a complete
tree, that is we set a[8]=89. Figure 6.12(a) shows that 89 is the left child of a[4]. Now the
item 85 is compared with its parent at a[4], that is 74. Since 89 is greater than 74, so we will
interchange them. The new complete tree is as shown in figure-6.12(b). Now the item 89 is
again compared with its parent 82. Since 89 is greater than 82, so we will interchange them.
The new complete tree is as shown in figure-6.12 (c). Now again 89 is compared with its
parent 90. Since 89 is not greater than 90, so item=89 has now placed its appropriate place
in it. The new complete tree is as shown in figure-6.12 (d).
91 91
82 65 82 65
74 43 28 59 89 43 28 59
89 74
(a) (b)
91 91
82 65 89 64
89 43 28 59 82 43 28 59
74 74
(c) (d)
Figure 6.12
Here is this insertion algorithm.
Algorithm: InsertHeap
InsertHeap(A, N, Item)
Here ‘A’ is an array of ‘N’ number of elements and ‘Item’ represents the
item to be inserted into a heap
1. Set I = N;
2. Repeat through step-7 while (I > 1)
3. Assign Parent = I/2
4. Compare item with A[Parent]
5. If (Item <= A[Parent]) then A[I] = Item and go to step-7;
otherwise A[I] = A[Parent] and I = Parent;
6. Set A[1] = Item;
7. Exit
17
Here is the C implementation of this insertion algorithm.
Implementation : InsertHeap
Whenever a deletion operation is made on a Heap, it is always the root node which is to be
deleted. Let H is a heap with ‘n’ elements. When we delete the root node of a heap we divide
the process into three steps:
(i) In first step we assign the root node R to some variable, say Item.
(ii) In second step we replace the deleted node R by the last node L of H so that it is still
a complete binary tree, but not necessarily a heap.
(iii) Finally we will reheap by placing the L to its appropriate place in H so that H is finally
a heap.
91
82 65
74 43 28 59
Figure 6.13
Here A[R] = 91 is the root node and A[L] = 59 is the last node of the tree. First we access the
root node as:
Item = A[1];
51 51
82 65 82 65
74 43 28 91 74 43 28
(a) (b)
82 82
51 65 74 65
74 43 28 51 43 28
(c) (d)
Figure 6.14
Now we will reheap this complete binary tree. First we compare 51 with its two children. And
since 51 is lesser than the larger child 82, so we will interchange them. The new binary tree
is as shown in figure-6.14(c). Now again 51 is compared with its two children. And since 51
is lesser than the left child 74, so we will interchange them. The new binary tree is as shown
in figure-6.14(d). Since there is no further element in the tree, therefore item 51 has placed
to its appropriate place in H.
Algorithm : DeleteHeap
DeleteHeap(A, N)
Here ‘A’ is an array of ‘N’ number of elements
Implementation : DeleteHeap
Heap Sorting
The process of heap sorting is divided into two parts:
(i) In first part we build a heap H out of the elements of array using an InsertHeap()
algorithm.
(ii) In second part we repeatedly delete the root element of H using DeleteHeap()
algorithm and move it to the last position.
Heap Sorting – a sorting technique in which a list can be sorted by first building it
into a heap and then iteratively deleting the root node from the heap until the heap is
empty. If the deleted roots are stored in reverse order in an array they will be sorted in
ascending order (if a max heap is used)
20
The algorithm of heap sorting is as:
Algorithm : HeapSort
HeapSort(A, N)
Here ‘A’ is an array of ‘N’ number of elements
1. Set J = 1
2. Repeat through step-4 while (J <= N-1)
3. Call InsertHeap(A, J, A[J+1])
4. Increment the value of ‘J’ as J = J+1
5. Repeat through step-7 while (N >= 1)
6. Call DeleteHeap() as
Item = DeleteHeap(A, N)
7. Set A[N] = item
8. Decrement ‘N’ as N = N-1
9. Exit
Implementation : HeapSort
while (n >= 1)
{
item = DeleteHeap(a, n);
a[n] = item;
n--;
}
}
Analysis – As stated earlier, the heap sorting process is divided into two phases, therefore
we analyze the complexity of each phase separately. In first phase we assume that there are
‘n’ number of elements in an array and number of comparison to find the appropriate place
of a new element in H can not exceed the depth of H. The depth of a complete binary tree is
bounded by log2n, where n is the number of element in H. You can say that the running time
of first phase of heap sort is proportional to (n log 2n).