Algorithm: Design and Analysis: Sorting Problem

Algorithm: Design and Analysis
Sorting Problem
Dr. Khaled W. Mahmoud
1
1- Quick Sort
 Idea:
◦ Partition a given elements into two sets such that the smaller
numbers into one set and larger number into another.
 This partition is done with respective an element called pivot.
◦ Repeat the process for both the sets until each partition
contains only one element.
2
Quick Sort: Algorithm
void QuickSort (int A[], int low, int high){
int pivotPoint = partition(A, low, high);
if (low < pivotPoint ) QuickSort(A, low, pivotPoint-1);
if (high > pivotPoint) QuickSort(A, pivotPoint+1, high);
}
Quick Sort: Algorithm
int partition(int A[], int low, int high){
int pivot=A[low];
while (low < high) {
while ((A[high]>pivot) && (low < high)) high=high-1;
if (low != high) {
A[low]=A[high];
low=low+1;
}
while ((A[low]<pivot)&& (low < high)) low=low+1;
if (low != high){
A[high]=A[low];
high=high-1;
}
}// end while
A[high]=pivot;
return high;
}
4
Quick Sort: Example
#of DC = 21
5
Quick Sort: Partition Analysis
 Partition algorithm compare each key to pivot, so if
there are n positions in the range of the array its
working on, it does n-1 key comparisons.
◦  O(n): blind.
Quick Sort: Analysis
 Worst-case (DC): when partition split
the range into
1. empty sub-range (key smaller than pivot)
2. a sub-range with k-1 element.
◦ This happens when the array is sorted or in
reversed order.
 W(n) = n(n-1)/2  O (n2)
 A(n) 1.386nlgn-2.846n
7
Quick Sort: Analysis
 Best case (DC): Suppose that
each partition split the range
into two equal sub-ranges, then
◦ then log2(N) levels of recursive
calls.
◦ The whole array is scanned at each
level of calls, so the total work
done is O(N*log(N)).
 This is the upper limit
8
How many DC?
 Do Analysis based on DM.
9
Quick Sort: Space usage
 Explicitly:
◦ It is “in-place” sort, as no extra array is needed.
 Implicitly:
◦ there are hidden space usage: stacks for recursive calls
◦ The memory needed for the original array and the stack frame
memory is n +  (depth of the call graph)
◦ The size of the recursive stack depends on the number of sub-ranges
into which the range will be split.
 The worst case:
◦ If partition split off one entry at a time, the depth of recursion is n,
and in each stack frame the array is reserved i.e. W(n) (n)
 The average case = log(n)
Quick Sort: Improvements
1- Choice of Pivot
 QSort works well if the pivot belong near the middle of the segment.
 Choosing E[First] as a pivot causes QSort to do poorly in cases where
sorting should be easily.
 Several strategies for choosing pivot:
◦ Choose a random key.
◦ let pivot be the median of the E[First], E[last] and E[(first+last)/2)]
 In either case the selected pivot should be move to E[First].
 The main idea is to split the range into two equal sub range.
Quick Sort: Improvements
2- Small Sort
 Whenever the size of a subset is small, QSort become inefficient.
 this can be improved by using simple, non-recursive sort algorithm for
small subset i.e.:
if (L-f >=small size)
--
else
smallsort(E,f,l)
3- Stack-space Optimization:
 The second recursive call in the last statement can be converted into
iteration.
 Avoid making the recursive call on the larger sub-range.
2- Merge Sort
 Idea: It split input array in two halves, calls itself for
the two halves and then merges the two sorted halves.
Merge Sort: Merging Sorted Sequences
 Problem: Given two sequences A and B sorted in non-
decreasing order, merge them to create one sorted
sequence C.
 Strategy:
◦ Determine the first item in C: It is the minimum between the

first items of A and B.
◦ Suppose it is the first items of A. Then, rest of C consisting of
merging rest of A with B.
MergeSS_Rec(A, B, C)
if (A is empty)
rest of C = rest of B
else if (B is empty)
rest of C = rest of A
else if (first of A <= first of B)
first of C = first of A
MergeSS_Rec (rest of A, B, rest of C)
else
first of C = first of B
MergeSS_Rec (A, rest of B, rest of C)
return
MergeSS (A, B, C) // iterative if (i >= len(A))
i=j=k=0; copy the remaining of B to C
While ( i < len(A) && j < len(B)){ else
if (A[i] < B[j]) copy the remaining of A to C
C[k] = A[i]
i++; k++;
else if (A[i] > B[j])
C[k] = B[j];
j++; k++;
else
C[k] = A[i];
j++; i++;k++;
}
Merge Sorted Sequence: Analysis
 Given A with k elements, B with m elements, and n = k+m
 Worst Case (DC):
◦ Note: whenever a comparison of keys is done, at least one

elements is moved. The remaining elements moved with no need
to DC (i.e. by copy statement )
◦ To get the worst case for DC, we need to minimize number of
elements moved to C by copy statement. This can be happen if:
 If the last elements in both array belong in the last two positions in C;
then after the last comparison, at least two element have not yet moved
to C, the smaller one is moved
◦ C has at most n-1 element and no more comparisons will be done.
 W(n) = n – 1 …. Comparisons
Merge Sorted Sequence: Analysis
Best Case (DC):
 When the last element in the first array is less than or equal the first element in the
second array.
 Assuming that both arrays are equal (i.e. n/2) then
◦ B(n) = n/2 – 1 …. Comparisons
Space Usage:
 Merging sorted sequence with a total of n entries (k+m) requires enough memory
locations for 2n entries

◦ all entries are copied to C.
 Merge Sorted Sequence is not in-place Alg
What about DM? all cases you need n DM

Merge Sort: Algorithm
void mergeSort(Element[] E, int first, int last){
if (first < last)
{ int mid = (first+last)/2;
mergeSort(E, first, mid);
mergeSort(E, mid+1, last);
MergeSS (E, first, mid, last);
}
}
Note: MergeSS(E, first, mid, last): merge adjacent sub-ranges
of one array, putting the results back to the same array (you
need to define a two temporary array)
Merge Sort: Analysis
Worst case DC: Test the above analysis using array of size 9
 Merge Sort is not in-place sort :
◦ Extra space used in merge W(n) (n)
◦ Stack is used for recursion: W(n) ( log n)
◦ Total (n)
 However, the amount of extra space needed can be
decreased:
 If A and B are linked list and they are not needed after the
merge is completed , then the nodes can be recycled as C is
created.
 If k>=m and A has room for n=k + m, then the extra m
locations in A are needed (i.e. do merging from the right ends).
Note:
 Merge Sort does about 30% fewer comparisons in the
worst case than quick sort does in the average.
 However, merge sort does more elements movement than
QSort.
 Quick sort: A(n) 1.386nlgn-2.846n

 Merge sort: W(n)  n log n - 0.914n
3- Heap Sort
 Heap Definition: A Heap data structure is a binary tree
with special properties:
1. Heap Structure
2. Partial order tree property
 Heap Structure Definition:
◦ A binary tree T is a heap structure if and only if it satisfies the
following conditions: (h = height of the tree)
1. T is complete at least through depth h-1
2. All leaves are at depth h or h –1
3. All paths to leaf of depth h are to the left of all parts to a leaf of
depth h-1
 Such a tree is also called a left-complete binary tree.
Heap Sort
 Partial order tree property Definition:
◦ A tree T is a (maximizing) partial order tree if and only if the key at any
node is greater than or equal to the keys at each of its children (if it has
any).
Heap Sort
Implementation issue: storing a tree in an array
 Given an array E with range from 0, …, n-1
 If the index i of a node is given, then
◦ left child has index 2i+1

◦ right child has index 2i + 2
◦ parent has index floor (( i-1)/2 )
 Note: for heap structure:
◦ Each complete level has 2d elements, where d is the level
number (0 based).
◦ Complete tree with d levels has nd+1-1 elements in the whole
tree.
…
 Height of a node = the number of edges on a longest simple
path from the node down to a leaf
 Height = floor(lg n); n is number of elements
 Depth of a node = the length of a path from the root to the
node
26
Heap Sort : fixHeap (heapify)
 Input: an array A and an index i.
◦ Assumption: sub-trees rooted in Left(i) and Right(i) are proper
max-heaps, but A[i] may be smaller than its children
 Output: heap that full fill all requirements
…
void fixHeap(int A[],int index,int n){
int j=2*index+1;
while ( j < n ){
if ( j+1 < n ){if ( A[j] < A[j+1] )j++; }
if ( A[index] < A[j] ){
int t=A[index]; A[index]=A[j]; A[j]=t;
index=j; j=2*index+1;
}
else
break;
}}
28
FixHeap (Recursive)
void fixHeapR(int E[], int root, int heapSize) {
int k=E[root];int largerSubHeap;
int left=2*root+1, right = 2*root+2;
if (left>= heapSize)// Root is a leaf
E[root] = k;
else{ //Root has left child
if (right>=heapSize)
largerSubHeap=left; //Root has left child only
else if (E[left] > E[right]) //Root has both left and right children
largerSubHeap = left;
else
largerSubHeap = right;
if (k>= E[largerSubHeap])
E[root]=k;
else{
E[root] = E[largerSubHeap];
E[largerSubHeap]=k;
fixHeapR(E, largerSubHeap, heapSize);
} }}
Heap Sort: Construct (build) Heap
 Input: A heap structure H that does not necessarily
have the partial order tree property
 Output: H with the same nodes rearranged to satisfy
the partial order tree property

…
void construct(int A[], int n ){
int y=(n-1 -1)/2;
while ( y>=0 ){
fixHeap(A,y,n);
y--;
}
}
31
Heap Sort: Construct (build) Heap (Recursive)
void constructR(int E[], int root, int heapsize)

{
int left=2*root+1, right = 2*root+2;
if (left< heapsize) // Not a leaf
{
constructR (E,left,heapsize);
constructR (E,right,heapsize);
fixHeap(E, root, heapsize);
}
}
Heap Sort : The main Algorithm
void Heap Sort(int B[] , int n){ Unsorted Array: B
construct(B,n);
sortH(B,n);// delete Max
}
H= construct(B)
S= sortH (H)
Sorted Array: S
Heap Sort: Example
 Given an array of 6 elements: 15, 19, 10, 7, 17, 16, sort
it in ascending order using heap sort.
 after construct
34
…
 Delete 19:
◦ Swap 19 with 10 (last
element)
◦ Exclude 19 from the array
(n--)
◦ Fix the root (10)
 Delete 17 …
 Delete 16 …
 Delete 15 …
 Delete 10 ….
35
Heap Sort: delete Max (sort heap)
void sortH(int A[], int n){
int y=n-1;
while( y>0 ){
int t=A[0];
A[0]=A[y];
A[y]=t;
y--;
n--;
fixHeap(A,0,n);
}
}
Heap Sort: Complexity Analysis
For Space:
 in-place algorithm for iterative algorithms.
 The depth of recursive (if used) is about lgn.
For Time:
First: FixHeap… DC
 Requires 2h comparisons of keys in the worst case on a
heap with height h.
 h = Floor(log n) // for example: if n= 15, then h=log 15= 3.xx
 W(n)  2 lg(n)
Second: construct Heap … DC
 Let r be the number of nodes in the right sub tree
 W(n) = W(const: left side)+ W(const: right tree)+ fixheap
 W(n) = W(n-r-1) + W(r) + 2 lg(n)

 To simplify the solution: Let N = 2 d+1-1; where d = levels, i.e. a
complete tree (right and left tree are equal):

◦ W(N) = 2W[(N-1)/2]+2lg(N), for N > 1
◦ using Master theorem, we get: W(N)  (N);
◦ heap is constructed in linear time.
 Since n is less than or equal N then
◦ W(n)<=W(N)
◦ i.e. W(N) is an upper bound (worst case)
Third: HeapSort Analysis
 Heap construction is a linear algorithm. (neglected)
 The number of comparisons done by fix Heap on heap with k
nodes is at most 2 lg(k), so the total for all deletions is at most=

n 1
2 lg k   (n lg n)
k 1
 Exact solution: W(n) = 2(nlgn – 1.443n)

 Theorem: The number of comparisons of keys done by Heapsort in the
worst case is 2n lg(n) + O(n).
 Heapsort does  (n lg(n)) comparisons on average as well.

 What about DM?
40
Finally: Optimality
Theorem:
 Any algorithm to sort n items by comparisons must do
at least
◦ in worst case
 n log n  1.443n
◦ in average case. n log n  1 .443 n

Finally: Comparison of Four Sorting
Algorithms
QSort Merge Sort Heap sort
• Rearrange elements • can be sure of making • Rearrange in the
in the original array even subdivision. same array.
• Cannot be sure of • Has nearly optimal • Optimal worst case.
making an even worst case. but it has a higher
subdivision. constant factors than
• Cannot rearrange the the others.
• It has a bad worst elements in the
case. original array
Divide & Conquer Algorithms:
 A divide and conquer algorithm repeatedly reduces an
instance of a problem to one or more smaller instances
of the same problem (usually recursively), until the
instances are small enough to solve easily.
1. Binary search
2. Quick sort :hard division ,easy combination.
3. Merge sort: easy division, hard combination
 It is often easier to solve several small instances of a
problem than large one.
1. Divide the problem into smaller instances of the same problem
2. Solve (conquer) the smaller instances recursively
3. Combine the solutions to obtain the solution for original input
 Solve(I)
if (size(I) <= small-size)
solution = directly-Solve(I);
else
divide I into I1, …, Ik.
for each i in {1, …, k}
Si = solve(Ii);
solution = combine(S1, …, Sk);
 return solution;
To describe the amount of work done by an algorithm:
 for n>small size:
◦ T(n)=D(n)+∑T(size(Ii))+C(n) ; where
 D(n): steps done by divide
 T(n): steps done small instance
 C(n): steps done combine
 for n<=small size:
◦ T(n)=B(n) ; Done by direct solve
Example: Quicksort
 Divide
◦ Partition the array A into 2 subarrays, such that each
element of A[p..q-1] is smaller than each element in
A[q+1..r]
◦ The index (pivot) q is computed
 Conquer
◦ Recursively sort A[p..q-1] and A[q+1..r] using Quicksort
 Combine
◦ Trivial: the arrays are sorted in place  no work needed to
combine them: the entire array is now sorted
Example: Merge sort
 Divide
◦ Divide the n-element sequence to be sorted into
two subsequences of n/2 elements each
 Conquer
◦ Sort the subsequences recursively using merge sort
◦ When the size of the sequences is 1 there is
nothing more to do
 Combine
◦ Merge the two sorted subsequences

Algorithm: Design and Analysis: Sorting Problem

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Algorithm: Design and Analysis: Sorting Problem

Uploaded by

Copyright:

Available Formats

Algorithm: Design and Analysis

 Do Analysis based on DM.

◦ Determine the first item in C: It is the minimum between the

◦ Note: whenever a comparison of keys is done, at least one

◦ B(n) = n/2 – 1 …. Comparisons

locations for 2n entries

What about DM? all cases you need n DM

 Quick sort: A(n) 1.386nlgn-2.846n

◦ left child has index 2i+1

the partial order tree property

void constructR(int E[], int root, int heapsize)

 W(n) = W(n-r-1) + W(r) + 2 lg(n)

complete tree (right and left tree are equal):

nodes is at most 2 lg(k), so the total for all deletions is at most=

 Exact solution: W(n) = 2(nlgn – 1.443n)

worst case is 2n lg(n) + O(n).

 Heapsort does  (n lg(n)) comparisons on average as well.

◦ in average case. n log n  1 .443 n

3. Combine the solutions to obtain the solution for original input

You might also like