Professional Documents
Culture Documents
CSE 332 Data Abstractions: Sorting It All Out: Kate Deibel Summer 2012
CSE 332 Data Abstractions: Sorting It All Out: Kate Deibel Summer 2012
Kate Deibel
Summer 2012
Input:
An array A of data records
A key value in each data record (maybe many fields)
A comparison function (must be consistent and
total): Given keys a and b is a<b, a=b, a>b?
Effect:
Reorganize the elements of A such that for any i and
j such that if i < j then A[i] A[j]
Array A must have all the data it started with
In-place sorting:
Uses at most O(1) auxiliary space beyond initial array
Non-Comparison Sorting:
Redefining the concept of comparison to improve speed
Other concepts:
External Sorting: Too much data to fit in main memory
Parallel Sorting: When you have multiple processors
STANDARD COMPARISON
SORT ALGORITHMS
Loop invariant:
When loop index is i, the first i elements are the i
smallest elements in sorted order
Time?
Best: _____ Worst: _____ Average: _____
Loop invariant:
When loop index is i, the first i elements are the i
smallest elements in sorted order
Loop invariant:
When loop index is i, first i elements are sorted
Time?
Best: _____ Worst: _____ Average: _____
Loop invariant:
When loop index is i, first i elements are sorted
Already or
Nearly Sorted Reverse Sorted See Book
Time: Best: O(n) Worst: O(n ) 2
Average: O(n2)
Stable and In-Place
5 7 6 9 8 10 4 3 2 1
arr[n-i] = deleteMin()
heap part sorted part
July 16, 2012 CSE 332 Data Abstractions, Summer 2012 22
In-Place Heap Sort
But this reverse sorts… how to fix?
Build a maxHeap instead
4 7 5 9 8 6 10 3 2 1
5 7 6 9 8 10 4 3 2 1
arr[n-i] = deleteMax()
deleteMin()
heap part sorted part
July 16, 2012 CSE 332 Data Abstractions, Summer 2012 23
"Dictionary Sorts"
We can also use a balanced tree to:
insert each element: total time O(n log n)
Repeatedly deleteMin: total time O(n log n)
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3 4
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3 4 5
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3 4 5 6
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3 4 5 6 8
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3 4 5 6 8 9
1 more array
After merge, we
will copy back to
the original array
After recursion: a 2 4 8 9 1 3 5 6
Merge:
Use 3 “fingers” and aux 1 2 3 4 5 6 8 9
1 more array
After merge, we
will copy back to a 1 2 3 4 5 6 8 9
the original array
2 4 5 6 1 3 8 9 Main array
1 2 3 4 5 6 Auxiliary array
copy
second
Better Implementation
Use a new auxiliary array of size n for every merge
Best Implementation:
Do not copy back after merge
Swap usage of the original and auxiliary array (i.e.,
even levels move to auxiliary array, odd levels move
back to original array)
Will need one copy at end if number of stages is odd
Merge by 1
Merge by 2
Merge by 4
Merge by 8
Merge by 16
Copy if Needed
Performance:
To sort n elements, we
Return immediately if n=1
Else do 2 subproblems of size n/2 and then
an O(n) merge
Recurrence relation:
T(1) = c1
T(n) = 2T(n/2) + c2n
S1 0
S2
43 31 75
13 65
81 partition S
26 57 92
S1 S2
QuickSort(S1) and
0 13 26 31 43 57 65 75 81 92
QuickSort(S2)
S 0 13 26 31 43 57 65 75 81 92
Worst pivot?
Greatest/least element 8 2 9 4 5 3 1 6
Problem of size n - 1 1
8 2 9 4 5 3 6
O(n2)
0 1 2 3 4 5 6 7 8 9
6 1 4 9 0 3 5 2 7 8
6 1 4 9 0 3 5 2 7 8
Move fingers
6 1 4 2 0 3 5 9 7 8
Swap
6 1 4 2 0 3 5 9 7 8
Move fingers
5 1 4 2 0 3 6 9 7 8
Move pivot
This is a short example—you typically have
more than one swap during partition
July 16, 2012 CSE 332 Data Abstractions, Summer 2012 55
Quicksort Analysis
Best-case: Pivot is always the median
T(0)=T(1)=1
T(n)=2T(n/2) + n linear-time partition
Same recurrence as Mergesort: O(n log n)
Notes:
Could also use a cutoff for merge sort
Cutoffs are also the norm with parallel algorithms
(Switch to a sequential algorithm)
None of this affects asymptotic complexity, just
real-world performance
Example, n=3
a[0]<a[1]<a[2] a[0]<a[2]<a[1] a[1]<a[0]<a[2]
a[1]<a[2]<a[0] a[2]<a[0]<a[1] a[2]<a[1]<a[0]
Now: Show a binary tree with n! leaves has height Ω(n log n)
n log n is the lower bound, the height must be at least this
It could be more (in other words, a comparison sorting
algorithm could take longer but can not be faster)
Factorial function grows very quickly
Idea:
Bucket sort on one digit at a time
Number of buckets = radix
Starting with least significant digit, sort with
Bucket Sort
Keeping sort stable
Do one pass per digit
After k passes, the last k digits are sorted
Invariant:
After k passes the low order k digits are sorted.
Non-comparison sorts
Bucket sort good for small number of key values
Radix sort uses fewer buckets and more phases
It depends!
July 16, 2012 CSE 332 Data Abstractions, Summer 2012 83