Sorting
Previously
Sorting Defined; Sorting Algorithms for runtime complexity analysis
Algorithms for Sorting
Selection Sort and Insertion Sort: O(n2)
Shell Sort: O(n log2n)
Merge Sort and Heap Sort: O(n log n)
Quick Sort: O(n log n) ave case, O(n2) worst case
Master Theorem as recipe for analyzing many recursive algorithms
Limitation of Comparisonbased sorting: worst case (n log n)
n! possible permutations of input array
log(n!) is (n log n)
Breaking the O(n log n) Barrier
Do not make comparisons (and swaps) the fundamental operation
Create containers with implied order
Bucket/Bin Sort
Radix Sort
Bucket Sort
MAIN IDEA:
Partition input set into sub sets having some implied order
Like L+E+G in quicksort, but with (normally) more partitions
These containers are usually called buckets/bins
Place input elements to appropriate buckets/bins
Sort smaller instances within containers
Having more buckets implies greater probability that each bucket will
contain fewer elements
Merge the (sorted) contents of these containers
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
5. B[Index] Index
6. i 1
7. For Index 1 to m
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements from
the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 0 0 0 0 0 0 0 0 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 0 0 0 0 0 0 0 0 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 0 0 0 0 0 0 8 0 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 0 0 0 0 0 0 8 0 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 0 0 0 0 8 0 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 0 0 0 0 8 0 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 0 0 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 0 0 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 0 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 0 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 8 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 2 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 9 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 5 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 5 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 5 5 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 5 8 4
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 5 8 9
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
OUTPUT: Sorted array
1. For Index 1 to m
2. B[Index] 0
3. For i 1 to n
4. Index A[i]
A 2 4 5 8 9
5. B[Index] Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index 1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i] Index
10. i i +1
11. Return A
Bin Sort
Runtime Complexity
O(n) to read contents of A
O(m) to read contents of B
O(n+m) overall
Linear if m is O(n)
A 2 4 5 8 9
What if m is O(n2) or O(n3)
Other issue: B 0 2 0 4 5 0 0 8 9 0
What if there are duplicates? 1 2 3 4 5 6 7 8 9 10
Counting Sort
Variant of Bin Sort
Handles duplicate entries
A 8 5 8 4 5
B 0 0 0 0 0 0 0 0 0 0
1 2 3 4 5 6 7 8 9 10
Counting Sort
Variant of Bin Sort
Handles duplicate entries
A 8 5 8 4 5
B 0 0 0 1 2 0 0 2 0 0
1 2 3 4 5 6 7 8 9 10
Counting Sort
Variant of Bin Sort
Handles duplicate entries
By counting the number of
occurrences
A 4 5 5 8 8
B 0 0 0 1 2 0 0 2 0 0
1 2 3 4 5 6 7 8 9 10
Counting Sort
Try This!
Write algorithm for counting sort
What is the runtime complexity
of your solution?
Note: time and space
A 4 5 5 8 8
complexities involving sparse
inputs over extremely large
B 0 0 0 1 2 0 0 2 0 0
range
A=(5, 1E300, 1) 1 2 3 4 5 6 7 8 9 10
Even Selection Sort will run faster
General Bucket Sort
MAIN IDEA:
Each container (bucket) can A 88 53 47 81 55
have more than 1 unique value
Implies less number of buckets B
         
EXAMPLE:
Let each bucket have max size 0 1 2 3 4 5 6 7 8 9
of 10 distinct values
Bucket 0: 09, Bucket 1: 119,…
For an element k, Bucket Index
is k/10
General Bucket Sort
MAIN IDEA:
Each container (bucket) can A 88 53 47 81 55
have more than 1 unique value
Implies less number of buckets B 55 81
    47   
53 88
EXAMPLE:
Let each bucket have max size 0 1 2 3 4 5 6 7 8 9
of 10 distinct values
Bucket 0: 09, Bucket 1: 119,…
For an element k, Bucket Index
is k/10
General Bucket Sort
MAIN IDEA:
Each container (bucket) can have A 88 53 47 81 55
more than 1 unique value
Implies less number of buckets
B 55 81
    47   
After Partitioning: 53 88
Sort each bucket
Can use insertion Sort 0 1 2 3 4 5 6 7 8 9
“Merge” sorted buckets
B’ 55 88
Discuss:     47   
Guaranteed “uniform” distribution?
53 81
Runtime complexity for n items over
m buckets? A’ 47 53 55 81 88
Space complexity?
Radix Sort
Radix – refers to the base of a number system
Each digit in a number system is from 0 to (r1), r is the radix
The actual value of a digit is digit x place_value
Radix sort is a sorting algorithm that takes advantage of known
possible digits and their values in order to avoid comparison
operations in sorting a set of numbers.
Radix Sort
MAIN IDEA:
Prepare r buckets with indices
0 to r1
Sort the numbers by the least
significant digit (e.g., ones, for
base10)
Then sort by the next
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
188 53 748 381 155
Prepare r buckets with indices
0 to r1
Sort the numbers by the least          
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
188 53 748 381 155
Prepare r buckets with indices
0 to r1
Sort the numbers by the least  381  53  155  
748

188
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
381 53 155 188 748
Prepare r buckets with indices
0 to r1
Sort the numbers by the least  381  53  155  
748

188
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
381 53 155 188 748
Prepare r buckets with indices
0 to r1
Sort the numbers by the least          
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
381 53 155 188 748
Prepare r buckets with indices
0 to r1
Sort the numbers by the least     748
155
 
188

53 381
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
748 53 155 381 188
Prepare r buckets with indices
0 to r1
Sort the numbers by the least     748
155
 
188

53 381
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
748 53 155 381 188
Prepare r buckets with indices
0 to r1
Sort the numbers by the least          
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
748 53 155 381 188
Prepare r buckets with indices
0 to r1
Sort the numbers by the least 053
188
 381    748  
155
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
MAIN IDEA:
53 155 188 381 748
Prepare r buckets with indices
0 to r1
Sort the numbers by the least 053
188
 381    748  
155
significant digit (e.g., ones, for
base10)
Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base10)
Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
Discuss:
53 155 188 381 748
What’s the runtime for n
elements over radix r, with
longest element having length 188
053  381    748  
d? 155
Space complexity?
Can be used to sort mails by 0 1 2 3 4 5 6 7 8 9
zip codes?
Can be used to sort strings of
length < 20?
Can be used to sort real
numbers in general?
Sorting Issues
Does O(n log n) always beat O(n 2)?
Which should be considered: Best Case, Average Case or Worst Case?
Recursive vs Iterative
Recursive is normally easier to code (more intuitive most of the time)
Stack Overflow issue for recursive algorithms
Iterative normally runs faster
Inplace sorting is usually faster, but normally requires more complex code
Memory/Secondary Storage Paging
Merge Sort needs contiguous array elements: potentially fewer page requests
Heap Sort needs array elements that are not contiguous (e.g., node k with its parent
node k/2 will be swapped)
How about Quick sort? Radix sort?
Closing
Many sorting algorithms available, most have several variants
For our course, the discussion of algorithms serves 2 main purposes:
Appreciation of different ways of accomplishing the same goal
Examples on analysis of runtime (and space) complexities, which is the
main basis for comparing different algorithms that solve the same problem
Bonus: you were “forced” to do programming