You are on page 1of 49

Sorting

Previously
 Sorting Defined; Sorting Algorithms for run-time complexity analysis
 Algorithms for Sorting
 Selection Sort and Insertion Sort: O(n2)
 Shell Sort: O(n log2n)
 Merge Sort and Heap Sort: O(n log n)
 Quick Sort: O(n log n) ave case, O(n2) worst case

 Master Theorem as recipe for analyzing many recursive algorithms


 Limitation of Comparison-based sorting: worst case  (n log n)
 n! possible permutations of input array
 log(n!) is  (n log n)
Breaking the O(n log n) Barrier
 Do not make comparisons (and swaps) the fundamental operation
 Create containers with implied order
 Bucket/Bin Sort
 Radix Sort
Bucket Sort
 MAIN IDEA:
 Partition input set into sub sets having some implied order
 Like L+E+G in quicksort, but with (normally) more partitions
 These containers are usually called buckets/bins
 Place input elements to appropriate buckets/bins
 Sort smaller instances within containers
 Having more buckets implies greater probability that each bucket will
contain fewer elements
 Merge the (sorted) contents of these containers
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
5. B[Index]  Index
6. i 1
7. For Index  1 to m
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements from
the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 0 0 0 0 0 0 0 0 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 0 0 0 0 0 0 0 0 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 0 0 0 0 0 0 8 0 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 0 0 0 0 0 0 8 0 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 0 0 0 0 8 0 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 0 0 0 0 8 0 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 0 0 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 0 0 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 0 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 0 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 8 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 2 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 9 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 5 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 5 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 5 5 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 5 8 4
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 5 8 9
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 INPUT: Array A of n distinct elements
from the set {1, 2,…,m}, m > n
 OUTPUT: Sorted array
1. For Index  1 to m
2. B[Index]  0
3. For i  1 to n
4. Index  A[i]
A 2 4 5 8 9
5. B[Index]  Index
6. i 1 B 0 2 0 4 5 0 0 8 9 0
7. For Index  1 to m
1 2 3 4 5 6 7 8 9 10
8. If B[Index] > 0
9. A[i]  Index
10. i  i +1
11. Return A
Bin Sort
 Run-time Complexity
 O(n) to read contents of A
 O(m) to read contents of B
 O(n+m) overall
 Linear if m is O(n)
A 2 4 5 8 9
 What if m is O(n2) or O(n3)

 Other issue: B 0 2 0 4 5 0 0 8 9 0
 What if there are duplicates? 1 2 3 4 5 6 7 8 9 10
Counting Sort
 Variant of Bin Sort
 Handles duplicate entries

A 8 5 8 4 5

B 0 0 0 0 0 0 0 0 0 0
1 2 3 4 5 6 7 8 9 10
Counting Sort
 Variant of Bin Sort
 Handles duplicate entries

A 8 5 8 4 5

B 0 0 0 1 2 0 0 2 0 0
1 2 3 4 5 6 7 8 9 10
Counting Sort
 Variant of Bin Sort
 Handles duplicate entries
 By counting the number of
occurrences

A 4 5 5 8 8

B 0 0 0 1 2 0 0 2 0 0
1 2 3 4 5 6 7 8 9 10
Counting Sort
 Try This!
 Write algorithm for counting sort
 What is the run-time complexity
of your solution?

 Note: time and space


A 4 5 5 8 8
complexities involving sparse
inputs over extremely large
B 0 0 0 1 2 0 0 2 0 0
range
 A=(5, 1E300, 1) 1 2 3 4 5 6 7 8 9 10
 Even Selection Sort will run faster
General Bucket Sort
 MAIN IDEA:
 Each container (bucket) can A 88 53 47 81 55
have more than 1 unique value
 Implies less number of buckets B
- - - - - - - - - -
 EXAMPLE:
 Let each bucket have max size 0 1 2 3 4 5 6 7 8 9
of 10 distinct values
 Bucket 0: 0-9, Bucket 1: 1-19,…
 For an element k, Bucket Index
is k/10
General Bucket Sort
 MAIN IDEA:
 Each container (bucket) can A 88 53 47 81 55
have more than 1 unique value
 Implies less number of buckets B 55 81
- - - - 47 - - -
53 88
 EXAMPLE:
 Let each bucket have max size 0 1 2 3 4 5 6 7 8 9
of 10 distinct values
 Bucket 0: 0-9, Bucket 1: 1-19,…
 For an element k, Bucket Index
is k/10
General Bucket Sort
 MAIN IDEA:
 Each container (bucket) can have A 88 53 47 81 55
more than 1 unique value
 Implies less number of buckets
B 55 81
- - - - 47 - - -
 After Partitioning: 53 88
 Sort each bucket
 Can use insertion Sort 0 1 2 3 4 5 6 7 8 9
 “Merge” sorted buckets
B’ 55 88
 Discuss: - - - - 47 - - -
 Guaranteed “uniform” distribution?
53 81
 Run-time complexity for n items over
m buckets? A’ 47 53 55 81 88
 Space complexity?
Radix Sort
 Radix – refers to the base of a number system
 Each digit in a number system is from 0 to (r-1), r is the radix
 The actual value of a digit is digit x place_value

 Radix sort is a sorting algorithm that takes advantage of known


possible digits and their values in order to avoid comparison
operations in sorting a set of numbers.
Radix Sort
 MAIN IDEA:
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least
significant digit (e.g., ones, for
base-10)
 Then sort by the next
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
188 53 748 381 155
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - - - - - - - - - -
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
188 53 748 381 155
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - 381 - 53 - 155 - -
748
-
188
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
381 53 155 188 748
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - 381 - 53 - 155 - -
748
-
188
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
381 53 155 188 748
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - - - - - - - - - -
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
381 53 155 188 748
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - - - - 748
155
- -
188
-
53 381
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
748 53 155 381 188
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - - - - 748
155
- -
188
-
53 381
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
748 53 155 381 188
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least - - - - - - - - - -
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
748 53 155 381 188
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least 053
188
- 381 - - - 748 - -
155
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 MAIN IDEA:
53 155 188 381 748
 Prepare r buckets with indices
0 to r-1
 Sort the numbers by the least 053
188
- 381 - - - 748 - -
155
significant digit (e.g., ones, for
base-10)
 Then sort by the next 0 1 2 3 4 5 6 7 8 9
significant digit (e.g., tens, for
base-10)
 Repeat this until you have
finished sorting by most
significant digit.
Radix Sort
 Discuss:
53 155 188 381 748
 What’s the run-time for n
elements over radix r, with
longest element having length 188
053 - 381 - - - 748 - -
d? 155

 Space complexity?
 Can be used to sort mails by 0 1 2 3 4 5 6 7 8 9
zip codes?
 Can be used to sort strings of
length < 20?
 Can be used to sort real
numbers in general?
Sorting Issues
 Does O(n log n) always beat O(n 2)?
 Which should be considered: Best Case, Average Case or Worst Case?
 Recursive vs Iterative
 Recursive is normally easier to code (more intuitive most of the time)
 Stack Overflow issue for recursive algorithms
 Iterative normally runs faster

 In-place sorting is usually faster, but normally requires more complex code
 Memory/Secondary Storage Paging
 Merge Sort needs contiguous array elements: potentially fewer page requests
 Heap Sort needs array elements that are not contiguous (e.g., node k with its parent
node k/2 will be swapped)
 How about Quick sort? Radix sort?
Closing
 Many sorting algorithms available, most have several variants
 For our course, the discussion of algorithms serves 2 main purposes:
 Appreciation of different ways of accomplishing the same goal
 Examples on analysis of run-time (and space) complexities, which is the
main basis for comparing different algorithms that solve the same problem

 Bonus: you were “forced” to do programming