You are on page 1of 54

Presentation on

Sorting In Linear Time

ANUJ DUBEY
IIIrd B.TECH.(C.S.E.)
H.B.T.I. KANPUR
How fast can we sort?
All the sorting algorithms we have seen so
far are comparison sorts: only use
comparisons to determine the relative
order of elements.
The best worst-case running time that we’ve
seen for comparison sorting is O(n lg n) .
Prove a Lower Bound for
any comparison based
algorithm for the Sorting
Problem

How?
Decision trees help us.
Decision-tree example
Sort 〈a1, a2, …, an〉 1:2

2:3 1:3

123 1:3 213 2:3

132 312 231 321

Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.


• The left subtree shows subsequent comparisons if ai ≤ aj.
• The right subtree shows subsequent comparisons if ai ≥ aj.
Decision-tree example
Sort 〈a1, a2, a3〉 1:2 9≥4
= 〈 9, 4, 6 〉: 2:3 1:3

123 1:3 213 2:3

132 312 231 321

Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.


• The left subtree shows subsequent comparisons if ai ≤ aj.
• The right subtree shows subsequent comparisons if ai ≥ aj.
Decision-tree example
Sort 〈a1, a2, a3〉 1:2
= 〈 9, 4, 6 〉: 2:3 1:3 9≥6
123 1:3 213 2:3

132 312 231 321

Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.


• The left subtree shows subsequent comparisons if ai ≤ aj.
• The right subtree shows subsequent comparisons if ai ≥ aj.
Decision-tree example
Sort 〈a1, a2, a3〉 1:2
= 〈 9, 4, 6 〉: 2:3 1:3

123 1:3 213


4 ≤ 6 2:3
132 312 231 321

Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.


• The left subtree shows subsequent comparisons if ai ≤ aj.
• The right subtree shows subsequent comparisons if ai ≥ aj.
Decision-tree example
Sort 〈a1, a2, a3〉 1:2
= 〈 9, 4, 6 〉: 2:3 1:3

123 1:3 213 2:3

132 312 231 321


4≤6≤9
Each leaf contains a permutation 〈π(1), π(2),…, π(n)〉 to
indicate that the ordering aπ(1) ≤ aπ(2) ≤  ≤ aπ(n) has been
established.
Decision-tree model
A decision tree can model the execution of
any comparison sort:
• One tree for each input size n.
• View the algorithm as splitting whenever
it compares two elements.
• The tree contains the comparisons along
all possible instruction traces.
• The running time of the algorithm = the
length of the path taken.
• Worst-case running time = height of tree.
Lower bound for decision-
tree sorting
Theorem. Any decision tree that can sort n
elements must have height Ω(n lg n) .
Proof. Since there are n! possible
permutations. A height-h binary tree has ≤ 2h
leaves. Thus, n! ≤ 2h .
∴ h ≥ lg(n!) (lg is mono. increasing)
≥ lg ((n/e)n) (Stirling’s formula)
= n lg n – n lg e
= Ω(n lg n) .
Lower Bound for Sorting

• Any sorting algorithm based on


comparisons between elements requires
Ω(N log N) comparisons.
Sorting in linear time
Counting sort: No comparisons between elements.
• Input: A[1 . . n], where A[ j]∈{1, 2, …, k} .
• Output: B[1 . . n], sorted.
• Auxiliary storage: C[1 . . k] .
Example 1. Counting Sort

• Counting Sort applies when the elements


to be sorted come from a finite (and
preferably small) set.
• For example, the elements to be sorted
are integers in the range [0…k-1], for
some fixed integer k.
• This allows us to create an array V[0…k-1]
which we use to count the number of
elements with each value [0…k-1].
• This then allows us to place each input
element in exactly the right place in the
output array in constant time.
Counting sort
for i ← 1 to k
do C[i] ← 0
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1⊳ C[i] = |{key = i}|
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key ≤ i}|
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 3 3 3

Input: N records each labeled with a digit


between 0&(k-1).
Output: Stable sort the numbers.

Algorithm: Count to determine where records go.


Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Stable sort: If two records are the same for that digit,
their order does not change.

Therefore, the 4th record in input with digit 1 must be


the 4th record in output with digit 1.
It belongs in output index 8, because 8 records go before it
ie 5 records with a smaller digit & 3 records with the same digit
Count These!
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output:
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
# of records with digit v: 5 9 3 2

N records. Time to count? Θ(N)


CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output:
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
# of records with digit v: 5 9 3 3
# of records with digit < v: 0 5 14 17

N records, k different values. Time to count? Θ(k)


CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
# of records with digit < v: 0 5 14 17
Location of first record
with digit v.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 ? 1
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of first record 0 5 14 17
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 1
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 0 5 14 17
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 1
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 0 6 14 17
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 1 6 14 17
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1 1
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 6 14 17
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1 1 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 7 14 17
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1 1 1 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 7 14 18
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1 1 1 1 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 8 14 18
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1 1 1 1 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 9 14 18
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 1 1 1 1 1 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 9 14 19
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 1 1 1 1 1 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 2 10 14 19
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 1 1 1 1 1 2 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 3 10 14 19
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 1 1 1 1 1 1 2 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 3 10 15 19
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 1 1 1 1 1 1 2 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 3 10 15 19
with digit v.
Algorithm: Go through the records in order
putting them where they go.
CountingSort
Input: 1 0 0 1 3 1 1 3 1 0 2 1 0 1 1 2 2 1 0
Output: 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 3 3
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Value v: 0 1 2 3
Location of next record 5 14 17 19
with digit v.

Θ(N) ×(log
Time = Θ(N+k)
Total (Counting Ops)
N + log k) (bit ops)
Analysis
for i ← 1 to k
Θ(k) do C[i] ← 0
for j ← 1 to n
Θ(n) do C[A[ j]] ← C[A[ j]] + 1
for i ← 2 to k
Θ(k) do C[i] ← C[i] + C[i–1]
for j ← n downto 1
Θ(n) do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Θ(n + k)
Radix Sort
• Algorithm
– sort by the least significant digit first (counting
sort)
– sort by the next least significant digit
– continue this process until the numbers have
been sorted on all k digits
Radix Sort
• Does it work?

• Clearly, if the most significant digit of a


and b are different and a < b, then finally
a comes before b
• If the most significant digit of a and b are
the same, and the second most
significant digit of b is less than that of a,
then b comes before a.
To sort d digits:

Correctness:
• Induction on number of passes ( i in pseudocode).
• Assume digits 1, 2, . . . , i-1 are sorted.
• Show that a stable sort on digit i leaves digits 1, ..., i sorted:
• If 2 digits in position i are different, ordering by position i is correct,
and positions 1, . . . , i-1 are irrelevant.
• If 2 digits in position i are equal, numbers are already in the right
order (by inductive hypothesis). The stable sort on digit i leaves them
in the right order.
Analysis of Radix Sort
Given n d-digit numbers in which each digit can take
on up to k possible values, RADIX-SORT correctly
sorts these numbers in Θ (d( n + k)) time.

Assume that we use counting sort as the


intermediate sort.
Θ (n + k) per pass (digits in range 0, ... , k)
– d passes
Θ (d(n + k)) total
– If k = O(n), time = Θ (d⋅n).
Operation of radix sort

329 720 720 329


457 355 329 355
657 436 436 436
839 457 839 457
436 657 355 657
720 329 457 720
355 839 657 839
RadixSort
344 125 125
125 134 224
333 Sort wrt which 143 Sort wrt which 225
134 digit first? 224 digit Second? 325
224 225 134
334 The most 243 The next most 333
143 significant. 344 significant. 334
225 333 143
325 334 243
243 325 344
All meaning in first sort lost.
RadixSort
344 333 224
125 143 125
333 Sort wrt which 243 Sort wrt which 225
134 digit first? 344 digit Second? 325
224 134 333
334 The least 224 The next least 134
143 significant. 334 significant. 334
225 125 143
325 225 243
243 325 344
RadixSort
344 333 2 24
125 143 1 25
333 Sort wrt which 243 Sort wrt which 2 25
134 digit first? 344 digit Second? 3 25
224 134 3 33
334 The least 224 The next least 1 34
143 significant. 334 significant. 3 34
225 125 1 43
325 225 2 43
243 325 3 44
RadixSort
2 24 1 25
1 25 1 34
2 25 1 43 Is sorted wrt
3 25 first i+1 digits.
2 24
3 33
2 25
1 34
2 43 These are in the
3 34
1 43 correct order
3 25
2 43 Sort wrt i+1st because sorted
3 33
3 44 digit. wrt high order digit.
i+1
3 34
3 44
RadixSort
2 24 1 25
1 25 1 34
2 25 1 43 Is sorted wrt
3 25 first i+1 digits.
2 24
3 33
2 25
1 34
2 43 These are in the
3 34
1 43 3 25 correct order
2 43 Sort wrt i+1st 3 33 because was sorted &
3 44 digit. 3 34 stable sort left sorted.
i+1
3 44
Bucket Sort

Assumes the input is generated by a random process that


distributes elements uniformly over [0, 1).
Idea:
• Divide [0, 1) into n equal-sized buckets.
• Distribute the n input values into the buckets.
• Sort each bucket.
• Then go through buckets in order, listing elements in each
one.
Input: A[1 .. n], where 0 ≤ A[i] < 1 for all i.
Auxiliary array: B[0 .. n-1] of linked lists, each list initially empty.
Bucket Sort
• Applicable if input is constrained to finite
interval, e.g., [0…1).
• If input is random and uniformly
distributed, expected run time is Θ(n).
BUCKET-SORT(A)
1. n <- length[A]
2 for i <- 1 to n
3 do insert A[i] into list B[.n A[i].]
4 for i <- 0 to n - 1
5 do sort list B[i] with insertion sort
6 concatenate the lists B[0], B[1], . . ., B[n -
1] together in order
Bucket Sort
PseudoCode

(1)

(1) n
(n )

(n )
Correctness:

Consider A[i ], A[ j ].
Assume without loss of generality that A[i ] ≤ A[j].

Then n·A[i ] ≤ n·A[ j ].


So A[i ] is placed into the same bucket as A[ j ] or
into a bucket with a lower index.
– If same bucket, insertion sort fixes up.
– If earlier bucket, concatenation of lists fixes
up.
Analysis:
• Relies on no bucket getting too many values.
• All lines of algorithm except insertion sorting
take Θ(n) altogether.
• Intuitively, if each bucket gets a constant number
of elements, it takes O(1) time to sort each
bucket ⇒O(n) sort time for all buckets.
• We “expect” each bucket to have few elements,
since the average is 1 element per bucket.
THANK YOU
VERY MUCH

You might also like