You are on page 1of 6

# Sorting revisited Lower bounds for sorting

## I In order to prove a lower bound for a problem we need to somehow

show that no algorithm can be faster than our bound in the worst case.
I In COMS12100 we looked at various sorting algorithms and analysed This is different from seeing how fast a particular algorithm takes to
their time complexities. run.
I Bubble sort - Θ(n2 ) I To prove a lower bound we need to show that for any program P that
I Quicksort - Θ(n log n) on average but Θ(n2 ) in the worst case correctly sorts all inputs, we can find an input such that P takes time
I Merge Sort, Heap Sort - Θ(n log n) in the worst case ≥ cn log n for some c (and for all n larger than some constant).
I The question we want to answer is "is it possible to do any better?" I We have to define carefully which operations we want to count. We
I What do we mean by “do better?" have to define the computational model.
I An algorithm gives an “upper bound" for a problem but tells us nothing
about whether another faster algorithm might exist.
I All the sorting algorithms we have seen so far have worked by
I We will look at how to prove a lower bound for sorting. comparing pairs of elements in the input. We will count only
I Then we will show how to “beat" this lower bound. comparison operations and show that Ω(n log n) comparisons are
always needed in the worst case.
I But is there any alternative to comparison sorting? We will see the
answer to this question is yes.

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 1 COMS21102 : Software Engineering Slide 2

## Decision Trees Decision Trees

Sort < a1 , . . . , an > (assume w.l.o.g. that all ai are distinct) Sort < 7, 4, 6 >

Each internal node is labelled i : j for i, j ∈ {1, 2, . . . , n} Each internal node is labelled i : j for i, j ∈ {1, 2, . . . , n}

I The left subtree shows subsequent comparisons if ai ≤ aj I The left subtree shows subsequent comparisons if ai ≤ aj
I The right subtree shows subsequent comparisons if ai > aj I The right subtree shows subsequent comparisons if ai > aj

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 3 COMS21102 : Software Engineering Slide 4
Decision Trees Decision Trees
Sort < 7, 4, 6 > Sort < 7, 4, 6 >

Each internal node is labelled i : j for i, j ∈ {1, 2, . . . , n} Each internal node is labelled i : j for i, j ∈ {1, 2, . . . , n}

I The left subtree shows subsequent comparisons if ai ≤ aj I The left subtree shows subsequent comparisons if ai ≤ aj
I The right subtree shows subsequent comparisons if ai > aj I The right subtree shows subsequent comparisons if ai > aj

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 5 COMS21102 : Software Engineering Slide 6

## A decision tree can model the execution of any comparison sort.

I One tree for each input size n
I Tree contains all possible sequences of comparisons needed to sort
the input
I The running time of the algorithm on a particular input is the length of
the path taken
I Worst case is the height of the tree

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 7 COMS21102 : Software Engineering Slide 8
Lower Bound for Decision Tree Model Lower Bound for Decision Tree Model

Theorem
Any decision tree that can sort n elements must have height Ω(n log n).

Proof. Corollary
The tree must contain at least n! leaves as there are n! different
Merge sort and heap sort are asymptotically optimal comparison sorting
permutations to choose from. A binary tree of height h has ≤ 2h leaves.
algorithms.
Therefore n! ≤ number of leaves ≤ 2h .
n n
h ≥ log n! ≥ log( )
2 2
h ∈ Ω(n log n)

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 9 COMS21102 : Software Engineering Slide 10

## Linear Time Sorting Counting sort

for i ← 0 to k do
C[i] ← 0;
end
for j ← 1 to n do
Counting sort: No comparisons used at all C[A[j]] ← C[A[j]] + 1;
end
Input: A[1, . . . , n] where A[j] ∈ {1, 2, . . . , k} . C[i] now contains the number of elements equal to i;
Output: B[1, . . . , n], sorted and a permutation of A for i ← 2 to k do
Auxiliary storage: C[0, . . . , k] C[i] ← C[i] + C[i − 1];
end
. C[i] now contains the number of elements less than or equal
to i;
for j ← n downto 1 do
B[C[A[j]]] ← A[j];
C[A[j]] ← C[A[j]] − 1;
end

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 11 COMS21102 : Software Engineering Slide 12
Counting sort example Counting sort analysis
Sum the 4 different loops giving Θ(n + k) time in total.
for i ← 0 to k do
C[i] ← 0;
end
The initialisation takes Θ(k) time;
for j ← 1 to n do
C[A[j]] ← C[A[j]] + 1;
On blackboard... end
Building the array of element counts takes Θ(n) time;
for i ← 1 to k do
C[i] ← C[i] + C[i − 1];
end
Accumulating the element count takes Θ(k) time;
for j ← n downto 1 do
B[C[A[j]]] ← A[j];
C[A[j]] ← C[A[j]] − 1;
end
“Distribution" takes Θ(n) time;

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 13 COMS21102 : Software Engineering Slide 14

## Running time Stable sorting

If k ∈ Θ(n) then the total running time of counting sort is Θ(n). A crucial property of counting sort is that it is stable.
I But we proved an Ω(n log n) lower bound! What happened? I Why did we bother with the last two loops of counting sort?
Answer: I We want the sort to be stable.
I We proved a lower bound for comparison sorts I It preserves the order of equal elements
I Counting sort is not a comparison sort What other sorting algorithms are stable?
I In fact there is not a single comparison in it! See blackboard...

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 15 COMS21102 : Software Engineering Slide 16

## We can prove the correctness of radix sort by induction on the digit

position.
I Base case. Radix sort clearly is correct for single digit numbers
Radix sort is possibly the oldest implemented sorting algorithm. It operates
digit by digit from the least significant digit to the most significant one. I Inductive hypothesis. Assume the numbers are sorted by their t − 1
See blackboard... lowest order digits
I Inductive step. Sort using digit t
I Two numbers that have the same digit t preserve their original order by
stability
I Two numbers that differ at digit t are placed in the correct order

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 17 COMS21102 : Software Engineering Slide 18

## Remember that counting sort takes Θ(n + k) time.

I Assume counting sort is used as the auxiliary sorting algorithm I Each call to counting sort takes Θ(n + 2r ) time
I Sort n numbers of b bits each I b/r calls are made to counting sort
I A “digit" is r ≤ b bits long I The total time is therefore
I If the numbers are 32 bits long and r = 8 then we can think of the b
T (n, b) ∈ Θ( (n + 2r ))
input as having 4 digits in base 2r r
I What should we set r to? I How can we set r to minimise this?
See blackboard... I Increasing r means fewer passes but then the time for counting sort
grows exponentially

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 19 COMS21102 : Software Engineering Slide 20
To help us get a feel for the problem we plot r against the running time
(b/r )(n + 2r ) (setting all constant factors to 1). We want to minimise
b
Radix sort running time for b=32, n=1000 and r=1...15 T (n, b) ∈ Θ( (n + 2r ))
80000
r
Running time

70000
I

## 60000 minimise the function.

50000
I However, we can get a good guess (which turns out to be correct) by
remembering that n + 2r ∈ Θ(max(n, 2r )). We set r = log(n) so that
2r = n.
Time

40000

30000
I Therefore
T (n, b) ∈ Θ(nb/ log n)
20000
I For numbers in the range from 0 to nd −1 , we have b = d log n ⇒ radix
10000 sort runs in Θ(dn) time.
0
2 4 6 8 10 12 14
r

## Clifford, Harrow and Page Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 21 COMS21102 : Software Engineering Slide 22

Summary

I Counting sort runs in Θ(n) time when the values to be sorted are less
than n
I Radix is fast for 32-bit numbers, for example, where only 4 passes of
count sort and an auxiliary array of size 28 = 256 are needed if we set
r =8
I Merge sort or quicksort will need at least 11 linear time passes if there
are, say, ≥ 2000 values to be sorted
I However, radix sort has poor memory locality and so a well tuned
quicksort may be faster in practice

## Clifford, Harrow and Page

COMS21102 : Software Engineering Slide 23