You are on page 1of 26

Bucket Sort and Radix Sort

Several sorting algorithms have been

discussed and the best ones, so far:

10/02/05

Heap sort and Merge sort: O( n log n )

Quick sort (best one in practice): O( n log n )
on average, O( n2 ) worst case
No.
It can be proven that any comparison-based
sorting algorithm will need to carry out at
least O( n log n ) operations
BucketSor
t
Slide 2

Suppose the values in the list to be

sorted can repeat but the values have a
limit (e.g., values are digits from 0 to 9)
Sorting, in this case, appears easier
Is it possible to come up with an
algorithm better than O( n log n )?

10/02/05

Yes
Strategy will not involve comparisons
BucketSor
t
Slide 3

Decision-tree example
Sort a1, a2, , an

1:2

2:3
123

1:3
213

1:3
132

312

2:3
231

321

Each internal node is labeled i:j for i, j {1, 2,, n}.

The left subtree shows subsequent comparisons if ai aj.
BucketSor
The right subtree shows subsequent comparisons
t if ai aj.
10/02/05

Slide 4

Decision-tree example
Sort a1, a2,
a3 9, 4, 6

1:2

94

2:3

123

1:3
213

1:3
132

312

2:3
231

321

Each internal node is labeled i:j for i, j {1, 2,, n}.

The left subtree shows subsequent comparisons if ai aj.
BucketSor
if ai aj.
The right subtree shows subsequent comparisons
10/02/05

t
Slide 5

Decision-tree example
Sort a1, a2,
a3 9, 4, 6

1:2
2:3

123

1:3
213

1:3
132

312

96
2:3

231

321

Each internal node is labeled i:j for i, j {1, 2,, n}.

The left subtree shows subsequent comparisons if ai aj.
BucketSor
if ai aj.
The right subtree shows subsequent comparisons
10/02/05

t
Slide 6

Decision-tree example
Sort a1, a2,
a3 9, 4, 6

1:2
2:3

123

1:3
213

1:3
132

312

4 6 2:3
231

321

Each internal node is labeled i:j for i, j {1, 2,, n}.

The left subtree shows subsequent comparisons if ai aj.
BucketSor
if ai aj.
The right subtree shows subsequent comparisons
10/02/05

t
Slide 7

Decision-tree example
Sort a1, a2,
a3 9, 4, 6

1:2
2:3

123

1:3
213

1:3
132

312

2:3
231

321

469
Each leaf contains a permutation , ,, (n) to
indicate that the ordering a(1) a(2) a(n) BucketSor
has been
established.
t
10/02/05

Slide 8

Decision-tree model
A decision tree can model the execution of
any comparison sort:
One tree for each input size n.
View the algorithm as splitting whenever
it compares two elements.
The tree contains the comparisons along
all possible instruction traces.
The running time of the algorithm = the
length of the path taken.
BucketSor
Worst-case running time = height ofttree.
10/02/05

Slide 9

Any comparison sort

Can be turned into a Decision tree
class InsertionSortAlgorithm {

1:2

for (int i = 1; i < a.length; i++) {

int j = i;
while ((j > 0) && (a[j-1] > a[i])) {

2:3

1:3

a[j] = a[j-1];
j--; }

123

213

1:3

2:3

a[j] = B; }}

132

10/02/05

312

231
321
BucketSor
t
Slide 10

Lower bound for decisiontree sorting

Theorem. Any decision tree that can sort n
elements must have height (n lg n) .
Proof. The tree must contain n! leaves, since
there are n! possible permutations. A height-h
binary tree has 2h leaves. Thus, n! 2h .
h lg(n!)
(lg is mono. increasing)
lg ((n/e)n)
(Stirlings formula)
= n lg n n lg e
BucketSor
= (n lg n) .
t
10/02/05

Slide 11

Bucket sort

Idea: suppose the values are in the range

numbered 0 to m-1, scan the list and
place element s[i] in bucket s[i], and then
output the buckets in order
Will need an array of buckets, and the
values in the list to be sorted will be the
indexes to the buckets

10/02/05

No comparisons will be necessary

BucketSor
t
Slide 12

Example
4 2 1

10/02/05

2 0 3 2 1

0
0
0

1
1

0 0 0 1

2
2
2
2

4 0 2 3 0

3
3

4
4

2 2 2 2 3 3 BucketSor
4 4
t
Slide 13

Bucket sort algorithm

Algorithm BucketSort( S )
( values in S are between 0 and m-1 )
for j 0 to m-1 do
b[j] 0
for i 0 to n-1 do
b[S[i]] b[S[i]] + 1
i0
for j 0 to m-1 do
for r 1 to b[j] do
S[i] j
ii+1
10/02/05

// initialize m buckets
// place elements in their
// appropriate buckets
// place elements in buckets
// back in S
BucketSor
t
Slide 14

If we were sorting values, each bucket is

just a counter that we increment
whenever a value matching the buckets
number is encountered
If we were sorting entries according to
keys, then each bucket is a queue

10/02/05

Entries are enqueued into a matching bucket

Entries will be dequeued back into the array
after the scan
BucketSor
t
Slide 15

Bucket sort algorithm

Algorithm BucketSort( S )

( S is an array of entries whose keys are between 0..m-1 )

for j 0 to m-1 do
//
initialize queue b[j]
for i 0 to n-1 do
//
b[S[i].getKey()].enqueue( S[i] );
i0
for j 0 to m-1 do
//
while not b[j].isEmpty() do
//
S[i] b[j].dequeue()
ii+1
10/02/05

initialize m buckets
place in buckets
place elements in
buckets back in S
BucketSor
t
Slide 16

Time complexity

Bucket initialization: O( m )
From array to buckets: O( n )
From buckets to array: O( n )

Since m will likely be small compared to n,

Bucket sort is O( n )

10/02/05

Even though this stage is a nested loop, notice that

all we do is dequeue from each bucket until they are
all empty > n dequeue operations in all

BucketSor
t
Slide 17

Sorting integers

Can we perform bucket sort on any array of

(non-negative) integers?

If you are sorting 1000 integers and the

maximum value is 999999, you will need 1 million
buckets!

10/02/05

Yes, but note that the number of buckets will

depend on the maximum integer value

Time complexity is not really O( n ) because m is

much > than n. Actual time complexity is O( m )

Can we do better?

BucketSor
t
Slide 18

Idea: repeatedly sort by digitperform

multiple bucket sorts on S starting with the
rightmost digit
If maximum value is 999999, only ten buckets
(not 1 million) will be necessary
Use this strategy when the keys are integers,
and there is a reasonable limit on their values

10/02/05

Number of passes (bucket sort stages) will depend

on the number of digits in the maximum value
BucketSor
t
Slide 19

Example: first pass

12 58 37 64 52 36 99 63 18 9

20

10/02/05

12
52

63

64

37
36 47

20 88 47

58
18
88

9
99

20 12 52 63 64 36 37 47 58 18 88BucketSor
9 99
t
Slide 20

Example: second pass

20 12 52 63 64 36 37 47 58 18 88 9

9
10/02/05

12
18

20

36
37

52
47 58

63
64

88

99

99

12 18 20 36 37 47 52 58 63 64BucketSor
88 99
t
Slide 21

Example: 1st and 2nd passes

12 58 37 64 52 36 99 63 18 9

20 88 47

sort by rightmost digit

20 12 52 63 64 36 37 47 58 18 88 9

99

sort by leftmost digit

10/02/05

12 18 20 36 37 47 52 58 63 64 88 99
BucketSor
t
Slide 22

Radix sort works as long as the bucket sort

stages are stable sorts
Stable sort: in case of ties, relative order of
elements are preserved in the resulting array

10/02/05

Suppose there are two elements whose first digit is

the same; for example, 52 & 58
If 52 occurs before 58 in the array prior to the
sorting stage, 52 should occur before 58 in the
resulting array

This way, the work carried out in the previous

bucket sort stages is preserved

BucketSor
t
Slide 23

Time complexity

If there is a fixed number p of bucket

sort stages (six stages in the case where
the maximum value is 999999), then
radix sort is O( n )

10/02/05

O( n ) time

Strictly speaking, time complexity is

O( pn ), where p is the number of digits
(note that p = log10m, where m is the
maximum value in the list)
BucketSor
t
Slide 24

Note that only 10 buckets are needed

regardless of number of stages since the
buckets are reused at each stage
Radix sort can apply to words

10/02/05

Set a limit to the number of letters in a word

Use 27 buckets (or more, depending on the
letters/characters allowed), one for each
letter plus a blank character
The word-length limit is exactly the number
of bucket sort stages needed
BucketSor
t
Slide 25

Summary

10/02/05

Bucket sort and Radix sort are O( n )

algorithms only because we have imposed
restrictions on the input list to be sorted
Sorting, in general, can be done in
O( n log n ) time

BucketSor
t
Slide 26