Introduction to Algorithms

6.046J/18.401J/SMA5503
Lecture 6
Prof. Erik Demaine
Introduction to Algorithms Day 9 L6.2 © 2001 by Charles E. Leiserson
Order statistics
Select the ith smallest of n elements (the
element with rank i).
• i = 1: minimum;
• i = n: maximum;
• i = (n+1)/2 or (n+1)/2: median.
Naive algorithm: Sort and index ith element.
Worst-case running time = Θ(n lg n) + Θ(1)
= Θ(n lg n),
using merge sort or heapsort (not quicksort).
Introduction to Algorithms Day 9 L6.3 © 2001 by Charles E. Leiserson
Randomized divide-and-
conquer algorithm
RAND-SELECT(A, p, q, i) ⊳ ith smallest of A[ p . . q]
if p = q then return A[ p]
r ←RAND-PARTITION(A, p, q)
k ←r – p + 1 ⊳ k = rank(A[r])
if i = k then return A[ r]
if i < k
then return RAND-SELECT(A, p, r – 1, i )
else return RAND-SELECT(A, r + 1, q, i – k)
≤ A[r]
≤ A[r]
≥ A[r]
≥ A[r]
r p q
k
Introduction to Algorithms Day 9 L6.4 © 2001 by Charles E. Leiserson
Example
pivot
i = 7
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
k = 4
Select the 7 – 4 = 3rd smallest recursively.
Select the i = 7th smallest:
2
2
5
5
3
3
6
6
8
8
13
13
10
10
11
11
Partition:
Introduction to Algorithms Day 9 L6.5 © 2001 by Charles E. Leiserson
Intuition for analysis
Lucky:
1
0 1 log
9 / 10
= = n n
CASE 3
T(n) = T(9n/10) + Θ(n)
= Θ(n)
Unlucky:
T(n) = T(n – 1) + Θ(n)
= Θ(n
2
)
arithmetic series
Worse than sorting!
(All our analyses today assume that all elements
are distinct.)
Introduction to Algorithms Day 9 L6.6 © 2001 by Charles E. Leiserson
Analysis of expected time
Let T(n) = the random variable for the running
time of RAND-SELECT on an input of size n,
assuming random numbers are independent.
For k = 0, 1, …, n–1, define the indicator
random variable
X
k
=
1 if PARTITION generates a k : n–k–1 split,
0 otherwise.
The analysis follows that of randomized
quicksort, but it’s a little different.
Introduction to Algorithms Day 9 L6.7 © 2001 by Charles E. Leiserson
Analysis (continued)
T(n) =
T(max{0, n–1}) + Θ(n) if 0 : n–1 split,
T(max{1, n–2}) + Θ(n) if 1 : n–2 split,
M
T(max{n–1, 0}) + Θ(n) if n–1 : 0 split,
( )


=
Θ + − − =
1
0
) ( }) 1 , (max{
n
k
k
n k n k T X .
To obtain an upper bound, assume that the ith
element always falls in the larger side of the
partition:
Introduction to Algorithms Day 9 L6.8 © 2001 by Charles E. Leiserson
Calculating expectation
( )

Θ + − − =


=
1
0
) ( }) 1 , (max{ )] ( [
n
k
k
n k n k T X E n T E
Take expectations of both sides.
Introduction to Algorithms Day 9 L6.9 © 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |



=

=
Θ + − − =

Θ + − − =
1
0
1
0
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
n
k
k
n
k
k
n k n k T X E
n k n k T X E n T E
Linearity of expectation.
Introduction to Algorithms Day 9 L6.10 © 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |
| | | |




=

=

=
Θ + − − ⋅ =
Θ + − − =

Θ + − − =
1
0
1
0
1
0
) ( }) 1 , (max{
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
n
k
k
n
k
k
n
k
k
n k n k T E X E
n k n k T X E
n k n k T X E n T E
Independence of X
k
from other random
choices.
Introduction to Algorithms Day 9 L6.11 © 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |
| | | |
| |
∑ ∑




=

=

=

=

=
Θ + − − =
Θ + − − ⋅ =
Θ + − − =

Θ + − − =
1
0
1
0
1
0
1
0
1
0
) (
1
}) 1 , (max{
1
) ( }) 1 , (max{
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
n
k
n
k
n
k
k
n
k
k
n
k
k
n
n
k n k T E
n
n k n k T E X E
n k n k T X E
n k n k T X E n T E
Linearity of expectation; E[X
k
] = 1/n.
Introduction to Algorithms Day 9 L6.12 © 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |
| | | |
| |
| |

) ( ) (
2
) (
1
}) 1 , (max{
1
) ( }) 1 , (max{
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
1
2 /
1
0
1
0
1
0
1
0
1
0
n k T E
n
n
n
k n k T E
n
n k n k T E X E
n k n k T X E
n k n k T X E n T E
n
n k
n
k
n
k
n
k
k
n
k
k
n
k
k
Θ + ≤
Θ + − − =
Θ + − − ⋅ =
Θ + − − =

Θ + − − =

∑ ∑




=

=

=

=

=

=
Upper terms
appear twice.
Introduction to Algorithms Day 9 L6.13 © 2001 by Charles E. Leiserson
Hairy recurrence
| |

) ( ) (
2
)] ( [
1
2 /
n k T E
n
n T E
n
n k
Θ + =


=
Prove: E[T(n)] ≤ cn for constant c > 0.
Use fact:

2
1
2 /
8
3
n k
n
n k


=

(exercise).
• The constant c can be chosen large enough
so that E[T(n)] ≤ cn for the base cases.
(But not quite as hairy as the quicksort one.)
Introduction to Algorithms Day 9 L6.14 © 2001 by Charles E. Leiserson
Substitution method
| |

) (
2
) (
1
2 /
n ck
n
n T E
n
n k
Θ + ≤


=
Substitute inductive hypothesis.
Introduction to Algorithms Day 9 L6.15 © 2001 by Charles E. Leiserson
Substitution method
| |

) (
8
3 2
) (
2
) (
2
1
2 /
n n
n
c
n ck
n
n T E
n
n k
Θ +
|
.
|

\
|

Θ + ≤


=
Use fact.
Introduction to Algorithms Day 9 L6.16 © 2001 by Charles E. Leiserson
Substitution method
Express as desired – residual.
| |

|
.
|

\
|
Θ − − =
Θ +
|
.
|

\
|

Θ + ≤


=
) (
4
) (
8
3 2
) (
2
) (
2
1
2 /
n
cn
cn
n n
n
c
n ck
n
n T E
n
n k
Introduction to Algorithms Day 9 L6.17 © 2001 by Charles E. Leiserson
Substitution method
| |

cn
n
cn
cn
n n
n
c
n ck
n
n T E
n
n k

|
.
|

\
|
Θ − − =
Θ +
|
.
|

\
|

Θ + ≤


=
) (
4
) (
8
3 2
) (
2
) (
2
1
2 /
if c is chosen large enough so
that cn/4 dominates the Θ(n).
,
Introduction to Algorithms Day 9 L6.18 © 2001 by Charles E. Leiserson
Summary of randomized
order-statistic selection
• Works fast: linear expected time.
• Excellent algorithm in practice.
• But, the worst case is very bad: Θ(n
2
).
Q. Is there an algorithm that runs in linear
time in the worst case?
IDEA: Generate a good pivot recursively.
A. Yes, due to Blum, Floyd, Pratt, Rivest,
and Tarjan [1973].
Introduction to Algorithms Day 9 L6.19 © 2001 by Charles E. Leiserson
Worst-case linear-time order
statistics
if i = k then return x
elseif i < k
then recursively SELECT the ith
smallest element in the lower part
else recursively SELECT the (i–k)th
smallest element in the upper part
SELECT(i, n)
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
3. Partition around the pivot x. Let k = rank(x).
4.
Same as
RAND-
SELECT
Introduction to Algorithms Day 9 L6.20 © 2001 by Charles E. Leiserson
Choosing the pivot
Introduction to Algorithms Day 9 L6.21 © 2001 by Charles E. Leiserson
Choosing the pivot
1. Divide the n elements into groups of 5.
Introduction to Algorithms Day 9 L6.22 © 2001 by Charles E. Leiserson
Choosing the pivot
lesser
greater
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
Introduction to Algorithms Day 9 L6.23 © 2001 by Charles E. Leiserson
Choosing the pivot
lesser
greater
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
x
Introduction to Algorithms Day 9 L6.24 © 2001 by Charles E. Leiserson
Analysis
lesser
greater
x
At least half the group medians are ≤ x, which
is at least n/5 /2 = n/10 group medians.
Introduction to Algorithms Day 9 L6.25 © 2001 by Charles E. Leiserson
Analysis
lesser
greater
x
At least half the group medians are ≤ x, which
is at least n/5 /2 = n/10 group medians.
• Therefore, at least 3n/10 elements are ≤ x.
(Assume all elements are distinct.)
Introduction to Algorithms Day 9 L6.26 © 2001 by Charles E. Leiserson
Analysis
lesser
greater
x
At least half the group medians are ≤ x, which
is at least n/5 /2 = n/10 group medians.
• Therefore, at least 3n/10 elements are ≤ x.
• Similarly, at least 3n/10 elements are ≥ x.
(Assume all elements are distinct.)
Introduction to Algorithms Day 9 L6.27 © 2001 by Charles E. Leiserson
Minor simplification
• For n ≥ 50, we have 3n/10 ≥ n/4.
• Therefore, for n ≥ 50 the recursive call to
SELECT in Step 4 is executed recursively
on ≤ 3n/4 elements.
• Thus, the recurrence for running time
can assume that Step 4 takes time
T(3n/4) in the worst case.
• For n < 50, we know that the worst-case
time is T(n) = Θ(1).
Introduction to Algorithms Day 9 L6.28 © 2001 by Charles E. Leiserson
Developing the recurrence
if i = k then return x
elseif i < k
then recursively SELECT the ith
smallest element in the lower part
else recursively SELECT the (i–k)th
smallest element in the upper part
SELECT(i, n)
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
3. Partition around the pivot x. Let k = rank(x).
4.
T(n)
Θ(n)
T(n/5)
Θ(n)
T(3n/4)
Introduction to Algorithms Day 9 L6.29 © 2001 by Charles E. Leiserson
Solving the recurrence
) (
4
3
5
1
) ( n n T n T n T Θ +
|
.
|

\
|
+
|
.
|

\
|
=
if c is chosen large enough to handle both the
Θ(n) and the initial conditions.
cn
n cn cn
n cn
n cn cn n T

|
.
|

\
|
Θ − − =
Θ + =
Θ + + ≤
) (
20
1
) (
20
19
) (
4
3
5
1
) (
,
Substitution:
T(n) ≤ cn
Introduction to Algorithms Day 9 L6.30 © 2001 by Charles E. Leiserson
Conclusions
• Since the work at each level of recursion
is a constant fraction (19/20) smaller, the
work per level is a geometric series
dominated by the linear work at the root.
• In practice, this algorithm runs slowly,
because the constant in front of n is large.
• The randomized algorithm is far more
practical.
Exercise: Why not divide into groups of 3?

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.