You are on page 1of 1

LOWER BOUND FOR SORTING

I A comparison sort uses only comparisons between elements to


gain order information about an input sequence ha1 , a2 , . . . , an i.
I That is, given two elements ai and aj , it performs one of the tests
ai < aj , ai ≤ aj , ai = aj , ai ≥ aj , or ai > aj to determine their relative
order. It may not inspect the values of the elements or gain order
information about them in any other way.
I We assume without loss of generality that all the input elements are
distinct: a lower bound for distinct elements applies when elements
may or may not be distinct. Consequently,we disregard equality
comparisons ai = aj .
I The comparisons ai ≤ aj , ai ≥ aj , ai > aj , and ai < aj are all
equivalent in that they yield identical information about the relative
order of ai and aj . So we assume that all comparisons have the form
ai ≤ aj .

The decision-tree model


I We view comparison sorts abstractly in terms of decision trees.
I A decision tree is a full binary tree (each node is either a leaf or has
both children) that represents the comparisons between elements
that are performed by a particular sorting algorithm operating on an
input of a given size.
I Control, data movement, and all other aspects of the algorithm are
ignored. The figure shows the decision tree corresponding to the
insertion sort algorithm operating on an input sequence of three
elements.

I A decision tree has each internal node annotated by i : j for some i


and j in the range 1 ≤ i, j ≤ n, where n is the number of elements in the
input sequence. We also annotate each leaf by a permutation
hπ(1), π(2), · · · , π(n)i.
I Indices in the internal nodes and the leaves always refer to the
original positions of the array elements at the start of the sorting
algorithm.
I The execution of the comparison sorting algorithm corresponds to
tracing a simple path from the root of the decision tree down to a leaf.
I Each internal node indicates a comparison ai ≤ aj . The left subtree
then dictates subsequent comparisons once we know that ai ≤ aj , and
the right subtree dictates subsequent comparisons when ai > aj .
I Arriving at a leaf, the sorting algorithm has established the ordering
aπ(1) ≤ aπ(2) ≤ · · · ≤ aπ(n) .
I Because any correct sorting algorithm must be able to produce
each permutation of its input, each of the n ! permutations on n
elements must appear as at least one of the leaves of the decision tree
for a comparison sort to be correct.
I Furthermore, each of these leaves must be reachable from the root
by a downward path corresponding to an actual execution of the
comparison sort. Thus, we consider only decision trees in which each
permutation appears as a reachable leaf.
A lower bound for the worst case.
I The length of the longest (simple) path from the root of a decision
tree to any of its reachable leaves represents the worst-case number of
comparisons that the corresponding sorting algorithm performs.
I Consequently, the worst-case number of comparisons for a given
comparison sort algorithm equals the height of its decision tree.

A lower bound on the heights of all decision trees in which each


permutation appears as a reachable leaf is therefore a lower bound on
the running time of any comparison sort algorithm.

Theorem. Any comparison sort algorithm requires Ω(n lg n)


comparisons in the worst case.

Proof: From the discussion, it suffices to determine the height of a


decision tree in which each permutation appears as a reachable leaf.
Consider a decision tree of height h with ` reachable leaves
corresponding to a comparison sort on n elements.
Because each of the n! permutations of the input appears as one or
more leaves, we have
n! ≤ `.

Since a binary tree of height h has no more than 2h leaves, we have

n ! ≤ ` ≤ 2h

which, by taking logarithms, implies

h ≥ log(n!)

So,
h = Ω(n log n)

Corollary Heapsort and merge sort are asymptotically optimal


comparison sorts.

Proof: The O(n lg n) upper bounds on the running times for heapsort
and merge sort match the Ω(n lg n) worst-case lower bound.

SORTING IN LINEAR TIME

COUNTING SORT

Idea:
1. Count the number of occurrences of each key in the input array A
2. Determine the accumulated sum of the count
If C[i] is the count of occurrences of key i, then the corresponding
accumulated sum (from left to right) is

B[i] = C[0] + C[1] + · · · + C[i]

Claim. B[i] is the position in the output array in which the last
occurrence of key i must be placed.
3. Use a right to left scan of the input to place each input entry in its
correct position in the output

Example:

Pseudocode:

COUNTING-SORT(A, n, k)
1 let B[1 : n] and C[0 : k] be new arrays
2 for i ← 0 to k // Set counter to 0
3 C [i ] ← 0
4 for j ← 1 to n // Count the keys
5 C[A[j]] ← C[A[j]] + 1
6 for i ← 1 to k // Compute the accumulated sum
7 C [i ] ← C [i ] + C [i − 1 ]
8 for j ← n downto 1 // Copy input to output from left to right
9 B[C[A[j]]] ← A[j]
10 C[A[j]] ← C[A[j]] − 1 // decrease accumulated counter
11 return B

Lemma
The running time of Counting Sort is Θ(n + k).
Counting Sort is stable.

I Counting sort runs in requires additional Θ(n + k) storage.

RADIX SORT

Idea:
I Input keys consist of digits (or columns, or fields)
I The input is sorted according to each of the digits, from least to
most significative.

Example:

Pseudocode:

RADIX-SORT(A, n, d)
1 for i ← 1 to d
2 Use a stable sort to sort array A[1 : n] on digit i

Lemma
Given n d-digit numbers in which each digit can take on up to k
possible values, RADIX-SORT correctly sorts these numbers in
Θ(d(n + k)) time if the stable sort it uses takes Θ(n + k) time.

Lemma
Given n b-bit numbers and any positive integer r ≤ b, RADIX-SORT
correctly sorts these numbers in Θ ((b/r ) (n + 2r )) time if the stable
sort it uses takes Θ(n + k) time for inputs in the range 0 to k.

Proof:
I For a value r ≤ b, view each key as having d = db/r e digits of r bits
each.
I Each digit is an integer in the range 0 to 2r − 1, so that we can use
counting sort with k = 2r − 1.
I (For example, we can view a 32-bit word as having four 8-bit digits,
so that b = 32, r = 8, k = 2r − 1 = 255, and d = b/r = 4.)
I Each pass of counting sort takes Θ(n + k) = Θ (n + 2r ) time and
there are d passes, for a total running time of

Θ (d (n + 2r )) = Θ ((b/r ) (n + 2r )) .

How to chose r ?:
I Given n and b, what value of r ≤ b minimizes (b/r ) (n + 2r ) ?
I As r increases, the factor b/r decreases, but 2r increases.
For b < blg nc:
Then r ≤ b implies (n + 2r ) = Θ(n).
Thus, choosing r = b yields a running time of

(b/b) n + 2b = Θ(n),


which is asymptotically optimal.


For b ≥ blg nc: Choosing r = blg nc gives the best running time of
 
bn
Θ
log n

because if r > blog nc, then


n + 2r
 
n
=Ω
r log n
and if r < blog nc, then b/r increases and n + 2r = Θ(n).

SELECTION: DETERMINISTIC ALGORITHM

I Recall that in the selection problem, given an input array A of


numbers and an index i, we want to determine the entry x in A with
rank i, that is, such that the set {j | A[j] ≤ x } has size i. (This definition
assumes the entries in A are all different; if equal entries are possible,
any ordering between equal entries will give the same value as answer,
though a different entry.)
I We have already seen a randomized algorithm for the selection
problem that runs in expected linear time.
I Now, we will see that the same time can be achieved with a
deterministic algorithm.
I The main outline is the same as the randomized algorithm, with a
new deterministic algorithm for choosing the pivot:
the input entries are split arbitrarily into groups of 5 (this parameter
could be changed), and the median of each group is determined (any
algorithm will do), then the median of these group medians is
computed recursively and taken as the pivot.

Outline:
SELECT(A, i) : where n = |A|
1. Divide the n elements of A into n5 groups of 5 elements each.
 

Additional elements may be placed in their own group of size


n mód 5.
2. Find the median of each the groups.
Insertion-sort the elements in each group and then pick the median
in each sorted list.
3. Recursively SELECT the median x of the medians found in step 2 .
4. Partition the input array around the median-of-medians x.
Let k = rank(x ).
Let L and R be the elements smaller and greater than x respectively.
5. Case of i = k: return x.
If i < k: Recursively SELECT the i th element in L.
Else i > k: Recursively SELECT the (i − k) th element in R

I In forming the groups of 5 entries, in general, there can be an


incomplete group (with 1 to 4 elements). The more detailed
implementation in CLRS handles this by iteratively (up to 4 times)
finding the minimum and either returning it as the answer (if i = 1) or
discarding it and decreasing i by 1.
I In this way, we may assume that all groups have 5 elements and the
the number of groups g satisfies g ≤ n/5 where n is the number of
elements.
I The analysis is based on the following two claims.

7n
Claim 1: The recursion in step 5 is on at most elements
10
Proof. In the figure, nodes are entries, groups correspond to columns,
red nodes are the medians of the groups and x is the median of the
medians. The yellow nodes are all ≥ x and the blue ones are ≤ x (x is
both blue and yellow). The yellow region includes 3 elements in at
least g/2 columns, so it has ≥ 3g/2 elements/nodes. The same is true
for the blue region. Therefore, either of the two recursions in step 5
includes at most
3g 7g
5g − =
2 2
elements. Since g ≤ n/5, then either of the two recursions in step 5
includes at most
7 n 7n
· =
2 5 10
elements.
Claim 2: SELECT finds the i th order statistic of A in O(n) worst-case
time.
Proof. We must evaluate the recurrence T (n) for SELECT.
 n  steps that take O(n) time.
Steps 1, 2, and 4 are non-recursive
Step 3 is a recursive call over 5 elements which takes time
l n m
T .
5
7n
Step 3 is a recursive call over elements which takes time
10
 
7n
T .
10

Thus,  
l n m 7n
T (n) ≤ T +T + Θ(n).
5 10

We use the substitution method to verify the O(n) running time.


I We will verify by induction that, for some sufficiently large c,

T (n) ≤ c · n (?)

for n ≥ 1.
I For 1 ≤ n < 20 this method requires O(1) time, so certainly this
time satisfies T (n) ≤ c0 n (for 1 ≤ n < 20) for an appropriate c0 .
I We take 1 ≤ n < 20 as the induction basis and show (?) for n > 20
(using the induction hypothesis).
I Substituting the inductive hypothesis into the recurrence gives:
 
lnm 7n
T (n) ≤ c +c + an
5 10
 
cn 7cn
≤ +c+ + an
5 10
9cn
= + c + an
10 
cn 
= cn + − + c + an
10
This is ≤ cn if
cn
− + c + an ≤ 0,
10
which holds as long as
n
c ≥ 10a .
n − 10
Because n ≥ 20 then

n/(n − 10) ≤ 2,

so choosing c ≥ 20a will satisfy this inequality.


Thus, if we choose c = máx(c0 , 20a), then both the basis and the
induction step hold.

Lemma
SELECT runs in O(n) time.

Pseudocode IN CLRS 4th ed:


I Lines 1-10 in the pseudocode handle the possible extra 1 to 4
elements.
I The remaining 5g elements are divided in 5 parts of size g. The j-th
group consists of the j-th element in each of the 5 groups.
I With each group sorted in place (to get the median), the set of
medians will be in the middle group.

DIVIDE-AND-CONQUER (MORE EXAMPLES)

MATRIX MUTIPLICATION

I The product of two n × n matrices X and Y is a third n × n matrix


Z = XY, with (i, j) th entry
n
X
Zij = Xik Ykj .
k =1

I The preceding formula implies an O (n3 ) algorithm for matrix


multiplication: there are n2 entries to be computed, and each takes
O(n) time.
I For quite a while, this was widely believed to be the best running
time possible
I In 1969, the mathematician Volker Strassen announced a
significantly more efficient algorithm, based upon divide-and-conquer.
I Matrix multiplication is particularly easy to break into subproblems,
because it can be performed blockwise. For this, divide X into four
n/2 × n/2 blocks, and also Y :
   
A B E F
X= , Y=
C D G H

Then their product can be expressed in terms of these blocks and is


exactly as if the blocks were single elements
    
A B E F AE + BG AF + BH
XY = =
C D G H CE + DG CF + DH

I We have a divide-and-conquer strategy to compute the size- n


product XY:
1. recursively compute eight size- n/2 products

AE, BG, AF , BH, CE, DG, CF , DH

2. then do a few O (n2 ) time additions.

The total running time is described by the recurrence relation

T (n) = 8T (n/2) + O n2


The solution is T (n) = O(n3 ).


I It turns out XY can be computed from just seven n/2 × n/2
subproblems, via a very ingenious decomposition:
 
P5 + P4 − P2 + P6 P1 + P2
XY =
P3 + P4 P1 + P5 − P3 − P7
where
P1 = A(F − H) P5 = (A + D)(E + H)
P2 = (A + B)H P6 = (B − D)(G + H)
P3 = (C + D)E P7 = (A − C)(E + F )
P4 = D(G − E)
The new running time is

T (n) = 7T (n/2) + O n2 ,


whose solution works out to

T (n) = O nlog2 7 ≈ O n2,81 .


 

CLOSEST PAIR OF POINTS

I We consider the problem finding a closest pair of points in a given


set of points in the plane.
I More precisely, we are given a set of n points in R2 – each is a pair
x = (x1 , x2 ) –, and we are interested in a pair of points p, q ∈ P such
that
d(p, q) = mı́n{d(p0 , q0 ) : p0 , q0 ∈ P, p0 6= q0 }
p
where d(p, q) = (p1 − q1 )2 + (p2 − q2 )2 is the euclidean distance
between p = (p1 , p2 ) and q = (q1 , q2 ).
I The 1-dimensional version of this problem is equivalent to finding
the smallest interval determined by n numbers, which is obviously
determined by two consecutive numbers.
I To solve this problem, it suffices to sort the numbers and this
results in a running time O(n log n).
I Can this running time be improved ? It turns out that if only certain
computation primitives –which most algorithms are restricted to use–
then this is the best possible running time.

Divide-and-Conquer Solution
I To obtain an O(n log n) time, the divide will not be arbitrary, rather,
the set of points is split into two halves by a line (vertical for simplicity).
I The outline is

CLOSEST-PAIR-DC (P)
Partition P with a vertical halving line ` into
points to the left Q and to the right R
(q, q0 ) ← CLOSEST-PAIR-DC(Q)
(r , r 0 ) ← CLOSEST-PAIR-DC(R)
(s, s0 ) ← CP-MERGE(Q, R)
return closest pair among (q, q0 ), (r , r 0 ), (s, s0 )

CP-MERGE in linear time

From the recursion, we know closest pairs with both points on the left
and on the right of `. It remains to check for a closest pair with one
point on the left and one on the right. In principle, we would have to
check every such pair, and that would be very time consuming.
However, it is observed that:
(i) Only points within a strip of bounded by lines `− and `+ parallel
to ` and at distance δ on the left and right need to be considered:
a point outside this strip would be at least distance δ away from a
point on the opposite side. Let Pstrip denote the subset of P in this
strip.
ℓ− ℓ ℓ+
ℓ− ℓ ℓ+
PL PR

δ δ δ

(ii) Each point in this strip only need checked against four other
points, that can be easily determined:
To see this, consider the lowest point q of a possible closest pair.
Suppose q is on the right.
Then a candidate p to be closest to q = (q1 , q2 ) must lie in the
square determined by the lines `− , `, x2 = q2 and x2 = q2 + δ .
This square contains at most 4 points: if there are more than 4,
then there are 2 in one of its 4 “subsquares” (of size δ/2 × δ/2) and
so the distance between them is at most
q √
(δ/2)2 + (δ/2)2 = δ/ 2 < δ,

a contradiction.
Pstrip is computed by simply scanning P and selecting the points wihtin
distance δ from `. If Pstrip is sorted in increasing order of x2 -coordinate,
then (ii) implies that each point in Pstrip only need to be considered to
the next 7 points in that list (at most 4 points on the opposite side are
relevant, but other 3 on the same side may appear between them in
the list).
I The final algorithm sorts the initial set P of points according to the
x1 and x2 coordinates. These sorted sets are denoted by P1 and P2 .
These sets can be stored in arrays, but it seems more convenient to use
lists. The not so detailed pseudocode below is not precise about how
P1 , P2 are stored.
I Below, CLOSEST-PAIR(P) simply sorts P according to x1 and x2 to
obtain P1 and P2 and then calls CLOSEST-PAIR-REC(P1 , P2 ).
Pseudocode:

CLOSEST-PAIR(P)
1 Construct P1 and P2 in O(n log n) time
2 return CLOSEST-PAIR-REC(P1 , P2 )

CLOSEST-PAIR-REC (P1 , P2 )
if |P| ≤ 3
1 find closest pair by measuring all pairwise distances
2 Determine halving line ` (using the sorting by x1 ) and the sets Q, R
3 Construct the sorted versions Q1 , Q2 , R1 , R2 of Q, P
from the sorted sets P1 , P2
4 (q, q0 ) ← CLOSEST-PAIR-REC (Q1 , Q2 )
5 (r , r 0 ) ← CLOSEST-PAIR-REC (R1 , R2 )
6 δ ← mı́n (d (q, q0 ) , d (r , r 0 ))
// Merge
7 construct Pstrip = points in P within distance δ of ` sorted by x2
by scanning P2
8 for each point x ∈ Pstrip
9 compute distance from x to each of next 7 points in Pstrip
10 and let s, s0 be pair achieving minimum over all of these pairs
11 return closest pair among (q, q0 ), (r , r 0 ), (s, s0 )

MASTER THEOREM

We discuss the solution of a (somewhat) general recurrence equation


for divide-and-conquer, the, so called, master theorem.
Let T (n) be a function on the non negative integers determined by the
recurrence equation, where a > 0, b > 1 are constants (often integers,
but b could also be a fraction in applications) and n0 > is an integer
(constant):
aT bn + f (n) if n ≥ n0
  
T (n) =
C if n < n0
The function f (n) is called the driving function, and the function

nlogb a

which appears in the solution is called the watershed function.

Theorem (Master theorem)


Let a > 0 and b > 1 be constants, and let f (n) be a driving function
that is defined and nonnegative on all sufficiently large reals. Define
the recurrence T (n) on n ∈ N by
n
T (n) = aT + f (n)
b
where aT (n/b) actually means

a0 T (bn/bc) + a00 T (dn/be)

for some constants a0 ≥ 0 and a00 ≥ 0 satisfying a = a0 + a00 .


Then the asymptotic behavior of T (n) can be characterized as
follows:
1. If there exists a constant  > 0 such that f (n) = O nlogb a− , then


T (n) = Θ nlogb a .


 
logb a k
2. If there exists a constant k ≥ 0 such that f (n) = Θ n lg n ,
then  
logb a k+1
T (n) = Θ n lg n

3. If there exists a constant  > 0 such that f (n) = Ω nlogb a+ , and if


f (n) additionally satisfies the regularity condition

af (n/b) ≤ cf (n)

for some constant c < 1 and all sufficiently large n, then

T (n) = Θ(f (n)).

The proof is divided in two parts.


The first part analyzes the master recurrence, under the simplifying
assumption that T (n) is defined only on exact powers of b > 1, that is,
for n = 1, b, b2 , . . ..
The second part shows how to extend the analysis to all positive
integers n.
Warning: There is abuse of asymptotic notation slightly by using it to
describe the behavior of functions that are defined only over exact
powers of b.

THE PROOF FOR EXACT POWERS


Lemma 1
Let a ≥ 1 and b > 1 be constants, and let f (n) be a nonnegative
function defined on exact powers of b. Define T (n) on exact powers of
b by the recurrence
(
B if n = 1,
T (n) =
aT (n/b) + f (n) if n = bi ,
where i is a positive integer. Then
logb n−1 n
X
logb a j
T (n) = Bn + af .
j =0
bj

Proof. Iterating the recurrence equation, with n = bk ,


n
T (n) = aT + f (n)
b
n/b
    n 
= a aT +f + f (n)
b b
n n
= a2 T 2 + af + f (n)
b b
2
/
   n 
2 n b n
= a aT +f 2 + af + f (n)
b b b
n n n
3 2
= a T 3 + a f 2 + af + f (n)
b b b
..
.
n  n  n n
` `−1 2
= a T ` + a f `−1 + · · · + a f 2 + af + f (n)
b b b b
..
.
n  n  n n
k k −1 2
= a T k + a f k−1 + · · · + a f 2 + af + f (n)
b b b b
logb n−1 n
X
logb a j
= Bn + af
j =0
bj

This can be visualized with the recursion tree in the figure.


The root of the tree has cost f (n), and it has a children, each with cost
f (n/b). (It is convenient to think of a as being an integer, especially
when visualizing the recursion tree, but it is not required.)
Each of these children has a children, making a2 nodes at depth 2 , and
each of the a children has cost f (n/b2 ).
In general, there are aj nodes at depth j, and each has cost f n/bj .


The cost of each leaf is T (1) = B, and each leaf is at depth logb n, since
n/blogb n = 1.
There are alogb n = nlogb a leaves in the tree.

We can obtain the lemma by summing the costs of the nodes at each
depth in the tree.
In the underlying divide-and-conquer algorithm, the sum represents
the costs of dividing problems into subproblems and then
recombining the subproblems.
The cost of all the leaves, which isthe cost of doing all nlogb a
subproblems of size 1, is Θ nlogb a .
The next lemma provides asymptotic bounds on the summation’s
growth.

Lemma
Let a ≥ 1 and b > 1 be constants, and let f (n) be a nonnegative
function defined on exact powers of b. A function g(n) defined over
exact powers of b by
logb n−1
X
aj f n/bj

g(n) =
j =0

has the following asymptotic bounds for exact powers of b :


1. If f (n) = O nlogb a− for some constant  > 0, then


g(n) = O nlogb a .


2. If f (n) = Θ nlogb a , then




g(n) = Θ nlogb a lg n .


3. If f (n) = Ω nlogb a+ for some constant  > 0, then




g(n) = Θ(f (n)).

In terms of the recursion tree, the three cases of the master theorem
correspond to cases in which the total cost of the tree is
(1) the cost increases with the level geometrically and is dominated
by the costs in the leaves,
(2) the cost is equal over the levels of the tree, or
(3) the cost decreases with the level geometrically and is dominated
by the cost of the root.
Proof.
logb a− j
 
Case
 1: We have f
 (n ) = O n , which implies that f n / b =
log a−
O n/bj b

. Substituting into he sum yields
 
logb n−1  n logb a−
X
g(n) = O  aj .
j =0
bj

The summation is bounded as follows:


logb n−1 logb n−1  j
X
j
 n logb a−
logb a−
X ab
a =n
j =0
bj j=0
blogb a
logb n−1
X
=n logb a−
(b )j
j=0
 logb n
−1
 
b
= nlogb a−
b − 1
 
logb a− n − 1

=n
b − 1
1
≤  nlogb a
b −1
= O(nlogb a )

Case 2: Since f (n) = Θ nlogb a , then




 
logb n−1 logb n−1  n logb a
X X
j j j

g(n) = a f n/b = Θ  a 
j =0 j =0
bj
 
logb n−1 
X a  j
= Θ nlogb a 
j =0
blogb a
 
logb n−1
X
logb a
= Θ n 1
j =0

= Θ nlogb a logb n


Substituting this expression for the summation in equation (4.24)


yields
g(n) = Θ nlogb a logb n


= Θ nlogb a lg n .


Case 3: Since f (n) appears in the defining sum of g(n), we have that

g(n) = Ω(f (n))

for exact powers of b.


Proceeding as in case 1,
logb n−1 logb n−1  j
X
j
 n logb a+
logb a+
X ab−
a =n
j =0
bj j=0
blogb a
logb n−1
logb a−
X j
=n b−
j=0

1 − b− logb n
 
= nlogb a−
1 − b−
1
≤ nlogb a
b −1


= O(nlogb a )

It is assumed that af (n/b) ≤ cf (n) for some constant c < 1 and all
sufficiently large n.
Form this condition, f (n/b) ≤ (c/a)f (n) and iterating j times results in
f n/bj ≤ (c/a)j f (n) or, equivalently,


aj f n/bj ≤ cj f (n)


, where we assume that the values we iterate on are sufficiently large.


Since the last, and smallest, such value is n/bj−1 , it is enough to
assume that n/bj−1 is sufficiently large.
Substituting into the sum, we obtain
logb n−1
X
aj f n/bj

g(n) =
j =0
logb n−1
X
≤ cj f (n) + O(1)
j =0

X
≤ f (n) cj + O(1)
j =0
 
1
= f (n) + O(1)
1−c
= O(f (n))

since 0 < c < 1 is a constant.


Thus, we can conclude that g(n) = Θ(f (n)) for exact powers of b.

You might also like