You are on page 1of 50


Dr. Yingwu Zhu

P65-74, p83-88, p93-96,

The most-well known algorithm design
1. Divide instance of problem into two or
more smaller instances
Recursive case

2. Solve smaller instances independently and

When to stop? base case

3. Obtain solution to original (larger) instance

by combining these solutions

Divide-and-Conquer Technique
a problem of size n

subproblem 1
of size n/2


a solution to
subproblem 1

subproblem 2
of size n/2

a solution to
subproblem 2

a solution to
the original problem

A General Template
//S is a large problem with input size of n
Algorithm divide_and_conquer(S)
if (S is small enough to handle)
solve it //base case: conquer
split S into two (equally-sized) subproblems S1 and
combine solutions to S1 and S2

General Divide-and-Conquer
Recursive algorithms are a natural fit
for divide-and-conquer
Distinguish from Dynamic Programming

Recall in Lecture 1, algorithm

efficiency analysis for recursive
Key: Recurrence Relation
Solve: backward substitution, often

General Divide-and-Conquer
T(n) = aT(n/b) + f (n)
where f(n) (nd), d 0, f(n) accounts for the
time spent on dividing the problem into smaller
ones and combining their solutions
Master Theorem: If a < bd, T(n) (nd)
If a = bd,
T(n) (nd log n)
If a > bd,
T(n) (nlog b a )
Note: The same results hold with O instead of .
Examples: T(n) = 4T(n/2) + n T(n) ?
T(n) = 4T(n/2) + n2 T(n) ?
T(n) = 4T(n/2) + n3 T(n) ?

Sorting: mergesort and quicksort
Maximum subarray pronblem
Multiplication of large integers
Closest-pair problem
Matrix multiplication: Strassens algorithm

Question #1: what makes mergesort
distinct from many other sorting
internal and external algorithm
Question #2: How to design
mergesort using divide-and-conquer

Split array A[0..n-1] in two about equal halves
and make copies of each half in arrays B and C
Sort arrays B and C recursively
Q: when to stop?

Merge sorted arrays B and C into array A as

Repeat the following until no elements remain in one of the
compare the first elements in the remaining unprocessed
portions of the arrays
copy the smaller of the two into A, while incrementing the
index indicating the unprocessed portion of that array
Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.

Pseudocode of Mergesort

Pseudocode of Merge

8 3 2 9 7 1 5 4


8 3 2 9

8 3

7 1 5 4

2 9

3 8


2 9

5 4

1 7

2 3 8 9

4 5

1 4 5 7

1 2 3 4 5 7 8 9

Analysis of Mergesort
Time efficiency by recurrence reltation:
T(n) = 2T(n/2) + f(n)
n-1 comparisons in merge operation for worst
T(n) = (n log n)
Number of comparisons in the worst case is close to
theoretical minimum for comparison-based sorting:
log2 n!

n log2 n - 1.44n

(Section 11.2)

Space requirement: (nlogn) (not in-place)

Can be implemented without recursion (bottom-up)

Mergsort: A big picture

Problem: Assume you want to sort X
terabytes (even perabytes) of data
using a cluster of M (even thousands)

Similar to Search Engines in some

aspects: data partitioning!

Exercises #1
a. Write a pseudocode for a divide-and-conquer
algorithm for finding the position of the largest
element in an array of n numbers.
b. Set up and solve a recurrence relation for the
number of key comparisons made by your
c. How does this algorithm compare with the bruteforce algorithm for this problem?

Select a pivot (partitioning element) here, the
first element for simplicity!
Rearrange the list so that all the elements in the
first s positions are smaller than or equal to the
pivot and all the elements in the remaining n-s
positions are larger than the pivot (see next slide
for an algorithm)



Exchange the pivot with the last element in the

first (i.e., ) subarray the pivot is now in its
final position
Sort the two subarrays recursively

Basic operation: split/divide
Differ from the divide operation in
What is the major difference?

Basic operation: split/divide
Differ from the divide operation in
What is the major difference?
Each split will place the pivot in the right
position, and the left sublist < the right
No explicit merge

Partitioning Algorithm

A[i] > p
A[j] <= p

Quicksort Example
8, 2, 13, 5, 14, 3, 7

Analysis of Quicksort
Best case: T(n) =?
Worst case: T(n) =?

Analysis of Quicksort
Best case: split in the middle (n log n)
Worst case: sorted array! (n2)
Average case: random arrays (n log n)
Assume the split can happen in each position
with equal probability! See textbook for details!

better pivot selection: median-of-three partitioning
switch to insertion sort on small sublists
elimination of recursion
These combination makes 20-25% improvement

Considered the method of choice for

internal sorting of large files (n 10000)

Randomized Quicksort
Many regard the randomized
quicksort as the sorting algorithm of
choice for large inputs
Random sampling:
Select a randomly-chosen element from
the subarray as the pivot

Expected running time: O(nlogn)

Chapter 7.4, p180

Randomized Quicksort
int pos = l + rand() % (r-l+1);
swap(A[l], A[pos]);
partition(A[l..r]); //call default one

Questions for Quicksort

Q1: How to implement the medianof-three rule by reusing the previous
implementation of the split
Q2: How to implement a nonrecursive quicksort?


Sorting: mergesort and quicksort

Binary search (degenerated)

Binary tree traversals

Multiplication of large integers

Matrix multiplication: Strassens algorithm

Closest-pair algorithm

Maximum Subarray Problem


given an array of n numbers, find the (a)

contiguous subarray whose sum has the largest


an unrealistic stock market game, in

which you decide when to buy and see a stock, with
full knowledge of the past and future. The
restriction is that you can perform just one buy
followed by a sell. The buy and sell both occur right
after the close of the market.
The interpretation of the numbers: each number
represents the stock value at closing on any particular

Maximum Subarray Problem


Maximum Subarray Problem

Another Example: buying low and
selling high, even with perfect
knowledge, is not trivial:

A brute-force solution
O(n2) Solution
Considering Cn2 pairs
Not a pleasant prospect if we are
rummaging through long time-series
(Who told you it was easy to get
rich???), even if you are allowed to
post-date your stock options

A Better Solution: Max

Transformation: Instead of the daily price, let
us consider the daily change: A[i] is the
difference between the closing value on day i
and that on day i-1.
The problem becomes that of finding a
contiguous subarray the sum of whose values
is maximum.
On a first look this seems even worse: roughly the
same number of intervals (one fewer, to be precise),
and the requirement to add the values in the subarray
rather than just computing a difference:

Max Subarray

Max Subarray
How do we divide?
We observe that a maximum contiguous subarray A[ij] must
be located as follows:
1.It lies entirely in the left half of the original array: [lowmid];
2.It lies entirely in the right half of the original array: [mid+1
3.It straddles the midpoint of the original array: i mid < j.

Max Subarray: Divide &

The left and right subproblems are smaller versions
of the original problem, so they are part of the
standard Divide & Conquer recursion.
The middle subproblem is not, so we will need to
count its cost as part of the combine (or divide)
The crucial observation (and it may not be entirely obvious) is
that we can find the maximum crossing subarray in time
linear in the length of the A[lowhigh] subarray.

How? A[i,,j] must be made up of A[imid] and

A[m+1j] so we find the largest A[imid] and the
largest A[m+1j] and combine them.

The middle subproblem

Algorithms: Max Subarray

Max Subarray: Algorithm

We finally have:

1ifn 1,
T n
2T n /2 n ifn 1.

The recurrence has the same form as that for

MERGESORT, and thus we should expect it to
have thesame solution T(n) = (n lg n).
This algorithm is clearly substantially faster
than any of the brute-force methods. It
required some cleverness, and the
programming is a little more complicated
but the payoff is large.

Multiplication of Large

Consider the problem of multiplying

two (large) n-digit integers represented
by arrays of their digits such as:
A = 12345678901357986429
B = 87654321284820912836

Multiplication of Large

Consider the problem of multiplying two (large) ndigit integers represented by arrays of their digits
such as:
A = 12345678901357986429 B =
The grade-school algorithm:
a1 a2 an
b1 b2 bn
(d10) d11d12 d1n
(d20) d21d22 d2n

(dn0) dn1dn2 dnn

Efficiency: n2 one-digit multiplications

Multiplication of Large

Consider the problem of multiplying

two (large) n-digit integers represented
by arrays of their digits such as:
A = 12345678901357986429
B = 87654321284820912836
Discussion: How to apply divide-andconquer to this problem?

First Cut
A small example: A B where A = 2135 and B =
A = (21102 + 35), B = (40 102 + 14)
So, A B = (21 102 + 35) (40 102 + 14)
= 21 40 104 + (21 14 + 35 40) 102 + 35
In general, if A = A1A2 and B = B1B2 (where A and B
are n-digit, A1, A2, B1, B2 are n/2-digit numbers),
A B = A1 B110n + (A1 B2 + A2 B1) 10n/2 + A2 B2
Recurrence for the number of one-digit
multiplications T(n):
T(n) = 4T(n/2), T(1) = 1
Solution: T(n) = n2

Second Cut
A B = A1 B110n + (A1 B2 + A2 B1) 10n/2 + A2 B2

The idea is to decrease the number of multiplications

from 4 to 3:
(A1 + A2 ) (B1 + B2 ) = A1 B1 + (A1 B2 + A2 B1)
+ A2 B2,

i.e., (A1 B2 + A2 B1) = (A1 + A2 ) (B1 + B2 ) - A1 B1

- A2 B2,
which requires only 3 multiplications at the expense
of (4-1) extra add/sub.
Recurrence for the number of multiplications T(n):
T(n) = 3T(n/2), T(1) = 1
Solution: T(n) = 3log 2n = nlog 23 n1.585

Integer Multiplication
To multiply two n-digit integers:
Add two n digit integers.
Multiply three n-digit integers.
Add, subtract, and shift n-digit integers to obtain result.


Large-Integer Multiplication
What if two large numbers have different
number of digits?
What if n is an odd number?

Closest-Pair Problem
S is a set of n points Pi=(xi, yi) in the
For simplicity, n is a power of two
Without loss of generality, we
assume points are ordered by their x
Discussion: How to apply divide-andconquer?

Closest-Pair Problem by Divide-andConquer

Step 1

Divide the points given into two subsets S1

and S2 by a vertical line x = c so that half the
points lie to the left or on the line and half the
points lie to the right or on the line.

Closest Pair by Divide-and-Conquer

Step 2 Find recursively the closest pairs for the left and
Step 3 Set d = min{d1, d2}
We can limit our attention to the points in the
vertical strip of width 2d as possible closest pair.
Let C1
and C2 be the subsets of points in the left subset S1
and of
the right subset S2, respectively, that lie in this
strip. The points in C1 and C2 are stored in
order of their y coordinates, which is maintained by
merging during the execution of the next step.
Step 4 For every point P(x,y) in C1, we inspect points in
C2 that may be closer to P than d. There can be no
than 6 such points (because d d2)!

Closest Pair by Divide-and-Conquer: Worst


The worst case scenario is depicted


Efficiency of the Closest-Pair Algorithm

Running time of the algorithm is described by

T(n) = 2T(n/2) + f(n), where f(n)

By the Master Theorem (with a = 2, b = 2, d

= 1)
T(n) O(n log n)