# Stanford University — CS161: Algorithms Handout 2

Luca Trevisan April 3, 2013
Lecture 2
In which we analyze the running time of mergesort and of a recursive algorithm that
multiplies large integers.
1 Analysis of mergesort
Let T(n) be the worst-case running time of mergesort on inputs of length n. Then
T(n) satisﬁes the equations
T(1) = O(1)
T(n) = 2 · T
_
n
2
_
+ O(n)
(The second equation is true as written only if n is even, otherwise instead of recursing
on two instances of size n/2 we recurse on two instances of size (n+1)/2 and (n−1)/2;
There are constants c
1
, c
2
such that the equations can be written as the inequalities
T(1) ≤ c
1
T(n) ≤ 2 · T
_
n
2
_
+ c
2
· n
and, if we deﬁne c := max{c
1
, c
2
}, we can also write them as
T(1) ≤ c
T(n) ≤ 2 · T
_
n
2
_
+ c · n
Now deﬁne a function F via the equations
1
F(1) = 1
F(n) = 2 · F
_
n
2
_
+ ·n
It is easy to prove by strong induction on n that, if F(n) is deﬁned, then T(n) ≤
c · F(n). (Note that F(n) is deﬁned only if n is a power of two.)
Let us study F(n). By expanding the deﬁnition, we see that
F(n) = 2F
_
n
2
_
+ n
= 4F
_
n
4
_
+ 2n
.
.
.
= 2
k
F
_
n
2
k
_
+ kn
More precisely, we can prove by induction on k that for every n that is a power of
two, and for every k such that 2
k
≤ n we have
F(n) = 2
k
F
_
n
2
k
_
+ kn
Then we apply the above equation to k := log
2
n, that is, to the value of k such that
n = 2
k
. Then we get
F(n) = n + nlog
2
n
and, from T(n) ≤ cF(n), we have T(n) = O(nlog n).
The above analysis applies only to the case in which n is a power of two. What do
we do in general? For general n, the running time T(·) satisﬁes
T(1) = O(1)
T(n) = T
__
n
2
__
+ T
__
n
2
__
+ O(n)
which, for the same constants c
1
, c
2
, c used above, can be written as
T(1) ≤ c
1
T(n) ≤ T
__
n
2
__
+ T
__
n
2
__
+ c
2
· n
2
and as
T(1) ≤ c
T(n) ≤ T
__
n
2
__
+ T
__
n
2
__
+ c · n
Now deﬁne a function F, for every positive integer n, as
F(1) = 1
F(n) = F
__
n
2
__
+ F
__
n
2
__
+ n
We can prove by strong induction on n that for every n we have T(n) ≤ cF(n). Also,
the values of T() and F() that we just deﬁned are equal to the ones that we deﬁne
above when n is a power of two, so it remains true that F(n) = n + nlog
2
n when n
is a power of two.
What remains to prove, which we can do again by strong induction on n, is to argue
that for every n we have F(n + 1) > F(n), which is easily veriﬁed for n = 1, because
F(2) = 4 > 2 = F(1)
and, assuming F(m + 1) > F(m) for m = 1, . . . , n − 1 we have
F(n + 1) = F
__
n
2
__
+ F
__
n
2
__
+ n
> F
__
n − 1
2
__
+ F
__
n − 1
2
__
+ n − 1
= F(n)
Let us now consider the value of F(n) for an arbitrary n. Let k be such that 2
k
is the power of two immediately larger than n, that is, k = log
2
n. Note that
n ≤ 2
k
≤ 2n − 1. Then we have
F(n) ≤ F(2
k
) because n ≤ 2
k
= 2
k
+ k · 2
k
based on our analysis of F(n) when n is a power of 2
< 2n + (log
2
2n) · 2n
= 4n + 2nlog
2
n
This means that T(n) = O(nlog n) for every n.
3
2 Integer Multiplication
Consider the problem of multiplying two large integers. We assume that a large integer
is stored as an array a = a[0], . . . , a[n − 1] of decimal digits, with the understanding
that the array stands for the integer
a[0] + 10a[1] + · · · + 10
n−1
a[n − 1]
For example, the integer 345225 is stored as the array [5, 2, 2, 5, 4, 3].
1
school algorithm clearly computes sums of n-digit integers in time O(n). What about
multiplication?
If we analyze the grade-school multiplication algorithm, we see that it executes n
2
multiplications between digits, plus several sums, and the overall running time can
be analyzed as O(n
2
). Can we do better?
A ﬁrst attempt at using divide-and-conquer is the following. Suppose we are given
two n-digit integers a and b, which we want to multiply. Then we can reduce this
problem to the problem of multiplying integers with n/2 digits as follows. Write
a = a
L
+ 10
n/2
a
M
where a
L
is the number whose digits are the n/2 lower-order digits of a, and a
M
is the
number whose digits are the n/2 higher-order digits of a. For example, if a = 345225,
then a
L
= 225, a
M
= 345, and a = a
L
+ 1000a
M
. Let us similarly write
b = b
L
+ 10
n/2
b
M
Then we have
a · b = 10
n
a
M
b
M
+ 10
n/2
(a
L
b
M
+ a
M
b
L
) + a
L
b
L
multiplication by a power of 10 is just a shift, that can be computed in time O(n),
and sums can also be computed in time O(n), which means that a · b can be com-
puted in time O(n) plus the time it takes to recursively compute the 4 products
a
M
b
M
, a
L
b
M
, a
M
b
L
, a
L
b
L
. Note that each of the four products involves integers with
n/2 digits, so if we let T(n) be the worst-case running time of the above recursive
algorithm, where n is the number of digits of the two input integers, we have
T(1) = O(1)
T(n) = 4T
_
n
2
_
+ O(n)
1
In an actual implementation, we would use the base 2
32
or the base 2
64
so that each digit ﬁlls up an entire memory word, and each operation between digits can be executed
as one machine-language operation.
4
By the argument that we have already seen twice, if we deﬁne the function F as
F(1) = 1
F(n) = 4F
_
n
2
_
+ n
and we solve for F, then we have T(n) = O(F(n)). How do we solve for F? After
applying the deﬁnition a few times we see that
F(n) = 4F
_
n
2
_
+ n
= 16F
_
n
4
_
+ 2n + n
= 64F
_
n
8
_
+ 4n + 2n + n
.
.
.
= 4
k
F
_
n
2
k
_
+ n · (1 + 2 + · · · 2
k−1
)
That is, we can prove by induction on k that for every n for which F is deﬁned (that
is, for every n that is a power of 2) and for every k such that 2
k
≤ n, we have
F(n) = 4
k
F
_
n
2
k
_
+ n · (1 + 2 + · · · 2
k−1
)
If we substitute k := log
2
n, then we see that 4
k
= n
2
, that 2
k
= n, and
1 + · · · 2
k−1
= 2
k
− 1 = n − 1
so that we get
F(n) = n
2
+ n · (n − 1) = 2n
2
− n
and so
T(n) = O(n
2
)
meaning that we have not improved over the grade-school algorithm. In the next
lecture we will see a more eﬃcient algorithm.
5