This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Herbert S. Wilf

University of Pennsylvania

Philadelphia, PA 19104-6395

Copyright Notice

Copyright 1994 by Herbert S. Wilf. This material may be reproduced for any educational purpose, multiple

copies may be made for classes, etc. Charges, if any, for reproduced copies must be just enough to recover

reasonable costs of reproduction. Reproduction for commercial purposes is prohibited. This cover page must

be included in all distributed copies.

Internet Edition, Summer, 1994

This edition of Algorithms and Complexity is available at the web site http://www.math.upenn.edu. It

may be taken at no charge by all interested persons. Comments and corrections are welcome, and should be

sent to wilf@math.upenn.edu

CONTENTS

Chapter 0: What This Book Is About

0.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

0.2 Hard vs. easy problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

0.3 A preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Chapter 1: Mathematical Preliminaries

1.1 Orders of magnitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Positional number systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 Manipulations with series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4 Recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.5 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.6 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Chapter 2: Recursive Algorithms

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3 Recursive graph algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.4 Fast matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.5 The discrete Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.6 Applications of the FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.7 A review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Chapter 3: The Network Flow Problem

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2 Algorithms for the network ﬂow problem . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.3 The algorithm of Ford and Fulkerson . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.4 The max-ﬂow min-cut theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.5 The complexity of the Ford-Fulkerson algorithm . . . . . . . . . . . . . . . . . . . . . 70

3.6 Layered networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.7 The MPM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.8 Applications of network ﬂow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Chapter 4: Algorithms in the Theory of Numbers

4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2 The greatest common divisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.3 The extended Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.4 Primality testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.5 Interlude: the ring of integers modulo n . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.6 Pseudoprimality tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.7 Proof of goodness of the strong pseudoprimality test . . . . . . . . . . . . . . . . . . . . 94

4.8 Factoring and cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.9 Factoring large integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4.10 Proving primality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

iii

Chapter 5: NP-completeness

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.2 Turing machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.3 Cook’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.4 Some other NP-complete problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.5 Half a loaf ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.6 Backtracking (I): independent sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

5.7 Backtracking (II): graph coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

5.8 Approximate algorithms for hard problems . . . . . . . . . . . . . . . . . . . . . . . . 128

iv

Preface

For the past several years mathematics majors in the computing track at the University of Pennsylvania

have taken a course in continuous algorithms (numerical analysis) in the junior year, and in discrete algo-

rithms in the senior year. This book has grown out of the senior course as I have been teaching it recently.

It has also been tried out on a large class of computer science and mathematics majors, including seniors

and graduate students, with good results.

Selection by the instructor of topics of interest will be very important, because normally I’ve found

that I can’t cover anywhere near all of this material in a semester. A reasonable choice for a ﬁrst try might

be to begin with Chapter 2 (recursive algorithms) which contains lots of motivation. Then, as new ideas

are needed in Chapter 2, one might delve into the appropriate sections of Chapter 1 to get the concepts

and techniques well in hand. After Chapter 2, Chapter 4, on number theory, discusses material that is

extremely attractive, and surprisingly pure and applicable at the same time. Chapter 5 would be next, since

the foundations would then all be in place. Finally, material from Chapter 3, which is rather independent

of the rest of the book, but is strongly connected to combinatorial algorithms in general, might be studied

as time permits.

Throughout the book there are opportunities to ask students to write programs and get them running.

These are not mentioned explicitly, with a few exceptions, but will be obvious when encountered. Students

should all have the experience of writing, debugging, and using a program that is nontrivially recursive,

for example. The concept of recursion is subtle and powerful, and is helped a lot by hands-on practice.

Any of the algorithms of Chapter 2 would be suitable for this purpose. The recursive graph algorithms are

particularly recommended since they are usually quite foreign to students’ previous experience and therefore

have great learning value.

In addition to the exercises that appear in this book, then, student assignments might consist of writing

occasional programs, as well as delivering reports in class on assigned readings. The latter might be found

among the references cited in the bibliographies in each chapter.

I am indebted ﬁrst of all to the students on whom I worked out these ideas, and second to a num-

ber of colleagues for their helpful advice and friendly criticism. Among the latter I will mention Richard

Brualdi, Daniel Kleitman, Albert Nijenhuis, Robert Tarjan and Alan Tucker. For the no-doubt-numerous

shortcomings that remain, I accept full responsibility.

This book was typeset in T

E

X. To the extent that it’s a delight to look at, thank T

E

X. For the deﬁciencies

in its appearance, thank my limitations as a typesetter. It was, however, a pleasure for me to have had the

chance to typeset my own book. My thanks to the Computer Science department of the University of

Pennsylvania, and particularly to Aravind Joshi, for generously allowing me the use of T

E

X facilities.

Herbert S. Wilf

v

Chapter 0: What This Book Is About

0.1 Background

An algorithm is a method for solving a class of problems on a computer. The complexity of an algorithm

is the cost, measured in running time, or storage, or whatever units are relevant, of using the algorithm to

solve one of those problems.

This book is about algorithms and complexity, and so it is about methods for solving problems on

computers and the costs (usually the running time) of using those methods.

Computing takes time. Some problems take a very long time, others can be done quickly. Some problems

seem to take a long time, and then someone discovers a faster way to do them (a ‘faster algorithm’). The

study of the amount of computational eﬀort that is needed in order to perform certain kinds of computations

is the study of computational complexity.

Naturally, we would expect that a computing problem for which millions of bits of input data are

required would probably take longer than another problem that needs only a few items of input. So the time

complexity of a calculation is measured by expressing the running time of the calculation as a function of

some measure of the amount of data that is needed to describe the problem to the computer.

For instance, think about this statement: ‘I just bought a matrix inversion program, and it can invert

an n n matrix in just 1.2n

3

minutes.’ We see here a typical description of the complexity of a certain

algorithm. The running time of the program is being given as a function of the size of the input matrix.

A faster program for the same job might run in 0.8n

3

minutes for an n n matrix. If someone were

to make a really important discovery (see section 2.4), then maybe we could actually lower the exponent,

instead of merely shaving the multiplicative constant. Thus, a program that would invert an n n matrix

in only 7n

2.8

minutes would represent a striking improvement of the state of the art.

For the purposes of this book, a computation that is guaranteed to take at most cn

3

time for input of

size n will be thought of as an ‘easy’ computation. One that needs at most n

10

time is also easy. If a certain

calculation on an n n matrix were to require 2

n

minutes, then that would be a ‘hard’ problem. Naturally

some of the computations that we are calling ‘easy’ may take a very long time to run, but still, from our

present point of view the important distinction to maintain will be the polynomial time guarantee or lack of

it.

The general rule is that if the running time is at most a polynomial function of the amount of input

data, then the calculation is an easy one, otherwise it’s hard.

Many problems in computer science are known to be easy. To convince someone that a problem is easy,

it is enough to describe a fast method for solving that problem. To convince someone that a problem is

hard is hard, because you will have to prove to them that it is impossible to ﬁnd a fast way of doing the

calculation. It will not be enough to point to a particular algorithm and to lament its slowness. After all,

that algorithm may be slow, but maybe there’s a faster way.

Matrix inversion is easy. The familiar Gaussian elimination method can invert an n n matrix in time

at most cn

3

.

To give an example of a hard computational problem we have to go far aﬁeld. One interesting one is

called the ‘tiling problem.’ Suppose* we are given inﬁnitely many identical ﬂoor tiles, each shaped like a

regular hexagon. Then we can tile the whole plane with them, i.e., we can cover the plane with no empty

spaces left over. This can also be done if the tiles are identical rectangles, but not if they are regular

pentagons.

In Fig. 0.1 we show a tiling of the plane by identical rectangles, and in Fig. 0.2 is a tiling by regular

hexagons.

That raises a number of theoretical and computational questions. One computational question is this.

Suppose we are given a certain polygon, not necessarily regular and not necessarily convex, and suppose we

have inﬁnitely many identical tiles in that shape. Can we or can we not succeed in tiling the whole plane?

That elegant question has been proved* to be computationally unsolvable. In other words, not only do

we not know of any fast way to solve that problem on a computer, it has been proved that there isn’t any

* See, for instance, Martin Gardner’s article in Scientiﬁc American, January 1977, pp. 110-121.

* R. Berger, The undecidability of the domino problem, Memoirs Amer. Math. Soc. 66 (1966), Amer.

Chapter 0: What This Book Is About

Fig. 0.1: Tiling with rectangles

Fig. 0.2: Tiling with hexagons

way to do it, so even looking for an algorithm would be fruitless. That doesn’t mean that the question is

hard for every polygon. Hard problems can have easy instances. What has been proved is that no single

method exists that can guarantee that it will decide this question for every polygon.

The fact that a computational problem is hard doesn’t mean that every instance of it has to be hard. The

problem is hard because we cannot devise an algorithm for which we can give a guarantee of fast performance

for all instances.

Notice that the amount of input data to the computer in this example is quite small. All we need to

input is the shape of the basic polygon. Yet not only is it impossible to devise a fast algorithm for this

problem, it has been proved impossible to devise any algorithm at all that is guaranteed to terminate with

a Yes/No answer after ﬁnitely many steps. That’s really hard!

0.2 Hard vs. easy problems

Let’s take a moment more to say in another way exactly what we mean by an ‘easy’ computation vs. a

‘hard’ one.

Think of an algorithm as being a little box that can solve a certain class of computational problems.

Into the box goes a description of a particular problem in that class, and then, after a certain amount of

time, or of computational eﬀort, the answer appears.

A ‘fast’ algorithm is one that carries a guarantee of fast performance. Here are some examples.

Example 1. It is guaranteed that if the input problem is described with B bits of data, then an answer

will be output after at most 6B

3

minutes.

Example 2. It is guaranteed that every problem that can be input with B bits of data will be solved in at

most 0.7B

15

seconds.

A performance guarantee, like the two above, is sometimes called a ‘worst-case complexity estimate,’

and it’s easy to see why. If we have an algorithm that will, for example, sort any given sequence of numbers

into ascending order of size (see section 2.2) it may ﬁnd that some sequences are easier to sort than others.

For instance, the sequence 1, 2, 7, 11, 10, 15, 20 is nearly in order already, so our algorithm might, if

it takes advantage of the near-order, sort it very rapidly. Other sequences might be a lot harder for it to

handle, and might therefore take more time.

Math. Soc., Providence, RI

2

0.2 Hard vs. easy problems

So in some problems whose input bit string has B bits the algorithm might operate in time 6B, and on

others it might need, say, 10Blog B time units, and for still other problem instances of length B bits the

algorithm might need 5B

2

time units to get the job done.

Well then, what would the warranty card say? It would have to pick out the worst possibility, otherwise

the guarantee wouldn’t be valid. It would assure a user that if the input problem instance can be described

by B bits, then an answer will appear after at most 5B

2

time units. Hence a performance guarantee is

equivalent to an estimation of the worst possible scenario: the longest possible calculation that might ensue

if B bits are input to the program.

Worst-case bounds are the most common kind, but there are other kinds of bounds for running time.

We might give an average case bound instead (see section 5.7). That wouldn’t guarantee performance no

worse than so-and-so; it would state that if the performance is averaged over all possible input bit strings of

B bits, then the average amount of computing time will be so-and-so (as a function of B).

Now let’s talk about the diﬀerence between easy and hard computational problems and between fast

and slow algorithms.

A warranty that would not guarantee ‘fast’ performance would contain some function of B that grows

faster than any polynomial. Like e

B

, for instance, or like 2

√

B

, etc. It is the polynomial time vs. not

necessarily polynomial time guarantee that makes the diﬀerence between the easy and the hard classes of

problems, or between the fast and the slow algorithms.

It is highly desirable to work with algorithms such that we can give a performance guarantee for their

running time that is at most a polynomial function of the number of bits of input.

An algorithm is slow if, whatever polynomial P we think of, there exist arbitrarily large values of B,

and input data strings of B bits, that cause the algorithm to do more than P(B) units of work.

A computational problem is tractable if there is a fast algorithm that will do all instances of it.

A computational problem is intractable if it can be proved that there is no fast algorithm for it.

Example 3. Here is a familiar computational problem and a method, or algorithm, for solving it. Let’s see

if the method has a polynomial time guarantee or not.

The problem is this. Let n be a given integer. We want to ﬁnd out if n is prime. The method that we

choose is the following. For each integer m = 2, 3, . . . , ¸

√

n| we ask if m divides (evenly into) n. If all of the

answers are ‘No,’ then we declare n to be a prime number, else it is composite.

We will now look at the computational complexity of this algorithm. That means that we are going to

ﬁnd out how much work is involved in doing the test. For a given integer n the work that we have to do can

be measured in units of divisions of a whole number by another whole number. In those units, we obviously

will do about

√

n units of work.

It seems as though this is a tractable problem, because, after all,

√

n is of polynomial growth in n. For

instance, we do less than n units of work, and that’s certainly a polynomial in n, isn’t it? So, according to

our deﬁnition of fast and slow algorithms, the distinction was made on the basis of polynomial vs. faster-

than-polynomial growth of the work done with the problem size, and therefore this problem must be easy.

Right? Well no, not really.

Reference to the distinction between fast and slow methods will show that we have to measure the

amount of work done as a function of the number of bits of input to the problem. In this example, n is not

the number of bits of input. For instance, if n = 59, we don’t need 59 bits to describe n, but only 6. In

general, the number of binary digits in the bit string of an integer n is close to log

2

n.

So in the problem of this example, testing the primality of a given integer n, the length of the input bit

string B is about log

2

n. Seen in this light, the calculation suddenly seems very long. A string consisting of

a mere log

2

n 0’s and 1’s has caused our mighty computer to do about

√

n units of work.

If we express the amount of work done as a function of B, we ﬁnd that the complexity of this calculation

is approximately 2

B/2

, and that grows much faster than any polynomial function of B.

Therefore, the method that we have just discussed for testing the primality of a given integer is slow.

See chapter 4 for further discussion of this problem. At the present time no one has found a fast way

to test for primality, nor has anyone proved that there isn’t a fast way. Primality testing belongs to the

(well-populated) class of seemingly, but not provably, intractable problems.

In this book we will deal with some easy problems and some seemingly hard ones. It’s the ‘seemingly’

that makes things very interesting. These are problems for which no one has found a fast computer algorithm,

3

Chapter 0: What This Book Is About

but also, no one has proved the impossibility of doing so. It should be added that the entire area is vigorously

being researched because of the attractiveness and the importance of the many unanswered questions that

remain.

Thus, even though we just don’t know many things that we’d like to know in this ﬁeld , it isn’t for lack

of trying!

0.3 A preview

Chapter 1 contains some of the mathematical background that will be needed for our study of algorithms.

It is not intended that reading this book or using it as a text in a course must necessarily begin with Chapter

1. It’s probably a better idea to plunge into Chapter 2 directly, and then when particular skills or concepts

are needed, to read the relevant portions of Chapter 1. Otherwise the deﬁnitions and ideas that are in that

chapter may seem to be unmotivated, when in fact motivation in great quantity resides in the later chapters

of the book.

Chapter 2 deals with recursive algorithms and the analyses of their complexities.

Chapter 3 is about a problem that seems as though it might be hard, but turns out to be easy, namely the

network ﬂow problem. Thanks to quite recent research, there are fast algorithms for network ﬂow problems,

and they have many important applications.

In Chapter 4 we study algorithms in one of the oldest branches of mathematics, the theory of num-

bers. Remarkably, the connections between this ancient subject and the most modern research in computer

methods are very strong.

In Chapter 5 we will see that there is a large family of problems, including a number of very important

computational questions, that are bound together by a good deal of structural unity. We don’t know if

they’re hard or easy. We do know that we haven’t found a fast way to do them yet, and most people suspect

that they’re hard. We also know that if any one of these problems is hard, then they all are, and if any one

of them is easy, then they all are.

We hope that, having found out something about what people know and what people don’t know, the

reader will have enjoyed the trip through this subject and may be interested in helping to ﬁnd out a little

more.

4

1.1 Orders of magnitude

Chapter 1: Mathematical Preliminaries

1.1 Orders of magnitude

In this section we’re going to discuss the rates of growth of diﬀerent functions and to introduce the ﬁve

symbols of asymptotics that are used to describe those rates of growth. In the context of algorithms, the

reason for this discussion is that we need a good language for the purpose of comparing the speeds with

which diﬀerent algorithms do the same job, or the amounts of memory that they use, or whatever other

measure of the complexity of the algorithm we happen to be using.

Suppose we have a method of inverting square nonsingular matrices. How might we measure its speed?

Most commonly we would say something like ‘if the matrix is nn then the method will run in time 16.8n

3

.’

Then we would know that if a 100 100 matrix can be inverted, with this method, in 1 minute of computer

time, then a 200 200 matrix would require 2

3

= 8 times as long, or about 8 minutes. The constant ‘16.8’

wasn’t used at all in this example; only the fact that the labor grows as the third power of the matrix size

was relevant.

Hence we need a language that will allow us to say that the computing time, as a function of n, grows

‘on the order of n

3

,’ or ‘at most as fast as n

3

,’ or ‘at least as fast as n

5

log n,’ etc.

The new symbols that are used in the language of comparing the rates of growth of functions are the

following ﬁve: ‘o’ (read ‘is little oh of’), ‘O’ (read ‘is big oh of’), ‘Θ’ (read ‘is theta of’), ‘∼’ (read ‘is

asymptotically equal to’ or, irreverently, as ‘twiddles’), and ‘Ω’ (read ‘is omega of’).

Now let’s explain what each of them means.

Let f(x) and g(x) be two functions of x. Each of the ﬁve symbols above is intended to compare the

rapidity of growth of f and g. If we say that f(x) = o(g(x)), then informally we are saying that f grows

more slowly than g does when x is very large. Formally, we state the

Deﬁnition. We say that f(x) = o(g(x)) (x → ∞) if lim

x→∞

f(x)/g(x) exists and is equal to 0.

Here are some examples:

(a) x

2

= o(x

5

)

(b) sinx = o(x)

(c) 14.709

√

x = o(x/2 +7 cos x)

(d) 1/x = o(1) (?)

(e) 23 logx = o(x

.02

)

We can see already from these few examples that sometimes it might be easy to prove that a ‘o’

relationship is true and sometimes it might be rather diﬃcult. Example (e), for instance, requires the use of

L’Hospital’s rule.

If we have two computer programs, and if one of them inverts n n matrices in time 635n

3

and if the

other one does so in time o(n

2.8

) then we know that for all suﬃciently large values of n the performance

guarantee of the second program will be superior to that of the ﬁrst program. Of course, the ﬁrst program

might run faster on small matrices, say up to size 10, 000 10, 000. If a certain program runs in time

n

2.03

and if someone were to produce another program for the same problem that runs in o(n

2

log n) time,

then that second program would be an improvement, at least in the theoretical sense. The reason for the

‘theoretical’ qualiﬁcation, once more, is that the second program would be known to be superior only if n

were suﬃciently large.

The second symbol of the asymptotics vocabulary is the ‘O.’ When we say that f(x) = O(g(x)) we

mean, informally, that f certainly doesn’t grow at a faster rate than g. It might grow at the same rate or it

might grow more slowly; both are possibilities that the ‘O’ permits. Formally, we have the next

Deﬁnition. We say that f(x) = O(g(x)) (x → ∞) if ∃C, x

0

such that [f(x)[ < Cg(x) (∀x > x

0

).

The qualiﬁer ‘x → ∞’ will usually be omitted, since it will be understood that we will most often be

interested in large values of the variables that are involved.

For example, it is certainly true that sinx = O(x), but even more can be said, namely that sinx = O(1).

Also x

3

+ 5x

2

+ 77 cos x = O(x

5

) and 1/(1 + x

2

) = O(1). Now we can see how the ‘o’ gives more precise

information than the ‘O,’ for we can sharpen the last example by saying that 1/(1 + x

2

) = o(1). This is

5

Chapter 1: Mathematical Preliminaries

sharper because not only does it tell us that the function is bounded when x is large, we learn that the

function actually approaches 0 as x → ∞.

This is typical of the relationship between O and o. It often happens that a ‘O’ result is suﬃcient for

an application. However, that may not be the case, and we may need the more precise ‘o’ estimate.

The third symbol of the language of asymptotics is the ‘Θ.’

Deﬁnition. We say that f(x) = Θ(g(x)) if there are constants c

1

,= 0, c

2

,= 0, x

0

such that for all x > x

0

it is true that c

1

g(x) < f(x) < c

2

g(x).

We might then say that f and g are of the same rate of growth, only the multiplicative constants are

uncertain. Some examples of the ‘Θ’ at work are

(x + 1)

2

= Θ(3x

2

)

(x

2

+ 5x + 7)/(5x

3

+7x + 2) = Θ(1/x)

_

3 +

√

2x = Θ(x

1

4

)

(1 +3/x)

x

= Θ(1).

The ‘Θ’ is much more precise than either the ‘O’ or the ‘o.’ If we know that f(x) = Θ(x

2

), then we know

that f(x)/x

2

stays between two nonzero constants for all suﬃciently large values of x. The rate of growth

of f is established: it grows quadratically with x.

The most precise of the symbols of asymptotics is the ‘∼.’ It tells us that not only do f and g grow at

the same rate, but that in fact f/g approaches 1 as x → ∞.

Deﬁnition. We say that f(x) ∼ g(x) if lim

x→∞

f(x)/g(x) = 1.

Here are some examples.

x

2

+x ∼ x

2

(3x +1)

4

∼ 81x

4

sin1/x ∼ 1/x

(2x

3

+ 5x +7)/(x

2

+ 4) ∼ 2x

2

x

+ 7 logx + cos x ∼ 2

x

Observe the importance of getting the multiplicative constants exactly right when the ‘∼’ symbol is used.

While it is true that 2x

2

= Θ(x

2

), it is not true that 2x

2

∼ x

2

. It is, by the way, also true that 2x

2

= Θ(17x

2

),

but to make such an assertion is to use bad style since no more information is conveyed with the ‘17’ than

without it.

The last symbol in the asymptotic set that we will need is the ‘Ω.’ In a nutshell, ‘Ω’ is the negation of

‘o.’ That is to say, f(x) = Ω(g(x)) means that it is not true that f(x) = o(g(x)). In the study of algorithms

for computers, the ‘Ω’ is used when we want to express the thought that a certain calculation takes at least

so-and-so long to do. For instance, we can multiply together two n n matrices in time O(n

3

). Later on

in this book we will see how to multiply two matrices even faster, in time O(n

2.81

). People know of even

faster ways to do that job, but one thing that we can be sure of is this: nobody will ever be able to write

a matrix multiplication program that will multiply pairs n n matrices with fewer than n

2

computational

steps, because whatever program we write will have to look at the input data, and there are 2n

2

entries in

the input matrices.

Thus, a computing time of cn

2

is certainly a lower bound on the speed of any possible general matrix

multiplication program. We might say, therefore, that the problem of multiplying two nn matrices requires

Ω(n

2

) time.

The exact deﬁnition of the ‘Ω’ that was given above is actually rather delicate. We stated it as the

negation of something. Can we rephrase it as a positive assertion? Yes, with a bit of work (see exercises 6

and 7 below). Since ‘f = o(g)’ means that f/g → 0, the symbol f = Ω(g) means that f/g does not approach

zero. If we assume that g takes positive values only, which is usually the case in practice, then to say that

f/g does not approach 0 is to say that ∃ > 0 and an inﬁnite sequence of values of x, tending to ∞, along

which [f[/g > . So we don’t have to show that [f[/g > for all large x, but only for inﬁnitely many large

x.

6

1.1 Orders of magnitude

Deﬁnition. We say that f(x) = Ω(g(x)) if there is an > 0 and a sequence x

1

, x

2

, x

3

, . . . → ∞ such that

∀j : [f(x

j

)[ > g(x

j

).

Now let’s introduce a hierarchy of functions according to their rates of growth when x is large. Among

commonly occurring functions of x that grow without bound as x → ∞, perhaps the slowest growing ones are

functions like log logx or maybe (log log x)

1.03

or things of that sort. It is certainly true that log logx → ∞

as x → ∞, but it takes its time about it. When x = 1, 000, 000, for example, log logx has the value 2.6.

Just a bit faster growing than the ‘snails’ above is log x itself. After all, log (1, 000, 000) = 13.8. So if

we had a computer algorithm that could do n things in time log n and someone found another method that

could do the same job in time O(log logn), then the second method, other things being equal, would indeed

be an improvement, but n might have to be extremely large before you would notice the improvement.

Next on the scale of rapidity of growth we might mention the powers of x. For instance, think about

x

.01

. It grows faster than log x, although you wouldn’t believe it if you tried to substitute a few values of x

and to compare the answers (see exercise 1 at the end of this section).

How would we prove that x

.01

grows faster than log x? By using L’Hospital’s rule.

Example. Consider the limit of x

.01

/logx for x → ∞. As x → ∞ the ratio assumes the indeterminate form

∞/∞, and it is therefore a candidate for L’Hospital’s rule, which tells us that if we want to ﬁnd the limit

then we can diﬀerentiate the numerator, diﬀerentiate the denominator, and try again to let x → ∞. If we

do this, then instead of the original ratio, we ﬁnd the ratio

.01x

−.99

/(1/x) = .01x

.01

which obviously grows without bound as x → ∞. Therefore the original ratio x

.01

/log x also grows without

bound. What we have proved, precisely, is that log x = o(x

.01

), and therefore in that sense we can say that

x

.01

grows faster than log x.

To continue up the scale of rates of growth, we meet x

.2

, x, x

15

, x

15

log

2

x, etc., and then we encounter

functions that grow faster than every ﬁxed power of x, just as logx grows slower than every ﬁxed power of

x.

Consider e

log

2

x

. Since this is the same as x

log x

it will obviously grow faster than x

1000

, in fact it will

be larger than x

1000

as soon as logx > 1000, i.e., as soon as x > e

1000

(don’t hold your breath!).

Hence e

log

2

x

is an example of a function that grows faster than every ﬁxed power of x. Another such

example is e

√

x

(why?).

Deﬁnition. A function that grows faster than x

a

, for every constant a, but grows slower than c

x

for

every constant c > 1 is said to be of moderately exponential growth. More precisely, f(x) is of moderately

exponential growth if for every a > 0 we have f(x) = Ω(x

a

) and for every > 0 we have f(x) = o((1 +)

x

).

Beyond the range of moderately exponential growth are the functions that grow exponentially fast.

Typical of such functions are (1.03)

x

, 2

x

, x

9

7

x

, and so forth. Formally, we have the

Deﬁnition. A function f is of exponential growth if there exists c > 1 such that f(x) = Ω(c

x

) and there

exists d such that f(x) = O(d

x

).

If we clutter up a function of exponential growth with smaller functions then we will not change the

fact that it is of exponential growth. Thus e

√

x+2x

/(x

49

+37) remains of exponential growth, because e

2x

is,

all by itself, and it resists the eﬀorts of the smaller functions to change its mind.

Beyond the exponentially growing functions there are functions that grow as fast as you might please.

Like n!, for instance, which grows faster than c

n

for every ﬁxed constant c, and like 2

n

2

, which grows much

faster than n!. The growth ranges that are of the most concern to computer scientists are ‘between’ the very

slowly, logarithmically growing functions and the functions that are of exponential growth. The reason is

simple: if a computer algorithm requires more than an exponential amount of time to do its job, then it will

probably not be used, or at any rate it will be used only in highly unusual circumstances. In this book, the

algorithms that we will deal with all fall in this range.

Now we have discussed the various symbols of asymptotics that are used to compare the rates of growth

of pairs of functions, and we have discussed the pecking order of rapidity of growth, so that we have a small

catalogue of functions that grow slowly, medium-fast, fast, and super-fast. Next let’s look at the growth of

sums that involve elementary functions, with a view toward discovering the rates at which the sums grow.

7

Chapter 1: Mathematical Preliminaries

Think about this one:

f(n) =

n

j=0

j

2

= 1

2

+2

2

+ 3

2

+ + n

2

.

(1.1.1)

Thus, f(n) is the sum of the squares of the ﬁrst n positive integers. How fast does f(n) grow when n is

large?

Notice at once that among the n terms in the sum that deﬁnes f(n), the biggest one is the last one,

namely n

2

. Since there are n terms in the sum and the biggest one is only n

2

, it is certainly true that

f(n) = O(n

3

), and even more, that f(n) ≤ n

3

for all n ≥ 1.

Suppose we wanted more precise information about the growth of f(n), such as a statement like f(n) ∼?.

How might we make such a better estimate?

The best way to begin is to visualize the sum in (1.1.1) as shown in Fig. 1.1.1.

Fig. 1.1.1: How to overestimate a sum

In that ﬁgure we see the graph of the curve y = x

2

, in the x-y plane. Further, there is a rectangle drawn

over every interval of unit length in the range from x = 1 to x = n. The rectangles all lie under the curve.

Consequently, the total area of all of the rectangles is smaller than the area under the curve, which is to say

that

n−1

j=1

j

2

≤

_

n

1

x

2

dx

= (n

3

−1)/3.

(1.1.2)

If we compare (1.1.2) and (1.1.1) we notice that we have proved that f(n) ≤ ((n + 1)

3

−1)/3.

Now we’re going to get a lower bound on f(n) in the same way. This time we use the setup in Fig.

1.1.2, where we again show the curve y = x

2

, but this time we have drawn the rectangles so they lie above

the curve.

From the picture we see immediately that

1

2

+ 2

2

+ + n

2

≥

_

n

0

x

2

dx

= n

3

/3.

(1.1.3)

Now our function f(n) has been bounded on both sides, rather tightly. What we know about it is that

∀n ≥ 1 : n

3

/3 ≤ f(n) ≤ ((n + 1)

3

−1)/3.

From this we have immediately that f(n) ∼ n

3

/3, which gives us quite a good idea of the rate of growth of

f(n) when n is large. The reader will also have noticed that the ‘∼’ gives a much more satisfying estimate

of growth than the ‘O’ does.

8

1.1 Orders of magnitude

Fig. 1.1.2: How to underestimate a sum

Let’s formulate a general principle, for estimating the size of a sum, that will make estimates like the

above for us without requiring us each time to visualize pictures like Figs. 1.1.1 and 1.1.2. The general idea

is that when one is faced with estimating the rates of growth of sums, then one should try to compare the

sums with integrals because they’re usually easier to deal with.

Let a function g(n) be deﬁned for nonnegative integer values of n, and suppose that g(n) is nondecreasing.

We want to estimate the growth of the sum

G(n) =

n

j=1

g(j) (n = 1, 2, . . .). (1.1.4)

Consider a diagram that looks exactly like Fig. 1.1.1 except that the curve that is shown there is now the

curve y = g(x). The sum of the areas of the rectangles is exactly G(n −1), while the area under the curve

between 1 and n is

_

n

1

g(t)dt. Since the rectangles lie wholly under the curve, their combined areas cannot

exceed the area under the curve, and we have the inequality

G(n −1) ≤

_

n

1

g(t)dt (n ≥ 1). (1.1.5)

On the other hand, if we consider Fig. 1.1.2, where the graph is once more the graph of y = g(x),

the fact that the combined areas of the rectangles is now not less than the area under the curve yields the

inequality

G(n) ≥

_

n

0

g(t)dt (n ≥ 1). (1.1.6)

If we combine (1.1.5) and (1.1.6) we ﬁnd that we have completed the proof of

Theorem 1.1.1. Let g(x) be nondecreasing for nonnegative x. Then

_

n

0

g(t)dt ≤

n

j=1

g(j) ≤

_

n+1

1

g(t)dt. (1.1.7)

The above theorem is capable of producing quite satisfactory estimates with rather little labor, as the

following example shows.

Let g(n) = log n and substitute in (1.1.7). After doing the integrals, we obtain

nlogn −n ≤

n

j=1

logj ≤ (n +1) log(n +1) −n. (1.1.8)

9

Chapter 1: Mathematical Preliminaries

We recognize the middle member above as logn!, and therefore by exponentiation of (1.1.8) we have

(

n

e

)

n

≤ n! ≤

(n + 1)

n+1

e

n

. (1.1.9)

This is rather a good estimate of the growth of n!, since the right member is only about ne times as large as

the left member (why?), when n is large.

By the use of slightly more precise machinery one can prove a better estimate of the size of n! that is

called Stirling’s formula, which is the statement that

x! ∼ (

x

e

)

x

√

2xπ. (1.1.10)

Exercises for section 1.1

1. Calculate the values of x

.01

and of log x for x = 10, 1000, 1,000,000. Find a single value of x > 10 for

which x

.01

> logx, and prove that your answer is correct.

2. Some of the following statements are true and some are false. Which are which?

(a) (x

2

+3x +1)

3

∼ x

6

(b) (

√

x +1)

3

/(x

2

+ 1) = o(1)

(c) e

1/x

= Θ(1)

(d) 1/x ∼ 0

(e) x

3

(log log x)

2

= o(x

3

log x)

(f)

√

logx + 1 = Ω(loglog x)

(g) sinx = Ω(1)

(h) cos x/x = O(1)

(i)

_

x

4

dt/t ∼ log x

(j)

_

x

0

e

−t

2

dt = O(1)

(k)

j≤x

1/j

2

= o(1)

(l)

j≤x

1 ∼ x

3. Each of the three sums below deﬁnes a function of x. Beneath each sum there appears a list of ﬁve

assertions about the rate of growth, as x → ∞, of the function that the sum deﬁnes. In each case state

which of the ﬁve choices, if any, are true (note: more than one choice may be true).

h

1

(x) =

j≤x

¦1/j + 3/j

2

+4/j

3

¦

(i) ∼ log x (ii) = O(x) (iii) ∼ 2 logx (iv) = Θ(log x) (v) = Ω(1)

h

2

(x) =

j≤

√

x

¦logj +j¦

(i) ∼ x/2 (ii) = O(

√

x) (iii) = Θ(

√

x log x) (iv) = Ω(

√

x) (v) = o(

√

x)

h

3

(x) =

j≤

√

x

1/

_

j

(i) = O(

√

x) (ii) = Ω(x

1/4

) (iii) = o(x

1/4

) (iv) ∼ 2x

1/4

(v) = Θ(x

1/4

)

4. Of the ﬁve symbols of asymptotics O, o, ∼, Θ, Ω, which ones are transitive (e.g., if f = O(g) and g = O(h),

is f = O(h)?)?

5. The point of this exercise is that if f grows more slowly than g, then we can always ﬁnd a third function

h whose rate of growth is between that of f and of g. Precisely, prove the following: if f = o(g) then there

10

1.2 Positional number systems

is a function h such that f = o(h) and h = o(g). Give an explicit construction for the function h in terms

of f and g.

6. ¦This exercise is a warmup for exercise 7.¦ Below there appear several mathematical propositions. In

each case, write a proposition that is the negation of the given one. Furthermore, in the negation, do not use

the word ‘not’ or any negation symbols. In each case the question is, ‘If this isn’t true, then what is true?’

(a) ∃x > 0 ÷ f(x) ,= 0

(b) ∀x > 0, f(x) > 0

(c) ∀x > 0, ∃ > 0 ÷ f(x) <

(d) ∃x ,= 0 ÷ ∀y < 0, f(y) < f(x)

(e) ∀x ∃y ÷ ∀z : g(x) < f(y)f(z)

(f) ∀ > 0 ∃x ÷ ∀y > x : f(y) <

Can you formulate a general method for negating such propositions? Given a proposition that contains ‘∀,’

‘∃,’ ‘÷,’ what rule would you apply in order to negate the proposition and leave the result in positive form

(containing no negation symbols or ‘not’s).

7. In this exercise we will work out the deﬁnition of the ‘Ω.’

(a) Write out the precise deﬁnition of the statement ‘lim

x→∞

h(x) = 0’ (use ‘’s).

(b) Write out the negation of your answer to part (a), as a positive assertion.

(c) Use your answer to part (b) to give a positive deﬁnition of the assertion ‘f(x) ,= o(g(x)),’ and

thereby justify the deﬁnition of the ‘Ω’ symbol that was given in the text.

8. Arrange the following functions in increasing order of their rates of growth, for large n. That is, list them

so that each one is ‘little oh’ of its successor:

2

√

n

, e

log n

3

, n

3.01

, 2

n

2

,

n

1.6

, logn

3

+1,

√

n!, n

3log n

,

n

3

log n, (loglogn)

3

, n

.5

2

n

, (n +4)

12

9. Find a function f(x) such that f(x) = O(x

1+

) is true for every > 0, but for which it is not true that

f(x) = O(x).

10. Prove that the statement ‘f(n) = O((2 + )

n

) for every > 0’ is equivalent to the statement ‘f(n) =

o((2 + )

n

) for every > 0.’

1.2 Positional number systems

This section will provide a brief review of the representation of numbers in diﬀerent bases. The usual

decimal system represents numbers by using the digits 0, 1, . . ., 9. For the purpose of representing whole

numbers we can imagine that the powers of 10 are displayed before us like this:

. . . , 100000, 10000, 1000, 100, 10, 1.

Then, to represent an integer we can specify how many copies of each power of 10 we would like to have. If

we write 237, for example, then that means that we want 2 100’s and 3 10’s and 7 1’s.

In general, if we write out the string of digits that represents a number in the decimal system, as

d

m

d

m−1

d

1

d

0

, then the number that is being represented by that string of digits is

n =

m

i=0

d

i

10

i

.

Now let’s try the binary system. Instead of using 10’s we’re going to use 2’s. So we imagine that the

powers of 2 are displayed before us, as

. . . , 512, 256, 128, 64, 32, 16, 8, 4, 2, 1.

11

Chapter 1: Mathematical Preliminaries

To represent a number we will now specify how many copies of each power of 2 we would like to have. For

instance, if we write 1101, then we want an 8, a 4 and a 1, so this must be the decimal number 13. We will

write

(13)

10

= (1101)

2

to mean that the number 13, in the base 10, is the same as the number 1101, in the base 2.

In the binary system (base 2) the only digits we will ever need are 0 and 1. What that means is that if

we use only 0’s and 1’s then we can represent every number n in exactly one way. The unique representation

of every number, is, after all, what we must expect and demand of any proposed system.

Let’s elaborate on this last point. If we were allowed to use more digits than just 0’s and 1’s then we

would be able to represent the number (13)

10

as a binary number in a whole lot of ways. For instance, we

might make the mistake of allowing digits 0, 1, 2, 3. Then 13 would be representable by 3 2

2

+1 2

0

or by

2 2

2

+ 2 2

1

+ 1 2

0

etc.

So if we were to allow too many diﬀerent digits, then numbers would be representable in more than one

way by a string of digits.

If we were to allow too few diﬀerent digits then we would ﬁnd that some numbers have no representation

at all. For instance, if we were to use the decimal system with only the digits 0, 1, . . ., 8, then inﬁnitely many

numbers would not be able to be represented, so we had better keep the 9’s.

The general proposition is this.

Theorem 1.2.1. Let b > 1 be a positive integer (the ‘base’). Then every positive integer n can be written

in one and only one way in the form

n = d

0

+ d

1

b +d

2

b

2

+ d

3

b

3

+

if the digits d

0

, d

1

, . . . lie in the range 0 ≤ d

i

≤ b −1, for all i.

Remark: The theorem says, for instance, that in the base 10 we need the digits 0, 1, 2, . . . , 9, in the base

2 we need only 0 and 1, in the base 16 we need sixteen digits, etc.

Proof of the theorem: If b is ﬁxed, the proof is by induction on n, the number being represented. Clearly

the number 1 can be represented in one and only one way with the available digits (why?). Suppose,

inductively, that every integer 1, 2, . . . , n −1 is uniquely representable. Now consider the integer n. Deﬁne

d = n mod b. Then d is one of the b permissible digits. By induction, the number n

= (n −d)/b is uniquely

representable, say

n −d

b

= d

0

+d

1

b +d

2

b

2

+ . . .

Then clearly,

n = d +

n −d

b

b

= d +d

0

b +d

1

b

2

+d

2

b

3

+ . . .

is a representation of n that uses only the allowed digits.

Finally, suppose that n has some other representation in this form also. Then we would have

n = a

0

+a

1

b +a

2

b

2

+ . . .

= c

0

+c

1

b +c

2

b

2

+. . .

Since a

0

and c

0

are both equal to n mod b, they are equal to each other. Hence the number n

= (n −a

0

)/b

has two diﬀerent representations, which contradicts the inductive assumption, since we have assumed the

truth of the result for all n

< n.

The bases b that are the most widely used are, aside from 10, 2 (‘binary system’), 8 (‘octal system’)

and 16 (‘hexadecimal system’).

The binary system is extremely simple because it uses only two digits. This is very convenient if you’re

a computer or a computer designer, because the digits can be determined by some component being either

‘on’ (digit 1) or ‘oﬀ’ (digit 0). The binary digits of a number are called its bits or its bit string.

12

1.2 Positional number systems

The octal system is popular because it provides a good way to remember and deal with the long bit

strings that the binary system creates. According to the theorem, in the octal system the digits that we

need are 0, 1, . . . , 7. For instance,

(735)

8

= (477)

10

.

The captivating feature of the octal system is the ease with which we can convert between octal and binary.

If we are given the bit string of an integer n, then to convert it to octal, all we have to do is to group the

bits together in groups of three, starting with the least signiﬁcant bit, then convert each group of three bits,

independently of the others, into a single octal digit. Conversely, if the octal form of n is given, then the

binary form is obtainable by converting each octal digit independently into the three bits that represent it

in the binary system.

For example, given (1101100101)

2

. To convert this binary number to octal, we group the bits in threes,

(1)(101)(100)(101)

starting from the right, and then we convert each triple into a single octal digit, thereby getting

(1101100101)

2

= (1545)

8

.

If you’re a working programmer it’s very handy to use the shorter octal strings to remember, or to write

down, the longer binary strings, because of the space saving, coupled with the ease of conversion back and

forth.

The hexadecimal system (base 16) is like octal, only more so. The conversion back and forth to binary

now uses groups of four bits, rather than three. In hexadecimal we will need, according to the theorem

above, 16 digits. We have handy names for the ﬁrst 10 of these, but what shall we call the ‘digits 10 through

15’ ? The names that are conventionally used for them are ‘A,’ ‘B,’...,‘F.’ We have, for example,

(A52C)

16

= 10(4096) +5(256) +2(16) + 12

= (42284)

10

= (1010)

2

(0101)

2

(0010)

2

(1100)

2

= (1010010100101100)

2

= (1)(010)(010)(100)(101)(100)

= (122454)

8

.

Exercises for section 1.2

1. Prove that conversion from octal to binary is correctly done by converting each octal digit to a binary

triple and concatenating the resulting triples. Generalize this theorem to other pairs of bases.

2. Carry out the conversions indicated below.

(a) (737)

10

= (?)

3

(b) (101100)

2

= (?)

16

(c) (3377)

8

= (?)

16

(d) (ABCD)

16

= (?)

10

(e) (BEEF)

16

= (?)

8

3. Write a procedure convert (n, b:integer, digitstr:string), that will ﬁnd the string of digits that represents

n in the base b.

13

Chapter 1: Mathematical Preliminaries

1.3 Manipulations with series

In this section we will look at operations with power series, including multiplying them and ﬁnding their

sums in simple form. We begin with a little catalogue of some power series that are good to know. First we

have the ﬁnite geometric series

(1 −x

n

)/(1 −x) = 1 +x + x

2

+ + x

n−1

. (1.3.1)

This equation is valid certainly for all x ,= 1, and it remains true when x = 1 also if we take the limit

indicated on the left side.

Why is (1.3.1) true? Just multiply both sides by 1 −x to clear of fractions. The result is

1 −x

n

= (1 + x +x

2

+ x

3

+ + x

n−1

)(1 −x)

= (1 + x +x

2

+ + x

n−1

) −(x + x

2

+x

3

+ +x

n

)

= 1 −x

n

and the proof is ﬁnished.

Now try this one. What is the value of the sum

9

j=0

3

j

?

Observe that we are looking at the right side of (1.3.1) with x = 3. Therefore the answer is (3

10

−1)/2. Try

to get used to the idea that a series in powers of x becomes a number if x is replaced by a number, and if

we know a formula for the sum of the series then we know the number that it becomes.

Here are some more series to keep in your zoo. A parenthetical remark like ‘([x[ < 1)’ shows the set of

values of x for which the series converges.

∞

k=0

x

k

= 1/(1 −x) ([x[ < 1) (1.3.2)

e

x

=

∞

m=0

x

m

/m! (1.3.3)

sinx =

∞

r=0

(−1)

r

x

2r+1

/(2r + 1)! (1.3.4)

cos x =

∞

s=0

(−1)

s

x

2s

/(2s)! (1.3.5)

log (1/(1 −x)) =

∞

j=1

x

j

/j ([x[ < 1) (1.3.6)

Can you ﬁnd a simple form for the sum (the logarithms are ‘natural’)

1 + log2 + (log2)

2

/2! + (log2)

3

/3! + ?

Hint: Look at (1.3.3), and replace x by log 2.

Aside from merely substituting values of x into known series, there are many other ways of using known

series to express sums in simple form. Let’s think about the sum

1 +2 2 +3 4 + 4 8 + 5 16 + +N2

N−1

. (1.3.7)

14

1.3 Manipulations with series

We are reminded of the ﬁnite geometric series (1.3.1), but (1.3.7) is a little diﬀerent because of the multipliers

1, 2, 3, 4, . . ., N.

The trick is this. When confronted with a series that is similar to, but not identical with, a known

series, write down the known series as an equation, with the series on one side and its sum on the other.

Even though the unknown series involves a particular value of x, in this case x = 2, keep the known series

with its variable unrestricted. Then reach for an appropriate tool that will be applied to both sides of that

equation, and whose result will be that the known series will have been changed into the one whose sum we

needed.

In this case, since (1.3.7) reminds us of (1.3.1), we’ll begin by writing down (1.3.1) again,

(1 −x

n

)/(1 −x) = 1 + x +x

2

+ + x

n−1

(1.3.8)

Don’t replace x by 2 yet, just walk up to the equation (1.3.8) carrying your tool kit and ask what kind

of surgery you could do to both sides of (1.3.8) that would be helpful in evaluating the unknown (1.3.7).

We are going to reach into our tool kit and pull out ‘

d

dx

.’ In other words, we are going to diﬀerentiate

(1.3.8). The reason for choosing diﬀerentiation is that it will put the missing multipliers 1, 2, 3, . . ., N into

(1.3.8). After diﬀerentiation, (1.3.8) becomes

1 +2x +3x

2

+4x

3

+ +(n −1)x

n−2

=

1 −nx

n−1

+ (n −1)x

n

(1 −x)

2

. (1.3.9)

Now it’s easy. To evaluate the sum (1.3.7), all we have to do is to substitute x = 2, n = N +1 in (1.3.9), to

obtain, after simplifying the right-hand side,

1 +2 2 + 3 4 + 4 8 + +N2

N−1

= 1 + (N −1)2

N

. (1.3.10)

Next try this one:

1

2 3

2

+

1

3 3

3

+ (1.3.11)

If we rewrite the series using summation signs, it becomes

∞

j=2

1

j 3

j

.

Comparison with the series zoo shows great resemblance to the species (1.3.6). In fact, if we put x = 1/3 in

(1.3.6) it tells us that

∞

j=1

1

j 3

j

= log(3/2). (1.3.12)

The desired sum (1.3.11) is the result of dropping the term with j = 1 from (1.3.12), which shows that the

sum in (1.3.11) is equal to log(3/2) −1/3.

In general, suppose that f(x) =

a

n

x

n

is some series that we know. Then

na

n

x

n−1

= f

(x) and

na

n

x

n

= xf

**(x). In other words, if the n
**

th

coeﬃcient is multiplied by n, then the function changes from

f to (x

d

dx

)f. If we apply the rule again, we ﬁnd that multiplying the n

th

coeﬃcient of a power series by n

2

changes the sum from f to (x

d

dx

)

2

f. That is,

∞

j=0

j

2

x

j

/j! = (x

d

dx

)(x

d

dx

)e

x

= (x

d

dx

)(xe

x

)

= (x

2

+x)e

x

.

15

Chapter 1: Mathematical Preliminaries

Similarly, multiplying the n

th

coeﬃcient of a power series by n

p

will change the sum from f(x) to

(x

d

dx

)

p

f(x), but that’s not all. What happens if we multiply the coeﬃcient of x

n

by, say, 3n

2

+ 2n + 5? If

the sum previously was f(x), then it will be changed to ¦3(x

d

dx

)

2

+ 2(x

d

dx

) + 5¦f(x). The sum

∞

j=0

(2j

2

+ 5)x

j

is therefore equal to ¦2(x

d

dx

)

2

+5¦¦1/(1−x)¦, and after doing the diﬀerentiations we ﬁnd the answer in the

form (7x

2

−8x +5)/(1 −x)

3

.

Here is the general rule: if P(x) is any polynomial then

j

P(j)a

j

x

j

= P(x

d

dx

)¦

j

a

j

x

j

¦. (1.3.13)

Exercises for section 1.3

1. Find simple, explicit formulas for the sums of each of the following series.

(a)

j≥3

log6

j

/j!

(b)

m>1

(2m+ 7)/5

m

(c)

19

j=0

(j/2

j

)

(d) 1 −x/2! +x

2

/4! −x

3

/6! +

(e) 1 −1/3

2

+1/3

4

−1/3

6

+

(f)

∞

m=2

(m

2

+3m+2)/m!

2. Explain why

r≥0

(−1)

r

π

2r+1

/(2r +1)! = 0.

3. Find the coeﬃcient of t

n

in the series expansion of each of the following functions about t = 0.

(a) (1 + t +t

2

)e

t

(b) (3t −t

2

) sint

(c) (t + 1)

2

/(t −1)

2

1.4 Recurrence relations

A recurrence relation is a formula that permits us to compute the members of a sequence one after

another, starting with one or more given values.

Here is a small example. Suppose we are to ﬁnd an inﬁnite sequence of numbers x

0

, x

1

, . . . by means of

x

n+1

= cx

n

(n ≥ 0; x

0

= 1). (1.4.1)

This relation tells us that x

1

= cx

0

, and x

2

= cx

1

, etc., and furthermore that x

0

= 1. It is then clear that

x

1

= c, x

2

= c

2

, . . . , x

n

= c

n

, . . .

We say that the solution of the recurrence relation (= ‘diﬀerence equation’) (1.4.1) is given by x

n

= c

n

for all n ≥ 0. Equation (1.4.1) is a ﬁrst-order recurrence relation because a new value of the sequence is

computed from just one preceding value (i.e., x

n+1

is obtained solely from x

n

, and does not involve x

n−1

or

any earlier values).

Observe the format of the equation (1.4.1). The parenthetical remarks are essential. The ﬁrst one

‘n ≥ 0’ tells us for what values of n the recurrence formula is valid, and the second one ‘x

0

= 1’ gives the

starting value. If one of these is missing, the solution may not be uniquely determined. The recurrence

relation

x

n+1

= x

n

+ x

n−1

(1.4.2)

needs two starting values in order to ‘get going,’ but it is missing both of those starting values and the range

of n. Consequently (1.4.2) (which is a second-order recurrence) does not uniquely determine the sequence.

16

1.4 Recurrence relations

The situation is rather similar to what happens in the theory of ordinary diﬀerential equations. There,

if we omit initial or boundary values, then the solutions are determined only up to arbitrary constants.

Beyond the simple (1.4.1), the next level of diﬃculty occurs when we consider a ﬁrst-order recurrence

relation with a variable multiplier, such as

x

n+1

= b

n+1

x

n

(n ≥ 0; x

0

given). (1.4.3)

Now ¦b

1

, b

2

, . . .¦ is a given sequence, and we are being asked to ﬁnd the unknown sequence ¦x

1

, x

2

, . . .¦.

In an easy case like this we can write out the ﬁrst few x’s and then guess the answer. We ﬁnd, successively,

that x

1

= b

1

x

0

, then x

2

= b

2

x

1

= b

2

b

1

x

0

and x

3

= b

3

x

2

= b

3

b

2

b

1

x

0

etc. At this point we can guess that the

solution is

x

n

= ¦

n

i=1

b

i

¦x

0

(n = 0, 1, 2, . . .). (1.4.4)

Since that wasn’t hard enough, we’ll raise the ante a step further. Suppose we want to solve the

ﬁrst-order inhomogeneous (because x

n

= 0 for all n is not a solution) recurrence relation

x

n+1

= b

n+1

x

n

+ c

n+1

(n ≥ 0; x

0

given). (1.4.5)

Now we are being given two sequences b

1

, b

2

, . . . and c

1

, c

2

, . . ., and we want to ﬁnd the x’s. Suppose we

follow the strategy that has so far won the game, that is, writing down the ﬁrst few x’s and trying to guess

the pattern. Then we would ﬁnd that x

1

= b

1

x

0

+c

1

, x

2

= b

2

b

1

x

0

+b

2

c

1

+c

2

, and we would probably tire

rapidly.

Here is a somewhat more orderly approach to (1.4.5). Though no approach will avoid the unpleasant

form of the general answer, the one that we are about to describe at least gives a method that is much

simpler than the guessing strategy, for many examples that arise in practice. In this book we are going to

run into several equations of the type of (1.4.5), so a uniﬁed method will be a deﬁnite asset.

The ﬁrst step is to deﬁne a new unknown function as follows. Let

x

n

= b

1

b

2

b

n

y

n

(n ≥ 1; x

0

= y

0

) (1.4.6)

deﬁne a new unknown sequence y

1

, y

2

, . . . Now substitute for x

n

in (1.4.5), getting

b

1

b

2

b

n+1

y

n+1

= b

n+1

b

1

b

2

b

n

y

n

+ c

n+1

.

We notice that the coeﬃcients of y

n+1

and of y

n

are the same, and so we divide both sides by that coeﬃcient.

The result is the equation

y

n+1

= y

n

+ d

n+1

(n ≥ 0; y

0

given) (1.4.7)

where we have written d

n+1

= c

n+1

/(b

1

b

n+1

). Notice that the d’s are known.

We haven’t yet solved the recurrence relation. We have only changed to a new unknown function that

satisﬁes a simpler recurrence (1.4.7). Now the solution of (1.4.7) is quite simple, because it says that each y

is obtained from its predecessor by adding the next one of the d’s. It follows that

y

n

= y

0

+

n

j=1

d

j

(n ≥ 0).

We can now use (1.4.6) to reverse the change of variables to get back to the original unknowns x

0

, x

1

, . . .,

and ﬁnd that

x

n

= (b

1

b

2

b

n

)¦x

0

+

n

j=1

d

j

¦ (n ≥ 1). (1.4.8)

It is not recommended that the reader memorize the solution that we have just obtained. It is recom-

mended that the method by which the solution was found be mastered. It involves

(a) make a change of variables that leads to a new recurrence of the form (1.4.6), then

17

Chapter 1: Mathematical Preliminaries

(b) solve that one by summation and

(c) go back to the original unknowns.

As an example, consider the ﬁrst-order equation

x

n+1

= 3x

n

+ n (n ≥ 0; x

0

= 0). (1.4.9)

The winning change of variable, from (1.4.6), is to let x

n

= 3

n

y

n

. After substituting in (1.4.9) and simplifying,

we ﬁnd

y

n+1

= y

n

+ n/3

n+1

(n ≥ 0; y

0

= 0).

Now by summation,

y

n

=

n−1

j=1

j/3

j+1

(n ≥ 0).

Finally, since x

n

= 3

n

y

n

we obtain the solution of (1.4.9) in the form

x

n

= 3

n

n−1

j=1

j/3

j+1

(n ≥ 0). (1.4.10)

This is quite an explicit answer, but the summation can, in fact, be completely removed by the same method

that you used to solve exercise 1(c) of section 1.3 (try it!).

That pretty well takes care of ﬁrst-order recurrence relations of the form x

n+1

= b

n+1

x

n

+ c

n+1

, and

it’s time to move on to linear second order (homogeneous) recurrence relations with constant coeﬃcients.

These are of the form

x

n+1

= ax

n

+ bx

n−1

(n ≥ 1; x

0

and x

1

given). (1.4.11)

If we think back to diﬀerential equations of second-order with constant coeﬃcients, we recall that there

are always solutions of the form y(t) = e

αt

where α is constant. Hence the road to the solution of such a

diﬀerential equation begins by trying a solution of that form and seeing what the constant or constants α

turn out to be.

Analogously, equation (1.4.11) calls for a trial solution of the form x

n

= α

n

. If we substitute x

n

= α

n

in (1.4.11) and cancel a common factor of α

n−1

we obtain a quadratic equation for α, namely

α

2

= aα +b. (1.4.12)

‘Usually’ this quadratic equation will have two distinct roots, say α

+

and α

−

, and then the general solution

of (1.4.11) will look like

x

n

= c

1

α

n

+

+c

2

α

n

−

(n = 0, 1, 2, . . .). (1.4.13)

The constants c

1

and c

2

will be determined so that x

0

, x

1

have their assigned values.

Example. The recurrence for the Fibonacci numbers is

F

n+1

= F

n

+F

n−1

(n ≥ 1; F

0

= F

1

= 1). (1.4.14)

Following the recipe that was described above, we look for a solution in the form F

n

= α

n

. After substituting

in (1.4.14) and cancelling common factors we ﬁnd that the quadratic equation for α is, in this case, α

2

= α+1.

If we denote the two roots by α

+

= (1 +

√

5)/2 and α

−

= (1 −

√

5)/2, then the general solution to the

Fibonacci recurrence has been obtained, and it has the form (1.4.13). It remains to determine the constants

c

1

, c

2

from the initial conditions F

0

= F

1

= 1.

From the form of the general solution we have F

0

= 1 = c

1

+c

2

and F

1

= 1 = c

1

α

+

+c

2

α

−

. If we solve

these two equations in the two unknowns c

1

, c

2

we ﬁnd that c

1

= α

+

/

√

5 and c

2

= −α

−

/

√

5. Finally, we

substitute these values of the constants into the form of the general solution, and obtain an explicit formula

for the n

th

Fibonacci number,

F

n

=

1

√

5

__

1 +

√

5

2

_

n+1

−

_

1 −

√

5

2

_

n+1

_

(n = 0, 1, . . .). (1.4.15)

18

1.4 Recurrence relations

The Fibonacci numbers are in fact 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . It isn’t even obvious that the formula

(1.4.15) gives integer values for the F

n

’s. The reader should check that the formula indeed gives the ﬁrst few

F

n

’s correctly.

Just to exercise our newly acquired skills in asymptotics, let’s observe that since (1 +

√

5)/2 > 1 and

[(1 −

√

5)/2[ < 1, it follows that when n is large we have

F

n

∼ ((1 +

√

5)/2)

n+1

/

√

5.

The process of looking for a solution in a certain form, namely in the form α

n

, is subject to the same

kind of special treatment, in the case of repeated roots, that we ﬁnd in diﬀerential equations. Corresponding

to a double root α of the associated quadratic equation α

2

= aα+b we would ﬁnd two independent solutions

α

n

and nα

n

, so the general solution would be in the form α

n

(c

1

+c

2

n).

Example. Consider the recurrence

x

n+1

= 2x

n

−x

n−1

(n ≥ 1; x

0

= 1; x

1

= 5). (1.4.16)

If we try a solution of the type x

n

= α

n

, then we ﬁnd that α satisﬁes the quadratic equation α

2

= 2α −1.

Hence the ‘two’ roots are 1 and 1. The general solution is x

n

= 1

n

(c

1

+nc

2

) = c

1

+c

2

n. After inserting the

given initial conditions, we ﬁnd that

x

0

= 1 = c

1

; x

1

= 5 = c

1

+ c

2

If we solve for c

1

and c

2

we obtain c

1

= 1, c

2

= 4, and therefore the complete solution of the recurrence

(1.4.16) is given by x

n

= 4n +1.

Now let’s look at recurrent inequalities, like this one:

x

n+1

≤ x

n

+ x

n−1

+n

2

(n ≥ 1; x

0

= 0; x

1

= 0). (1.4.17)

The question is, what restriction is placed on the growth of the sequence ¦x

n

¦ by (1.4.17)?

By analogy with the case of diﬀerence equations with constant coeﬃcients, the thing to try here is

x

n

≤ Kα

n

. So suppose it is true that x

n

≤ Kα

n

for all n = 0, 1, 2, . . . , N. Then from (1.4.17) with n = N

we ﬁnd

x

N+1

≤ Kα

N

+Kα

N−1

+ N

2

.

Let c be the positive real root of the equation c

2

= c +1, and suppose that α > c. Then α

2

> α +1 and so

α

2

−α −1 = t, say, where t > 0. Hence

x

N+1

≤ Kα

N−1

(1 + α) +N

2

= Kα

N−1

(α

2

−t) +N

2

= Kα

N+1

−(tKα

N−1

−N

2

).

(1.4.18)

In order to insure that x

N+1

< Kα

N+1

what we need is for tKα

N−1

> N

2

. Hence as long as we choose

K > max

N≥2

_

N

2

/tα

N−1

_

, (1.4.19)

in which the right member is clearly ﬁnite, the inductive step will go through.

The conclusion is that (1.4.17) implies that for every ﬁxed > 0, x

n

= O((c+)

n

), where c = (1+

√

5)/2.

The same argument applies to the general situation that is expressed in

19

Chapter 1: Mathematical Preliminaries

Theorem 1.4.1. Let a sequence ¦x

n

¦ satisfy a recurrent inequality of the form

x

n+1

≤ b

0

x

n

+b

1

x

n−1

+ +b

p

x

n−p

+G(n) (n ≥ p)

where b

i

≥ 0 (∀i),

b

i

> 1. Further, let c be the positive real root of * the equa tion c

p+1

= b

0

c

p

+ +b

p

.

Finally, suppose G(n) = o(c

n

). Then for every ﬁxed > 0 we have x

n

= O((c + )

n

).

Proof: Fix > 0, and let α = c + , where c is the root of the equation shown in the statement of the

theorem. Since α > c, if we let

t = α

p+1

−b

0

α

p

− −b

p

then t > 0. Finally, deﬁne

K = max

_

[x

0

[,

[x

1

[

α

, . . . ,

[x

p

[

α

p

, max

n≥p

_

G(n)

tα

n−p

_

_

.

Then K is ﬁnite, and clearly [x

j

[ ≤ Kα

j

for j ≤ p. We claim that [x

n

[ ≤ Kα

n

for all n, which will complete

the proof.

Indeed, if the claim is true for 0, 1, 2, . . . , n, then

[x

n+1

[ ≤ b

0

[x

0

[ + +b

p

[x

n−p

[ + G(n)

≤ b

0

Kα

n

+ +b

p

Kα

n−p

+ G(n)

= Kα

n−p

¦b

0

α

p

+ +b

p

¦ + G(n)

= Kα

n−p

¦α

p+1

−t¦ +G(n)

= Kα

n+1

−¦tKα

n−p

−G(n)¦

≤ Kα

n+1

.

Exercises for section 1.4

1. Solve the following recurrence relations

(i) x

n+1

= x

n

+ 3 (n ≥ 0; x

0

= 1)

(ii) x

n+1

= x

n

/3 +2 (n ≥ 0; x

0

= 0)

(iii) x

n+1

= 2nx

n

+1 (n ≥ 0; x

0

= 0)

(iv) x

n+1

= ((n + 1)/n)x

n

+n +1 (n ≥ 1; x

1

= 5)

(v) x

n+1

= x

n

+ x

n−1

(n ≥ 1; x

0

= 0; x

1

= 3)

(vi) x

n+1

= 3x

n

−2x

n−1

(n ≥ 1; x

0

= 1; x

1

= 3)

(vii) x

n+1

= 4x

n

−4x

n−1

(n ≥ 1; x

0

= 1; x

1

= ξ)

2. Find x

1

if the sequence x satisﬁes the Fibonacci recurrence relation and if furthermore x

0

= 1 and

x

n

= o(1) (n → ∞).

3. Let x

n

be the average number of trailing 0’s in the binary expansions of all integers 0, 1, 2, . . ., 2

n

− 1.

Find a recurrence relation satisﬁed by the sequence ¦x

n

¦, solve it, and evaluate lim

n→∞

x

n

.

4. For what values of a and b is it true that no matter what the initial values x

0

, x

1

are, the solution of the

recurrence relation x

n+1

= ax

n

+ bx

n−1

(n ≥ 1) is guaranteed to be o(1) (n → ∞)?

5. Suppose x

0

= 1, x

1

= 1, and for all n ≥ 2 it is true that x

n+1

≤ x

n

+x

n−1

. Is it true that ∀n : x

n

≤ F

n

?

Prove your answer.

6. Generalize the result of exercise 5, as follows. Suppose x

0

= y

0

and x

1

= y

1

, where y

n+1

= ay

n

+

by

n−1

(∀n ≥ 1). If furthermore, x

n+1

≤ ax

n

+ bx

n−1

(∀n ≥ 1), can we conclude that ∀n : x

n

≤ y

n

? If

not, describe conditions on a and b under which that conclusion would follow.

7. Find the asymptotic behavior in the form x

n

∼? (n → ∞) of the right side of (1.4.10).

* See exercise 10, below.

20

1.5 Counting

8. Write out a complete proof of theorem 1.4.1.

9. Show by an example that the conclusion of theorem 1.4.1 may be false if the phrase ‘for every ﬁxed

> 0 . . .’ were replaced by ‘for every ﬁxed ≥ 0 . . ..’

10. In theorem 1.4.1 we ﬁnd the phrase ‘... the positive real root of ...’ Prove that this phrase is justiﬁed, in

that the equation shown always has exactly one positive real root. Exactly what special properties of that

equation did you use in your proof?

1.5 Counting

For a given positive integer n, consider the set ¦1, 2, . . .n¦. We will denote this set by the symbol [n],

and we want to discuss the number of subsets of various kinds that it has. Here is a list of all of the subsets

of [2]: ∅, ¦1¦, ¦2¦, ¦1, 2¦. There are 4 of them.

We claim that the set [n] has exactly 2

n

subsets.

To see why, notice that we can construct the subsets of [n] in the following way. Either choose, or don’t

choose, the element ‘1,’ then either choose, or don’t choose, the element ‘2,’ etc., ﬁnally choosing, or not

choosing, the element ‘n.’ Each of the n choices that you encountered could have been made in either of 2

ways. The totality of n choices, therefore, might have been made in 2

n

diﬀerent ways, so that is the number

of subsets that a set of n objects has.

Next, suppose we have n distinct objects, and we want to arrange them in a sequence. In how many

ways can we do that? For the ﬁrst object in our sequence we may choose any one of the n objects. The

second element of the sequence can be any of the remaining n − 1 objects, so there are n(n − 1) possible

ways to make the ﬁrst two decisions. Then there are n − 2 choices for the third element, and so we have

n(n −1)(n−2) ways to arrange the ﬁrst three elements of the sequence. It is no doubt clear now that there

are exactly n(n −1)(n −2) 3 2 1 = n! ways to form the whole sequence.

Of the 2

n

subsets of [n], how many have exactly k objects in them? The number of elements in a

set is called its cardinality. The cardinality of a set S is denoted by [S[, so, for example, [[6][ = 6. A set

whose cardinality is k is called a ‘k-set,’ and a subset of cardinality k is, naturally enough, a ‘k-subset.’ The

question is, for how many subsets S of [n] is it true that [S[ = k?

We can construct k-subsets S of [n] (written ‘S ⊆ [n]’) as follows. Choose an element a

1

(n possible

choices). Of the remaining n −1 elements, choose one (n −1 possible choices), etc., until a sequence of k

diﬀerent elements have been chosen. Obviously there were n(n −1)(n −2) (n −k +1) ways in which we

might have chosen that sequence, so the number of ways to choose an (ordered) sequence of k elements from

[n] is

n(n −1)(n −2) (n −k +1) = n!/(n −k)!.

But there are more sequences of k elements than there are k-subsets, because any particular k-subset S

will correspond to k! diﬀerent ordered sequences, namely all possible rearrangements of the elements of the

subset. Hence the number of k-subsets of [n] is equal to the number of k-sequences divided by k!. In other

words, there are exactly n!/k!(n −k)! k-subsets of a set of n objects.

The quantities n!/k!(n −k)! are the famous binomial coeﬃcients, and they are denoted by

_

n

k

_

=

n!

k!(n −k)!

(n ≥ 0; 0 ≤ k ≤ n). (1.5.1)

Some of their special values are

_

n

0

_

= 1 (∀n ≥ 0);

_

n

1

_

= n (∀n ≥ 0);

_

n

2

_

= n(n −1)/2 (∀n ≥ 0);

_

n

n

_

= 1 (∀n ≥ 0).

It is convenient to deﬁne

_

n

k

_

to be 0 if k < 0 or if k > n.

We can summarize the developments so far with

21

Chapter 1: Mathematical Preliminaries

Theorem 1.5.1. For each n ≥ 0, a set of n objects has exactly 2

n

subsets, and of these, exactly

_

n

k

_

have

cardinality k ( ∀k = 0, 1, . . ., n). There are exactly n! diﬀerent sequences that can be formed from a set of n

distinct objects.

Since every subset of [n] has some cardinality, it follows that

n

k=0

_

n

k

_

= 2

n

(n = 0, 1, 2, . . .). (1.5.2)

In view of the convention that we adopted, we might have written (1.5.2) as

k

_

n

k

_

= 2

n

, with no restriction

on the range of the summation index k. It would then have been understood that the range of k is from

−∞ to ∞, and that the binomial coeﬃcient

_

n

k

_

vanishes unless 0 ≤ k ≤ n.

In Table 1.5.1 we show the values of some of the binomial coeﬃcients

_

n

k

_

. The rows of the table

are thought of as labelled ‘n = 0,’ ‘n = 1,’ etc., and the entries within each row refer, successively, to

k = 0, 1, . . . , n. The table is called ‘Pascal’s triangle.’

1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

1 6 15 20 15 6 1

1 7 21 35 35 21 7 1

1 8 28 56 70 56 28 8 1

...................................................

..

Table 1.5.1: Pascal’s triangle

Here are some facts about the binomial coeﬃcients:

(a) Each row of Pascal’s triangle is symmetric about the middle. That is,

_

n

k

_

=

_

n

n −k

_

(0 ≤ k ≤ n; n ≥ 0).

(b) The sum of the entries in the n

th

row of Pascal’s triangle is 2

n

.

(c) Each entry is equal to the sum of the two entries that are immediately above it in the triangle.

The proof of (c) above can be interesting. What it says about the binomial coeﬃcients is that

_

n

k

_

=

_

n −1

k −1

_

+

_

n −1

k

_

((n, k) ,= (0, 0)). (1.5.3)

There are (at least) two ways to prove (1.5.3). The hammer-and-tongs approach would consist of expanding

each of the three binomial coeﬃcients that appears in (1.5.3), using the deﬁnition (1.5.1) in terms of factorials,

and then cancelling common factors to complete the proof.

That would work (try it), but here’s another way. Contemplate (this proof is by contemplation) the

totality of k-subsets of [n]. The number of them is on the left side of (1.5.3). Sort them out into two piles:

those k-subsets that contain ‘1’ and those that don’t. If a k-subset of [n] contains ‘1,’ then its remaining

k −1 elements can be chosen in

_

n−1

k−1

_

ways, and that accounts for the ﬁrst term on the right of (1.5.3). If a

k-subset does not contain ‘1,’ then its k elements are all chosen from [n −1], and that completes the proof

of (1.5.3).

22

1.5 Counting

The binomial theorem is the statement that ∀n ≥ 0 we have

(1 +x)

n

=

n

k=0

_

n

k

_

x

k

. (1.5.4)

Proof: By induction on n. Eq. (1.5.4) is clearly true when n = 0, and if it is true for some n then multiply

both sides of (1.5.4) by (1 + x) to obtain

(1 + x)

n+1

=

k

_

n

k

_

x

k

+

k

_

n

k

_

x

k+1

=

k

_

n

k

_

x

k

+

k

_

n

k −1

_

x

k

=

k

_

_

n

k

_

+

_

n

k −1

_

_

x

k

=

k

_

n + 1

k

_

x

k

which completes the proof.

Now let’s ask how big the binomial coeﬃcients are, as an exercise in asymptotics. We will refer to the

coeﬃcients in row n of Pascal’s triangle, that is, to

_

n

0

_

,

_

n

1

_

, . . . ,

_

n

n

_

as the coeﬃcients of order n. Then, by (1.5.2) (or by (1.5.4) with x = 1), the sum of all of the coeﬃcients

of order n is 2

n

. It is also fairly apparent, from an inspection of Table 1.5.1, that the largest one(s) of the

coeﬃcients of order n is (are) the one(s) in the middle.

More precisely, if n is odd, then the largest coeﬃcients of order n are

_

n

(n−1)/2

_

and

_

n

(n+1)/2

_

, whereas

if n is even, the largest one is uniquely

_

n

n/2

_

.

It will be important, in some of the applications to algorithms later on in this book, for us to be able

to pick out the largest term in a sequence of this kind, so let’s see how we could prove that the biggest

coeﬃcients are the ones cited above.

For n ﬁxed, we will compute the ratio of the (k +1)

st

coeﬃcient of order n to the k

th

. We will see then

that the ratio is larger than 1 if k < (n −1)/2 and is < 1 if k > (n −1)/2. That, of course, will imply that

the (k + 1)

st

coeﬃcient is bigger than the k

th

, for such k, and therefore that the biggest one(s) must be in

the middle.

The ratio is

_

n

k+1

_

_

n

k

_ =

n!/¦(k +1)!(n −k −1)!¦

n!/¦k!(n −k)!¦

=

k!(n −k)!

(k +1)!(n −k −1)!

= (n −k)/(k + 1)

and is > 1 iﬀ k < (n −1)/2, as claimed.

OK, the biggest coeﬃcients are in the middle, but how big are they? Let’s suppose that n is even, just

to keep things simple. Then the biggest binomial coeﬃcient of order n is

_

n

n/2

_

=

n!

(n/2)!

2

∼

(

n

e

)

n

√

2nπ

¦(

n

2e

)

n

2

√

nπ¦

2

=

_

2

nπ

2

n

(1.5.5)

23

Chapter 1: Mathematical Preliminaries

where we have used Stirling’s formula (1.1.10).

Equation (1.5.5) shows that the single biggest binomial coeﬃcient accounts for a very healthy fraction

of the sum of all of the coeﬃcients of order n. Indeed, the sum of all of them is 2

n

, and the biggest one is

∼

_

2/nπ2

n

. When n is large, therefore, the largest coeﬃcient contributes a fraction ∼

_

2/nπ of the total.

If we think in terms of the subsets that these coeﬃcients count, what we will see is that a large fraction

of all of the subsets of an n-set have cardinality n/2, in fact Θ(n

−.5

) of them do. This kind of probabilistic

thinking can be very useful in the design and analysis of algorithms. If we are designing an algorithm that

deals with subsets of [n], for instance, we should recognize that a large percentage of the customers for that

algorithm will have cardinalities near n/2, and make every eﬀort to see that the algorithm is fast for such

subsets, even at the expense of possibly slowing it down on subsets whose cardinalities are very small or very

large.

Exercises for section 1.5

1. How many subsets of even cardinality does [n] have?

2. By observing that (1 + x)

a

(1 + x)

b

= (1 + x)

a+b

, prove that the sum of the squares of all binomial

coeﬃcients of order n is

_

2n

n

_

.

3. Evaluate the following sums in simple form.

(i)

n

j=0

j

_

n

j

_

(ii)

n

j=3

_

n

j

_

5

j

(iii)

n

j=0

(j +1)3

j+1

4. Find, by direct application of Taylor’s theorem, the power series expansion of f(x) = 1/(1 −x)

m+1

about

the origin. Express the coeﬃcients as certain binomial coeﬃcients.

5. Complete the following twiddles.

(i)

_

2n

n

_

∼ ?

(ii)

_

n

log

2

n

_

∼ ?

(iii)

_

n

θn

_

∼ ?

(iv)

_

n

2

n

_

∼ ?

6. How many ordered pairs of unequal elements of [n] are there?

7. Which one of the numbers ¦2

j

_

n

j

_

¦

n

j=0

is the biggest?

1.6 Graphs

A graph is a collection of vertices, certain unordered pairs of which are called its edges. To describe a

particular graph we ﬁrst say what its vertices are, and then we say which pairs of vertices are its edges. The

set of vertices of a graph G is denoted by V (G), and its set of edges is E(G).

If v and w are vertices of a graph G, and if (v, w) is an edge of G, then we say that vertices v, w are

adjacent in G.

Consider the graph G whose vertex set is ¦1, 2, 3, 4, 5¦ and whose edges are the set of pairs (1,2), (2,3),

(3,4), (4,5), (1,5). This is a graph of 5 vertices and 5 edges. A nice way to present a graph to an audience

is to draw a picture of it, instead of just listing the pairs of vertices that are its edges. To draw a picture of

a graph we would ﬁrst make a point for each vertex, and then we would draw an arc between two vertices v

and w if and only if (v, w) is an edge of the graph that we are talking about. The graph G of 5 vertices and

5 edges that we listed above can be drawn as shown in Fig. 1.6.1(a). It could also be drawn as shown in

Fig. 1.6.1(b). They’re both the same graph. Only the pictures are diﬀerent, but the pictures aren’t ‘really’

the graph; the graph is the vertex list and the edge list. The pictures are helpful to us in visualizing and

remembering the graph, but that’s all.

The number of edges that contain (‘are incident with’) a particular vertex v of a graph G is called the

degree of that vertex, and is usually denoted by ρ(v). If we add up the degrees of every vertex v of G we will

have counted exactly two contributions from each edge of G, one at each of its endpoints. Hence, for every

24

1.6 Graphs

Fig. 1.6.1(a) Fig. 1.6.1(b)

graph G we have

v∈V (G)

ρ(v) = 2[E(G)[. (1.6.1)

Since the right-hand side is an even number, there must be an even number of odd numbers on the left side

of (1.6.1). We have therefore proved that every graph has an even number of vertices whose degrees are odd.*

In Fig. 1.6.1 the degrees of the vertices are ¦2, 2, 2, 2, 2¦ and the sum of the degrees is 10 = 2[E(G)[.

Next we’re going to deﬁne a number of concepts of graph theory that will be needed in later chapters.

A fairly large number of terms will now be deﬁned, in rather a brief space. Don’t try to absorb them all

now, but read through them and look them over again when the concepts are actually used, in the sequel.

A path T in a graph G is a walk from one vertex of G to another, where at each step the walk uses an

edge of the graph. More formally, it is a sequence ¦v

1

, v

2

, . . . , v

k

¦ of vertices of G such that ∀i = 1, k − 1 :

(v

i

, v

i+1

) ∈ E(G).

A graph is connected if there is a path between every pair of its vertices.

A path T is simple if its vertices are all distinct, Hamiltonian if it is simple and visits every vertex of G

exactly once, Eulerian if it uses every edge of G exactly once.

A subgraph of a graph G is a subset S of its vertices to gether with a subset of just those edges of G

both of whose endpoints lie in S. An induced subgraph of G is a subset S of the vertices of G together with

all edges of G both of whose endpoints lie in S. We would then speak of ‘the subgraph induced by S.’

In a graph G we can deﬁne an equivalence relation on the vertices as follows. Say that v and w are

equivalent if there is a path of G that joins them. Let S be one of the equivalence classes of vertices of G

under this relation. The subgraph of G that S induces is called a connected component of the graph G. A

graph is connected if and only if it has exactly one connected component.

A cycle is a closed path, i.e., one in which v

k

= v

1

. A cycle is a circuit if v

1

is the only repeated vertex

in it. We may say that a circuit is a simple cycle. We speak of Hamiltonian and Eulerian circuits of G as

circuits of G that visit, respectively, every vertex, or every edge, of a graph G.

Not every graph has a Hamiltonian path. The graph in Fig. 1.6.2(a) has one and the graph in Fig.

1.6.2(b) doesn’t.

Fig. 1.6.2(a) Fig. 1.6.2(b)

* Did you realize that the number of people who shook hands an odd number of times yesterday is an

even number of people?

25

Chapter 1: Mathematical Preliminaries

Fig. 1.6.3(a) Fig. 1.6.3(b)

Likewise, not every graph has an Eulerian path. The graph in Fig. 1.6.3(a) has one and the graph in

Fig. 1.6.3(b) doesn’t.

There is a world of diﬀerence between Eulerian and Hamiltonian paths, however. If a graph G is given,

then thanks to the following elegant theorem of Euler, it is quite easy to decide whether or not G has an

Eulerian path. In fact, the theorem applies also to multigraphs, which are graphs except that they are allowed

to have several diﬀerent edges joining the same pair of vertices.

Theorem 1.6.1. A (multi-)graph has an Eulerian circuit (resp. path) if and only if it is connected and has

no (resp. has exactly two) vertices of odd degree.

Proof: Let G be a connected multigraph in which every vertex has even degree. We will ﬁnd an Eulerian

circuit in G. The proof for Eulerian paths will be similar, and is omitted.

The proof is by induction on the number of edges of G, and the result is clearly true if G has just one

edge.

Hence suppose the theorem is true for all such multigraphs of fewer than m edges, and let G have m

edges. We will construct an Eulerian circuit of G.

Begin at some vertex v and walk along some edge to a vertex w. Generically, having arrived at a vertex

u, depart from u along an edge that hasn’t been used yet, arriving at a new vertex, etc. The process halts

when we arrive for the ﬁrst time at a vertex v

such that all edges incident with v

**have previously been
**

walked on, so there is no exit.

We claim that v

= v, i.e., we’re back where we started. Indeed, if not, then we arrived at v

one more

time than we departed from it, each time using a new edge, and ﬁnding no edges remaining at the end. Thus

there were an odd number of edges of G incident with v

, a contradiction.

Hence we are indeed back at our starting point when the walk terminates. Let W denote the sequence

of edges along which we have so far walked. If W includes all edges of G then we have found an Euler tour

and we are ﬁnished.

Else there are edges of G that are not in W. Erase all edges of W from G, thereby obtaining a (possibly

disconnected multi-) graph G

. Let C

1

, . . . , C

k

denote the connected components of G

**. Each of them has
**

only vertices of even degree because that was true of G and of the walk W that we subtracted from G.

Since each of the C

i

has fewer edges than G had, there is, by induction, an Eulerian circuit in each of the

connected components of G

.

We will thread them all together to make such a circuit for G itself.

Begin at the same v and walk along 0 or more edges of W until you arrive for the ﬁrst time at a vertex

q of component C

1

. This will certainly happen because G is connected. Then follow the Euler tour of the

edges of C

1

, which will return you to vertex q. Then continue your momentarily interrupted walk W until

you reach for the ﬁrst time a vertex of C

2

, which will surely happen because G is connected, etc., and the

proof is complete.

It is extremely diﬃcult computationally to decide if a given graph has a Hamilton path or circuit. We

will see in Chapter 5 that this question is typical of a breed of problems that are the main subject of that

chapter, and are perhaps the most (in-)famous unsolved problems in theoretical computer science. Thanks

to Euler’s theorem (theorem 1.6.1) it is easy to decide if a graph has an Eulerian path or circuit.

Next we’d like to discuss graph coloring, surely one of the prettier parts of graph theory. Suppose that

there are K colors available to us, and that we are presented with a graph G. A proper coloring of the

vertices of G is an assignment of a color to each vertex of G in such a way that ∀e ∈ E(G) the colors of

26

1.6 Graphs

the two endpoints of e are diﬀerent. Fig. 1.6.4(a) shows a graph G and an attempt to color its vertices

properly in 3 colors (‘R,’ ‘Y’ and ‘B’). The attempt failed because one of the edges of G has had the same

color assigned to both of its endpoints. In Fig. 1.6.4(b) we show the same graph with a successful proper

coloring of its vertices in 4 colors.

Fig. 1.6.4(a) Fig. 1.6.4(b)

The chromatic number χ(G) of a graph G is the minimum number of colors that can be used in a proper

coloring of the vertices of G. A bipartite graph is a graph whose chromatic number is ≤ 2, i.e., it is a graph

that can be 2-colored. That means that the vertices of a bipartite graph can be divided into two classes ‘R’

and ‘Y’ such that no edge of the graph runs between two ‘R’ vertices or between two ‘Y’ vertices. Bipartite

graphs are most often drawn, as in Fig. 1.6.5, in two layers, with all edges running between layers.

Fig. 1.6.5: A bipartite graph

The complement G of a graph G is the graph that has the same vertex set that G has and has an edge

exactly where G does not have its edges. Formally,

E(G) = ¦(v, w) [ v, w ∈ V (G); v ,= w; (v, w) / ∈ E(G)¦.

Here are some special families of graphs that occur so often that they rate special names. The complete

graph K

n

is the graph of n vertices in which every possible one of the

_

n

2

_

edges is actually present. Thus

K

2

is a single edge, K

3

looks like a triangle, etc. The empty graph K

n

consists of n isolated vertices, i.e.,

has no edges at all.

The complete bipartite graph K

m,n

has a set S of m vertices and a set T of n vertices. Its edge set

is E(K

m,n

) = S T. It has [E(K

m,n

)[ = mn edges. The n-cycle, C

n

, is a graph of n vertices that are

connected to form a single cycle. A tree is a graph that (a) is connected and (b ) has no cycles. A tree is

shown in Fig. 1.6.6.

Fig. 1.6.6: A tree

27

Chapter 1: Mathematical Preliminaries

It is not hard to prove that the following are equivalent descriptions of a tree.

(a) A tree is a graph that is connected and has no cycles.

(b) A tree is a graph G that is connected and for which [E(G)[ = [V (G)[ −1.

(c) A tree is a graph G with the property that between every pair of distinct vertices there is a unique

path.

If G is a graph and S ⊆ V (G), then S is an independent set of vertices of G if no two of the vertices in

S are adjacent in G. An independent set S is maximal if it is not a proper subset of another independent set

of vertices of G. Dually, if a vertex subset S induces a complete graph, then we speak of a complete subgraph

of G. A maximal complete subgraph of G is called a clique.

A graph might be labeled or unlabeled. The vertices of a labeled graph are numbered 1, 2, . . . , n. One

diﬀerence that this makes is that there are a lot more labeled graphs than there are unlabeled graphs. There

are, for example, 3 labeled graphs that have 3 vertices and 1 edge. They are shown in Fig. 1.6.7.

Fig. 1.6.7: Three labeled graphs...

There is, however, only 1 unlabeled graph that has 3 vertices and 1 edge, as shown in Fig. 1.6.8.

Fig. 1.6.8: ... but only one unlabeled graph

Most counting problems on graphs are much easier for labeled than for unlabeled graphs. Consider the

following question: how many graphs are there that have exactly n vertices? Suppose ﬁrst that we mean

labeled graphs. A graph of n vertices has a maximum of

_

n

2

_

edges. To construct a graph we would decide

which of these possible edges would be used. We can make each of these

_

n

2

_

decisions independently, and

for every way of deciding where to put the edges we would get a diﬀerent graph. Therefore the number of

labeled graphs of n vertices is 2

(

n

2

)

= 2

n(n−1)/2

.

If we were to ask the corresponding question for unlabeled graphs we would ﬁnd it to be very hard.

The answer is known, but the derivation involves Burnside’s lemma about the action of a group on a set,

and some fairly delicate counting arguments. We will state the approximate answer to this question, which

is easy to write out, rather than the exact answer, which is not. If g

n

is the number of unlabeled graphs of

n vertices then

g

n

∼ 2

(

n

2

)

/n!.

Exercises for section 1.6

1. Show that a tree is a bipartite graph.

2. Find the chromatic number of the n-cycle.

3. Describe how you would ﬁnd out, on a computer, if a given graph G is bipartite.

4. Given a positive integer K. Find two diﬀerent graphs each of whose chromatic numbers is K.

5. Exactly how many labeled graphs of n vertices and E edges are there?

6. In how many labeled graphs of n vertices do vertices ¦1, 2, 3¦ form an independent set?

7. How many cliques does an n-cycle have?

8. True or false: a Hamilton circuit is an induced cycle in a graph.

9. Which graph of n vertices has the largest number of independent sets? How many does it have?

10. Draw all of the connected, unlabeled graphs of 4 vertices.

11. Let G be a bipartite graph that has q connected components. Show that there are exactly 2

q

ways to

properly color the vertices of G in 2 colors.

12. Find a graph G of n vertices, other than the complete graph, whose chromatic number is equal to 1 plus

the maximum degree of any vertex of G.

28

1.6 Graphs

13. Let n be a multiple of 3. Consider a labeled graph G that consists of n/3 connected components, each

of them a K

3

. How many maximal independent sets does G have?

14. Describe the complement of the graph G in exercise 13 above. How many cliques does it have?

15. In how many labeled graphs of n vertices is the subgraph that is induced by vertices ¦1, 2, 3¦ a triangle?

16. Let H be a labeled graph of L vertices. In how many labeled graphs of n vertices is the subgraph that

is induced by vertices ¦1, 2, . . . , L¦ equal to H?

17. Devise an algorithm that will decide if a given graph, of n vertices and m edges, does or does not contain

a triangle, in time O(max(n

2

, mn)).

18. Prove that the number of labeled graphs of n vertices all of whose vertices have even degree is equal to

the number of all labeled graphs of n −1 vertices.

29

Chapter 1: Mathematical Preliminaries

Chapter 2: Recursive Algorithms

2.1 Introduction

Here are two diﬀerent ways to deﬁne n!, if n is a positive integer. The ﬁrst deﬁnition is nonrecursive,

the second is recursive.

(1) ‘n! is the product of all of the whole numbers from 1 to n, inclusive.’

(2) ‘If n = 1 then n! = 1, else n! = n (n −1)!.’

Let’s concentrate on the second deﬁnition. At a glance, it seems illegal, because we’re deﬁning something,

and in the deﬁnition the same ‘something’ appears. Another glance, however, reveals that the value of n! is

deﬁned in terms of the value of the same function at a smaller value of its argument, viz. n −1. So we’re

really only using mathematical induction in order to validate the assertion that a function has indeed been

deﬁned for all positive integers n.

What is the practical import of the above? It’s monumental. Many modern high-level computer

languages can handle recursive constructs directly, and when this is so, the programmer’s job may be

considerably simpliﬁed. Among recursive languages are Pascal, PL/C, Lisp, APL, C, and many others.

Programmers who use these languages should be aware of the power and versatility of recursive methods

(conversely, people who like recursive methods should learn one of those languages!).

A formal ‘function’ module that would calculate n! nonrecursively might look like this.

function fact(n);

¦computes n! for given n > 0¦

fact := 1;

for i := 1 to n do fact := i fact;

end.

On the other hand a recursive n! module is as follows.

function fact(n);

if n = 1 then fact := 1

else fact := n fact(n −1);

end.

The hallmark of a recursive procedure is that it calls itself, with arguments that are in some sense

smaller than before. Notice that there are no visible loops in the recursive routine. Of course there will

be loops in the compiled machine-language program, so in eﬀect the programmer is shifting many of the

bookkeeping problems to the compiler (but it doesn’t mind!).

Another advantage of recursiveness is that the thought processes are helpful. Mathematicians have

known for years that induction is a marvellous method for proving theorems, making constructions, etc.

Now computer scientists and programmers can proﬁtably think recursively too, because recursive compilers

allow them to express such thoughts in a natural way, and as a result many methods of great power are being

formulated recursively, methods which, in many cases, might not have been developed if recursion were not

readily available as a practical programming tool.

Observe next that the ‘trivial case,’ where n = 1, is handled separately, in the recursive form of the n!

program above. This trivial case is in fact essential, because it’s the only thing that stops the execution of

the program. In eﬀect, the computer will be caught in a loop, reducing n by 1, until it reaches 1, then it will

actually know the value of the function fact, and after that it will be able to climb back up to the original

input value of n.

The overall structure of a recursive routine will always be something like this:

30

2.2 Quicksort

procedure calculate(list of variables);

if ¦trivialcase¦ then do ¦trivialthing¦

else do

¦call calculate(smaller values of the variables)¦;

¦maybe do a few more things¦

end.

In this chapter we’re going to work out a number of examples of recursive algorithms, of varying

sophistication. We will see how the recursive structure helps us to analyze the running time, or complexity,

of the algorithms. We will also ﬁnd that there is a bit of art involved in choosing the list of variables that

a recursive procedure operates on. Sometimes the ﬁrst list we think of doesn’t work because the recursive

call seems to need more detailed information than we have provided for it. So we try a larger list, and then

perhaps it works, or maybe we need a still larger list ..., but more of this later.

Exercises for section 2.1

1. Write a recursive routine that will ﬁnd the digits of a given integer n in the base b. There should be no

visible loops in your program.

2.2 Quicksort

Suppose that we are given an array x[1], . . ., x[n] of n numbers. We would like to rearrange these

numbers as necessary so that they end up in nondecreasing order of size. This operation is called sorting the

numbers.

For instance, if we are given ¦9, 4, 7, 2, 1¦, then we want our program to output the sorted array

¦1, 2, 4, 7, 9¦.

There are many methods of sorting, but we are going to concentrate on methods that rely on only

two kinds of basic operations, called comparisons and interchanges. This means that our sorting routine is

allowed to

(a) pick up two numbers (‘keys’) from the array, compare them, and decide which is larger.

(b) interchange the positions of two selected keys.

Here is an example of a rather primitive sorting algorithm:

(i) ﬁnd, by successive comparisons, the smallest key

(ii) interchange it with the ﬁrst key

(iii) ﬁnd the second smallest key

(iv) interchange it with the second key, etc. etc.

Here is a more formal algorithm that does the job above.

procedure slowsort(X: array[1..n]);

¦sorts a given array into nondecreasing order¦

for r := 1 to n −1 do

for j := r + 1 to n do

if x[j] < x[r] then swap(x[j], x[r])

end.¦slowsort¦

If you are wondering why we called this method ‘primitive,’ ‘slowsort,’ and other pejorative names, the

reason will be clearer after we look at its complexity.

What is the cost of sorting n numbers by this method? We will look at two ways to measure that cost.

First let’s choose our unit of cost to be one comparison of two numbers, and then we will choose a diﬀerent

unit of cost, namely one interchange of position (‘swap’) of two numbers.

31

Chapter 1: Mathematical Preliminaries

How many paired comparisons does the algorithm make? Reference to procedure slowsort shows that it

makes one comparison for each value of j = r +1, . . . , n in the inner loop. This means that the total number

of comparisons is

f

1

(n) =

n−1

r=1

n

j=r+1

1

=

n−1

r=1

(n −r)

= (n −1)n/2.

The number of comparisons is Θ(n

2

), which is quite a lot of comparisons for a sorting method to do. Not

only that, but the method does that many comparisons regardless of the input array, i.e. its best case and

its worst case are equally bad.

The Quicksort* method, which is the main object of study in this section, does a maximum of cn

2

comparisons, but on the average it does far fewer, a mere O(nlog n) comparisons. This economy is much

appreciated by those who sort, because sorting applications can be immense and time consuming. One

popular sorting application is in alphabetizing lists of names. It is easy to imagine that some of those lists

are very long, and that the replacement of Θ(n

2

) by an average of O(nlogn) comparisons is very welcome.

An insurance company that wants to alphabetize its list of 5,000,000 policyholders will gratefully notice the

diﬀerence between n

2

= 25, 000, 000, 000, 000 comparisons and nlogn = 77, 124, 740 comparisons.

If we choose as our unit of complexity the number of swaps of position, then the running time may

depend strongly on the input array. In the ‘slowsort’ method described above, some arrays will need no

swaps at all while others might require the maximum number of (n −1)n/2 (which arrays need that many

swaps?). If we average over all n! possible arrangements of the input data, assuming that the keys are

distinct, then it is not hard to see that the average number of swaps that slowsort needs is Θ(n

2

).

Now let’s discuss Quicksort. In contrast to the sorting method above, the basic idea of Quicksort is

sophisticated and powerful. Suppose we want to sort the following list:

26, 18, 4, 9, 37, 119, 220, 47, 74 (2.2.1)

The number 37 in the above list is in a very intriguing position. Every number that precedes it is smaller

than it is and every number that follows it is larger than it is. What that means is that after sorting the list,

the 37 will be in the same place it now occupies, the numbers to its left will have been sorted but will still be

on its left, and the numbers on its right will have been sorted but will still be on its right.

If we are fortunate enough to be given an array that has a ‘splitter,’ like 37, then we can

(a) sort the numbers to the left of the splitter, and then

(b) sort the numbers to the right of the splitter.

Obviously we have here the germ of a recursive sorting routine.

The ﬂy in the ointment is that most arrays don’t have splitters, so we won’t often be lucky enough to

ﬁnd the state of aﬀairs that exists in (2.2.1). However, we can make our own splitters, with some extra work,

and that is the idea of the Quicksort algorithm. Let’s state a preliminary version of the recursive procedure

as follows (look carefully for how the procedure handles the trivial case where n=1).

procedure quicksortprelim(x : an array of n numbers);

¦sorts the array x into nondecreasing order¦

if n ≥ 2 then

permute the array elements so as to create a splitter;

let x[i] be the splitter that was just created;

quicksortprelim(the subarray x

1

, . . . , x

i−1

) in place;

quicksortprelim(the subarray x

i+1

, . . . , x

n

) in place;

end.¦quicksortprelim¦

* C. A. R. Hoare, Comp. J., 5 (1962), 10-15.

32

2.2 Quicksort

This preliminary version won’t run, though. It looks like a recursive routine. It seems to call itself twice

in order to get its job done. But it doesn’t. It calls something that’s just slightly diﬀerent from itself in

order to get its job done, and that won’t work.

Observe the exact purpose of Quicksort, as described above. We are given an array of length n, and

we want to sort it, all of it. Now look at the two ‘recursive calls,’ which really aren’t quite. The ﬁrst one

of them sorts the array to the left of x

i

. That is indeed a recursive call, because we can just change the ‘n’

to ‘i − 1’ and call Quicksort. The second recursive call is the problem. It wants to sort a portion of the

array that doesn’t begin at the beginning of the array. The routine Quicksort as written so far doesn’t have

enough ﬂexibility to do that. So we will have to give it some more parameters.

Instead of trying to sort all of the given array, we will write a routine that sorts only the portion of the

given array x that extends from x[left] to x[right], inclusive, where left and right are input parameters.

This leads us to the second version of the routine:

procedure qksort(x:array; left, right:integer);

¦sorts the subarray x[left], . . . , x[right]¦

if right −left ≥ 1 then

create a splitter for the subarray in the i

th

array position;

qksort(x, left, i −1);

qksort(x, i +1, right)

end.¦qksort¦

Once we have qksort, of course, Quicksort is no problem: we call qksort with left := 1 and right := n.

The next item on the agenda is the little question of how to create a splitter in an array. Suppose we

are working with a subarray

x[left], x[left +1], . . . , x[right].

The ﬁrst step is to choose one of the subarray elements (the element itself, not the position of the element)

to be the splitter, and the second step is to make it happen. The choice of the splitter element in the

Quicksort algorithm is done very simply: at random. We just choose, using our favorite random number

generator, one of the entries of the given subarray, let’s call it T, and declare it to be the splitter. To repeat

the parenthetical comment above, T is the value of the array entry that was chosen, not its position in the

array. Once the value is selected, the position will be what it has to be, namely to the right of all smaller

entries, and to the left of all larger entries.

The reason for making the random choice will become clearer after the smoke of the complexity discussion

has cleared, but brieﬂy it’s this: the analysis of the average case complexity is realtively easy if we use the

random choice, so that’s a plus, and there are no minuses.

Second, we have now chosen T to be the value around which the subarray will be split. The entries of

the subarray must be moved so as to make T the splitter. To do this, consider the following algorithm.*

* Attributed to Nico Lomuto by Jon Bentley, CACM 27 (April 1984).

33

Chapter 1: Mathematical Preliminaries

procedure split(x, left, right, i)

¦chooses at random an entry T of the subarray

[x

left

, x

right

], and splits the subarray around T¦

¦the output integer i is the position of T in the

output array: x[i] = T¦;

1 L := a random integer in [left, right];

2 swap(x[left], x[L]);

3 ¦now the splitter is ﬁrst in the subarray¦

4 T := x[left];

5 i := left;

6 for j := left +1 to right do

begin

7 if x[j] < T then

begin

8 i := i +1

swap(x[i], x[j])

end;

end

9 swap(x[left], x[i])

10 end.¦split¦

We will now prove the correctness of split.

Theorem 2.2.1. Procedure split correctly splits the array x around the chosen value T.

Proof: We claim that as the loop in lines 7 and 8 is repeatedly executed for j := left + 1 to right, the

following three assertions will always be true just after each execution of lines 7, 8:

(a) x[left] = T and

(b) x[r] < T for all left < r ≤ i and

(c) x[r] ≥ T for all i < r ≤ j

Fig. 2.2.1 illustrates the claim.

Fig. 2.2.1: Conditions (a), (b), (c)

To see this, observe ﬁrst that (a), (b), (c) are surely true at the beginning, when j = left + 1. Next, if

for some j they are true, then the execution of lines 7, 8 guarantee that they will be true for the next value

of j.

Now look at (a), (b), (c) when j = right. It tells us that just prior to the execution of line 9 the

condition of the array will be

(a) x[left] = T and

(b) x[r] < T for all left < r ≤ i and

(c) x[r] ≥ T for all i < r ≤ right.

When line 9 executes, the array will be in the correctly split condition.

Now we can state a ‘ﬁnal’ version of qksort (and therefore of Quicksort too).

34

2.2 Quicksort

procedure qksort(x:array; left, right:integer);

¦sorts the subarray x[left], . . . , x[right]¦;

if right −left ≥ 1 then

split(x, left, right, i);

qksort(x, left, i −1);

qksort(x, i + 1, right)

end.¦qksort¦

procedure Quicksort(x :array; n:integer)

¦sorts an array of length n¦;

qksort(x, 1, n)

end.¦Quicksort¦

Now let’s consider the complexity of Quicksort. How long does it take to sort an array? Well, the

amount of time will depend on exactly which array we happen to be sorting, and furthermore it will depend

on how lucky we are with our random choices of splitting elements.

If we want to see Quicksort at its worst, suppose we have a really unlucky day, and that the random

choice of the splitter element happens to be the smallest element in the array. Not only that, but suppose

this kind of unlucky choice is repeated on each and every recursive call.

If the splitter element is the smallest array entry, then it won’t do a whole lot of splitting. In fact, if

the original array had n entries, then one of the two recursive calls will be to an array with no entries at all,

and the other recursive call will be to an array of n−1 entries. If L(n) is the number of paired comparisons

that are required in this extreme scenario, then, since the number of comparisons that are needed to carry

out the call to split an array of length n is n −1, it follows that

L(n) = L(n −1) +n −1 (n ≥ 1; L(0) = 0).

Hence,

L(n) = (1 + 2 + + (n −1)) = Θ(n

2

).

The worst-case behavior of Quicksort is therefore quadratic in n. In its worst moods, therefore, it is as bad

as ‘slowsort’ above.

Whereas the performance of slowsort is pretty much always quadratic, no matter what the input is,

Quicksort is usually a lot faster than its worst case discussed above.

We want to show that on the average the running time of Quicksort is O(nlogn).

The ﬁrst step is to get quite clear about what the word ‘average’ refers to. We suppose that the entries

of the input array x are all distinct. Then the performance of Quicksort can depend only on the sequence of

size relationships in the input array and the choices of the random splitting elements.

The actual numerical values that appear in the input array are not in themselves important, except that,

to simplify the discussion we will assume that they are all diﬀerent. The only thing that will matter, then,

will be the set of outcomes of all of the paired comparisons of two elements that are done by the algorithm.

Therefore, we will assume, for the purposes of analysis, that the entries of the input array are exactly the

set of numbers 1, 2, . . ., n in some order.

There are n! possible orders in which these elements might appear, so we are considering the action of

Quicksort on just these n! inputs.

Second, for each particular one of these inputs, the choices of the splitting elements will be made by

choosing, at random, one of the entries of the array at each step of the recursion. We will also average over

all such random choices of the splitting elements.

Therefore, when we speak of the function F(n), the average complexity of Quicksort, we are speaking of

the average number of pairwise comparisons of array entries that are made by Quicksort, where the averaging

35

Chapter 1: Mathematical Preliminaries

is done ﬁrst of all over all n! of the possible input orderings of the array elements, and second, for each such

input ordering, we average also over all sequences of choices of the splitting elements.

Now let’s consider the behavior of the function F(n). What we are going to show is that F(n) =

O(nlogn).

The labor that F(n) estimates has two components. First there are the pairwise comparisons involved

in choosing a splitting element and rearranging the array about the chosen splitting value. Second there are

the comparisons that are done in the two recursive calls that follow the creation of a splitter.

As we have seen, the number of comparisons involved in splitting the array is n −1. Hence it remains

to estimate the number of comparisons in the recursive calls.

For this purpose, suppose we have rearranged the array about the splitting element, and that it has

turned out that the splitting entry now occupies the i

th

position in the array.

Our next remark is that each value of i = 1, 2, . . . , n is equally likely to occur. The reason for this is that

we chose the splitter originally by choosing a random array entry. Since all orderings of the array entries are

equally likely, the one that we happened to have chosen was just as likely to have been the largest entry as

to have been the smallest, or the 17

th

-from-largest, or whatever.

Since each value of i is equally likely, each i has probability 1/n of being chosen as the residence of the

splitter.

If the splitting element lives in the i

th

array position, the two recursive calls to Quicksort will be on

two subarrays, one of which has length i −1 and the other of which has length n −i. The average numbers

of pairwise comparisons that are involved in such recursive calls are F(i −1) and F(n −i), respectively. It

follows that our average complexity function F satisﬁes the relation

F(n) = n −1 +

1

n

n

i=1

¦F(i −1) +F(n −i)¦ (n ≥ 1). (2.2.2)

together with the initial value F(0) = 0.

How can we ﬁnd the solution of the recurrence relation (2.2.2)? First let’s simplify it a little by noticing

that

n

i=1

¦F(n −i)¦ = F(n −1) +F(n −2) + +F(0)

=

n

i=1

¦F(i −1)¦

(2.2.3)

and so (2.2.2) can be written as

F(n) = n −1 +

2

n

n

i=1

F(i −1). (2.2.4)

We can simplify (2.2.4) a lot by getting rid of the summation sign. This next step may seem like a trick

at ﬁrst (and it is!), but it’s a trick that is used in so many diﬀerent ways that now we call it a ‘method.’

What we do is ﬁrst to multiply (2.2.4) by n, to get

nF(n) = n(n −1) + 2

n

i=1

F(i −1). (2.2.5)

Next, in (2.2.5) we replace n by n −1, yielding

(n −1)F(n −1) = (n −1)(n −2) + 2

n−1

i=1

F(i −1). (2.2.6)

Finally we subtract (2.2.6) from (2.2.5), and the summation sign obligingly disappears, leaving behind just

nF(n) −(n −1)F(n −1) = n(n −1) −(n −1)(n −2) + 2F(n −1). (2.2.7)

36

2.2 Quicksort

After some tidying up, (2.2.7) becomes

F(n) = (1 +

1

n

)F(n −1) + (2 −

2

n

). (2.2.8)

which is exactly in the form of the general ﬁrst-order recurrence relation that we discussed in section 1.4.

In section 1.4 we saw that to solve (2.2.8) the winning tactic is to change to a new variable y

n

that is

deﬁned, in this case, by

F(n) =

n + 1

n

n

n −1

n −1

n −2

2

1

y

n

= (n + 1)y

n

.

(2.2.9)

If we make the change of variable F(n) = (n + 1)y

n

in (2.2.8), then it takes the form

y

n

= y

n−1

+2(n −1)/n(n +1) (n ≥ 1) (2.2.10)

as an equation for the y

n

’s (y

0

= 0).

The solution of (2.2.10) is obviously

y

n

= 2

n

j=1

j −1

j(j +1)

= 2

n

j=1

¦

2

j +1

−

1

j

¦

= 2

n

j=1

1

j

−4n/(n +1).

Hence from (2.2.9),

F(n) = 2(n +1)¦

n

j=1

1/j¦ −4n (2.2.11)

is the average number of pairwise comparisons that we do if we Quicksort an array of length n. Evidently

F(n) ∼ 2nlogn (n → ∞) (see (1.1.7) with g(t) = 1/t), and we have proved

Theorem 2.2.2. The average number of pairwise comparisons of array entries that Quicksort makes when

it sorts arrays of n elements is exactly as shown in (2.2.11), and is ∼ 2nlogn (n → ∞).

Quicksort is, on average, a very quick sorting method, even though its worst case requires a quadratic

amount of labor.

Exercises for section 2.2

1. Write out an array of 10 numbers that contains no splitter. Write out an array of 10 numbers that

contains 10 splitters.

2. Write a program that does the following. Given a positive integer n. Choose 100 random permutations

of [1, 2, . . . , n],* and count how many of the 100 had at least one splitter. Execute your program for n =

5, 6, . . . , 12 and tabulate the results.

3. Think of some method of sorting n numbers that isn’t in the text. In the worst case, how many comparisons

might your method do? How many swaps?

* For a fast and easy way to do this see A. Nijenhuis and H. S. Wilf, Combinatorial Algorithms, 2nd ed. (New

York: Academic Press, 1978), chap. 6.

37

Chapter 1: Mathematical Preliminaries

4. Consider the array

x = ¦2, 4, 1, 10, 5, 3, 9, 7, 8, 6¦

with left = 1 and right = 10. Suppose that the procedure split is called, and suppose the random integer

L in step 1 happens to be 5. Carry out the complete split algorithm (not on a computer; use pencil and

paper). Particularly, record the condition of the array x after each value of j is processed in the for j = . . .

loop.

5. Suppose H(0) = 1 and H(n) ≤ 1 +

1

n

n

i=1

H(i −1) (n ≥ 1). How big might H(n) be?

6. If Q(0) = 0 and Q(n) ≤ n

2

+

n

i=1

Q(i −1) (n ≥ 1), how big might Q(n) be?

7. (Research problem) Find the asymptotic behavior, for large n, of the probability that a randomly chosen

permutation of n letters has a splitter.

2.3 Recursive graph algorithms

Algorithms on graphs are another rich area of applications of recursive thinking. Some of the problems

are quite diﬀerent from the ones that we have so far been studying in that they seem to need exponential

amounts of computing time, rather than the polynomial times that were required for sorting problems.

We will illustrate the dramatically increased complexity with a recursive algorithm for the ‘maximum

independent set problem,’ one which has received a great deal of attention in recent years.

Suppose a graph G is given. By an independent set of vertices of G we mean a set of vertices no two of

which are connected by an edge of G. In the graph in Fig. 2.3.1 the set ¦1, 2, 6¦ is an independent set and so

is the set ¦1, 3¦. The largest independent set of vertices in the graph shown there is the set ¦1, 2, 3, 6¦. The

problem of ﬁnding the size of the largest independent set in a given graph is computationally very diﬃcult.

All algorithms known to date require exponential amounts of time, in their worst cases, although no one has

proved the nonexistence of fast (polynomial time) algorithms.

If the problem itself seems unusual, and maybe not deserving of a lot of attention, be advised that in

Chapter 5 we will see that it is a member in good standing of a large family of very important computational

problems (the ‘NP-complete’ problems) that are tightly bound together, in that if we can ﬁgure out better

ways to compute any one of them, then we will be able to do all of them faster.

Fig. 2.3.1

Here is an algorithm for the independent set problem that is easy to understand and to program,

although, of course, it may take a long time to run on a large graph G.

We are looking for the size of the largest independent set of vertices of G. Suppose we denote that number

by maxset(G). Fix some vertex of the graph, say vertex v

∗

. Let’s distinguish two kinds of independent sets

of vertices of G. There are those that contain vertex v

∗

and those that don’t contain vertex v

∗

.

If an independent set S of vertices contains vertex v

∗

, then what does the rest of the set S consist of?

The remaining vertices of S are an independent set in a smaller graph, namely the graph that is obtained

from G by deleting vertex v

∗

as well as all vertices that are connected to vertex v

∗

by an edge. This latter

set of vertices is called the neighborhood of vertex v

∗

, and is written Nbhd(v

∗

).

The set S consists, therefore, of vertex v

∗

together with an independent set of vertices from the graph

G−¦v

∗

¦ −Nbhd(v

∗

).

Now consider an independent set S that doesn’t contain vertex v

∗

. In that case the set S is simply an

independent set in the smaller graph G−¦v

∗

¦.

38

2.3 Recursive graph algorithms

We now have all of the ingredients of a recursive algorithm. Suppose we have found the two numbers

maxset(G−¦v

∗

¦) and maxset(G−¦v

∗

¦−Nbhd(v

∗

)). Then, from the discussion above, we have the relation

maxset(G) = max

_

maxset(G −¦v

∗

¦), 1 + maxset(G −¦v

∗

¦ −Nbhd(v

∗

))

_

.

We obtain the following recursive algorithm.

function maxset1(G);

¦returns the size of the largest independent set of

vertices of G¦

if G has no edges

then maxset1 := [V (G)[

else

choose some nonisolated vertex v

∗

of G;

n

1

:= maxset1(G −¦v

∗

¦);

n

2

:= maxset1(G −¦v

∗

¦ −Nbhd(v

∗

));

maxset1 := max(n

1

, 1 +n

2

)

end.¦maxset1¦

Example:

Here is an example of a graph G and the result of applying the maxset1 algorithm to it. Let the graph

G be a 5-cycle. That is, it has 5 vertices and its edges are (1, 2), (2, 3), (3, 4), (4, 5), (1, 5). What are the two

graphs on which the algorithm calls itself recursively?

Suppose we select vertex number 1 as the chosen vertex v in the algorithm. Then G − ¦1¦ and G −

¦1¦ −Nbhd(1) are respectively the two graphs shown in Fig. 2.3.2.

2 3 4 5 3

4

Fig. 2.3.2: G−¦1¦ G−¦1¦ −Nbhd(1)

The reader should now check that the size of the largest independent set of G is equal to the larger of

the two numbers maxset1(G −¦1¦), 1 + maxset1(G −¦1¦ −Nbhd(1)) in this example.

Of course the creation of these two graphs from the original input graph is just the beginning of the

story, as far as the computation is concerned. Unbeknownst to the programmer, who innocently wrote the

recursive routine maxset1 and then sat back to watch, the compiler will go ahead with the computation by

generating a tree-full of graphs. In Fig. 2.3.3 we show the collection of all of the graphs that the compiler

might generate while executing a single call to maxset1 on the input graph of this example. In each case,

the graph that is below and to the left of a given one is the one obtained by deleting a single vertex, and the

one below and to the right of each graph is obtained by deleting a single vertex and its entire neighborhood.

Now we are going to study the complexity of maxset1. The results will be suﬃciently depressing that

we will then think about how to speed up the algorithm, and we will succeed in doing that to some extent.

To open the discussion, let’s recall that in Chapter 0 it was pointed out that the complexity of a

calculation is usefully expressed as a function of the number of bits of input data. In problems about graphs,

however, it is more natural to think of the amount of labor as a function of n, the number of vertices of the

graph. In problems about matrices it is more natural to use n, the size of the matrix, and so forth.

Do these distinctions alter the classiﬁcation of problems into ‘polynomial time do-able’ vs. ‘hard’ ? Take

the graph problems, for instance. How many bits of input data does it take to describe a graph? Well,

certainly we can march through the entire list of n(n−1)/2 pairs of vertices and check oﬀ the ones that are

actually edges in the input graph to the problem. Hence we can describe a graph to a computer by making

39

Chapter 1: Mathematical Preliminaries

Fig. 2.3.3: A tree-full of graphs is created

a list of n(n −1)/2 0’s and 1’s. Each 1 represents a pair that is an edge, each 0 represents one that isn’t an

edge.

Thus Θ(n

2

) bits describe a graph. Since n

2

is a polynomial in n, any function of the number of input

data bits that can be bounded by a polynomial in same, can also be bounded by a polynomial in n itself.

Hence, in the case of graph algorithms, the ‘easiness’ vs. ‘hardness’ judgment is not altered if we base the

distinction on polynomials in n itself, rather than on polynomials in the number of bits of input data.

Hence, with a clear conscience, we are going to estimate the running time or complexity of graph

algorithms in terms of the number of vertices of the graph that is input.

Now let’s do this for algorithm maxset1 above.

The ﬁrst step is to ﬁnd out if G has any edges. To do this we simply have to look at the input data.

In the worst case we might look at all of the input data, all Θ(n

2

) bits of it. Then, if G actually has some

edges, the additional labor needed to process G consists of two recursive calls on smaller graphs and one

computation of the larger of two numbers.

If F(G) denotes the total amount of computational labor that we do in order to ﬁnd maxset1(G), then

we see that

F(G) ≤ cn

2

+ F(G−¦v

∗

¦) + F(G−¦v

∗

¦ −Nbhd(v

∗

)). (2.3.1)

Next, let f(n) = max

|V (G)|=n

F(G), and take the maximum of (2.3.1) over all graphs G of n vertices. The

result is that

f(n) ≤ cn

2

+ f(n −1) +f(n −2) (2.3.2)

because the graph G− ¦v

∗

¦ −Nbhd(v

∗

) might have as many as n −2 vertices, and would have that many

if v

∗

had exactly one neighbor.

Now it’s time to ‘solve’ the recurrent inequality (2.3.2). Fortunately the hard work has all been done,

and the answer is in theorem 1.4.1. That theorem was designed expressly for the analysis of recursive

algorithms, and in this case it tells us that f(n) = O((1.619

n

)). Indeed the number c in that theorem is

(1 +

√

5)/2 = 1.61803.... We chose the ‘’ that appears in the conclusion of the theorem simply by rounding

c upwards.

What have we learned? Algorithm maxset1 will ﬁnd the answer in a time of no more than O(1.619

n

)

units if the input graph G has n vertices. This is a little improvement of the most simple-minded possible

40

2.3 Recursive graph algorithms

algorithm that one might think of for this problem, which is to examine every single subset of the vertices of

of G and ask if it is an independent set or not. That algorithm would take Θ(2

n

) time units because there

are 2

n

subsets of vertices to look at. Hence we have traded in a 2

n

for a 1.619

n

by being a little bit cagey

about the algorithm. Can we do still better?

There have in fact been a number of improvements of the basic maxset1 algorithm worked out. Of

these the most successful is perhaps the one of Tarjan and Trojanowski that is cited in the bibliography at

the end of this chapter. We are not going to work out all of those ideas here, but instead we will show what

kind of improvements on the basic idea will help us to do better in the time estimate.

We can obviously do better if we choose v

∗

in such a way as to be certain that it has at least two

neighbors. If we were to do that then although we wouldn’t aﬀect the number of vertices of G−¦v

∗

¦ (always

n −1) we would at least reduce the number of vertices of G−¦v

∗

¦ −Nbhd(v

∗

) as much as possible.

So, as our next thought, we might replace the instruction ‘choose some nonisolated vertex v

∗

of G’ in

maxset1 by an instruction ‘choose some vertex v

∗

of G that has at least two neighbors.’ Then we could be

quite certain that G−¦v

∗

¦ −Nbhd(v

∗

) would have at most n −3 vertices.

What if there isn’t any such vertex in the graph G? Then G would contain only vertices with 0 or 1

neighbors. Such a graph G would be a collection of E disjoint edges together with a number m of isolated

vertices. The size of the largest independent set of vertices in such a graph is easy to ﬁnd. A maximum

independent set contains one vertex from each of the E edges and it contains all m of the isolated vertices.

Hence in this case, maxset = E +m = [V (G)[ −[E(G)[, and we obtain a second try at a good algorithm in

the following form.

procedure maxset2(G);

¦returns the size of the largest independent set of

vertices of G¦

if G has no vertex of degree ≥ 2

then maxset2 := [V (G)[ −[E(G)[

else

choose a vertex v

∗

of degree ≥ 2;

n

1

:= maxset2(G−¦v

∗

¦);

n

2

:= maxset2(G−¦v

∗

¦ −Nbhd(v

∗

) );

maxset2 := max(n

1

, 1 +n

2

)

end.¦maxset2¦

How much have we improved the complexity estimate? If we apply to maxset2 the reasoning that led

to (2.3.2) we ﬁnd

f(n) ≤ cn

2

+ f(n −1) +f(n −3) (f(0) = 0; n = 2, 3, . . .), (2.3.3)

where f(n) is once more the worst-case time bound for graphs of n vertices.

Just as before, (2.3.3) is a recurrent inequality of the form that was studied at the end of section 1.4,

in theorem 1.4.1. Using the conclusion of that theorem, we ﬁnd from (2.3.3) that f(n) = O((c +)

n

) where

c = 1.46557.. is the positive root of the equation c

3

= c

2

+ 1.

The net result of our eﬀort to improve maxset1 to maxset2 has been to reduce the running-time bound

from O(1.619

n

) to O(1.47

n

), which isn’t a bad day’s work. In the exercises below we will develop maxset3,

whose running time will be O(1.39

n

). The idea will be that since in maxset2 we were able to insure that v

∗

had at least two neighbors, why not try to insure that v

∗

has at least 3 of them?

As long as we have been able to reduce the time bound more and more by insuring that the selected

vertex has lots of neighbors, why don’t we keep it up, and insist that v

∗

should have 4 or more neighbors?

Regrettably the method runs out of steam precisely at that moment. To see why, ask what the ‘trivial case’

would then look like. We would be working on a graph G in which no vertex has more than 3 neighbors.

Well, what ‘trivialthing’ shall we do, in this ‘trivial case’ ?

The fact is that there isn’t any way of ﬁnding the maximum independent set in a graph where all

vertices have ≤ 3 neighbors that’s any faster than the general methods that we’ve already discussed. In fact,

if one could ﬁnd a fast method for that restricted problem it would have extremely important consequences,

because we would then be able to do all graphs rapidly, not just those special ones.

41

Chapter 1: Mathematical Preliminaries

We will learn more about this phenomenon in Chapter 5, but for the moment let’s leave just the

observation that the general problem of maxset turns out to be no harder than the special case of maxset

in which no vertex has more than 3 neighbors.

Aside from the complexity issue, the algorithm maxset has shown how recursive ideas can be used to

transform questions about graphs to questions about smaller graphs.

Here’s another example of such a situation. Suppose G is a graph, and that we have a certain supply

of colors available. To be exact, suppose we have K colors. We can then attempt to color the vertices of G

properly in K colors (see section 1.6).

If we don’t have enough colors, and G has lots of edges, this will not be possible. For example, suppose

G is the graph of Fig. 2.3.4, and suppose we have just 3 colors available. Then there is no way to color the

vertices without ever ﬁnding that both endpoints of some edge have the same color. On the other hand, if

we have four colors available then we can do the job.

Fig. 2.3.4

There are many interesting computational and theoretical problems in the area of coloring of graphs.

Just for its general interest, we are going to mention the four-color theorem, and then we will turn to a study

of some of the computational aspects of graph coloring.

First, just for general cultural reasons, let’s slow down for a while and discuss the relationship between

graph colorings in general and the four-color problem, even though it isn’t directly relevant to what we’re

doing.

The original question was this. Suppose that a delegation of Earthlings were to visit a distant planet

and ﬁnd there a society of human beings. Since that race is well known for its squabbling habits, you can

be sure that the planet will have been carved up into millions of little countries, each with its own ruling

class, system of government, etc., and of course, all at war with each other. The delegation wants to escape

quickly, but before doing so it draws a careful map of the 5,000,000 countries into which the planet has

been divided. To make the map easier to read, the countries are then colored in such a way that whenever

two countries share a stretch of border they are of two diﬀerent colors. Surprisingly, it was found that the

coloring could be done using only red, blue, yellow and green.

It was noticed over 100 years ago that no matter how complicated a map is drawn, and no matter how

many countries are involved, it seems to be possible to color the countries in such a way that

(a) every pair of countries that have a common stretch of border have diﬀerent colors and

(b) no more than four colors are used in the entire map.

It was then conjectured that four colors are always suﬃcient for the proper coloring of the countries

of any map at all. Settling this conjecture turned out to be a very hard problem. It was ﬁnally solved in

1976 by K. Appel and W. Haken* by means of an extraordinary proof with two main ingredients. First they

showed how to reduce the general problem to only a ﬁnite number of cases, by a mathematical argument.

Then, since the ‘ﬁnite number’ was over 1800, they settled all of those cases with quite a lengthy computer

calculation. So now we have the ‘Four Color Theorem,’ which asserts that no matter how we carve up the

plane or the sphere into countries, we will always be able to color those countries with at most four colors

so that countries with a common frontier are colored diﬀerently.

We can change the map coloring problem into a graph coloring problem as follows. Given a map. From

the map we will construct a graph G. There will be a vertex of G corresponding to each country on the

map. Two of these vertices will be connected by an edge of the graph G if the two countries that they

correspond to have a common stretch of border (we keep saying ‘stretch of border’ to emphasize that if the

two countries have just a single point in common they are allowed to have the same color). As an illustration

* Every planar map is four colorable, Bull. Amer. Math. Soc., 82 (1976), 711-712.

42

2.3 Recursive graph algorithms

Fig. 2.3.5(a) Fig. 2.3.5(b)

of this construction, we show in Fig. 2.3.5(a) a map of a distant planet, and in Fig. 2.3.5(b) the graph that

results from the construction that we have just described.

By a ‘planar graph’ we mean a graph G that can be drawn in the plane in such a way that two edges

never cross (except that two edges at the same vertex have that vertex in common). The graph that results

from changing a map of countries into a graph as described above is always a planar graph. In Fig. 2.3.6(a)

we show a planar graph G. This graph doesn’t look planar because two of its edges cross. However, that isn’t

the graph’s fault, because with a little more care we might have drawn the same graph as in Fig. 2.3.6(b), in

which its planarity is obvious. Don’t blame the graph if it doesn’t look planar. It might be planar anyway!

Fig. 2.3.6(a) Fig. 2.3.6(b)

The question of recognizing whether a given graph is planar is itself a formidable problem, although the

solution, due to J. Hopcroft and R. E. Tarjan,* is an algorithm that makes the decision in linear time, i.e.

in O(V ) time for a graph of V vertices.

Although every planar graph can be properly colored in four colors, there are still all of those other

graphs that are not planar to deal with. For any one of those graphs we can ask, if a positive integer K is

given, whether or not its vertices can be K-colored properly.

As if that question weren’t hard enough, we might ask for even more detail, namely about the number

of ways of properly coloring the vertices of a graph. For instance, if we have K colors to work with, suppose

G is the empty graph K

n

, that is, the graph of n vertices that has no edges at all. Then G has quite a large

number of proper colorings, K

n

of them, to be exact. Other graphs of n vertices have fewer proper colorings

than that, and an interesting computational question is to count the proper colorings of a given graph.

We will now ﬁnd a recursive algorithm that will answer this question. Again, the complexity of the

algorithm will be exponential, but as a small consolation we note that no polynomial time algorithm for this

problem is known.

Choose an edge e of the graph, and let its endpoints be v and w. Now delete the edge e from the graph,

and let the resulting graph be called G − ¦e¦. Then we will distinguish two kinds of proper colorings of

G−¦e¦: those in which vertices v and w have the same color and those in which v and w have diﬀerent colors.

Obviously the number of proper colorings of G−¦e¦ in K colors is the sum of the numbers of colorings of

each of these two kinds.

* Eﬃcient planarity testing, J. Assoc. Comp. Mach. 21 (1974), 549-568.

43

Chapter 1: Mathematical Preliminaries

Consider the proper colorings in which vertices v and w have the same color. We claim that the number

of such colorings is equal to the number of all colorings of a certain new graph G/¦e¦, whose construction

we now describe:

The vertices of G/¦e¦ consist of the vertices of G other than v or w and one new vertex that we will

call ‘vw’ (so G/¦e¦ will have one less vertex than G has).

Now we describe the edges of G/¦e¦. First, if a and b are two vertices of G/¦e¦ neither of which is the

new vertex ‘vw’, then (a, b) is an edge of G/¦e¦ if and only if it is an edge of G. Second, (vw, b) is an edge

of G/¦e¦ if and only if either (v, b) or (w, b) (or both) is an edge of G.

We can think of this as ‘collapsing’ the graph G by imagining that the edges of G are elastic bands,

and that we squeeze vertices v and w together into a single vertex. The result is G/¦e¦ (anyway, it is if we

replace any resulting double bands by single ones!).

In Fig. 2.3.7(a) we show a graph G of 7 vertices and a chosen edge e. The two endpoints of e are v

and w. In Fig. 2.3.7(b) we show the graph G/¦e¦ that is the result of the construction that we have just

described.

Fig. 2.3.7(a) Fig. 2.3.7(b)

The point of the construction is the following

Lemma 2.3.1. Let v and w be two vertices of G such that e = (v, w) ∈ E(G). Then the number of proper

K-colorings of G−¦e¦ in which v and w have the same color is equal to the number of all proper colorings

of the graph G/¦e¦.

Proof: Suppose G/¦e¦ has a proper K-coloring. Color the vertices of G−¦e¦ itself in K colors as follows.

Every vertex of G−¦e¦ other than v or w keeps the same color that it has in the coloring of G/¦e¦. Vertex v

and vertex w each receive the color that vertex vw has in the coloring of G/¦e¦. Now we have a K-coloring

of the vertices of G−¦e¦.

It is a proper coloring because if f is any edge of G − ¦e¦ then the two endpoints of f have diﬀerent

colors. Indeed, this is obviously true if neither endpoint of f is v or w because the coloring of G/¦e¦ was a

proper one. There remains only the case where one endpoint of f is, say, v and the other one is some vertex

x other than v or w. But then the colors of v and x must be diﬀerent because vw and x were joined in

G/¦e¦ by an edge, and therefore must have gotten diﬀerent colors there.

To get back to the main argument, we were trying to compute the number of proper K-colorings of

G − ¦e¦. We observed that in any K-coloring v and w have either the same or diﬀerent colors. We have

shown that the number of colorings in which they receive the same color is equal to the number of all proper

colorings of a certain smaller (one less vertex) graph G/¦e¦. It remains to look at the case where vertices v

and w receive diﬀerent colors.

Lemma 2.3.2. Let e = (v, w) be an edge of G. Then the number of proper K-colorings of G−¦e¦ in which

v and w have diﬀerent colors is equal to the number of all proper K-colorings of G itself.

Proof: Obvious (isn’t it?).

Now let’s put together the results of the two lemmas above. Let P(K; G) denote the number of ways of

properly coloring the vertices of a given graph G. Then lemmas 2.3.1 and 2.3.2 assert that

P(K; G−¦e¦) = P(K; G/¦e¦) + P(K; G)

44

2.3 Recursive graph algorithms

or if we solve for P(K; G), then we have

P(K; G) = P(K; G−¦e¦) −P(K; G/¦e¦). (2.3.4)

The quantity P(K; G), the number of ways of properly coloring the vertices of a graph G in K colors,

is called the chromatic polynomial of G.

We claim that it is, in fact, a polynomial in K of degree [V (G)[. For instance, if G is the complete

graph of n vertices then obviously P(K, G) = K(K −1) (K −n + 1), and that is indeed a polynomial in

K of degree n.

Proof of claim: The claim is certainly true if G has just one vertex. Next suppose the assertion is true for

graphs of < V vertices, then we claim it is true for graphs of V vertices also. This is surely true if G has V

vertices and no edges at all. Hence, suppose it is true for all graphs of V vertices and fewer than E edges,

and let G have V vertices and E edges. Then (2.3.4) implies that P(K; G) is a polynomial of the required

degree V because G−¦e¦ has fewer edges than G does, so its chromatic polynomial is a polynomial of degree

V . G/¦e¦ has fewer vertices than G has, and so P(K; G/¦e¦) is a polynomial of lower degree. The claim is

proved, by induction.

Equation (2.3.4) gives a recursive algorithm for computing the chromatic polynomial of a graph G, since

the two graphs that appear on the right are both ‘smaller’ than G, one in the sense that it has fewer edges

than G has, and the other in that it has fewer vertices. The algorithm is the following.

function chrompoly(G:graph): polynomial;

¦computes the chromatic polynomial of a graph G¦

if G has no edges then chrompoly:=K

|V (G)|

else

choose an edge e of G;

chrompoly:=chrompoly(G −¦e¦)−chrompoly(G/¦e¦)

end.¦chrompoly¦

Next we are going to look at the complexity of the algorithm chrompoly (we will also refer to it as ‘the

delete-and-identify’ algorithm). The graph G can be input in any one of a number of ways. For example,

we might input the full list of edges of G, as a list of pairs of vertices.

The ﬁrst step of the computation is to choose the edge e and to create the edge list of the graph G−¦e¦.

The latter operation is trivial, since all we have to do is to ignore one edge in the list.

Next we call chrompoly on the graph G−¦e¦.

The third step is to create the edge list of the collapsed graph G/¦e¦ from the edge list of G itself. That

involves some work, but it is rather routine, and its cost is linear in the number of edges of G, say c[E(G)[.

Finally we call chrompoly on the graph G/¦e¦.

Let F(V, E) denote the maximum cost of calling chrompoly on any graph of at most V vertices and at

most E edges. Then we see at once that

F(V, E) ≤ F(V, E −1) +cE +F(V −1, E −1) (2.3.5)

together with F(V, 0) = 0. If we put, successively, E = 1, 2, 3, we ﬁnd that F(V, 1) ≤ c, F(V, 2) ≤ 4c, and

F(V, 3) ≤ 11c. Hence we seek a solution of (2.3.5) in the form F(V, E) ≤ f(E)c, and we quickly ﬁnd that if

f(E) = 2f(E −1) +E (f(0) = 0) (2.3.6)

then we will have such a solution.

Since (2.3.6) is a ﬁrst-order diﬀerence equation of the form (1.4.5), we ﬁnd that

f(E) = 2

E

E

j=0

j2

−j

∼ 2

E+1

.

(2.3.7)

45

Chapter 1: Mathematical Preliminaries

The last ‘∼’ follows from the evaluation

j2

−j

= 2 that we discussed in section 1.3.

To summarize the developments so far, then, we have found out that the chromatic polynomial of a graph

can be computed recursively by an algorithm whose cost is O(2

E

) for graphs of E edges. This is exponential

cost, and such computations are prohibitively expensive except for graphs of very modest numbers of edges.

Of course the mere fact that our proved time estimate is O(2

E

) doesn’t necessarily mean that the

algorithm can be that slow, because maybe our complexity analysis wasn’t as sharp as it might have been.

However, consider the graph G(s, t) that consists of s disjoint edges and t isolated vertices, for a total of

2s + t vertices altogether. If we choose an edge of G(s, t) and delete it, we get G(s −1, t + 2), whereas the

graph G/¦e¦ is G(s −1, t +1). Each of these two new graphs has s −1 edges.

We might imagine arranging the computation so that the extra isolated vertices will be ‘free,’ i.e., will

not cost any additional labor. Then the work that we

do on G(s, t) will depend only on s, and will be twice as much as the work we do on G(s−1, ). Therefore

G(s, t) will cost at least 2

s

operations, and our complexity estimate wasn’t a mirage, there really are graphs

that make the algorithm do an amount 2

|E(G)|

of work.

Considering the above remarks it may be surprising that there is a slightly diﬀerent approach to the

complexity analysis that leads to a time bound (for the same algorithm) that is a bit sharper than O(2

E

) in

many cases (the work of the complexity analyst is never ﬁnished!). Let’s look at the algorithm chrompoly

in another way.

For a graph G we can deﬁne a number γ(G) = [V (G)[ +[E(G)[, which is rather an odd kind of thing

to deﬁne, but it has a nice property with respect to this algorithm, namely that whatever G we begin with,

we will ﬁnd that

γ(G −¦e¦) = γ(G) −1; γ(G/¦e¦) ≤ γ(G) −2. (2.3.8)

Indeed, if we delete the edge e then γ must drop by 1, and if we collapse the graph on the edge e then we

will have lost one vertex and at least one edge, so γ will drop by at least 2.

Hence, if h(γ) denotes the maximum amount of labor that chrompoly does on any graph G for which

[V (G)[ +[E(G)[ ≤ γ (2.3.9)

then we claim that

h(γ) ≤ h(γ −1) + h(γ −2) (γ ≥ 2). (2.3.10)

Indeed, if G is a graph for which (2.3.9) holds, then if G has any edges at all we can do the delete-and-identify

step to prove that the labor involved in computing the chromatic polynomial of G is at most the quantity

on the right side of (2.3.10). Else, if G has no edges then the labor is 1 unit, which is again at most equal

to the right side of (2.3.10), so the result (2.3.10) follows.

With the initial conditions h(0) = h(1) = 1 the solution of the recurrent inequality (2.3.10) is obviously

the relation h(γ) ≤ F

γ

, where F

γ

is the Fibonacci number. We have thereby proved that the time complexity

of the algorithm chrompoly is

O(F

|V (G)|+|E(G)|

) = O

_

(

1 +

√

5

2

)

|V (G)|+|E(G)|

_

= O(1.62

|V (G)|+|E(G)|

).

(2.3.11)

This analysis does not, of course, contradict the earlier estimate, but complements it. What we have

shown is that the labor involved is always

O

_

min(2

|E(G)|

, 1.62

|V (G)|+|E(G)|

)

_

. (2.3.12)

On a graph with ‘few’ edges relative to its number of vertices (how few?) the ﬁrst quantity in the parentheses

in (2.3.12) will be the smaller one, whereas if G has more edges, then the second term is the smaller one. In

either case the overall judgment about the speed of the algorithm (it’s slow!) remains.

46

2.4 Fast matrix multiplication

Exercises for section 2.3

1. Let G be a cycle of n vertices. What is the size of the largest independent set of vertices in V (G)?

2. Let G be a path of n vertices. What is the size of the largest independent set of vertices in V (G)?

3. Let G be a connected graph in which every vertex has degree 2. What must such a graph consist of?

Prove.

4. Let G be a connected graph in which every vertex has degree ≤ 2. What must such a graph look like?

5. Let G be a not-necessarily-connected graph in which every vertex has degree ≤ 2. What must such a

graph look like? What is the size of the largest independent set of vertices in such a graph? How long would

it take you to calculate that number for such a graph G? How would you do it?

6. Write out algorithm maxset3, which ﬁnds the size of the largest independent set of vertices in a graph.

Its trivial case will occur if G has no vertex of degree ≥ 3. Otherwise, it will choose a vertex v

∗

of degree

≥ 3 and proceed as in maxset2.

7. Analyze the complexity of your algorithm maxset3 from exercise 6 above.

8. Use (2.3.4) to prove by induction that P(K; G) is a polynomial in K of degree [V (G)[. Then show that

if G is a tree then P(K; G) = K(K −1)

|V (G)|−1

.

9. Write out an algorithm that will change the vertex adjacency matrix of a graph G to the vertex adjacency

matrix of the graph G/¦e¦, where e is a given edge of G.

10. How many edges must G have before the second quantity inside the ‘O’ in (2.3.12) is the smaller of the

two?

11. Let α(G) be the size of the largest independent set of vertices of a graph G, let χ(G) be its chromatic

number, and let n = [V (G)[. Show that, for every G, α(G) ≥ n/χ(G).

2.4 Fast matrix multiplication

Everybody knows how to multiply two 2 2 matrices. If we want to calculate

_

c

11

c

12

c

21

c

22

_

=

_

a

11

a

12

a

21

a

22

__

b

11

b

12

b

21

b

22

_

(2.4.1)

then, ‘of course,’

c

i,j

=

2

k=1

a

i,k

b

k,j

(i, j = 1, 2). (2.4.2)

Now look at (2.4.2) a little more closely. In order to calculate each one of the 4 c

i,j

’s we have to do 2

multiplications of numbers. The cost of multiplying two 2 2 matrices is therefore 8 multiplications of

numbers. If we measure the cost in units of additions of numbers, the cost is 4 such additions. Hence, the

matrix multiplication method that is shown in (2.4.1) has a complexity of 8 multiplications of numbers and

4 additions of numbers.

This may seem rather unstartling, but the best ideas often have humble origins.

Suppose we could ﬁnd another way of multiplying two 2 2 matrices in which the cost was only 7

multiplications of numbers, together with more than 4 additions of numbers. Would that be a cause for

dancing in the streets, or would it be just a curiosity, of little importance? In fact, it would be extremely

important, and the consequences of such a step were fully appreciated only in 1969 by V. Strassen, to whom

the ideas that we are now discussing are due.*

What we’re going to do next in this section is the following:

(a) describe another way of multiplying two 2 2 matrices in which the cost will be only 7 multipli-

cations of numbers plus a bunch of additions of numbers, and

(b) convince you that it was worth the trouble.

The usefulness of the idea stems from the following amazing fact: if two 22 matrices can be multiplied

with only 7 multiplications of numbers, then two N N matrices can be multiplied using only O(N

2.81...

)

* V. Strassen, Gaussian elimination is not optimal, Numerische Math. 13 (1969), 354-6.

47

Chapter 1: Mathematical Preliminaries

multiplications of numbers instead of the N

3

such multiplications that the usual method involves (the number

‘2.81...’ is log

2

7).

In other words, if we can reduce the number of multiplications of numbers that are needed to multiply two

22 matrices, then that improvement will show up in the exponent of N when we measure the complexity of

multiplying two NN matrices. The reason, as we will see, is that the little improvement will be pyramided

by numerous recursive calls to the 2 2 procedure– but we get ahead of the story.

Now let’s write out another way to do the 2 2 matrix multiplication that is shown in (2.4.1). Instead

of doing it ´a l` a (2.4.2), try the following 11-step approach.

First compute, from the input 2 2 matrices shown in (2.4.1), the following 7 quantities:

I = (a

12

−a

22

) (b

21

+b

22

)

II = (a

11

+a

22

) (b

11

+b

22

)

III = (a

11

−a

21

) (b

11

+b

12

)

IV = (a

11

+a

12

) b

22

V = a

11

(b

12

−b

22

)

V I = a

22

(b

21

−b

11

)

V II = (a

21

+a

22

) b

11

(2.4.3)

and then calculate the 4 entries of the product matrix C = AB from the 4 formulas

c

11

= I + II −IV +V I

c

12

= IV +V

c

21

= V I +V II

c

22

= II −III + V −V II.

(2.4.4)

The ﬁrst thing to notice about this seemingly overelaborate method of multiplying 2 2 matrices is that

only 7 multiplications of numbers are used (count the ‘’ signs in (2.4.3)). ‘Well yes,’ you might reply, ‘but

18 additions are needed, so where is the gain?’

It will turn out that multiplications are more important than additions, not because computers can do

them faster, but because when the routine is called recursively each ‘’ operation will turn into a multipli-

cation of two big matrices whereas each ‘±’ will turn into an addition or subtraction of two big matrices,

and that’s much cheaper.

Next we’re going to describe how Strassen’s method (equations (2.4.3), (2.4.4)) of multiplying 2 2

matrices can be used to speed up multiplications of N N matrices. The basic idea is that we will partition

each of the large matrices into four smaller ones and multiply them together using (2.4.3), (2.4.4).

Suppose that N is a power of 2, say N = 2

n

, and let there be given two N N matrices, A and B.

We imagine that A and B have each been partitioned into four 2

n−1

2

n−1

matrices, and that the product

matrix C is similarly partitioned. Hence we want to do the matrix multiplication that is indicated by

_

C

11

C

12

C

21

C

22

_

=

_

A

11

A

12

A

21

A

22

__

B

11

B

12

B

21

B

22

_

(2.4.5)

where now each of the capital letters represents a 2

n−1

2

n−1

matrix.

To do the job in (2.4.5) we use exactly the 11 formulas that are shown in (2.4.3) and (2.4.4), except

that the lower-case letters are now all upper case. Suddenly we very much appreciate the reduction of the

number of ‘’ signs because it means one less multiplication of large matrices, and we don’t so much mind

that it has been replaced by 10 more ‘±’ signs, at least not if N is very large.

This yields the following recursive procedure for multiplying large matrices.

48

2.4 Fast matrix multiplication

function MatrProd(A, B: matrix; N:integer):matrix;

¦MatrProd is AB, where A and B are N N¦

¦uses Strassen method¦

if N is not a power of 2 then

border A and B by rows and columns of 0’s until

their size is the next power of 2 and change N;

if N = 1 then MatrProd := AB

else

partition A and B as shown in (2.4.5);

I := MatrProd(A

11

−A

22

, B

21

+B

22

, N/2);

II := MatrProd(A

11

+A

22

, B

11

+ B

22

, N/2);

etc. etc., through all 11 of the formulas

shown in (2.4.3), (2.4.4), ending with ...

C

22

:= II −III +V −V II

end.¦MatrProd¦

Note that this procedure calls itself recursively 7 times. The plus and minus signs in the program each

represent an addition or subtraction of two matrices, and therefore each one of them involves a call to a

matrix addition or subtraction procedure (just the usual method of adding, nothing fancy!). Therefore the

function MatrProd makes 25 calls, 7 of which are recursively to itself, and 18 of which are to a matrix

addition/subtraction routine.

We will now study the complexity of the routine in two ways. We will count the number of multiplications

of numbers that are needed to multiply two 2

n

2

n

matrices using MatrProd (call that number f(n)), and

then we will count the number of additions of numbers (call it g(n)) that MatrProd needs in order to

multiply two 2

n

2

n

matrices.

The multiplications of numbers are easy to count. MatrProd calls itself 7 times, in each of which it

does exactly f(n −1) multiplications of numbers, hence f(n) = 7f(n −1) and f(0) = 1 (why?). Therefore

we see that f(n) = 7

n

for all n ≥ 0. Hence MatrProd does 7

n

multiplications of numbers in order to do one

multiplication of 2

n

2

n

matrices.

Let’s take the last sentence in the above paragraph and replace ‘2

n

’ by N throughout. It then tells

us that MatrProd does 7

log N/ log 2

multiplications of numbers in order to do one multiplication of N N

matrices. Since 7

log N/ log 2

= N

log 7/ log 2

= N

2.81...

, we see that Strassen’s method uses only O(N

2.81

)

multiplications of numbers, in place of the N

3

such multiplications that are required by the usual formulas.

It remains to count the additions/subtractions of numbers that are needed by MatrProd.

In each of its 7 recursive calls to itself MatrProd does g(n − 1) additions of numbers. In each of its

18 calls to the procedure that adds or subtracts matrices it does a number of additions of numbers that is

equal to the square of the size of the matrices that are being added or subtracted. That size is 2

n−1

, so each

of the 18 such calls does 2

2n−2

additions of numbers. It follows that g(0) = 0 and for n ≥ 1 we have

g(n) = 7g(n −1) + 18 4

n−1

= 7g(n −1) +

9

2

4

n

.

We follow the method of section 1.4 on this ﬁrst-order linear diﬀerence equation. Hence we make the

change of variable g(n) = 7

n

y

n

(n ≥ 0) and we ﬁnd that y

0

= 0 and for n ≥ 1,

y

n

= y

n−1

+

9

2

(4/7)

n

.

If we sum over n we obtain

y

n

=

9

2

n

j=1

(4/7)

j

≤

9

2

∞

j=0

(4/7)

n

= 21/2.

49

Chapter 1: Mathematical Preliminaries

Finally, g(n) = 7

n

y

n

≤ (10.5)7

n

= O(7

n

), and this is O(N

2.81

) as before. This completes the proof of

Theorem 2.4.1. In Strassen’s method of fast matrix multiplication the number of multiplications of num-

bers, of additions of numbers and of subtractions of numbers that are needed to multiply together two NN

matrices are each O(N

2.81

) (in contrast to the Θ(N

3

) of the conventional method).

In the years that have elapsed since Strassen’s original paper many researchers have been whittling

away at the exponent of N in the complexity bounds. Several new, and more elaborate algorithms have

been developed, and the exponent, which was originally 3, has progressed downwards through 2.81 to values

below 2.5. It is widely believed that the true minimum exponent is 2 +, i.e., that two N N matrices can

be multiplied in time O(N

2+

), but there seems to be a good deal of work to be done before that result can

be achieved.

Exercises for section 2.4

1. Suppose we could multiply together two 3 3 matrices with only 22 multiplications of numbers. How

fast, recursively, would we then be able to multiply two N N matrices?

2. (cont.) With what would the ‘22’ in problem 1 above have to be replaced in order to achieve an

improvement over Strassen’s algorithm given in the text?

3. (cont.) Still more generally, with how few multiplications would we have to be able to multiply two

M M matrices in order to insure that recursively we would then be able to multiply two N N matrices

faster than the method given in this section?

4. We showed in the text that if N is a power of 2 then two N N matrices can be multiplied in at most

time CN

log

2

7

, where C is a suitable constant. Prove that if N is not a power of 2 then two N N matrices

can be multiplied in time at most 7CN

log

2

7

.

2.5 The discrete Fourier transform

It is a lot easier to multiply two numbers than to multiply two polynomials.

If you should want to multiply two polynomials f and g, of degrees 77 and 94, respectively, you are

in for a lot of work. To calculate just one coeﬃcient of the product is already a lot of work. Think about

the calculation of the coeﬃcient of x

50

in the product, for instance, and you will see that about 50 numbers

must be multiplied together and added in order to calculate just that one coeﬃcient of fg, and there are

171 other coeﬃcients to calculate!

Instead of calculating the coeﬃcients of the product fg it would be much easier just to calculate the

values of the product at, say, 172 points. To do that we could just multiply the values of f and of g at each

of those points, and after a total cost of 172 multiplications we would have the values of the product.

The values of the product polynomial at 172 distinct points determine that polynomial completely, so

that sequence of values is the answer. It’s just that we humans prefer to see polynomials given by means of

their coeﬃcients instead of by their values.

The Fourier transform, that is the subject of this section, is a method of converting from one representa-

tion of a polynomial to another. More exactly, it converts from the sequence of coeﬃcients of the polynomial

to the sequence of values of that polynomial at a certain set of points. Ease of converting between these two

representations of a polynomial is vitally important for many reasons, including multiplication of polynomi-

als, high precision integer arithmetic in computers, creation of medical images in CAT scanners and NMR

scanners, etc.

Hence, in this section we will study the discrete Fourier transform of a ﬁnite sequence of numbers,

methods of calculating it, and some applications.

This is a computational problem which at ﬁrst glance seems very simple. What we’re asked to do,

basically, is to evaluate a polynomial of degree n − 1 at n diﬀerent points. So what could be so diﬃcult

about that?

If we just calculate the n values by brute force, we certainly won’t need to do more than n multiplications

of numbers to ﬁnd each of the n values of the polynomial that we want, so we surely don’t need more than

O(n

2

) multiplications altogether.

50

2.5 The discrete Fourier transform

The interesting thing is that this particular problem is so important, and turns up in so many diﬀerent

applications, that it really pays to be very eﬃcient about how the calculation is done. We will see in this

section that if we use a fairly subtle method of doing this computation instead of the obvious method, then

the work can be cut down from O(n

2

) to O(nlogn). In view of the huge arrays on which this program is

often run, the saving is very much worthwhile.

One can think of the Fourier transform as being a way of changing the description, or coding of a

polynomial, so we will introduce the subject by discussing it from that point of view.

Next we will discuss the obvious way of computing the transform.

Then we will describe the ‘Fast Fourier Transform’, which is a rather un-obvious, but very fast, method

of computing the same creature.

Finally we will discuss an important application of the subject, to the fast multiplication of polynomials.

There are many diﬀerent ways that might choose to describe (‘encode’) a particular polynomial. Take

the polynomial f(t) = t(6 −5t + t

2

), for instance. This can be uniquely described in any of the following

ways (and a lot more).

It is the polynomial whose

(i) coeﬃcients are 0, 6, −5, 1 or whose

(ii) roots are 0, 2 and 3, and whose highest coeﬃcient is 1 or whose

(iii) values at t = 0, 1, 2, 3 are 0, 2, 0, 0, respectively, or whose

(iv) values at the fourth-roots of unity 1, i, −1, −i are 2, 5 +5i, −12, 5 −5i, or etc.

We want to focus on two of these ways of representing a polynomial. The ﬁrst is by its coeﬃcient

sequence; the second is by its sequence of values at the n

th

roots of unity, where n is 1 more than the degree

of the polynomial. The process by which we pass from the coeﬃcient sequence to the sequence of values at

the roots of unity is called forming the Fourier transform of the coeﬃcient sequence. To use the example

above, we would say that the Fourier transform of the sequence

0, 6, −5, 1 (2.5.1)

is the sequence

2, 5 +5i, −12, 5 −5i. (2.5.2)

In general, if we are given a sequence

x

0

, x

1

, . . . , x

n−1

(2.5.3)

then we think of the polynomial

f(t) = x

0

+ x

1

t + x

2

t

2

+ + x

n−1

t

n−1

(2.5.4)

and we compute its values at the n

th

roots of unity. These roots of unity are the numbers

ω

j

= e

2πij/n

(j = 0, 1, . . ., n −1). (2.5.5)

Consequently, if we calculate the values of the polynomial (2.5.4) at the n numbers (2.5.5), we ﬁnd the

Fourier transform of the given sequence (2.5.3) to be the sequence

f(ω

j

) =

n−1

k=0

x

k

ω

j

k

=

n−1

k=0

x

k

e

2πijk/n

(j = 0, 1, . . .n −1).

(2.5.6)

Before proceeding, the reader should pause for a moment and make sure that the fact that (2.5.1)-(2.5.2)

is a special case of (2.5.3)-(2.5.6) is clearly understood. The Fourier transform of a sequence of n numbers

is another sequence of n numbers, namely the sequence of values at the n

th

roots of unity of the very same

polynomial whose coeﬃcients are the members of the original sequence.

51

Chapter 1: Mathematical Preliminaries

The Fourier transform moves us from coeﬃcients to values at roots of unity. Some good reasons for

wanting to make that trip will appear presently, but for the moment, let’s consider the computational side

of the question, namely how to compute the Fourier transform eﬃciently.

We are going to derive an elegant and very speedy algorithm for the evaluation of Fourier transforms.

The algorithm is called the Fast Fourier Transform (FFT) algorithm. In order to appreciate how fast it is,

let’s see how long it would take us to calculate the transform without any very clever procedure.

What we have to do is to compute the values of a given polynomial at n given points. How much work

is required to calculate the value of a polynomial at one given point? If we want to calculate the value of

the polynomial x

0

+ x

1

t + x

2

t

2

+ . . . + x

n−1

t

n−1

at exactly one value of t, then we can do (think how you

would do it, before looking)

function value(x :coeﬀ array; n:integer; t:complex);

¦computes value := x

0

+ x

1

t + +x

n−1

t

n−1

¦

value := 0;

for j := n −1 to 0 step −1 do

value := t value + x

j

end.¦value¦

This well-known algorithm (= ‘synthetic division’) for computing the value of a polynomial at a single

point t obviously runs in time O(n).

If we calculate the Fourier transform of a given sequence of n points by calling the function value n

times, once for each point of evaluation, then obviously we are looking at a simple algorithm that requires

Θ(n

2

) time.

With the FFT we will see that the whole job can be done in time O(nlogn), and we will then look

at some implications of that fact. To put it another way, the cost of calculating all n of the values of a

polynomial f at the n

th

roots of unity is much less than n times the cost of one such calculation.

First we consider the important case where n is a power of 2, say n = 2

r

. Then the values of f, a

polynomial of degree 2

r

−1, at the (2

r

)

th

roots of unity are, from (2.5.6),

f(ω

j

) =

n−1

k=0

x

k

exp¦2πijk/2

r

¦ (j = 0, 1, . . ., 2

r

−1). (2.5.7)

Let’s break up the sum into two sums, containing respectively the terms where k is even and those where k

is odd. In the ﬁrst sum write k = 2m and in the second put k = 2m+1. Then, for each j = 0, 1, . . . , 2

r

−1,

f(ω

j

) =

2

r−1

−1

m=0

x

2m

e

2πijm/2

r−1

+

2

r−1

−1

m=0

x

2m+1

e

2πij(2m+1)/2

r

=

2

r−1

−1

m=0

x

2m

e

2πijm/2

r−1

+e

2πij/2

r

2

r−1

−1

m=0

x

2m+1

e

2πijm/2

r−1

.

(2.5.8)

Something special just happened. Each of the two sums that appear in the last member of (2.5.8) is

itself a Fourier transform, of a shorter sequence. The ﬁrst sum is the transform of the array

x[0], x[2], x[4], . . . , x[2

r

−2] (2.5.9)

and the second sum is the transform of

x[1], x[3], x[5], . . ., x[2

r

−1]. (2.5.10)

The stage is set (well, almost set) for a recursive program.

52

2.5 The discrete Fourier transform

There is one small problem, though. In (2.5.8) we want to compute f(ω

j

) for 2

r

values of j, namely for

j = 0, 1, . . . , 2

r

−1. However, the Fourier transform of the shorter sequence (2.5.9) is deﬁned for only 2

r−1

values of j, namely for j = 0, 1, . . . , 2

r−1

− 1. So if we calculate the ﬁrst sum by a recursive call, then we

will need its values for j’s that are outside the range for which it was computed.

This problem is no sooner recognized than solved. Let Q(j) denote the ﬁrst sum in (2.5.8). Then we

claim that Q(j) is a periodic function of j, of period 2

r−1

, because

Q(j +2

r−1

) =

2

r−1

−1

m=0

x

2m

exp¦2πim(j +2

r−1

)/2

r−1

¦

=

2

r−1

−1

m=0

x

2m

exp¦2πimj/2

r−1

¦e

2πim

=

2

r−1

−1

m=0

x

2m

exp¦2πimj/2

r−1

¦

= Q(j)

(2.5.11)

for all integers j. If Q(j) has been computed only for 0 ≤ j ≤ 2

r−1

− 1 and if we should want its value for

some j ≥ 2

r−1

then we can get that value by asking for Q(j mod 2

r−1

).

Now we can state the recursive form of the Fast Fourier Transform algorithm in the (most important)

case where n is a power of 2. In the algorithm we will use the type complexarray to denote an array of

complex numbers.

function FFT(n:integer; x :complexarray):complexarray;

¦computes fast Fourier transform of n = 2

k

numbers x ¦

if n = 1 then FFT[0] := x[0]

else

evenarray := ¦x[0], x[2], . . ., x[n −2]¦;

oddarray := ¦x[1], x[3], . . . , x[n−1]¦;

¦u[0], u[1], . . . u[

n

2

−1]¦ := FFT(n/2, evenarray);

¦v[0], v[1], . . . v[

n

2

−1]¦ := FFT(n/2, oddarray);

for j := 0 to n −1 do

τ := exp¦2πij/n¦;

FFT[j] := u[j mod

n

2

] +τv[j mod

n

2

]

end.¦FFT¦

Let y(k) denote the number of multiplications of complex numbers that will be done if we call FFT on

an array whose length is n = 2

k

. The call to FFT(n/2, evenarray) costs y(k − 1) multiplications as does

the call to FFT(n/2, oddarray). The ‘for j:= 0 to n’ loop requires n more multiplications. Hence

y(k) = 2y(k −1) +2

k

(k ≥ 1; y(0) = 0). (2.5.12)

If we change variables by writing y(k) = 2

k

z

k

, then we ﬁnd that z

k

= z

k−1

+1, which, together with z

0

= 0,

implies that z

k

= k for all k ≥ 0, and therefore that y(k) = k2

k

. This proves

Theorem 2.5.1. The Fourier transform of a sequence of n complex numbers is computed using only

O(nlogn) multiplications of complex numbers by means of the procedure FFT, if n is a power of 2.

Next* we will discuss the situation when n is not a power of 2.

The reader may observe that by ‘padding out’ the input array with additional 0’s we can extend the

length of the array until it becomes a power of 2, and then call the FFT procedure that we have already

* The remainder of this section can be omitted at a ﬁrst reading.

53

Chapter 1: Mathematical Preliminaries

discussed. In a particular application, that may or may not be acceptable. The problem is that the original

question asked for the values of the input polynomial at the n

th

roots of unity, but after the padding, we

will ﬁnd the values at the N

th

roots of unity, where N is the next power of 2. In some applications, such as

the multiplication of polynomials that we will discuss later in this section, that change is acceptable, but in

others the substitution of N

th

roots for n

th

roots may not be permitted.

We will suppose that the FFT of a sequence of n numbers is wanted, where n is not a power of 2, and

where the padding operation is not acceptable. If n is a prime number we will have nothing more to say,

i.e. we will not discuss any improvements to the obvious method for calculating the transform, one root of

unity at a time.

Suppose that n is not prime (n is ‘composite’). Then we can factor the integer n in some nontrivial

way, say n = r

1

r

2

where neither r

1

nor r

2

is 1.

We claim, then, that the Fourier transform of a sequence of length n can be computed by recursively

ﬁnding the Fourier transforms of r

1

diﬀerent sequences, each of length r

2

. The method is a straightforward

generalization of the idea that we have already used in the case where n was a power of 2.

In the following we will write ξ

n

= e

2πi/n

. The train of ‘=’ signs in the equation below shows how the

question on an input array of length n is changed into r

1

questions about input arrays of length r

2

. We

have, for the value of the input polynomial f at the j

th

one of the n n

th

roots of unity, the relations

f(e

2πij/n

) =

n−1

s=0

x

s

ξ

n

js

=

r1−1

k=0

r2−1

t=0

¦x

tr1+k

ξ

n

j(tr1+k)

¦

=

r1−1

k=0

r2−1

t=0

¦x

tr1+k

ξ

n

tjr1

ξ

n

kj

¦

=

r1−1

k=0

_

r2−1

t=0

x

tr1+k

ξ

r2

tj

_

ξ

n

kj

=

r1−1

k=0

a

k

(j)ξ

n

kj

.

(2.5.13)

We will discuss (2.5.13), line-by-line. The ﬁrst ‘=’ sign is the deﬁnition of the j

th

entry of the Fourier

transform of the input array x. The second equality uses the fact that every integer s such that 0 ≤ s ≤ n−1

can be uniquely written in the form s = tr

1

+ k, where 0 ≤ t ≤ r

2

−1 and 0 ≤ k ≤ r

1

−1. The next ‘=’ is

just a rearrangement, but the next one uses the all-important fact that ξ

n

r1

= ξ

r2

(why?), and in the last

equation we are simply deﬁning a set of numbers

a

k

(j) =

r2−1

t=0

x

tr1+k

ξ

r2

tj

(0 ≤ k ≤ r

1

−1; 0 ≤ j ≤ n −1). (2.5.14)

The important thing to notice is that for a ﬁxed k the numbers a

k

(j) are periodic in n, of period r

2

, i.e.,

that a

k

(j +r

2

) = a

k

(j) for all j. Hence, even though the values of the a

k

(j) are needed for j = 0, 1, . . . , n−1,

they must be computed only for j = 0, 1, . . ., r

2

−1.

Now the entire job can be done recursively, because for ﬁxed k the set of values of a

k

(j) (j =

0, 1, . . . , r

2

−1) that we must compute is itself a Fourier transform, namely of the sequence

¦x

tr1+k

¦ (t = 0, 1, . . . , r

2

−1). (2.5.15)

Let g(n) denote the number of complex multiplications that are needed to compute the Fourier transform

of a sequence of n numbers. Then, for k ﬁxed we can recursively compute the r

2

values of a

k

(j) that we need

with g(r

2

) multiplications of complex numbers. There are r

1

such ﬁxed values of k for which we must do the

54

2.5 The discrete Fourier transform

computation, hence all of the necessary values of a

k

(j) can be found with r

1

g(r

2

) complex multiplications.

Once the a

k

(j) are all in hand, then the computation of the one value of the transform from (2.5.13) will

require an additional r

1

− 1 complex multiplications. Since n = r

1

r

2

values of the transform have to be

computed, we will need r

1

r

2

(r

1

−1) complex multiplications.

The complete computation needs r

1

g(r

2

) +r

2

1

r

2

−r

1

r

2

multiplications if we choose a particular factor-

ization n = r

1

r

2

. The factorization that should be chosen is the one that minimizes the labor, so we have

the recurrence

g(n) = min

n=r1r2

¦r

1

g(r

2

) + r

2

1

r

2

¦ −n. (2.5.16)

If n = p is a prime number then there are no factorizations to choose from and our algorithm is no help

at all. There is no recourse but to calculate the p values of the transform directly from the deﬁnition (2.5.6),

and that will require p −1 complex multiplications to be done in order to get each of those p values. Hence

we have, in addition to the recurrence formula (2.5.16), the special values

g(p) = p(p −1) (if p is prime). (2.5.17)

The recurrence formula (2.5.16) together with the starting values that are shown in (2.5.17) completely

determine the function g(n). Before proceeding, the reader is invited to calculate g(12) and g(18).

We are going to work out the exact solution of the interesting recurrence (2.5.16), (2.5.17), and when

we are ﬁnished we will see which factorization of n is the best one to choose. If we leave that question in

abeyance for a while, though, we can summarize by stating the (otherwise) complete algorithm for the fast

Fourier transform.

function FFT(x:complexarray; n:integer):complexarray;

¦computes Fourier transform of a sequence x of length n¦

if n is prime

then

for j:=0 to n −1 do

FFT[j] :=

n−1

k=0

x[k]ξ

n

jk

else

let n = r

1

r

2

be some factorization of n;

¦see below for best choice of r

1

, r

2

¦

for k:=0 to r

1

−1 do

¦a

k

[0], a

k

[1], . . ., a

k

[r

2

−1]¦

:= FFT(¦x[k], x[k + r

1

], . . . , x[k + (r

2

−1)r

1

]¦, r

2

);

for j:=0 to n −1 do

FFT[j] :=

r1−1

k=0

a

k

[j mod r

2

]ξ

kj

n

end.¦FFT¦

Our next task will be to solve the recurrence relations (2.5.16), (2.5.17), and thereby to learn the best

choice of the factorization of n.

Let g(n) = nh(n), where h is a new unknown function. Then the recurrence that we have to solve takes

the form

h(n) =

_

min

d

¦h(n/d) +d¦ −1, if n is composite;

n −1, if n is prime.

(2.5.18)

In (2.5.18), the ‘min’ is taken over all d that divide n other than d = 1 and d = n.

The above relation determines the value of h for all positive integers n. For example,

h(15) = min

d

(h(15/d) + d) −1

= min(h(5) +3, h(3) +5) −1

= min(7, 7) −1 = 6

55

Chapter 1: Mathematical Preliminaries

and so forth.

To ﬁnd the solution in a pleasant form, let

n = p

a1

1

p

a2

2

p

as

s

(2.5.19)

be the canonical factorization of n into primes. We claim that the function

h(n) = a

1

(p

1

−1) + a

2

(p

2

−1) + +a

s

(p

s

−1) (2.5.20)

is the solution of (2.5.18) (this claim is obviously (?) correct if n is prime).

To prove the claim in general, suppose it to be true for 1, 2, . . . , n−1, and suppose that n is not prime.

Then every divisor d of n must be of the form d = p

b1

1

p

b2

2

p

bs

s

, where the primes p

i

are the same as those

that appear in (2.5.19) and each b

i

is ≤ a

i

. Hence from (2.5.18) we get

h(n) = min

b

¦(a

1

−b

1

)(p

1

−1) + +(a

s

−b

s

)(p

s

−1) + p

b1

1

p

b

s

s

¦ −1 (2.5.21)

where now the ‘min’ extends over all admissible choices of the b’s, namely exponents b

1

, . . . , b

s

such that

0 ≤ b

i

≤ a

i

(∀i = 1, s) and not all b

i

are 0 and not all b

i

= a

i

.

One such admissible choice would be to take, say, b

j

= 1 and all other b

i

= 0. If we let H(b

1

, . . . , b

s

)

denote the quantity in braces in (2.5.21), then with this choice the value of H would be a

1

(p

1

−1) + +

a

s

(p

s

− 1) + 1, exactly what we need to prove our claim (2.5.20). Hence what we have to show is that the

above choice of the b

i

’s is the best one. We will show that if one of the b

i

is larger than 1 then we can reduce

it without increasing the value of H.

To prove this, observe that for each i = 1, s we have

H(b

1

, . . . , b

i

+1, . . . , b

s

) −H(b

1

, . . . , b

s

) = −p

i

+ d(p

i

−1)

= (d −1)(p

i

−1).

Since the divisor d ≥ 2 and the prime p

i

≥ 2, the last diﬀerence is nonnegative. Hence H doesn’t increase

if we decrease one of the b’s by 1 unit, as long as not all b

i

= 0. It follows that the minimum of H occurs

among the prime divisors d of n. Further, if d is prime, then we can easily check from (2.5.21) that it doesn’t

matter which prime divisor of n that we choose to be d, the function h(n) is always given by (2.5.20). If we

recall the change of variable g(n) = nh(n) we ﬁnd that we have proved

Theorem 2.5.2. (Complexity of the Fast Fourier Transform) The best choice of the factorization n = r

1

r

2

in algorithm FFT is to take r

1

to be a prime divisor of n. If that is done, then algorithm FFT requires

g(n) = n(a

1

(p

1

−1) + a

2

(p

2

−1) + +a

s

(p

s

−1))

complex multiplications in order to do its job, where n = p

a1

1

p

as

s

is the canonical factorization of the

integer n.

Table 2.5.1 shows the number g(n) of complex multiplications required by FFT as a function of n. The

saving over the straightforward algorithm that uses n(n −1) multiplications for each n is apparent.

If n is a power of 2, say n = 2

q

, then the formula of theorem 2.5.2 reduces to g(n) = nlogn/ log 2, in

agreement with theorem 2.5.1. What does the formula say if n is a power of 3? if n is a product of distinct

primes?

2.6 Applications of the FFT

Finally, we will discuss some applications of the FFT. A family of such applications begins with the

observation that the FFT provides the fastest game in town for multiplying two polynomials together.

Consider a multiplication like

(1 +2x + 7x

2

−2x

3

−x

4

) (4 −5x −x

2

−x

3

+ 11x

4

+x

5

).

56

2.6 Applications of the FFT

n g(n) n g(n)

2 2 22 242

3 6 23 506

4 8 24 120

5 20 25 200

6 18 26 338

7 42 27 162

8 24 28 224

9 36 29 812

10 50 30 210

11 110 31 930

12 48 32 160

13 156 33 396

14 98 34 578

15 90 35 350

16 64 36 216

17 272 37 1332

18 90 38 722

19 342 39 546

20 120 40 280

21 168 41 1640

Table 2.5.1: The complexity of the FFT

We will study the amount of labor that is needed to do this multiplication by the straightforward algorithm,

and then we will see how the FFT can help.

If we do this multiplication in the obvious way then there is quite a bit of work to do. The coeﬃcient of

x

4

in the product, for instance, is 1 11 +2 (−1) +7 (−1) +(−2) (−5) +(−1) 4 = 8, and 5 multiplications

are needed to compute just that single coeﬃcient of the product polynomial.

In the general case, we want to multiply

¦

n

i=0

a

i

x

i

¦ ¦

m

j=0

b

j

x

j

¦. (2.6.1)

In the product polynomial, the coeﬃcient of x

k

is

min(k,n)

r=max(0,k−m)

a

r

b

k−r

. (2.6.2)

For k ﬁxed, the number of terms in the sum (2.6.2) is min(k, n) −max(0, k −m) +1. If we sum this amount

of labor over k = 0, m+ n we ﬁnd that the total amount of labor for multiplication of two polynomials of

degrees m and n is Θ(mn). In particular, if the polynomials are of the same degree n then the labor is

Θ(n

2

).

By using the FFT the amount of labor can be reduced from Θ(n

2

) to Θ(nlogn).

To understand how this works, let’s recall the deﬁnition of the Fourier transform of a sequence. It is

the sequence of values of the polynomial whose coeﬃcients are the given numbers, at the n

th

roots of unity,

where n is the length of the input sequence.

Imagine two universes, one in which the residents are used to describing polynomials by means of their

coeﬃcients, and another one in which the inhabitants are fond of describing polynomials by their values at

roots of unity. In the ﬁrst universe the locals have to work fairly hard to multiply two polynomials because

they have to carry out the operations (2.6.2) in order to ﬁnd each coeﬃcient of the product.

57

Chapter 1: Mathematical Preliminaries

In the second universe, multiplying two polynomials is a breeze. If we have in front of us the values

f(ω) of the polynomial f at the roots of unity, and the values g(ω) of the polynomial g at the same roots

of unity, then what are the values (fg)(ω) of the product polynomial fg at the roots of unity? To ﬁnd each

one requires only a single multiplication of two complex numbers, because the value of fg at ω is simply

f(ω)g(ω).

Multiplying values is easier than ﬁnding the coeﬃcients of the product.

Since we live in a universe where people like to think about polynomials as being given by their coeﬃcient

arrays, we have to take a somewhat roundabout route in order to do an eﬃcient multiplication.

Given: A polynomial f, of degree n, and a polynomial g of degree m; by their coeﬃcient arrays. Wanted:

The coeﬃcients of the product polynomial fg, of degree m+ n.

Step 1: Let N −1 be the smallest integer that is a power of 2 and is greater than m+ n +1.

Step 2. Think of f and g as polynomials each of whose degrees is N − 1. This means that we should

adjoin N −n more coeﬃcients, all = 0, to the coeﬃcient array of f and N −m more coeﬃcients, all = 0, to

the coeﬃcient array of g. Now both input coeﬃcient arrays are of length N.

Step 3. Compute the FFT of the array of coeﬃcients of f. Now we are looking at the values of f at the

N

th

roots of unity. Likewise compute the FFT of the array of coeﬃcients of g to obtain the array of values

of g at the same N

th

roots of unity. The cost of this step is O(N logN).

Step 4. For each of the N

th

roots of unity ω multiply the number f(ω) by the number g(ω). We now

have the numbers f(ω)g(ω), which are exactly the values of the unknown product polynomial fg at the N

th

roots of unity. The cost of this step is N multiplications of numbers, one for each ω.

Step 5. We now are looking at the values of fg at the N

th

roots, and we want to get back to the

coeﬃcients of fg because that was what we were asked for. To go backwards, from values at roots of unity

to coeﬃcients, calls for the inverse Fourier transform, which we will describe in a moment. Its cost is also

O(N logN).

The answer to the original question has been obtained at a total cost of O(N logN) = O((m +

n) log (m+ n)) arithmetic operations. It’s true that we did have to take a walk from our universe to the next

one and back again, but the round trip was a lot cheaper than the O((m+n)

2

) cost of a direct multiplication.

It remains to discuss the inverse Fourier transform. Perhaps the neatest way to do that is to juxtapose

the formulas for the Fourier transform and for the inverse tranform, so as to facilitate comparison of the two,

so here they are. If we are given a sequence ¦x

0

, x

1

, . . . , x

n−1

¦ then the Fourier transform of the sequence is

the sequence (see (2.5.6))

f(ω

j

) =

n−1

k=0

x

k

e

2πijk/n

(j = 0, 1, . . . , n −1). (2.6.3)

Conversely, if we are given the numbers f(ω

j

) (j = 0, . . . , n−1) then we can recover the coeﬃcient sequence

x

0

, . . . , x

n−1

by the inverse formulas

x

k

=

1

n

n−1

j=0

f(ω

j

)e

−2πijk/n

(k = 0, 1, . . . , n −1). (2.6.4)

The diﬀerences between the inverse formulas and the original transform formulas are ﬁrst the appearance of

‘1/n’ in front of the summation and second the ‘−’ sign in the exponential. We leave it as an exercise for

the reader to verify that these formulas really do invert each other.

We observe that if we are already in possession of a computer program that will ﬁnd the FFT, then we

can use it to calculate the inverse Fourier transform as follows:

(i) Given a sequence ¦f(ω)¦ of values of a polynomial at the n

th

roots of unity, form the complex

conjugate of each member of the sequence.

(ii) Input the conjugated sequence to your FFT program.

(iii) Form the complex conjugate of each entry of the output array, and divide by n. You now have the

inverse transform of the input sequence.

The cost is obviously equal to the cost of the FFT plus a linear number of conjugations and divisions

by n.

58

2.7 A review

An outgrowth of the rapidity with which we can now multiply polynomials is a rethinking of the methods

by which we do ultrahigh-precision arithmetic. How fast can we multiply two integers, each of which has

ten million bits? By using ideas that developed directly (though not at all trivially) from the ones that we

have been discussing, Sch¨ onhage and Strassen found the fastest known method for doing such large-scale

multiplications of integers. The method relies heavily on the FFT, which may not be too surprising since

an integer n is given in terms of its bits b

0

, b

1

, . . . , b

m

by the relation

n =

i≥0

b

i

2

i

. (2.6.5)

However the sum in (2.6.5) is seen at once to be the value of a certain polynomial at x = 2. Hence in asking

for the bits of the product of two such integers we are asking for something very similar to the coeﬃcients

of the product of two polynomials, and indeed the fastest known algorithms for this problem depend upon

the Fast Fourier Transform.

Exercises for section 2.6

1. Let ω be an n

th

root of unity, and let k be a ﬁxed integer. Evaluate

1 +ω

k

+ ω

2k

+ + ω

k(n−1)

.

2. Verify that the relations (2.6.3) and (2.6.4) indeed are inverses of each other.

3. Let f =

n−1

j=0

a

j

x

j

. Show that

1

n

ω

n

=1

[f(ω)[

2

= [a

0

[

2

+ +[a

n−1

[

2

4. The values of a certain cubic polynomial at 1, i, −1, −i are 1, 2, 3, 4, respectively. Find its value at 2.

5. Write a program that will do the FFT in the case where the number of data points is a power of 2.

Organize your program so as to minimize additional array storage beyond the input and output arrays.

6. Prove that a polynomial of degree n is uniquely determined by its values at n + 1 distinct points.

2.7 A review

Here is a quick review of the algorithms that we studied in this chapter.

Sorting is an easy computational problem. The most obvious way to sort n array elements takes time

Θ(n

2

). We discussed a recursive algorithm that sorts in an average time of Θ(nlog n).

Finding a maximum independent set in a graph is a hard computational problem. The most obvious

way to ﬁnd one might take time Θ(2

n

) if the graph G has n vertices. We discussed a recursive method that

runs in time Θ(1.39

n

). The best known methods run in time Θ(2

n/3

).

Finding out if a graph is K-colorable is a hard computational problem. The most obvious way to do it

takes time Θ(K

n

), if G has n vertices. We discussed a recursive method that runs in time O(1.62

n+E

) if G

has n vertices and E edges. One recently developed method * runs in time O((1 +

3

√

3)

n

). We will see in

section 5.7 that this problem can be done in an average time that is O(1) for ﬁxed K.

Multiplying two matrices is an easy computational problem. The most obvious way to do it takes time

Θ(n

3

) if the matrices are n n. We discussed a recursive method that runs in time O(n

2.82

). A recent

method ** runs in time O(n

γ

) for some γ < 2.5.

* E. Lawler, A note on the complexity of the chromatic number problem, Information Processing Letters

5 (1976), 66-7.

** D. Coppersmith and S. Winograd, On the asymptotic complexity of matrix multiplication, SIAM J.

Comp. 11 (1980), 472-492.

59

Chapter 1: Mathematical Preliminaries

Finding the discrete Fourier transform of an array of n elements is an easy computational problem. The

most obvious way to do it takes time Θ(n

2

). We discussed a recursive method that runs in time O(nlogn)

if n is a power of 2.

When we write a program recursively we are making life easier for ourselves and harder for the compiler

and the computer. A single call to a recursive program can cause it to execute a tree-full of calls to itself

before it is able to respond to our original request.

For example, if we call Quicksort to sort the array

¦5, 8, 13, 9, 15, 29, 44, 71, 67¦

then the tree shown in Fig. 2.7.1 might be generated by the compiler.

Fig. 2.7.1: A tree of calls to Quicksort

Again, if we call maxset1 on the 5-cycle, the tree in Fig. 2.3.3 of calls may be created.

A single invocation of chrompoly, where the input graph is a 4-cycle, for instance, might generate the

tree of recursive calls that appears in Fig. 2.7.2.

Fig. 2.7.2: A tree of calls to chrompoly

60

2.7 A review

Fig. 2.7.3: The recursive call tree for FFT

Finally, if we call the ‘power of 2’ version of the FFT algorithm on the sequence ¦1, i, −i, 1¦ then FFT

will proceed to manufacture the tree shown in Fig. 2.7.3.

It must be emphasized that the creation of the tree of recursions is done by the compiler without any

further eﬀort on the part of the programmer. As long as we’re here, how does a compiler go about making

such a tree?

It does it by using an auxiliary stack. It adopts the philosophy that if it is asked to do two things at

once, well after all, it can’t do that, so it does one of those two things and drops the other request on top of

a stack of unﬁnished business. When it ﬁnishes executing the ﬁrst request it goes to the top of the stack to

ﬁnd out what to do next.

Example

Let’s follow the compiler through its tribulations as it attempts to deal with our request for maximum

independent set size that appears in Fig. 2.3.3. We begin by asking for the maxset1 of the 5-cycle. Our

program immediately makes two recursive calls to maxset1, on each of the two graphs that appear on the

second level of the tree in Fig. 2.3.3. The stack is initially empty.

The compiler says to itself ‘I can’t do these both at once’, and it puts the right-hand graph (involving

vertices 3,4) on the stack, and proceeds to call itself on the left hand graph (vertices 2,3,4,5).

When it tries to do that one, of course, two more graphs are generated, of which the right-hand one

(4,5) is dropped onto the stack, on top of the graph that previously lived there, so now two graphs are on

the stack, awaiting processing, and the compiler is dealing with the graph (3,4,5).

This time the graph of just one vertex (5) is dropped onto the stack, which now holds three graphs, as

the compiler works on (4,5).

Next, that graph is broken up into (5), and an empty graph, which is dutifully dropped onto the stack,

so the compiler can work on (5).

Finally, something fruitful happens: the graph (5) has no edges, so the program maxset1 gives, in its

trivial case, very speciﬁc instructions as to how to deal with this graph. We now know that the graph that

consists of just the single vertex (5) has a maxset1 values of 1.

The compiler next reaches for the graph on top of the stack, ﬁnds that it is the empty graph, which has

no edges at all, and therefore its maxset size is 0.

It now knows the n

1

= 1 and the n

2

= 0 values that appear in the algorithm maxset1, and therefore it

can execute the instruction maxset1 := max(n

1

, 1 + n

2

), from which it ﬁnds that the value of maxset1 for

the graph (4,5) is 1, and it continues from there, to dig itself out of the stack of unﬁnished business.

In general, if it is trying to execute maxset1 on a graph that has edges, it will drop the graph G −

¦v

∗

¦ −Nbhd(v

∗

) on the stack and try to do the graph G−¦v

∗

¦.

The reader should try to write out, as a formal algorithm, the procedure that we have been describing,

whereby the compiler deals with a recursive computation that branches into two sub-computations until a

trivial case is reached.

61

Chapter 1: Mathematical Preliminaries

Exercise for section 2.7

1. In Fig. 2.7.3, add to the picture the output that each of the recursive calls gives back to the box above it

that made the call.

Bibliography

A deﬁnitive account of all aspects of sorting is in

D. E. Knuth, The art of computer programming, Vol. 3: Sorting and searching, Addison Wesley, Reading

MA, 1973.

All three volumes of the above reference are highly recommended for the study of algorithms and discrete

mathematics.

A O(2

n/3

) algorithm for the maximum independent set problem can be found in

R. E. Tarjan and A. Trojanowski, Finding a maximum independent set, SIAM J.Computing 6 (1977), 537-

546.

Recent developments in fast matrix multiplication are traced in

Victor Pan, How to multiply matrices faster, Lecture notes in computer science No. 179, Springer-Verlag,

1984.

The realization that the Fourier transform calculation can be speeded up has been traced back to

C. Runge, Zeits. Math. Phys., 48 (1903) p. 443.

and also appears in

C. Runge and H. K¨ onig, Die Grundlehren der math. Wissensch., 11, Springer Verlag, Berlin 1924.

The introduction of the method in modern algorithmic terms is generally credited to

J. M. Cooley and J. W. Tukey, An algorithm for the machine calculation of complex Fourier series, Mathe-

matics of Computation, 19 (1965), 297-301.

A number of statistical applications of the method are in

J. M. Cooley, P. A. W. Lewis and P. D. Welch, The Fast Fourier Transform and its application to time series

analysis, in Statistical Methods for Digital Computers, Enslein, Ralston and Wilf eds., John Wiley & Sons,

New York, 1977, 377-423.

The use of the FFT for high precision integer arithmetic is due to

A Sch¨ onhage and V. Strassen, Schnelle Multiplikation grosser Zahlen, Computing, 7 (1971), 281-292.

An excellent account of the above as well as of applications of the FFT to polynomial arithmetic is by

A. V. Aho, J. E. Hopcroft and J. D. Ullman, The design and analysis of computer algorithms, Addison

Wesley, Reading, MA, 1974 (chap. 7).

62

Chapter 3: The Network Flow Problem

3.1 Introduction

The network ﬂow problem is an example of a beautiful theoretical subject that has many important

applications. It also has generated algorithmic questions that have been in a state of extremely rapid

development in the past 20 years. Altogether, the fastest algorithms that are now known for the problem

are much faster, and some are much simpler, than the ones that were in use a short time ago, but it is still

unclear how close to the ‘ultimate’ algorithm we are.

Deﬁnition. A network is an edge-capacitated directed graph, with two distinguished vertices called the

source and the sink.

To repeat that, this time a little more slowly, suppose ﬁrst that we are given a directed graph (digraph)

G. That is, we are given a set of vertices, and a set of ordered pairs of these vertices, these pairs being

the edges of the digraph. It is perfectly OK to have both an edge from u to v and an edge from v to u, or

both, or neither, for all u ,= v. No edge (u, u) is permitted. If an edge e is directed from vertex v to vertex

w, then v is the initial vertex of e and w is the terminal vertex of e. We may then write v = Init(e) and

w = Term(e).

Next, in a network there is associated with each directed edge e of the digraph a positive real number

called its capacity, and denoted by cap(e).

Finally, two of the vertices of the digraph are distinguished. One, s, is the source, and the other, t, is

the sink of the network.

We will let X denote the resulting network. It consists of the digraph G, the given set of edge capacities,

the source, and the sink. A network is shown in Fig. 3.1.1.

Fig. 3.1.1: A network

Now roughly speaking, we can think of the edges of G as conduits for a ﬂuid, the capacity of each edge

being the carrying-capacity of the edge for that ﬂuid. Imagine that the ﬂuid ﬂows in the network from the

source to the sink, in such a way that the amount of ﬂuid in each edge does not exceed the capacity of that

edge.

We want to know the maximum net quantity of ﬂuid that could be ﬂowing from source to sink.

That was a rough description of the problem; here it is more precisely.

Deﬁnition. A ﬂow in a network X is a function f that assigns to each edge e of the network a real number

f(e), in such a way that

(1) For each edge e we have 0 ≤ f(e) ≤ cap(e) and

(2) For each vertex v other than the source and the sink, it is true that

Init(e)=v

f(e) =

Term(e)=v

f(e). (3.1.1)

63

Chapter 3: The Network Flow Problem

The condition (3.1.1) is a ﬂow conservation condition. It states that the outﬂow from v (the left side

of (3.1.1)) is equal to the inﬂow to v (the right side) for all vertices v other than s and t. In the theory of

electrical networks such conservation conditions are known as Kirchhoﬀ’s laws. Flow cannot be manufactured

anywhere in the network except at s or t. At other vertices, only redistribution or rerouting takes place.

Since the source and the sink are exempt from the conservation conditions there may, and usually will,

be a nonzero net ﬂow out of the source, and a nonzero net ﬂow into the sink. Intuitively it must already be

clear that these two are equal, and we will prove it below, in section 3.4. If we let Q be the net outﬂow from

the source, then Q is also the net inﬂow to the sink.

The quantity Q is called the value of the ﬂow.

In Fig. 3.1.2 there is shown a ﬂow in the network of Fig. 3.1.1. The amounts of ﬂow in each edge are

shown in the square boxes. The other number on each edge is its capacity. The letter inside the small circle

next to each vertex is the name of that vertex, for the purposes of the present discussion. The value of the

ﬂow in Fig. 3.1.2 is Q = 32.

Fig. 3.1.2: A ﬂow in a network

The network ﬂow problem, the main subject of this chapter, is: given a network X, ﬁnd the maximum

possible value of a ﬂow in X, and ﬁnd a ﬂow of that value.

3.2 Algorithms for the network ﬂow problem

The ﬁrst algorithm for the network ﬂow problem was given by Ford and Fulkerson. They used that

algorithm not only to solve instances of the problem, but also to prove theorems about network ﬂow, a

particularly happy combination. In particular, they used their algorithm to prove the ‘max-ﬂow-min-cut’

theorem, which we state below as theorem 3.4.1, and which occupies a central position in the theory.

The speed of their algorithm, it turns out, depends on the edge capacities in the network as well as on

the numbers V of vertices, and E of edges, of the network. Indeed, for certain (irrational) values of edge

capacities they found that their algorithm might not converge at all (see section 3.5).

In 1969 Edmonds and Karp gave the ﬁrst algorithm for the problem whose speed is bounded by a

polynomial function of E and V only. In fact that algorithm runs in time O(E

2

V ). Since then there has

been a steady procession of improvements in the algorithms, culminating, at the time of this writing anyway,

with an O(EV logV ) algorithm. The chronology is shown in Table 3.2.1.

The maximum number of edges that a network of V vertices can have is Θ(V

2

). A family of networks

might be called dense if there is a K > 0 such that [E(X)[ > K[V (X)[

2

for all networks in the family.

The reader should check that for dense networks, all of the time complexities in Table 3.2.1, beginning with

Karzanov’s algorithm, are in the neighborhood of O(V

3

). On the other hand, for sparse networks (networks

with relatively few edges), the later algorithms in the table will give signiﬁcantly better performances than

the earlier ones.

64

3.3 The algorithm of Ford and Fulkerson

Author(s) Y ear Complexity

Ford, Fulkerson 1956 −−−−−

Edmonds, Karp 1969 O(E

2

V )

Dinic 1970 O(EV

2

)

Karzanov 1973 O(V

3

)

Cherkassky 1976 O(

√

EV

2

)

Malhotra, et al . 1978 O(V

3

)

Galil 1978 O(V

5/3

E

2/3

)

Galil and Naamad 1979 O(EV log

2

V )

Sleator and Tarjan 1980 O(EV logV )

Goldberg and Tarjan 1985 O(EV log(V

2

/E))

Table 3.2.1: Progress in network ﬂow algorithms

Exercise 3.2.1. Given K > 0. Consider the family of all possible networks X for which [E(X)[ = K[V (X)[.

In this family, evaluate all of the complexity bounds in Table 3.2.1 and ﬁnd the fastest algorithm for the

family.

Among the algorithms in Table 3.2.1 we will discuss just two in detail. The ﬁrst will be the original

algorithm of Ford and Fulkerson, because of its importance and its simplicity, if not for its speed.

The second will be the 1978 algorithm of Malhotra, Pramodh-Kumar and Maheshwari (MPM), for

three reasons. It uses the idea, introduced by Dinic in 1970 and common to all later algorithms, of layered

networks, it is fast, and it is extremely simple and elegant in its conception, and so it represents a good

choice for those who may wish to program one of these algorithms for themselves.

3.3 The algorithm of Ford and Fulkerson

The basic idea of the Ford-Fulkerson algorithm for the network ﬂow problem is this: start with some

ﬂow function (initially this might consist of zero ﬂow on every edge). Then look for a ﬂow augmenting path

in the network. A ﬂow augmenting path is a path from the source to the sink along which we can push some

additional ﬂow.

In Fig. 3.3.1 below we show a ﬂow augmenting path for the network of Fig. 3.2.1. The capacities of the

edges are shown on each edge, and the values of the ﬂow function are shown in the boxes on the edges.

Fig. 3.3.1: A ﬂow augmenting path

Fig. 3.3.2: The path above, after augmentation.

An edge can get elected to a ﬂow augmenting path for two possible reasons. Either

65

Chapter 3: The Network Flow Problem

(a) the direction of the edge is coherent with the direction of the path from source to sink and the present

value of the ﬂow function on the edge is below the capacity of that edge, or

(b) the direction of the edge is opposed to that of the path from source to sink and the present value of the

ﬂow function on the edge is strictly positive.

Indeed, on all edges of a ﬂow augmenting path that are coherently oriented with the path we can increase

the ﬂow along the edge, and on all edges that are incoherently oriented with the path we can decrease the

ﬂow on the edge, and in either case we will have increased the value of the ﬂow (think about that one until

it makes sense).

It is, of course, necessary to maintain the conservation of ﬂow, i.e., to respect Kirchhoﬀ’s laws. To do

this we will augment the ﬂow on every edge of an augmenting path by the same amount. If the conservation

conditions were satisﬁed before the augmentation then they will still be satisﬁed after such an augmentation.

It may be helpful to remark that an edge is coherently or incoherently oriented only with respect to a

given path from source to sink. That is, the coherence, or lack of it, is not only a property of the directed

edge, but depends on how the edge sits inside a chosen path.

Thus, in Fig. 3.3.1 the ﬁrst edge is directed towards the source, i.e., incoherently with the path. Hence

if we can decrease the ﬂow in that edge we will have increased the value of the ﬂow function, namely the net

ﬂow out of the source. That particular edge can indeed have its ﬂow decreased, by at most 8 units. The

next edge carries 10 units of ﬂow towards the source. Therefore if we decrease the ﬂow on that edge, by up

to 10 units, we will also have increased the value of the ﬂow function. Finally, the edge into the sink carries

12 units of ﬂow and is oriented towards the sink. Hence if we increase the ﬂow in this edge, by at most 3

units since its capacity is 15, we will have increased the value of the ﬂow in the network.

Since every edge in the path that is shown in Fig. 3.3.1 can have its ﬂow altered in one way or the

other so as to increase the ﬂow in the network, the path is indeed a ﬂow augmenting path. The most that

we might accomplish with this path would be to push 3 more units of ﬂow through it from source to sink.

We couldn’t push more than 3 units through because one of the edges (the edge into the sink) will tolerate

an augmentation of only 3 ﬂow units before reaching its capacity.

To augment the ﬂow by 3 units we would diminish the ﬂow by 3 units on each of the ﬁrst two edges and

increase it by 3 units on the last edge. The resulting ﬂow in this path is shown in Fig. 3.3.2. The ﬂow in

the full network, after this augmentation, is shown in Fig. 3.3.3. Note carefully that if these augmentations

are made then ﬂow conservation at each vertex of the network will still hold (check this!).

Fig. 3.3.3: The network, after augmentation of ﬂow

After augmenting the ﬂow by 3 units as we have just described, the resulting ﬂow will be the one that

is shown in Fig. 3.3.3. The value of the ﬂow in Fig. 3.1.2 was 32 units. After the augmentation, the ﬂow

function in Fig. 3.3.3 has a value of 35 units.

We have just described the main idea of the Ford-Fulkerson algorithm. It ﬁrst ﬁnds a ﬂow augmenting

path. Then it augments the ﬂow along that path as much as it can. Then it ﬁnds another ﬂow augmenting

66

3.3 The algorithm of Ford and Fulkerson

path, etc. etc. The algorithm terminates when no ﬂow augmenting paths exist. We will prove that when

that happens, the ﬂow will be at the maximum possible value, i.e., we will have found the solution of the

network ﬂow problem.

We will now describe the steps of the algorithm in more detail.

Deﬁnition. Let f be a ﬂow function in a network X. We say that an edge e of X is usable from v to w if

either e is directed from v to w and the ﬂow in e is less than the capacity of the edge, or e is directed from

w to v and the ﬂow in e is > 0.

Now, given a network and a ﬂow in that network, how do we ﬁnd a ﬂow augmenting path from the

source to the sink? This is done by a process of labelling and scanning the vertices of the network, beginning

with the source and proceeding out to the sink. Initially all vertices are in the conditions ‘unlabeled’ and

‘unscanned.’ As the algorithm proceeds, various vertices will become labeled, and if a vertex is labeled, it

may become scanned. To scan a vertex v means, roughly, that we stand at v and look around at all neighbors

w of v that haven’t yet been labeled. If e is some edge that joins v with a neighbor w, and if the edge e

is usable from v to w as deﬁned above, then we will label w, because any ﬂow augmenting path that has

already reached from the source to v can be extended another step, to w.

The label that every vertex v gets is a triple (u, ±, z), and here is what the three items mean.

The ‘u’ part of the label of v is the name of the vertex that was being scanned when v was labeled.

The ‘±’ will be ‘+’ if v was labeled because the edge (u, v) was usable from u to v (i.e., if the ﬂow from

u to v was less than the capacity of (u, v)) and it will be ‘−’ if v was labeled because the edge (v, u) was

usable from u to v (i.e., if the ﬂow from v to u was > 0).

Finally, the ‘z’ component of the label represents the largest amount of ﬂow that can be pushed from

the source to the present vertex v along any augmenting path that has so far been found. At each step the

algorithm will replace the current value of z by the amount of new ﬂow that could be pushed through to z

along the edge that is now being examined, if that amount is smaller than z.

So much for the meanings of the various labels. As the algorithm proceeds, the labels that get attached

to the diﬀerent vertices form a record of how much ﬂow can be pushed through the network from the source

to the various vertices, and by exactly which routes.

To begin with, the algorithm labels the source with (−∞, +, ∞). The source now has the label-status

labeled and the scan-status unscanned.

Next we will scan the source. Here is the procedure for scanning any vertex u.

procedure scan(u:vertex;X :network; f:ﬂow );

for every ‘unlabeled’ vertex v that is connected

to u by an edge in either or both directions, do

if the ﬂow in (u, v) is less than cap(u, v)

then

label v with (u, +, min¦z(u), cap(u, v) −flow(u, v)¦)

else if the ﬂow in (v, u) is > 0

then

label v with (u, −, min¦z(u), flow(v, u)¦) and

change the label-status of v to ‘labeled’;

change the scan-status of u to ‘scanned’

end.¦scan¦

We can use the above procedure to describe the complete scanning and labelling of the vertices of the

network, as follows.

67

Chapter 3: The Network Flow Problem

procedure labelandscan(X :network; f:ﬂow; whyhalt:reason);

give every vertex the scan-status ‘unscanned’

and the label-status ‘unlabeled’;

u := source;

label source with (−∞, +, ∞);

label-status of source:= ‘labeled’;

while ¦there is a ‘labeled’ and ‘unscanned’ vertex v

and sink is ‘unlabeled’¦

do scan(v, X, f);

if sink is unlabeled

then ‘whyhalt’:=‘flow is maximum’

else ‘whyhalt’:= ‘it’s time to augment’

end.¦labelandscan¦

Obviously the labelling and scanning process will halt for one of two reasons: either the sink t acquires

a label, or the sink never gets labeled but no more labels can be given. In the ﬁrst case we will see that a

ﬂow augmenting path from source to sink has been found, and in the second case we will prove that the ﬂow

is at its maximum possible value, so the network ﬂow problem has been solved.

Suppose the sink does get a label, for instance the label (u, ±, z). Then we claim that the value of the

ﬂow in the network can be augmented by z units.

To prove this we will construct a ﬂow augmenting path, using the labels on the vertices, and then we

will change the ﬂow by z units on every edge of that path in such a way as to increase the value of the ﬂow

function by z units. This is done as follows.

If the sign part of the label of t is ‘+,’ then increase the ﬂow function by z units on the edge (u, t), else

decrease the ﬂow on edge (t, u) by z units.

Then move back one step away from the sink, to vertex u, and look at its label, which might be (w, ±, z

1

).

If the sign is ‘+’ then increase the ﬂow on edge (w, u) by z units (not by z

1

units!), while if the sign is ‘−’

then decrease the ﬂow on edge (u, w) by z units. Next replace u by w, etc., until the source s has been

reached.

A little more formally, the ﬂow augmentation algorithm is the following.

procedure augmentflow(X :network; f:ﬂow ; amount:real);

¦assumes that labelandscan has just been done¦

v:=sink;

amount:= the ‘z’ part of the label of sink;

repeat

(previous, sign, z) := label(v);

if sign=‘+’

then

increase f(previous, v) by amount

else

decrease f(v, previous) by amount;

v := previous

until v= source

end.¦augmentflow¦

The value of the ﬂow in the network has now been increased by z units. The whole process of labelling

and scanning is now repeated, to search for another ﬂow augmenting path. The algorithm halts only when

we are unable to label the sink. The complete Ford-Fulkerson algorithm is shown below.

68

3.4 The max-ﬂow min-cut theorem

procedure fordfulkerson(X :network; f: ﬂow; maxflowvalue:real);

¦ﬁnds maximum ﬂow in a given network X ¦

set f:=0 on every edge of X ;

maxflowvalue:=0;

repeat

labelandscan(X, f, whyhalt);

if whyhalt=‘it’s time to augment’ then

augmentflow(X,f, amount);

maxflowvalue := maxflowvalue + amount

until whyhalt = ‘flow is maximum’

end.¦fordfulkerson¦

Let’s look at what happens if we apply the labelling and scanning algorithm to the network and ﬂow

shown in Fig. 3.1.2. First vertex s gets the label (−∞, +, ∞). We then scan s. Vertex A gets the label

(s, −, 8), B cannot be labeled, and C gets labeled with (s, +, 10), which completes the scan of s.

Next we scan vertex A, during which D acquires the label (A, +, 8). Then C is scanned, which results

in E getting the label (C, −, 10). Finally, the scan of D results in the label (D, +, 3) for the sink t.

From the label of t we see that there is a ﬂow augmenting path in the network along which we can push

3 more units of ﬂow from s to t. We ﬁnd the path as in procedure augmentflow above, following the labels

backwards from t to D, A and s. The path in question will be seen to be exactly the one shown in Fig.

3.3.1, and further augmentation proceeds as we have discussed above.

3.4 The max-ﬂow min-cut theorem

Now we are going to look at the state of aﬀairs that holds when the ﬂow augmentation procedure

terminates because it has not been able to label the sink. We want to show that then the ﬂow will have a

maximum possible value.

Let W ⊂ V (X), and suppose that W contains the source and W does not contain the sink. Let W

denote

all other vertices of X, i.e., W = V (X) −W.

Deﬁnition. By the cut (W, W) we mean the set of all edges of X whose initial vertex is in W and whose

terminal vertex is in W.

For example, one cut in a network consists of all edges whose initial vertex is the source.

Now, every unit of ﬂow that leaves the source and arrives at the sink must at some moment ﬂow from

a vertex of W to a vertex of W, i.e., must ﬂow along some edge of the cut (W, W). If we deﬁne the capacity

of a cut to be the sum of the capacities of all edges in the cut, then it seems clear that the value of a ﬂow

can never exceed the capacity of any cut, and therefore that the maximum value of a ﬂow cannot exceed the

minimum capacity of any cut.

The main result of this section is the ‘max-ﬂow min-cut’ theorem of Ford and Fulkerson, which we state

as

Theorem 3.4.1. The maximum possible value of any ﬂow in a network is equal to the minimum capacity

of any cut in that network.

Proof: We will ﬁrst do a little computation to show that the value of a ﬂow can never exceed the capacity of

a cut. Second, we will show that when the Ford-Fulkerson algorithm terminates because it has been unable

to label the sink, then at that moment there is a cut in the network whose edges are saturated with ﬂow,

i.e., such that the ﬂow in each edge of the cut is equal to the capacity of that edge.

Let U and V be two (not necessarily disjoint) sets of vertices of the network X, and let f be a ﬂow

function for X. By f(U, V ) we mean the sum of the values of the ﬂow function along all edges whose initial

vertex lies in U and whose terminal vertex lies in V . Similarly, by cap(U, V ) we mean the sum of the

capacities of all of those edges. Finally, by the net ﬂow out of U we mean f(U, U) −f(U, U).

69

Chapter 3: The Network Flow Problem

Lemma 3.4.1. Let f be a ﬂow of value Q in a network X, and let (W, W) be a cut in X. Then

Q = f(W, W) −f(W, W) ≤ cap(W, W). (3.4.1)

Proof of lemma: The net ﬂow out of s is Q. The net ﬂow out of any other vertex w ∈ W is 0. Hence, if

V (X) denotes the vertex set of the network X, we obtain

Q =

w∈W

¦f(w, V (X)) −f(V (X), w)¦

= f(W, V (X)) −f(V (X), W)

= f(W, W ∪ W) −f(W ∪ W, W)

= f(W, W) + f(W, W) −f(W, W) −f(W, W)

= f(W, W) −f(W, W).

This proves the ‘=’ part of (3.4.1), and the ‘≤’ part is obvious, completing the proof of lemma 3.4.1.

We now know that the maximum value of the ﬂow in a network cannot exceed the minimum of the

capacities of the cuts in the network.

To complete the proof of the theorem we will show that a ﬂow of maximum value, which surely exists,

must saturate the edges of some cut.

Hence, let f be a ﬂow in X of maximum value, and call procedure labelandscan(X, f, whyhalt). Let

W be the set of vertices of X that have been labeled when the algorithm terminates. Clearly s ∈ W.

Equally clearly, t / ∈ W, for suppose the contrary. Then we would have termination with ‘whyhalt’ = ‘it’s

time to augment,’ and if we were then to call procedure augmentflow we would ﬁnd a ﬂow of higher value,

contradicting the assumed maximality of f.

Since s ∈ W and t / ∈ W, the set W deﬁnes a cut (W, W).

We claim that every edge of the cut (W, W) is saturated. Indeed, if (x, y) is in the cut, x ∈ W, y / ∈ W,

then edge (x, y) is saturated, else y would have been labeled when we were scanning x and we would have

y ∈ W, a contradiction. Similarly, if (y, x) is an edge where y ∈ W and x ∈ W, then the ﬂow f(y, x) = 0,

else again y would have been labeled when we were scanning x, another contradiction.

Therefore, every edge from W to W is carrying as much ﬂow as its capacity permits, and every edge

from W to W is carrying no ﬂow at all. Hence the sign of equality holds in (3.4.1), the value of the ﬂow is

equal to the capacity of the cut (W, W), and the proof of theorem 3.4.1 is ﬁnished.

3.5 The complexity of the Ford-Fulkerson algorithm

The algorithm of Ford and Fulkerson terminates if and when it arrives at a stage where the sink is not

labeled but no more vertices can be labeled. If at that time we let W be the set of vertices that have been

labeled, then we have seen that (W, W) is a minimum cut of the network, and the present value of the ﬂow

is the desired maximum for the network.

The question now is, how long does it take to arrive at that stage, and indeed, is it guaranteed that we

will ever get there? We are asking if the algorithm is ﬁnite, surely the most primitive complexity question

imaginable.

First consider the case where every edge of the given network X has integer capacity. Then during the

labelling and ﬂow augmentation algorithms, various additions and subtractions are done, but there is no

way that any nonintegral ﬂows can be produced.

It follows that the augmented ﬂow is still integral. The value of the ﬂow therefore increases by an integer

amount during each augmentation. On the other hand if, say, C

∗

denotes the combined capacity of all edges

that are outbound from the source, then it is eminently clear that the value of the ﬂow can never exceed C

∗

.

Since the value of the ﬂow increases by at least 1 unit per augmentation, we see that no more than C

∗

ﬂow

augmentations will be needed before a maximum ﬂow is reached. This yields

70

3.5 Complexity of the Ford-Fulkerson algorithm

Theorem 3.5.1. In a network with integer capacities on all edges, the Ford-Fulkerson algorithm terminates

after a ﬁnite number of steps with a ﬂow of maximum value.

This is good news and bad news. The good news is that the algorithm is ﬁnite. The bad news is that

the complexity estimate that we have proved depends not only on the numbers of edges and vertices in X,

but on the edge capacities. If the bound C

∗

represents the true behavior of the algorithm, rather than some

weakness in our analysis of the algorithm, then even on very small networks it will be possible to assign edge

capacities so that the algorithm takes a very long time to run.

And it is possible to do that.

We will show below an example due to Ford and Fulkerson in which the situation is even worse than

the one envisaged above: not only will the algorithm take a very long time to run; it won’t converge at all!

Consider the network X that is shown in Fig. 3.5.1. It has 10 vertices s, t, x

1

, . . . , x

4

, y

1

, . . . , y

4

. There

are directed edges (x

i

, x

j

) ∀i ,= j, (x

i

, y

j

) ∀i, j, (y

i

, y

j

) ∀i ,= j, (y

i

, x

j

) ∀i, j, (s, x

i

) ∀i, and (y

j

, t) ∀j.

Fig. 3.5.1: How to give the algorithm a hard time

In this network, the four edges A

i

= (x

i

, y

i

) (i = 1, 4) will be called the special edges.

Next we will give the capacities of the edges of X. Write r = (−1 +

√

5)/2, and let

S = (3 +

√

5)/2 =

∞

n=0

r

n

.

Then to every edge of X except the four special edges we assign the capacity S. The special edges

A

1

, A

2

, A

3

, A

4

are given capacities 1, r, r

2

, r

2

, respectively (you can see that this is going to be interest-

ing).

Suppose, for our ﬁrst augmentation step, we ﬁnd the ﬂow augmenting path s → x

1

→ y

1

→ t, and that

we augment the ﬂow by 1 unit along that path. The four special edges will then have residual capacities

(excesses of capacity over ﬂow) of 0, r, r

2

, r

2

, respectively.

Inductively, suppose we have arrived at a stage of the algorithm where the four special edges, taken in

some rearrangement A

1

, A

2

, A

3

, A

4

, have residual capacities 0, r

n

, r

n+1

, r

n+1

. We will now show that the

algorithm might next do two ﬂow augmentation steps the net result of which would be that the inductive

state of aﬀairs would again hold, with n replaced by n + 1.

Indeed, choose the ﬂow augmenting path

s → x

2

→ y

2

→ x

3

→ y

3

→ t.

71

Chapter 3: The Network Flow Problem

The only special edges that are on this path are A

2

and A

3

. Augment the ﬂow along this path by r

n+1

units

(the maximum possible amount).

Next, choose the ﬂow augmenting path

s → x

2

→ y

2

→ y

1

→ x

1

→ y

3

→ x

3

→ y

4

→ t.

Notice that with respect to this path the special edges A

1

and A

3

are incoherently directed. Augment the

ﬂow along this path by r

n+2

units, once more the largest possible amount.

The reader may now verify that the residual capacities of the four special edges are r

n+2

, 0, r

n+2

, r

n+1

.

In the course of doing this veriﬁcation it will be handy to use the fact that

r

n+2

= r

n

−r

n+1

(∀n ≥ 0).

These two augmentation steps together have increased the ﬂow value by r

n+1

+r

n+2

= r

n

units. Hence

the ﬂow in an edge will never exceed S units.

The algorithm converges to a ﬂow of value S. Now comes the bad news: the maximum ﬂow in this

network has the value 4S (ﬁnd it!).

Hence, for this network

(a) the algorithm does not halt after ﬁnitely many steps even though the edge capacities are ﬁnite and

(b) the sequence of ﬂow values converges to a number that is not the maximum ﬂow in the network.

The irrational capacities on the edges may at ﬁrst seem to make this example seem ‘cooked up.’ But

the implication is that even with a network whose edge capacities are all integers, the algorithm might take

a very long time to run.

Motivated by the importance and beauty of the theory of network ﬂows, and by the unsatisfactory time

complexity of the original algorithm, many researchers have attacked the question of ﬁnding an algorithm

whose success is guaranteed within a time bound that is independent of the edge capacities, and depends

only on the size of the network.

We turn now to the consideration of one of the main ideas on which further progress has depended, that

of layering a network with respect to a ﬂow function. This idea has triggered a whole series of improved

algorithms. Following the discussion of layering we will give a description of one of the algorithms, the MPM

algorithm, that uses layered networks and guarantees fast operation.

3.6 Layered networks

Layering a network is a technique that has the eﬀect of replacing a single max-ﬂow problem by several

problems, each a good deal easier than the original. More precisely, in a network with V vertices we will ﬁnd

that we can solve a max-ﬂow problem by solving at most V slightly diﬀerent problems, each on a layered

network. We will then discuss an O(V

2

) method for solving each such problem on a layered network, and

the result will be an O(V

3

) algorithm for the original network ﬂow problem.

Now we will discuss how to layer a network with respect to a given ﬂow function. The purpose of the

italics is to emphasize the fact that one does not just ‘layer a network.’ Instead, there is given a network X

and a ﬂow function f for that network, and together they induce a layered network Y = Y(X, f), as follows.

First let us say that an edge e of X is helpful from u to v if either e is directed from u to v and f(e) is

below capacity or e is directed from v to u and the ﬂow f(e) is positive.

Next we will describe the layered network Y. Recall that in order to describe a network one must

describe the vertices of the network, the directed edges, give the capacities of those edges, and designate the

source and the sink. The network Y will be constructed one layer at a time from the vertices of X, using

the ﬂow f as a guide. For each layer, we will say which vertices of X go into that layer, then we will say

which vertices of the previous layer are connected to each vertex of the new layer. All of these edges will be

directed from the earlier layer to the later one. Finally we will give the capacities of each of these new edges.

The 0

th

layer of Y consists only of the source s. The vertices that comprise layer 1 of Y will be every

vertex v of X such that in X there is a helpful edge from s to v. We then draw an edge in Y directed from

s to v for each such vertex v. We assign to that edge in Y a capacity cap(s, v) −f(s, v) +f(v, s).

72

3.6 Layered networks

The set of all such v will be called layer 1 of Y. Next we construct layer 2 of Y. The vertex set of layer

2 consists of all vertices w that do not yet belong to any layer, and such that there is a helpful edge in X

from some vertex v of layer 1 to w.

Next we draw the edges from layer 1 to layer 2: for each vertex v in layer 1 we draw a single edge in Y

directed from v to every vertex w in layer 2 for which there is a helpful edge in X from v to w.

Note that the edge always goes from v to w regardless of the direction of the helpful edge in X. Note

also that in contrast to the Ford-Fulkerson algorithm, even after an edge has been drawn from v to w in Y,

additional edges may be drawn to the same w from other vertices v

, v

in layer 1.

Assign capacities to the edges from layer 1 to layer 2 in the same way as described above, that is, the

capacity in Y of the edge from v to w is cap(v, w) −f(v, w) +f(w, v). This latter quantity is, of course, the

total residual (unused) ﬂow-carrying capacity of the edges in both directions between v and w.

The layering continues until we reach a layer L such that there is a helpful edge from some vertex of

layer L to the sink t, or else until no additional layers can be created (to say that no more layers can be

created is to say that among the vertices that haven’t yet been included in the layered network that we are

building, there aren’t any that are adjacent to a vertex that is in the layered network, by a helpful edge).

In the former case, we then create a layer L+1 that consists solely of the sink t, we connect t by edges

directed from the appropriate vertices of layer L, assign capacities to those edges, and the layering process

is complete. Observe that not all vertices of X need appear in Y.

In the latter case, where no additional layers can be created but the sink hasn’t been reached, the

present ﬂow function f in f

¯

X is maximum, and the network ﬂow problem in X has been solved.

Here is a formal statement of the procedure for layering a given network X with respect to a given ﬂow

function f in X. Input are the network X and the present ﬂow function f in that network. Output are the

layered network Y, and a logical variable maxflow that will be True, on output, if the ﬂow is at a maximum

value, False otherwise.

procedure layer (X, f, Y, maxflow);

¦forms the layered network Y with respect to the ﬂow f in X ¦

¦maxflow will be ‘True’ if the input ﬂow f already has the

maximum possible value for the network, else it will be ‘False’¦

L:= 0; layer(L) := ¦source¦; maxflow := false;

repeat

layer(L + 1) := ∅;

for each vertex u in layer(L) do

for each vertex v such that ¦layer(v) = L + 1 or v is

not in any layer¦ do

q := cap(u, v) −f(u, v) +f(v, u);

if q > 0 then do

draw edge u → v in Y;

assign capacity q to that edge;

assign vertex v to layer(L + 1);

L := L +1

if layer(L) is empty then exit with maxflow := true;

until sink is in layer(L);

delete from layer(L) of Y all vertices other than sink,

and remove their incident edges from Y

end.¦layer¦

In Fig. 3.6.1 we show the typical appearance of a layered network. In contrast to a general network, in

a layered network every path from the source to some ﬁxed vertex v has the same number of edges in it (the

number of the layer of v), and all edges on such a path are directed the same way, from the source towards

73

Chapter 3: The Network Flow Problem

Fig. 3.6.1: A general layered network

v. These properties of layered networks are very friendly indeed, and make them much easier to deal with

than general networks.

In Fig. 3.6.2 we show speciﬁcally the layered network that results from the network of Fig. 3.1.2 with

the ﬂow shown therein.

Fig. 3.6.2: A layering of the network in Fig. 3.1.2

The next question is this: exactly what problem would we like to solve on the layered network Y, and

what is the relationship of that problem to the original network ﬂow problem in the original network X?

The answer is that in the layered network Y we are looking for a blocking ﬂow g. By a blocking ﬂow we

mean a ﬂow function g in Y such that every path from source to sink in Y has at least one saturated edge.

This immediately raises two questions: (a) what can we do if we ﬁnd a blocking ﬂow in Y? (b) how can

we ﬁnd a blocking ﬂow in Y? The remainder of this section will be devoted to answering (a). In the next

section we will give an elegant answer to (b).

Suppose that we have somehow found a blocking ﬂow function, g, in Y. What we do with it is that we

use it to augment the ﬂow function f in X, as follows.

procedure augment(f, X; g, Y);

¦augment ﬂow f in X by using a blocking ﬂow g

in the corresponding layered network Y¦

for every edge e : u → v of the layered network Y,

do

increase the ﬂow f in the edge u → v of the

network X by the amount

min¦g(e), cap(u → v) −f(u → v)¦;

if not all of g(e) has been used

then decrease the ﬂow in edge v → u by

the unused portion of g(e)

end.¦augment¦

74

3.6 Layered networks

After augmenting the ﬂow in the original network X, what then? We construct a new layered network,

from X and the newly augmented ﬂow function f on X.

The various activities that are now being described may sound like some kind of thinly disguised repack-

aging of the Ford-Fulkerson algorithm, but they aren’t just that, because here is what can be proved to

happen:

First, if we start with zero ﬂow in X, make the layered network Y, ﬁnd a blocking ﬂow in Y, augment

the ﬂow in X, make a new layered network Y, ﬁnd a blocking ﬂow, etc. etc., then after at most V phases

(‘phase’ = layer + block + augment) we will have found the maximum ﬂow in X and the process will halt.

Second, each phase can be done very rapidly. The MPM algorithm, to be discussed in section 3.7, ﬁnds

a blocking ﬂow in a layered network in time O(V

2

).

By the height of a layered network Y we will mean the number of edges in any path from source to sink.

The network of Fig. 3.6.1 has height 3. Let’s now show

Theorem 3.6.1. The heights of the layered networks that occur in the consecutive phases of the solution

of a network ﬂow problem form a strictly increasing sequence of positive integers. Hence, for a network X

with V vertices, there can be at most V phases before a maximum ﬂow is found.

Let Y(p) denote the layered network that is constructed at the p

th

phase of the computation and let

H(p) denote the height of Y(p). We will ﬁrst prove

Lemma 3.6.1. If

v

0

→ v

1

→ v

2

→ → v

m

(v

0

= source)

is a path in Y(p + 1), and if every vertex v

i

(i = 1, m) of that path also appears in Y(p), then for every

a = 0, m it is true that if vertex v

a

was in layer b of Y(p) then a ≥ b.

Proof of lemma: The result is clearly true for a = 0. Suppose it is true for v

0

, v

1

, . . . , v

a

, and suppose

v

a+1

was in layer c of network Y(p). We will show that a + 1 ≥ c. Indeed, if not then c > a + 1. Since v

a

,

by induction, was in a layer ≤ a, it follows that the edge

e

∗

: v

a

→ v

a+1

was not present in network Y(p) since its two endpoints were not in two consecutive layers. Hence the ﬂow

in Y between v

a

and v

a+1

could not have been aﬀected by the augmentation procedure of p hase p. But edge

e

∗

is in Y(p+1). Therefore it represented an edge of Y that was helpful from v

a

to v

a+1

at the beginning of

phase p +1, was unaﬀected by phase p, but was not helpful at the beginning of phase p. This contradiction

establishes the lemma.

Now we will prove the theorem. Let

s → v

1

→ v

2

→ → v

H(p+1)−1

→ t

be a path from source to sink in Y(p +1).

Consider ﬁrst the case where every vertex of the path also lies in Y(p), and apply the lemma to

v

m

= t (m = H(p + 1)), a = m. We conclude at once that H(p + 1) ≥ H(p). Now we want to exclude the

‘=’ sign. If H(p + 1) = H(p) then the entire path is in Y(p) and in Y(p + 1), and so all of the edges in Y

that the edges of the path represent were helpful both before and after the augmentation step of phase p,

contradicting the fact that the blocking ﬂow that was used for the augmentation saturated some edge of the

chosen path. The theorem is now proved for the case where the path had all of its vertices in Y(p) also.

Now suppose that this was not the case. Let e

∗

: v

a

→ v

a+1

be the ﬁrst edge of the path whose terminal

vertex v

a+1

was not in Y(p). Then the corresponding edge(s) of Y was unaﬀected by the augmentation in

phase p. It was helpful from v

a

to v

a+1

at the beginning of phase p + 1 because e

∗

∈ Y(p + 1) and it was

unaﬀected by phase p, yet e

∗

/ ∈ Y(p). The only possibility is that vertex v

a+1

would have entered into Y(p)

in the layer H(p) that contains the sink, but that layer is special, and contains only t. Hence, if v

a

was

in layer b of Y(p), then b+1 = H(p). By the lemma once more, a ≥ b, so a+1 ≥ b+1 = H(p), and therefore

H(p + 1) > H(p), completing the proof of theorem 3.6.1.

75

Chapter 3: The Network Flow Problem

To summarize, if we want to ﬁnd a maximum ﬂow in a given network Y by the method of layered

networks, we carry out

procedure maxflow (X ,Y ,f);

set the ﬂow function f to zero on all edges of Y;

repeat

(i) construct the layered network Y = Y(X, f)

if possible, else exit with ﬂow at maximum

value;

(ii) ﬁnd a blocking ﬂow g in Y;

(iii) augment the ﬂow f in Y with the blocking

ﬂow g, by calling procedure augment above

until exit occurs in (i) above;

end.¦maxflow¦

According to theorem 3.6.1, the procedure will repeat steps (i), (ii), (iii) at most V times because the

height of the layered network increases each time around, and it certainly can never exceed V . The labor

involved in step (i) is certainly O(E), and so is the labor in step (iii). Hence if BFL denotes the labor involved

in some method for ﬁnding a blocking ﬂow in a layered network, then the whole network ﬂow problem can

be done in time O(V (E + BFL)).

The idea of layering networks is due to Dinic. Since his work was done, all eﬀorts have been directed at

the problem of reducing BFL as much as possible.

3.7 The MPM algorithm

Now we suppose that we are given a layered network Y and we want to ﬁnd a blocking ﬂow in Y. The

following ingenious suggestion is due to Malhotra, Pramodh-Kumar and Maheshwari.

Let V be some vertex of Y. The in-potential of v is the sum of the capacities of all edges directed into

v, and the outpotential of v is the total capacity of all edges directed out from v. The potential of v is the

smaller of these two.

(A) Find a vertex v of smallest potential, say P

∗

. Now we will push P

∗

more units of ﬂow from source

to sink, as follows.

(B) (Pushout) Take the edges that are outbound from v in some order, and saturate each one with ﬂow,

unless and until saturating one more would lift the total ﬂow used over P

∗

. Then assign all remaining ﬂow

to the next outbound edge (not necessarily saturating it), so the total outﬂow from v becomes exactly P

∗

.

(C) Follow the ﬂow to the next higher layer of Y. That is, for each vertex v

of the next layer, let h(v

)

be the ﬂow into v

. Now saturate all except possibly one outbound edge of v

, to pass through v

the h(v

)

units of ﬂow. When all vertices v

**in that layer have been done, repeat for the next layer, etc. We never ﬁnd
**

a vertex with insuﬃcient capacity, in or out, to handle the ﬂow that is thrust upon it, because we began by

choosing a vertex of minimum potential.

(D) (Pullback) When all layers ‘above’ v have been done, then follow the ﬂow to the next layer ‘below’

v. For each vertex v

of that layer, let h(v

) be the ﬂow out of v

**to v. Then saturate all except possibly one
**

incoming edge of v

, to pass through v

the h(v

) units of ﬂow. When all v

**in that layer have been done,
**

proceed to the next layer below v, etc.

(E) (Update capacities) The ﬂow function that has just been created in the layered network must be

stored somewhere. A convenient way to keep it is to carry out the augmentation procedure back in the

network X at this time, thereby, in eﬀect ‘storing’ the contributions to the blocking ﬂow in Y in the ﬂow

array for X. This can be done concurrently with the MPM algorithm as follows: Every time we increase the

ﬂow in some edge u → v of Y we do it by augmenting the ﬂow from u to v in X, and then decreasing the

capacity of edge u → v in Y by the same amount. In that way the capacities of the edges in Y will always

be the updated residual capacities, and the ﬂow function f in X will always reﬂect the latest augmentation

of the ﬂow in Y.

(F) (Prune) We have now pushed the original h(v) units of ﬂow through the whole layered network. We

intend to repeat the operation on some other vertex v of minimum potential, but ﬁrst we can prune oﬀ of

the network some vertices and edges that are guaranteed never to be needed again.

76

3.6 Layered networks

The vertex v itself has either all incoming edges or all outgoing edges, or both, at zero residual capacities.

Hence no more ﬂow will ever be pushed throug v. Therefore we can delete v from the network Y together

with all of its incident edges, incoming or outgoing. Further, we can delete from Y all of the edges that were

saturated by the ﬂow pushing process just completed, i.e., all edges that now have zero residual capacity.

Next, we may now ﬁnd that some vertex w has had all of its incoming or all of its outgoing edges deleted.

That vertex will never be used again, so delete it and any other incident edges it may still have. Continue

the pruning process until only vertices remain that have nonzero potential. If the source and the sink are

still connected by some path, then repeat from (A) above.

Else the algorithm halts. The blocking ﬂow function g that we have just found is the following: if e is

an edge of the input layered network Y, then g(e) is the sum of all of the ﬂows that were pushed through

edge e at all stages of the above algorithm.

It is obviously a blocking ﬂow: since no path between s and t remains, every path must have had at

least one of its edges saturated at some step of the algorithm. What is the complexity of this algorithm?

Certainly we delete at least one vertex from the network at every pruning stage, because the vertex v that

had minimum potential will surely have had either all of its incoming or all of its outgoing edges (or both)

saturated.

It follows that steps (A)–(E) can be executed at most V times before we halt with a blocking ﬂow.

The cost of saturating all edges that get saturated , since every edge has but one saturation to give to its

network, is O(E). The number of partial edge-saturation operations is at most two per vertex visited. For

each minimal-potential vertex v we visit at most V other vertices, so we use at most V minimal-potential

vertices altogether. So the partial edge saturation operations cost O(V

2

) and the total edge saturations cost

O(E).

The operation of ﬁnding a vertex of minium potential is ‘free,’ in the following sense. Initially we

compute and store the in- and out- potentials of every vertex. Thereafter, each time the ﬂow in some edge

is increased, the outpotential of its initial vertex and the inpotential of its terminal vertex are reduced by

the same amount. It follows that the cost of maintaining these arrays is linear in the number of vertices, V .

Hence it aﬀects only the constants implied by the ‘big oh’ symbols above, but not the orders of magnitude.

The total cost is therefore O(V

2

) for the complete MPM algorithm that ﬁnds a blocking ﬂow in a layered

network. Hence a maximum ﬂow in a netwrok can be found in O(V

3

) time, since at most V layered networks

need to be looked at in order to ﬁnd a maximum ﬂow in the original network.

In contrast to the nasty example network of section 3.5, with its irrational edge capacities, that made

the Ford-Fulkerson algorithm into an inﬁnite process that converged to the wrong answer, the time bound

O(V

3

) that we have just proved for the layered-network MPM algorithm is totally independent of the edge

capacities.

3.8 Applications of network ﬂow

We conclude this chapter by mentioning some applications of the network ﬂow problem and algorithm.

Certainly, among these, one most often mentions ﬁrst the problem of maximum matching in a bipartite

graph. Consider a set of P people and a set of J jobs, such that not all of the people are capable of doing

all of the jobs.

We construct a graph of P + J vertices to represent this situation, as follows. Take P vertices to

represent the people, J vertices to represent the jobs, and connect vertex p to vertex j by an undirected edge

if person p can do job j. Such a graph is called bipartite. In general a graph G is bipartite if its vertices can

be partitioned into two classes in such a way that no edge runs between two vertices of the same class (see

section 1.6).

In Fig. 3.8.1 below we show a graph that might result from a certain group of 8 people and 9 jobs.

The maximum matching problem is just this: assuming that each person can handle at most one of

the jobs, and that each job needs only one person, assign people to the jobs in such a way that the largest

possible number of people are employed. In terms of the bipartite graph G, we want to ﬁnd a maximum

number of edges, no two incident with the same vertex.

To solve this problem by the method of network ﬂows we construct a network Y. First we adjoin two

new vertices s, t to the bipartite graph G. If we let P, J denote the two classes of vertices in the graph G,

then we draw an edge from s to each p ∈ P and an edge from each j ∈ J to t. Each edge in the network is

77

Chapter 3: The Network Flow Problem

Fig. 3.8.1: Matching people to jobs

Fig. 3.8.2: The network for the matching problem

given capacity 1. The result for the graph of Fig. 3.8.1 is shown in Fig. 3.8.2.

Consider a maximum integer-valued ﬂow in this network, of value Q. Since each edge has capacity 1, Q

edges of the type (s, p) each contain a unit of ﬂow. Out of each vertex p that receives some of this ﬂow there

will come one unit of ﬂow (since inﬂow equals outﬂow at such vertices), which will then cross to a vertex j of

J. No such j will receive more than one unit because at most one unit can leave it for the sink t. Hence the

ﬂow deﬁnes a matching of Q edges of the graph G. Conversely, any matching in G deﬁnes a ﬂow, hence a

maximum ﬂow corresponds to a maximum matching. In Fig. 3.8.3 we show a maximum ﬂow in the network

of Fig. 3.8.2 and therefore a maximum matching in the graph of Fig. 3.8.1.

Fig. 3.8.3: A maximum ﬂow

For a second application of network ﬂow methods, consdier an undirected graph G. The edge-connectivity

of G is deﬁned as the smallest number of edges whose removal would disconnect G. Certainly, for instance,

78

3.6 Layered networks

if we remove all of the edges incident to a single vertex v, we will disconnect the graph. Hence the edge

connectivity cannot exceed the minimum degree of vertices in the graph. However the edge connectivity

could be a lot smaller than the minimum degree as the graph of Fig. 3.8.4 shows, in which the minimum is

large, but the removal of just one edge will disconnect the graph.

Fig. 3.8.4: Big degree, low connectivity

Finding the edge connectivity is quite an important combinatorial problem, and it is by no means

obvious that network ﬂow methods can be used on it, but they can, and here is how.

Given G, a graph of V vertices. We solve not just one, but V −1 network ﬂow problems, one for each

vertex j = 2, . . . , V .

Fix such a vertex j. Then consider vertex 1 of G to be the source and vertex j to be the sink of a

network X

j

. Replace each edge of G by two edges of X

j

, one in each direction, each with capacity 1. Now

solve the network ﬂow problem in X

j

obtaining a maximum ﬂow Q(j). Then the smallest of the numbers

Q(j), for j = 2, . . . , V is the edge connectivity of G. We will not prove this here.

∗

As a ﬁnal application of network ﬂow we discuss the beautiful question of determining whether or not

there is a matrix of 0’s and 1’s that has given row and column sums. For instance, is there a 6 8 matrix

whose row sums are respectively (5, 5, 4, 3, 5, 6) and whose column sums are (3, 4, 4, 4, 3, 3, 4, 3)? Of

course the phrase ‘row sums’ means the same thing as ‘number of 1’s in each row’ since we have said that

the entries are only 0 or 1.

Hence in general, let there be given a row sum vector (r

1

, . . . , r

m

) and a column sum vector (s

1

, . . . , s

n

).

Wed ask if there exists an mn matrix A of 0’s and 1’s that has exactly r

i

1’s in the ith row and exactly

s

j

1’s in the jth column, for each i = 1, . . . , m, j = 1, . . . , n. The reader will no doubt have noticed that for

such a matrix to exist it must surely be true that

r

1

+ + r

m

= s

1

+ +s

n

(3.8.1)

since each side counts the total number of 1’s in the matrix. Hence we will suppose that (3.8.1) is true.

Now we will construct a network Y of m+n +2 vertices named s, x

1

, . . . , x

m

, y

1

, . . . , y

n

, and t. There

is an edge of capacity r

i

drawn from the source s to vertex x

i

, for each i = 1, . . . , m, and an edge of capacity

s

j

drawn from vertex y

j

to the sink t, for each j = 1, . . . , n. Finally, there are mn edges of capacity 1 drawn

from each edge x

i

to each vertex y

j

.

Next ﬁnd a maximum ﬂow in this netwrok. Then there is a 0-1 matrix with the given row and column

sum vectors if and only if a maximum ﬂow saturates every edge outbound from the source, that is, if and

only if a maximum ﬂow has value equal to the right or left side of equation (3.8.1). If such a ﬂow exists then

a matrix A of the desired kind is constructed by putting a

i,j

equal to the ﬂow in the edge from x

i

to y

j

.

∗

S. Even and R. E. Tarjan, Network ﬂow and testing graph connectivity, SIAM J. Computing 4 (1975),

507-518.

79

Chapter 3: The Network Flow Problem

Exercises for section 3.8

1. Apply the max-ﬂow min-cut theorem to the network that is constructed in order to solve the bipartite

matching problem. Precisely what does a cut correspond to in this network? What does the theorem tell

you about the matching problem?

2. Same as question 1 above, but applied to the question of discovering whether or not there is a 0-1 matrix

with a certain given set of row and column sums.

Bibliography

The standard reference for the network ﬂow problem and its variants is

L. R. Ford and D. R. Fulkerson, Flows in Networks, Princeton University Press, Princeton, NJ, 1974.

The algorithm, the example of irrational capacities and lack of convergence to maximum ﬂow, and many

applications are discussed there. The chronology of accelerated algorithms is based on the following papers.

The ﬁrst algorithms with a time bound independent of the edge capacities are in

J. Edmonds and R. M. Karp, Theoretical improvements in algorithmic eﬃciency for network ﬂow prob-

lems, JACM 19, 2 (1972), 248-264.

E. A. Dinic, Algorithm for solution of a problem of maximal ﬂow in a network with power estimation,

Soviet Math. Dokl., 11 (1970), 1277-1280.

The paper of Dinic, above, also originated the idea of a layered network. Further accelerations of the

netowrk ﬂow algorithms are found in the following.

A. V. Karzanov, Determining the maximal ﬂow in a network by the method of preﬂows, Soviet Math.

Dokl. 15 (1974), 434-437.

B. V. Cherkassky, Algorithm of construction of maximal ﬂow in networks with complexity of O(V

2

E)

operations, Akad. Nauk. USSR, Mathematical methods for the solution of economical problems 7 (1977),

117-126.

The MPM algorithm, discussed in the tex, is due to

V. M. Malhotra, M. Pramodh-Kumar and S. N. Maheshwari, An O(V

3

) algorithm for ﬁnding maximum

ﬂows in networks, Information processing Letters 7, (1978), 277-278.

Later algorithms depend on reﬁned data structures that save fragments of partially construted aug-

menting paths. These developments were initiated in

Z. Galil, A new algorithm for the maximal ﬂow problem, Proc. 19th IEEE Symposium on the Founda-

tions of Computer Science, Ann Arbor, October 1978, 231-245.

Andrew V. Goldberg and Robert E. Tarjan, A new approach to the maximum ﬂow problem, 1985.

A number of examples that show that the theoretical complexity estimates for the various algorithms

cannot be improved are contained in

Z. Galil, On the theoretical eﬃciency of various network ﬂow algorithms, IBM report RC7320, September

1978.

The proof given in the text, of theorem 3.6.1, leans heavily on the one in

Shimon Even, Graph Algorithms, Computer Science Press, Potomac, MD, 1979.

If edge capacities are all 0’s and 1’s, as in matching problems, then still faster algorithms can be given,

as in

S. Even and R. E. Tarjan, Network ﬂow and testing graph connectivity, SIAM J. Computing 4, (1975),

507-518.

If every pair of vertices is to act, in turn, as source and sink, then considerable economies can be realized,

as in

R. E. Gomory and T. C. Hu, Multiterminal netwrok ﬂows, SIAM Journal, 9 (1961), 551-570.

Matching in general graphs is much harder than in bipartite graphs. The pioneering work is due to

J. Edmonds, Path, trees, and ﬂowers, Canadian J. Math. 17 (1965), 449-467.

80

4.1 Preliminaries

Chapter 4: Algorithms in the Theory of Numbers

Number theory is the study of the properties of the positive integers. It is one of the oldest branches of

mathematics, and one of the purest, so to speak. It has immense vitality, however, and we will see in this

chapter and the next that parts of number theory are extremely relevant to current research in algorithms.

Part of the reason for this is that number theory enters into the analysis of algorithms, but that isn’t

the whole story.

Part of the reason is that many famous problems of number theory, when viewed from an algorithmic

viewpoint (like, how do you decide whether or not a positive integer n is prime?) present extremely deep

and attractive unsolved algorithmic problems. At least, they are unsolved if we regard the question as not

just how to do these problems computationally, but how to do them as rapidly as possible.

But that’s not the whole story either.

There are close connections between algorithmic problems in the theory of numbers, and problems

in other ﬁelds, seemingly far removed from number theory. There is a unity between these seemingly

diverse problems that enhances the already considerable beauty of any one of them. At least some of these

connections will be apparent by the end of study of Chapter 5.

4.1 Preliminaries

We collect in this section a number of facts about the theory of numbers, for later reference.

If n and m are positive integers then to divide n by m is to ﬁnd an integer q ≥ 0 (the quotient) and an

integer r ( the remainder) such that 0 ≤ r < m and n = qm+r.

If r = 0, we say that ‘m divides n,’ or ‘m is a divisor of n,’ and we write m[n. In any case the remainder

r is also called ‘n modulo m,’ and we write r = n mod m. Thus 4 = 11 mod 7, for instance.

If n has no divisors other than m = n and m = 1, then n is prime, else n is composite. Every positive

integer n can be factored into primes, uniquely apart from the order of the factors. Thus 120 = 2

3

3 5, and

in general we will write

n = p

a1

1

p

a2

2

p

al

l

=

l

i=1

p

ai

i

. (4.1.1)

We will refer to (4.1.1) as the canonical factorization of n.

Many interesting and important properties of an integer n can be calculated from its canonical factor-

ization. For instance, let d(n) be the number of divisors of the integer n. The divisors of 6 are 1, 2, 3, 6, so

d(6) = 4.

Can we ﬁnd a formula for d(n)? A small example may help to clarify the method. Since 120 = 2

3

3 5,

a divisor of 120 must be of the form m = 2

a

3

b

5

c

, in which a can have the values 0,1,2,3, b can be 0 or 1, and

c can be 0 or 1. Thus there are 4 choices for a, 2 for b and 2 for c, so there are 16 divisors of 120.

In general, the integer n in (4.1.1) has exactly

d(n) = (1 + a

1

)(1 +a

2

) (1 + a

l

) (4.1.2)

divisors.

If m and n are nonnegative integers then their greatest common divisor, written gcd(n, m), is the integer

g that

(a) divides both m and n and

(b) is divisible by every other common divisor of m and n.

Thus gcd(12, 8) = 4, gcd(42, 33) = 3, etc. If gcd(n, m) = 1 then we say that n and m are relatively

prime. Thus 27 and 125 are relatively prime (even though neither of them is prime).

If n > 0 is given, then φ(n) will denote the number of positive integers m such that m ≤ n and

gcd(n, m) = 1. Thus φ(6) = 2, because there are only two positive integers ≤ 6 that are relatively prime to

6 (namely 1 and 5). φ(n) is called the Euler φ-function, or the Euler totient function.

Let’s ﬁnd a formula that expresses φ(n) in terms of the canonical factorization (4.1.1) of n.

81

Chapter 4: Algorithms in the Theory of Numbers

We want to count the positive integers m for which m ≤ n, and m is not divisible by any of the primes

p

i

that appear in (4.1.1). There are n possibilities for such an integer m. Of these we throw away n/p

1

of

them because they are divisible by p

1

. Then we discard n/p

2

multiples of p

2

, etc. This leaves us with

n −n/p

1

−n/p

2

− −n/p

l

(4.1.3)

possible m’s.

But we have thrown away too much. An integer m that is a multiple of both p

1

and p

2

has been

discarded at least twice. So let’s correct these errors by adding

n/(p

1

p

2

) +n/(p

1

p

3

) + + n/(p

1

p

l

) + + n/(p

l−1

p

l

)

to (4.1.3).

The reader will have noticed that we added back too much, because an integer that is divisible by

p

1

p

2

p

3

, for instance, would have been re-entered at least twice. The ‘bottom line’ of counting too much,

then too little, then too much, etc. is the messy formula

φ(n) =n −n/p

1

−n/p

2

− −n/p

l

+ n/(p

1

p

2

) + + n/(p

l−1

p

l

)

−n/(p

1

p

2

p

3

) − −n/(p

l−2

p

l−1

p

l

)

+ +(−1)

l

n/(p

1

p

2

p

l

).

(4.1.4)

Fortunately (4.1.4) is identical with the much simpler expression

φ(n) = n(1 −1/p

1

)(1 −1/p

2

) (1 −1/p

l

) (4.1.5)

which the reader can check by beginning with (4.1.5) and expanding the product.

To calculate φ(120), for example, we ﬁrst ﬁnd the canonical factorization 120 = 2

3

3 5. Then we apply

(4.1.5) to get

φ(120) = 120(1 −1/2)(1 −1/3)(1 −1/5)

= 32.

Thus, among the integers 1, 2, . . ., 120, there are exactly 32 that are relatively prime to 120.

Exercises for section 4.1

1. Find a formula for the sum of the divisors of an integer n, expressed in terms of its prime divisors and

their multiplicities.

2. How many positive integers are ≤ 10

10

and have an odd number of divisors? Find a simple formula for

the number of such integers that are ≤ n.

3. If φ(n) = 2 then what do you know about n?

4. For which n is φ(n) odd?

4.2 The greatest common divisor

Let m and n be two positive integers. Suppose we divide n by m, to obtain a quotient q and a remainder

r, with, of course, 0 ≤ r < m. Then we have

n = qm+r. (4.2.1)

If g is some integer that divides both n and m then obviously g divides r also. Thus every common divisor

of n and m is a common divisor of m and r. Conversely, if g is a common divisor of m and r then (4.2.1)

shows that g divides n too.

It follows that gcd(n, m) = gcd(m, r). If r = 0 then n = qm, and clearly, gcd(n, m) = m.

82

4.2 The greatest common divisor

If we use the customary abbreviation ‘n mod m’ for r, the remainder in the division of n by m, then

what we have shown is that

gcd(n, m) = gcd(m, n mod m).

This leads to the following recursive procedure for computing the g.c.d.

function gcd(n, m);

¦ﬁnds gcd of given nonnegative integers n and m¦

if m = 0 then gcd := n else gcd := gcd(m, n mod m)

end.

The above is the famous ‘Euclidean algorithm’ for the g.c.d. It is one of the oldest algorithms known.

The reader is invited to write the Euclidean algorithm as a recursive program, and get it working on

some computer. Use a recursive language, write the program more or less as above, and try it out with some

large, healthy integers n and m.

The gcd program exhibits all of the symptoms of recursion. It calls itself with smaller values of its

variable list. It begins with ‘if trivialcase then do trivialthing’ (m = 0), and this case is all-important

because it’s the only way the procedure can stop itself.

If, for example, we want the g.c.d. of 13 and 21, we call the program with n = 13 and m = 21, and it

then recursively calls itself with the following arguments:

(21, 13), (13, 8), (8, 5), (5, 3), (3, 2), (2, 1), (1, 0) (4.2.2)

When it arrives at a call in which the ‘m’ is 0, then the ‘n,’ namely 1 in this case, is the desired g.c.d.

What is the input to the problem? The two integers n, m whose g.c.d. we want are the input, and the

number of bits that are needed to input those two integers is Θ(log n) +Θ(log m), namely Θ(logmn). Hence

c logmn is the length of the input bit string. Now let’s see how long the algorithm might run with an input

string of that length.

∗

To measure the running time of the algorithm we need ﬁrst to choose a unit of cost or work. Let’s

agree that one unit of labor is the execution of a single ‘a mod b’ operation. In this problem, an equivalent

measure of cost would be the number of times the algorithm calls itself recursively. In the example (4.2.2)

the cost was 7 units.

Lemma 4.2.1. If 1 ≤ b ≤ a then a mod b ≤ (a −1)/2.

Proof: Clearly a mod b ≤ b −1. Further,

a mod b = a −

_

a

b

_

b

≤ a −b.

Thus a mod b ≤ min(a −b, b −1). Now we distinguish two cases.

First suppose b ≤ (a +1)/2. Then b −1 ≤ a −b and so

a mod b ≤ b −1

≤

a +1

2

−1

=

a −1

2

in this case.

Next, suppose b > (a +1)/2. Then a −b ≤ b −1 and

a mod b ≤ a −b < a −

a + 1

2

=

a −1

2

so the result holds in either case.

∗

In Historia Mathematica 21 (1994), 401-419, Jeﬀrey Shallit traces this analysis back to Pierre-Joseph-

´

Etienne Finck, in 1841.

83

Chapter 4: Algorithms in the Theory of Numbers

Theorem 4.2.1. (A worst-case complexity bound for the Euclidean algorithm) Given two positive integers

a, b. The Euclidean algorithm will ﬁnd their greatest common divisor after a cost of at most ¸2 log

2

M| +1

integer divisions, where M = max(a, b).

Before we prove the theorem, let’s return to the example (a, b) = (13, 21) of the display (4.2.2). In that

case M = 21 and 2 log

2

M +1 = 9.78 . . .. The theorem asserts that the g.c.d. will be found after at most 9

operations. In fact it was found after 7 operations in that case.

Proof of theorem: Suppose ﬁrst that a ≥ b. The algorithm generates a sequence a

0

, a

1

, . . . where a

0

=

a, a

1

= b, and

a

j+1

= a

j−1

mod a

j

(j ≥ 1).

By lemma 4.2.1,

a

j+1

≤

a

j−1

−1

2

≤

a

j−1

2

.

Then, by induction on j it follows that

a

2j

≤

a

0

2

j

(j ≥ 0)

a

2j+1

≤

a

1

2

j

(j ≥ 0)

and so,

a

r

≤ 2

−r/2

M (r = 0, 1, 2, . . .).

Obviously the algorithm has terminated if a

r

< 1, and this will have happened when r is large enough so

that 2

−r/2

M < 1, i.e., if r > 2 log

2

M. If a < b then after 1 operation we will be in the case ‘a ≥ b’ that

we have just discussed, and the proof is complete.

The upper bound in the statement of theorem 4.2.1 can be visualized as follows. The number log

2

M

is almost exactly the number of bits in the binary representation of M (what is ‘exactly’ that number of

bits?). Theorem 4.2.1 therefore asserts that we can ﬁnd the g.c.d. of two integers in a number of operations

that is at most a linear function of the number of bits that it takes to represent the two numbers. In brief,

we might say that ‘Time = O(bits),’ in the case of Euclid’s algorithm.

Exercises for section 4.2

1. Write a nonrecursive program, in Basic or Fortran, for the g.c.d. Write a recursive program, in Pascal or

a recursive language of your choice, for the g.c.d.

2. Choose 1000 pairs of integers (n, m), at random between 1 and 1000. For each pair, compute the g.c.d.

using a recursive program and a nonrecursive program.

(a) Compare the execution times of the two programs.

(b) There is a theorem to the eﬀect that the probability that two random integers have g.c.d. = 1 is

6/π

2

. What, precisely, do you think that this theorem means by ‘the probability that ...’ ? What

percentage of the 1000 pairs that you chose had g.c.d. = 1? Compare your observed percentage

with 100 (6/π

2

).

3. Find out when Euclid lived, and with exactly what words he described his algorithm.

4. Write a program that will light up a pixel in row m and column n of your CRT display if and only if

gcd(m, n) = 1. Run the program with enough values of m and n to ﬁll your screen. If you see any interesting

visual patterns, try to explain them mathematically.

5. Show that if m and n have a total of B bits, then Euclid’s algorithm will not need more than 2B + 3

operations before reaching termination.

84

4.3 The extended Euclidean algorithm

6. Suppose we have two positive integers m, n, and we have factored them completely into primes, in the

form

m =

p

ai

i

; n =

q

bi

i

.

How would you calculate gcd(m, n) from the above information? How would you calculate the least common

multiple (lcm) of m and n from the above information? Prove that gcd(m, n) = mn/lcm(m, n).

7. Calculate gcd(102131, 56129) in two ways: use the method of exercise 6 above, then use the Euclidean

algorithm. In each case count the total number of arithmetic operations that you had to do to get the

answer.

8. Let F

n

be the n

th

Fibonacci number. How many operations will be needed to compute gcd(F

n

, F

n−1

) by

the Euclidean algorithm? What is gcd(F

n

, F

n−1

)?

4.3 The extended Euclidean algorithm

Again suppose n, m are two positive integers whose g.c.d. is g. Then we can always write g in the form

g = tn + um (4.3.1)

where t and u are integers. For instance, gcd(14, 11) = 1, so we can write 1 = 14t + 11u for integers t, u.

Can you spot integers t, u that will work? One pair that does the job is (4, −5), and there are others (can

you ﬁnd all of them?).

The extended Euclidean algorithm ﬁnds not only the g.c.d. of n and m, it also ﬁnds a pair of integers t,

u that satisfy (4.3.1). One ‘application’ of the extended algorithm is that we will obtain an inductive proof

of the existence of t, u, that is not immediately obvious from (4.3.1) (see exercise 1 below). While this hardly

rates as a ‘practical’ application, it represents a very important feature of recursive algorithms. We might

say, rather generally, that the following items go hand-in-hand:

Recursive algorithms

Inductive proofs

Complexity analyses by recurrence formulas

If we have a recursive algorithm, then it is natural to prove the validity of the algorithm by mathematical

induction. Conversely, inductive proofs of theorems often (not always, alas!) yield recursive algorithms for

the construction of the objects that are being studied. The complexity analysis of a recursive algorithm will

use recurrence formulas, in a natural way. We saw that already in the analysis that proved theorem 4.2.1.

Now let’s discuss the extended algorithm. Input to it will be two integers n and m. Output from it will

be g = gcd(n, m) and two integers t and u for which (4.3.1) is true.

A single step of the original Euclidean algorithm took us from the problem of ﬁnding gcd(n, m) to

gcd(m, n mod m). Suppose, inductively, that we not only know g = gcd(m, n mod m) but we also know the

coeﬃcients t

, u

**for the equation
**

g = t

m+ u

(n mod m). (4.3.2)

Can we get out, at the next step, the corresponding coeﬃcients t, u for (4.3.1)? Indeed we can, by substituting

in (4.3.2) the fact that

n mod m = n −

_

n

m

_

m (4.3.3)

we ﬁnd that

g = t

m+ u

(n −

_

n

m

_

m)

= u

n + (t

−u

_

n

m

_

)m.

(4.3.4)

Hence the rule by which t

, u

**for equation (4.3.2) transform into t, u for equation (4.3.1) is that
**

t = u

u = t

−

_

n

m

_

u

.

(4.3.5)

85

Chapter 4: Algorithms in the Theory of Numbers

We can now formulate recursively the extended Euclidean algorithm.

procedure gcdext(n, m, g, t, u);

¦computes g.c.d. of n and m, and ﬁnds

integers t, u that satisfy (4.3.1)¦

if m = 0 then

g := n; t := 1; u := 0

else

gcdext(m, n mod m, g, t, u);

s := u;

u := t −¸n/m| u;

t := s

end.¦gcdext¦

It is quite easy to use the algorithm above to make a proof of the main mathematical result of this

section (see exercise 1), which is

Theorem 4.3.1. Let m and n be given integers, and let g be their greatest common divisor. Then there

exist integers t, u such that g = tm+un.

An immediate consequence of the algorithm and the theorem is the fact that ﬁnding inverses modulo a

given integer is an easy computational problem. We will need to refer to that fact in the sequel, so we state

it as

Corollary 4.3.1. Let m and n be given positive integers, and let g be their g.c.d. Then m has a multi-

plicative inverse modulo n if and only if g = 1. In that case, the inverse can be computed in polynomial

time.

Proof: By the extended Euclidean algorithm we can ﬁnd, in linear time, integers t and u such that g =

tm+un. But this last equation says that tm ≡ g (mod n). If g = 1 then it is obvious that t is the inverse

modn of m. If g > 1 then there exists no t such that tm ≡ 1 (mod n) since tm = 1 + rn implies that the

g.c.d. of m and n is 1.

We will now trace the execution of gcdext if it is called with (n, m) = (14, 11). The routine ﬁrst replaces

(14,11) by (11,3) and calls itself. Then it calls itself successively with (3,2), (2,1) and (1,0). When it executes

with (n, m) = (1, 0) it encounters the ‘if m = 0’ statement, so it sets g := 1, t := 1, u := 0.

Now it can complete the execution of the call with (n, m) = (2, 1), which has so far been pending. To

do this it sets

u := t −¸n/m| u = 1

t := 0.

The call with (n, m) = (2, 1) is now complete. The call to the routine with (n, m) = (3, 2) has been in

limbo until just this moment. Now that the (2,1) call is ﬁnished, the (3,2) call executes and ﬁnds

u := 0 −¸3/2| 1 = 1

t := 1.

The call to the routine with (n, m) = (11, 3) has so far been languishing, but its turn has come. It

computes

u := 1 −¸11/3| (−1) = 4

t := −1.

Finally, the original call to gcdext from the user, with (n, m) = (14, 11), can be processed. We ﬁnd

u := (−1) −¸14/11| 4 = −5

t := 4.

86

4.4 Primality testing

Therefore, to the user, gcdext returns the values g = 1, u = −5, t = 4, and we see that the procedure has

found the representation (4.3.1) in this case. The importance of the ‘trivial case’ where m = 0 is apparent.

Exercises for section 4.3

1. Give a complete formal proof of theorem 4.3.1. Your proof should be by induction (on what?) and should

use the extended Euclidean algorithm.

2. Find integers t, u such that

(a) 1 = 4t +7u

(b) 1 = 24t + 35u

(c) 5 = 65t + 100u

3. Let a

1

, . . . , a

n

be positive integers.

(a) How would you compute gcd(a

1

, . . . , a

n

)?

(b) Prove that there exist integers t

1

, . . . , t

n

such that

gcd(a

1

, . . . , a

n

) = t

1

a

1

+t

2

a

2

+ +t

n

a

n

.

(c) Give a recursive algorithm for the computation of t

1

, . . . , t

n

in part (b) above.

4. If r = ta +ub, where r, a, b, u, v are all integers, must r = gcd(a, b)? What, if anything, can be said about

the relationship of r to gcd(a, b)?

5. Let (t

0

, u

0

) be one pair of integers t, u for which gcd(a, b) = ta +ub. Find all such pairs of integers, a and

b being given.

6. Find all solutions to exercises 2(a)-(c) above.

7. Find the multiplicative inverse of 49 modulo 73, using the extended Euclidean algorithm.

8. If gcdext is called with (n, m) = (98, 30), draw a picture of the complete tree of calls that will occur

during the recursive execution of the program. In your picture show, for each recursive call in the tree, the

values of the input parameters to that call and the values of the output variables that were returned by that

call.

4.4 Primality testing

In Chapter 1 we discussed the important distinction between algorithms that run in polynomial time

vs. those that may require exponential time. Since then we have seen some fast algorithms and some slow

ones. In the network ﬂow problem the complexity of the MPM algorithm was O(V

3

), a low power of the

size of the input data string, and the same holds true for the various matching and connectivity problems

that are special cases of the network ﬂow algorithm.

Likewise, the Fast Fourier Transform is really Fast. It needs only O(nlog n) time to ﬁnd the transform

of a sequence of length n if n is a power of two, and only O(n

2

) time in the worst case, where n is prime.

In both of those problems we were dealing with computational situations near the low end of the

complexity scale. It is feasible to do a Fast Fourier Transform on, say, 1000 data points. It is feasible to

calculate maximum ﬂows in networks with 1000 vertices or so.

On the other hand, the recursive computation of the chromatic polynomial in section 2.3 of Chapter 2

was an example of an algorithm that might use exponential amounts of time.

In this chapter we will meet another computational question for which, to date, no one has ever been

able to provide a polynomial-time algorithm, nor has anyone been able to prove that such an algorithm does

not exist.

The problem is just this: Given a positive integer n. Is n prime?

87

Chapter 4: Algorithms in the Theory of Numbers

The reader should now review the discussion in Example 3 of section 0.2. In that example we showed

that the obvious methods of testing for primality are slow in the sense of complexity theory. That is, we

do an amount of work that is an exponentially growing function of the length of the input bit string if we

use one of those methods. So this problem, which seems like a ‘pushover’ at ﬁrst glance, turns out to be

extremely diﬃcult.

Although it is not known if a polynomial-time primality testing algorithm exists, remarkable progress

on the problem has been made in recent years.

One of the most important of these advances was made independently and almost simultaneously by

Solovay and Strassen, and by Rabin, in 1976-7. These authors took the imaginative step of replacing

‘certainly’ by ‘probably,’ and they devised what should be called a probabilistic compositeness (an integer

is composite if it is not prime) test for integers, that runs in polynomial time.

Here is how the test works. First choose a number b uniformly at random, 1 ≤ b ≤ n − 1. Next,

subject the pair (b, n) to a certain test, called a pseudoprimality test, to be described below. The test has

two possible outcomes: either the number n is correctly declared to be composite or the test is inconclusive.

If that were the whole story it would be scarcely have been worth the telling. Indeed the test ‘Does b

divide n?’ already would perform the function stated above. However, it has a low probability of success

even if n is composite, and if the answer is ‘No,’ we would have learned virtually nothing.

The additional property that the test described below has, not shared by the more naive test ‘Does b

divide n?,’ is that if n is composite, the chance that the test will declare that result is at least 1/2.

In practice, for a given n we would apply the test 100 times using 100 numbers b

i

that are independently

chosen at random in [1, n −1]. If n is composite, the probability that it will be declared composite at least

once is at least 1−2

−100

, and these are rather good odds. Each test would be done in quick polynomial time.

If n is not found to be composite after 100 trials, and if certainty is important, then it would be worthwhile

to subject n to one of the nonprobabilistic primality tests in order to dispel all doubt.

It remains to describe the test to which the pair (b, n) is subjected, and to prove that it detects com-

positeness with probability ≥ 1/2.

Before doing this we mention another important development. A more recent primality test, due to

Adleman, Pomerance and Rumely in 1983, is completely deterministic. That is, given n it will surely decide

whether or not n is prime. The test is more elaborate than the one that we are about to describe, and it

runs in tantalizingly close to polynomial time. In fact it was shown to run in time

O((log n)

c log log log n

)

for a certain constant c. Since the number of bits of n is a constant multiple of logn, this latter estimate is

of the form

O((Bits)

c log log Bits

).

The exponent of ‘Bits,’ which would be constant in a polynomial time algorithm, in fact grows extremely

slowly as n grows. This is what was referred to as ‘tantalizingly close’ to polynomial time, earlier.

It is important to notice that in order to prove that a number is not prime, it is certainly suﬃcient to

ﬁnd a nontrivial divisor of that number. It is not necessary to do that, however. All we are asking for is a

‘yes’ or ‘no’ answer to the question ‘is n prime?.’ If you should ﬁnd it discouraging to get only the answer

‘no’ to the question ‘Is 7122643698294074179 prime?,’ without getting any of the factors of that number,

then what you want is a fast algorithm for the factorization problem.

In the test that follows, the decision about the compositeness of n will be reached without a knowledge

of any of the factors of n. This is true of the Adleman, Pomerance, Rumely test also. The question of

ﬁnding a factor of n, or all of them, is another interesting computational problem that is under active

investigation. Of course the factorization problem is at least as hard as ﬁnding out if an integer is prime,

and so no polynomial-time algorithm is known for it either. Again, there are probabilistic algorithms for the

factorization problem just as there are for primality testing, but in the case of the factorization problem,

even they don’t run in polynomial-time.

In section 4.9 we will discuss a probabilistic algorithm for factoring large integers, after some motivation

in section 4.8, where we remark on the connection between computationally intractable problems and cryp-

tography. Speciﬁcally, we will describe one of the ‘Public Key’ data encryption systems whose usefulness

stems directly from the diﬃculty of factoring large integers.

88

4.5 Interlude: the ring of integers modulo n

Isn’t it amazing that in this technologically enlightened age we still don’t know how to ﬁnd a divisor of

a whole number quickly?

4.5 Interlude: the ring of integers modulo n

In this section we will look at the arithmetic structure of the integers modulo some ﬁxed integer n.

These results will be needed in the sequel, but they are also of interest in themselves and have numerous

applications.

Consider the ring whose elements are 0, 1, 2, . . . , n −1 and in which we do addition, subtraction, and

multiplication modulo n. This ring is called Z

n

. For example, in Table 4.5.1 we show the addition and

multiplication tables of Z

6

.

+ 0 1 2 3 4 5 ∗ 0 1 2 3 4 5

0 0 1 2 3 4 5 0 0 0 0 0 0 0

1 1 2 3 4 5 0 1 0 1 2 3 4 5

2 2 3 4 5 0 1 2 0 2 4 0 2 4

3 3 4 5 0 1 2 3 0 3 0 3 0 3

4 4 5 0 1 2 3 4 0 4 2 0 4 2

5 5 0 1 2 3 4 5 0 5 4 3 2 1

Table 4.5.1: Arithmetic in the ring Z

6

Notice that while Z

n

is a ring, it certainly need not be a ﬁeld, because there will usually be some

noninvertible elements. Reference to Table 4.5.1 shows that 2, 3, 4 have no multiplicative inverses in Z

6

,

while 1, 5 do have such inverses. The diﬀerence, of course, stems from the fact that 1 and 5 are relatively

prime to the modulus 6 while 2, 3, 4 are not. We learned, in corollary 4.3.1, that an element m of Z

n

is

invertible if and only if m and n are relatively prime.

The invertible elements of Z

n

form a multiplicative group. We will call that group the group of units of

Z

n

and will denote it by U

n

. It has exactly φ(n) elements, by lemma 4.5.1, where φ is the Euler function of

(4.1.5).

The multiplication table of the group U

18

is shown in Table 4.5.2.

∗ 1 5 7 11 13 17

1 1 5 7 11 13 17

5 5 7 17 1 11 13

7 7 17 13 5 1 11

11 11 1 5 13 17 7

13 13 11 1 17 7 5

17 17 13 11 7 5 1

Table 4.5.2: Multiplication modulo 18

Notice that U

18

contains φ(18) = 6 elements, that each of them has an inverse and that each row

(column) of the multiplication table contains a permutation of all of the group elements.

Let’s look at the table a little more closely, with a view to ﬁnding out if the group U

18

is cyclic. In a

cyclic group there is an element a whose powers 1, a, a

2

, a

3

, . . . run through all of the elements of the group.

If we refer to the table again, we see that in U

18

the powers of 5 are 1, 5, 7, 17, 13, 11, 1, . . .. Thus the

order of the group element 5 is equal to the order of the group, and the powers of 5 exhaust all group

elements. The group U

18

is indeed cyclic, and 5 is a generator of U

18

.

89

Chapter 4: Algorithms in the Theory of Numbers

A number (like 5 in the example) whose powers run through all elements of U

n

is called a primitive root

modulo n. Thus 5 is a primitive root modulo 18. The reader should now ﬁnd, from Table 4.5.2, all of the

primitive roots modulo 18.

Alternatively, since the order of a group element must always divide the order of the group, every

element of U

n

has an order that divides φ(n). The primitive roots are exactly the elements, if they exist, of

maximum possible order φ(n).

We pause to note two corollaries of these remarks, namely

Theorem 4.5.1 (‘Fermat’s theorem’). For every integer b that is relatively prime to n we have

b

φ(n)

≡ 1 (mod n). (4.5.1)

In particular, if n is a prime number then φ(n) = n −1, and we have

Theorem 4.5.2 (‘Fermat’s little theorem’). If n is prime, then for all b ,≡ 0 (mod n) we have b

n−1

≡ 1

(mod n).

It is important to know which groups U

n

are cyclic, i.e., which integers n have primitive roots. The

answer is given by

Theorem 4.5.3. An integer n has a primitive root if and only if n = 2 or n = 4 or n = p

a

(p an odd prime)

or n = 2p

a

(p an odd prime). Hence, the groups U

n

are cyclic for precisely such values of n.

The proof of theorem 4.5.3 is a little lengthy and is omitted. It can be found, for example, in the book

of LeVeque that is cited at the end of this chapter.

According to theorem 4.5.3, for example, U

18

is cyclic, which we have already seen, and U

12

is not cyclic,

which the reader should check.

Further, we state as an immediate consequence of theorem 4.5.3,

Corollary 4.5.3. If n is an odd prime, then U

n

is cyclic, and in particular the equation x

2

= 1, in U

n

, has

only the solutions x = ±1.

Next we will discuss the fact that if the integer n can be factored in the form n = p

a1

1

p

a2

2

p

ar

r

then

the full ring Z

n

can also be factored, in a certain sense, as a ‘product’ of Z

p

a

i

i

.

Let’s take Z

6

as an example. Since 6 = 2 3, we expect that somehow Z

6

= Z

2

Z

3

. What this means

is that we consider ordered pairs x

1

, x

2

, where x

1

∈ Z

2

and x

2

∈ Z

3

.

Here is how we do the arithmetic with the ordered pairs.

First, (x

1

, x

2

) + (y

1

, y

2

) = (x

1

+ y

1

, x

2

+ y

2

), in which the two ‘+’ signs on the right are diﬀerent: the

ﬁrst ‘x

1

+y

1

’ is done in Z

2

while the ‘x

2

+y

2

’ is done in Z

3

.

Second, (x

1

, x

2

)(y

1

, y

2

) = (x

1

y

1

, x

2

y

2

), in which the two multiplications on the right side are diﬀerent:

the ‘x

1

y

1

’ is done in Z

2

and the ‘x

2

y

2

’ in Z

3

.

Therefore the 6 elements of Z

6

are

(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2).

A sample of the addition process is

(0, 2) +(1, 1) = (0 +1, 2 + 1)

= (1, 0)

where the addition of the ﬁrst components was done modulo 2 and of the second components was done

modulo 3.

A sample of the multiplication process is

(1, 2) (1, 2) = (1 1, 2 2)

= (1, 1)

in which multiplication of the ﬁrst components was done modulo 2 and of the second components was done

modulo 3.

In full generality we can state the factorization of Z

n

as

90

4.5 Interlude: the ring of integers modulo n

Theorem 4.5.4. Let n = p

a1

1

p

a2

2

p

ar

r

. The mapping which associates with each x ∈ Z

n

the r-tuple

(x

1

, x

2

, . . . , x

r

), where x

i

= x mod p

ai

i

(i = 1, r), is a ring isomorphism of Z

n

with the ring of r-tuples

(x

1

, x

2

, . . . , x

r

) in which

(a) x

i

∈ Z

p

a

i

i

(i = 1, r) and

(b) (x

1

, . . . , x

r

) +(y

1

, . . . , y

r

) = (x

1

+y

1

, . . . , x

r

+y

r

) and

(c) (x

1

, . . . , x

r

) (y

1

, . . . , y

r

) = (x

1

y

1

, . . . , x

r

y

r

)

(d) In (b), the i

th

‘+’ sign on the right side is the addition operation of Z

p

a

i

i

and in (c) the i

th

‘’ sign is

the multiplication operation of Z

p

a

i

i

, for each i = 1, 2, . . ., r.

The proof of theorem 4.5.4 follows at once from the famous

Theorem 4.5.5 (‘The Chinese Remainder Theorem’). Let m

i

(i = 1, r) be pairwise relatively prime

positive integers, and let

M = m

1

m

2

m

r

.

Then the mapping that associates with each integer x (0 ≤ x ≤ M − 1) the r-tuple (b

1

, b

2

, . . . , b

r

), where

b

i

= x mod m

i

(i = 1, r), is a bijection between Z

M

and Z

m1

Z

mr

.

A good theorem deserves a good proof. An outstanding theorem deserves two proofs, at least, one

existential, and one constructive. So here are one of each for the Chinese Remainder Theorem.

Proof 1: We must show that each r-tuple (b

1

, . . . , b

r

) such that 0 ≤ b

i

< m

i

(i = 1, r) occurs exactly once.

There are obviously M such vectors, and so it will be suﬃcient to show that each of them occurs at most

once as the image of some x.

In the contrary case we would have x and x

both corresponding to (b

1

, b

2

, . . . , b

r

), say. But then

x −x

**≡ 0 modulo each of the m
**

i

. Hence x −x

is divisible by M = m

1

m

2

m

r

. But [x −x

[ < M, hence

x = x

.

Proof 2: Here’s how to compute a number x that satisﬁes the simultaneous congruences x ≡ b

i

mod

m

i

(i = 1, r). First, by the extended Euclidean algorithm we can quickly ﬁnd t

1

, . . . , t

r

, u

1

, . . . , u

r

, such that

t

j

(M/m

j

) + u

j

m

j

= 1 for j = 1, . . . , r. Then we claim that the number x =

j

b

j

t

j

(M/m

j

) satisﬁes all of

the given congruences. Indeed, for each k = 1, 2, . . ., r we have

x =

r

j=1

b

j

t

j

(M/m

j

)

≡ b

k

t

k

(M/m

k

) (mod m

k

)

≡ b

k

(mod m

k

)

where the ﬁrst congruence holds because each M/m

j

(j ,= k) is divisible by m

k

, and the second congruence

follows since

t

k

(M/m

k

) = 1 −u

k

m

k

≡ 1 mod m

k

,

completing the second proof of the Chinese Remainder Theorem.

Now the proof of theorem 4.5.4 follows easily, and is left as an exercise for the reader.

The factorization that is described in detail in theorem 4.5.4 will be written symbolically as

Z

n

∼

=

r

i=1

Z

p

a

i

i

. (4.5.2)

The factorization (4.5.2) of the ring Z

n

induces a factorization

U

n

∼

=

r

i=1

U

pi

a

i (4.5.3)

91

Chapter 4: Algorithms in the Theory of Numbers

of the group of units. Since U

n

is a group, (4.5.3) is an isomorphism of the multiplicative structure only. In

Z

12

, for example, we ﬁnd

U

12

∼

= U

4

U

3

where U

4

= ¦1, 3¦, U

3

= ¦1, 2¦. So U

12

can be thought of as the set ¦(1, 1, ), (1, 2), (3, 1), (3, 2)¦, together

with the componentwise multiplication operation described above.

Exercises for section 4.5

1. Give a complete proof of theorem 4.5.4.

2. Find all primitive roots modulo 18.

3. Find all primitive roots modulo 27.

4. Write out the multiplication table of the group U

27

.

5. Which elements of Z

11

are squares?

6. Which elements of Z

13

are squares?

7. Find all x ∈ U

27

such that x

2

= 1. Find all x ∈ U

15

such that x

2

= 1.

8. Prove that if there is a primitive root modulo n then the equation x

2

= 1 in the group U

n

has only the

solutions x = ±1.

9. Find a number x that is congruent to 1, 7 and 11 to the respective moduli 5, 11 and 17. Use the method

in the second proof of the remainder theorem 4.5.5.

10. Write out the complete proof of the ‘immediate’ corollary 4.5.3.

4.6 Pseudoprimality tests

In this section we will discuss various tests that might be used for testing the compositeness of integers

probabilistically.

By a pseudoprimality test we mean a test that is applied to a pair (b, n) of integers, and that has the

following characteristics:

(a) The possible outcomes of the test are ‘n is composite’ or ‘inconclusive.’

(b) If the test reports ‘n is composite’ then n is composite.

(c) The test runs in a time that is polynomial in log n.

If the test result is ‘inconclusive’ then we say that n is pseudoprime to the base b (which means that n

is so far acting like a prime number, as far as we can tell).

The outcome of the test of the primality of n depends on the base b that is chosen. In a good pseu-

doprimality test there will be many bases b that will give the correct answer. More precisely, a good

pseudoprimality test will, with high probability (i.e., for a large number of choices of the base b) declare

that a composite n is composite. In more detail, we will say that a pseudoprimality test is ‘good’ if there

is a ﬁxed positive number t such that every composite integer n is declared to be composite for at least tn

choices of the base b, in the interval 1 ≤ b ≤ n.

Of course, given an integer n, it is silly to say that ‘there is a high probability that n is prime.’ Either

n is prime or it isn’t, and we should not blame our ignorance on n itself. Nonetheless, the abuse of language

is suﬃciently appealing that we will deﬁne the problem away: we will say that a given integer n is very

probably prime if we have subjected it to a good pseudoprimality test, with a large number of diﬀerent bases

b, and have found that it is pseudoprime to all of those bases.

Here are four examples of pseudoprimality tests, only one of which is ‘good.’

Test 1. Given b, n. Output ‘n is composite’ if b divides n, else ‘inconclusive.’

This isn’t the good one. If n is composite, the probability that it will be so declared is the probability

that we happen to have found a b that divides n, where b is not 1 or n. The probability of this event, if b is

chosen uniformly at random from [1, n], is

p

1

= (d(n) −2)/n

where d(n) is the number of divisors of n. Certainly p

1

is not bounded from below by a positive constant t,

if n is composite.

92

4.6 Pseudoprimality tests

Test 2. Given b, n. Output ‘n is composite’ if gcd(b, n) ,= 1, else output ‘inconclusive.’

This one is a little better, but not yet good. If n is composite, the number of bases b ≤ n for which

Test 2 will produce the result ‘composite’ is n −φ(n), where φ is the Euler totient function, of (4.1.5). This

number of useful bases will be large if n has some small prime factors, but in that case it’s easy to ﬁnd

out that n is composite by other methods. If n has only a few large prime factors, say if n = p

2

, then the

proportion of useful bases is very small, and we have the same kind of ineﬃciency as in Test 1 above.

Now we can state the third pseudoprimality test.

Test 3. Given b, n. (If b and n are not relatively prime or) if b

n−1

,≡ 1 (mod n) then output ‘n is

composite,’ else output ‘inconclusive.’

Regrettably, the test is still not ‘good,’ but it’s a lot better than its predecessors. To cite an extreme

case of its un-goodness, there exist composite numbers n, called Carmichael numbers, with the property that

the pair (b, n) produces the output ‘inconclusive’ for every integer b in [1, n −1] that is relatively prime to

n. An example of such a number is n = 1729, which is composite (1729 = 7 13 19), but for which Test

3 gives the result ‘inconclusive’ on every integer b < 1729 that is relatively prime to 1729 (i.e., that is not

divisible by 7 or 13 or 19).

Despite such misbehavior, the test usually seems to perform quite well. When n = 169 (a diﬃcult

integer for tests 1 and 2) it turns out that there are 158 diﬀerent b’s in [1,168] that produce the ‘composite’

outcome from Test 3, namely every such b except for 19, 22, 23, 70, 80, 89, 99, 146, 147, 150, 168.

Finally, we will describe a good pseudoprimality test. The familial resemblance to Test 3 will be

apparent.

Test 4. (the strong pseudoprimality test): Given (b, n). Let n − 1 = 2

q

m, where m is an odd integer. If

either

(a) b

m

≡ 1 (mod n) or

(b) there is an integer i in [0, q −1] such that

b

m2

i

≡ −1 (mod n)

then return ‘inconclusive’ else return ‘n is composite.’

First we validate the test by proving the

Proposition. If the test returns the message ‘n is composite,’ then n is composite.

Proof: Suppose not. Then n is an odd prime. We claim that

b

m2

i

≡ 1 (mod n)

for all i = q, q − 1, . . . , 0. If so then the case i = 0 will contradict the outcome of the test, and thereby

complete the proof. To establish the claim, it is clearly true when i = q, by Fermat’s theorem. If true for i,

then it is true for i −1 also, because

(b

m2

i−1

)

2

= b

m2

i

≡ 1 (mod n)

implies that the quantity being squared is +1 or −1. Since n is an odd prime, by corollary 4.5.3 U

n

is cyclic,

and so the equation x

2

= 1 in U

n

has only the solutions x = ±1. But −1 is ruled out by the outcome of the

test, and the proof of the claim is complete.

What is the computational complexity of the test? Consider ﬁrst the computational problem of raising

a number to a power. We can calculate, for example, b

m

mod n with O(log m) integer multiplications,

by successive squaring. More precisely, we compute b, b

2

, b

4

, b

8

, . . . by squaring, and reducing modulo n

immediately after each squaring operation, rather than waiting until the ﬁnal exponent is reached. Then we

use the binary expansion of the exponent m to tell us which of these powers of b we should multiply together

in order to compute b

m

. For instance,

b

337

= b

256

b

64

b

16

b.

93

Chapter 4: Algorithms in the Theory of Numbers

The complete power algorithm is recursive and looks like this:

function power(b, m, n);

¦returns b

m

mod n¦

if m = 0

then

power := 1

else

t := sqr(power(b, ¸m/2|, n));

if m is odd then t := t b;

power := t mod n

end.¦power¦

Hence part (a) of the strong pseudoprimality test can be done in O(logm) = O(log n) multiplications

of integers of at most O(logn) bits each. Similarly, in part (b) of the test there are O(log n) possible values

of i to check, and for each of them we do a single multiplication of two integers each of which has O(log n)

bits (this argument, of course, applies to Test 3 above also).

The entire test requires, therefore, some low power of log n bit operations. For instance, if we were to

use the most obvious way to multiply two B bit numbers we would do O(B

2

) bit operations, and then the

above test would take O((logn)

3

) time. This is a polynomial in the number of bits of input.

In the next section we are going to prove that Test 4 is a good pseudoprimality test in that if n is

composite then at least half of the integers b, 1 ≤ b ≤ n −1 will give the result ‘n is composite.’

For example, if n = 169, then it turns out that for 157 of the possible 168 bases b in [1,168], Test 4 will

reply ‘169 is composite.’ The only bases b that 169 can fool are 19, 22, 23, 70, 80, 89, 99, 146, 147, 150,

168. For this case of n = 169 the performances of Test 4 and of Test 3 are identical. However, there are no

analogues of the Carmichael numbers for Test 4.

Exercises for section 4.6

1. Given an odd integer n. Let T(n) be the set of all b ∈ [1, n] such that gcd(b, n) = 1 and b

n−1

≡ 1

(mod n). Show that [T(n)[ divides φ(n).

2. Let H be a cyclic group of order n. How many elements of each order r are there in H (r divides n)?

3. If n = p

a

, where p is an odd prime, then the number of x ∈ U

n

such that x has exact order r, is φ(r), for

all divisors r of φ(n). In particular, the number of primitive roots modulo n is φ(φ(n)).

4. If n = p

a1

1

p

am

m

, and if r divides φ(n), then the number of x ∈ U

n

such that x

r

≡ 1 (mod n) is

m

i=1

gcd(φ(p

ai

i

), r).

5. In a group G suppose f

m

and g

m

are, respectively, the number of elements of order m and the number

of solutions of the equation x

m

= 1, for each m = 1, 2, . . .. What is the relationship between these two

sequences? That is, how would you compute the g’s from the f’s? the f’s from the g’s? If you have never

seen a question of this kind, look in any book on the theory of numbers, ﬁnd ‘M¨ obius inversion,’ and apply

it to this problem.

4.7 Proof of goodness of the strong pseudoprimality test

In this section we will show that if n is composite, then at least half of the integers b in [1, n −1] will

yield the result ‘n is composite’ in the strong pseudoprimality test. The basic idea of the proof is that a

subgroup of a group that is not the entire group can consist of at most half of the elements of that group.

Suppose n has the factorization

n = p

a1

1

p

as

s

and let n

i

= p

i

ai

(i = 1, s).

94

4.7 Goodness of pseudoprimality test

Lemma 4.7.1. The order of each element of U

n

is a divisor of e

∗

= lcm¦φ(n

i

); i = 1, s¦.

Proof: From the product representation (4.5.3) of U

n

we ﬁnd that an element x of U

n

can be regarded as

an s-tuple of elements from the cyclic groups U

ni

(i = 1, s). The order of x is equal to the lcm of the orders

of the elements of the s-tuple. But for each i = 1, . . . , s the order of the i

th

of those elements is a divisor of

φ(n

i

), and therefore the order of x divides the lcm shown above.

Lemma 4.7.2. Let n > 1 be odd. For each element u of U

n

let C(u) = ¦1, u, u

2

, . . . , u

e−1

¦ denote the cyclic

group that u generates. Let B be the set of all elements u of U

n

for which C(u) either contains −1 or has

odd order (e odd). If B generates the full group U

n

then n is a prime power.

Proof: Let e

∗

= 2

t

m, where m is odd and e

∗

is as shown in lemma 4.7.1. Then there is a j such that φ(n

j

)

is divisible by 2

t

.

Now if n is a prime power, we are ﬁnished. So we can suppose that n is divisible by more than one

prime number. Since φ(n) is an even number for all n > 2 (proof?), the number e

∗

is even. Hence t > 0 and

we can deﬁne a mapping ψ of the group U

n

to itself by

ψ(x) = x

2

t−1

m

(x ∈ U

n

)

(note that ψ(x) is its own inverse).

This is in fact a group homomorphism:

∀x, y ∈ U

n

: ψ(xy) = ψ(x)ψ(y).

Let B be as in the statement of lemma 4.7.2. For each x ∈ B, ψ(x) is in C(x) and

ψ(x)

2

= ψ(x

2

) = 1.

Since ψ(x) is an element of C(x) whose square is 1, ψ(x) has order 1 or 2. Hence if ψ(x) ,= 1, it is of order

2. If the cyclic group C(x) is of odd order then it contains no element of even order. Hence C(x) is of even

order and contains −1. Then it can contain no other element of order 2, so ψ(x) = −1 in this case.

Hence for every x ∈ B, ψ(x) = ±1.

Suppose B generates the full group U

n

. Then not only for every x ∈ B but for every x ∈ U

n

it is true

that ψ(x) = ±1.

Suppose n is not a prime power. Then s > 1 in the factorization (4.5.2) of U

n

. Consider the element v

of U

n

which, when written out as an s-tuple according to that factorization, is of the form

v = (1, 1, 1, . . ., 1, y, 1, . . . , 1)

where the ‘y’ is in the j

th

component, y ∈ U

nj

(recall that j is as described above, in the second sentence of

this proof). We can suppose y to be an element of order exactly 2

t

in U

nj

since U

nj

is cyclic.

Consider ψ(v). Clearly ψ(v) is not 1, for otherwise the order of y, namely 2

t

, would divide 2

t−1

m, which

is impossible because m is odd.

Also, ψ(v) is not −1, because the element −1 of U

n

is represented uniquely by the s-tuple all of whose

entries are −1. Thus ψ(v) is neither 1 nor −1 in U

n

, which contradicts the italicized assertion above. Hence

s = 1 and n is a prime power, completing the proof.

Now we can prove the main result of Solovay, Strassen and Rabin, which asserts that Test 4 is good.

Theorem 4.7.1. Let B

**be the set of integers b mod n such that (b, n) returns ‘inconclusive’ in Test 4.
**

(a) If B

generates U

n

then n is prime.

(b) If n is composite then B

**consists of at most half of the integers in [1, n −1].
**

Proof: Suppose b ∈ B

**and let m be the odd part of n − 1. Then either b
**

m

≡ 1 or b

m2

i

≡ −1 for some

i ∈ [0, q −1]. In the former case the cyclic subgroup C(b) has odd order, since m is odd, and in the latter

case C(b) contains −1.

95

Chapter 4: Algorithms in the Theory of Numbers

Hence in either case B

⊆ B, where B is the set deﬁned in the statement of lemma 4.7.2 above. If B

**generates the full group U
**

n

then B does too, and by lemma 4.7.2, n is a prime power, say n = p

k

.

Also, in either of the above cases we have b

n−1

≡ 1, so the same holds for all b ∈ B

**, and so for all
**

x ∈ U

n

we have x

n−1

≡ 1, since B

generates U

n

.

Now U

n

is cyclic of order

φ(n) = φ(p

k

) = p

k−1

(p −1).

By theorem 4.5.3 there are primitive roots modulo n = p

k

. Let g be one of these. The order of g is, on the

one hand, p

k−1

(p−1) since the set of all of its powers is identical with U

n

, and on the other hand is a divisor

of n −1 = p

k

−1 since x

n−1

≡ 1 for all x, and in particular for x = g.

Hence p

k−1

(p −1) (which, if k > 1, is a multiple of p) divides p

k

−1 (which is one less than a multiple

of p), and so k = 1, which completes the proof of part (a) of the theorem.

In part (b), n is composite and so B

**cannot generate all of U
**

n

, by part (a). Hence B

generates a

proper subgroup of U

n

, and so can contain at most half as many elements as U

n

contains, and the proof is

complete.

Another application of the same circle of ideas to computer science occurs in the generation of random

numbers on a computer. A good way to do this is to choose a primitive root modulo the word size of your

computer, and then, each time the user asks for a random number, output the next higher power of the

primitive root. The fact that you started with a primitive root insures that the number of ‘random numbers’

generated before repetition sets in will be as large as possible.

Now we’ll summarize the way in which the primality test is used. Suppose there is given a large integer

n, and we would like to determine if it is prime.

We would do

function testn(n, outcome);

times := 0;

repeat

choose an integer b uniformly at random in [2, n −1];

apply the strong pseudoprimality test (Test 4) to the

pair (b, n);

times := times + 1

until ¦result is ‘n is composite’ or times = 100¦;

if times = 100 then outcome:=‘n probably prime’

else outcome:=‘n is composite’

end¦testn¦

If the procedure exits with ‘n is composite,’ then we can be certain that n is not prime. If we want to

see the factors of n then it will be necessary to use some factorization algorithm, such as the one described

below in section 4.9.

On the other hand, if the procedure halts because it has been through 100 trials without a conclusive

result, then the integer n is very probably prime. More precisely, the chance that a composite integer n

would have behaved like that is less than 2

−100

. If we want certainty, however, it will be necessary to apply a

test whose outcome will prove primality, such as the algorithm of Adleman, Rumely and Pomerance, referred

to earlier.

In section 4.9 we will discuss a probabilistic factoring algorithm. Before doing so, in the next section

we will present a remarkable application of the complexity of the factoring problem, to cryptography. Such

applications remind us that primality and factorization algorithms have important applications beyond pure

mathematics, in areas of vital public concern.

Exercises for section 4.7

1. For n = 9 and for n = 15 ﬁnd all of the cyclic groups C(u), of lemma 4.7.2, and ﬁnd the set B.

2. For n = 9 and n = 15 ﬁnd the set B

, of theorem 4.7.1.

96

4.8 Factoring and cryptography

4.8 Factoring and cryptography

A computationally intractable problem can be used to create secure codes for the transmission of infor-

mation over public channels of communication. The idea is that those who send the messages to each other

will have extra pieces of information that will allow the m to solve the intractable problem rapidly, whereas

an aspiring eavesdropper would be faced with an exponential amount of computation.

Even if we don’t have a provably computationally intractable problem, we can still take a chance that

those who might intercept our messages won’t know any polynomial-time algorithms if we don’t know any.

Since there are precious few provably hard problems, and hordes of apparently hard problems, it is scarcely

surprising that a number of sophisticated coding schemes rest on the latter rather than the former. One

should remember, though, that an adversary might discover fast algorithms for doing these problems and

keep that fact secret while deciphering all of our messages.

A remarkable feature of a family of recently developed coding schemes, called ‘Public Key Encryption

Systems,’ is that the ‘key’ to the code lies in the public domain, so it can be easily available to sender and

receiver (and eavesdropper), and can be readily changed if need be. On the negative side, the most widely

used Public Key Systems lean on computational problems that are only presumed to be intractable, like

factoring large integers, rather than having been proved so.

We are going to discuss a Public Key System called the RSA scheme, after its inventors: Rivest, Shamir

and Adleman. This particular method depends for its success on the seeming intractability of the problem

of ﬁnding the factors of large integers. If that problem could be done in polynomial time, then the RSA

system could be ‘cracked.’

In this system there are three centers of information: the sender of the message, the receiver of the

message, and the Public Domain (for instance, the ‘Personals’ ads of the New York Times). Here is how the

system works.

(A) Who knows what and when

Here are the items of information that are involved, and who knows each item:

p, q: two large prime numbers, chosen by the receiver, and told to nobody else (not even to the sender!).

n : the product pq is n, and this is placed in the Public Domain.

E : a random integer, placed in the Public Domain by the receiver, who has ﬁrst made sure that E is

relatively prime to (p −1)(q −1) by computing the g.c.d., and choosing a new E at random until the g.c.d.

is 1. This is easy for the receiver to do because p and q are known to him, and the g.c.d. calculation is fast.

P : a message that the sender would like to send, thought of as a string of bits whose value, when

regarded as a binary number, lies in the range [0, n −1].

In addition to the above, one more item of information is computed by the receiver, and that is the

integer D that is the multiplicative inverse mod (p −1)(q −1) of E, i.e.,

DE ≡ 1 (mod (p −1)(q −1)).

Again, since p and q are known, this is a fast calculation for the receiver, as we shall see.

To summarize,

The receiver knows p, q, D

The sender knows P

Everybody knows n and E

In Fig. 4.8.1 we show the interiors of the heads of the sender and receiver, as well as the contents of the

Public Domain.

97

Chapter 4: Algorithms in the Theory of Numbers

Fig. 4.8.1: Who knows what

(B) How to send a message

The sender takes the message P, looks at the public keys E and n, computes C ≡ P

E

(mod n), and

transmits C over the public airwaves.

Note that the sender has no private codebook or anything secret other than the message itself.

(C) How to decode a message

The receiver receives C, and computes C

D

mod n. Observe, however, that (p −1)(q −1) is φ(n), and

so we have

C

D

≡ P

DE

= P

(1+tφ(n))

(t is some integer)

≡ P (mod n)

where the last equality is by Fermat’s theorem (4.5.1). The receiver has now recovered the original message

P.

If the receiver suspects that the code has been broken, i.e., that the adversaries have discovered the

primes p and q, then the sender can change them without having to send any secret messages to anyone else.

Only the public numbers n and E would change. The sender would not need to be informed of any other

changes.

Before proceeding, the reader is urged to contruct a little scenario. Make up a short (very short!) mes-

sage. Choose values for the other parameters that are needed to complete the picture. Send the message as

the sender would, and decode it as the receiver would. Then try to intercept the message, as an eavesdropper

would, and see what the diﬃculties are.

(D) How to intercept the message

An eavesdropper who receives the message C would be unable to decode it without (inventing some

entirely new decoding scheme or) knowing the inverse D of E (mod (p − 1)(q − 1)). The eavesdropper,

however, does not even know the modulus (p − 1)(q − 1) because p and q are unknown (only the receiver

knows them), and knowing the product pq = n alone is insuﬃcient. The eavesdropper is thereby compelled

to derive a polynomial-time factoring algorithm for large integers. May success attend those eﬀorts!

The reader might well remark here that the receiver has a substantial computational problem in creating

two large primes p and q. To a certain extent this is so, but two factors make the task a good deal easier.

First, p and q will need to have only half as many bits as n has, so the job is of smaller size. Second, there

98

4.9 Factoring large integers

are methods that will produce large prime numbers very rapidly as long as one is not too particular about

which primes they are, as long as they are large enough. We will not discuss those methods here.

The elegance of the RSA cryptosystem prompts a few more remarks that are intended to reinforce the

distinction between exponential- and polynomial-time complexities.

How hard is it to factor a large integer? At this writing, integers of up to perhaps a couple of hundred

digits can be approached with some conﬁdence that factorization will be accomplished within a few hours of

the computing time of a very fast machine. If we think in terms of a message that is about the length of one

typewritten page, then that message would contain about 8000 bits, equivalent to about 2400 decimal digits.

This is in contrast to the largest feasible length that can be handled by contemporary factoring algorithms of

about 200 decimal digits. A one-page message is therefore well into the zone of computational intractability.

How hard is it to ﬁnd the multiplicative inverse, mod(p −1)(q −1)? If p and q are known then it’s easy

to ﬁnd the inverse, as we saw in corollary 4.3.1. Finding an inverse modn is no harder than carrying out

the extended Euclidean algorithm, i.e., it’s a linear time job.

4.9 Factoring large integers

The problem of ﬁnding divisors of large integers is in a much more primitive condition than is primality

testing. For example, we don’t even know a probabilistic algorithm that will return a factor of a large

composite integer, with probability > 1/2, in polynomial time.

In this section we will discuss a probabilistic factoring algorithm that ﬁnds factors in an average time

that is only moderately exponential, and that’s about the state of the art at present.

Let n be an integer whose factorization is desired.

Deﬁnition. By a factor base B we will mean a set of distinct nonzero integers ¦b

0

, b

1

, . . . , b

h

¦.

Deﬁnition. Let B be a factor base. An integer a will be called a B-number if the integer c that is deﬁned

by the conditions

(a) c ≡ a

2

(mod n) and

(b) −n/2 ≤ c < n/2

can be written as a product of factors from the factor base B.

If we let e(a, i) denote the exponent of b

i

in that product, then we have

a

2

≡

h

i=0

b

e(a,i)

i

(mod n).

Hence, for each B-number we get an (h +1)-vector of exponents e(a).

Suppose we can ﬁnd enough B-numbers so that the resulting collection of exponent vectors is a linearly

dependent set, mod 2. For instance, a set of h + 2 B-numbers would certainly have that property.

Then we could nontrivially represent the zero vector as a sum of a certain set A of exponent vectors,

say

a∈A

e(a) ≡ (0, 0, . . . , 0) (mod 2).

Now deﬁne the integers

r

i

= (1/2)

a∈A

e(a, i) (i = 0, 1, . . .h)

u =

A

a (mod n)

v =

i

b

ri

i

.

It then would follow, after an easy calculation, that u

2

≡ v

2

(mod n). Hence either u − v or u + v has

a factor in common with n. It may be, of course, that u ≡ ±v (mod n), in which case we would have

99

Chapter 4: Algorithms in the Theory of Numbers

learned nothing. However if neither u ≡ v (mod n) nor u ≡ −v (mod n) is true then we will have found

a nontrivial factor of n, namely gcd(u −v, n) or gcd(u +v, n).

Example:

Take as a factor base B = ¦−2, 5¦, and let it be required to ﬁnd a factor of n = 1729. Then we claim

that 186 and 267 are B-numbers. To see that 186 is a B-number, note that 186

2

= 20 1729 + (−2)

4

, and

similarly, since 267

2

= 41 1729 +(−2)

4

5

2

, we see that 267 is a B-number, for this factor base B.

The exponent vectors of 186 and 167 are (4, 0) and (4, 2) respectively, and these sum to (0, 0) (mod 2),

hence we ﬁnd that

u = 186 267 ≡ 1250 (mod 1729)

r

1

= 4; r

2

= 1

v = (−2)

4

(5)

1

= 80

gcd(u −v, n) = gcd(1170, 1729) = 13

and we have found the factor 13 of 1729.

There might have seemed to be some legerdemain involved in plucking the B-numbers 186 and 267 out

of the air, in the example above. In fact, as the algorithm has been implemented by its author, J. D. Dixon,

one simply chooses integers uniformly at random from [1, n −1] until enough B-numbers have been found

so their exponent vectors are linearly dependent modulo 2. In Dixon’s implementation the factor base that

is used consists of −1 together with the ﬁrst h prime numbers.

It can then be proved that if n is not a prime power then with a correct choice of h relative to n, if we

repeat the random choices until a factor of n is found, the average running time will be

exp¦(2 + o(1))(log loglog n)

.5

¦.

This is not polynomial time, but it is moderately exponential only. Nevertheless, it is close to being about

the best that we know how to do on the elusive problem of factoring a large integer.

4.10 Proving primality

In this section we will consider a problem that sounds a lot like primality testing, but is really a little

diﬀerent because the rules of the game are diﬀerent. Basically the problem is to convince a skeptical audience

that a certain integer is prime, requiring them to do only a small amount of computation in order to be so

persuaded.

First, though, suppose you were writing a 100-decimal-digit integer n on the blackboard in front of a

large audience and you wanted to prove to them that n was not a prime.

If you simply wrote down two smaller integers whose product was n, the job would be done. Anyone

who wished to be certain could spend a few minutes multiplying the factors together and verifying that their

product was indeed n, and all doubts would be dispelled.

Indeed*, a spea ker at a mathematical convention in 1903 announced the result that 2

67

− 1 is not a

prime number, and to be utterly convincing all he had to do was to write

2

67

−1 = 193707721 761838257287.

We note that the speaker probably had to work very hard to ﬁnd those factors, but having found them

it became quite easy to convince others of the truth of the claimed result.

A pair of integers r, s for which r ,= 1, s ,= 1, and n = rs constitute a certiﬁcate attesting to the

compositeness of n. With this certiﬁcate ((n) and an auxiliary checking algorithm, viz.

(1) Verify that r ,= 1, and that s ,= 1

(2) Verify that rs = n

we can prove, in polynomial time, that n is not a prime number.

* We follow the account given in V. Pratt, Every prime has a succinct certiﬁcate, SIAM J. Computing, 4

(1975), 214-220.

100

4.10 Proving primality

Now comes the hard part. How might we convince an audience that a certain integer n is a prime

number? The rules are that we are allowed to do any immense amount of calculation beforehand, and the

results of that calculation can be written on a certiﬁcate ((n) that accompanies the integer n. The audience,

however, will need to do only a polynomial amount of further computation in order to convince themselves

that n is prime.

We will describe a primality-checking algorithm / with the following properties:

(1) Inputs to / are the integer n and a certain certiﬁcate ((n).

(2) If n is prime then the action of / on the inputs (n, ((n)) results in the output ‘n is prime.’

(3) If n is not prime then for every possible certiﬁcate ((n) the action of / on the inputs (n, ((n)) results

in the output ‘primality of n is not veriﬁed.’

(4) Algorithm / runs in polynomial time.

Now the question is, does such a procedure exist for primality veriﬁcation? The answer is aﬃrmative,

and we will now describe one. The fact that primality can be quickly veriﬁed, if not quickly discovered, is

of great importance for the developments of Chapter 5. In the language of section 5.1, what we are about

to do is to show that the problem ‘Is n prime?’ belongs to the class NP.

The next lemma is a kind of converse to ‘Fermat’s little theorem’ (theorem 4.5.2 ).

Lemma 4.10.1. Let p be a positive integer. Suppose there is an integer x such that x

p−1

≡ 1 (mod p)

and such that for all divisors d of p −1, d < p −1, we have x

d

,≡ 1 (mod p). Then p is prime.

Proof: First we claim that gcd(x, p) = 1, for let g = gcd(x, p). Then x = gg

, p = gg

. Since x

p−1

≡ 1

(mod p) we have x

p−1

= 1 + tp and x

p−1

−tp = (gg

)

p−1

−tgg

**= 1. The left side is a multiple of g. The
**

right side is not, unless g = 1.

It follows that x ∈ U

p

, the group of units of Z

p

. Thus x is an element of order p −1 in a group of order

φ(p). Hence (p −1)[φ(p). But always φ(p) ≤ p −1. Hence φ(p) = p −1 and p is prime.

Lemma 4.10.1 is the basis for V. Pratt’s method of constructing certiﬁcates of primality. The construc-

tion of the certiﬁcate is actually recursive since step 3

0

below calls for certiﬁcates of smaller primes. We

suppose that the certiﬁcate of the prime 2 is the trivial case, and that it can be veriﬁed at no cost.

Here is a complete list of the information that is on the certiﬁcate ((p) that accompanies an integer p

whose primality is to be attested to:

1

0

: a list of the primes p

i

and the exponents a

i

for the canonical factorization p −1 =

r

i=1

p

ai

i

2

0

: the certiﬁcates ((p

i

) of each of the primes p

1

, . . . , p

r

3

0

: a positive integer x.

To verify that p is prime we could execute the following algorithm B:

(B1) Check that p −1 =

p

ai

i

.

(B2) Check that each p

i

is prime, using the certiﬁcates ((p

i

) (i = 1, r).

(B3) For each divisor d of p −1, d < p −1, check that x

d

,≡ 1 (mod p).

(B4) Check that x

p−1

≡ 1 (mod p).

This algorithm B is correct, but it might not operate in polynomial time. In step B3 we are looking at

every divisor of p −1, and there may be a lot of them.

Fortunately, it isn’t necessary to check every divisor of p −1. The reader will have no trouble proving

that there is a divisor d of p −1 (d < p −1) for which x

d

≡ 1 (mod p) if and only if there is such a divisor

that has the special form d = (p −1)/p

i

.

The primality checking algorithm / now reads as follows.

(A1) Check that p −1 =

p

ai

i

.

(A2) Check that each p

i

is prime, using the certiﬁcates ((p

i

) (i = 1, r).

(A3) For each i := 1 to r, check that

x

(p−1)/pi

,≡ 1 (mod p).

101

Chapter 4: Algorithms in the Theory of Numbers

(A4) Check that x

p−1

≡ 1 (mod p).

Now let’s look at the complexity of algorithm / .

We will measure its complexity by the number of times that we have to do a computation of either of

the types (a) ‘is m =

q

bj

j

?’ or (b) ‘is y

s

≡ 1 (mod p)?’

Let f(p) be that number. Then we have (remembering that the algorithm calls itself r times)

f(p) = 1 +

r

i=2

f(p

i

) +r +1 (4.10.1)

in which the four terms, as written, correspond to the four steps in the checking algorithm. The sum begins

with ‘i = 2’ because the prime 2, which is always a divisor of p −1, is ‘free.’

Now (4.10.1) can be written as

g(p) =

r

i=2

g(p

i

) +4 (4.10.2)

where g(p) = 1 +f(p). We claim that g(p) ≤ 4 log

2

p for all p.

This is surely true if p = 2. If true for primes less than p then from (4.10.2),

g(p) ≤

r

i=2

¦4 log

2

p

i

¦ + 4

= 4 log

2

¦

r

i=2

p

i

¦ +4

≤ 4 log

2

¦(p −1)/2¦ +4

= 4 log

2

(p −1)

≤ 4 log

2

p.

Hence f(p) ≤ 4 log

2

p −1 for all p ≥ 2.

Since the number of bits in p is Θ(log p), the number f(p) is a number of executions of steps that is

a polynomial in the length of the input bit string. We leave to the exercises the veriﬁcation that each of

the steps that f(p) counts is also executed in polynomial time, so the entire primality-veriﬁcation procedure

operates in polynomial time. This yields

Theorem 4.10.1. (V. Pratt, 1975) There exist a checking algorithm and a certiﬁcate such that primality

can be veriﬁed in polynomial time.

Exercises for section 4.10

1. Show that two positive integers of b bits each can be multiplied with at most O(b

2

) bit operations

(multiplications and carries).

2. Prove that step A1 of algorithm / can be executed in polynomial time, where time is now measured by

the number of bit operations that are implied by the integer multiplications.

3. Same as exercise 2 above, for steps A3 and A4.

4. Write out the complete certiﬁcate that attests to the primality of 19.

5. Find an upper bound for the total number of bits that are in the certiﬁcate of the integer p.

6. Carry out the complete checking algorithm on the certiﬁcate that you prepared in exercise 4 above.

7. Let p = 15. Show that there is no integer x as described in the hypotheses of lemma 4.10.1.

8. Let p = 17. Find all integers x that satisfy the hypotheses of lemma 4.10.1.

102

4.10 Proving primality

Bibliography

The material in this chapter has made extensive use of the excellent review article

John D. Dixon, Factorization and primality tests, The American Mathematical Monthly, 91 (1984), 333-352.

A basic reference for number theory, Fermat’s theorem, etc. is

G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Oxford University Press, Oxford,

1954

Another is

W. J. LeVeque, Fundamentals of Number Theory, Addison-Wesley, Re ading, MA, 1977

The probabilistic algorithm for compositeness testing was found by

M. O. Rabin, Probabilistic algorithms, in Algorithms and Complexity, New Directions and Recent Results,

J. Traub ed., Academic Press, New York, 1976

and at about the same time by

R. Solovay and V. Strassen, A fast Monte Carlo test for primality, SIAM Journal of Computing, 6 (1977),

pp. 84-85; erratum ibid., 7 (1978), 118.

Some empirical properties of that algorithm are in

C. Pomerance, J. L. Selfridge and S. Wagstaﬀ Jr., The pseudoprimes to 2510

9

, Mathematics of Computation,

35 (1980 ), 1003-1026.

The fastest nonprobabilistic primality test appeared ﬁrst in

L. M. Adleman, On distinguishing prime numbers from composite numbers, IEEE Abstracts, May 1980,

387-406.

A more complete account, together with the complexity analysis, is in

L. M. Adleman, C. Pomerance and R. S. Rumely, On distinguishing prime numbers from composite numbers,

Annals of Mathematics 117 (1983), 173-206.

A streamlined version of the above algorithm was given by

H. Cohen and H. W. Lenstra Jr., Primality testing and Jacobi sums, Report 82-18, Math. Inst. U. of

Amsterdam, Amsterdam, 1982.

The idea of public key data encryption is due to

W. Diﬃe and M. E. Hellman, New directions in cryptography, IEEE Transactions on Information Theory,

IT-22, 6 (1976), 644-654.

An account of the subject is contained in

M. E. Hellman, The mathematics of public key cryptography, Scientiﬁc American, 241, 2 (August 1979),

146-157.

The use of factoring as the key to the code is due to

R. L. Rivest, A. Shamir and L. M. Adleman, A method for obtaining digital signatures and public key

cryptosystems, Communications of the A.C.M., 21, 2 (February 1978), 120-126

The probabilistic factoring algorithm in the text is that of

John D. Dixon, Asymptotically fast factorization of integers, Mathematics of Computation, 36 (1981), 255-

260.

103

Chapter 5: NP-completeness

Chapter 5: NP-completeness

5.1 Introduction

In the previous chapter we met two computational problems for which fast algorithms have never been

found, but neither have such algorithms been proved to be unattainable. Those were the primality-testing

problem, for which the best-known algorithm is delicately poised on the brink of polynomial time, and the

integer-factoring problem, for which the known algorithms are in a more primitive condition.

In this chapter we will meet a large family of such problems (hundreds of them now!). This family is not

just a list of seemingly diﬃcult computational problems. It is in fact bound together by strong structural

ties. The collection of problems, called the NP-complete problems, includes many well known and important

questions in discrete mathematics, such as the following.

The travelling salesman problem (‘TSP’): Given n points in the plane (‘cities’), and a distance D. Is

there a tour that visits all n of the cities, returns to its starting point, and has total length ≤ D?

Graph coloring: Given a graph G and an integer K. Can the vertices of G be properly colored in K or

fewer colors?

Independent set: Given a graph G and an integer K. Does V (G) contain an independent set of K vertices?

Bin packing: Given a ﬁnite set S of positive integers, and an integer N (the number of bins). Does there

exist a partition of S into N or fewer subsets such that the sum of the integers in each subset is ≤ K? In

other words, can we ‘pack’ the integers of S into at most N ‘bins,’ where the ‘capacity’ of each bin is K?

These are very diﬃcult computational problems. Take the graph coloring problem, for instance. We

could try every possible way of coloring the vertices of G in K colors to see if any of them work. There are

K

n

such possibilities, if G has n vertices. Hence a very large amount of computation will be done, enough

so that if G has 50 vertices and we have 10 colors at our disposal, the problem would lie far beyond the

capabilities of the fastest computers that are now available.

Hard problems can have easy instances. If the graph G happens to have no edges at all, or very few of

them, then it will be very easy to ﬁnd out if a coloring is possible, or if an independent set of K vertices is

present.

The real question is this (let’s use ‘Independent Set’ as an illustration). Is it possible to design an

algorithm that will come packaged with a performance guarantee of the following kind:

The seller warrants that if a graph G, of n vertices,

and a positive integer K are input to this program, then

it will correctly determine if there is an independent set

of K or more vertices in V (G), and it will do so in an

amount of time that is at most 1000n

8

minutes.

Hence there is no contradiction between the facts that the problem is hard and that there are easy

cases. The hardness of the problem stems from the seeming impossibility of producing such an algorithm

accompanied by such a manufacturer’s warranty card. Of course the ‘1000n

8

’ didn’t have to be exactly that.

But some quite speciﬁc polynomial in the length of the input bit string must appear in the performance

guarantee. Hence ‘357n

9

’ might have appeared in the guarantee, and so might ‘23n

3

,’ but ‘n

K

’ would not

be allowed.

Let’s look carefully at why n

K

would not be an acceptable worst-case polynomial time performance

bound. In the ‘Independent Set’ problem the input must describe the graph G and the integer K. How

many bits are needed to do that? The graph can be speciﬁed, for example, by its vertex adjacency matrix

A. This is an n n matrix in which the entry in row i and column j is 1 if (i, j) ∈ E(G) and is 0 else.

Evidently n

2

bits of input will describe the matrix A. The integers K and n can be entered with just

O(logn) bits, so the entire input bit string for the ‘Independent Set’ problem is ∼ n

2

bits long. Let B denote

the number of bits in the input string. Suppose that on the warranty card the program was guaranteed to

run in a time that is < n

K

.

Is this a guarantee of polynomial time performance? That question means ‘Is there a polynomial P such

that for every instance of ‘Independent Set’ the running time T will be at most P(B)?’ Well, is T bounded

104

What is a language?

by a polynomial in B if T = n

K

and B ∼ n

2

? It would seem so; in fact obviously T = O(B

K/2

), and that’s

a polynomial, isn’t it?

The key point resides in the order of the qualiﬁers. We must give the polynomial that works for

every instance of the problem ﬁrst. Then that one single polynomial must work on every instance. If

the ‘polynomial’ that we give is B

K/2

, well that’s a diﬀerent polynomial in B for diﬀerent instances of

the problem, because K is diﬀerent for diﬀerent instances. Therefore if we say that a certain program for

‘Independent Set’ will always get an answer before B

K/2

minutes, where B is the length of the input bit

string, then we would not have provided a polynomial-time guarantee in the form of a single polynomial in

B that applies uniformly to all problem instances.

The distinction is a little thorny, but is worthy of careful study because it’s of fundamental importance.

What we are discussing is usually called a worst-case time bound, meaning a bound on the running time that

applies to every instance of the problem. Worst-case time bounds aren’t the only possible interesting ones.

Sometimes we might not care if an algorithm is occasionally very slow as long as it is almost always fast. In

other situations we might be satisﬁed with an algorithm that is fast on average. For the present, however,

we will stick to the worst-case time bounds and study some of the theory that applies to that situation. In

sections 5.6 and 5.7 we will study some average time bounds.

Now let’s return to the properties of the NP-complete family of problems. Here are some of them.

1

0

: The problems all seem to be computationally very diﬃcult, and no polynomial time algorithms have

been found for any of them.

2

0

: It has not been proved that polynomial time algorithms for these problems do not exist.

2

0

: But this is not just a random list of hard problems. If a fast algorithm could be found for one

NP-complete problem then here would be fast algorithms for all of them.

2

0

: Conversely, if it could be proved that no fast algorithm exists for one of the NP-complete problems,

then there could not be a fast algorithm for any other of those problems.

The above properties are not intended to be a deﬁnition of the concept of NP-completeness. We’ll get to

that later on in this section. They are intended as a list of some of the interesting features of these problems,

which, when coupled with their theoretical and practical importance, accounts for the intense worldwide

research eﬀort that has gone into understanding them in recent years.

The question of the existence or nonexistence of polynomial-time algorithms for the NP-complete prob-

lems probably rates as the principal unsolved problem that faces theoretical computer science today.

Our next task will be to develop the formal machinery that will permit us to give precise deﬁnitions

of all of the concepts that are needed. In the remainder of this section we will discuss the additional ideas

informally, and then in section 5.2 we’ll state them quite precisely.

What is a decision problem? First, the idea of a decision problem. A decision problem is one that asks

only for a yes-or-no answer: Can this graph be 5-colored? Is there a tour of length ≤ 15 miles? Is there a

set of 67 independent vertices?

Many of them problems that we are studying can be phrased as decision problems or as optimization

problems: What is the smallest number of colors with which G can be colored? What is the length of the

shortest tour of these cities? What is the size of the largest independent set of vertices in G?

Usually if we ﬁnd a fast algorithm for a decision problem then with just a little more work we will

be able to solve the corresponding optimization problem. For instance, suppose we have an algorithm that

solves the decision problem for graph coloring, and what we want is the solution of the optimization problem

(the chromatic number).

Let a graph G be given, say of 100 vertices. Ask: can the graph be 50-colored? If so, then the chromatic

number lies between 1 and 50. Then ask if it can be colored in 25 colors. If not, then the chromatic

number lies between 26 and 50. Continue in this way, using bisection of the interval that is known to

contain the chromatic number. After O(log n) steps we will have found the chromatic number of a graph of

n vertices. The extra multiplicative factor of log n will not alter the polynomial vs. nonpolynomial running

time distinction. Hence if there is a fast way to do the decision problem then there is a fast way to do the

optimization problem. The converse is obvious.

Hence we will restrict our discussion to decision problems.

105

Chapter 5: NP-completeness

What is a language?

Since every decision problem can have only the two answers ‘Y/N,’ we can think of a decision problem

as asking if a given word (the input string) does or does not belong to a certain language. The language is

the totality of words for which the answer is ‘Y.’

The graph 3-coloring language, for instance, is the set of all symetric, square matrices of 0,1 entries,

with zeroes on the main diagonal (these are the vertex adjacency matrices of graphs) such that the graph

that the matrix represents is 3-colorable. We can image that somewhere there is a vast dictionary of all of

the words in this language. A 3-colorability computation is therefore nothing but an attempt to discover

whether a given word belongs to the dictionary.

What is the class P?

We say that a decision problem belongs to the class P if there is an algorithm / and a number c such

that for every instance I of the problem the algorithm / will produce a solution in time O(B

c

), where B is

the number of bits in the input string that represents I.

To put it more brieﬂy, P is the set of easy decision problems.

Examples of problems in P are most of the ones that we have already met in this book: Are these two

integers relatively prime? Is this integer divisible by that one? Is this graph 2-colorable? Is there a ﬂow of

value greater than K in this network? Can this graph be disconnected by the removal of K or fewer edges?

Is there a matching of more than K edges in this bipartite graph? For each of these problems there is a fast

(polynomial time) algorithm.

What is the class NP?

The class NP is a little more subtle. A decision problem Q belongs to NP if there is an algorithm /

that does the following:

(a) Associated with each word of the language Q (i.e.,with each instance I for which the answer is ‘Yes’)

there is a certiﬁcate C(I) such that when the pair (I, C(I)) are input to algorithm / it recognizes

that I belongs to the language Q.

(b) If I is some word that does not belong to the language Q then there is no choice of certiﬁcate C(I)

that will cause / to recognize I as a member of Q.

(c) Algorithm / operates in polynomial time.

To put this one more brieﬂy, NP is the class of decision problems for which it is easy to check the

correctness of a claimed answer, with the aid of a little extra information. So we aren’t asking for a way to

ﬁnd a solution, but only to verify that an alleged solution really is correct.

Here is an analogy that may help to clarify the distinction between the classes P and NP. We have all

had the experience of reading through a truly ingenious and diﬃcult proof of some mathematical theorem,

and wondering how the person who found the proof in the ﬁrst place ever did it. Our task, as a reader, was

only to verify the proof, and that is a much easier job than the mathematician who invented the proof had.

To pursue the analogy a bit farther, some proofs are extremely time consuming even to check (see the proof

of the four-color theorem!), and similarly, some computational problems are not even known to belong to

NP, let alone to P.

In P are the problems where it’s easy to ﬁnd a solution, and in NP are the problems where it’s easy to

check a solution that may have been very tedious to ﬁnd.

Here’s another example. Consider the graph coloring problem to be the decision problem Q. Certainly

this problem is not known to be in P. It is, however, in NP, and here is an algorithm, and a method of

constructing certiﬁcates that proves it.

Suppose G is some graph that is K-colorable. The certiﬁcate of G might be a list of the colors that get

assigned to each vertex in some proper K-coloring of the vertices of G. Where did we get that list, you ask?

Well, we never said it was easy to construct a certiﬁcate. If you actually want to ﬁnd one then you will have

to solve a hard problem. But we’re really only talking about checking the correctness of an alleged answer.

To check that a certain graph G really is K-colorable we can be convinced if you will show us the color of

each vertex in a proper K-coloring.

If you do provide that certiﬁcate, then our checking algorithm/ is very simple. It checks ﬁrst that every

vertex has a color and only one color. It then checks that no more than K colors have been used altogether.

106

What is NP-completeness?

It ﬁnally checks that for each edge e of G it is true that the two endpoints of e have diﬀerent colors.

Hence the graph coloring problem belongs to NP.

For the travelling salesman problem we would provide a certiﬁcate that contains a tour, whose total

length is ≤ K, of all of the cities. The checking algorithm / would then verify that the tour really does visit

all of the cities and really does have total length ≤ K.

The travelling salesman probelm, therefore, also belongs to NP.

‘Well,’ you might reply, ‘if we’re allowed to look at the answers, how could a problem fail to belong to

NP?’

Try this decision problem: an instance I of the problem consists of a set of n cities in the plane and a

positive number K. The question is ‘Is it true that there is not a tour of all of these cities whose total length

is less than K?’ Clearly this is a kind of a negation of the travelling salesman problem. Does it belong to

NP? If so, there must be an algorithm / and a way of making a certiﬁcate C(I) for each instance I such

that we can quickly verify that no such tour exists of the given cities. Any suggestions for the certiﬁcate?

The algorithm? No one else knows how to do this either.

It is not known if this negation of the travelling salesman problem belongs to NP.

Are there problems that do belong to NP but for which it isn’t immediately obvious that this is so?

yes. In fact that’s one of the main reasons that we studied the algorithm of Pratt, in section 4.10. Pratt’s

algorithm is exactly a method of producing a certiﬁcate with the aid of which we can quickly check that a

given integer is prime. The decision problem ‘Given n, is it prime?’ is thereby revealed to belong to NP,

although that fact wasn’t obvious at a glance.

It is very clear that P⊆NP. Indeed if Q ∈ P is some decision problem then we can verify membership in

the language Q with the empty certiﬁcate. That is, we don’t even need a certiﬁcate in order to do a quick

calculation that checks membership in the language because the problem itself can be quickly solved.

It seems natural to suppose that NP is larger than P. That is, one might presume that there are problems

whose solutions can be quickly checked with the aid of a certiﬁcate even though they can’t be quickly found

in the ﬁrst place.

No example of such a problem has ever been produced (and proved), nor has it been proved that no

such problem exists. The question of whether or not P=NP is the one that we cited earlier as being perhaps

the most important open question in the subject area today.

It is fairly obvious that the class P is called ‘the class P’ because ‘P’ is the ﬁrst letter of ‘Polynomial

Time.’ But what does ‘NP’ stand for? Stay tuned. The answer will appear in section 5.2.

What is reducibility?

Suppose that we want to solve a system of 100 simultaneous linear equations in 100 unknowns, of the

form Ax = b. We run down to the local software emporium and quickly purchase a program for $49.95 that

solves such systems. When we get home and read the ﬁne print on the label we discover, to our chagrin,

that the system works only on systems where the matrix A is symmetric, and the coeﬃcient matrix in the

system that we want to solve is, of course, not symmetric.

One possible response to this predicament would be to look for the solution to the system A

T

Ax = A

T

b,

in which the coeﬃcient matrix A

T

A is now symmetric.

What we would have done would be to have reduced the problem that we really are interested in to an

instance of a problem for which we have an algorithm.

More generally, let Q and Q

be two decision problems. We will say that Q

is quickly reducible to Q if

whenever we are given an instance I

of the problem Q

**we can convert it, with only a polynomial amount
**

of labor, into an instance I of Q, in such a way that I

**and I both have the same answer (‘Yes’ or ‘No’).
**

Thus if we buy a program to solve Q, then we can use it to solve Q

**, with just a small amount of extra
**

work.

What is NP-completeness?

How would you like to buy one program, for $49.95, that can solve 500 diﬀerent kinds of problems?

That’s what NP-completeness is about.

To state it a little more carefully, a decision problem is NP-complete if it belongs to NP and every

problem in NP is quickly reducible to it.

107

Chapter 5: NP-completeness

The implications of NP-completeness are numerous. Suppose we could prove that a certain decision

problem Q is NP-complete. Then we could concentrate our eﬀorts to ﬁnd polynomial-time algorithms on

just that one problem Q. Indeed if we were to succeed in ﬁnding a polynomial time algorithm to do instances

of Q then we would automatically have found a fast algorithm for doing every problem in NP. How does

that work?

Take an instance I

of some problem Q

in NP. Since Q

**is quickly reducible to Q we could transform
**

the instance I

**into an instance I of Q. Then use the super algorithm that we found for problems in Q to
**

decide I. Altogether only a polynomial amount of time will have been used from start to ﬁnish.

Let’s be more speciﬁc. Suppose that tomorow morning we prove that the graph coloring problem is

NP-complete, and that on the next morning you ﬁnd a fast algorithm for solving it. Then consider some

instance of the bin packing problem. Since graph coloring is NP-complete, the instance of bin packing can

be quickly converted into an instance of graph coloring for which the ‘Yes/No’ answer is the same. Now use

the fast graph coloring algorithm that you found (congratulations, by the way!) on the converted problem.

The answer you get is the correct answer for the original bin packing problem.

So, a fast algorithm for some NP-complete problem implies a fast algorithm for every problem in NP.

Conversely suppose we can prove that it is impossible to ﬁnd a fast algorithm for some particular problem

Q in NP. Then we can’t ﬁnd a fast algorithm for any NP-complete problem Q

**either. For if we could then
**

we would be able to solve instances of Q by quickly reducing them to instances of Q

**and solving them.
**

If we could prove that there is no fast way to test the primality of a given integer then we would have

proved that there is no fast way to decide of graphs are K-colorable, because, as we will see, the graph coloring

problem is NP-complete and primality testing is in NP. Think about that one for a few moments, and the

extraordinary beauty and structural unity of these computational problems will begin to reveal itself.

To summarize: quick for one NP-complete problem implies quick for all of NP; provably slow for one

problem in NP implies provably slow for all NP-complete problems.

There’s just one small detail to attend to. We’ve been discussing the economic advantages of keeping

ﬂocks of unicorns instead of sheep. If there aren’t any unicorns then the discussion is a little silly.

NP-complete problems have all sorts of marvellous properties. It’s lovely that every problem in NP can

be quickly reduced to just that one NP-complete problem. But are there any NP-complete problems? Why,

after all, should there be a single computational problem with the property that every one of the diverse

creatures that inhabit NP should be quickly reducible to it?

Well, there are NP-complete problems, hordes of them, and proving that will occupy our attention for

the next two sections. Here’s the plan.

In section 5.2 we are going to talk about a simple computer, called a Turing machine. It is an idealized

computer, and its purpose is to standardize ideas of computability and time of computation by referring all

problems to the one standard machine.

A Turing machine is an extremely simple ﬁnite-state computer, and when it performs a computation,

a unit of computational labor will be very clearly and unambiguously describable. It turns out that the

important aspects of polynomial time computability do not depend on the particular computer that is

chosen as the model. The beauty of the Turing machine is that it is at once a strong enough concept that it

can in principle perform any calculation that any other ﬁnite state machine can do, while at the same time

it is logically clean and simple enough to be useful for proving theorems about complexity.

The microcomputer on your desktop might have been chosen as the standard against which polynomial

time computability is measured. If that had been done then the class P of quickly solvable problems would

scarcely have changed at all (the polynomials would be diﬀerent but they would still be polynomials), but

the proofs that we humans would have to give in order to establish the relevant theorems would have gotten

much more complicated because of the variety of diﬀerent kinds of states that modern computers have.

Next, in section 5.3 we will prove that there is an NP-complete problem. It is called the satisﬁability

problem. Its status as an NP-complete problem was established by S. Cook in 1971, and from that work all

later progress in the ﬁeld has ﬂowed. The proof uses the theory of Turing machines.

The ﬁrst NP-complete problem was the hardest one to ﬁnd. We will ﬁnd, in section 5.4, a few more

NP-complete problems, so the reader will get some idea of the methods that are used in identifying them.

Since nobody knows a fast way to solve these problems, various methods have been developed that give

approximate solutions quickly, or that give exact solutions in fast average time, and so forth. The beautiful

108

5.2 Turing Machines

book of Garey and Johnson (see references at the end of the chapter) calls this ‘coping with NP-completeness,’

and we will spend the rest of this chapter discussing some fo these ideas.

Exercises for section 5.1

1. Prove that the following decision problem belongs to P: Given integers K and a

1

, . . . , a

n

. Is the median

of the a’s smaller than K?

2. Prove that the following decision problem is in NP: given an n n matrix A of integer entries. Is

det A = 0?

3. For which of the following problems can you prove membership in P?

(a) Given a graph G. Does G contain a circuit of length 4?

(b) Given a graph G. Is G bipartite?

(c) Given n integers. Is there a subset of them whose sum is an even number?

(d) Given n integers. Is there a subset of them whose sum is divisible by 3?

(e) Given a graph G. Does G contain an Euler circuit?

4. For which of the following problems can you prove membership in NP?

(a) Given a set of integers and another integer K. Is there a subset of the given integers whose sum is

K?

(b) Given a graph G and an integer K. Does G contain a path of length ≥ K?

(c) Given a set of K integers. Is it true that not all of them are prime?

(d) Gievn a set of K integers. Is it true that all of them are prime?

5.2 Turing Machines

A Turing machine consists of

(a) a doubly inﬁnite tape, that is marked oﬀ into squares that are numbered as shown in Fig. 5.2.1 below.

Each square can contain a single character from the character set that the machine recognizes. For

simplicity we can assume that the character set contains just three symbols: ‘0,’ ‘1,’ and ‘’ (blank).

(b) a tape head that is capable of either reading a single character from a square on the tape or writing a

single character on a square, or moving its position relative to the tape by an increment of one square in

either direction.

(c) a ﬁnite list of states such that at every instant the machine is in exactly one of those states. The possible

states of the machine are, ﬁrst of all, the regular states q

1

, . . . , q

s

, and second, three special states

q

0

: the initial state

q

Y

, the ﬁnal state in a problem to which the answer is ‘Yes’

q

N

: the ﬁnal state in a problem to which the answer is ‘No’

(d) a program (or program module, if we think of it as a pluggable component) that directs the machine

through the steps of a particular task.

Fig. 5.2.1: A Turing machine tape

Let’s describe the program module in more detail. Suppose that at a certain instant the machine is in

state q (other than q

Y

or q

N

) and that the symbol that has just been read from the tape is ‘symbol.’ Then

from the pair (q, symbol ) the program module will decide

(i) to what state q

**the machine shall next go, and
**

(ii) what single character the machine will now write on the tape in the square over which the head is not

positioned, and

(iii) whether the tape head will next move one square to the right or one square to the left.

109

Chapter 5: NP-completeness

One step of the program, therefore, goes from

(state, symbol ) to (newstate, newsymbol , increment). (5.2.1)

If and when the state reaches q

Y

or q

N

the computation is over and the machine halts.

The machine should be thought of as part hardware and part software. The programmer’s job is, as

usual, to write the software. To write a program for a Turing machine, what we have to do is to tell it how

to make each and every one of the transitions (5.2.1). A Turing machine program looks like a table in which,

for every possible pair (state, symbol) that the machine might ﬁnd itself in, the programmer has speciﬁed

what the newstate, the newsymbol and the increment shall be.

To begin a computation with a Turing machine we take the input string x, of length B, say, that describes

the problem that we want to solve, and we write x in squares 1, 2, . . . , B of the tape. The tape head is then

positioned over square 1, the machine is put into state q

0

, the program module that the programmer prepared

is plugged into its slot, and the computation begins.

The machine reads the symbol in square 1. It now is in state q

0

and has read symbol, so it can colsult

the program module to ﬁnd out what to do. The program instructs it to write at square 1 a newsymbol, to

move the head either to square 0 or to square 2, and to enter a certain newstate, say q

**. The whole process
**

is then repeated, possibly forever, but hopefully after ﬁnitely many steps the machine will enter the state

q

Y

or state q

N

, at which moment the computation will halt with the decision having been made.

If we want to watch a Turing machine in operation, we don’t have to build it. We can simulate one.

Here is a pidgin-Pascal simulation of a Turing machine that can easily be turned into a functioning program.

It is in two principal parts.

The procedure turmach has for input a string x of length B, and for output it sets the Boolean variable

accept to True or False, depending on whether the outcome of the computation is that the machine halted

in state q

Y

or q

N

respectively. This procedure is the ‘hardware’ part of the Turing machine. It doesn’t vary

from one job to the next.

Procedure gonextto is the program module of the machine, and it will be diﬀerent for each task. Its

inputs are the present state of the machine and the symbol that was just read from the tape. Its outputs

are the newstate into which the machine goes next, the newsymbol that the tape head now writes on the

current square, and the increment (±1) by which the tape head will now move.

procedure turmach(B:integer; x :array[1..B]; accept:Boolean);

¦simulates Turing machine action on input string x of length B¦

¦write input string on tape in ﬁrst B squares¦

for square := 1 to B do

tape[square] :=x[square];

¦record boundaries of written-on part of tape¦

leftmost:=1; rightmost := B;

¦initialize tape head and state¦

state:=0; square:=1;

while state ,= ‘Y’ and state ,= ‘N’ do

¦read symbol at current tape square¦

if square< leftmost or square> rightmost

then symbol:=‘’ else symbol:= tape[square]

¦ask program module for state transition¦

gonnextto(state,symbol,newstate,newsybol,increment);

state:=newstate;

¦update boundaries and write new symbol¦;

if square> rightmost then leftmost:= square;

tape[square]:=newsymbol;

¦move tape head¦

square := square+increment

end;¦while¦

accept:=¦ state=‘Y’¦

end.¦turmach¦

110

5.2 Turing Machines

Now let’s try to write a particular program module gonextto. Consider the following problem: given

an input string x, consisting of 0’s and 1’s, of length B. FInd out if it is true that the string contains an

odd number of 1’s.

We will write a program that will scan the input string from left to right, and at each moment the

machine will be instate 0 if it has so far scanned an even number of 1’s, in state 1 otherwise. In Fig. 5.2.2

we show a program that will get the job done.

state symbol newstate newsymbol increment

0 0 0 0 +1

0 1 1 1 +1

0 blank q

N

blank −1

1 0 1 0 +1

1 1 0 1 +1

1 blank q

Y

blank −1

Fig. 5.2.2: A Turing machine program for bit parity

Exercise. Program the above as procedure gonextto, run it for some input string, and print out the state of

the machine, the contents of the tape, and the position of the tape head after each step of the computation.

In the next section we are going to use the Turing machine concept to prove Cook’s theorem, which

is the assertion that a certain problem is NP-complete. Right now let’s review some of the ideas that have

already been introduced from the point of view of Turing machines.

We might immediately notice that some terms that were just a little bit fuzzy before are now much more

sharply in focus. Take the notion of polynomial time, for example. To make that idea precise one needs a

careful deﬁnition of what ‘the length of the input bit string’ means, and what one means by the number of

‘steps’ in a computation.

But on a Turing machine both of these ideas come through with crystal clarity. The input bit string x

is what we write on the tape to get things started, and its length is the number of tape squares it occupies.

A ‘step’ in a Turing machine calculation is obviously a single call to the program module. A Turing machine

caluclation was done ‘in time P(B)’ if the input string occupied B tape squares and the calculation took

P(B) steps.

Another word that we have been using without ever nailing down precisely is ‘algorithm.’ We all

understand informally what an algorithm is. But now we understand formally too. An algorithm for a

problem is a program module for a Turing machine that will cause the machine to halt after ﬁnitely many

steps in state ‘Y’ for every instance whose answer is ‘Yes,’ and after ﬁnitely many steps in state ‘N’ for every

instance whose answer is ‘No.’

A Turing machine and an algorithm deﬁne a language. The language is the set of all input strings x

that lead to termination in state ‘Y,’ i.e., to an accepting calculation.

Now let’s see how the idea of a Turing machine can clarify the description of the class NP. This is

the class of problems for which the decisions can be made quickly if the input strings are accompanied by

suitable certiﬁcates.

By a certiﬁcate we mean a ﬁnite strip of Turing machine tape, consisting of 0 or more squares, each

of which contains a symbol from the character set of the machine. A certiﬁcate can be loaded into a

Turing machine as follows. If the certiﬁcate contains m > 0 tape squares, then replace the segment from

square number −m to square number −1, inclusive, of the Turing machine tape with the certiﬁcate. The

information on the certiﬁcate is then available to the program module just as any other information on the

tape is available.

To use a Turing machine as a checking or verifying computer, we place the input string x that describes

the problem instance in squares 1, 2, . . ., B of the tape, and we place the certiﬁcate C(x) of x in squares

−m, −m + 1, . . . , −1 of the tape. We then write a verifying program for the program module in which the

program veriﬁes that the string x is indeed a word in the language of the machine, and in the course of the

veriﬁcation the program is quite free to examine the certiﬁcate as well as the problem instance.

A Turing machine that is being used as a verifying computer is called a nondeterministic machine. The

hardware is the same, but the manner of input and the question that is being asked are diﬀerent from the

111

Chapter 5: NP-completeness

situation with a deterministic Turing machine, in which we decide whether or not the input string is in the

language, without using any certiﬁcates.

The class NP (‘Nondeterministic Polynomial’) consists of those decision problems for which there exists

a fast (polynomial time) algorithm that will verify, given a problem instance string x and a suitable certiﬁcate

C(x), that x belongs to the language recognized by the machine, and for which, if x does not belong to the

language, no certiﬁcate would cause an accepting computation to ensue.

5.3 Cook’s Theorem

The NP-complete problems are the hardest problems in NP, in the sense that if Q

is any decision

problem in NP and Q is an NP-complete problem, then every instance of Q

is polynomially reducible to an

instance of Q. As we have already remarked, the surprising thing is that there is an NP-complete problem

at all, since it is not immediately clear why any single problem should hold the key to the polynomial time

solvability of every problem in the class NP. But there is one. As soon as we see why there is one, then we’ll

be able to see more easily why there are hundreds of them, including many computational questions about

discrete structures such as graphs, networks and games and about optimization problems, about algebraic

structures, formal logic, and so forth.

Here is the satisﬁability problem, the ﬁrst problem that was proved to be NP-complete, by Stephen Cook

in 1971.

We begin with a list of (Boolean) variables x

1

, . . . , x

n

. A literal is either one of the variables x

i

or the

negation of one of the variables, as ¯ x

i

. There are 2n possible literals.

A clause is a set of literals.

The rules of the game are these. We assign the value ‘True’ (T) or ‘False’ (F), to each one of the

variables. Having done that, each one of the literals inherits a truth value, namely a literal x

i

has the same

truth or falsity as the corresponding variable x

i

, and a literal ¯ x

i

has the opposite truth value from that of

the variable x

i

.

Finally each of the clauses also inherits a truth value from this process, and it is determined as follows.

A clause has the value ‘T’ if and only if at least one of the literals in that clause has the value ‘T,’ and

otherwise it has the value ‘F.’

Hence starting with an assignment of truth values to the variables, some true and some false, we end

up with a determination of the truth values of each of the clauses, some true and some false.

Deﬁnition. A set of clauses is satisﬁable if there exists an assignment of truth values to the variables that

makes all of the clauses true.

Think of the word ‘or’ as being between each of the literals in a clause, and the word ‘and’ as being

between the clauses.

The satisﬁability problem (SAT). Given a set of clauses. Does there exist a set of truth values (=T or

F), one for each variable, such that every clause contains at least one literal whose value is T (i.e., such that

every clause is satisﬁed)?

Example: Consider the set x

1

, x

2

, x

3

of variables. From these we might manufacture the following list of

four clauses:

¦x

1

, ¯ x

2

¦, ¦x

1

, x

3

¦, ¦x

2

, ¯ x

3

¦, ¦¯ x

1

, x

3

¦.

If we choose the truth values (T, T, F) for the variables, respectively, then the four clauses would

acquire the truth values (T, T, T, F), and so this would not be a satisfying truth assignment for the set

of clauses. There are only eight possible ways to assign truth values to three variables, and after a little

more experimentation we might ﬁnd out that these clauses would in fact be satisﬁed if we were to make the

assignments (T, T, T) (how can we recognize a set of clauses that is satisﬁed by assigning to every variable

the value ‘T’ ?).

The example already leaves one with the feeling that SAT might be a tough computational problem,

because there are 2

n

possible sets of truth values that we might have to explore if we were to do an exhaustive

search.

It is quite clear, however, that this problem belongs to NP. Indeed, it is a decision problem. Furthermore

we can easily assign a certiﬁcate to every set of clauses for which the answer to SAT is ‘Yes, the clauses

112

5.3 Cook’s Theorem

are satisﬁable.’ The certiﬁcate contains a set of truth values, one for each variable, that satisfy all of the

clauses. A Turing machine that receives the set of clauses, suitably encoded, as input, along with the above

certiﬁcate, would have to verify only that if the truth values are assigned to the variables as shown on the

certiﬁcate then indeed every clause does contain at least one literal of value ‘T.’ That veriﬁcation is certainly

a polynomial time computation.

Now comes the hard part. We want to show

Theorem 5.3.1. (S. Cook, 1971): SAT is NP-complete.

Before we carry out the proof, it may be helpful to give a small example of the reducibility ideas that

we are going to use.

Fig. 5.3.1: A 3-coloring problem

Example. Reducing graph-coloring to SAT

Consider the graph G of four vertices that is shown in Fig. 5.3.1, and the decision problem ‘Can the

vertices of G be properly colored in 3 colors?’

Let’s see how that decision problem can be reduced to an instance of SAT. We will use 12 Boolean

variables: the variable x

i,j

corresponds to the assertion that ‘vertex i has been colored in color j’ (i =

1, 2, 3, 4; j = 1, 2, 3).

The instance of SAT that we construct has 31 clauses. The ﬁrst 16 of these are

C(i) := ¦x

i,1

, x

i,2

, x

i,3

¦ (i = 1, 2, 3, 4)

T(i) := ¦¯ x

i,1

, ¯ x

i,2

¦ (i = 1, 2, 3, 4)

U(i) := ¦¯ x

i,1

, ¯ x

i,3

¦ (i = 1, 2, 3, 4)

V (i) := ¦¯ x

i,2

, ¯ x

i,3

¦ (i = 1, 2, 3, 4).

(5.3.1)

In the above, the four clauses C(i) assert that each vertex has been colored in at least one color. The

clauses T(i) say that no vertex has both color 1 and color 2. Similarly the clauses U(i) (resp. V (i)) guarantee

that no vertex has been colored 1 and 3 (resp. 2 and 3).

All 16 of the clauses in (5.3.1) together amount to the statement that ‘each vertex has been colored in

one and only one of the three available colors.’

Next we have to construct the clauses that will assure us that the two endpoints of an edge of the graph

are never the same color. For this purpose we deﬁne, for each edge e of the graph G and color j (=1,2,3),

a clause D(e, j) as follows. Let u and v be the two endpoints of e; the D(e, j) := ¦¯ x

u,j

, ¯ x

v,j

¦, which asserts

that not both endpoints of the edge e have the same color j.

The original instance of the graph coloring problem has now been reduced to an instance of SAT. In

more detail, there exists an assignment of values T, F to the 12 Boolean variables x

1,1

, . . . , x

4,3

such that

each of the 31 clauses contains at least one literal whose value is T if and only if the vertices of the graph G

can be properly colored in three colors. The graph is 3-colorable if and only if the clauses are satisﬁable.

It is clear that if we have an algorithm that will solve SAT, then we can also solve graph coloring

problems. A few moments of thought will convince the reader that the transformation of one problem to the

other that was carried out above involves only a polynomial amount of computation, despite the seemingly

large number of variables and clauses. Hence graph coloring is quickly reducible to SAT.

113

Chapter 5: NP-completeness

Proof of Cook’s theorem

We want to prove that SAT is NP-complete, i.e., that every problem in NP is polynomially reducible

to an instance of SAT. Hence let Q be some problem in NP and let I be an instance of problem Q. Since Q

is in NP there exists a Turing machine that recognizes encoded instances of problem Q, if accompanied by

a suitable certiﬁcate, in polynomial time.

Let TMQ be such a Turing machine, and let P(n) be a polynomial in its argument n with the property

that TMQ recognizes every pair (x, C(x)), where x is a word in the language Q and C(x) is its certiﬁcate,

in time ≤ P(n), where n is the length of x.

We intend to construct, corresponding to each word I in the language Q, and instance f(I) of SAT for

which the answer is ‘Yes, the clauses are all simultaneously satisﬁable.’ Conversely, if the word I is not in

the language Q, the clauses will not be satisﬁable.

The idea can be summarized like this: the instance of SAT that will be constructed will be a collection

of clauses that together express the fact that there exists a certiﬁcate that causes Turing machine TMQ to do

an accepting calculation. Therefore, in order to test whether or not the word Q belongs to the language, it

suﬃces to check that the collection of clauses is satisﬁable.

To construct an instance of SAT means that we are going to deﬁne a number of variables, of literals,

and of clauses, in such a way that the clauses are satisﬁable if and only if x is in the language Q, i.e., the

machine TMQ accepts x and its certiﬁcate.

What we must do, then, is to express the accepting computation of the Turing machine as the simul-

taneous satisfaction of a number of logical propositions. It is precisely here that the relative simplicity of a

Turing machine allows us to enumerate all of the possible paths to an accepting computation in a way that

would be quite unthinkable with a ‘real’ computer.

Now we will describe the Boolean variables that will be used in the clauses under construction.

Variable Q

i,k

is true if after step i of the checking calculation it is true that the Turing machine TMQ

is in state q

k

, false otherwise.

Variable S

i,j,a

= ¦after step i, symbol a is in tape square j¦.

Variable T

i,j

= ¦after step i, the tape head is positioned over square j¦.

Let’s count the variables that we’ve just introduced. Since the Turing machine TMQ does its accepting

calculation in time ≤ P(n) it follows that the tape head will never venture more than ±P(n) squares away

from its starting position. Therefore the subscript j, which runs through the various tape squares that are

scanned during the computation, can assume only O(P(n)) diﬀerent values.

Index a runs over the letters in the alphabet that the machine can read, so it can assume at most some

ﬁxed number A of values.

The index i runs over the steps of the accepting computation, and so it takes at most O(P(n)) diﬀerent

values.

Finally, k indexes the states of the Turing machine, and there is only some ﬁxed ﬁnite number, K, say,

of states that TMQ might be in. Hence there are altogether O(P(n)

2

) variables, a polynomial number of

them.

Is it true that every random assignment of true or false values to each of these variables corresponds to

an accepting computation on (x, C(x))? Certainly not. For example, if we aren’t careful we might assign

true values to T

9,4

and to T

10,33

, thereby burning out the bearings on the tape transport mechanism! (why?)

Our remaining task, then, will be to describe precisely the conditions under which a set of values assigned

to the variables listed above actually deﬁnes a possible accepting calculation for (x, C(x)). Then we will be

sure that whatever set of satisfying values of the variables might be found by solving the SAT problem, they

will determine a real accepting calculation of the machine TMQ.

This will be done by requiring that a number of clauses be all true (‘satisﬁed’) at once, where each

clause will exprss one necessary condition. In the following, the bold face type will describe, in words, the

condition that we want to express, and it will be followed by the formal set of clauses that actually expresses

the condition on input to SAT.

At each step, the machine is in at least one state.

Hence at least one of the K available state variables must be true. This leads to the ﬁrst set of clauses,

114

At step P(n) the machine is in state q

Y

.

one for each step i of the computation:

¦Q

i,1

, Q

i,2

, . . . , Q

i,K

Since i assumes O(P(n)) values, these are O(P(n)) clauses.

At each step, the machine is not in more than one state

Therefore, for each step i, and each pair j

, j

**of distinct states, the clause
**

¦

¯

Q

i,j

,

¯

Q

i,j

¦

must be true. These are O(P(n)) additonal clauses to add to the list, but still more are needed.

At each step, each tape square contains exactly one symbol from the alphabet of the machine.

This leeads to two lists of clauses which require, ﬁrst, that there is at least one symbol in each square

at each step, and second, that there are not two symbols in each square at each step. The clauses that do

this are

¦S

i,j,1

, S

i,j,2

, . . . , S

i,j,A

¦

where A is the number of letters in the machine’s alphabet, and

¦

¯

S

i,j,k

,

¯

S

i,j,k

¦

for each step i, square j, and pair k

, k

**of distinct symbols in the alphabet of the machine.
**

The reader will by now have gotten the idea of how to construct the clauses, so for the next three

categories we will simply list the functions that must be performed by the corresponding lists of clauses, and

leave the construction of the clauses as an exercise.

At each step, the tape head is positioned over a single square.

Initially the machine is in state 0, the head is over square 1, the input string x is in squares 1

to n, and C(x) (the input certiﬁcate of x) is in squares 0, -1, ..., −P(n).

At step P(n) the machine is in state q

Y

.

The last set of restrictions is a little trickier:

At each step the machine moves to its next conﬁguration (state, symbol, head position) in

accordance with the application of its program module to its previous (state, symbol).

To ﬁnd the clauses that will do this job, consider ﬁrst the following condition: the symbol in square j

of the tape cannot change during step i of the computation if the tape head isn’t positioned there at that

moment. This translates into the collection

¦T

i,j

,

¯

S

i,j,k

, S

i+1,j,k

¦

of clauses, one for each triple (i, j, k) = (state, square, symbol). These clauses express the condition in the

following way: either (at time i) the tape head is positioned over square j (T

i,j

is true) or else the head is

not positioned there, in which case either symbol k is not in the jth square before the step or else symbol k

is (still) in the jth square after the step is executed.

It remains to express the fact that the transitions from one conﬁguration of the machine to the next are

the direct results of the operation of the program module. The three sets of clauses that do this are

¦

¯

T

i,j

,

¯

Q

i,k

,

¯

S

i,j,l

,

¯

T

i+1,j+INC

¦

¦

¯

T

i,j

,

¯

Q

i,k

,

¯

S

i,j,l

,Q

i+1,k

¦

¦

¯

T

i,j

,

¯

Q

i,k

,

¯

S

i,j,l

,S

i+1,j,l

¦.

In each case the format of the clause is this: ‘either the tape head is not positioned at square j, or the

present state is not q

k

or the symbol just read is not l, but if they are then ...’ There is a clause as above for

115

Chapter 5: NP-completeness

each step i = 0, . . . , P(n) of the computation, for each square j = −P(n), P(n) of the tape, for each symbol

l in the alphabet, and for each possible state q

k

of the machine, a polynomial number of clauses in all. The

new conﬁguration triple (INC, k

, l

**) is, of course, as computed by the program module.
**

Now we have constructed a set of clauses with the following property. If we execute a recognizing

computation on a string x and its certiﬁcate, in time at most P(n), then this computation determines a

set of (True, False) values for all of the variables listed above, in such a way that all of the clauses just

constructed are simultaneously satisﬁed.

Conversely if we have a set of values of the SAT variables that satisfy all of the clauses at once, then

that set of values of the variables describes a certiﬁcate that would cause TMQ to do a computation that

would recognize the string x and it also describes, in minute detail, the ensuing accepting computation that

TMQ would do if it were given x and that certiﬁcate.

Hence every language in NP can be reduced to SAT. it is not diﬃcult to check through the above

construction and prove that the reduction is accomplishable in polynomial time. It follows that SAT is

NP-complete.

5.4 Some other NP-complete problems

Cook’s theorem opened the way to the identiﬁcation of a large number of NP-complete problems. The

proof that Satisﬁability is NP-complete required a demonstration that every problem in NP is polynomially

reducible to SAT. To prove that some other problem X is NP-complete it will be suﬃcient to prove that

SAT reduces to problem X. For if that is so then every problem in NP can be reduced to problem X by

ﬁrst reducing to an instance of SAT and then to an instance of X.

In other words, life after Cook’s theorem is a lot easier. To prove that some problem is NP-complete

we need show only that SAT reduces to it. We don;t have to go all the way back to the Turing machine

computations any more. Just prove that if you can solve your problem then you can solve SAT. By Cook’s

theorem you will then know that by solving your problem you will have solved every problem in NP.

For the honor of being ‘the second NP-complete problem,’ consider the following special case of SAT,

called 3-satisﬁability, or 3SAT. An instance of 3SAT consists of a number of clauses, just as in SAT, except

that the clauses are permitted to contain no more than three literals each. The question, as in SAT, is ‘Are

the clauses simultaneously satisﬁable by some assignment of T, F values to the variables?’

Interestingly, though, the general problem SAT is reducible to the apparently more special problem

3SAT, which will show us

Theorem 5.4.1. 3-satisﬁability is NP-complete.

Proof. Let an instance of SAT be given. We will show how to transform it quickly to an instance of 3SAT

that is satisﬁable if and only if the original SAT problem was satisﬁable.

More precisely, we are going to replace clauses that contain more than three literals with collections

of clauses that contain exactly three literals and that have the same satisﬁability as the original. In fact,

suppose our instance of SAT contains a clause

¦x

1

, x

2

, . . . , x

k

¦ (k ≥ 4). (5.4.1)

Then this clause will be replaced by k −2 new clauses, utilizing k −3 new variables z

i

(i = 1, . . . , k −3) that

are introduced just for this purpose. The k −2 new clauses are

¦x

1

, x

2

, z

1

¦, ¦x

3

, ¯ z

1

, z

2

¦, ¦x

4

, ¯ z

2

, z

3

¦, . . . , ¦x

k−1

, x

k

, ¯ z

k−3

¦. (5.4.2)

We now make the following

Claim. If x

∗

1

, . . . , x

∗

k

is an assignment of ruth values to the x’s for which the clause (5.4.1) is true, then there

exist assignments z

∗

1

, . . . , z

∗

k−3

of truth values to the z’s such that all of the clauses (5.4.2) are simultaneously

satisﬁed by (x

∗

, z

∗

). Conversely, if (x

∗

, z

∗

) is some assignment that satisﬁes all of (5.4.2), then x

∗

alone

satisﬁes (5.4.1).

To prove the claim, ﬁrst suppose that (5.4.1) is satisﬁed by some assignment x

∗

. Then one, at least, of

the k literals x

1

, . . . , x

k

, say x

r

, has the value ‘T.’ Then we can satisfy all k −2 of the transformed clauses

116

5.4 Some other NP-complete problems

(5.4.2) by assigning z

∗

s

:= ‘T

for s ≤ r −2 and z

∗

s

= ‘F

**for s > r −2. It is easy to check that each one of
**

the k −2 new clauses is satisﬁed.

Conversely, suppose that all of the new clauses are satisﬁed by some assignment of truth values to the

x’s and the z’s. We will show that at least one of the x’s must be ‘True,’ so that the original clause will be

satisﬁed.

Suppose, to the contrary, that all of the x’s are false. Since, in the new clauses none of the x’s are

negated, the fact that the new clauses are satisﬁed tells us that they would remain satisﬁed without any of

the x’s. Hence the clauses

¦z

1

¦, ¦¯ z

1

, z

2

¦, ¦¯ z

2

, z

3

¦, . . . , ¦¯ z

k−4

, z

k−3

¦, ¦¯ z

k−3

¦

are satisﬁed by the values of the z’s. If we scan the list from left to right we discover, in turn, that z

1

is true,

z

2

is true, . . . , and ﬁnally, much to our surprise, that z

k−3

is true, and z

k−3

is also false, a contradiction

which establishes the truth of the claim made above.

The observation that the transformations just discussed can be carried out in polynomial time completes

the proof of theorem 5.4.1.

We remark, in passing, that the problem ‘2SAT’ is in P.

Our collection of NP-complete problems is growing. Now we have two, and a third is on the way. We

will show next how to reduce 3SAT to a graph coloring problem, thereby proving

Theorem 5.4.2. The graph vertex coloring problem is NP-complete.

Proof: Given an instance of 3SAT, that is to say, given a collection of k clauses, involving n variables and

having at most three literals per clause, we will construct, in polynomial time, a graph G with the property

that its vertices can be properly colored in n + 1 colors if and only if the given clauses are satisﬁable. We

will assume that n > 4, the contrary case being trivial.

The graph G will have 3n +k vertices:

¦x

1

, . . . , x

n

¦, ¦¯ x

1

, . . . , ¯ x

n

¦, ¦y

1

, . . . , y

n

¦, ¦C

1

, . . . , C

k

¦

Now we will describe the set of edges of G. First each vertex x

i

is joined to ¯ x

i

(i = 1, . . . , n). Next, every

vertex y

i

is joined to every other vertex y

j

(j ,= i), to every other vertex x

j

(j ,= i), and to every vertex

¯ x

j

(j ,= i).

Vertex x

i

is connected to C

j

if x

i

is not one of the literals in clause C

j

. Finally, ¯ x

i

is connected to C

j

if ¯ x

i

is not one of the literals in C

j

.

May we interrupt the proceedings to say again why we’re doing all of this? You have just read the

description of a certain graph G. The graph is one that can be drawn as soon as someone hands us a 3SAT

problem. We described the graph by listing its vertices and then listing its edges. What does the graph do

for us?

Well suppose that we have just bought a computer program that can decide if graphs are colorable in

a given number of colors. We paid $ 49.95 for it, and we’d like to use it. But the ﬁrst problem that needs

solving happens to be a 3SAT problem, not a grpah coloring problem. We aren’t so easily discouraged,

though. We convert the 3SAT problem into a graph that is (n+1)-colorable if and only if the original 3SAT

problem was satisﬁable. Now we can get our money’s worth by running the graph coloring program even

though what we really wanted to do was to solve a 3SAT problem.

117

Chapter 5: NP-completeness

In Fig. 5.4.1 we show the graph G of 11 vertices that correesponds to the following instance of 3SAT:

Fig. 5.4.1: The graph for a 3SAT problem

Now we claim that this graph is n +1 colorable if and only if the clauses are satisﬁable.

Clearly G cannot be colored in fewer than n colors, because the n vertices y

1

, . . . , y

n

are all connected

to each other and therefore they alone already require n diﬀerent colors for a proper coloration. Suppose

that y

i

is assigned color i (i = 1, . . . , n).

Do we need new colors in order to color the x

i

vertices? Since vertex y

i

is connected to every x vertex

and every ¯ x vertex except x

i

, ¯ x

i

, if color i is going to be used on the x’s or the ¯ x’s, it will have to be assigned

to one of x

i

, ¯ x

i

, but not to both, since they are connected to each other. Hence a new color, color n+1, will

have to be introduced in order to color the x’s and ¯ x’s.

Further, if we are going to color the vertices of G in only n + 1 colors, the only way to do it will be

to assign color n + 1 to exactly one member of each pair (x

i

, ¯ x

i

), and color i to the other one, for each

i = 1, . . . , n. That one of the pair that gets color n + 1 will be called the False vertex, the other one is the

True vertex of the pair (x

i

, ¯ x

i

), for each i = 1, . . . , n.

It remains to color the vertices C

1

, . . . , C

k

. The graph will be n+1 colorable if and only if we can do this

without using any new colors. Since each clause contains at most three literals, and n > 4, every variable C

i

must be adjacent to both x

j

and ¯ x

j

for at least one value of j. Therefore no vertex C

i

can be colored in the

color n +1 in a proper coloring of G, and therefore every C

i

must be colored in one of the colors 1, . . . , n.

Since C

i

is connected by an edge to every vertex x

j

or ¯ x

j

that is not in the clause C

i

, it follows that C

i

cannot be colored in the same color as any x

j

or ¯ x

j

that is not in the clause C

i

.

Hence the color that we assign to C

i

must be the same as the color of some ‘True’ vertex X

j

or ¯ x

j

that

corresponds to a literal that is in clause C

i

. Therefore the graph is n + 1 colorable isf and only if there is a

‘True’ vertex for each C

i

, and this means exactly that the clauses are satisﬁable.

It is easy to verify that the transformation from the 3SAT problem to the graph coloring problem can

be carried out in polynomial time, and the proof is ﬁnished.

By means of many, often quite ingenious, transofrmations of the kind that we have just seen, the list of

NP-complete problems has grown rapidly since the ﬁrst example, and the 21 additional problems found by

R. Karp. Hundreds of such problems are now known. Here are a few of the more important ones.

118

5.5 Half a loaf ...

Maximum clique: We are given a graph G and an integer K. The question is to determine whether or

not there is a set of K vertices in G, each of which is joined, by an edge of G, to all of the others.

Edge coloring: Given a graph G and an integer K. Can we color the edges of G in K colors, so that

whenevr two edges meet at a vertex, they will have diﬀerent colors?

Let us refer to an edge coloring of this kind as a proper coloring of the edges of G.

A beautiful theorem of Vizing

∗

deals with this question. If ∆ denotes the largest degree of any vertex

in the given graph, the Vizing’s theorem asserts that the edges of G can be properly colored in either ∆ or

∆ + 1 colors. Since it is obvious that at least ∆ colors will be needed, this means that the edge chromatic

number is in doubt by only one unit, for every graph G! Nevertheless the decision as to whether the correct

answer is ∆ or ∆+ 1 is NP-complete.

Hamilton path: In a given graph G, is there a path that visits every vertex of G exactly once?

Target sum: Given a ﬁnite set of positive integers whose sum is S. Is there a subset whose sum is S/2?

The above list, together with SAT, 3SAT, Travelling Salesman and Graph Coloring, constitutes a modest

sampling of the class of these seemingly intractable problems. Of course it must not be assumed that every

problem that ‘sounds like’ an NP-complete problem is necessarily so hard. If for example we ask for an Euler

path instead of a Hamilton path (i.e., if we want to traverse edges rather than vertices) the problem would

no longer be NP-complete, and in fact it would be in P, thanks to theorem 1.6.1.

As another example, the fact that one can ﬁnd the edge connectivity of a given graph in polynomial

time (see section 3.8) is rather amazing consideriong the quite diﬃcult appearance of the problem. One

of our motivations for including the network ﬂow algorithms in this book was, indeed, to show how very

sophisticated algorithms can sometimes prove that seemingly hard problems are in fact computationally

tractable.

Exercises for section 5.4

1. Is the claim that we made and proved above (just after (5.4.2)) identical with the statement that the

clause (5.4.1) is satisﬁable if and only if the clauses (5.4.2) are simultaneously satisﬁable? Discuss.

2. Is the claim that we made and proved above (just after (5.4.2)) identical with the statement that the

Boolean expression (5.4.1) is equal to the product of the Boolean expressions (5.4.2) in the sense that their

truth values are identical on every set of inputs? Discuss.

3. Let it be desired to ﬁnd out if a given graph G, of V vertices, can be vertex colored in K colors. If we

transform the problem into an instance of 3SAT, exactly how many clauses will there be?

5.5 Half a loaf ...

If we simply have to solve an NP-complete problem, then we are faced with a very long computation. Is

there anything that can be done to lighten the load? In a number of cases various kinds of probabilistic and

approximate algorithms have been developed, some very ingenious, and these may often be quite serviceable,

as we have already seen in the case of primality testing. Here are some of the strategies of ‘near’ solutions

that have been developed.

Type I: ‘Almost surely ...’

Suppose we have an NP-complete problem that asks if there is a certain kind of substructure embedded

inside a given structure. Then we may be able to develop an algorithm with the following properties:

(a) It always runs in polynomial time

(b) When it ﬁnds a solution then that solution is always a correct one

(c) It doesn’t always ﬁnd a solution, but it ‘almost always’ does, in the sense that the ratio of successes to

total cases approaches unity as the size of the input string grows large.

An example of such an algorithm is one that will ﬁnd a Hamilton path in almost all graphs, failing to

do so sometimes, but not often, and running always in polynomial time. We will describe such an algorithm

below.

∗

V. G. Vizing, On an estimate of the chromatic class of a p-graph (Russian), Diskret. Analiz. 3 (1964),

25-30.

119

Chapter 5: NP-completeness

Type II: ‘Usually fast ...’

In this category of quasi-solution are algorithms in which the uncertainty lies not in whether a solution

will be found, but in how long it will take to ﬁnd one. An algorithm of this kind will

(a) always ﬁnd a solution and the solution will always be correct, and

(b) operate in an average of subexponential time, although occasionally it may require exponential time.

The averaging is over all input strings of a given size.

An example of this sort is an algorithm that will surely ﬁnd a maximum independent set in a graph,

will on the average require ‘only’ O(n

c log n

) time to do so, but will occasionally, i.e., for some graphs, require

nearly 2

n

time to get an answer. We will outline such an algorithm below, in section 5.6. Note that O(n

c log n

)

is not a polynomial time estimate, but it’s an improvement over 2

n

.

Type II: ‘Usually fast ...’

In this kind of an algorithm we don’t even get the right answer, but it’s close. Since this means giving

up quite a bit, people like these algorithms to be very fast. Of course we are going to drop our insistence

that the questions be posed as decision problems, and instead they will be asked as optimization problems:

ﬁnd the shortest tour through these cities, or, ﬁnd the size of the maximum clique in this graph, or, ﬁnd a

coloring of this graph in the fewest possible colors, etc.

In response these algorithms will

(a) run in polynomial time

(b) always poroduce some output

(c) provide a guarantee that the output will not deviate from the optimal solution by more than such-and-

such.

An example of this type is the approximate algorithm for the travelling salesman problem that is given

below, in section 5.8. It quickly yields a tour of the cities that is guaranteed to be at most twice as long as

the shortest possible tour.

Now let’s look at examples of each of these kinds of approximation algorithms.

An example of an algorithm of Type I is due to Angluin and Valiant. It tries to ﬁnd a Hamilton path

(or circuit) in a graph G. It doesn’t always ﬁnd such a path, but in theorem 5.5.1 below we will see that it

usually does, at least if the graph is from a class of graphs that are likely to have Hamilton paths at all.

Input to the algorithm are the graph G and two distinguished vertices s, t. It looks for a Hamilton path

between the vertices s, t (if s = t on input then we are looking for a Hamilton circuit in G).

The procedure maintains a partially constructed Hamilton path P, from s to some vertex ndp, and it

attempts to extend P by adjoining an edge to a new, previously unvisited vertex. In the process of doing

so it will delete from the graph G, from time to time, an edge, so we will also maintain a variable graph G

,

that is initially set to G, but which is acted upon by the program.

To do its job, the algorithmchooses at random an edge (ndp, v) that is incident with the current endpoint

of the partial path P, and it deletes the edge (ndp, v) from the graph G

**, so it will never be chosen again. If
**

v is a vertex that is not on the path P then the path is extended by adjoining the new edge (ndp, v).

So much is fairly clear. However if the new vertex v is already on the path P, then we short circuit the

path by deleting an edge from it and drawing in a new edge, as is shown below in the formal statement of

the algorithm, and in Fig. 5.5.1. In that case the path does not get longer, but it changes so that it now has

120

5.5 Half a loaf ...

enhanced chances of ultimate completion.

Fig. 5.5.1: The short circuit

Here is a formal statement of the algorithm of Angluin and Valiant for ﬁnding a Hamilton path or circuit

in an undirected graph G.

procedure uhc(G:graph; s, t: vertex);

¦ﬁnds a Hamilton path (if s ,= t) or a Hamilton

circuit (if s = t) P in an undirected graph G

and returns ‘success’, or fails, and returns ‘failure’¦

G

**:= G; ndp := s; P:= empty path;
**

repeat

if ndp is an isolated point of G

**then return ‘failure’
**

else

choose uniformly at random an edge (ndp, v) from

among the edges of G

**that are incident with ndp
**

and delete that edge from G

;

if v ,= t and v / ∈ P

then adjoin the edge (ndp, v) to P; ndp := v

else

if v ,= t and v ∈ P

then

¦This is the short-circuit of Fig. 5.5.1¦

u := neighbor of v in P that is cloaser to ndp;

delete edge (u, v) from P;

adjoin edge (ndp, v) to P;

ndp := u

end; ¦then¦

end ¦else¦

until P contains every vertex of G (except T, if

s ,= t) and edge (ndp, t) is in G but not in G

;

adjoin edge (ndp, t) to P and return ‘success’

end. ¦uhc¦

As stated above, the algorithm makes only a very modest claim: either it succeeds or it fails! Of course

what makes it valuable is the accompanying theorem, which asserts that in fact the procedure almost always

succeeds, provided the graph G has a good chance of having a Hamilton path or circuit.

121

Chapter 5: NP-completeness

What kind of graph has such a ‘good chance’ ? A great deal of research has gone into the study of how

many edges a graph has to have before almost surely it must contain certain given structures. For instance,

how many edges must a graph of n vertices have before we can be almost certain that it will contain a

complete graph of 4 vertices?

To say that graphs have a property ‘almost certainly’ is to say that the ratio of the number of graphs

on n vertices that have the property to the number of graphs on n vertices approaches 1 as n grows without

bound.

For the Hamilton path problem, an important dividing line, or threshold, turns out to be at the level

of c logn edges. That is to say, a graph of n vertices that has o(nlogn) edges has relatively little chance

of being even connected, whereas a graph with > cnlogn edges is almost certainly connected, and almost

certainly has a Hamilton path.

We now state the theorem of Angluin and Valiant, which asserts that the algorithm above will almost

surely succeed if the graph G has enough edges.

Theorem 5.5.1. Fix a positive real number a. There exist numbers M and c such that if we choose a graph

G at random from among those of n vertices and at least cnlogn edges, and we choose arbitrary vertices s, t

in G, then the probability that algorithm UHC returns ‘success’ before making a total of Mnlogn attempts

to extend partially contructed paths is 1 −O(n

−a

).

5.6 Backtracking (I): independent sets

In this section we are going to describe an algorithm that is capable of solving some NP-complete

problems fast, on the average, while at the same time guaranteeing that a solution will always be found, be

it quickly or slowly.

The method is called backtracking, and it has long been a standard method in computer search problems

when all else fails. It has been common to think of backtracking as a very long process, and indeed it can be.

But recently it has been shown that the method can be very fast on average, and that in the graph coloring

problem, for instance, it functions in an average of constant time, i.e.,the time is independent of the number

of vertices, although to be sure, the worst-case behavior is very exponential.

We ﬁrst illustrate the backtrack method in the context of a search for the largest independent set of

vertices (a set of vertices no two of which are joined by an edge) in a given garph G, an NP-complete

problem. In this case the average time behavior of the method is not constant, or even polynomial, but is

subexponential. The method is also easy to analyze and to describe in this case.

Hence consider a graph G of n vertices, in which the vertices have been numbered 1, 2, . . . , n. We want

to ﬁnd, in G, the size of the largest independent set of vertices. In Fig. 5.6.1 below, the graph G has 6

vertices.

Fig. 5.6.1: Find the largest independent set

Begin by searching for an independent set S that contains vertex 1, so let S := ¦1¦. Now attempt to

enlarge S. We cannot enlarge S by adjoining vertex 2 to it, but we can add vertex 3. Our set S is now

¦1, 3¦.

Now we cannot adjoin vertex 4 (joined to 1) or vertex 5 (joined to 1) or vertex 6 (joined to 3), so we are

stuck. Therefore we backtrack, by replacing the most recently added member of S by th next choice that

we might have made for it. In this case, we dlete vertex 3 from S, and the next choice would be vertex 6.

The set S is ¦1, 6¦. Again we have a dead end.

If we backtrack again, there are no further choices with which to replace vertex 6, so we backtrack even

further, and not only delete 6 from S but also replace vertex 1 by the next possible choice for it, namely

vertex 2.

122

5.6 Backtracking (I): independent sets

To speed up the discussion, we will now show the list of all sets S that turn up from start to ﬁnish of

the algorithm:

¦1¦, ¦13¦, ¦16¦, ¦2¦, ¦24¦, ¦245¦, ¦25¦, ¦3¦,

¦34¦, ¦345¦, ¦35¦, ¦4¦, ¦45¦, ¦5¦, ¦6¦

A convenient way to represent the search process is by means of the backtrack search tree T. This is

a tree whose vertices are arranged on levels L := 0, 1, 2, . . ., n for a graph of n vertices. Each vertex of T

corresponds to an independent set of vertices in G. Two vertices of T, corresponding to independent sets

S

, S

of vertices of G, are joined by an edge in T if S

⊆ S

, and S

−S

**consists of a single element: the
**

highest-numbered vertex in S

**. On level L we ﬁnd a vertex S of T for every independent set of exactly L
**

vertices of G. Level 0 consists of a single root vertex, corresponding to the empty set of vertices of G.

The complete backtrack search tree for the problem of ﬁnding a maximum independent set in the graph

G of Fig. 5.6.1 is shown in Fig. 5.6.2 below.

Fig. 5.6.2: The backtrack search tree

The backtrack algorithm amounts just to visiting every vertex of the search tree T, without actually

having to write down the tree explicitly, in advance.

Observe that the list of sets S above, or equivalently, the list of nodes of the tree T, consists of exactly

every independent set in the graph G. A reasonable measure of the complexity of the searching job, therefore,

is the number of independent sets that G has. In the example above, the graph G had 19 independent sets

of vertices, including the empty set.

The question of the complexity of backtrack search is therefore the same as the question of determining

the number of independent sets of the graph G.

Some graphs have an enormous number of independent sets. The graph K

n

of n vertices and no edges

whatever has 2

n

independent sets of vertices. The backtrack tree will have 2

n

nodes, and the search will be

a long one indeed.

The complete graph K

n

of n vertices and every possible edge, n(n−1)/2 in all, has just n+1 independent

sets of vertices.

Any other graph G of n vertices will have a number of independent sets that lies between these two

extremes of n + 1 and 2

n

. Sometimes backtracking will take an exponentially long time, and sometimes it

will be fairly quick. Now the question is, on the average how fast is the backtrack method for this problem?

What we are asking for is the average number of independent sets that a graph of n vertices has. But

that is the sum, over all vertex subsets S ⊆ ¦1, . . . , n¦, of the probability that S is an independent set. If

S has k vertices, then the probability that S is independent is the probability that, among the k(k − 1)/2

possible edges that might join a pair of vertices in S, exactly zero of these edges actually live in the random

graph G. Since each of these

_

k

2

_

edges has a probability 1/2 of appearing in G, the probability that none of

them appear is 2

−k(k−1)/2

. Hence the average number of independent sets in a graph of n vertices is

I

n

=

n

k=0

_

n

k

_

2

−k(k−1)/2

. (5.6.1)

123

Chapter 5: NP-completeness

Hence in (5.6.1) we have an exact formula for the average number of independent sets in a graph of n

vertices. A short table of values of I

n

is shown below, in Table 5.6.1, along with values of 2

n

, for comparison.

Clearly the average number of independent sets in a graph is a lot smaller than the maximum number that

graphs of that size might have.

n I

n

2

n

2 3.5 4

3 5.6 8

4 8.5 16

5 12.3 32

10 52 1024

15 149.8 32768

20 350.6 1048576

30 1342.5 1073741824

40 3862.9 1099511627776

Table 5.6.1: Independent sets and all sets

In the exercises it will be seen that the rate of growth of I

n

as n grows large is O(n

log n

). Hence the

average amount of labor in a backtrack search for the largest independent set in a graph grows subexponen-

tially, although faster than polynomially. It is some indication of how hard this problem is that even on the

average the amount of labor needed is not of polynomial growth.

Exercises for section 5.6

1. What is the average number of independent sets of size k that are in graphs of V vertices and E edges?

2. Let t

k

denote the kth term in the sum (5.6.1).

(a) Show that t

k

/t

k−1

= (n −k +1)/(k2

k+1

).

(b) Show that t

k

/t

k−1

is > 1 when k is small, then is < 1 after k passes a certain critical value k

0

. Hence

show that the terms in the sum (5.6.1) increase in size until k = k

0

and then decrease.

3. Now we will estimate the size of k

0

in the previous problem.

(a) Show that t

k

< 1 when k = ¸log

2

n| and t

k

> 1 when k = ¸log

2

n−log

2

log

2

n|. Hence the index k

0

of

the largest term in (5.6.1) satisﬁes

¸log

2

n −log

2

log

2

n| ≤ k

0

≤ ¸log

2

n|

(b) The entire sum in (5.6.1) is at most n+1 times as large as its largest single term. Use Stirling’s formula

(1.1.10) and 3(a) above to show that the k

0

th term is O((n +)

log n

) and therefore the same is true of

the whole sum, i.e., of I

n

.

5.7 Backtracking (II): graph coloring

In another NP-complete problem, that of graph-coloring, the average amount of labor in a backtrack

search is O(1) (bounded) as n, the number of vertices in the graph, grows without bound. More precisely,

for ﬁxed K, if we ask ‘Is the graph G, of V vertices, properly vertex-colorable in K colors?,’ then the average

labor in a backtrack search for the answer is bounded. Hence not only is the average of polynomial growth,

but the polynomial is of degree 0 (in V ).

To be even more speciﬁc, consider the case of 3 colors. It is already NP-complete to ask if the vertices of

a given graph can be colored in 3 colors. Nevertheless, the average number of nodes in the backtrack search

tree for this problem is about 197, averaged over all graphs of all sizes. This means that if we input a random

graph of 1,000,000 vertices, and ask if it is 3-colorable, then we can expect an answer (probably ‘No’) after

only about 197 steps of computation.

To prove this we will need some preliminary lemmas.

124

5.7 Backtracking (II): graph coloring

Lemma 5.7.1. Let s

1

, . . . , s

K

be nonnegative numbers whose sum is L. Then the sum of their squares is

at least L

2

/K.

Proof: We have

0 ≤

K

i=1

(s

i

−

L

K

)

2

=

K

i=1

(s

2

i

−2

Ls

i

K

+

L

2

K

2

)

=

K

i=1

s

2

i

−2

L

2

K

+

L

2

K

=

K

i=1

s

2

i

−

L

2

K

.

The next lemma deals with a kind of inside-out chromatic polynomial question. Instead of asking ‘How

many proper colorings can a given graph have?,’ we ask ‘How many graphs can have a given proper coloring?’

Lemma 5.7.2. Let ( be one of the K

L

possible ways to color in K colors a set of L abstract vertices

1, 2, . . . , L. Then the number of graphs G whose vertex set is that set of L colored vertices and for which (

is a proper coloring of G is at most 2

L

2

(1−1/K)/2

.

Proof: In the coloring ( , suppose s

1

vertices get color 1, . . . , s

K

get color K, where, of course, s

1

+ +s

K

=

L. If a graph G is to admit ( as a proper vertex coloring then its edges can be drawn only between vertices

of diﬀerent colors. The number of edges that G might have is therefore

s

1

s

2

+ s

1

s

3

+ + s

1

s

K

+s

2

s

3

+ +s

2

s

K

+ + s

K−1

s

K

for which we have the following estimate:

1≤i<j≤K

s

i

s

j

=

1

2

i=j

s

i

s

j

=

1

2

_

K

i,j=1

s

i

s

j

−

K

i=1

s

2

i

_

=

1

2

(

s

i

)

2

−

1

2

s

2

i

≤

L

2

2

−

1

2

L

2

K

(by lemma 5.7.1)

=

L

2

2

(1 −

1

K

).

(5.7.1)

The number of possible graphs G is therefore at most 2

L

2

(1−1/K)/2

.

Lemma 5.7.3. The total number of proper colorings in K colors of all graphs of L vertices is at most

K

L

2

L

2

(1−1/K)/2

.

Proof: We are counting the pairs (G, (), where the graph G has L vertices and ( is a proper coloring of

G. If we keep ( ﬁxed and sum on G, then by lemma 5.7.2 the sum is at most 2

L

2

(1−1/K)/2

. Since there are

K

L

such ( ’s, the proof is ﬁnished.

Now let’s think about a backtrack search for a K-coloring of a graph. Begin by using color 1 on vertex

1. Then use color 1 on vertex 2 unless (1, 2) is an edge, in which case use color 2. As the coloring progresses

through vertices 1, 2, . . . , L we color each new vertex with the lowest available color number that does not

cause a conﬂict with some vertex that has previously been colored.

125

Chapter 5: NP-completeness

At some stage we may reach a dead end: out of colors, but not out of vertices to color. In the graph of

Fig. 5.7.1 if we try to 2-color the vertices we can color vertex 1 in color 1, vertex 2 in color 2, vertex 3 in

color 1 and then we’d be stuck because neither color would work on vertex 4.

Fig. 5.7.1: Color this graph

When a dead end is reached, back up to the most recently colored vertex for which other color choices

are available, replace its color with the next available choice, and try again to push forward to the next

vertex.

The (futile) attempt to color the graph in Fig. 5.7.1 with 2 colors by the backtrack method can be

portrayed by the backtrack search tree in Fig. 5.7.2.

The search is thought of as beginning at ‘Root.’ The label at each node of the tree describes the

colors of the vertices that have so far been colored. Thus ‘212’ means that vertices 1,2,3 have been colored,

respectively, in colors 2,1,2.

Fig. 5.7.2: A frustrated search tree

Fig. 5.7.3: A happy search tree

126

5.7 Backtracking (II): graph coloring

If instead we use 3 colors on the graph of Fig. 5.7.1 then we get a successful coloring; in fact we get 12

of them, as is shown in Fig. 5.7.3.

Let’s concentrate on a particular level of the search tree. Level 2, for instance, consists of the nodes of

the search tree that are at a distance 2 from ‘Root.’ In Fig. 5.7.3, level 2 contains 6 nodes, correspoonding

to the partial colorings 12, 13, 21, 23, 31, 32 of the graph. When the coloring reaches vertex 2 it has seen

only the portion of the graph G that is induced by vertices 1 and 2.

Generally, a node at level L of the backtrack search tree corresponds to a proper coloring in K colors

of the subgraph of G that is induced by vertices 1, 2, . . . , L.

Let H

L

(G) denote that subgraph. Then we see the truth of

Lemma 5.7.4. The number of nodes at level L of the backtrack search tree for coloring a graph G in K

colors is equal to the number of proper colorings of H

L

(G) in K colors, i.e., to P(K, H

L

(G)), where P is the

chromatic polynomial.

We are now ready for the main question of this section: what is the average number of nodes in a

backtrack search tree for K-coloring graphs of n vertices? This is

A(n, K) =

1

no. of graphs

graphs Gn

¦no. of nodes in tree for G¦

= 2

−(

n

2

)

Gn

¦

n

L=0

¦no. of nodes at level L¦¦

= 2

−(

n

2

)

Gn

n

L=0

P(K, H

L

(G)) (by lemma 5.7.4)

= 2

−(

n

2

)

n

L=0

¦

Gn

P(K, H

L

(G))¦.

Fix some value of L and consider the inner sum. As G runs over all graphs of N vertices, H

L

(G) selects

the subgraph of G that is induced by vertices 1, 2, . . ., L. Now lots of graphs G of n vertices have the same

H

L

(G) sitting at vertices 1, 2, . . . , L. In fact exactly 2

(

n

2

)−(

L

2

)

diﬀerent graphs G of n vertices all have the

same graph H of L vertices in residence at vertices 1, 2, . . ., L (see exercise 15 of section 1.6). Hence (5.7.2)

gives

A(n, K) = 2

−(

n

2

)

n

L=0

2

(

n

2

)−(

L

2

)

_

HL

P(K, H)

_

=

n

L=0

2

−(

L

2

)

_

HL

P(K, H)

_

.

The inner sum is exactly the number that is counted by lemma 5.7.3, and so

A(n, K) ≤

n

L=0

2

−(

L

2

)

K

L

2

L

2

(1−1/K)/2

≤

∞

L=0

K

L

2

L/2

2

−L

2

/2K

.

The inﬁnite series actually converges! Hence A(n, k) is bounded, for all n. This proves

Theorem 5.7.1. Let A(n, k) denote the average number of nodes in the backtrack search trees for K-

coloring the vertices of all graphs of n vertices. Then there is a constant h = h(K), that depends on the

number of colors, K, but not on n, such that A(n, k) ≤ h(K) for all n.

127

Chapter 5: NP-completeness

5.8 Approximate algorithms for hard problems

Finally we come to Type III of the three kinds of ‘half-a-loaf-is-better-than-none’ algorithms that were

described in section 5.5. In these algorithms we don’t ﬁnd the exact solution of the problem, only an

approximate one. As consolation we have an algorithm that runs in polynomial time as well as a performance

guarantee to the eﬀect that while the answer is approximate, it can certainly deviate by no more than such-

and-such from the exact answer.

An elegant example of such a situation is in the Travelling Salesman Problem, which we will now express

as an optimization problem rather than as a decision problem.

We are given n points (‘cities’) in the plane, as well as the distances between every pair of them, and we

are asked to ﬁnd a round-trip tour of all of these cities that has minimum length. We will assume throughout

the following discussion that the distances satisfy the triangle inequality. This restriction of the TSP is often

called the ‘Euclidean’ Travelling Salesman Problem.

The algorithm that we will discuss for this problem has the properties

(a) it runs in polynomial time and

(b) the round-trip tour that it ﬁnds will never be more than twice as long as the shortest possible tour.

The ﬁrst step in carrying out the algorithm is to ﬁnd a minimum spanning tree (MST) for the n given

cities. A MST is a tree whose nodes are the cities in question, and which, among all possible trees on that

vertex set, has minimum possible length.

It may seem that ﬁnding a MST is just as hard as solving the TSP, but NIN (No, It’s Not). The MST

problem is one of those all-too-rare computational situations in which it pays to be greedy.

Generally speaking, in a greedy algorithm,

(i) we are trying to construct some optimal structure by adding one piece at a time, and

(ii) at each step we make the decision about which piece will be added next by choosing, among all

available pieces, the single one that will carry us as far as possible in the desirable direction (be

greedy!).

The reason that greedy algorithms are not usually the best possible ones is that it may be better not

to take the single best piece at each step, but to take some other piece, in the hope that at a later step we

will be able to improve things even more. In other words, the global problem of ﬁnding the best structure

might not be solveable by the local procedure of being as greedy as possible at each step.

In the MST problem, though, the greedy strategy works, as we see in the following algorithm.

procedure mst(x :array of n points in the plane);

¦constructs a spanning tree T of minimum length, on the

vertices ¦x

1

, . . . , x

n

¦ in the plane¦

let T consist of a single vertex x

1

;

while T has fewer than n vertices do

for each vertex v that is not yet in T, ﬁnd the

distance d(v) from v to the nearest vertex of T;

let v

∗

be a vertex of smallest d(v);

adjoin v

∗

to the vertex set of T;

adjoin to T the edge from v

∗

to the nearest

vertex w ,= v

∗

of T;

end¦while¦

end.¦mst¦

Proof of correctness of mst: Let T be the tree that is produced by running mst, and let e

1

, . . . , e

n−1

be

its edges, listed in the same order in which the alfgorithm mst produced them.

Let T

**be a minimum spanning tree for x. Let e
**

r

be the ﬁrst edge of T that does not appear in T

. In

the minimum tree T

, edges e

1

, . . . , e

r−1

all appear, and we let S be the union of their vertex sets. In T

let

f be the edge that joins the subtree on S to the subtree on the remaining vertices of x.

Suppose f is shorter than e

r

. Then f was one of the edges that was available to the algorithm mst

at the instant that it chose e

r

, and since e

r

was the shortest edge available at that moment, we have a

contradiction.

128

5.7 Backtracking (II): graph coloring

Suppose f is longer than e

r

. Then T

**would not be minimal because the tree that we would obtain by
**

exchanging f for e

r

in T

**( why is it still a tree if we do that exchange?) would be shorter, contradicting the
**

minimality of T

.

Hence f and e

r

have the same length. In T

exchange f for e

r

. Then T

**is still a tree, and is still a
**

minimum spanning tree.

The index of the ﬁrst edge of T that does not appear in T

**is now at least r + 1, one unit larger than
**

before. The process of replacing edges of T that do not appear in T

**without aﬀecting the minimality of T
**

can be repeated until every edge of T appears in T

, i.e., until T = T

**. Hence T was a minimum spanning
**

tree.

That ﬁnishes one step of the process that leads to a polynomial time travelling salesman algorithm that

ﬁnds a tour of at most twice the minimum length.

The next step involves ﬁnding an Euler circuit. Way back in theorem 1.6.1 we learned that a connected

graph has an Euler circuit if and only if every vertex has even degree. Recall that the proof was recursive

in nature, and immediately implies a linear time algorithm for ﬁnding Euler circuits recursively. We also

noted that the proof remains valid even if we are dealing with a multigraph, that is, with a graph in which

several edges are permitted between single pairs of vertices. We will in fact need that extra ﬂexibility for

the purpose at hand.

Now we have the ingredients for a quick near-optimal travelling salesman tour.

Theorem 5.8.1. There is an algorithm that operates in polynomial time and which will return a travelling

salesman tour whose length is at most twice the length of a minimum tour.

Here is the algorithm. Given the n cities in the plane:

(1) Find a minimum spanning tree T for the cities.

(2) Double each edge of the tree, thereby obtaining a ‘multitree’ T

(2)

in which between each pair of

vertices there are 0 or 2 edges.

(3) Since every vertex of the doubled tree has even degree, there is an Eulerian tour W of the edges of

T

(2)

; ﬁnd one, as in the proof of theorem 1.6.1.

(4) Now we construct the output tour of the cities. Begin at some city and follow the walk W. However,

having arrived at some vertex v, go from v directly (via a straight line) to the next vertex of the walk

W that you haven’t visited yet. This means that you will often short-circuit portions of the walk W

by going directly from some vertex to another one that is several edges ‘down the road.’

The tour Z

**that results from (4) above is indeed a tour of all of the cities in which each city is visited
**

once and only once. We claim that its length is at most twice optimal.

Let Z be an optimum tour, and let e be some edge of Z. Ten Z −e is a path that visits all of the cities.

Since a path is a tree, Z −e is a spanning tree of the cities, hence Z −e is at least as long as T is, and so Z

is surely at least as long as T is.

Next consider the length of the tour Z

. A step of Z

**that walks along an edge of the walk W has length
**

equal to the length of that edge of W. A step of Z

**that short circuits several edges of W has length at most
**

equal to the sum of the lengths of the edges of W that were short-circuited. If we sum these inequalities

over all steps of Z

we ﬁnd that the length of Z

**is at most equal to the length of W, which is in turn twice
**

the length of the tree T.

If we put all of this together we ﬁnd that

length(Z) > length(Z −e) ≥ length(T) =

1

2

length(W) ≥

1

2

length(Z

)

as claimed (!)

More recently it has been proved (Cristoﬁdes, 1976) that in polynomial time we can ﬁnd a TSP tour

whose total length is at most 3/2 as long as the minimum tour. The algorithm makes use of Edmonds’s

algorithm for maximum matching in a general graph (see the reference at the end of Chapter 3). It will be

interesting to see if the factor 3/2 can be further reﬁned.

Polynomial time algorithms are known for other NP-complete problems that guarantee that the answer

obtained will not exceed, by more than a constant factor, the optimum answer. In some cases the guarantees

apply to the diﬀerence between the answer that the algorithm gives and the best one. See the references

below for more information.

129

Chapter 5: NP-completeness

Exercises for section 5.8

1. Consider the following algorithm:

procedure mst2(x :array of n points in the plane);

¦allegedly ﬁnds a tree of minimum total length that

visits every one of the given points¦

if n = 1

then T := ¦x

1

¦

else

T := mst2(n −1,x−x

n

);

let u be the vertex of T that is nearest to x

n

;

mst2:=T plus vertex x

n

plus edge (x

n

, u)

end.¦mst2¦

Is this algorithm a correct recursive formulation of the minimum spanning tree greedy algorithm? If so then

prove it, and if not then give an example of a set of points where mst2 gets the wrong answer.

Bibliography

Before we list some books and journal articles it should be mentioned that research in the area of

NP-completeness is moving rapidly, and the state of the art is changing all the time. Readers who would

like updates on the subject are referred to a series of articles that have appeared in issues of the Journal

of Algorithms in recent years. These are called ‘NP-completeness: An ongoing guide.’ They are written

by David S. Johnson, and each of them is a thorough survey of recent progress in one particular area of

NP-completeness research. They are written as updates of the ﬁrst reference below.

Journals that contain a good deal of research on the areas of this chapter include the Journal of Algo-

rithms, the Journal of the Association for Computing Machinery, the SIAM Journal of Computing, Infor-

mation Processing Letters, and SIAM Journal of Discrete Mathematics.

The most complete reference on NP-completeness is

M. Garey and D. S. Johnson, Computers and Intractability; A guide to the theory of NP-completeness, W.

H. Freeman and Co., San Francisco, 1979.

The above is highly recommended. It is readable, careful and complete.

The earliest ideas on the computational intractability of certain problems go back to

Alan Turing, On computable numbers, with an application to the Entscheidungsproblem, Proc. London

Math. Soc., Ser. 2, 42 (1936), 230-265.

Cook’s theorem, which originated the subject of NP-completeness, is in

S. A. Cook, The complexity of theorem proving procedures, Proc., Third Annual ACM Symposium on the

Theory of Computing, ACM, New York, 1971, 151-158.

After Cook’s work was done, a large number of NP-complete problems were found by

Richard M. Karp, Reducibility among combinatorial problems, in R. E. Miller and J. W. Thatcher, eds.,

Complexity of Computer Computations, Plenum, New York, 1972, 85-103.

The above paper is recommended both for its content and its clarity of presentation.

The approximate algorithm for the travelling salesman problem is in

D. J. Rosencrantz, R. E. Stearns and P. M. Lewis, An analysis of several heuristics for the travelling salesman

problem, SIAM J. Comp. 6, 1977, 563-581.

Another approximate algorithm for the Euclidean TSP which guarantees that the solution found is no more

than 3/2 as long as the optimum tour, was found by

N. Cristoﬁdes, Worst case analysis of a new heuristic for the travelling salesman problem, Technical Report,

Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh, 1976.

The minimum spanning tree algorithm is due to

R. C. Prim, Shortest connection netwroks and some generalizations, Bell System Tech. J. 36 (1957), 1389-

1401.

The probabilistic algorithm for the Hamilton path problem can be found in

130

5.7 Backtracking (II): graph coloring

D. Angluin and L. G. Valiant, Fast probabilistic algorithms for Hamilton circuits and matchings, Proc. Ninth

Annual ACM Symposium on the Theory of Computing, ACM, New York, 1977.

The result that the graph coloring problem can be done in constant average time is due to

H. Wilf, Backtrack: An O(1) average time algorithm for the graph coloring problem, Information Processing

Letters 18 (1984), 119-122.

Further reﬁnements of the above result can be found in

E. Bender and H. S. Wilf, A theoretical analysis of backtracking in the graph coloring problem, Journal of

Algorithms 6 (1985), 275-282.

If you enjoyed the average numbers of independent sets and average complexity of backtrack, you might

enjoy the subject of random graphs. An excellent introduction to the subject is

Edgar M. Palmer, Graphical Evolution, An introduction to the theory of random graphs, Wiley-Interscience,

New York, 1985.

131

Index

Index

adjacent 40

Adleman, L. 149, 164, 165, 176

Aho, A. V. 103

Angluin, D. 208-211, 227

Appel, K. 69

average complexity 57, 211ﬀ.

backtracking 211ﬀ.

Bender, E. 227

Bentley, J. 54

Berger, R. 3

big oh 9

binary system 19

bin-packing 178

binomial theorem 37

bipartite graph 44, 182

binomial coeﬃcients 35

—, growth of 38

blocking ﬂow 124

Burnside’s lemma 46

cardinality 35

canonical factorization 138

capacity of a cut 115

Carmichael numbers 158

certiﬁcate 171, 182, 193

Cherkassky, B. V. 135

Chinese remainder theorem 154

chromatic number 44

chromatic polynomial 73

Cohen, H. 176

coloring graphs 43

complement of a graph 44

complexity 1

—, worst-case 4

connected 41

Cook, S. 187, 194-201, 226

Cook’s theorem 195ﬀ.

Cooley, J. M. 103

Coppersmith, D. 99

cryptography 165

Cristoﬁdes, N. 224, 227

cut in a network 115

—, capacity of 115

cycle 41

cyclic group 152

decimal system 19

decision problem 181

degree of a vertex 40

deterministic 193

Diﬃe, W. 176

digraph 105

Dinic, E. 108, 134

divide 137

Dixon, J. D. 170, 175, 177

domino problem 3

‘easy’ computation 1

edge coloring 206

edge connectivity 132

132

Index

Edmonds, J. 107, 134, 224

Enslein, K. 103

Euclidean algorithm 140, 168

—, complexity 142

—, extended 144ﬀ.

Euler totient function 138, 157

Eulerian circuit 41

Even, S. 135

exponential growth 13

factor base 169

Fermat’s theorem 152, 159

FFT, complexity of 93

—, applications of 95 ﬀ.

Fibonacci numbers 30, 76, 144

ﬂow 106

—, value of 106

—, augmentation 109

—, blocking 124

ﬂow augmenting path 109

Ford-Fulkerson algorithm 108ﬀ.

Ford, L. 107ﬀ.

four-color theorem 68

Fourier transform 83ﬀ.

—, discrete 83

—, inverse 96

Fulkerson, D. E. 107ﬀ.

Galil, Z. 135

Gardner, M. 2

Garey, M. 188

geometric series 23

Gomory, R. E. 136

graphs 40ﬀ.

—, coloring of 43, 183, 216ﬀ.

—, connected 41

—, complement of 44

—, complete 44

—, empty 44

—, bipartite 44

—, planar 70

greatest common divisor 138

group of units 151

Haken, W. 69

Hamiltonian circuit 41, 206, 208ﬀ.

Hardy, G. H. 175

height of network 125

Hellman, M. E. 176

hexadecimal system 21

hierarchy of growth 11

Hoare, C. A. R. 51

Hopcroft, J. 70, 103

Hu, T. C. 136

independent set 61, 179, 211ﬀ.

intractable 5

Johnson, D. S. 188, 225, 226

Karp, R. 107, 134, 205, 226

Karzanov, A. 134

Knuth, D. E. 102

K¨ onig, H. 103

133

Index

k-subset 35

language 182

Lawler, E. 99

layered network 120ﬀ.

Lenstra, H. W., Jr. 176

LeVeque, W. J. 175

Lewis, P. A. W. 103

Lewis, P. M. 227

L’Hospital’s rule 12

little oh 8

Lomuto, N. 54

Maheshwari, S. N. 108ﬀ. , 135

Malhotra, V. M. 108ﬀ. , 135

matrix multiplication 77ﬀ.

max-ﬂow-min-cut 115

maximum matching 130

minimum spanning tree 221

moderately exponential growth 12

MPM algorithm 108, 128ﬀ.

MST 221

multigraph 42

network 105

— ﬂow 105ﬀ.

—, dense 107

—, layered 108, 120ﬀ.

—, height of 125

Nijenhuis, A. 60

nondeterministic 193

NP 182

NP-complete 61, 180

NP-completeness 178ﬀ.

octal system 21

optimization problem 181

orders of magnitude 6ﬀ.

P 182

Palmer, E. M. 228

Pan, V. 103

Pascal’s triangle 36

path 41

periodic function 87

polynomial time 2, 179, 185

polynomials, multiplication of 96

Pomerance, C. 149, 164, 176

positional number systems 19ﬀ.

Pramodh-Kumar, M. 108ﬀ. , 135

Pratt, V. 171, 172

Prim, R. C. 227

primality, testing 6, 148ﬀ. , 186

—, proving 170

prime number 5

primitive root 152

pseudoprimality test 149, 156ﬀ.

—, strong 158

public key encryption 150, 165

Quicksort 50ﬀ.

Rabin, M. O. 149, 162, 175

Ralston, A. 103

134

Index

recurrence relations 26ﬀ.

recurrent inequality 31

recursive algorithms 48ﬀ.

reducibility 185

relatively prime 138

ring Z

n

151ﬀ.

Rivest, R. 165, 176

roots of unity 86

Rosenkrantz, D. 227

RSA system 165, 168

Rumely, R. 149, 164, 176

Runge, C. 103

SAT 195

satisﬁability 187, 195

scanned vertex 111

Sch¨ onhage, A. 103

Selfridge, J. 176

Shamir, A. 165, 176

slowsort 50

Solovay, R. 149, 162, 176

splitter 52

Stearns, R. E. 227

Stirling’s formula 16, 216

Strassen, V. 78, 103, 149, 162, 176

synthetic division 86

3SAT 201

target sum 206

Tarjan, R. E. 66, 70, 103, 135

Θ (‘Theta of’) 10

tiling 2

tractable 5

travelling salesman problem 178, 184, 221

tree 45

Trojanowski, A. 66, 103

‘TSP’ 178, 221

Tukey, J. W. 103

Turing, A. 226

Turing machine 187ﬀ.

Ullman, J. D. 103

usable edge 111

Valiant, L. 208-11, 227

vertices 40

Vizing, V. 206

Wagstaﬀ, S. 176

Welch, P. D. 103

Wilf, H. 60, 103, 227, 228

Winograd, S. 99

worst-case 4, 180

Wright, E. M. 175

135

CONTENTS

Chapter 0: What This Book Is About 0.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 0.2 Hard vs. easy problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 0.3 A preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 1: Mathematical Preliminaries 1.1 1.2 1.3 1.4 1.5 1.6 Orders of magnitude . . Positional number systems Manipulations with series Recurrence relations . . . Counting . . . . . . . Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . 11 . 14 . 16 . 21 . 24

Chapter 2: Recursive Algorithms 2.1 2.2 2.3 2.4 2.5 2.6 2.7 Introduction . . . . . . . . Quicksort . . . . . . . . . Recursive graph algorithms . . Fast matrix multiplication . . The discrete Fourier transform Applications of the FFT . . . A review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 31 38 47 50 56 60

Chapter 3: The Network Flow Problem 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 Introduction . . . . . . . . . . . . . . . . Algorithms for the network ﬂow problem . . . . The algorithm of Ford and Fulkerson . . . . . The max-ﬂow min-cut theorem . . . . . . . . The complexity of the Ford-Fulkerson algorithm Layered networks . . . . . . . . . . . . . . The MPM Algorithm . . . . . . . . . . . . Applications of network ﬂow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 64 65 69 70 72 76 77

Chapter 4: Algorithms in the Theory of Numbers 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 Preliminaries . . . . . . . . . . . . . . . . . The greatest common divisor . . . . . . . . . . The extended Euclidean algorithm . . . . . . . Primality testing . . . . . . . . . . . . . . . Interlude: the ring of integers modulo n . . . . . Pseudoprimality tests . . . . . . . . . . . . . Proof of goodness of the strong pseudoprimality test Factoring and cryptography . . . . . . . . . . Factoring large integers . . . . . . . . . . . . Proving primality . . . . . . . . . . . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 . 82 . 85 . 87 . 89 . 92 . 94 . 97 . 99 . 100

Chapter 5: NP-completeness 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 Introduction . . . . . . . . . . . . . Turing machines . . . . . . . . . . . Cook’s theorem . . . . . . . . . . . . Some other NP-complete problems . . . Half a loaf ... . . . . . . . . . . . . . Backtracking (I): independent sets . . . Backtracking (II): graph coloring . . . . Approximate algorithms for hard problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 109 112 116 119 122 124 128

iv

It was. for example. but will be obvious when encountered. Chapter 4. a pleasure for me to have had the chance to typeset my own book. including seniors and graduate students. This book was typeset in TEX. Herbert S. but is strongly connected to combinatorial algorithms in general. The recursive graph algorithms are particularly recommended since they are usually quite foreign to students’ previous experience and therefore have great learning value. might be studied as time permits. In addition to the exercises that appear in this book. These are not mentioned explicitly. then. Any of the algorithms of Chapter 2 would be suitable for this purpose. Throughout the book there are opportunities to ask students to write programs and get them running. one might delve into the appropriate sections of Chapter 1 to get the concepts and techniques well in hand. however. Wilf v . Finally. material from Chapter 3. For the deﬁciencies in its appearance. for generously allowing me the use of TEX facilities. Students should all have the experience of writing. After Chapter 2. as new ideas are needed in Chapter 2. with a few exceptions. Chapter 5 would be next. Selection by the instructor of topics of interest will be very important. For the no-doubt-numerous shortcomings that remain. and using a program that is nontrivially recursive. The latter might be found among the references cited in the bibliographies in each chapter. discusses material that is extremely attractive. To the extent that it’s a delight to look at. and in discrete algorithms in the senior year. Daniel Kleitman. It has also been tried out on a large class of computer science and mathematics majors. and surprisingly pure and applicable at the same time. and is helped a lot by hands-on practice. which is rather independent of the rest of the book. debugging. with good results. and particularly to Aravind Joshi. Then. The concept of recursion is subtle and powerful. on number theory. because normally I’ve found that I can’t cover anywhere near all of this material in a semester. I am indebted ﬁrst of all to the students on whom I worked out these ideas. I accept full responsibility. thank TEX. student assignments might consist of writing occasional programs. thank my limitations as a typesetter. My thanks to the Computer Science department of the University of Pennsylvania. and second to a number of colleagues for their helpful advice and friendly criticism. Among the latter I will mention Richard Brualdi. Robert Tarjan and Alan Tucker. This book has grown out of the senior course as I have been teaching it recently. as well as delivering reports in class on assigned readings. Albert Nijenhuis. A reasonable choice for a ﬁrst try might be to begin with Chapter 2 (recursive algorithms) which contains lots of motivation. since the foundations would then all be in place.Preface For the past several years mathematics majors in the computing track at the University of Pennsylvania have taken a course in continuous algorithms (numerical analysis) in the junior year.

One computational question is this. or whatever units are relevant.1 we show a tiling of the plane by identical rectangles. For instance. Can we or can we not succeed in tiling the whole plane? That elegant question has been proved* to be computationally unsolvable. it has been proved that there isn’t any * See. it is enough to describe a fast method for solving that problem. If a certain calculation on an n × n matrix were to require 2n minutes. not necessarily regular and not necessarily convex.2 is a tiling by regular hexagons. . In other words. then that would be a ‘hard’ problem. then the calculation is an easy one.4). The running time of the program is being given as a function of the size of the input matrix. 0. A faster program for the same job might run in 0.e. a computation that is guaranteed to take at most cn3 time for input of size n will be thought of as an ‘easy’ computation. This book is about algorithms and complexity. measured in running time. because you will have to prove to them that it is impossible to ﬁnd a fast way of doing the calculation. To convince someone that a problem is hard is hard. each shaped like a regular hexagon.Chapter 0: What This Book Is About 0. Berger. Then we can tile the whole plane with them. Some problems take a very long time. think about this statement: ‘I just bought a matrix inversion program. and it can invert an n × n matrix in just 1. otherwise it’s hard. but still. Amer. from our present point of view the important distinction to maintain will be the polynomial time guarantee or lack of it.’ Suppose* we are given inﬁnitely many identical ﬂoor tiles. The complexity of an algorithm is the cost. that algorithm may be slow. Thus. Matrix inversion is easy. instead of merely shaving the multiplicative constant. and then someone discovers a faster way to do them (a ‘faster algorithm’). To convince someone that a problem is easy. and in Fig.8 minutes would represent a striking improvement of the state of the art.1 Background An algorithm is a method for solving a class of problems on a computer. not only do we not know of any fast way to solve that problem on a computer. Naturally. pp. That raises a number of theoretical and computational questions. Naturally some of the computations that we are calling ‘easy’ may take a very long time to run. Suppose we are given a certain polygon. Computing takes time..2n3 minutes. and so it is about methods for solving problems on computers and the costs (usually the running time) of using those methods. After all. For the purposes of this book. 0. If someone were to make a really important discovery (see section 2. In Fig. Many problems in computer science are known to be easy. The study of the amount of computational eﬀort that is needed in order to perform certain kinds of computations is the study of computational complexity. others can be done quickly. Math. but maybe there’s a faster way. Some problems seem to take a long time. * R. Memoirs Amer. The familiar Gaussian elimination method can invert an n × n matrix in time at most cn3 . One interesting one is called the ‘tiling problem. for instance. i. This can also be done if the tiles are identical rectangles. we can cover the plane with no empty spaces left over. 66 (1966).’ We see here a typical description of the complexity of a certain algorithm. and suppose we have inﬁnitely many identical tiles in that shape. One that needs at most n10 time is also easy. So the time complexity of a calculation is measured by expressing the running time of the calculation as a function of some measure of the amount of data that is needed to describe the problem to the computer. January 1977.8n3 minutes for an n × n matrix. of using the algorithm to solve one of those problems. Soc. It will not be enough to point to a particular algorithm and to lament its slowness. then maybe we could actually lower the exponent. but not if they are regular pentagons. The general rule is that if the running time is at most a polynomial function of the amount of input data. a program that would invert an n × n matrix in only 7n2. To give an example of a hard computational problem we have to go far aﬁeld. we would expect that a computing problem for which millions of bits of input data are required would probably take longer than another problem that needs only a few items of input. 110-121. The undecidability of the domino problem. or storage. Martin Gardner’s article in Scientiﬁc American.

for example. and then.’ and it’s easy to see why. 15. What has been proved is that no single method exists that can guarantee that it will decide this question for every polygon. A performance guarantee. Example 2. so our algorithm might. sort any given sequence of numbers into ascending order of size (see section 2. after a certain amount of time. Math. so even looking for an algorithm would be fruitless.2 Hard vs.. All we need to input is the shape of the basic polygon. Into the box goes a description of a particular problem in that class. the sequence 1. a ‘hard’ one. Notice that the amount of input data to the computer in this example is quite small. That’s really hard! 0. Soc. the answer appears. If we have an algorithm that will. The problem is hard because we cannot devise an algorithm for which we can give a guarantee of fast performance for all instances. 7. 0. Providence. is sometimes called a ‘worst-case complexity estimate. easy problems Let’s take a moment more to say in another way exactly what we mean by an ‘easy’ computation vs. It is guaranteed that every problem that can be input with B bits of data will be solved in at most 0. and might therefore take more time. sort it very rapidly.2: Tiling with hexagons way to do it. 11. It is guaranteed that if the input problem is described with B bits of data. if it takes advantage of the near-order.2) it may ﬁnd that some sequences are easier to sort than others. Example 1. Yet not only is it impossible to devise a fast algorithm for this problem. 0.1: Tiling with rectangles Fig. 20 is nearly in order already. RI 2 . or of computational eﬀort.Chapter 0: What This Book Is About Fig.7B 15 seconds. For instance. The fact that a computational problem is hard doesn’t mean that every instance of it has to be hard. Hard problems can have easy instances. it has been proved impossible to devise any algorithm at all that is guaranteed to terminate with a Yes/No answer after ﬁnitely many steps. 2. then an answer will be output after at most 6B 3 minutes. A ‘fast’ algorithm is one that carries a guarantee of fast performance. Think of an algorithm as being a little box that can solve a certain class of computational problems. That doesn’t mean that the question is hard for every polygon. Here are some examples. like the two above. 10. Other sequences might be a lot harder for it to handle.

and that’s certainly a polynomial in n. fasterthan-polynomial growth of the work done with the problem size. or like 2 B . nor has anyone proved that there isn’t a fast way. if n = 59. . These are problems for which no one has found a fast computer algorithm. we obviously √ will do about n units of work. testing the primality of a given integer n. So in the problem of this example. . then an answer will appear after at most 5B 2 time units. then the average amount of computing time will be so-and-so (as a function of B). 3. else it is composite. Like eB . we ﬁnd that the complexity of this calculation is approximately 2B/2 .7). easy problems So in some problems whose input bit string has B bits the algorithm might operate in time 6B. At the present time no one has found a fast way to test for primality. For instance. Well then. Worst-case bounds are the most common kind. but not provably. the distinction was made on the basis of polynomial vs. and therefore this problem must be easy. for solving it. that cause the algorithm to do more than P (B) units of work. A computational problem is tractable if there is a fast algorithm that will do all instances of it. there exist arbitrarily large values of B. That wouldn’t guarantee performance no worse than so-and-so. but there are other kinds of bounds for running time. Reference to the distinction between fast and slow methods will show that we have to measure the amount of work done as a function of the number of bits of input to the problem. 3 . we don’t need 59 bits to describe n. Therefore. We √ want to ﬁnd out if n is prime. and that grows much faster than any polynomial function of B. n is not the number of bits of input. Primality testing belongs to the (well-populated) class of seemingly. Let n be a given integer.’ then we declare n to be a prime number. Example 3.0. For each integer m = 2. It would assure a user that if the input problem instance can be described by B bits. The method that we choose is the following. An algorithm is slow if. and input data strings of B bits. A computational problem is intractable if it can be proved that there is no fast algorithm for it. Right? Well no. If we express the amount of work done as a function of B. we do less than n units of work. n is of polynomial growth in n. according to our deﬁnition of fast and slow algorithms. the length of the input bit string B is about log2 n. and for still other problem instances of length B bits the algorithm might need 5B 2 time units to get the job done. We will now look at the computational complexity of this algorithm. it would state that if the performance is averaged over all possible input bit strings of B bits. etc. intractable problems. the number of binary digits in the bit string of an integer n is close to log2 n. 10B log B time units. Seen in this light. Hence a performance guarantee is equivalent to an estimation of the worst possible scenario: the longest possible calculation that might ensue if B bits are input to the program. but only 6. isn’t it? So. In this example. otherwise the guarantee wouldn’t be valid. for instance. what would the warranty card say? It would have to pick out the worst possibility. n we ask if m divides (evenly into) n. after all. Now let’s talk about the diﬀerence between easy and hard computational problems and between fast and slow algorithms. . The problem is this. the calculation suddenly seems very long. . It’s the ‘seemingly’ that makes things very interesting. In those units. the method that we have just discussed for testing the primality of a given integer is slow. If all of the answers are ‘No.2 Hard vs. See chapter 4 for further discussion of this problem. It is the polynomial time vs. and on others it might need. whatever polynomial P we think of. For a given integer n the work that we have to do can be measured in units of divisions of a whole number by another whole number. That means that we are going to ﬁnd out how much work is involved in doing the test. say. because. In this book we will deal with some easy problems and some seemingly hard ones. A string consisting of √ a mere log2 n 0’s and 1’s has caused our mighty computer to do about n units of work. not necessarily polynomial time guarantee that makes the diﬀerence between the easy and the hard classes of problems. For instance. or between the fast and the slow algorithms. Here is a familiar computational problem and a method. A warranty that would not guarantee ‘fast’ performance would contain some function of B that grows √ faster than any polynomial. We might give an average case bound instead (see section 5. not really. or algorithm. In general. √ It seems as though this is a tractable problem. It is highly desirable to work with algorithms such that we can give a performance guarantee for their running time that is at most a polynomial function of the number of bits of input. Let’s see if the method has a polynomial time guarantee or not.

Chapter 0: What This Book Is About but also, no one has proved the impossibility of doing so. It should be added that the entire area is vigorously being researched because of the attractiveness and the importance of the many unanswered questions that remain. Thus, even though we just don’t know many things that we’d like to know in this ﬁeld , it isn’t for lack of trying!

0.3 A preview Chapter 1 contains some of the mathematical background that will be needed for our study of algorithms. It is not intended that reading this book or using it as a text in a course must necessarily begin with Chapter 1. It’s probably a better idea to plunge into Chapter 2 directly, and then when particular skills or concepts are needed, to read the relevant portions of Chapter 1. Otherwise the deﬁnitions and ideas that are in that chapter may seem to be unmotivated, when in fact motivation in great quantity resides in the later chapters of the book. Chapter 2 deals with recursive algorithms and the analyses of their complexities. Chapter 3 is about a problem that seems as though it might be hard, but turns out to be easy, namely the network ﬂow problem. Thanks to quite recent research, there are fast algorithms for network ﬂow problems, and they have many important applications. In Chapter 4 we study algorithms in one of the oldest branches of mathematics, the theory of numbers. Remarkably, the connections between this ancient subject and the most modern research in computer methods are very strong. In Chapter 5 we will see that there is a large family of problems, including a number of very important computational questions, that are bound together by a good deal of structural unity. We don’t know if they’re hard or easy. We do know that we haven’t found a fast way to do them yet, and most people suspect that they’re hard. We also know that if any one of these problems is hard, then they all are, and if any one of them is easy, then they all are. We hope that, having found out something about what people know and what people don’t know, the reader will have enjoyed the trip through this subject and may be interested in helping to ﬁnd out a little more.

4

1.1 Orders of magnitude Chapter 1: Mathematical Preliminaries

1.1 Orders of magnitude In this section we’re going to discuss the rates of growth of diﬀerent functions and to introduce the ﬁve symbols of asymptotics that are used to describe those rates of growth. In the context of algorithms, the reason for this discussion is that we need a good language for the purpose of comparing the speeds with which diﬀerent algorithms do the same job, or the amounts of memory that they use, or whatever other measure of the complexity of the algorithm we happen to be using. Suppose we have a method of inverting square nonsingular matrices. How might we measure its speed? Most commonly we would say something like ‘if the matrix is n × n then the method will run in time 16.8n3.’ Then we would know that if a 100 × 100 matrix can be inverted, with this method, in 1 minute of computer time, then a 200 × 200 matrix would require 23 = 8 times as long, or about 8 minutes. The constant ‘16.8’ wasn’t used at all in this example; only the fact that the labor grows as the third power of the matrix size was relevant. Hence we need a language that will allow us to say that the computing time, as a function of n, grows ‘on the order of n3 ,’ or ‘at most as fast as n3 ,’ or ‘at least as fast as n5 log n,’ etc. The new symbols that are used in the language of comparing the rates of growth of functions are the following ﬁve: ‘o’ (read ‘is little oh of’), ‘O’ (read ‘is big oh of’), ‘Θ’ (read ‘is theta of’), ‘∼’ (read ‘is asymptotically equal to’ or, irreverently, as ‘twiddles’), and ‘Ω’ (read ‘is omega of’). Now let’s explain what each of them means. Let f(x) and g(x) be two functions of x. Each of the ﬁve symbols above is intended to compare the rapidity of growth of f and g. If we say that f(x) = o(g(x)), then informally we are saying that f grows more slowly than g does when x is very large. Formally, we state the Deﬁnition. We say that f(x) = o(g(x)) (x → ∞) if limx→∞ f(x)/g(x) exists and is equal to 0. Here are some examples: (a) x2 = o(x5 ) (b) sin x = o(x) √ (c) 14.709 x = o(x/2 + 7 cos x) (d) 1/x = o(1) (?) (e) 23 log x = o(x.02 ) We can see already from these few examples that sometimes it might be easy to prove that a ‘o’ relationship is true and sometimes it might be rather diﬃcult. Example (e), for instance, requires the use of L’Hospital’s rule. If we have two computer programs, and if one of them inverts n × n matrices in time 635n3 and if the other one does so in time o(n2.8 ) then we know that for all suﬃciently large values of n the performance guarantee of the second program will be superior to that of the ﬁrst program. Of course, the ﬁrst program might run faster on small matrices, say up to size 10, 000 × 10, 000. If a certain program runs in time n2.03 and if someone were to produce another program for the same problem that runs in o(n2 log n) time, then that second program would be an improvement, at least in the theoretical sense. The reason for the ‘theoretical’ qualiﬁcation, once more, is that the second program would be known to be superior only if n were suﬃciently large. The second symbol of the asymptotics vocabulary is the ‘O.’ When we say that f(x) = O(g(x)) we mean, informally, that f certainly doesn’t grow at a faster rate than g. It might grow at the same rate or it might grow more slowly; both are possibilities that the ‘O’ permits. Formally, we have the next Deﬁnition. We say that f(x) = O(g(x)) (x → ∞) if ∃C, x0 such that |f(x)| < Cg(x) (∀x > x0 ).

The qualiﬁer ‘x → ∞’ will usually be omitted, since it will be understood that we will most often be interested in large values of the variables that are involved. For example, it is certainly true that sin x = O(x), but even more can be said, namely that sin x = O(1). Also x3 + 5x2 + 77 cos x = O(x5 ) and 1/(1 + x2 ) = O(1). Now we can see how the ‘o’ gives more precise information than the ‘O,’ for we can sharpen the last example by saying that 1/(1 + x2 ) = o(1). This is 5

Chapter 1: Mathematical Preliminaries sharper because not only does it tell us that the function is bounded when x is large, we learn that the function actually approaches 0 as x → ∞. This is typical of the relationship between O and o. It often happens that a ‘O’ result is suﬃcient for an application. However, that may not be the case, and we may need the more precise ‘o’ estimate. The third symbol of the language of asymptotics is the ‘Θ.’ Deﬁnition. We say that f(x) = Θ(g(x)) if there are constants c1 = 0, c2 = 0, x0 such that for all x > x0 it is true that c1 g(x) < f(x) < c2 g(x). We might then say that f and g are of the same rate of growth, only the multiplicative constants are uncertain. Some examples of the ‘Θ’ at work are (x + 1)2 = Θ(3x2 ) (x2 + 5x + 7)/(5x3 + 7x + 2) = Θ(1/x) √ 1 3 + 2x = Θ(x 4 ) (1 + 3/x)x = Θ(1). The ‘Θ’ is much more precise than either the ‘O’ or the ‘o.’ If we know that f(x) = Θ(x2 ), then we know that f(x)/x2 stays between two nonzero constants for all suﬃciently large values of x. The rate of growth of f is established: it grows quadratically with x. The most precise of the symbols of asymptotics is the ‘∼.’ It tells us that not only do f and g grow at the same rate, but that in fact f/g approaches 1 as x → ∞. Deﬁnition. We say that f(x) ∼ g(x) if limx→∞ f(x)/g(x) = 1. Here are some examples. x2 + x ∼ x2 (3x + 1)4 ∼ 81x4 sin 1/x ∼ 1/x (2x3 + 5x + 7)/(x2 + 4) ∼ 2x 2x + 7 log x + cos x ∼ 2x Observe the importance of getting the multiplicative constants exactly right when the ‘∼’ symbol is used. While it is true that 2x2 = Θ(x2 ), it is not true that 2x2 ∼ x2 . It is, by the way, also true that 2x2 = Θ(17x2 ), but to make such an assertion is to use bad style since no more information is conveyed with the ‘17’ than without it. The last symbol in the asymptotic set that we will need is the ‘Ω.’ In a nutshell, ‘Ω’ is the negation of ‘o.’ That is to say, f(x) = Ω(g(x)) means that it is not true that f(x) = o(g(x)). In the study of algorithms for computers, the ‘Ω’ is used when we want to express the thought that a certain calculation takes at least so-and-so long to do. For instance, we can multiply together two n × n matrices in time O(n3 ). Later on in this book we will see how to multiply two matrices even faster, in time O(n2.81 ). People know of even faster ways to do that job, but one thing that we can be sure of is this: nobody will ever be able to write a matrix multiplication program that will multiply pairs n × n matrices with fewer than n2 computational steps, because whatever program we write will have to look at the input data, and there are 2n2 entries in the input matrices. Thus, a computing time of cn2 is certainly a lower bound on the speed of any possible general matrix multiplication program. We might say, therefore, that the problem of multiplying two n×n matrices requires Ω(n2 ) time. The exact deﬁnition of the ‘Ω’ that was given above is actually rather delicate. We stated it as the negation of something. Can we rephrase it as a positive assertion? Yes, with a bit of work (see exercises 6 and 7 below). Since ‘f = o(g)’ means that f/g → 0, the symbol f = Ω(g) means that f/g does not approach zero. If we assume that g takes positive values only, which is usually the case in practice, then to say that f/g does not approach 0 is to say that ∃ > 0 and an inﬁnite sequence of values of x, tending to ∞, along which |f|/g > . So we don’t have to show that |f|/g > for all large x, but only for inﬁnitely many large x. 6

Deﬁnition. It grows faster than log x. all by itself. which tells us that if we want to ﬁnd the limit then we can diﬀerentiate the numerator. A function that grows faster than xa .99/(1/x) = .03)x. Among commonly occurring functions of x that grow without bound as x → ∞. . we have the Deﬁnition. x2 . Beyond the exponentially growing functions there are functions that grow as fast as you might please. logarithmically growing functions and the functions that are of exponential growth. fast. but grows slower than cx for every constant c > 1 is said to be of moderately exponential growth. but n might have to be extremely large before you would notice the improvement. log (1. x. The growth ranges that are of the most concern to computer scientists are ‘between’ the very slowly. so that we have a small catalogue of functions that grow slowly. Example. Now we have discussed the various symbols of asymptotics that are used to compare the rates of growth of pairs of functions. medium-fast. we meet x.8. 2 log Hence e√ x is an example of a function that grows faster than every ﬁxed power of x. What we have proved. .. Next let’s look at the growth of sums that involve elementary functions. Consider the limit of x.1 Orders of magnitude Deﬁnition. other things being equal. x15 log2 x.01 which obviously grows without bound as x → ∞. in fact it will 1000 be larger than x as soon as log x > 1000. Formally. If we clutter up a function of exponential growth with smaller functions then we will not change the √ fact that it is of exponential growth. The reason is simple: if a computer algorithm requires more than an exponential amount of time to do its job. x9 7x. 2 Consider elog x . which grows faster than cn for every ﬁxed constant c. think about x. then instead of the original ratio.01 . When x = 1. More precisely. which grows much faster than n!.e. 7 . with a view toward discovering the rates at which the sums grow. Beyond the range of moderately exponential growth are the functions that grow exponentially fast. for every constant a. precisely. and like 2n .01 grows faster than log x. . would indeed be an improvement.01 grows faster than log x? By using L’Hospital’s rule.03 or things of that sort. 000) = 13. and therefore in that sense we can say that x. just as log x grows slower than every ﬁxed power of x. Just a bit faster growing than the ‘snails’ above is log x itself. etc.01x−. and then we encounter functions that grow faster than every ﬁxed power of x. but it takes its time about it. After all.2 . and it resists the eﬀorts of the smaller functions to change its mind.01 ). x15. 2 Like n!. A function f is of exponential growth if there exists c > 1 such that f(x) = Ω(cx ) and there exists d such that f(x) = O(dx). For instance. diﬀerentiate the denominator. In this book. We say that f(x) = Ω(g(x)) if there is an ∀j : |f(xj )| > g(xj ). So if we had a computer algorithm that could do n things in time log n and someone found another method that could do the same job in time O(log log n). is that log x = o(x. > 0 and a sequence x1 . It is certainly true that log log x → ∞ as x → ∞. and so forth. as soon as x > e1000 (don’t hold your breath!). To continue up the scale of rates of growth. Typical of such functions are (1. then the second method. Therefore the original ratio x. How would we prove that x..01/log x also grows without bound. i. If we do this. and super-fast. → ∞ such that Now let’s introduce a hierarchy of functions according to their rates of growth when x is large. As x → ∞ the ratio assumes the indeterminate form ∞/∞. then it will probably not be used. 000. we ﬁnd the ratio .6. or at any rate it will be used only in highly unusual circumstances. because e2x is. for instance. perhaps the slowest growing ones are functions like log log x or maybe (log log x)1. log log x has the value 2. and try again to let x → ∞. f(x) is of moderately exponential growth if for every a > 0 we have f(x) = Ω(xa ) and for every > 0 we have f(x) = o((1 + )x ). x3 . Thus e x+2x /(x49 + 37) remains of exponential growth. Another such example is e x (why?). 000.1. and we have discussed the pecking order of rapidity of growth. 2x .01x. and it is therefore a candidate for L’Hospital’s rule. Since this is the same as xlog x it will obviously grow faster than x1000.01 /log x for x → ∞. Next on the scale of rapidity of growth we might mention the powers of x. 000. although you wouldn’t believe it if you tried to substitute a few values of x and to compare the answers (see exercise 1 at the end of this section). the algorithms that we will deal with all fall in this range. for example.

Now our function f(n) has been bounded on both sides. which is to say that n−1 n j2 ≤ j=1 1 x2 dx (1.1. Thus. The reader will also have noticed that the ‘∼’ gives a much more satisfying estimate of growth than the ‘O’ does. 1.Chapter 1: Mathematical Preliminaries Think about this one: f(n) = j=0 2 n j2 (1. If we compare (1. namely n2 . From the picture we see immediately that n 1 2 + 2 2 + · · · + n2 ≥ 0 3 x2 dx (1. the total area of all of the rectangles is smaller than the area under the curve. Suppose we wanted more precise information about the growth of f(n). 8 .1. that f(n) ≤ n3 for all n ≥ 1. f(n) is the sum of the squares of the ﬁrst n positive integers.1) as shown in Fig. Further.1. it is certainly true that f(n) = O(n3 ). the biggest one is the last one. in the x-y plane. 1.2) and (1.3) = n /3.1. How might we make such a better estimate? The best way to begin is to visualize the sum in (1. and even more. Fig. Consequently. The rectangles all lie under the curve.1.1) 2 2 2 = 1 +2 +3 +···+n . rather tightly. Since there are n terms in the sum and the biggest one is only n2 .1. This time we use the setup in Fig. Now we’re going to get a lower bound on f(n) in the same way.1) we notice that we have proved that f(n) ≤ ((n + 1)3 − 1)/3.1. where we again show the curve y = x2 .2) = (n3 − 1)/3.2.1: How to overestimate a sum In that ﬁgure we see the graph of the curve y = x2 . but this time we have drawn the rectangles so they lie above the curve. which gives us quite a good idea of the rate of growth of f(n) when n is large. there is a rectangle drawn over every interval of unit length in the range from x = 1 to x = n. such as a statement like f(n) ∼ ?. How fast does f(n) grow when n is large? Notice at once that among the n terms in the sum that deﬁnes f(n).1. 1. What we know about it is that ∀n ≥ 1 : n3 /3 ≤ f(n) ≤ ((n + 1)3 − 1)/3.1.1. From this we have immediately that f(n) ∼ n3 /3.

9 (1.1.1. 1.4) Consider a diagram that looks exactly like Fig. while the area under the curve n between 1 and n is 1 g(t)dt. if we consider Fig.1.1. then one should try to compare the sums with integrals because they’re usually easier to deal with.1. where the graph is once more the graph of y = g(x). (1. their combined areas cannot exceed the area under the curve.2. 2. (1. .1. The general idea is that when one is faced with estimating the rates of growth of sums. that will make estimates like the above for us without requiring us each time to visualize pictures like Figs. and we have the inequality n G(n − 1) ≤ 1 g(t)dt (n ≥ 1).1.1. .8) .). and suppose that g(n) is nondecreasing. .2.2: How to underestimate a sum Let’s formulate a general principle.1. Let a function g(n) be deﬁned for nonnegative integer values of n. (1. The sum of the areas of the rectangles is exactly G(n − 1).5) and (1.1. We want to estimate the growth of the sum n G(n) = j=1 g(j) (n = 1. Since the rectangles lie wholly under the curve. the fact that the combined areas of the rectangles is now not less than the area under the curve yields the inequality n G(n) ≥ 0 g(t)dt (n ≥ 1).7).1.6) we ﬁnd that we have completed the proof of Theorem 1.1.1. After doing the integrals. 1.6) If we combine (1.1.1 and 1. as the following example shows.5) On the other hand.1. for estimating the size of a sum. we obtain n n log n − n ≤ j=1 log j ≤ (n + 1) log (n + 1) − n. Let g(n) = log n and substitute in (1. (1.1. Let g(x) be nondecreasing for nonnegative x. Then n n n+1 g(t)dt ≤ 0 j=1 g(j) ≤ 1 g(t)dt.7) The above theorem is capable of producing quite satisfactory estimates with rather little labor.1 Orders of magnitude Fig.1 except that the curve that is shown there is now the curve y = g(x). 1. 1.

1 1. The point of this exercise is that if f grows more slowly than g. when n is large.01 and of log x for x = 10. Calculate the values of x.9) This is rather a good estimate of the growth of n!. which is the statement that x √ x! ∼ ( )x 2xπ. and therefore by exponentiation of (1. Which are which? (a) (x2 + 3x + 1)3 ∼ x6 √ (b) ( x + 1)3 /(x2 + 1) = o(1) (c) e1/x = Θ(1) (d) 1/x ∼ 0 (e) x3 (log log x)2 = o(x3 log x) √ (f) log x + 1 = Ω(log log x) (g) sin x = Ω(1) (h) cos x/x = O(1) x (i) 4 dt/t ∼ log x 2 x (j) 0 e−t dt = O(1) 2 (k) j≤x 1/j = o(1) (l) j≤x 1 ∼ x 3. h1 (x) = {1/j + 3/j 2 + 4/j 3 } j≤x (i) ∼ log x (ii) = O(x) (iii) ∼ 2 log x (iv) = Θ(log x) (v) = Ω(1) h2 (x) = √ j≤ x {log j + j} √ √ √ √ (i) ∼ x/2 (ii) = O( x) (iii) = Θ( x log x) (iv) = Ω( x) (v) = o( x) h3 (x) = √ j≤ x 1/ j √ (i) = O( x) (ii) = Ω(x1/4 ) (iii) = o(x1/4 ) (iv) ∼ 2x1/4 (v) = Θ(x1/4 ) 4. is f = O(h)?)? 5. Θ. Ω.. then we can always ﬁnd a third function h whose rate of growth is between that of f and of g. as x → ∞.000. In each case state which of the ﬁve choices. Some of the following statements are true and some are false. ∼. e (1.000. By the use of slightly more precise machinery one can prove a better estimate of the size of n! that is called Stirling’s formula. of the function that the sum deﬁnes. Of the ﬁve symbols of asymptotics O.01 > log x. prove the following: if f = o(g) then there 10 . Each of the three sums below deﬁnes a function of x. 1000. are true (note: more than one choice may be true). since the right member is only about ne times as large as the left member (why?).g. Find a single value of x > 10 for which x.1.8) we have n (n + 1)n+1 ( )n ≤ n! ≤ . o. if any. 2. Precisely. Beneath each sum there appears a list of ﬁve assertions about the rate of growth.10) Exercises for section 1. if f = O(g) and g = O(h).1.1. and prove that your answer is correct. e en (1. 1.Chapter 1: Mathematical Preliminaries We recognize the middle member above as log n!. which ones are transitive (e.

100000. 1000.. 2. 7. 10. then what is true?’ (a) ∃x > 0 f(x) = 0 (b) ∀x > 0. to represent an integer we can specify how many copies of each power of 10 we would like to have. (log log n)3 . 8.’ √ n 3 2 1. Arrange the following functions in increasing order of their rates of growth. then the number that is being represented by that string of digits is m n= i=0 di 10i . list them so that each one is ‘little oh’ of its successor: . log n3 + 1. (c) Use your answer to part (b) to give a positive deﬁnition of the assertion ‘f(x) = o(g(x)). n3 log n . elog n . as a positive assertion. n3.6 . 100. . 128.’ (a) Write out the precise deﬁnition of the statement ‘limx→∞ h(x) = 0’ (use ‘ ’s). ‘If this isn’t true. 2n . 16. (b) Write out the negation of your answer to part (a). In each case. In each case the question is. . Furthermore. The usual decimal system represents numbers by using the digits 0. . .5 2n . 10. ∃ > 0 f(x) < (d) ∃x = 0 ∀y < 0. 6. In this exercise we will work out the deﬁnition of the ‘Ω. . Find a function f(x) such that f(x) = O(x1+ ) is true for every > 0. for large n. 1. Prove that the statement ‘f(n) = O((2 + )n ) for every > 0’ is equivalent to the statement ‘f(n) = o((2 + )n ) for every > 0. as . 64. (n + 4)12 9.’ what rule would you apply in order to negate the proposition and leave the result in positive form (containing no negation symbols or ‘not’s). . n!. Instead of using 10’s we’re going to use 2’s. That is. So we imagine that the powers of 2 are displayed before us. 10000.’ and thereby justify the deﬁnition of the ‘Ω’ symbol that was given in the text. 11 . . For the purpose of representing whole numbers we can imagine that the powers of 10 are displayed before us like this: .} Below there appear several mathematical propositions. 2 n3 log n. n. 32. f(y) < f(x) (e) ∀x ∃y ∀z : g(x) < f(y)f(z) (f) ∀ > 0 ∃x ∀y > x : f(y) < Can you formulate a general method for negating such propositions? Given a proposition that contains ‘∀. 512.01. Now let’s try the binary system. . 256.’ ‘∃. 9.2 Positional number systems is a function h such that f = o(h) and h = o(g). f(x) > 0 (c) ∀x > 0. Then. {This exercise is a warmup for exercise 7.2 Positional number systems This section will provide a brief review of the representation of numbers in diﬀerent bases. as dm dm−1 · · · d1 d0 . in the negation. . do not use the word ‘not’ or any negation symbols. for example. 1. 8. 4. if we write out the string of digits that represents a number in the decimal system. If we write 237.’ ‘ . In general.1. √ n1. write a proposition that is the negation of the given one. but for which it is not true that f(x) = O(x). 1. Give an explicit construction for the function h in terms of f and g. then that means that we want 2 100’s and 3 10’s and 7 1’s.

. in the base 2. Then every positive integer n can be written in one and only one way in the form n = d 0 + d 1 b + d 2 b2 + d 3 b3 + · · · if the digits d0 . So if we were to allow too many diﬀerent digits. . we might make the mistake of allowing digits 0. Let b > 1 be a positive integer (the ‘base’). is the same as the number 1101. what we must expect and demand of any proposed system. suppose that n has some other representation in this form also. is a representation of n that uses only the allowed digits. . The general proposition is this. in the base 16 we need sixteen digits. d1 .Chapter 1: Mathematical Preliminaries To represent a number we will now specify how many copies of each power of 2 we would like to have. 2. What that means is that if we use only 0’s and 1’s then we can represent every number n in exactly one way.. Suppose. 12 . . . then numbers would be representable in more than one way by a string of digits. If we were allowed to use more digits than just 0’s and 1’s then we would be able to represent the number (13)10 as a binary number in a whole lot of ways. say n−d = d 0 + d 1 b + d 2 b2 + . the number n = (n − d)/b is uniquely representable. For instance. The binary system is extremely simple because it uses only two digits. In the binary system (base 2) the only digits we will ever need are 0 and 1. Then 13 would be representable by 3 · 22 + 1 · 20 or by 2 · 22 + 2 · 21 + 1 · 20 etc. a 4 and a 1. Since a0 and c0 are both equal to n mod b. We will write (13)10 = (1101)2 to mean that the number 13. so this must be the decimal number 13. then we want an 8. . Finally. is. Hence the number n = (n − a0 )/b has two diﬀerent representations. If we were to allow too few diﬀerent digits then we would ﬁnd that some numbers have no representation at all. . Theorem 1. if we write 1101. that every integer 1. . inductively. For instance. if we were to use the decimal system with only the digits 0. b Then clearly. for all i. aside from 10. . 8. so we had better keep the 9’s. . for instance. . The bases b that are the most widely used are. Then d is one of the b permissible digits. 1. 1. they are equal to each other. n−d n=d+ b b = d + d 0 b + d 1 b2 + d 2 b3 + . The binary digits of a number are called its bits or its bit string. n − 1 is uniquely representable. Remark: The theorem says. since we have assumed the truth of the result for all n < n. This is very convenient if you’re a computer or a computer designer. . etc. because the digits can be determined by some component being either ‘on’ (digit 1) or ‘oﬀ’ (digit 0). . lie in the range 0 ≤ di ≤ b − 1. Clearly the number 1 can be represented in one and only one way with the available digits (why?). Deﬁne d = n mod b. the number being represented. . that in the base 10 we need the digits 0. the proof is by induction on n. 3.1. 2. Let’s elaborate on this last point. . 9. Now consider the integer n. . For instance. in the base 2 we need only 0 and 1. after all. . By induction. . . . .2. The unique representation of every number. . then inﬁnitely many numbers would not be able to be represented. = c0 + c1 b + c2 b2 + . 2. Then we would have n = a0 + a1 b + a2 b2 + . 8 (‘octal system’) and 16 (‘hexadecimal system’). 2 (‘binary system’). in the base 10. which contradicts the inductive assumption. Proof of the theorem: If b is ﬁxed. 1.

(735)8 = (477)10. rather than three. then convert each group of three bits. If we are given the bit string of an integer n. 7. 2. (a) (737)10 = (?)3 (b) (101100)2 = (?)16 (c) (3377)8 = (?)16 (d) (ABCD)16 = (?)10 (e) (BEEF )16 = (?)8 3. we group the bits in threes. Generalize this theorem to other pairs of bases. . To convert this binary number to octal. . The conversion back and forth to binary now uses groups of four bits. the longer binary strings. starting with the least signiﬁcant bit. that will ﬁnd the string of digits that represents n in the base b.’. b:integer. independently of the others. For example. then to convert it to octal. Write a procedure convert (n. but what shall we call the ‘digits 10 through 15’ ? The names that are conventionally used for them are ‘A.’ ‘B.1. For instance. if the octal form of n is given. 16 digits.. thereby getting (1101100101)2 = (1545)8 . given (1101100101)2. (A52C)16 = 10(4096) + 5(256) + 2(16) + 12 = (42284)10 = (1010)2 (0101)2 (0010)2 (1100)2 = (1010010100101100)2 = (1)(010)(010)(100)(101)(100) = (122454)8. into a single octal digit. 1. Carry out the conversions indicated below. coupled with the ease of conversion back and forth.‘F. digitstr:string). In hexadecimal we will need. (1)(101)(100)(101) starting from the right. If you’re a working programmer it’s very handy to use the shorter octal strings to remember. The hexadecimal system (base 16) is like octal. Exercises for section 1. because of the space saving. . in the octal system the digits that we need are 0. all we have to do is to group the bits together in groups of three. or to write down. According to the theorem.2 1.. according to the theorem above. for example.. The captivating feature of the octal system is the ease with which we can convert between octal and binary. Conversely.2 Positional number systems The octal system is popular because it provides a good way to remember and deal with the long bit strings that the binary system creates. We have handy names for the ﬁrst 10 of these. only more so.’ We have. . 13 . then the binary form is obtainable by converting each octal digit independently into the three bits that represent it in the binary system. Prove that conversion from octal to binary is correctly done by converting each octal digit to a binary triple and concatenating the resulting triples. and then we convert each triple into a single octal digit.

7) .2) ex = m=0 ∞ xm /m! (1. (1.3.3).1) This equation is valid certainly for all x = 1. Try to get used to the idea that a series in powers of x becomes a number if x is replaced by a number. Let’s think about the sum 1 + 2 · 2 + 3 · 4 + 4 · 8 + 5 · 16 + · · · + N 2N−1 .3. Why is (1.3. A parenthetical remark like ‘(|x| < 1)’ shows the set of values of x for which the series converges. and it remains true when x = 1 also if we take the limit indicated on the left side.3) sin x = r=0 (−1)r x2r+1 /(2r + 1)! ∞ (1. Therefore the answer is (310 − 1)/2. 14 (1. and if we know a formula for the sum of the series then we know the number that it becomes. and replace x by log 2.Chapter 1: Mathematical Preliminaries 1. Aside from merely substituting values of x into known series.3.3.3.3. Here are some more series to keep in your zoo.1) with x = 3.3.3.3 Manipulations with series In this section we will look at operations with power series. ∞ xk = 1/(1 − x) k=0 ∞ (|x| < 1) (1. We begin with a little catalogue of some power series that are good to know.3.4) cos x = s=0 (−1)s x2s /(2s)! ∞ (1.5) log (1/(1 − x)) = j=1 xj /j (|x| < 1) (1. First we have the ﬁnite geometric series (1 − xn )/(1 − x) = 1 + x + x2 + · · · + xn−1 . What is the value of the sum 9 3j ? j=0 Observe that we are looking at the right side of (1.6) Can you ﬁnd a simple form for the sum (the logarithms are ‘natural’) 1 + log 2 + (log 2)2 /2! + (log 2)3 /3! + · · ·? Hint: Look at (1. including multiplying them and ﬁnding their sums in simple form.1) true? Just multiply both sides by 1 − x to clear of fractions. there are many other ways of using known series to express sums in simple form. Now try this one. The result is 1 − xn = (1 + x + x2 + x3 + · · · + xn−1 )(1 − x) = (1 + x + x2 + · · · + xn−1 ) − (x + x2 + x3 + · · · + xn ) = 1 − xn and the proof is ﬁnished.

(1. In general. (1. after simplifying the right-hand side.8). if the nth coeﬃcient is multiplied by n.9).8) becomes 1 + 2x + 3x2 + 4x3 + · · · + (n − 1)xn−2 = 1 − nxn−1 + (n − 1)xn .3.1).3.1). since (1..10) (1.3.3. keep the known series with its variable unrestricted. (1 − x)2 (1.3. it becomes ∞ j=2 (1. 3.7) reminds us of (1.6) it tells us that ∞ 1 = log (3/2).6).3.1) again. a known series. and whose result will be that the known series will have been changed into the one whose sum we needed. The trick is this. 1 + 2 · 2 + 3 · 4 + 4 · 8 + · · · + N 2N−1 = 1 + (N − 1)2N . Next try this one: 1 1 + +··· 2 · 32 3 · 33 If we rewrite the series using summation signs. .1.3. . That is.11) is the result of dropping the term with j = 1 from (1. we ﬁnd that multiplying the nth coeﬃcient of a power series by n2 d changes the sum from f to (x dx )2 f. If we apply the rule again.3.3.3. 2. .11) 1 . When confronted with a series that is similar to.3. which shows that the sum in (1. .3.3 Manipulations with series We are reminded of the ﬁnite geometric series (1.3. 2.3. n = N + 1 in (1.8) Don’t replace x by 2 yet.. j · 3j Comparison with the series zoo shows great resemblance to the species (1. N into (1.8). just walk up to the equation (1.7). .3. Then reach for an appropriate tool that will be applied to both sides of that equation. we’ll begin by writing down (1. 4.7). (1 − xn )/(1 − x) = 1 + x + x2 + · · · + xn−1 (1.3. but (1. then the function changes from d f to (x dx )f.3.12).3. with the series on one side and its sum on the other. . in this case x = 2. To evaluate the sum (1. N . ∞ j 2 xj /j! = (x j=0 d d )(x )ex dx dx d )(xex ) dx = (x2 + x)ex . to obtain.8) carrying your tool kit and ask what kind of surgery you could do to both sides of (1. write down the known series as an equation. In this case.7) is a little diﬀerent because of the multipliers 1. if we put x = 1/3 in (1. = (x 15 . In other words.12) j · 3j j=1 The desired sum (1. all we have to do is to substitute x = 2.11) is equal to log (3/2) − 1/3. After diﬀerentiation. In fact. Even though the unknown series involves a particular value of x. but not identical with.3.’ In other words. 3.9) Now it’s easy.3.3. Then nan xn−1 = f (x) and nan xn = xf (x). we are going to diﬀerentiate (1.3. The reason for choosing diﬀerentiation is that it will put the missing multipliers 1.3. suppose that f(x) = an xn is some series that we know.8) that would be helpful in evaluating the unknown (1. d We are going to reach into our tool kit and pull out ‘ dx .

2) needs two starting values in order to ‘get going. Consequently (1. starting with one or more given values.4. .e.3. j (a) j≥3 log 6 /j! m (b) m>1 (2m + 7)/5 19 j (c) j=0 (j/2 ) (d) 1 − x/2! + x2 /4! − x3 /6! + · · · (e) 1 − 1/32 + 1/34 − 1/36 + · · · ∞ 2 (f) m=2 (m + 3m + 2)/m! 2.1) This relation tells us that x1 = cx0 . The parenthetical remarks are essential. 16 . j (1. x2 = c2 . (1.4. The recurrence relation xn+1 = xn + xn−1 (1. . It is then clear that x1 = c. If one of these is missing. and the second one ‘x0 = 1’ gives the starting value. Suppose we are to ﬁnd an inﬁnite sequence of numbers x0 . Equation (1. and after doing the diﬀerentiations we ﬁnd the answer in the 2 form (7x − 8x + 5)/(1 − x)3 . Find simple. Explain why r≥0 (−1)r π 2r+1 /(2r + 1)! = 0. x1 . explicit formulas for the sums of each of the following series. then it will be changed to {3(x dx )2 + 2(x dx ) + 5}f(x).1) is a ﬁrst-order recurrence relation because a new value of the sequence is computed from just one preceding value (i.3 1. 3.1) is given by xn = cn for all n ≥ 0. . ..4. and does not involve xn−1 or any earlier values). The ﬁrst one ‘n ≥ 0’ tells us for what values of n the recurrence formula is valid. .4.4. Find the coeﬃcient of tn in the series expansion of each of the following functions about t = 0. xn = cn . We say that the solution of the recurrence relation (= ‘diﬀerence equation’) (1.4. (a) (1 + t + t2 )et (b) (3t − t2 ) sin t (c) (t + 1)2 /(t − 1)2 1. Observe the format of the equation (1. say. the solution may not be uniquely determined.’ but it is missing both of those starting values and the range of n.2) (which is a second-order recurrence) does not uniquely determine the sequence. by means of xn+1 = cxn (n ≥ 0. . . . multiplying the nth coeﬃcient of a power series by np will change the sum from f(x) to but that’s not all. xn+1 is obtained solely from xn .. . etc. What happens if we multiply the coeﬃcient of xn by. ∞ (2j 2 + 5)xj j=0 d is therefore equal to {2(x dx )2 + 5}{1/(1 − x)}.4 Recurrence relations A recurrence relation is a formula that permits us to compute the members of a sequence one after another. Here is the general rule: if P (x) is any polynomial then P (j)aj xj = P (x j d ){ dx aj xj }. Here is a small example.Chapter 1: Mathematical Preliminaries Similarly. x0 = 1).1). . The sum d (x dx )p f(x). and x2 = cx1 . and furthermore that x0 = 1.13) Exercises for section 1. 3n2 + 2n + 5? If d d the sum previously was f(x).

Though no approach will avoid the unpleasant form of the general answer. y0 given) (1. successively. Then we would ﬁnd that x1 = b1 x0 + c1 .4.} is a given sequence. and we want to ﬁnd the x’s. 2.8) It is not recommended that the reader memorize the solution that we have just obtained. x0 given). (1. . x2 = b2 b1 x0 + b2 c1 + c2 . There.5). We ﬁnd. we’ll raise the ante a step further.4) Since that wasn’t hard enough.1. .4. such as xn+1 = bn+1 xn (n ≥ 0.1).5). We can now use (1. y2 . It involves (a) make a change of variables that leads to a new recurrence of the form (1.4 Recurrence relations The situation is rather similar to what happens in the theory of ordinary diﬀerential equations. In this book we are going to run into several equations of the type of (1. . that x1 = b1 x0 . Suppose we follow the strategy that has so far won the game.4. Here is a somewhat more orderly approach to (1.4. . .4.4.5) Now we are being given two sequences b1 . (1. .4.4. then x2 = b2 x1 = b2 b1 x0 and x3 = b3 x2 = b3 b2 b1 x0 etc. and we are being asked to ﬁnd the unknown sequence {x1 . (1. then 17 . . The result is the equation yn+1 = yn + dn+1 (n ≥ 0. .}. if we omit initial or boundary values.7) where we have written dn+1 = cn+1 /(b1 · · · bn+1 ).6) deﬁne a new unknown sequence y1 . In an easy case like this we can write out the ﬁrst few x’s and then guess the answer. b2. . and c1 .7). Now substitute for xn in (1. We haven’t yet solved the recurrence relation. the next level of diﬃculty occurs when we consider a ﬁrst-order recurrence relation with a variable multiplier. At this point we can guess that the solution is n xn = { i=1 bi}x0 (n = 0. x1 .4. . Let xn = b1 b2 · · · bn yn (n ≥ 1.4.7) is quite simple. (1. b2 .4. . Notice that the d’s are known. . getting b1 b2 · · · bn+1 yn+1 = bn+1 b1 b2 · · · bn yn + cn+1 . x2. We have only changed to a new unknown function that satisﬁes a simpler recurrence (1. Suppose we want to solve the ﬁrst-order inhomogeneous (because xn = 0 for all n is not a solution) recurrence relation xn+1 = bn+1 xn + cn+1 (n ≥ 0. . It follows that n yn = y0 + j=1 dj (n ≥ 0). and ﬁnd that n xn = (b1 b2 · · · bn ){x0 + j=1 dj } (n ≥ 1). x0 given). 1. . and we would probably tire rapidly. .6) to reverse the change of variables to get back to the original unknowns x0 . It is recommended that the method by which the solution was found be mastered.6). . because it says that each y is obtained from its predecessor by adding the next one of the d’s. . The ﬁrst step is to deﬁne a new unknown function as follows.4. the one that we are about to describe at least gives a method that is much simpler than the guessing strategy. Now the solution of (1. for many examples that arise in practice.3) Now {b1 . We notice that the coeﬃcients of yn+1 and of yn are the same.4. . . ..). . that is. x0 = y0 ) (1. c2 . writing down the ﬁrst few x’s and trying to guess the pattern. and so we divide both sides by that coeﬃcient. Beyond the simple (1.4..5). so a uniﬁed method will be a deﬁnite asset. then the solutions are determined only up to arbitrary constants.

Now by summation.4.11) If we think back to diﬀerential equations of second-order with constant coeﬃcients.9) in the form n−1 xn = 3n j=1 j/3j+1 (n ≥ 0). After substituting in (1.9) and simplifying. That pretty well takes care of ﬁrst-order recurrence relations of the form xn+1 = bn+1 xn + cn+1 . c2 we ﬁnd that c1 = α+ / 5 and c2 = −α− / 5. be completely removed by the same method that you used to solve exercise 1(c) of section 1.4. equation (1.Chapter 1: Mathematical Preliminaries (b) solve that one by summation and (c) go back to the original unknowns. consider the ﬁrst-order equation xn+1 = 3xn + n (n ≥ 0. These are of the form xn+1 = axn + bxn−1 (n ≥ 1. √ √ If we denote the two roots by α+ = (1 + 5)/2 and α− = (1 − 5)/2.11) calls for a trial solution of the form xn = αn . x1 have their assigned values. and then the general solution of (1. n−1 yn = j=1 j/3j+1 (n ≥ 0). and obtain an explicit formula for the nth Fibonacci number.15) 2 2 5 18 . . . and it’s time to move on to linear second order (homogeneous) recurrence relations with constant coeﬃcients. √ n+1 √ n+1 1 1+ 5 1− 5 Fn = √ − (n = 0.4. F0 = F1 = 1). .10) This is quite an explicit answer. .4.4. α2 = α+1. It remains to determine the constants c1 .).4. The recurrence for the Fibonacci numbers is Fn+1 = Fn + Fn−1 (n ≥ 1. we look for a solution in the form Fn = αn . Analogously.11) and cancel a common factor of αn−1 we obtain a quadratic equation for α. . .4. y0 = 0). Hence the road to the solution of such a diﬀerential equation begins by trying a solution of that form and seeing what the constant or constants α turn out to be. If we substitute xn = αn in (1.13).4. then the general solution to the Fibonacci recurrence has been obtained. Finally.4. (1. in fact. and it has the form (1. 1. (1. (1. Example.14) Following the recipe that was described above. is to let xn = 3n yn . (1. in this case.13) + − The constants c1 and c2 will be determined so that x0 .4. After substituting in (1. Finally.4.14) and cancelling common factors we ﬁnd that the quadratic equation for α is.6).4. but the summation can. c2 from the initial conditions F0 = F1 = 1. we ﬁnd yn+1 = yn + n/3n+1 (n ≥ 0. we substitute these values of the constants into the form of the general solution. we recall that there are always solutions of the form y(t) = eαt where α is constant.3 (try it!).4.). say α+ and α−. 2. (1. If we solve F √ these two equations in the two unknowns c1 . As an example. (1. x0 and x1 given).4.9) The winning change of variable. since xn = 3n yn we obtain the solution of (1. 1. x0 = 0).12) ‘Usually’ this quadratic equation will have two distinct roots.4.11) will look like xn = c1 αn + c2 αn (n = 0. (1. namely α2 = aα + b. from (1. From the form of the general solution we have F0 = 1 = c1 + c2 and√ 1 = 1 = c1 α+ + c2 α− .

. the inductive step will go through. N . Then from (1.4. The general solution is xn = 1n (c1 + nc2 ) = c1 + c2 n.4. 21.1. and suppose that α > c.4. and therefore the complete solution of the recurrence (1.4. Corresponding to a double root α of the associated quadratic equation α2 = aα + b we would ﬁnd two independent solutions αn and nαn .15) gives integer values for the Fn ’s. namely in the form αn . 5. what restriction is placed on the growth of the sequence {xn } by (1. 1.17) The question is. In order to insure that xN+1 < KαN+1 what we need is for tKαN−1 > N 2 .4.4. c2 = 4. 2.4. let’s observe that since (1 + 5)/2 > 1 and √ |(1 − 5)/2| < 1. Then α2 > α + 1 and so α2 − α − 1 = t. 13. . The process of looking for a solution in a certain form. . Hence as long as we choose K > max N 2 /tαN−1 . Now let’s look at recurrent inequalities. x0 = 1.4. The reader should check that the formula indeed gives the ﬁrst few Fn ’s correctly. x1 = 5). After inserting the given initial conditions.4 Recurrence relations The Fibonacci numbers are in fact 1. 8. . say. where c = (1+ 5)/2. 2.18) 2 − (tKα N−1 − N ). 1. x1 = 5 = c1 + c2 If we solve for c1 and c2 we obtain c1 = 1. the thing to try here is xn ≤ Kαn . √ Just to exercise our newly acquired skills in asymptotics.17)? By analogy with the case of diﬀerence equations with constant coeﬃcients. that we ﬁnd in diﬀerential equations.16) If we try a solution of the type xn = αn .17) implies that for every ﬁxed > 0. 34. xn = O((c+ )n ). (1. . . The same argument applies to the general situation that is expressed in 19 . Hence xN+1 ≤ KαN−1 (1 + α) + N 2 = KαN−1 (α2 − t) + N 2 = Kα N+1 (1. x0 = 0. we ﬁnd that x0 = 1 = c1 . then we ﬁnd that α satisﬁes the quadratic equation α2 = 2α − 1.17) with n = N we ﬁnd xN+1 ≤ KαN + KαN−1 + N 2 . √ The conclusion is that (1.19) in which the right member is clearly ﬁnite. (1. It isn’t even obvious that the formula (1. So suppose it is true that xn ≤ Kαn for all n = 0. N≥2 (1. like this one: xn+1 ≤ xn + xn−1 + n2 (n ≥ 1. Consider the recurrence xn+1 = 2xn − xn−1 (n ≥ 1. Hence the ‘two’ roots are 1 and 1. Example.4. Let c be the positive real root of the equation c2 = c + 1. . 3.16) is given by xn = 4n + 1. it follows that when n is large we have Fn ∼ ((1 + √ √ 5)/2)n+1 / 5. in the case of repeated roots. is subject to the same kind of special treatment. where t > 0. x1 = 0). so the general solution would be in the form αn (c1 + c2 n).

x0 = 1. 6. Is it true that ∀n : xn ≤ Fn ? Prove your answer. . below. max n≥p tαn−p α α . 1. the solution of the recurrence relation xn+1 = axn + bxn−1 (n ≥ 1) is guaranteed to be o(1) (n → ∞)? 5.1. x1 = 3) (vii) xn+1 = 4xn − 4xn−1 (n ≥ 1. x1 = ξ) 2. Solve the following recurrence relations (i) xn+1 = xn + 3 (n ≥ 0. n. bi > 1.Chapter 1: Mathematical Preliminaries Theorem 1. Generalize the result of exercise 5. x0 = 0) (iii) xn+1 = 2nxn + 1 (n ≥ 0. x0 = 0) (iv) xn+1 = ((n + 1)/n)xn + n + 1 (n ≥ 1. let c be the positive real root of * the equa tion cp+1 = b0 cp + · · · + bp . x0 = 1) (ii) xn+1 = xn /3 + 2 (n ≥ 0.4 1. x1 = 3) (vi) xn+1 = 3xn − 2xn−1 (n ≥ 1. x0 = 1. Suppose x0 = y0 and x1 = y1 . 3. p . 2n − 1. For what values of a and b is it true that no matter what the initial values x0 . .. Find x1 if the sequence x satisﬁes the Fibonacci recurrence relation and if furthermore x0 = 1 and xn = o(1) (n → ∞). x1 = 1. Find the asymptotic behavior in the form xn ∼? (n → ∞) of the right side of (1.4. Let xn be the average number of trailing 0’s in the binary expansions of all integers 0. Finally. . and let α = c + . Since α > c. . 7. * See exercise 10. if the claim is true for 0. 20 . x1 are. . Proof: Fix > 0. then |xn+1 | ≤ b0 |x0 | + · · · + bp |xn−p| + G(n) ≤ b0 Kαn + · · · + bp Kαn−p + G(n) = Kαn−p{b0 αp + · · · + bp } + G(n) = Kαn−p{αp+1 − t} + G(n) = Kαn+1 − {tKαn−p − G(n)} ≤ Kαn+1 . can we conclude that ∀n : xn ≤ yn ? If not. and clearly |xj | ≤ Kαj for j ≤ p. . 2. 4. solve it. describe conditions on a and b under which that conclusion would follow. deﬁne K = max |x0 |. if we let t = αp+1 − b0 αp − · · · − bp then t > 0. Exercises for section 1. as follows. Suppose x0 = 1. Further. |x1| |xp | G(n) . 2. . Indeed. Let a sequence {xn } satisfy a recurrent inequality of the form xn+1 ≤ b0 xn + b1 xn−1 + · · · + bp xn−p + G(n) (n ≥ p) where bi ≥ 0 (∀i). and evaluate limn→∞ xn . where c is the root of the equation shown in the statement of the theorem. .4. If furthermore.10). We claim that |xn| ≤ Kαn for all n. suppose G(n) = o(cn ). . Then K is ﬁnite. x0 = 0. 1. xn+1 ≤ axn + bxn−1 (∀n ≥ 1). and for all n ≥ 2 it is true that xn+1 ≤ xn + xn−1 . which will complete the proof. . Find a recurrence relation satisﬁed by the sequence {xn }. Then for every ﬁxed > 0 we have xn = O((c + )n ). . where yn+1 = ayn + byn−1 (∀n ≥ 1). Finally. x1 = 5) (v) xn+1 = xn + xn−1 (n ≥ 1.

In theorem 1. We will denote this set by the symbol [n]. The totality of n choices.1 may be false if the phrase ‘for every ﬁxed > 0 . Choose an element a1 (n possible choices).1 we ﬁnd the phrase ‘. in that the equation shown always has exactly one positive real root. therefore.’ etc. In other words. The second element of the sequence can be any of the remaining n − 1 objects. and we want to arrange them in a sequence. notice that we can construct the subsets of [n] in the following way. There are 4 of them. It is convenient to deﬁne n to be 0 if k < 0 or if k > n. namely all possible rearrangements of the elements of the subset.’ were replaced by ‘for every ﬁxed ≥ 0 .4.1. so the number of ways to choose an (ordered) sequence of k elements from [n] is n(n − 1)(n − 2) · · · (n − k + 1) = n!/(n − k)!. might have been made in 2n diﬀerent ways. the element ‘2. The quantities n!/k!(n − k)! are the famous binomial coeﬃcients. or don’t choose. and we want to discuss the number of subsets of various kinds that it has. 2}. there are exactly n!/k!(n − k)! k-subsets of a set of n objects.’ then either choose. Either choose. .’ 10. A set whose cardinality is k is called a ‘k-set. But there are more sequences of k elements than there are k-subsets. so. and so we have n(n − 1)(n − 2) ways to arrange the ﬁrst three elements of the sequence..4. or not choosing. 0 ≤ k ≤ n).5 Counting 8. . k We can summarize the developments so far with 21 . Here is a list of all of the subsets of [2]: ∅. . naturally enough. . n 1 =n n n (∀n ≥ 0). (1.5 Counting For a given positive integer n. To see why. the element ‘1. so that is the number of subsets that a set of n objects has. . or don’t choose. . Of the 2n subsets of [n]. {1}.n}. until a sequence of k diﬀerent elements have been chosen.’ The question is. It is no doubt clear now that there are exactly n(n − 1)(n − 2) · · · 3 · 2 · 1 = n! ways to form the whole sequence. 2.4.1) = n(n − 1)/2 (∀n ≥ 0).. Hence the number of k-subsets of [n] is equal to the number of k-sequences divided by k!. Of the remaining n − 1 elements. . We claim that the set [n] has exactly 2n subsets. because any particular k-subset S will correspond to k! diﬀerent ordered sequences. Next. The cardinality of a set S is denoted by |S|. Obviously there were n(n − 1)(n − 2) · · · (n − k + 1) ways in which we might have chosen that sequence. a ‘k-subset.’ and a subset of cardinality k is. Then there are n − 2 choices for the third element. In how many ways can we do that? For the ﬁrst object in our sequence we may choose any one of the n objects. the element ‘n.. = 1 (∀n ≥ 0).1.. ﬁnally choosing.. for example. etc. Show by an example that the conclusion of theorem 1. {2}. Write out a complete proof of theorem 1.5. choose one (n − 1 possible choices).’ Each of the n choices that you encountered could have been made in either of 2 ways. {1. |[6]| = 6. Exactly what special properties of that equation did you use in your proof? 1. how many have exactly k objects in them? The number of elements in a set is called its cardinality.. suppose we have n distinct objects. = n! k!(n − k)! (n ≥ 0. for how many subsets S of [n] is it true that |S| = k? We can construct k-subsets S of [n] (written ‘S ⊆ [n]’) as follows. the positive real root of . so there are n(n − 1) possible ways to make the ﬁrst two decisions. 9. and they are denoted by n k Some of their special values are n 0 n 2 = 1 (∀n ≥ 0). consider the set {1.’ Prove that this phrase is justiﬁed..

.. . Sort them out into two piles: those k-subsets that contain ‘1’ and those that don’t...5. There are exactly n! diﬀerent sequences that can be formed from a set of n distinct objects.’ ‘n = 1.1 we show the values of some of the binomial coeﬃcients n .. we might have written (1..3). and the entries within each row refer..... exactly n have k cardinality k ( ∀k = 0. using the deﬁnition (1. but here’s another way. . If a k−1 k-subset does not contain ‘1.. (1. 1. . ... 1. and then cancelling common factors to complete the proof. .)..’ then its remaining k − 1 elements can be chosen in n−1 ways. That is.5.. The table is called ‘Pascal’s triangle. That would work (try it).5..5. and of these..’ etc.5. (b) The sum of the entries in the nth row of Pascal’s triangle is 2n .... n ≥ 0).3). For each n ≥ 0. 0)).1: Pascal’s triangle Here are some facts about the binomial coeﬃcients: (a) Each row of Pascal’s triangle is symmetric about the middle.. ...1..5... ...3).. Table 1.. Contemplate (this proof is by contemplation) the totality of k-subsets of [n]. (c) Each entry is equal to the sum of the two entries that are immediately above it in the triangle.Chapter 1: Mathematical Preliminaries Theorem 1... to k = 0.2) In view of the convention that we adopted. (1. The proof of (c) above can be interesting.. 2. n)..3) There are (at least) two ways to prove (1.5.1) in terms of factorials.5.. a set of n objects has exactly 2n subsets.2) as k n = 2n .. and that completes the proof of (1. The number of them is on the left side of (1.’ 1 1 1 1 3 1 4 2 3 6 1 1 1 4 1 1 5 10 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 8 28 56 70 56 28 8 1 .. n. successively. It would then have been understood that the range of k is from −∞ to ∞.. . and that accounts for the ﬁrst term on the right of (1.5... k In Table 1. . The hammer-and-tongs approach would consist of expanding each of the three binomial coeﬃcients that appears in (1.. Since every subset of [n] has some cardinality. and that the binomial coeﬃcient n vanishes unless 0 ≤ k ≤ n... with no restriction k on the range of the summation index k...’ then its k elements are all chosen from [n − 1].. 1. The rows of the table k are thought of as labelled ‘n = 0.5.5..5. it follows that n k=0 n k = 2n (n = 0. k) = (0.. n k = n n−k (0 ≤ k ≤ n. .3).3). If a k-subset of [n] contains ‘1.. What it says about the binomial coeﬃcients is that n k = n−1 n−1 + k−1 k ((n... 22 ....

then the largest coeﬃcients of order n are (n−1)/2 and (n+1)/2 .5) . the biggest coeﬃcients are in the middle. we will compute the ratio of the (k + 1)st coeﬃcient of order n to the k th .. that the largest one(s) of the coeﬃcients of order n is (are) the one(s) in the middle. n n More precisely.4) Proof: By induction on n. Now let’s ask how big the binomial coeﬃcients are. for such k.5. 1 n as the coeﬃcients of order n. It will be important. for us to be able to pick out the largest term in a sequence of this kind. of course. For n ﬁxed.5. whereas n if n is even. The ratio is n n!/{(k + 1)!(n − k − 1)!} k+1 = n n!/{k!(n − k)!} k k!(n − k)! (k + 1)!(n − k − 1)! = (n − k)/(k + 1) = and is > 1 iﬀ k < (n − 1)/2. 0 n n . but how big are they? Let’s suppose that n is even. if n is odd.5. the largest one is uniquely n/2 .5.4) by (1 + x) to obtain (1 + x)n+1 = k n k x + k n k x + k k n k+1 x k n xk k−1 xk = k k = k n n + k k−1 n+1 k x k = k which completes the proof.5 Counting The binomial theorem is the statement that ∀n ≥ 0 we have n (1 + x)n = k=0 n k x . (1.4) with x = 1). so let’s see how we could prove that the biggest coeﬃcients are the ones cited above. the sum of all of the coeﬃcients of order n is 2n . . to n . Then the biggest binomial coeﬃcient of order n is n n/2 n! (n/2)!2 √ ( n )n 2nπ e ∼ n n√ {( 2e ) 2 nπ}2 = = 2 n 2 nπ 23 (1. That. as an exercise in asymptotics. It is also fairly apparent. in some of the applications to algorithms later on in this book. from an inspection of Table 1. OK...4) is clearly true when n = 0.2) (or by (1. Eq. that is.5.1. We will see then that the ratio is larger than 1 if k < (n − 1)/2 and is < 1 if k > (n − 1)/2. Then. will imply that the (k + 1)st coeﬃcient is bigger than the k th . by (1. We will refer to the coeﬃcients in row n of Pascal’s triangle. and if it is true for some n then multiply both sides of (1. as claimed.5. just to keep things simple.5.1. and therefore that the biggest one(s) must be in the middle. k (1.

Equation (1. 3. 1. 2.1(a). This is a graph of 5 vertices and 5 edges. Hence. w are adjacent in G. How many ordered pairs of unequal elements of [n] are there? 7. and its set of edges is E(G). A nice way to present a graph to an audience is to draw a picture of it. Indeed. the sum of all of them is 2n . and if (v. Express the coeﬃcients as certain binomial coeﬃcients. (1. To draw a picture of a graph we would ﬁrst make a point for each vertex. If we think in terms of the subsets that these coeﬃcients count. If we add up the degrees of every vertex v of G we will have counted exactly two contributions from each edge of G. and is usually denoted by ρ(v). The set of vertices of a graph G is denoted by V (G). the graph is the vertex list and the edge list. instead of just listing the pairs of vertices that are its edges. This kind of probabilistic thinking can be very useful in the design and analysis of algorithms. then we say that vertices v.10). for instance. therefore. 1. They’re both the same graph.5 1.1. by direct application of Taylor’s theorem. By observing that (1 + x)a (1 + x)b = (1 + x)a+b . and then we say which pairs of vertices are its edges. Complete the following twiddles. even at the expense of possibly slowing it down on subsets whose cardinalities are very small or very large. and make every eﬀort to see that the algorithm is fast for such subsets. The number of edges that contain (‘are incident with’) a particular vertex v of a graph G is called the degree of that vertex. w) is an edge of the graph that we are talking about. n 3. (2. 4. Exercises for section 1. the power series expansion of f(x) = 1/(1 − x)m+1 about the origin. Only the pictures are diﬀerent.6 Graphs A graph is a collection of vertices. 5. The graph G of 5 vertices and 5 edges that we listed above can be drawn as shown in Fig.6. but that’s all.Chapter 1: Mathematical Preliminaries where we have used Stirling’s formula (1.5). How many subsets of even cardinality does [n] have? 2. It could also be drawn as shown in Fig. what we will see is that a large fraction of all of the subsets of an n-set have cardinality n/2. Which one of the numbers {2j n }n is the biggest? j=0 j n θn n2 n ∼? 1. and then we would draw an arc between two vertices v and w if and only if (v.6. (3.2). Find. (4.4). certain unordered pairs of which are called its edges. but the pictures aren’t ‘really’ the graph.5 ) of them do. one at each of its endpoints. and the biggest one is ∼ 2/nπ2n . Evaluate the following sums in simple form. The pictures are helpful to us in visualizing and remembering the graph.3).5).1(b).5. To describe a particular graph we ﬁrst say what its vertices are.5) shows that the single biggest binomial coeﬃcient accounts for a very healthy fraction of the sum of all of the coeﬃcients of order n. for every 24 . the largest coeﬃcient contributes a fraction ∼ 2/nπ of the total. we should recognize that a large percentage of the customers for that algorithm will have cardinalities near n/2. n n (i) j=0 j j (ii) (iii) n n j j=3 j 5 n j+1 j=0 (j + 1)3 4. w) is an edge of G. in fact Θ(n−. If we are designing an algorithm that deals with subsets of [n]. If v and w are vertices of a graph G. When n is large. (i) 2n ∼ ? n n (ii) log n ∼ ? 2 (iii) (iv) ∼? 6. 5} and whose edges are the set of pairs (1. Consider the graph G whose vertex set is {1. prove that the sum of the squares of all binomial coeﬃcients of order n is 2n .

The graph in Fig. 1. A cycle is a closed path.1(b) ρ(v) = 2|E(G)|. 1. i.e. one in which vk = v1 .2(a) Fig. A path P is simple if its vertices are all distinct. Hamiltonian if it is simple and visits every vertex of G exactly once. .1(a) graph G we have Fig. it is a sequence {v1 . 1. Not every graph has a Hamiltonian path. but read through them and look them over again when the concepts are actually used. v2 . k − 1 : (vi . respectively.6. 2} and the sum of the degrees is 10 = 2|E(G)|.6 Graphs Fig. Eulerian if it uses every edge of G exactly once. Let S be one of the equivalence classes of vertices of G under this relation. or every edge. The subgraph of G that S induces is called a connected component of the graph G. We may say that a circuit is a simple cycle. A subgraph of a graph G is a subset S of its vertices to gether with a subset of just those edges of G both of whose endpoints lie in S.6. .1. every vertex. A path P in a graph G is a walk from one vertex of G to another. A fairly large number of terms will now be deﬁned.2(a) has one and the graph in Fig.* In Fig. vi+1 ) ∈ E(G). A graph is connected if there is a path between every pair of its vertices. A graph is connected if and only if it has exactly one connected component. . We have therefore proved that every graph has an even number of vertices whose degrees are odd. 1.6. .1 the degrees of the vertices are {2. in rather a brief space. vk } of vertices of G such that ∀i = 1. An induced subgraph of G is a subset S of the vertices of G together with all edges of G both of whose endpoints lie in S. We speak of Hamiltonian and Eulerian circuits of G as circuits of G that visit. Fig. where at each step the walk uses an edge of the graph.6. Say that v and w are equivalent if there is a path of G that joins them. 2.6. there must be an even number of odd numbers on the left side of (1. A cycle is a circuit if v1 is the only repeated vertex in it. More formally. 2. v∈V (G) (1. Next we’re going to deﬁne a number of concepts of graph theory that will be needed in later chapters.. 2. Don’t try to absorb them all now. in the sequel. We would then speak of ‘the subgraph induced by S.6. 1.’ In a graph G we can deﬁne an equivalence relation on the vertices as follows.1).6.1) Since the right-hand side is an even number. of a graph G.6. 1.2(b) * Did you realize that the number of people who shook hands an odd number of times yesterday is an even number of people? 25 .2(b) doesn’t.6. 1.

3(a) has one and the graph in Fig. We claim that v = v. 1. etc. There is a world of diﬀerence between Eulerian and Hamiltonian paths. so there is no exit. A proper coloring of the vertices of G is an assignment of a color to each vertex of G in such a way that ∀e ∈ E(G) the colors of 26 . Begin at some vertex v and walk along some edge to a vertex w. . however. Since each of the Ci has fewer edges than G had. a contradiction. Erase all edges of W from G.Chapter 1: Mathematical Preliminaries Fig. depart from u along an edge that hasn’t been used yet. We will ﬁnd an Eulerian circuit in G. Proof: Let G be a connected multigraph in which every vertex has even degree. Indeed. it is quite easy to decide whether or not G has an Eulerian path. Generically. and ﬁnding no edges remaining at the end. and that we are presented with a graph G. if not. surely one of the prettier parts of graph theory. We will construct an Eulerian circuit of G. Ck denote the connected components of G .6. and the result is clearly true if G has just one edge. Hence suppose the theorem is true for all such multigraphs of fewer than m edges. arriving at a new vertex. Then follow the Euler tour of the edges of C1 . Let C1 . In fact. If a graph G is given. then thanks to the following elegant theorem of Euler. path) if and only if it is connected and has no (resp.3(a) Fig. The graph in Fig. If W includes all edges of G then we have found an Euler tour and we are ﬁnished. Hence we are indeed back at our starting point when the walk terminates.6.1.. thereby obtaining a (possibly disconnected multi-) graph G .e. which will surely happen because G is connected. by induction. and the proof is complete. This will certainly happen because G is connected. 1.. and let G have m edges. Thus there were an odd number of edges of G incident with v . there is. etc.6. each time using a new edge. and is omitted. then we arrived at v one more time than we departed from it. which will return you to vertex q.3(b) doesn’t. The proof is by induction on the number of edges of G. . having arrived at a vertex u. Each of them has only vertices of even degree because that was true of G and of the walk W that we subtracted from G. we’re back where we started. 1.6. . The process halts when we arrive for the ﬁrst time at a vertex v such that all edges incident with v have previously been walked on. i. We will see in Chapter 5 that this question is typical of a breed of problems that are the main subject of that chapter. Theorem 1. Then continue your momentarily interrupted walk W until you reach for the ﬁrst time a vertex of C2 . The proof for Eulerian paths will be similar. and are perhaps the most (in-)famous unsolved problems in theoretical computer science. Thanks to Euler’s theorem (theorem 1. We will thread them all together to make such a circuit for G itself.6. Begin at the same v and walk along 0 or more edges of W until you arrive for the ﬁrst time at a vertex q of component C1 . 1.3(b) Likewise. Else there are edges of G that are not in W . Let W denote the sequence of edges along which we have so far walked.1) it is easy to decide if a graph has an Eulerian path or circuit.6. . an Eulerian circuit in each of the connected components of G . which are graphs except that they are allowed to have several diﬀerent edges joining the same pair of vertices. the theorem applies also to multigraphs. not every graph has an Eulerian path. Next we’d like to discuss graph coloring. It is extremely diﬃcult computationally to decide if a given graph has a Hamilton path or circuit. A (multi-)graph has an Eulerian circuit (resp. has exactly two) vertices of odd degree. Suppose that there are K colors available to us.

The complete bipartite graph Km. Cn .6. 1. w) | v. Formally. It has |E(Km.5: A bipartite graph The complement G of a graph G is the graph that has the same vertex set that G has and has an edge exactly where G does not have its edges. it is a graph that can be 2-colored. The empty graph K n consists of n isolated vertices. is a graph of n vertices that are connected to form a single cycle. E(G) = {(v.6.6.1.n has a set S of m vertices and a set T of n vertices. A tree is a graph that (a) is connected and (b ) has no cycles. as in Fig.’ ‘Y’ and ‘B’). i.6. Fig. in two layers.6 Graphs the two endpoints of e are diﬀerent. The n-cycle. v = w. w) ∈ E(G)}. etc.6.6. K3 looks like a triangle. 1. Fig. That means that the vertices of a bipartite graph can be divided into two classes ‘R’ and ‘Y’ such that no edge of the graph runs between two ‘R’ vertices or between two ‘Y’ vertices.n ) = S × T .5. In Fig.4(a) Fig.6. A tree is shown in Fig. 1. 1.e. Its edge set is E(Km. Thus 2 K2 is a single edge.n )| = mn edges.6.4(a) shows a graph G and an attempt to color its vertices properly in 3 colors (‘R.. has no edges at all. The complete graph Kn is the graph of n vertices in which every possible one of the n edges is actually present.6: A tree 27 . i. 1. with all edges running between layers.6. / Here are some special families of graphs that occur so often that they rate special names. The attempt failed because one of the edges of G has had the same color assigned to both of its endpoints.4(b) The chromatic number χ(G) of a graph G is the minimum number of colors that can be used in a proper coloring of the vertices of G. (v.4(b) we show the same graph with a successful proper coloring of its vertices in 4 colors. 1. Fig. w ∈ V (G).. A bipartite graph is a graph whose chromatic number is ≤ 2.e. 1. Bipartite graphs are most often drawn. 1. Fig.

and 2 for every way of deciding where to put the edges we would get a diﬀerent graph.Chapter 1: Mathematical Preliminaries It is not hard to prove that the following are equivalent descriptions of a tree. An independent set S is maximal if it is not a proper subset of another independent set of vertices of G. which is not.6. They are shown in Fig. A graph of n vertices has a maximum of n edges. if a given graph G is bipartite. then S is an independent set of vertices of G if no two of the vertices in S are adjacent in G. We can make each of these n decisions independently.. Exercises for section 1. and some fairly delicate counting arguments. In how many labeled graphs of n vertices do vertices {1... A graph might be labeled or unlabeled. There are.8: . 3. 2. . . but only one unlabeled graph Most counting problems on graphs are much easier for labeled than for unlabeled graphs. 5. A maximal complete subgraph of G is called a clique. 1. Therefore the number of n labeled graphs of n vertices is 2( 2 ) = 2n(n−1)/2. (c) A tree is a graph G with the property that between every pair of distinct vertices there is a unique path. (b) A tree is a graph G that is connected and for which |E(G)| = |V (G)| − 1. Let G be a bipartite graph that has q connected components. Fig. If gn is the number of unlabeled graphs of n vertices then n gn ∼ 2( 2 ) /n!.6 1. We will state the approximate answer to this question. rather than the exact answer.6. then we speak of a complete subgraph of G.6. 11.7. Draw all of the connected. If G is a graph and S ⊆ V (G). on a computer. Fig. only 1 unlabeled graph that has 3 vertices and 1 edge. Given a positive integer K. 3} form an independent set? 7. 4. Consider the following question: how many graphs are there that have exactly n vertices? Suppose ﬁrst that we mean labeled graphs. for example. which is easy to write out. 2. as shown in Fig. Show that there are exactly 2q ways to properly color the vertices of G in 2 colors. 3 labeled graphs that have 3 vertices and 1 edge. To construct a graph we would decide 2 which of these possible edges would be used. 9. Describe how you would ﬁnd out. If we were to ask the corresponding question for unlabeled graphs we would ﬁnd it to be very hard. 12. 1. 1. unlabeled graphs of 4 vertices. The answer is known. True or false: a Hamilton circuit is an induced cycle in a graph. Show that a tree is a bipartite graph. Find two diﬀerent graphs each of whose chromatic numbers is K. other than the complete graph. There is. Find a graph G of n vertices.8. (a) A tree is a graph that is connected and has no cycles. if a vertex subset S induces a complete graph. n. Exactly how many labeled graphs of n vertices and E edges are there? 6.6. 2. . 28 . Dually. however. How many cliques does an n-cycle have? 8. Which graph of n vertices has the largest number of independent sets? How many does it have? 10. Find the chromatic number of the n-cycle. 1. but the derivation involves Burnside’s lemma about the action of a group on a set. One diﬀerence that this makes is that there are a lot more labeled graphs than there are unlabeled graphs. .7: Three labeled graphs. The vertices of a labeled graph are numbered 1. whose chromatic number is equal to 1 plus the maximum degree of any vertex of G..

Consider a labeled graph G that consists of n/3 connected components. 29 . How many maximal independent sets does G have? 14. Describe the complement of the graph G in exercise 13 above. Prove that the number of labeled graphs of n vertices all of whose vertices have even degree is equal to the number of all labeled graphs of n − 1 vertices. of n vertices and m edges. How many cliques does it have? 15. .6 Graphs 13. In how many labeled graphs of n vertices is the subgraph that is induced by vertices {1. Let n be a multiple of 3. 18. does or does not contain a triangle. in time O(max(n2 . Devise an algorithm that will decide if a given graph.1. . 2. L} equal to H? 17. 3} a triangle? 16. . . each of them a K3 . Let H be a labeled graph of L vertices. 2. mn)). In how many labeled graphs of n vertices is the subgraph that is induced by vertices {1.

(1) ‘n! is the product of all of the whole numbers from 1 to n. in the recursive form of the n! program above. and when this is so. Another glance. might not have been developed if recursion were not readily available as a practical programming tool.’ (2) ‘If n = 1 then n! = 1. Mathematicians have known for years that induction is a marvellous method for proving theorems. and many others. What is the practical import of the above? It’s monumental. and in the deﬁnition the same ‘something’ appears. Notice that there are no visible loops in the recursive routine. function fact(n). because recursive compilers allow them to express such thoughts in a natural way. end. Now computer scientists and programmers can proﬁtably think recursively too. end. In eﬀect. and after that it will be able to climb back up to the original input value of n. On the other hand a recursive n! module is as follows. because it’s the only thing that stops the execution of the program. n − 1. the programmer’s job may be considerably simpliﬁed. function fact(n). A formal ‘function’ module that would calculate n! nonrecursively might look like this. The overall structure of a recursive routine will always be something like this: 30 . then it will actually know the value of the function fact. Observe next that the ‘trivial case. viz. reveals that the value of n! is deﬁned in terms of the value of the same function at a smaller value of its argument.Chapter 1: Mathematical Preliminaries Chapter 2: Recursive Algorithms 2. C.’ where n = 1. it seems illegal. Among recursive languages are Pascal. PL/C. so in eﬀect the programmer is shifting many of the bookkeeping problems to the compiler (but it doesn’t mind!). Many modern high-level computer languages can handle recursive constructs directly. if n is a positive integer. Programmers who use these languages should be aware of the power and versatility of recursive methods (conversely. APL. with arguments that are in some sense smaller than before. inclusive. {computes n! for given n > 0} fact := 1. however. people who like recursive methods should learn one of those languages!). The ﬁrst deﬁnition is nonrecursive. Another advantage of recursiveness is that the thought processes are helpful. is handled separately. else n! = n · (n − 1)!. for i := 1 to n do fact := i · fact. in many cases. Lisp.1 Introduction Here are two diﬀerent ways to deﬁne n!. This trivial case is in fact essential. and as a result many methods of great power are being formulated recursively. making constructions. etc. At a glance. The hallmark of a recursive procedure is that it calls itself. reducing n by 1. because we’re deﬁning something. So we’re really only using mathematical induction in order to validate the assertion that a function has indeed been deﬁned for all positive integers n. if n = 1 then fact := 1 else fact := n · fact(n − 1). the computer will be caught in a loop. until it reaches 1. methods which. Of course there will be loops in the compiled machine-language program. the second is recursive.’ Let’s concentrate on the second deﬁnition.

In this chapter we’re going to work out a number of examples of recursive algorithms. This means that our sorting routine is allowed to (a) pick up two numbers (‘keys’) from the array. but we are going to concentrate on methods that rely on only two kinds of basic operations. and decide which is larger. We will also ﬁnd that there is a bit of art involved in choosing the list of variables that a recursive procedure operates on. etc.1 1.2 Quicksort procedure calculate(list of variables). of varying sophistication. but more of this later. Here is a more formal algorithm that does the job above. etc. .2 Quicksort Suppose that we are given an array x[1]. (b) interchange the positions of two selected keys. by successive comparisons. For instance. We will see how the recursive structure helps us to analyze the running time. 2. Here is an example of a rather primitive sorting algorithm: (i) ﬁnd. There are many methods of sorting. . procedure slowsort(X: array[1. the smallest key (ii) interchange it with the ﬁrst key (iii) ﬁnd the second smallest key (iv) interchange it with the second key. namely one interchange of position (‘swap’) of two numbers.. 7. or complexity. compare them..n]). {maybe do a few more things} end. 2. and then we will choose a diﬀerent unit of cost.’ ‘slowsort. 9}. We would like to rearrange these numbers as necessary so that they end up in nondecreasing order of size. 7. Write a recursive routine that will ﬁnd the digits of a given integer n in the base b. x[r]) end. the reason will be clearer after we look at its complexity.’ and other pejorative names. 2.. Exercises for section 2.. of the algorithms. This operation is called sorting the numbers. then we want our program to output the sorted array {1. 4. First let’s choose our unit of cost to be one comparison of two numbers. x[n] of n numbers. 31 .{slowsort} If you are wondering why we called this method ‘primitive. There should be no visible loops in your program.. What is the cost of sorting n numbers by this method? We will look at two ways to measure that cost. if {trivialcase} then do {trivialthing} else do {call calculate(smaller values of the variables)}. 1}. Sometimes the ﬁrst list we think of doesn’t work because the recursive call seems to need more detailed information than we have provided for it.2. or maybe we need a still larger list . called comparisons and interchanges. . {sorts a given array into nondecreasing order} for r := 1 to n − 1 do for j := r + 1 to n do if x[j] < x[r] then swap(x[j]. So we try a larger list. if we are given {9. and then perhaps it works. 4.

and that the replacement of Θ(n2 ) by an average of O(n log n) comparisons is very welcome. 740 comparisons. which is quite a lot of comparisons for a sorting method to do. Not only that. and that is the idea of the Quicksort algorithm. 5 (1962). let x[i] be the splitter that was just created. does a maximum of cn2 comparisons. 000. An insurance company that wants to alphabetize its list of 5.{quicksortprelim} * C. One popular sorting application is in alphabetizing lists of names. 37. .. . J. the 37 will be in the same place it now occupies. Suppose we want to sort the following list: 26. but on the average it does far fewer. . 32 . and the numbers on its right will have been sorted but will still be on its right.Chapter 1: Mathematical Preliminaries How many paired comparisons does the algorithm make? Reference to procedure slowsort shows that it makes one comparison for each value of j = r + 1. i. . This means that the total number of comparisons is n−1 n f1 (n) = r=1 j=r+1 n−1 1 (n − r) r=1 = = (n − 1)n/2. In the ‘slowsort’ method described above. A. quicksortprelim(the subarray xi+1 . Now let’s discuss Quicksort. 74 (2. 000. R. assuming that the keys are distinct.e. we can make our own splitters. It is easy to imagine that some of those lists are very long. xi−1) in place. the numbers to its left will have been sorted but will still be on its left. What that means is that after sorting the list.2. 9. Let’s state a preliminary version of the recursive procedure as follows (look carefully for how the procedure handles the trivial case where n=1). . The Quicksort* method.000 policyholders will gratefully notice the diﬀerence between n2 = 25.1). then we can (a) sort the numbers to the left of the splitter. . xn ) in place. quicksortprelim(the subarray x1 . 000. However. {sorts the array x into nondecreasing order} if n ≥ 2 then permute the array elements so as to create a splitter. because sorting applications can be immense and time consuming. 000 comparisons and n log n = 77.1) The number 37 in the above list is in a very intriguing position. end. Comp.’ like 37. 4. with some extra work.000. 10-15. its best case and its worst case are equally bad. then the running time may depend strongly on the input array. . 220. so we won’t often be lucky enough to ﬁnd the state of aﬀairs that exists in (2. Obviously we have here the germ of a recursive sorting routine. If we average over all n! possible arrangements of the input data. . This economy is much appreciated by those who sort. If we are fortunate enough to be given an array that has a ‘splitter. . If we choose as our unit of complexity the number of swaps of position. The ﬂy in the ointment is that most arrays don’t have splitters. but the method does that many comparisons regardless of the input array. procedure quicksortprelim(x : an array of n numbers). 18. The number of comparisons is Θ(n2 ). then it is not hard to see that the average number of swaps that slowsort needs is Θ(n2 ). Hoare. and then (b) sort the numbers to the right of the splitter. 119. 124. some arrays will need no swaps at all while others might require the maximum number of (n − 1)n/2 (which arrays need that many swaps?). a mere O(n log n) comparisons. Every number that precedes it is smaller than it is and every number that follows it is larger than it is. In contrast to the sorting method above. 47. n in the inner loop. which is the main object of study in this section. . the basic idea of Quicksort is sophisticated and powerful. . .2.

. {sorts the subarray x[left]. 33 . . and to the left of all larger entries. It wants to sort a portion of the array that doesn’t begin at the beginning of the array. We are given an array of length n. . because we can just change the ‘n’ to ‘i − 1’ and call Quicksort. . left. . The ﬁrst one of them sorts the array to the left of xi . . of course. The routine Quicksort as written so far doesn’t have enough ﬂexibility to do that. The next item on the agenda is the little question of how to create a splitter in an array. x[right]} if right − left ≥ 1 then create a splitter for the subarray in the ith array position. So we will have to give it some more parameters. and declare it to be the splitter. The reason for making the random choice will become clearer after the smoke of the complexity discussion has cleared. and there are no minuses. Suppose we are working with a subarray x[left].’ which really aren’t quite. left. It calls something that’s just slightly diﬀerent from itself in order to get its job done. T is the value of the array entry that was chosen. To do this.* * Attributed to Nico Lomuto by Jon Bentley. the position will be what it has to be. x[right]. To repeat the parenthetical comment above. consider the following algorithm. but brieﬂy it’s this: the analysis of the average case complexity is realtively easy if we use the random choice. That is indeed a recursive call. qksort(x. right:integer). Now look at the two ‘recursive calls. and we want to sort it. inclusive.2. where left and right are input parameters. namely to the right of all smaller entries. We just choose. qksort(x. not its position in the array. Instead of trying to sort all of the given array. we have now chosen T to be the value around which the subarray will be split. as described above. CACM 27 (April 1984). The ﬁrst step is to choose one of the subarray elements (the element itself.{qksort} Once we have qksort. i + 1. so that’s a plus. The choice of the splitter element in the Quicksort algorithm is done very simply: at random. This leads us to the second version of the routine: procedure qksort(x:array. let’s call it T . right) end. and the second step is to make it happen. using our favorite random number generator. The entries of the subarray must be moved so as to make T the splitter. . though. we will write a routine that sorts only the portion of the given array x that extends from x[left] to x[right]. . x[left + 1]. i − 1). Observe the exact purpose of Quicksort. Once the value is selected. all of it. But it doesn’t. The second recursive call is the problem. and that won’t work. not the position of the element) to be the splitter. Second.2 Quicksort This preliminary version won’t run. It seems to call itself twice in order to get its job done. It looks like a recursive routine. one of the entries of the given subarray. Quicksort is no problem: we call qksort with left := 1 and right := n.

(c) are surely true at the beginning. When line 9 executes.2. i) {chooses at random an entry T of the subarray [xleft . 2. 2. (b). Proof: We claim that as the loop in lines 7 and 8 is repeatedly executed for j := left + 1 to right. observe ﬁrst that (a). right]. 8 guarantee that they will be true for the next value of j. Procedure split correctly splits the array x around the chosen value T . 8: (a) x[left] = T and (b) x[r] < T for all left < r ≤ i and (c) x[r] ≥ T for all i < r ≤ j Fig. 6 for j := left + 1 to right do begin 7 if x[j] < T then begin 8 i := i + 1 swap(x[i]. end 9 swap(x[left].1 illustrates the claim. then the execution of lines 7. Next. x[j]) end. It tells us that just prior to the execution of line 9 the condition of the array will be (a) x[left] = T and (b) x[r] < T for all left < r ≤ i and (c) x[r] ≥ T for all i < r ≤ right. 3 {now the splitter is ﬁrst in the subarray} 4 T := x[left]. (c) when j = right. 34 . left. the following three assertions will always be true just after each execution of lines 7.1: Conditions (a).1. (b). 5 i := left.{split} We will now prove the correctness of split. if for some j they are true. Now we can state a ‘ﬁnal’ version of qksort (and therefore of Quicksort too). 2 swap(x[left]. (c) To see this. the array will be in the correctly split condition. when j = left + 1.Chapter 1: Mathematical Preliminaries procedure split(x. xright ].2. Fig. and splits the subarray around T } {the output integer i is the position of T in the output array: x[i] = T }.2. 1 L := a random integer in [left. Now look at (a). (b). x[i]) 10 end. x[L]). Theorem 2. right.

2 Quicksort procedure qksort(x:array. .{Quicksort} Now let’s consider the complexity of Quicksort. since the number of comparisons that are needed to carry out the call to split an array of length n is n − 1. the average complexity of Quicksort. and that the random choice of the splitter element happens to be the smallest element in the array. for each particular one of these inputs. no matter what the input is. Not only that. it is as bad as ‘slowsort’ above. Second. The only thing that will matter. x[right]}. will be the set of outcomes of all of the paired comparisons of two elements that are done by the algorithm. left.2. when we speak of the function F (n). we will assume. at random. The worst-case behavior of Quicksort is therefore quadratic in n. then it won’t do a whole lot of splitting. that the entries of the input array are exactly the set of numbers 1. i). . i − 1). . 1. If the splitter element is the smallest array entry. Therefore. then one of the two recursive calls will be to an array with no entries at all. right. qksort(x. We will also average over all such random choices of the splitting elements. qksort(x. if the original array had n entries. . There are n! possible orders in which these elements might appear. . qksort(x. We suppose that the entries of the input array x are all distinct. n) end. Quicksort is usually a lot faster than its worst case discussed above. so we are considering the action of Quicksort on just these n! inputs. right) end. right:integer). . suppose we have a really unlucky day. n in some order. We want to show that on the average the running time of Quicksort is O(n log n). the choices of the splitting elements will be made by choosing. it follows that L(n) = L(n − 1) + n − 1 Hence. In its worst moods. The ﬁrst step is to get quite clear about what the word ‘average’ refers to. If we want to see Quicksort at its worst. Therefore. left. but suppose this kind of unlucky choice is repeated on each and every recursive call. In fact. . 2. L(0) = 0). n:integer) {sorts an array of length n}. The actual numerical values that appear in the input array are not in themselves important. and furthermore it will depend on how lucky we are with our random choices of splitting elements. then. L(n) = (1 + 2 + · · · + (n − 1)) = Θ(n2 ).{qksort} procedure Quicksort(x :array. therefore. How long does it take to sort an array? Well. for the purposes of analysis. we are speaking of the average number of pairwise comparisons of array entries that are made by Quicksort. where the averaging 35 (n ≥ 1. and the other recursive call will be to an array of n − 1 entries. i + 1. left.. then. if right − left ≥ 1 then split(x. {sorts the subarray x[left]. Then the performance of Quicksort can depend only on the sequence of size relationships in the input array and the choices of the random splitting elements. except that. . If L(n) is the number of paired comparisons that are required in this extreme scenario. one of the entries of the array at each step of the recursion. to simplify the discussion we will assume that they are all diﬀerent. Whereas the performance of slowsort is pretty much always quadratic. the amount of time will depend on exactly which array we happen to be sorting.

yielding n−1 (n − 1)F (n − 1) = (n − 1)(n − 2) + 2 i=1 F (i − 1).4) a lot by getting rid of the summation sign.5). the two recursive calls to Quicksort will be on two subarrays.2. we average also over all sequences of choices of the splitting elements. to get n nF (n) = n(n − 1) + 2 i=1 F (i − 1).2. Our next remark is that each value of i = 1.2. or the 17th -from-largest.4) We can simplify (2. (2. .2. (2. the one that we happened to have chosen was just as likely to have been the largest entry as to have been the smallest.5) we replace n by n − 1.2. 2. . . What we are going to show is that F (n) = O(n log n). and second.2. in (2.2) can be written as F (n) = n − 1 + 2 n n F (i − 1). It follows that our average complexity function F satisﬁes the relation 1 F (n) = n − 1 + n n {F (i − 1) + F (n − i)} i=1 (n ≥ 1). Since all orderings of the array entries are equally likely.2. one of which has length i − 1 and the other of which has length n − i. Since each value of i is equally likely.2. leaving behind just nF (n) − (n − 1)F (n − 1) = n(n − 1) − (n − 1)(n − 2) + 2F (n − 1). The average numbers of pairwise comparisons that are involved in such recursive calls are F (i − 1) and F (n − i).5) Next. and the summation sign obligingly disappears. How can we ﬁnd the solution of the recurrence relation (2.2. The reason for this is that we chose the splitter originally by choosing a random array entry.7) .Chapter 1: Mathematical Preliminaries is done ﬁrst of all over all n! of the possible input orderings of the array elements. (2. each i has probability 1/n of being chosen as the residence of the splitter.2.2. As we have seen. First there are the pairwise comparisons involved in choosing a splitting element and rearranging the array about the chosen splitting value. for each such input ordering. but it’s a trick that is used in so many diﬀerent ways that now we call it a ‘method. and that it has turned out that the splitting entry now occupies the ith position in the array.2.3) {F (i − 1)} = i=1 and so (2. 36 (2. For this purpose.4) by n.6) Finally we subtract (2.2)? First let’s simplify it a little by noticing that n {F (n − i)} = F (n − 1) + F (n − 2) + · · · + F (0) i=1 n (2.2. If the splitting element lives in the ith array position. suppose we have rearranged the array about the splitting element.’ What we do is ﬁrst to multiply (2. the number of comparisons involved in splitting the array is n − 1. or whatever. respectively. n is equally likely to occur. Second there are the comparisons that are done in the two recursive calls that follow the creation of a splitter.2) together with the initial value F (0) = 0. Hence it remains to estimate the number of comparisons in the recursive calls. The labor that F (n) estimates has two components. Now let’s consider the behavior of the function F (n). This next step may seem like a trick at ﬁrst (and it is!).6) from (2. i=1 (2. .

8). . 2nd ed. n n (2. how many comparisons might your method do? How many swaps? * For a fast and easy way to do this see A. F (n) = 2(n + 1){ j=1 1/j} − 4n (2.2.11). .2. .2. chap.2. 1978).8) the winning tactic is to change to a new variable yn that is deﬁned.11) is the average number of pairwise comparisons that we do if we Quicksort an array of length n. 37 . and is ∼ 2n log n (n → ∞).2 Quicksort After some tidying up. . Write out an array of 10 numbers that contains no splitter. In section 1. n]. . Evidently F (n) ∼ 2n log n (n → ∞) (see (1. even though its worst case requires a quadratic amount of labor.2. Wilf.* and count how many of the 100 had at least one splitter. Combinatorial Algorithms. Given a positive integer n. . . 6.7) becomes F (n) = (1 + 2 1 )F (n − 1) + (2 − ).2. 2. Choose 100 random permutations of [1.1. Execute your program for n = 5. Write a program that does the following.2. Write out an array of 10 numbers that contains 10 splitters.9). (New York: Academic Press. Exercises for section 2.7) with g(t) = 1/t).2.10) yn = 2 j=1 n j −1 j(j + 1) { 1 2 − } j +1 j =2 j=1 n =2 1 − 4n/(n + 1). Quicksort is.4. and we have proved Theorem 2.4 we saw that to solve (2. The solution of (2. . 2. The average number of pairwise comparisons of array entries that Quicksort makes when it sorts arrays of n elements is exactly as shown in (2. S.10) is obviously n (n ≥ 1) (2. Think of some method of sorting n numbers that isn’t in the text. 6. on average. j j=1 n Hence from (2. If we make the change of variable F (n) = (n + 1)yn in (2. In the worst case. Nijenhuis and H.2 1.2. (2. then it takes the form yn = yn−1 + 2(n − 1)/n(n + 1) as an equation for the yn ’s (y0 = 0). 3.2. 12 and tabulate the results. a very quick sorting method.9) = (n + 1)yn .2. by n+1 n n−1 2 F (n) = · · · yn n n−1n−2 1 (2.2.2. in this case.8) which is exactly in the form of the general ﬁrst-order recurrence relation that we discussed in section 1.

namely the graph that is obtained from G by deleting vertex v∗ as well as all vertices that are connected to vertex v∗ by an edge.Chapter 1: Mathematical Preliminaries 4. although no one has proved the nonexistence of fast (polynomial time) algorithms. Let’s distinguish two kinds of independent sets of vertices of G. Consider the array x = {2.1 Here is an algorithm for the independent set problem that is easy to understand and to program. 6} with left = 1 and right = 10. it may take a long time to run on a large graph G. 6}. (Research problem) Find the asymptotic behavior. and maybe not deserving of a lot of attention. 8. therefore. for large n. and is written N bhd(v∗ ).3 Recursive graph algorithms Algorithms on graphs are another rich area of applications of recursive thinking. and suppose the random integer L in step 1 happens to be 5. 9. How big might H(n) be? Q(i − 1) (n ≥ 1). of vertex v∗ together with an independent set of vertices from the graph G − {v∗ } − N bhd(v∗ ). Suppose that the procedure split is called. how big might Q(n) be? 7. If Q(0) = 0 and Q(n) ≤ n + 2 n i=1 1 n n i=1 H(i − 1) (n ≥ 1). Now consider an independent set S that doesn’t contain vertex v∗ . 2. Suppose we denote that number by maxset(G). say vertex v∗ . 3. use pencil and paper). 6} is an independent set and so is the set {1. 1. 2. 2. If an independent set S of vertices contains vertex v∗ . 38 .3. If the problem itself seems unusual. In that case the set S is simply an independent set in the smaller graph G − {v∗ }. . then what does the rest of the set S consist of? The remaining vertices of S are an independent set in a smaller graph. then we will be able to do all of them faster.1 the set {1.’ one which has received a great deal of attention in recent years. There are those that contain vertex v∗ and those that don’t contain vertex v∗ . Fix some vertex of the graph. in their worst cases. 4. 3}. 2. although. of the probability that a randomly chosen permutation of n letters has a splitter. This latter set of vertices is called the neighborhood of vertex v∗ . record the condition of the array x after each value of j is processed in the for j = .3. Suppose H(0) = 1 and H(n) ≤ 1 + 6. In the graph in Fig. be advised that in Chapter 5 we will see that it is a member in good standing of a large family of very important computational problems (the ‘NP-complete’ problems) that are tightly bound together. The problem of ﬁnding the size of the largest independent set in a given graph is computationally very diﬃcult. . 5. Fig. Carry out the complete split algorithm (not on a computer. The set S consists. 2. 3. of course. We will illustrate the dramatically increased complexity with a recursive algorithm for the ‘maximum independent set problem. We are looking for the size of the largest independent set of vertices of G. rather than the polynomial times that were required for sorting problems. Some of the problems are quite diﬀerent from the ones that we have so far been studying in that they seem to need exponential amounts of computing time. By an independent set of vertices of G we mean a set of vertices no two of which are connected by an edge of G. 10. loop. 5. Suppose a graph G is given. in that if we can ﬁgure out better ways to compute any one of them. Particularly. 7. The largest independent set of vertices in the graph shown there is the set {1. All algorithms known to date require exponential amounts of time.

2. To open the discussion. In problems about graphs. (1. as far as the computation is concerned. however.2. 1 + maxset1(G − {1} − N bhd(1)) in this example. certainly we can march through the entire list of n(n − 1)/2 pairs of vertices and check oﬀ the ones that are actually edges in the input graph to the problem. 4). 3).3. and the one below and to the right of each graph is obtained by deleting a single vertex and its entire neighborhood. the graph that is below and to the left of a given one is the one obtained by deleting a single vertex. {returns the size of the largest independent set of vertices of G} if G has no edges then maxset1 := |V (G)| else choose some nonisolated vertex v∗ of G. who innocently wrote the recursive routine maxset1 and then sat back to watch. Let the graph G be a 5-cycle. and we will succeed in doing that to some extent. the compiler will go ahead with the computation by generating a tree-full of graphs. the size of the matrix. We obtain the following recursive algorithm.{maxset1} Example: Here is an example of a graph G and the result of applying the maxset1 algorithm to it. for instance.2.2: G − {1} G − {1} − N bhd(1) The reader should now check that the size of the largest independent set of G is equal to the larger of the two numbers maxset1(G − {1}). 2). Unbeknownst to the programmer. (4. (2. Then G − {1} and G − {1} − N bhd(1) are respectively the two graphs shown in Fig. Then. ‘hard’ ? Take the graph problems. the number of vertices of the graph. That is. Now we are going to study the complexity of maxset1.3. What are the two graphs on which the algorithm calls itself recursively? Suppose we select vertex number 1 as the chosen vertex v in the algorithm. 2. 5). Suppose we have found the two numbers maxset(G − {v∗ }) and maxset(G − {v∗ } − N bhd(v∗ )). Of course the creation of these two graphs from the original input graph is just the beginning of the story. 2.3 Recursive graph algorithms We now have all of the ingredients of a recursive algorithm. 2 3 4 5 3 4 Fig. In each case. Do these distinctions alter the classiﬁcation of problems into ‘polynomial time do-able’ vs. it has 5 vertices and its edges are (1. n1 := maxset1(G − {v∗ }). from the discussion above. function maxset1(G). (3. it is more natural to think of the amount of labor as a function of n. we have the relation maxset(G) = max maxset(G − {v∗ }). In Fig.3. The results will be suﬃciently depressing that we will then think about how to speed up the algorithm.3 we show the collection of all of the graphs that the compiler might generate while executing a single call to maxset1 on the input graph of this example. Hence we can describe a graph to a computer by making 39 . and so forth. let’s recall that in Chapter 0 it was pointed out that the complexity of a calculation is usefully expressed as a function of the number of bits of input data. n2 := maxset1(G − {v∗ } − N bhd(v∗ )). How many bits of input data does it take to describe a graph? Well. 1 + n2 ) end. 1 + maxset(G − {v∗ } − N bhd(v∗ )) . 5). In problems about matrices it is more natural to use n. maxset1 := max(n1 .

3: A tree-full of graphs is created a list of n(n − 1)/2 0’s and 1’s. Then. in the case of graph algorithms. What have we learned? Algorithm maxset1 will ﬁnd the answer in a time of no more than O(1. the ‘easiness’ vs.2) because the graph G − {v∗ } − N bhd(v∗ ) might have as many as n − 2 vertices. if G actually has some edges. let f(n) = max|V (G)|=n F (G). If F (G) denotes the total amount of computational labor that we do in order to ﬁnd maxset1(G).61803.3.1. 2.. Fortunately the hard work has all been done. and in this case it tells us that f(n) = O((1. each 0 represents one that isn’t an edge. with a clear conscience. and the answer is in theorem 1. can also be bounded by a polynomial in n itself.619n) units if the input graph G has n vertices.3. In the worst case we might look at all of the input data. Now it’s time to ‘solve’ the recurrent inequality (2. The ﬁrst step is to ﬁnd out if G has any edges.3. all Θ(n2 ) bits of it.1) over all graphs G of n vertices. we are going to estimate the running time or complexity of graph algorithms in terms of the number of vertices of the graph that is input.619n )).3. the additional labor needed to process G consists of two recursive calls on smaller graphs and one computation of the larger of two numbers. rather than on polynomials in the number of bits of input data. any function of the number of input data bits that can be bounded by a polynomial in same. (2. Hence. Hence. and take the maximum of (2. That theorem was designed expressly for the analysis of recursive algorithms.Chapter 1: Mathematical Preliminaries Fig.2). Each 1 represents a pair that is an edge. Since n2 is a polynomial in n.. This is a little improvement of the most simple-minded possible 40 . ‘hardness’ judgment is not altered if we base the distinction on polynomials in n itself. Thus Θ(n2 ) bits describe a graph. Now let’s do this for algorithm maxset1 above.4. The result is that f(n) ≤ cn2 + f(n − 1) + f(n − 2) (2. We chose the ‘ ’ that appears in the conclusion of the theorem simply by rounding c upwards. and would have that many if v∗ had exactly one neighbor.3. then we see that F (G) ≤ cn2 + F (G − {v∗ }) + F (G − {v∗ } − N bhd(v∗ )). To do this we simply have to look at the input data.1) Next.. Indeed the number c in that theorem is √ (1 + 5)/2 = 1.

whose running time will be O(1. . Hence in this case. Well. ask what the ‘trivial case’ would then look like. . n = 2. if one could ﬁnd a fast method for that restricted problem it would have extremely important consequences. maxset2 := max(n1 .’ Then we could be quite certain that G − {v∗ } − N bhd(v∗ ) would have at most n − 3 vertices. In the exercises below we will develop maxset3. We would be working on a graph G in which no vertex has more than 3 neighbors.46557. So. The net result of our eﬀort to improve maxset1 to maxset2 has been to reduce the running-time bound from O(1.4.). What if there isn’t any such vertex in the graph G? Then G would contain only vertices with 0 or 1 neighbors.1.3) where f(n) is once more the worst-case time bound for graphs of n vertices.3 Recursive graph algorithms algorithm that one might think of for this problem. Can we do still better? There have in fact been a number of improvements of the basic maxset1 algorithm worked out. maxset = E + m = |V (G)| − |E(G)|.3.2.3. That algorithm would take Θ(2n ) time units because there are 2n subsets of vertices to look at. 1 + n2 ) end. We are not going to work out all of those ideas here. Hence we have traded in a 2n for a 1. but instead we will show what kind of improvements on the basic idea will help us to do better in the time estimate. not just those special ones. which is to examine every single subset of the vertices of of G and ask if it is an independent set or not. {returns the size of the largest independent set of vertices of G} if G has no vertex of degree ≥ 2 then maxset2 := |V (G)| − |E(G)| else choose a vertex v∗ of degree ≥ 2.3.3.3) is a recurrent inequality of the form that was studied at the end of section 1. Such a graph G would be a collection of E disjoint edges together with a number m of isolated vertices. and we obtain a second try at a good algorithm in the following form. Using the conclusion of that theorem. n1 := maxset2(G − {v∗ }). why not try to insure that v∗ has at least 3 of them? As long as we have been able to reduce the time bound more and more by insuring that the selected vertex has lots of neighbors. Just as before.. we ﬁnd from (2. To see why.619n) to O(1.2) we ﬁnd f(n) ≤ cn2 + f(n − 1) + f(n − 3) (f(0) = 0. A maximum independent set contains one vertex from each of the E edges and it contains all m of the isolated vertices. If we were to do that then although we wouldn’t aﬀect the number of vertices of G − {v∗ } (always n − 1) we would at least reduce the number of vertices of G − {v∗ } − N bhd(v∗ ) as much as possible. (2. .39n). is the positive root of the equation c3 = c2 + 1. The size of the largest independent set of vertices in such a graph is easy to ﬁnd. In fact.619n by being a little bit cagey about the algorithm. because we would then be able to do all graphs rapidly. why don’t we keep it up.4. what ‘trivialthing’ shall we do. we might replace the instruction ‘choose some nonisolated vertex v∗ of G’ in maxset1 by an instruction ‘choose some vertex v∗ of G that has at least two neighbors. 41 . 3.3) that f(n) = O((c + )n ) where c = 1. The idea will be that since in maxset2 we were able to insure that v∗ had at least two neighbors. (2. which isn’t a bad day’s work. We can obviously do better if we choose v∗ in such a way as to be certain that it has at least two neighbors. as our next thought. in theorem 1. Of these the most successful is perhaps the one of Tarjan and Trojanowski that is cited in the bibliography at the end of this chapter. in this ‘trivial case’ ? The fact is that there isn’t any way of ﬁnding the maximum independent set in a graph where all vertices have ≤ 3 neighbors that’s any faster than the general methods that we’ve already discussed.47n). procedure maxset2(G).{maxset2} How much have we improved the complexity estimate? If we apply to maxset2 the reasoning that led to (2. and insist that v∗ should have 4 or more neighbors? Regrettably the method runs out of steam precisely at that moment. n2 := maxset2(G − {v∗ } − N bhd(v∗ ) ).

Two of these vertices will be connected by an edge of the graph G if the two countries that they correspond to have a common stretch of border (we keep saying ‘stretch of border’ to emphasize that if the two countries have just a single point in common they are allowed to have the same color).000 countries into which the planet has been divided. each with its own ruling class. Suppose G is a graph. just for general cultural reasons. Math. we will always be able to color those countries with at most four colors so that countries with a common frontier are colored diﬀerently. For example. but before doing so it draws a careful map of the 5. all at war with each other. Then there is no way to color the vertices without ever ﬁnding that both endpoints of some edge have the same color. 42 . It was ﬁnally solved in 1976 by K. Appel and W.6).000. but for the moment let’s leave just the observation that the general problem of maxset turns out to be no harder than the special case of maxset in which no vertex has more than 3 neighbors. To make the map easier to read. etc. we are going to mention the four-color theorem. First they showed how to reduce the general problem to only a ﬁnite number of cases.’ which asserts that no matter how we carve up the plane or the sphere into countries. this will not be possible. If we don’t have enough colors. On the other hand. Haken* by means of an extraordinary proof with two main ingredients. Suppose that a delegation of Earthlings were to visit a distant planet and ﬁnd there a society of human beings. Fig. To be exact. system of government. As an illustration * Every planar map is four colorable. if we have four colors available then we can do the job. and of course. yellow and green. 2. Just for its general interest. it seems to be possible to color the countries in such a way that (a) every pair of countries that have a common stretch of border have diﬀerent colors and (b) no more than four colors are used in the entire map.3. suppose we have K colors. by a mathematical argument. 711-712. suppose G is the graph of Fig. We can change the map coloring problem into a graph coloring problem as follows. blue..4. Bull. and G has lots of edges. The delegation wants to escape quickly. So now we have the ‘Four Color Theorem. It was then conjectured that four colors are always suﬃcient for the proper coloring of the countries of any map at all. and suppose we have just 3 colors available.4 There are many interesting computational and theoretical problems in the area of coloring of graphs. Aside from the complexity issue. since the ‘ﬁnite number’ was over 1800. It was noticed over 100 years ago that no matter how complicated a map is drawn. Surprisingly. the algorithm maxset has shown how recursive ideas can be used to transform questions about graphs to questions about smaller graphs. Since that race is well known for its squabbling habits. Soc.Chapter 1: Mathematical Preliminaries We will learn more about this phenomenon in Chapter 5.3. There will be a vertex of G corresponding to each country on the map. and then we will turn to a study of some of the computational aspects of graph coloring. and no matter how many countries are involved. 2. let’s slow down for a while and discuss the relationship between graph colorings in general and the four-color problem. First. The original question was this. Given a map. 82 (1976). they settled all of those cases with quite a lengthy computer calculation. it was found that the coloring could be done using only red. Here’s another example of such a situation. We can then attempt to color the vertices of G properly in K colors (see section 1. Amer. From the map we will construct a graph G. and that we have a certain supply of colors available.. even though it isn’t directly relevant to what we’re doing. Then. the countries are then colored in such a way that whenever two countries share a stretch of border they are of two diﬀerent colors. you can be sure that the planet will have been carved up into millions of little countries. Settling this conjecture turned out to be a very hard problem.

5(b) of this construction. * Eﬃcient planarity testing. Mach. and let its endpoints be v and w. 43 .3. in O(V ) time for a graph of V vertices. Don’t blame the graph if it doesn’t look planar. suppose G is the empty graph K n . but as a small consolation we note that no polynomial time algorithm for this problem is known. 2. the graph of n vertices that has no edges at all. there are still all of those other graphs that are not planar to deal with.6(a) we show a planar graph G. Comp. Obviously the number of proper colorings of G − {e} in K colors is the sum of the numbers of colorings of each of these two kinds. we show in Fig. Choose an edge e of the graph. if we have K colors to work with. Then G has quite a large number of proper colorings. to be exact. Now delete the edge e from the graph. As if that question weren’t hard enough.3.* is an algorithm that makes the decision in linear time. The graph that results from changing a map of countries into a graph as described above is always a planar graph.3.6(b) The question of recognizing whether a given graph is planar is itself a formidable problem. 2. We will now ﬁnd a recursive algorithm that will answer this question. K n of them. 549-568.5(a) Fig. 21 (1974). that is. E.5(a) a map of a distant planet. if a positive integer K is given. Then we will distinguish two kinds of proper colorings of G−{e}: those in which vertices v and w have the same color and those in which v and w have diﬀerent colors. and an interesting computational question is to count the proper colorings of a given graph. 2.6(b).3. In Fig. This graph doesn’t look planar because two of its edges cross. 2. Hopcroft and R. 2. It might be planar anyway! Fig. 2. we might ask for even more detail. Tarjan.6(a) Fig. For instance. namely about the number of ways of properly coloring the vertices of a graph. Again. Assoc. For any one of those graphs we can ask. that isn’t the graph’s fault.e. Although every planar graph can be properly colored in four colors. However. 2. and in Fig.3.3 Recursive graph algorithms Fig.5(b) the graph that results from the construction that we have just described.2.3. J. because with a little more care we might have drawn the same graph as in Fig. in which its planarity is obvious. due to J. the complexity of the algorithm will be exponential. 2. i. whether or not its vertices can be K-colored properly.3. and let the resulting graph be called G − {e}. although the solution. Other graphs of n vertices have fewer proper colorings than that.3. By a ‘planar graph’ we mean a graph G that can be drawn in the plane in such a way that two edges never cross (except that two edges at the same vertex have that vertex in common).

3. 2. Let v and w be two vertices of G such that e = (v.7(a) The point of the construction is the following Fig. Let P (K. Proof: Obvious (isn’t it?). Lemma 2.3. Color the vertices of G − {e} itself in K colors as follows. (vw.7(a) we show a graph G of 7 vertices and a chosen edge e. Now we describe the edges of G/{e}. G/{e}) + P (K. Second. There remains only the case where one endpoint of f is.3. 2. 2.Chapter 1: Mathematical Preliminaries Consider the proper colorings in which vertices v and w have the same color.2 assert that P (K. say. Now we have a K-coloring of the vertices of G − {e}.3. if a and b are two vertices of G/{e} neither of which is the new vertex ‘vw’. It remains to look at the case where vertices v and w receive diﬀerent colors. 2. v and the other one is some vertex x other than v or w.7(b) we show the graph G/{e} that is the result of the construction that we have just described. The two endpoints of e are v and w. To get back to the main argument. Vertex v and vertex w each receive the color that vertex vw has in the coloring of G/{e}. w) be an edge of G. Fig. Then the number of proper K-colorings of G − {e} in which v and w have diﬀerent colors is equal to the number of all proper K-colorings of G itself.3. we were trying to compute the number of proper K-colorings of G − {e}. But then the colors of v and x must be diﬀerent because vw and x were joined in G/{e} by an edge. b) or (w. whose construction we now describe: The vertices of G/{e} consist of the vertices of G other than v or w and one new vertex that we will call ‘vw’ (so G/{e} will have one less vertex than G has). b) (or both) is an edge of G. Proof: Suppose G/{e} has a proper K-coloring. In Fig. We have shown that the number of colorings in which they receive the same color is equal to the number of all proper colorings of a certain smaller (one less vertex) graph G/{e}. It is a proper coloring because if f is any edge of G − {e} then the two endpoints of f have diﬀerent colors. G) 44 .3.1. We claim that the number of such colorings is equal to the number of all colorings of a certain new graph G/{e}. First.3.3. The result is G/{e} (anyway. and that we squeeze vertices v and w together into a single vertex. Indeed. Now let’s put together the results of the two lemmas above. We observed that in any K-coloring v and w have either the same or diﬀerent colors. this is obviously true if neither endpoint of f is v or w because the coloring of G/{e} was a proper one. Then lemmas 2. G) denote the number of ways of properly coloring the vertices of a given graph G. and therefore must have gotten diﬀerent colors there. We can think of this as ‘collapsing’ the graph G by imagining that the edges of G are elastic bands. In Fig. b) is an edge of G/{e} if and only if either (v.2. then (a. Every vertex of G − {e} other than v or w keeps the same color that it has in the coloring of G/{e}. b) is an edge of G/{e} if and only if it is an edge of G. Then the number of proper K-colorings of G − {e} in which v and w have the same color is equal to the number of all proper colorings of the graph G/{e}.7(b) Lemma 2. w) ∈ E(G). G − {e}) = P (K. Let e = (v. it is if we replace any resulting double bands by single ones!).1 and 2.

2. For instance. The ﬁrst step of the computation is to choose the edge e and to create the edge list of the graph G − {e}. 3. F (V. The algorithm is the following.3. one in the sense that it has fewer edges than G has.3. Hence we seek a solution of (2. G/{e}). Since (2.3. 0) = 0. E = 1. we ﬁnd that E f(E) = 2E j=0 j2−j (2. G) = K(K − 1) · · · (K − n + 1). Let F (V. and the other in that it has fewer vertices.3. The graph G can be input in any one of a number of ways. Then we see at once that F (V. and that is indeed a polynomial in K of degree n. as a list of pairs of vertices.3. say c|E(G)|. and so P (K. but it is rather routine. E) ≤ f(E)c. G − {e}) − P (K. if G is the complete graph of n vertices then obviously P (K. For example. then we have P (K.3. 1) ≤ c. (2. suppose it is true for all graphs of V vertices and fewer than E edges. The claim is proved. E − 1) + cE + F (V − 1. the number of ways of properly coloring the vertices of a graph G in K colors. since the two graphs that appear on the right are both ‘smaller’ than G. Equation (2.2. successively.5) together with F (V. That involves some work. by induction.5) in the form F (V. G). a polynomial in K of degree |V (G)|. Then (2.4) implies that P (K. and we quickly ﬁnd that if f(E) = 2f(E − 1) + E (f(0) = 0) (2.4. Proof of claim: The claim is certainly true if G has just one vertex. Next suppose the assertion is true for graphs of < V vertices. 45 . 2) ≤ 4c.5).3.6) then we will have such a solution. and let G have V vertices and E edges. The third step is to create the edge list of the collapsed graph G/{e} from the edge list of G itself. is called the chromatic polynomial of G. and its cost is linear in the number of edges of G.7) ∼ 2E+1 . we might input the full list of edges of G.{chrompoly} Next we are going to look at the complexity of the algorithm chrompoly (we will also refer to it as ‘the delete-and-identify’ algorithm). {computes the chromatic polynomial of a graph G} if G has no edges then chrompoly:=K |V (G)| else choose an edge e of G.4) The quantity P (K. since all we have to do is to ignore one edge in the list. Finally we call chrompoly on the graph G/{e}. If we put. chrompoly:=chrompoly(G − {e})−chrompoly(G/{e}) end. G). The latter operation is trivial.6) is a ﬁrst-order diﬀerence equation of the form (1. Hence. E − 1) (2. G) = P (K. G/{e}) is a polynomial of lower degree. Next we call chrompoly on the graph G − {e}. E) ≤ F (V.3. G) is a polynomial of the required degree V because G − {e} has fewer edges than G does. G/{e} has fewer vertices than G has. 3) ≤ 11c. and F (V. in fact. E) denote the maximum cost of calling chrompoly on any graph of at most V vertices and at most E edges. then we claim it is true for graphs of V vertices also.4) gives a recursive algorithm for computing the chromatic polynomial of a graph G. We claim that it is.3 Recursive graph algorithms or if we solve for P (K. function chrompoly(G:graph): polynomial. This is surely true if G has V vertices and no edges at all. so its chromatic polynomial is a polynomial of degree V . we ﬁnd that F (V.

We might imagine arranging the computation so that the extra isolated vertices will be ‘free. if h(γ) denotes the maximum amount of labor that chrompoly does on any graph G for which |V (G)| + |E(G)| ≤ γ then we claim that h(γ) ≤ h(γ − 1) + h(γ − 2) (γ ≥ 2). t) will cost at least 2s operations. (2. If we choose an edge of G(s.3. which is rather an odd kind of thing to deﬁne. we will ﬁnd that γ(G − {e}) = γ(G) − 1.3. where Fγ is the Fibonacci number. t + 2).3. if G has no edges then the labor is 1 unit. (2.3.11) = O(1. t + 1). Of course the mere fact that our proved time estimate is O(2E ) doesn’t necessarily mean that the algorithm can be that slow. then if G has any edges at all we can do the delete-and-identify step to prove that the labor involved in computing the chromatic polynomial of G is at most the quantity on the right side of (2. for a total of 2s + t vertices altogether. With the initial conditions h(0) = h(1) = 1 the solution of the recurrent inequality (2..3.3. For a graph G we can deﬁne a number γ(G) = |V (G)| + |E(G)|. In either case the overall judgment about the speed of the algorithm (it’s slow!) remains. will not cost any additional labor.9) holds. because maybe our complexity analysis wasn’t as sharp as it might have been.3. and if we collapse the graph on the edge e then we will have lost one vertex and at least one edge. 46 . t) will depend only on s. This is exponential cost. We have thereby proved that the time complexity of the algorithm chrompoly is O(F|V (G)|+|E(G)| ) = O ( √ 1 + 5 |V (G)|+|E(G)| ) 2 |V (G)|+|E(G)| (2. ·).10) follows. What we have shown is that the labor involved is always O min(2|E(G)| . whereas if G has more edges.Chapter 1: Mathematical Preliminaries The last ‘∼’ follows from the evaluation j2−j = 2 that we discussed in section 1. Therefore G(s.10).3. Considering the above remarks it may be surprising that there is a slightly diﬀerent approach to the complexity analysis that leads to a time bound (for the same algorithm) that is a bit sharper than O(2E ) in many cases (the work of the complexity analyst is never ﬁnished!). contradict the earlier estimate. and our complexity estimate wasn’t a mirage. consider the graph G(s. if we delete the edge e then γ must drop by 1.12) On a graph with ‘few’ edges relative to its number of vertices (how few?) the ﬁrst quantity in the parentheses in (2. we get G(s − 1. if G is a graph for which (2.10) Indeed.3. so γ will drop by at least 2. so the result (2. we have found out that the chromatic polynomial of a graph can be computed recursively by an algorithm whose cost is O(2E ) for graphs of E edges. Hence. γ(G/{e}) ≤ γ(G) − 2.9) (2. and will be twice as much as the work we do on G(s−1. then the second term is the smaller one. 1.8) Indeed.10) is obviously the relation h(γ) ≤ Fγ .3. then. and such computations are prohibitively expensive except for graphs of very modest numbers of edges. which is again at most equal to the right side of (2. t) and delete it. Then the work that we do on G(s.e. namely that whatever G we begin with. there really are graphs that make the algorithm do an amount 2|E(G)| of work. Each of these two new graphs has s − 1 edges. This analysis does not. t) that consists of s disjoint edges and t isolated vertices.3.10). whereas the graph G/{e} is G(s − 1. Let’s look at the algorithm chrompoly in another way. (2.3. To summarize the developments so far. However.’ i.62 ). of course. Else. but complements it. but it has a nice property with respect to this algorithm.12) will be the smaller one.62|V (G)|+|E(G)|) .

3.* What we’re going to do next in this section is the following: (a) describe another way of multiplying two 2 × 2 matrices in which the cost will be only 7 multiplications of numbers plus a bunch of additions of numbers. Otherwise. What must such a graph look like? What is the size of the largest independent set of vertices in such a graph? How long would it take you to calculate that number for such a graph G? How would you do it? 6. How many edges must G have before the second quantity inside the ‘O’ in (2. Suppose we could ﬁnd another way of multiplying two 2 × 2 matrices in which the cost was only 7 multiplications of numbers. or would it be just a curiosity.12) is the smaller of the two? 11. Then show that if G is a tree then P (K.2. 4. Let G be a path of n vertices.j ’s we have to do 2 multiplications of numbers. ‘of course. but the best ideas often have humble origins. Strassen. 2. Use (2. it would be extremely important.2) a little more closely. Write out algorithm maxset3. for every G. Would that be a cause for dancing in the streets. together with more than 4 additions of numbers. j = 1. 47 . If we want to calculate c11 c21 then.4) to prove by induction that P (K. then two N × N matrices can be multiplied using only O(N 2. Let G be a connected graph in which every vertex has degree ≤ 2.4 Fast matrix multiplication Exercises for section 2. Numerische Math. G) = K(K − 1)|V (G)|−1 .) * V.1) has a complexity of 8 multiplications of numbers and 4 additions of numbers. In order to calculate each one of the 4 ci.. the matrix multiplication method that is shown in (2. α(G) ≥ n/χ(G). 13 (1969). Analyze the complexity of your algorithm maxset3 from exercise 6 above. What is the size of the largest independent set of vertices in V (G)? 3. The usefulness of the idea stems from the following amazing fact: if two 2 × 2 matrices can be multiplied with only 7 multiplications of numbers. Let α(G) be the size of the largest independent set of vertices of a graph G.4. 8. What must such a graph consist of? Prove. of little importance? In fact. What must such a graph look like? 5.k bk. and the consequences of such a step were fully appreciated only in 1969 by V. which ﬁnds the size of the largest independent set of vertices in a graph. 2). Its trivial case will occur if G has no vertex of degree ≥ 3. it will choose a vertex v∗ of degree ≥ 3 and proceed as in maxset2. and let n = |V (G)|. 7. and (b) convince you that it was worth the trouble. Let G be a connected graph in which every vertex has degree 2. Show that. Gaussian elimination is not optimal. The cost of multiplying two 2 × 2 matrices is therefore 8 multiplications of numbers. where e is a given edge of G.. 10. (2.4. Let G be a cycle of n vertices. This may seem rather unstartling.3. let χ(G) be its chromatic number.4. 354-6. If we measure the cost in units of additions of numbers. G) is a polynomial in K of degree |V (G)|. 9. What is the size of the largest independent set of vertices in V (G)? 2.’ 2 c12 c22 = a11 a21 a12 a22 b11 b21 b12 b22 (2. Strassen. Let G be a not-necessarily-connected graph in which every vertex has degree ≤ 2. Hence.2) Now look at (2. Write out an algorithm that will change the vertex adjacency matrix of a graph G to the vertex adjacency matrix of the graph G/{e}.4 Fast matrix multiplication Everybody knows how to multiply two 2 × 2 matrices.1) ci. to whom the ideas that we are now discussing are due.j (i.81.3 1.j = k=1 ai. the cost is 4 such additions.4.

A and B.4. Next we’re going to describe how Strassen’s method (equations (2. except that the lower-case letters are now all upper case.4. (2. Instead of doing it ´ l` (2.4. (2. and we don’t so much mind that it has been replaced by 10 more ‘±’ signs. at least not if N is very large.. Hence we want to do the matrix multiplication that is indicated by C11 C21 C12 C22 A11 A21 A12 A22 B11 B21 B12 B22 = (2.4).81. ‘Well yes.4.4).4.. so where is the gain?’ It will turn out that multiplications are more important than additions.3)). a a First compute.4.4.3).Chapter 1: Mathematical Preliminaries multiplications of numbers instead of the N 3 such multiplications that the usual method involves (the number ‘2. (2. the following 7 quantities: I = (a12 − a22 ) × (b21 + b22 ) II = (a11 + a22 ) × (b11 + b22 ) III = (a11 − a21 ) × (b11 + b12 ) IV = (a11 + a12 ) × b22 V = a11 × (b12 − b22 ) V I = a22 × (b21 − b11 ) V II = (a21 + a22 ) × b11 and then calculate the 4 entries of the product matrix C = AB from the 4 formulas c11 = I + II − IV + V I c12 = IV + V c21 = V I + V II c22 = II − III + V − V II.2).4. then that improvement will show up in the exponent of N when we measure the complexity of multiplying two N × N matrices.1).3) The ﬁrst thing to notice about this seemingly overelaborate method of multiplying 2 × 2 matrices is that only 7 multiplications of numbers are used (count the ‘×’ signs in (2. and that the product matrix C is similarly partitioned. but because when the routine is called recursively each ‘×’ operation will turn into a multiplication of two big matrices whereas each ‘±’ will turn into an addition or subtraction of two big matrices.’ is log2 7).4)) of multiplying 2 × 2 matrices can be used to speed up multiplications of N × N matrices. This yields the following recursive procedure for multiplying large matrices. We imagine that A and B have each been partitioned into four 2n−1 × 2n−1 matrices.5) we use exactly the 11 formulas that are shown in (2.4. try the following 11-step approach.’ you might reply. and let there be given two N × N matrices. not because computers can do them faster. from the input 2 × 2 matrices shown in (2. In other words.4) (2. ‘but 18 additions are needed. as we will see.4. and that’s much cheaper.3). if we can reduce the number of multiplications of numbers that are needed to multiply two 2 × 2 matrices. Suppose that N is a power of 2.5) where now each of the capital letters represents a 2n−1 × 2n−1 matrix. Suddenly we very much appreciate the reduction of the number of ‘×’ signs because it means one less multiplication of large matrices.3) and (2.4. The reason.4. say N = 2n . To do the job in (2.4. The basic idea is that we will partition each of the large matrices into four smaller ones and multiply them together using (2. is that the little improvement will be pyramided by numerous recursive calls to the 2 × 2 procedure– but we get ahead of the story. Now let’s write out another way to do the 2 × 2 matrix multiplication that is shown in (2. 48 .1).4.

. It follows that g(0) = 0 and for n ≥ 1 we have g(n) = 7g(n − 1) + 18 · 4n−1 9 = 7g(n − 1) + 4n . 2 If we sum over n we obtain n 9 yn = (4/7)j 2 j=1 ≤ 9 2 ∞ (4/7)n j=0 = 21/2. In each of its 18 calls to the procedure that adds or subtracts matrices it does a number of additions of numbers that is equal to the square of the size of the matrices that are being added or subtracted. That size is 2n−1. Therefore the function M atrP rod makes 25 calls. Therefore we see that f(n) = 7n for all n ≥ 0. through all 11 of the formulas shown in (2. We will count the number of multiplications of numbers that are needed to multiply two 2n × 2n matrices using M atrP rod (call that number f(n))..3). The multiplications of numbers are easy to count. Let’s take the last sentence in the above paragraph and replace ‘2n ’ by N throughout. Hence we make the change of variable g(n) = 7n yn (n ≥ 0) and we ﬁnd that y0 = 0 and for n ≥ 1. in place of the N 3 such multiplications that are required by the usual formulas. nothing fancy!). N/2). 49 . 7 of which are recursively to itself. N/2). In each of its 7 recursive calls to itself M atrP rod does g(n − 1) additions of numbers. and therefore each one of them involves a call to a matrix addition or subtraction procedure (just the usual method of adding. Hence M atrP rod does 7n multiplications of numbers in order to do one multiplication of 2n × 2n matrices. etc. Since 7log N/ log 2 = N log 7/ log 2 = N 2. It remains to count the additions/subtractions of numbers that are needed by M atrP rod. (2. C22 := II − III + V − V II end. and 18 of which are to a matrix addition/subtraction routine. so each of the 18 such calls does 22n−2 additions of numbers. 9 yn = yn−1 + (4/7)n .. where A and B are N × N } {uses Strassen method} if N is not a power of 2 then border A and B by rows and columns of 0’s until their size is the next power of 2 and change N . we see that Strassen’s method uses only O(N 2.4). We will now study the complexity of the routine in two ways. B: matrix.4. It then tells us that M atrP rod does 7log N/ log 2 multiplications of numbers in order to do one multiplication of N × N matrices. The plus and minus signs in the program each represent an addition or subtraction of two matrices.4. ending with .. M atrP rod calls itself 7 times.4 on this ﬁrst-order linear diﬀerence equation.4 Fast matrix multiplication function M atrP rod(A.2.81 ) multiplications of numbers.81. N :integer):matrix.{M atrP rod} Note that this procedure calls itself recursively 7 times.5). B21 + B22 . {M atrP rod is AB. II := M atrP rod(A11 + A22 .4. B11 + B22 . hence f(n) = 7f(n − 1) and f(0) = 1 (why?). in each of which it does exactly f(n − 1) multiplications of numbers. 2 We follow the method of section 1. and then we will count the number of additions of numbers (call it g(n)) that M atrP rod needs in order to multiply two 2n × 2n matrices. if N = 1 then M atrP rod := AB else partition A and B as shown in (2.. etc. I := M atrP rod(A11 − A22 ..

(cont.81 ) (in contrast to the Θ(N 3 ) of the conventional method). If you should want to multiply two polynomials f and g. 172 points. What we’re asked to do.1. it converts from the sequence of coeﬃcients of the polynomial to the sequence of values of that polynomial at a certain set of points. and more elaborate algorithms have been developed. In the years that have elapsed since Strassen’s original paper many researchers have been whittling away at the exponent of N in the complexity bounds. The values of the product polynomial at 172 distinct points determine that polynomial completely. and after a total cost of 172 multiplications we would have the values of the product.) Still more generally.5 The discrete Fourier transform It is a lot easier to multiply two numbers than to multiply two polynomials. Several new.) With what would the ‘22’ in problem 1 above have to be replaced in order to achieve an improvement over Strassen’s algorithm given in the text? 3. and some applications. of additions of numbers and of subtractions of numbers that are needed to multiply together two N ×N matrices are each O(N 2. that two N × N matrices can be multiplied in time O(N 2+ ). To calculate just one coeﬃcient of the product is already a lot of work. including multiplication of polynomials. Exercises for section 2. with how few multiplications would we have to be able to multiply two M × M matrices in order to insure that recursively we would then be able to multiply two N × N matrices faster than the method given in this section? 4. basically. To do that we could just multiply the values of f and of g at each of those points. that is the subject of this section. It is widely believed that the true minimum exponent is 2 + . respectively. In Strassen’s method of fast matrix multiplication the number of multiplications of numbers. (cont. has progressed downwards through 2. Hence.4 1. in this section we will study the discrete Fourier transform of a ﬁnite sequence of numbers. and there are 171 other coeﬃcients to calculate! Instead of calculating the coeﬃcients of the product fg it would be much easier just to calculate the values of the product at. It’s just that we humans prefer to see polynomials given by means of their coeﬃcients instead of by their values. 50 . The Fourier transform.5. creation of medical images in CAT scanners and NMR scanners. so that sequence of values is the answer. Think about the calculation of the coeﬃcient of x50 in the product. Ease of converting between these two representations of a polynomial is vitally important for many reasons. methods of calculating it.e.4. etc. How fast.81) as before. More exactly. is a method of converting from one representation of a polynomial to another. This is a computational problem which at ﬁrst glance seems very simple. Suppose we could multiply together two 3 × 3 matrices with only 22 multiplications of numbers.. and you will see that about 50 numbers must be multiplied together and added in order to calculate just that one coeﬃcient of fg.5)7n = O(7n ). is to evaluate a polynomial of degree n − 1 at n diﬀerent points. high precision integer arithmetic in computers. g(n) = 7n yn ≤ (10. say. i. and this is O(N 2. but there seems to be a good deal of work to be done before that result can be achieved. Prove that if N is not a power of 2 then two N × N matrices can be multiplied in time at most 7CN log2 7 . So what could be so diﬃcult about that? If we just calculate the n values by brute force. where C is a suitable constant. for instance. which was originally 3. so we surely don’t need more than O(n2 ) multiplications altogether.81 to values below 2.Chapter 1: Mathematical Preliminaries Finally. and the exponent. you are in for a lot of work. This completes the proof of Theorem 2. we certainly won’t need to do more than n multiplications of numbers to ﬁnd each of the n values of the polynomial that we want. We showed in the text that if N is a power of 2 then two N × N matrices can be multiplied in at most time CN log2 7 . 2. would we then be able to multiply two N × N matrices? 2. of degrees 77 and 94. recursively.

−12.5.1)-(2. . −5.6) is clearly understood. to the fast multiplication of polynomials. 0. 3 are 0.5). that it really pays to be very eﬃcient about how the calculation is done. In general. . One can think of the Fourier transform as being a way of changing the description. . 6.2) (2. 1 is the sequence 2. . respectively. The ﬁrst is by its coeﬃcient sequence. the reader should pause for a moment and make sure that the fact that (2.5. −i are 2. We will see in this section that if we use a fairly subtle method of doing this computation instead of the obvious method. or whose (iv) values at the fourth-roots of unity 1. It is the polynomial whose (i) coeﬃcients are 0. 2 and 3. the second is by its sequence of values at the nth roots of unity. i.4) (2. 1 or whose (ii) roots are 0. Then we will describe the ‘Fast Fourier Transform’. Before proceeding. we would say that the Fourier transform of the sequence 0. These roots of unity are the numbers ωj = e2πij/n (j = 0. 5 + 5i. . so we will introduce the subject by discussing it from that point of view. 1. . 5 + 5i. This can be uniquely described in any of the following ways (and a lot more). but very fast. where n is 1 more than the degree of the polynomial. 1. 5 − 5i. which is a rather un-obvious. 1.5. There are many diﬀerent ways that might choose to describe (‘encode’) a particular polynomial. 51 .2) is a special case of (2.5. we ﬁnd the Fourier transform of the given sequence (2. or coding of a polynomial. 2. if we are given a sequence x0 .5. and turns up in so many diﬀerent applications.5. or etc.5. −12. (2.3) (2.5. and whose highest coeﬃcient is 1 or whose (iii) values at t = 0. 5 − 5i. x1 .5) (2. 0. Finally we will discuss an important application of the subject.3) to be the sequence n−1 f(ωj ) = k=0 n−1 x k ωj k (2.5.n − 1). n − 1). .4) at the n numbers (2. if we calculate the values of the polynomial (2. 2.5. . −1. the saving is very much worthwhile. In view of the huge arrays on which this program is often run. −5.5 The discrete Fourier transform The interesting thing is that this particular problem is so important. method of computing the same creature. .5.5. namely the sequence of values at the nth roots of unity of the very same polynomial whose coeﬃcients are the members of the original sequence. Next we will discuss the obvious way of computing the transform.1) Consequently. .6) xk e2πijk/n k=0 = (j = 0.2.3)-(2. then the work can be cut down from O(n2 ) to O(n log n). The Fourier transform of a sequence of n numbers is another sequence of n numbers. The process by which we pass from the coeﬃcient sequence to the sequence of values at the roots of unity is called forming the Fourier transform of the coeﬃcient sequence.5. for instance. 6. xn−1 then we think of the polynomial f(t) = x0 + x1 t + x2 t2 + · · · + xn−1 tn−1 and we compute its values at the nth roots of unity. We want to focus on two of these ways of representing a polynomial. To use the example above.. Take the polynomial f(t) = t(6 − 5t + t2 ).

Chapter 1: Mathematical Preliminaries The Fourier transform moves us from coeﬃcients to values at roots of unity. the cost of calculating all n of the values of a polynomial f at the nth roots of unity is much less than n times the cost of one such calculation. 2r − 1. say n = 2r . What we have to do is to compute the values of a given polynomial at n given points. but for the moment.5. let’s consider the computational side of the question. If we calculate the Fourier transform of a given sequence of n points by calling the function value n times. Each of the two sums that appear in the last member of (2. m=0 Something special just happened. x[5].8) is itself a Fourier transform. In order to appreciate how fast it is. namely how to compute the Fourier transform eﬃciently. We are going to derive an elegant and very speedy algorithm for the evaluation of Fourier transforms. . almost set) for a recursive program. . Then.5. n:integer. a polynomial of degree 2r − 1. Then the values of f. 52 (2. and we will then look at some implications of that fact. for each j = 0. . .6). for j := n − 1 to 0 step −1 do value := t · value + xj end. 1. (2. How much work is required to calculate the value of a polynomial at one given point? If we want to calculate the value of the polynomial x0 + x1 t + x2 t2 + . once for each point of evaluation. .8) x2m+1 e2πijm/2 r−1 + e2πij/2 .. The ﬁrst sum is the transform of the array x[0]. {computes value := x0 + x1 t + · · · + xn−1 tn−1 } value := 0. + xn−1 tn−1 at exactly one value of t. x[4]. First we consider the important case where n is a power of 2.{value} This well-known algorithm (= ‘synthetic division’) for computing the value of a polynomial at a single point t obviously runs in time O(n).7) Let’s break up the sum into two sums. To put it another way. then we can do (think how you would do it. . containing respectively the terms where k is even and those where k is odd. The stage is set (well. . . The algorithm is called the Fast Fourier Transform (FFT) algorithm. 2r − 1).10) (2.5. . n−1 f(ωj ) = k=0 xk exp{2πijk/2r } (j = 0. Some good reasons for wanting to make that trip will appear presently. In the ﬁrst sum write k = 2m and in the second put k = 2m + 1. . With the FFT we will see that the whole job can be done in time O(n log n). x[2r − 2] and the second sum is the transform of x[1]. . let’s see how long it would take us to calculate the transform without any very clever procedure. t:complex).5. . of a shorter sequence.5. 2r−1 −1 f(ωj ) = m=0 2r−1 −1 x2me 2πijm/2r−1 2r−1 −1 + m=0 x2m+1 e2πij(2m+1)/2 r r = m=0 x2me2πijm/2 r−1 2r−1 −1 (2. x[3]. 1.5.9) . at the (2r )th roots of unity are. x[2r − 1]. . x[2]. . . then obviously we are looking at a simple algorithm that requires Θ(n2 ) time. before looking) function value(x :coeﬀ array.. . from (2.

12) If we change variables by writing y(k) = 2k zk . .5. The call to F F T (n/2. Let Q(j) denote the ﬁrst sum in (2. . . (2. This proves Theorem 2. This problem is no sooner recognized than solved. The ‘for j:= 0 to n’ loop requires n more multiplications. namely for j = 0.5 The discrete Fourier transform There is one small problem.1.9) is deﬁned for only 2r−1 values of j. oddarray). 2 {v[0].8). which. x :complexarray):complexarray. .2. x[n − 1]}. x[2].8) we want to compute f(ωj ) for 2r values of j. . . 1. The reader may observe that by ‘padding out’ the input array with additional 0’s we can extend the length of the array until it becomes a power of 2. . . then we ﬁnd that zk = zk−1 + 1. . If Q(j) has been computed only for 0 ≤ j ≤ 2r−1 − 1 and if we should want its value for some j ≥ 2r−1 then we can get that value by asking for Q(j mod 2r−1 ). . F F T [j] := u[j mod n ] + τ v[j mod n ] 2 2 end. though. x[3]. .{F F T } Let y(k) denote the number of multiplications of complex numbers that will be done if we call F F T on an array whose length is n = 2k . Hence y(k) = 2y(k − 1) + 2k (k ≥ 1. namely for j = 0. {u[0].5. In the algorithm we will use the type complexarray to denote an array of complex numbers. v[ n − 1]} := F F T (n/2. 53 . u[1].5.. . function F F T (n:integer. So if we calculate the ﬁrst sum by a recursive call. implies that zk = k for all k ≥ 0.5. 2r−1 − 1. . and then call the F F T procedure that we have already * The remainder of this section can be omitted at a ﬁrst reading. together with z0 = 0. y(0) = 0). . v[1]. and therefore that y(k) = k2k . evenarray) costs y(k − 1) multiplications as does the call to F F T (n/2. . Then we claim that Q(j) is a periodic function of j. However. . . evenarray). Now we can state the recursive form of the Fast Fourier Transform algorithm in the (most important) case where n is a power of 2. then we will need its values for j’s that are outside the range for which it was computed.11) = m=0 x2mexp{2πimj/2r−1 } = Q(j) for all integers j.5. In (2. . . . . Next* we will discuss the situation when n is not a power of 2. {computes fast Fourier transform of n = 2k numbers x } if n = 1 then F F T [0] := x[0] else evenarray := {x[0]. oddarray).5. 2 for j := 0 to n − 1 do τ := exp{2πij/n}. 1. x[n − 2]}. because 2r−1 −1 r−1 Q(j + 2 )= m=0 2r−1 −1 x2mexp{2πim(j + 2r−1 )/2r−1 } x2mexp{2πimj/2r−1 }e2πim m=0 2 r−1 = −1 (2. oddarray := {x[1]. 2r − 1. u[ n − 1]} := F F T (n/2. of period 2r−1 . if n is a power of 2. The Fourier transform of a sequence of n complex numbers is computed using only O(n log n) multiplications of complex numbers by means of the procedure F F T . the Fourier transform of the shorter sequence (2.

but the next one uses the all-important fact that ξn r1 = ξr2 (why?). Now the entire job can be done recursively. that the Fourier transform of a sequence of length n can be computed by recursively ﬁnding the Fourier transforms of r1 diﬀerent sequences. we will not discuss any improvements to the obvious method for calculating the transform.13) = k=0 r1 −1 t=0 xtr1 +k ξr2 tj ξn kj ak (j)ξn kj . We have. because for ﬁxed k the set of values of ak (j) (j = 0. . such as the multiplication of polynomials that we will discuss later in this section. . In some applications. (2. If n is a prime number we will have nothing more to say. .13). they must be computed only for j = 0. that ak (j + r2 ) = ak (j) for all j. for k ﬁxed we can recursively compute the r2 values of ak (j) that we need with g(r2 ) multiplications of complex numbers. The ﬁrst ‘=’ sign is the deﬁnition of the j th entry of the Fourier transform of the input array x. that change is acceptable. of period r2 . k=0 = We will discuss (2. . one root of unity at a time. and in the last equation we are simply deﬁning a set of numbers r2 −1 ak (j) = t=0 xtr1+k ξr2 tj (0 ≤ k ≤ r1 − 1. namely of the sequence {xtr1 +k } (t = 0. Suppose that n is not prime (n is ‘composite’). the relations n−1 f(e2πij/n ) = s=0 xs ξn js r1 −1 r2 −1 = k=0 t=0 {xtr1 +k ξn j(tr1 +k) } r1 −1 r2 −1 = k=0 t=0 {xtr1 +k ξn tjr1 ξn kj } r1 −1 r2 −1 (2. .5. . . then. r2 − 1). The method is a straightforward generalization of the idea that we have already used in the case where n was a power of 2. n − 1. say n = r1 r2 where neither r1 nor r2 is 1. 1. but after the padding. where 0 ≤ t ≤ r2 − 1 and 0 ≤ k ≤ r1 − 1. The next ‘=’ is just a rearrangement. 1. . 0 ≤ j ≤ n − 1). The second equality uses the fact that every integer s such that 0 ≤ s ≤ n−1 can be uniquely written in the form s = tr1 + k. . . that may or may not be acceptable. Hence. where n is not a power of 2. Then we can factor the integer n in some nontrivial way.5. line-by-line. There are r1 such ﬁxed values of k for which we must do the 54 . Then. . even though the values of the ak (j) are needed for j = 0. r2 − 1) that we must compute is itself a Fourier transform. 1. 1. In the following we will write ξn = e2πi/n . . but in others the substitution of N th roots for nth roots may not be permitted.5. . for the value of the input polynomial f at the j th one of the n nth roots of unity.15) Let g(n) denote the number of complex multiplications that are needed to compute the Fourier transform of a sequence of n numbers. each of length r2 . i. we will ﬁnd the values at the N th roots of unity. and where the padding operation is not acceptable.e. The train of ‘=’ signs in the equation below shows how the question on an input array of length n is changed into r1 questions about input arrays of length r2 .5. In a particular application.. i. .. We claim. We will suppose that the FFT of a sequence of n numbers is wanted.e. where N is the next power of 2. (2.Chapter 1: Mathematical Preliminaries discussed.14) The important thing to notice is that for a ﬁxed k the numbers ak (j) are periodic in n. . The problem is that the original question asked for the values of the input polynomial at the nth roots of unity. r2 − 1.

If we leave that question in abeyance for a while. n − 1. For example. h(15) = min(h(15/d) + d) − 1 d = min(h(5) + 3.16). Once the ak (j) are all in hand. 7) − 1 = 6 55 . {computes Fourier transform of a sequence x of length n} if n is prime then for j:=0 to n − 1 do n−1 F F T [j] := k=0 x[k]ξn jk else let n = r1 r2 be some factorization of n. we can summarize by stating the (otherwise) complete algorithm for the fast Fourier transform. if n is composite.5. The above relation determines the value of h for all positive integers n. . we will need r1 r2 (r1 − 1) complex multiplications.18).5.16).17) completely determine the function g(n).5.17) The recurrence formula (2.5 The discrete Fourier transform computation. h(3) + 5) − 1 = min(7. hence all of the necessary values of ak (j) can be found with r1 g(r2 ) complex multiplications.5.5.16) n=r1 r2 If n = p is a prime number then there are no factorizations to choose from and our algorithm is no help at all. so we have the recurrence 2 g(n) = min {r1 g(r2 ) + r1 r2 } − n. the special values g(p) = p(p − 1) (if p is prime).16) together with the starting values that are shown in (2. (2. (2. Since n = r1 r2 values of the transform have to be computed.16).17). and thereby to learn the best choice of the factorization of n.5..5. then the computation of the one value of the transform from (2. though. ak [r2 − 1]} := F F T ({x[k]. if n is prime. The factorization that should be chosen is the one that minimizes the labor. r2). x[k + (r2 − 1)r1 ]}. in addition to the recurrence formula (2. x[k + r1 ]. . Hence we have.5. . Then the recurrence that we have to solve takes the form h(n) = mind {h(n/d) + d} − 1. function F F T (x:complexarray. the ‘min’ is taken over all d that divide n other than d = 1 and d = n.17). (2.5. the reader is invited to calculate g(12) and g(18).6). for j:=0 to n − 1 do r1 −1 kj F F T [j] := k=0 ak [j mod r2 ]ξn end. Let g(n) = nh(n). r2 } for k:=0 to r1 − 1 do {ak [0]. ak [1].5. There is no recourse but to calculate the p values of the transform directly from the deﬁnition (2. . n:integer):complexarray.18) In (2.{F F T } Our next task will be to solve the recurrence relations (2. where h is a new unknown function. . {see below for best choice of r1 . . (2. and that will require p − 1 complex multiplications to be done in order to get each of those p values.2. and when we are ﬁnished we will see which factorization of n is the best one to choose. Before proceeding. We are going to work out the exact solution of the interesting recurrence (2.5. .13) will require an additional r1 − 1 complex multiplications. (2.5.5. 2 The complete computation needs r1 g(r2 ) + r1 r2 − r1 r2 multiplications if we choose a particular factorization n = r1 r2 .

Hence what we have to show is that the above choice of the bi ’s is the best one.20) (2.18) (this claim is obviously (?) correct if n is prime). If we let H(b1 . . suppose it to be true for 1. To ﬁnd the solution in a pleasant form. we will discuss some applications of the FFT. observe that for each i = 1.5. bs such that 0 ≤ bi ≤ ai (∀i = 1. . say n = 2q . . . We will show that if one of the bi is larger than 1 then we can reduce it without increasing the value of H. To prove the claim in general. namely exponents b1 . . bj = 1 and all other bi = 0.21) that it doesn’t matter which prime divisor of n that we choose to be d. where n = pa1 · · · pas is the canonical factorization of the 1 s integer n. Hence from (2. where the primes pi are the same as those s 1 2 that appear in (2. . If that is done. Further. .5. The saving over the straightforward algorithm that uses n(n − 1) multiplications for each n is apparent.18) we get h(n) = min{(a1 − b1 )(p1 − 1) + · · · + (as − bs )(ps − 1) + pb1 · · · pb } − 1 s 1 b s (2. Consider a multiplication like (1 + 2x + 7x2 − 2x3 − x4 ) · (4 − 5x − x2 − x3 + 11x4 + x5 ).20). 2.5.5. s we have H(b1 .2 reduces to g(n) = n log n/ log 2. bs ) − H(b1 .5. A family of such applications begins with the observation that the FFT provides the fastest game in town for multiplying two polynomials together. then we can easily check from (2.5. Since the divisor d ≥ 2 and the prime pi ≥ 2. say. .5. Then every divisor d of n must be of the form d = pb1 pb2 · · · pbs . . . .5. if d is prime.6 Applications of the FFT Finally. bs ) denote the quantity in braces in (2. the last diﬀerence is nonnegative.1. To prove this. then algorithm FFT requires g(n) = n(a1 (p1 − 1) + a2 (p2 − 1) + · · · + as (ps − 1)) complex multiplications in order to do its job. . . .5.20). . exactly what we need to prove our claim (2. let n = pa1 pa2 · · · pas s 1 2 be the canonical factorization of n into primes.5.1 shows the number g(n) of complex multiplications required by F F T as a function of n. in agreement with theorem 2.5. s) and not all bi are 0 and not all bi = ai.21). We claim that the function h(n) = a1 (p1 − 1) + a2 (p2 − 1) + · · · + as (ps − 1) (2. Table 2.5.19) is the solution of (2. . . 56 .Chapter 1: Mathematical Preliminaries and so forth. . . as long as not all bi = 0. (Complexity of the Fast Fourier Transform) The best choice of the factorization n = r1 r2 in algorithm FFT is to take r1 to be a prime divisor of n.5. .21) where now the ‘min’ extends over all admissible choices of the b’s.19) and each bi is ≤ ai . . . If we recall the change of variable g(n) = nh(n) we ﬁnd that we have proved Theorem 2. and suppose that n is not prime. What does the formula say if n is a power of 3? if n is a product of distinct primes? 2. . If n is a power of 2. Hence H doesn’t increase if we decrease one of the b’s by 1 unit. then the formula of theorem 2. bi + 1. bs ) = −pi + d(pi − 1) = (d − 1)(pi − 1). n − 1.5. the function h(n) is always given by (2. . It follows that the minimum of H occurs among the prime divisors d of n. One such admissible choice would be to take.2. then with this choice the value of H would be a1 (p1 − 1) + · · · + as (ps − 1) + 1.

6 Applications of the FFT n 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 g(n) 2 6 8 20 18 42 24 36 50 110 48 156 98 90 64 272 90 342 120 168 n g(n) 22 242 23 506 24 120 25 200 26 338 27 162 28 224 29 812 30 210 31 930 32 160 33 396 34 578 35 350 36 216 37 1332 38 722 39 546 40 280 41 1640 Table 2.6. The coeﬃcient of x4 in the product. k − m) + 1. In particular.1) In the product polynomial. (2. for instance. is 1 · 11 + 2 · (−1) + 7 · (−1) + (−2) · (−5) + (−1) · 4 = 8.2.n) ar bk−r .6. By using the FFT the amount of labor can be reduced from Θ(n2 ) to Θ(n log n).6. Imagine two universes. we want to multiply n m { i=0 aixi } · { j=0 bj xj }.2) For k ﬁxed. To understand how this works. 57 .2) is min(k. It is the sequence of values of the polynomial whose coeﬃcients are the given numbers. at the nth roots of unity.2) in order to ﬁnd each coeﬃcient of the product. where n is the length of the input sequence. the number of terms in the sum (2. one in which the residents are used to describing polynomials by means of their coeﬃcients.k−m) (2. and 5 multiplications are needed to compute just that single coeﬃcient of the product polynomial.1: The complexity of the FFT We will study the amount of labor that is needed to do this multiplication by the straightforward algorithm. r=max(0.5. m + n we ﬁnd that the total amount of labor for multiplication of two polynomials of degrees m and n is Θ(mn). and another one in which the inhabitants are fond of describing polynomials by their values at roots of unity. In the general case. In the ﬁrst universe the locals have to work fairly hard to multiply two polynomials because they have to carry out the operations (2. and then we will see how the FFT can help. n) − max(0. the coeﬃcient of xk is min(k. if the polynomials are of the same degree n then the labor is Θ(n2 ). If we sum this amount of labor over k = 0. let’s recall the deﬁnition of the Fourier transform of a sequence. If we do this multiplication in the obvious way then there is quite a bit of work to do.6.

form the complex conjugate of each member of the sequence. xn−1 by the inverse formulas xk = 1 n n−1 (j = 0. We now are looking at the values of fg at the N th roots.5. n − 1).6. Given: A polynomial f. but the round trip was a lot cheaper than the O((m+n)2 ) cost of a direct multiplication. Step 4. . which we will describe in a moment. . We now have the numbers f(ω)g(ω). so here they are. . Think of f and g as polynomials each of whose degrees is N − 1. so as to facilitate comparison of the two. then what are the values (fg)(ω) of the product polynomial fg at the roots of unity? To ﬁnd each one requires only a single multiplication of two complex numbers. and a polynomial g of degree m. The cost of this step is O(N log N ). Wanted: The coeﬃcients of the product polynomial fg. The answer to the original question has been obtained at a total cost of O(N log N ) = O((m + n) log (m + n)) arithmetic operations. and we want to get back to the coeﬃcients of fg because that was what we were asked for. Compute the FFT of the array of coeﬃcients of f. Now both input coeﬃcient arrays are of length N . calls for the inverse Fourier transform. .Chapter 1: Mathematical Preliminaries In the second universe. Step 3. multiplying two polynomials is a breeze. one for each ω. 1. Now we are looking at the values of f at the N th roots of unity. 1. . Multiplying values is easier than ﬁnding the coeﬃcients of the product. if we are given the numbers f(ωj ) x0 . to the coeﬃcient array of g. . For each of the N th roots of unity ω multiply the number f(ω) by the number g(ω). we have to take a somewhat roundabout route in order to do an eﬃcient multiplication. (2. . (2. If we have in front of us the values f(ω) of the polynomial f at the roots of unity. It remains to discuss the inverse Fourier transform. because the value of fg at ω is simply f(ω)g(ω).4) The diﬀerences between the inverse formulas and the original transform formulas are ﬁrst the appearance of ‘1/n’ in front of the summation and second the ‘−’ sign in the exponential. n−1) then we can recover the coeﬃcient sequence f(ωj )e−2πijk/n j=0 (k = 0. Since we live in a universe where people like to think about polynomials as being given by their coeﬃcient arrays. To go backwards. x1 . . We leave it as an exercise for the reader to verify that these formulas really do invert each other. of degree n. . . . from values at roots of unity to coeﬃcients. to the coeﬃcient array of f and N − m more coeﬃcients. (ii) Input the conjugated sequence to your FFT program. Step 2. If we are given a sequence {x0 . Likewise compute the FFT of the array of coeﬃcients of g to obtain the array of values of g at the same N th roots of unity.6)) n−1 f(ωj ) = k=0 xk e2πijk/n (j = 0. . We observe that if we are already in possession of a computer program that will ﬁnd the FFT. The cost is obviously equal to the cost of the FFT plus a linear number of conjugations and divisions by n. xn−1} then the Fourier transform of the sequence is the sequence (see (2.6. .3) Conversely. of degree m + n. . n − 1). . and the values g(ω) of the polynomial g at the same roots of unity. Perhaps the neatest way to do that is to juxtapose the formulas for the Fourier transform and for the inverse tranform. 58 . Step 5. . . then we can use it to calculate the inverse Fourier transform as follows: (i) Given a sequence {f(ω)} of values of a polynomial at the nth roots of unity. all = 0. Step 1: Let N − 1 be the smallest integer that is a power of 2 and is greater than m + n + 1. . It’s true that we did have to take a walk from our universe to the next one and back again. (iii) Form the complex conjugate of each entry of the output array. The cost of this step is N multiplications of numbers. and divide by n. which are exactly the values of the unknown product polynomial fg at the N th roots of unity. . Its cost is also O(N log N ). This means that we should adjoin N − n more coeﬃcients. You now have the inverse transform of the input sequence. by their coeﬃcient arrays. . all = 0.

3) and (2. * E. 472-492.6. We discussed a recursive method that runs in time Θ(1. 3. The most obvious way to do it takes time Θ(n3 ) if the matrices are n × n. 6. Organize your program so as to minimize additional array storage beyond the input and output arrays. and indeed the fastest known algorithms for this problem depend upon the Fast Fourier Transform. We discussed a recursive algorithm that sorts in an average time of Θ(n log n). −1. The method relies heavily on the FFT. Prove that a polynomial of degree n is uniquely determined by its values at n + 1 distinct points. Finding a maximum independent set in a graph is a hard computational problem. We discussed a recursive method that runs in time O(n2. The values of a certain cubic polynomial at 1. The best known methods run in time Θ(2n/3 ). Exercises for section 2. Coppersmith and S. 2. A recent method ** runs in time O(nγ ) for some γ < 2. 4. 59 . Verify that the relations (2. bm by the relation n= i≥0 bi 2 i . b1.6. 2.7 that this problem can be done in an average time that is O(1) for ﬁxed K. ** D. Find its value at 2.6 1. . . each of which has ten million bits? By using ideas that developed directly (though not at all trivially) from the ones that we have been discussing.62n+E ) if G 3 has n vertices and E edges. −i are 1. Winograd. Show that 1 |f(ω)|2 = |a0 |2 + · · · + |an−1 |2 n ωn =1 4.6. 5. One recently developed method * runs in time O((1 + 3)n ). if G has n vertices. We discussed a recursive method that runs in √ time O(1. Lawler. . SIAM J. and let k be a ﬁxed integer. . respectively. How fast can we multiply two integers. Finding out if a graph is K-colorable is a hard computational problem. Let ω be an nth root of unity.39n).2.5) is seen at once to be the value of a certain polynomial at x = 2. A note on the complexity of the chromatic number problem.82).4) indeed are inverses of each other.7 A review An outgrowth of the rapidity with which we can now multiply polynomials is a rethinking of the methods by which we do ultrahigh-precision arithmetic. We will see in section 5.5) However the sum in (2. Sch¨nhage and Strassen found the fastest known method for doing such large-scale o multiplications of integers. On the asymptotic complexity of matrix multiplication. Let f = n−1 j j=0 aj x .7 A review Here is a quick review of the algorithms that we studied in this chapter. Hence in asking for the bits of the product of two such integers we are asking for something very similar to the coeﬃcients of the product of two polynomials. The most obvious way to sort n array elements takes time Θ(n2 ). 3. 66-7. 2. Comp. Write a program that will do the FFT in the case where the number of data points is a power of 2. Multiplying two matrices is an easy computational problem. which may not be too surprising since an integer n is given in terms of its bits b0 . 11 (1980). Sorting is an easy computational problem. i. The most obvious way to ﬁnd one might take time Θ(2n ) if the graph G has n vertices. (2. Evaluate 1 + ωk + ω2k + · · · + ωk(n−1).5. Information Processing Letters 5 (1976).6. The most obvious way to do it takes time Θ(K n ).

When we write a program recursively we are making life easier for ourselves and harder for the compiler and the computer. 29. 44.2.2: A tree of calls to chrompoly 60 . 67} then the tree shown in Fig. might generate the tree of recursive calls that appears in Fig.3.3 of calls may be created. 2. 2. A single invocation of chrompoly. if we call maxset1 on the 5-cycle.7. where the input graph is a 4-cycle. 71. 2. 2. the tree in Fig. 15. 8.1: A tree of calls to Quicksort Again. Fig.Chapter 1: Mathematical Preliminaries Finding the discrete Fourier transform of an array of n elements is an easy computational problem. 2.7. if we call Quicksort to sort the array {5. 9. for instance. We discussed a recursive method that runs in time O(n log n) if n is a power of 2.1 might be generated by the compiler.7. Fig. For example.7. 13. The most obvious way to do it takes time Θ(n2 ). A single call to a recursive program can cause it to execute a tree-full of calls to itself before it is able to respond to our original request.

This time the graph of just one vertex (5) is dropped onto the stack. Example Let’s follow the compiler through its tribulations as it attempts to deal with our request for maximum independent set size that appears in Fig. Next. We begin by asking for the maxset1 of the 5-cycle. so the program maxset1 gives.5). how does a compiler go about making such a tree? It does it by using an auxiliary stack. as the compiler works on (4. 1 + n2 ).3. 2. two more graphs are generated. i.3. if we call the ‘power of 2’ version of the F F T algorithm on the sequence {1. As long as we’re here.5). When it ﬁnishes executing the ﬁrst request it goes to the top of the stack to ﬁnd out what to do next. and proceeds to call itself on the left hand graph (vertices 2. it can’t do that. The compiler says to itself ‘I can’t do these both at once’.5).7 A review Fig. In general. and an empty graph. well after all. so the compiler can work on (5). if it is trying to execute maxset1 on a graph that has edges. on top of the graph that previously lived there. The compiler next reaches for the graph on top of the stack. very speciﬁc instructions as to how to deal with this graph. something fruitful happens: the graph (5) has no edges. and the compiler is dealing with the graph (3. and therefore it can execute the instruction maxset1 := max(n1 . 1} then F F T will proceed to manufacture the tree shown in Fig. of course. to dig itself out of the stack of unﬁnished business. the procedure that we have been describing.3: The recursive call tree for F F T Finally. We now know that the graph that consists of just the single vertex (5) has a maxset1 values of 1. from which it ﬁnds that the value of maxset1 for the graph (4. in its trivial case. which now holds three graphs. of which the right-hand one (4.7. Finally.3.5) is 1. which is dutifully dropped onto the stack.4. It adopts the philosophy that if it is asked to do two things at once. 2. so now two graphs are on the stack. and it continues from there. The stack is initially empty. 61 . which has no edges at all. and it puts the right-hand graph (involving vertices 3.4) on the stack. and therefore its maxset size is 0.2. on each of the two graphs that appear on the second level of the tree in Fig. awaiting processing. so it does one of those two things and drops the other request on top of a stack of unﬁnished business.3.7. whereby the compiler deals with a recursive computation that branches into two sub-computations until a trivial case is reached.4.3. as a formal algorithm. that graph is broken up into (5). 2.5) is dropped onto the stack. −i. ﬁnds that it is the empty graph. it will drop the graph G − {v∗ } − N bhd(v∗ ) on the stack and try to do the graph G − {v∗ }. 2. It must be emphasized that the creation of the tree of recursions is done by the compiler without any further eﬀort on the part of the programmer. It now knows the n1 = 1 and the n2 = 0 values that appear in the algorithm maxset1. When it tries to do that one. The reader should try to write out.3. Our program immediately makes two recursive calls to maxset1.

Reading MA. The realization that the Fourier transform calculation can be speeded up has been traced back to C. 1984. Enslein. MA. Computing.. M.Computing 6 (1977).. Welch. Springer-Verlag. 297-301. 1973. 1977. New York. E. Zeits. John Wiley & Sons. P. The design and analysis of computer algorithms. Bibliography A deﬁnitive account of all aspects of sorting is in D.3. Vol. Ralston and Wilf eds. 537546. Recent developments in fast matrix multiplication are traced in Victor Pan. 443. In Fig. Ullman. SIAM J. 19 (1965). Cooley and J. E. Lewis and P.7. Cooley. All three volumes of the above reference are highly recommended for the study of algorithms and discrete mathematics. Runge. Knuth. Hopcroft and J. A O(2n/3 ) algorithm for the maximum independent set problem can be found in R. The use of the FFT for high precision integer arithmetic is due to A Sch¨nhage and V. 2. Runge and H. 7). 48 (1903) p. How to multiply matrices faster. D. The art of computer programming. Reading. Wissensch.Chapter 1: Mathematical Preliminaries Exercise for section 2. o An excellent account of the above as well as of applications of the FFT to polynomial arithmetic is by A. Trojanowski. Lecture notes in computer science No. Addison Wesley. 11. Math. 377-423. V. D. Schnelle Multiplikation grosser Zahlen. 3: Sorting and searching. Berlin 1924. Mathematics of Computation. J. Springer Verlag. Finding a maximum independent set. An algorithm for the machine calculation of complex Fourier series. 179. and also appears in C. E. K¨nig. Addison Wesley. W. A number of statistical applications of the method are in J. A. 62 . M. Phys. 281-292. Tukey. 7 (1971).. Aho. o The introduction of the method in modern algorithmic terms is generally credited to J. 1974 (chap. in Statistical Methods for Digital Computers. add to the picture the output that each of the recursive calls gives back to the box above it that made the call. The Fast Fourier Transform and its application to time series analysis.7 1. Tarjan and A. Strassen. W. Die Grundlehren der math.

then v is the initial vertex of e and w is the terminal vertex of e. Deﬁnition. these pairs being the edges of the digraph. A ﬂow in a network X is a function f that assigns to each edge e of the network a real number f(e). in such a way that the amount of ﬂuid in each edge does not exceed the capacity of that edge. for all u = v. and some are much simpler.1: A network Now roughly speaking.1 Introduction The network ﬂow problem is an example of a beautiful theoretical subject that has many important applications.1. here it is more precisely. We may then write v = Init(e) and w = T erm(e). it is true that f(e) = Init(e)=v T erm(e)=v f(e). we are given a set of vertices.1. and the sink. We want to know the maximum net quantity of ﬂuid that could be ﬂowing from source to sink. but it is still unclear how close to the ‘ultimate’ algorithm we are. It consists of the digraph G. Finally. u) is permitted. two of the vertices of the digraph are distinguished. is the sink of the network.1. the capacity of each edge being the carrying-capacity of the edge for that ﬂuid. this time a little more slowly. That is. Next. and denoted by cap(e). Imagine that the ﬂuid ﬂows in the network from the source to the sink.Chapter 3: The Network Flow Problem 3. the given set of edge capacities. (3. Altogether. Deﬁnition. or both. We will let X denote the resulting network. It is perfectly OK to have both an edge from u to v and an edge from v to u. Fig. we can think of the edges of G as conduits for a ﬂuid. and a set of ordered pairs of these vertices. A network is shown in Fig. suppose ﬁrst that we are given a directed graph (digraph) G. A network is an edge-capacitated directed graph. the fastest algorithms that are now known for the problem are much faster. t. with two distinguished vertices called the source and the sink. than the ones that were in use a short time ago. If an edge e is directed from vertex v to vertex w. is the source. One.1) 63 . s. and the other. 3. That was a rough description of the problem.1. or neither. No edge (u. To repeat that. the source. in a network there is associated with each directed edge e of the digraph a positive real number called its capacity. It also has generated algorithmic questions that have been in a state of extremely rapid development in the past 20 years. in such a way that (1) For each edge e we have 0 ≤ f(e) ≤ cap(e) and (2) For each vertex v other than the source and the sink. 3.

In Fig. Flow cannot be manufactured anywhere in the network except at s or t.2. 3. 3. 3. and we will prove it below.Chapter 3: The Network Flow Problem The condition (3.2 there is shown a ﬂow in the network of Fig. In fact that algorithm runs in time O(E 2 V ). in section 3. The maximum number of edges that a network of V vertices can have is Θ(V 2 ). and usually will.1. and E of edges. 3. be a nonzero net ﬂow out of the source. 64 . all of the time complexities in Table 3. The amounts of ﬂow in each edge are shown in the square boxes. The speed of their algorithm.1.1. for the purposes of the present discussion.1. The quantity Q is called the value of the ﬂow.2.1. Indeed. only redistribution or rerouting takes place. 3. Since then there has been a steady procession of improvements in the algorithms. and a nonzero net ﬂow into the sink. The other number on each edge is its capacity. which we state below as theorem 3. the later algorithms in the table will give signiﬁcantly better performances than the earlier ones.4. are in the neighborhood of O(V 3 ).1. and which occupies a central position in the theory. The reader should check that for dense networks. and ﬁnd a ﬂow of that value. it turns out.2: A ﬂow in a network The network ﬂow problem. the main subject of this chapter. In 1969 Edmonds and Karp gave the ﬁrst algorithm for the problem whose speed is bounded by a polynomial function of E and V only.2 Algorithms for the network ﬂow problem The ﬁrst algorithm for the network ﬂow problem was given by Ford and Fulkerson. Since the source and the sink are exempt from the conservation conditions there may. depends on the edge capacities in the network as well as on the numbers V of vertices. The letter inside the small circle next to each vertex is the name of that vertex.1. at the time of this writing anyway. ﬁnd the maximum possible value of a ﬂow in X.5). A family of networks might be called dense if there is a K > 0 such that |E(X)| > K|V (X)|2 for all networks in the family. a particularly happy combination.2 is Q = 32. On the other hand. They used that algorithm not only to solve instances of the problem. for sparse networks (networks with relatively few edges). Fig.1) is a ﬂow conservation condition. of the network. In particular. culminating. is: given a network X. but also to prove theorems about network ﬂow. The value of the ﬂow in Fig.1. for certain (irrational) values of edge capacities they found that their algorithm might not converge at all (see section 3. If we let Q be the net outﬂow from the source. It states that the outﬂow from v (the left side of (3.4. with an O(EV log V ) algorithm.1. The chronology is shown in Table 3.1)) is equal to the inﬂow to v (the right side) for all vertices v other than s and t. then Q is also the net inﬂow to the sink.1. they used their algorithm to prove the ‘max-ﬂow-min-cut’ theorem. At other vertices. In the theory of electrical networks such conservation conditions are known as Kirchhoﬀ’s laws. Intuitively it must already be clear that these two are equal. beginning with Karzanov’s algorithm.

Fig.2. 3. introduced by Dinic in 1970 and common to all later algorithms.3.3.1: Progress in network ﬂow algorithms Exercise 3. it is fast.2: The path above. The second will be the 1978 algorithm of Malhotra. The ﬁrst will be the original algorithm of Ford and Fulkerson. and the values of the ﬂow function are shown in the boxes on the edges. In Fig. 3. An edge can get elected to a ﬂow augmenting path for two possible reasons.1.3.3 The algorithm of Ford and Fulkerson The basic idea of the Ford-Fulkerson algorithm for the network ﬂow problem is this: start with some ﬂow function (initially this might consist of zero ﬂow on every edge).3. Among the algorithms in Table 3.1 we will discuss just two in detail. It uses the idea. and so it represents a good choice for those who may wish to program one of these algorithms for themselves.2. and it is extremely simple and elegant in its conception. Given K > 0. A ﬂow augmenting path is a path from the source to the sink along which we can push some additional ﬂow. of layered networks. 3. for three reasons. Galil Galil and Naamad Sleator and Tarjan Goldberg and Tarjan Y ear 1956 1969 1970 1973 1976 1978 1978 1979 1980 1985 Complexity −−−−− O(E 2 V ) O(EV 2 ) O(V 3 ) √ O( EV 2 ) O(V 3 ) O(V 5/3 E 2/3 ) O(EV log2 V ) O(EV log V ) O(EV log (V 2 /E)) Table 3.2.2.1 below we show a ﬂow augmenting path for the network of Fig. because of its importance and its simplicity. 3. Karp Dinic Karzanov Cherkassky Malhotra.3 The algorithm of Ford and Fulkerson Author(s) Ford. Either 65 . 3. Then look for a ﬂow augmenting path in the network. Fulkerson Edmonds. The capacities of the edges are shown on each edge. et al .2. Pramodh-Kumar and Maheshwari (MPM).1 and ﬁnd the fastest algorithm for the family. if not for its speed. evaluate all of the complexity bounds in Table 3.1: A ﬂow augmenting path Fig.1. after augmentation. In this family. Consider the family of all possible networks X for which |E(X)| = K|V (X)|.

After the augmentation.e.3 has a value of 35 units. It may be helpful to remark that an edge is coherently or incoherently oriented only with respect to a given path from source to sink. 3. That particular edge can indeed have its ﬂow decreased.3. The most that we might accomplish with this path would be to push 3 more units of ﬂow through it from source to sink. Since every edge in the path that is shown in Fig. To augment the ﬂow by 3 units we would diminish the ﬂow by 3 units on each of the ﬁrst two edges and increase it by 3 units on the last edge. If the conservation conditions were satisﬁed before the augmentation then they will still be satisﬁed after such an augmentation. The next edge carries 10 units of ﬂow towards the source. That is.1 the ﬁrst edge is directed towards the source.3. we will also have increased the value of the ﬂow function. Fig. the path is indeed a ﬂow augmenting path. incoherently with the path.3. 3. the ﬂow function in Fig. but depends on how the edge sits inside a chosen path. or lack of it. 3.1 can have its ﬂow altered in one way or the other so as to increase the ﬂow in the network. Indeed. on all edges of a ﬂow augmenting path that are coherently oriented with the path we can increase the ﬂow along the edge. in Fig.2 was 32 units.3.. and on all edges that are incoherently oriented with the path we can decrease the ﬂow on the edge. is not only a property of the directed edge.1. The ﬂow in the full network. by at most 8 units. is shown in Fig. to respect Kirchhoﬀ’s laws.3. Hence if we increase the ﬂow in this edge. necessary to maintain the conservation of ﬂow.3. or (b) the direction of the edge is opposed to that of the path from source to sink and the present value of the ﬂow function on the edge is strictly positive. namely the net ﬂow out of the source.e.3: The network. i. 3. we will have increased the value of the ﬂow in the network. the resulting ﬂow will be the one that is shown in Fig. Therefore if we decrease the ﬂow on that edge. the coherence.3.Chapter 3: The Network Flow Problem (a) the direction of the edge is coherent with the direction of the path from source to sink and the present value of the ﬂow function on the edge is below the capacity of that edge. i. Then it ﬁnds another ﬂow augmenting 66 . It is. 3. of course. 3. To do this we will augment the ﬂow on every edge of an augmenting path by the same amount. Then it augments the ﬂow along that path as much as it can. Hence if we can decrease the ﬂow in that edge we will have increased the value of the ﬂow function. We couldn’t push more than 3 units through because one of the edges (the edge into the sink) will tolerate an augmentation of only 3 ﬂow units before reaching its capacity. The value of the ﬂow in Fig. by at most 3 units since its capacity is 15. 3. the edge into the sink carries 12 units of ﬂow and is oriented towards the sink.3.. Finally. by up to 10 units.2. We have just described the main idea of the Ford-Fulkerson algorithm. 3.3. and in either case we will have increased the value of the ﬂow (think about that one until it makes sense). after this augmentation. It ﬁrst ﬁnds a ﬂow augmenting path. after augmentation of ﬂow After augmenting the ﬂow by 3 units as we have just described. Note carefully that if these augmentations are made then ﬂow conservation at each vertex of the network will still hold (check this!). The resulting ﬂow in this path is shown in Fig. Thus.

min{z(u). u) was usable from u to v (i. Now. the ﬂow will be at the maximum possible value. how do we ﬁnd a ﬂow augmenting path from the source to the sink? This is done by a process of labelling and scanning the vertices of the network.X :network.3 The algorithm of Ford and Fulkerson path. We will now describe the steps of the algorithm in more detail. v)) and it will be ‘−’ if v was labeled because the edge (v. v) was usable from u to v (i. beginning with the source and proceeding out to the sink..e. and here is what the three items mean. and if a vertex is labeled. if the ﬂow from v to u was > 0). −. for every ‘unlabeled’ vertex v that is connected to u by an edge in either or both directions.3. Deﬁnition. f:ﬂow ). etc. the labels that get attached to the diﬀerent vertices form a record of how much ﬂow can be pushed through the network from the source to the various vertices. The ‘±’ will be ‘+’ if v was labeled because the edge (u. etc. To begin with. then we will label w. 67 .. So much for the meanings of the various labels. min{z(u). v) then label v with (u. Next we will scan the source. Let f be a ﬂow function in a network X. we will have found the solution of the network ﬂow problem. +.{scan} We can use the above procedure to describe the complete scanning and labelling of the vertices of the network. change the scan-status of u to ‘scanned’ end. and if the edge e is usable from v to w as deﬁned above. We say that an edge e of X is usable from v to w if either e is directed from v to w and the ﬂow in e is less than the capacity of the edge. At each step the algorithm will replace the current value of z by the amount of new ﬂow that could be pushed through to z along the edge that is now being examined. to w. or e is directed from w to v and the ﬂow in e is > 0.’ As the algorithm proceeds. cap(u. if the ﬂow from u to v was less than the capacity of (u. if that amount is smaller than z. u) is > 0 then label v with (u. The source now has the label-status labeled and the scan-status unscanned. v) − flow(u. it may become scanned. ∞). Here is the procedure for scanning any vertex u. Finally. The algorithm terminates when no ﬂow augmenting paths exist. do if the ﬂow in (u. If e is some edge that joins v with a neighbor w. roughly. v) is less than cap(u. v)}) else if the ﬂow in (v.e. flow(v. The label that every vertex v gets is a triple (u. procedure scan(u:vertex. +. z). as follows. that we stand at v and look around at all neighbors w of v that haven’t yet been labeled.e. As the algorithm proceeds. u)}) and change the label-status of v to ‘labeled’. ±. the algorithm labels the source with (−∞. The ‘u’ part of the label of v is the name of the vertex that was being scanned when v was labeled. Initially all vertices are in the conditions ‘unlabeled’ and ‘unscanned. the ‘z’ component of the label represents the largest amount of ﬂow that can be pushed from the source to the present vertex v along any augmenting path that has so far been found. and by exactly which routes. various vertices will become labeled. To scan a vertex v means. We will prove that when that happens. because any ﬂow augmenting path that has already reached from the source to v can be extended another step.. i. given a network and a ﬂow in that network.

Suppose the sink does get a label. repeat (previous. for instance the label (u. +. u := source. else decrease the ﬂow on edge (t. using the labels on the vertices. If the sign is ‘+’ then increase the ﬂow on edge (w. to vertex u. f).’ then increase the ﬂow function by z units on the edge (u. f:ﬂow . while {there is a ‘labeled’ and ‘unscanned’ vertex v and sink is ‘unlabeled’} do scan(v. The algorithm halts only when we are unable to label the sink. until the source s has been reached. ±. A little more formally. 68 .. {assumes that labelandscan has just been done} v:=sink. ∞). In the ﬁrst case we will see that a ﬂow augmenting path from source to sink has been found. whyhalt:reason). and look at its label. etc. If the sign part of the label of t is ‘+.{augmentflow} The value of the ﬂow in the network has now been increased by z units. u) by z units (not by z1 units!). Then move back one step away from the sink. if sink is unlabeled then ‘whyhalt’:=‘flow is maximum’ else ‘whyhalt’:= ‘it’s time to augment’ end. w) by z units. sign. while if the sign is ‘−’ then decrease the ﬂow on edge (u.{labelandscan} Obviously the labelling and scanning process will halt for one of two reasons: either the sink t acquires a label. f:ﬂow. v) by amount else decrease f(v. z). t). amount:real). The whole process of labelling and scanning is now repeated. Then we claim that the value of the ﬂow in the network can be augmented by z units. previous) by amount. amount:= the ‘z’ part of the label of sink. procedure augmentflow(X :network. so the network ﬂow problem has been solved. This is done as follows. and then we will change the ﬂow by z units on every edge of that path in such a way as to increase the value of the ﬂow function by z units. the ﬂow augmentation algorithm is the following. ±. label-status of source:= ‘labeled’. X. v := previous until v= source end. give every vertex the scan-status ‘unscanned’ and the label-status ‘unlabeled’. The complete Ford-Fulkerson algorithm is shown below. label source with (−∞. which might be (w. if sign=‘+’ then increase f(previous.Chapter 3: The Network Flow Problem procedure labelandscan(X :network. or the sink never gets labeled but no more labels can be given. u) by z units. z1 ). z) := label(v). To prove this we will construct a ﬂow augmenting path. Next replace u by w. and in the second case we will prove that the ﬂow is at its maximum possible value. to search for another ﬂow augmenting path.

First vertex s gets the label (−∞.3. −. whyhalt).4 The max-ﬂow min-cut theorem procedure fordfulkerson(X :network. Finally. U ) − f(U . Similarly. which completes the scan of s. i.e. and suppose that W contains the source and W does not contain the sink. We want to show that then the ﬂow will have a maximum possible value. Finally. the scan of D results in the label (D. V ) we mean the sum of the values of the ﬂow function along all edges whose initial vertex lies in U and whose terminal vertex lies in V . ∞). W ). every unit of ﬂow that leaves the source and arrives at the sink must at some moment ﬂow from a vertex of W to a vertex of W . one cut in a network consists of all edges whose initial vertex is the source. 8). The maximum possible value of any ﬂow in a network is equal to the minimum capacity of any cut in that network. If we deﬁne the capacity of a cut to be the sum of the capacities of all edges in the cut. Second. 8). maxflowvalue:=0. during which D acquires the label (A. A and s. For example. we will show that when the Ford-Fulkerson algorithm terminates because it has been unable to label the sink. We ﬁnd the path as in procedure augmentflow above. W = V (X) − W . if whyhalt=‘it’s time to augment’ then augmentflow(X. +. maxflowvalue := maxflowvalue + amount until whyhalt = ‘flow is maximum’ end.1.1. Let W denote all other vertices of X.1.e. W ) we mean the set of all edges of X whose initial vertex is in W and whose terminal vertex is in W . and further augmentation proceeds as we have discussed above. i. then at that moment there is a cut in the network whose edges are saturated with ﬂow.e. Then C is scanned. We then scan s. By f(U. 3) for the sink t. 3. Deﬁnition. and therefore that the maximum value of a ﬂow cannot exceed the minimum capacity of any cut.4 The max-ﬂow min-cut theorem Now we are going to look at the state of aﬀairs that holds when the ﬂow augmentation procedure terminates because it has not been able to label the sink. The path in question will be seen to be exactly the one shown in Fig. Vertex A gets the label (s. U ). following the labels backwards from t to D. i. V ) we mean the sum of the capacities of all of those edges. 69 . +. B cannot be labeled. From the label of t we see that there is a ﬂow augmenting path in the network along which we can push 3 more units of ﬂow from s to t.3. f. By the cut (W. +. Next we scan vertex A. The main result of this section is the ‘max-ﬂow min-cut’ theorem of Ford and Fulkerson.2. 3. Let W ⊂ V (X).. Proof: We will ﬁrst do a little computation to show that the value of a ﬂow can never exceed the capacity of a cut. by cap(U. which results in E getting the label (C. and let f be a ﬂow function for X.f. which we state as Theorem 3. then it seems clear that the value of a ﬂow can never exceed the capacity of any cut. 10). amount). must ﬂow along some edge of the cut (W. −. 3. f: ﬂow. repeat labelandscan(X. by the net ﬂow out of U we mean f(U.{fordfulkerson} Let’s look at what happens if we apply the labelling and scanning algorithm to the network and ﬂow shown in Fig. +. such that the ﬂow in each edge of the cut is equal to the capacity of that edge.. 10). {ﬁnds maximum ﬂow in a given network X } set f:=0 on every edge of X . Now. and C gets labeled with (s.. maxflowvalue:real). Let U and V be two (not necessarily disjoint) sets of vertices of the network X.4.

is it guaranteed that we will ever get there? We are asking if the algorithm is ﬁnite. W ) − f(W . To complete the proof of the theorem we will show that a ﬂow of maximum value. x) = 0. W ) − f(W . every edge from W to W is carrying as much ﬂow as its capacity permits. The net ﬂow out of any other vertex w ∈ W is 0. Hence. Then we would have termination with ‘whyhalt’ = ‘it’s / time to augment.1). Then Q = f(W. else again y would have been labeled when we were scanning x.1).1. We now know that the maximum value of the ﬂow in a network cannot exceed the minimum of the capacities of the cuts in the network. Let f be a ﬂow of value Q in a network X. Clearly s ∈ W . x) is an edge where y ∈ W and x ∈ W .1 is ﬁnished.4.1) Proof of lemma: The net ﬂow out of s is Q. and every edge from W to W is carrying no ﬂow at all. It follows that the augmented ﬂow is still integral. 3. contradicting the assumed maximality of f. W ) − f(W. say. if (y. for suppose the contrary. W ∪ W ) − f(W ∪ W . C ∗ denotes the combined capacity of all edges that are outbound from the source.5 The complexity of the Ford-Fulkerson algorithm The algorithm of Ford and Fulkerson terminates if and when it arrives at a stage where the sink is not labeled but no more vertices can be labeled. if (x. (3. Since the value of the ﬂow increases by at least 1 unit per augmentation. whyhalt).4. but there is no way that any nonintegral ﬂows can be produced. y) is in the cut. Hence. W ) ≤ cap(W.4. completing the proof of lemma 3. W ). then it is eminently clear that the value of the ﬂow can never exceed C ∗. W ) is saturated. Therefore. and let (W. and indeed. W ) = f(W. On the other hand if. / / We claim that every edge of the cut (W. Equally clearly. a contradiction. Similarly. w)} = f(W. the value of the ﬂow is equal to the capacity of the cut (W. First consider the case where every edge of the given network X has integer capacity. then edge (x. and the present value of the ﬂow is the desired maximum for the network. we obtain Q= w∈W {f(w. W ). W ) − f(W . W ) be a cut in X. This yields 70 . W ) + f(W. then we have seen that (W. let f be a ﬂow in X of maximum value. and the proof of theorem 3.’ and if we were then to call procedure augmentflow we would ﬁnd a ﬂow of higher value. Indeed. x ∈ W . W ) = f(W. This proves the ‘=’ part of (3. Let W be the set of vertices of X that have been labeled when the algorithm terminates. and call procedure labelandscan(X. If at that time we let W be the set of vertices that have been labeled. y) is saturated. then the ﬂow f(y. W ) = f(W. the set W deﬁnes a cut (W.Chapter 3: The Network Flow Problem Lemma 3. y ∈ W . W ).4. else y would have been labeled when we were scanning x and we would have y ∈ W .4. V (X)) − f(V (X). The question now is. we see that no more than C ∗ ﬂow augmentations will be needed before a maximum ﬂow is reached. how long does it take to arrive at that stage.1. must saturate the edges of some cut. and the ‘≤’ part is obvious. surely the most primitive complexity question imaginable. V (X)) − f(V (X). W ) is a minimum cut of the network. Since s ∈ W and t ∈ W . Then during the labelling and ﬂow augmentation algorithms. another contradiction. f. if V (X) denotes the vertex set of the network X. W ). t ∈ W . various additions and subtractions are done. Hence the sign of equality holds in (3. The value of the ﬂow therefore increases by an integer amount during each augmentation.4. which surely exists.

The four special edges will then have residual capacities (excesses of capacity over ﬂow) of 0. with n replaced by n + 1.1. . And it is possible to do that. A2 . Inductively. rn+1 .3. Suppose. the four edges Ai = (xi. 71 . Write r = (−1 + 5)/2. for our ﬁrst augmentation step. rather than some weakness in our analysis of the algorithm. and let S = (3 + √ 5)/2 = ∞ rn . . respectively (you can see that this is going to be interesting). y4 . yj ) ∀i.5. 3. Indeed. x4. and that we augment the ﬂow by 1 unit along that path. It has 10 vertices s. The good news is that the algorithm is ﬁnite. √ Next we will give the capacities of the edges of X.5. t) ∀j. yj ) ∀i = j. r. t. A3 . and (yj . If the bound C ∗ represents the true behavior of the algorithm. but on the edge capacities. . (yi . There are directed edges (xi . . it won’t converge at all! Consider the network X that is shown in Fig. 3. In a network with integer capacities on all edges. choose the ﬂow augmenting path s → x2 → y2 → x3 → y3 → t. A4 . . rn. This is good news and bad news. We will now show that the algorithm might next do two ﬂow augmentation steps the net result of which would be that the inductive state of aﬀairs would again hold. A3 . (s. x1 . . respectively. A2 . rn+1 .5.1. Fig. The special edges A1 . A4 are given capacities 1. xi) ∀i. r2. r2. (xi . xj ) ∀i. 4) will be called the special edges. j. We will show below an example due to Ford and Fulkerson in which the situation is even worse than the one envisaged above: not only will the algorithm take a very long time to run. suppose we have arrived at a stage of the algorithm where the four special edges.1: How to give the algorithm a hard time In this network. n=0 Then to every edge of X except the four special edges we assign the capacity S. taken in some rearrangement A1 . y1 . have residual capacities 0. then even on very small networks it will be possible to assign edge capacities so that the algorithm takes a very long time to run. . r2 .5 Complexity of the Ford-Fulkerson algorithm Theorem 3. xj ) ∀i = j. r2 . yi ) (i = 1. the Ford-Fulkerson algorithm terminates after a ﬁnite number of steps with a ﬂow of maximum value. The bad news is that the complexity estimate that we have proved depends not only on the numbers of edges and vertices in X. r. we ﬁnd the ﬂow augmenting path s → x1 → y1 → t. j. (yi . .

and by the unsatisfactory time complexity of the original algorithm. v) − f(s. using the ﬂow f as a guide. choose the ﬂow augmenting path s → x2 → y2 → y1 → x1 → y3 → x3 → y4 → t. Finally we will give the capacities of each of these new edges. We turn now to the consideration of one of the main ideas on which further progress has depended. Now comes the bad news: the maximum ﬂow in this network has the value 4S (ﬁnd it!). The irrational capacities on the edges may at ﬁrst seem to make this example seem ‘cooked up. the directed edges. The reader may now verify that the residual capacities of the four special edges are r n+2 . f). Motivated by the importance and beauty of the theory of network ﬂows. Augment the ﬂow along this path by rn+2 units. More precisely. Hence the ﬂow in an edge will never exceed S units. The 0th layer of Y consists only of the source s. 0. then we will say which vertices of the previous layer are connected to each vertex of the new layer. Next. We assign to that edge in Y a capacity cap(s.Chapter 3: The Network Flow Problem The only special edges that are on this path are A2 and A3 . each on a layered network. that of layering a network with respect to a ﬂow function. once more the largest possible amount. Next we will describe the layered network Y. We then draw an edge in Y directed from s to v for each such vertex v. there is given a network X and a ﬂow function f for that network. All of these edges will be directed from the earlier layer to the later one. Recall that in order to describe a network one must describe the vertices of the network. and depends only on the size of the network. The algorithm converges to a ﬂow of value S. each a good deal easier than the original. and designate the source and the sink. This idea has triggered a whole series of improved algorithms. for this network (a) the algorithm does not halt after ﬁnitely many steps even though the edge capacities are ﬁnite and (b) the sequence of ﬂow values converges to a number that is not the maximum ﬂow in the network. Now we will discuss how to layer a network with respect to a given ﬂow function. as follows. rn+2 . rn+1 . In the course of doing this veriﬁcation it will be handy to use the fact that rn+2 = rn − rn+1 (∀n ≥ 0). We will then discuss an O(V 2 ) method for solving each such problem on a layered network. and the result will be an O(V 3 ) algorithm for the original network ﬂow problem.’ But the implication is that even with a network whose edge capacities are all integers. we will say which vertices of X go into that layer. Hence. v) + f(v. that uses layered networks and guarantees fast operation. the MPM algorithm. in a network with V vertices we will ﬁnd that we can solve a max-ﬂow problem by solving at most V slightly diﬀerent problems. 72 . The purpose of the italics is to emphasize the fact that one does not just ‘layer a network. 3. s). the algorithm might take a very long time to run. give the capacities of those edges. First let us say that an edge e of X is helpful from u to v if either e is directed from u to v and f(e) is below capacity or e is directed from v to u and the ﬂow f(e) is positive. The vertices that comprise layer 1 of Y will be every vertex v of X such that in X there is a helpful edge from s to v. The network Y will be constructed one layer at a time from the vertices of X. Following the discussion of layering we will give a description of one of the algorithms. many researchers have attacked the question of ﬁnding an algorithm whose success is guaranteed within a time bound that is independent of the edge capacities. For each layer.6 Layered networks Layering a network is a technique that has the eﬀect of replacing a single max-ﬂow problem by several problems. Augment the ﬂow along this path by rn+1 units (the maximum possible amount). These two augmentation steps together have increased the ﬂow value by r n+1 + rn+2 = rn units.’ Instead. Notice that with respect to this path the special edges A1 and A3 are incoherently directed. and together they induce a layered network Y = Y(X.

for each vertex u in layer(L) do for each vertex v such that {layer(v) = L + 1 or v is not in any layer} do q := cap(u. Assign capacities to the edges from layer 1 to layer 2 in the same way as described above. In the former case. and a logical variable maxflow that will be T rue. assign capacities to those edges. w) + f(w.{layer} In Fig. L := L + 1 if layer(L) is empty then exit with maxflow := true. and all edges on such a path are directed the same way. or else until no additional layers can be created (to say that no more layers can be created is to say that among the vertices that haven’t yet been included in the layered network that we are building. and the layering process is complete. Note that the edge always goes from v to w regardless of the direction of the helpful edge in X. there aren’t any that are adjacent to a vertex that is in the layered network. until sink is in layer(L). maxflow := false. we then create a layer L + 1 that consists solely of the sink t. Next we construct layer 2 of Y. Output are the layered network Y. procedure layer (X.6. assign vertex v to layer(L + 1). even after an edge has been drawn from v to w in Y. v) + f(v. and such that there is a helpful edge in X from some vertex v of layer 1 to w. ¯ Here is a formal statement of the procedure for layering a given network X with respect to a given ﬂow function f in X. F alse otherwise. additional edges may be drawn to the same w from other vertices v . In contrast to a general network. and remove their incident edges from Y end. if q > 0 then do draw edge u → v in Y. v) − f(u. layer(L) := {source}. w) − f(v. Note also that in contrast to the Ford-Fulkerson algorithm.3. and the network ﬂow problem in X has been solved. 3. the present ﬂow function f in f X is maximum. Y. Observe that not all vertices of X need appear in Y. assign capacity q to that edge. The layering continues until we reach a layer L such that there is a helpful edge from some vertex of layer L to the sink t. the total residual (unused) ﬂow-carrying capacity of the edges in both directions between v and w. from the source towards 73 . repeat layer(L + 1) := ∅. The vertex set of layer 2 consists of all vertices w that do not yet belong to any layer. This latter quantity is. we connect t by edges directed from the appropriate vertices of layer L. by a helpful edge). that is. v in layer 1. Next we draw the edges from layer 1 to layer 2: for each vertex v in layer 1 we draw a single edge in Y directed from v to every vertex w in layer 2 for which there is a helpful edge in X from v to w.6 Layered networks The set of all such v will be called layer 1 of Y. maxflow). delete from layer(L) of Y all vertices other than sink. in a layered network every path from the source to some ﬁxed vertex v has the same number of edges in it (the number of the layer of v). {forms the layered network Y with respect to the ﬂow f in X } {maxflow will be ‘T rue’ if the input ﬂow f already has the maximum possible value for the network. f.1 we show the typical appearance of a layered network. u). the capacity in Y of the edge from v to w is cap(v. on output. where no additional layers can be created but the sink hasn’t been reached. else it will be ‘F alse’} L:= 0. Input are the network X and the present ﬂow function f in that network. if the ﬂow is at a maximum value. v). In the latter case. of course.

In Fig. 3. Y). in Y.2 The next question is this: exactly what problem would we like to solve on the layered network Y. and what is the relationship of that problem to the original network ﬂow problem in the original network X? The answer is that in the layered network Y we are looking for a blocking ﬂow g. as follows.Chapter 3: The Network Flow Problem Fig.6. 3. g. and make them much easier to deal with than general networks. These properties of layered networks are very friendly indeed.2 we show speciﬁcally the layered network that results from the network of Fig. procedure augment(f. This immediately raises two questions: (a) what can we do if we ﬁnd a blocking ﬂow in Y? (b) how can we ﬁnd a blocking ﬂow in Y? The remainder of this section will be devoted to answering (a). do increase the ﬂow f in the edge u → v of the network X by the amount min{g(e).1. Suppose that we have somehow found a blocking ﬂow function.2 with the ﬂow shown therein. 3. cap(u → v) − f(u → v)}. 3. Fig.1. What we do with it is that we use it to augment the ﬂow function f in X. {augment ﬂow f in X by using a blocking ﬂow g in the corresponding layered network Y} for every edge e : u → v of the layered network Y. By a blocking ﬂow we mean a ﬂow function g in Y such that every path from source to sink in Y has at least one saturated edge.6. X. g.6.2: A layering of the network in Fig. if not all of g(e) has been used then decrease the ﬂow in edge v → u by the unused portion of g(e) end. 3.{augment} 74 .1: A general layered network v. In the next section we will give an elegant answer to (b).

ﬁnd a blocking ﬂow in Y.1. The MPM algorithm. completing the proof of theorem 3. 3. but that layer is special. there can be at most V phases before a maximum ﬂow is found. We will show that a + 1 ≥ c. The network of Fig. each phase can be done very rapidly. if not then c > a + 1. by induction. Since va .7. If v0 → v1 → v2 → · · · → vm (v0 = source) is a path in Y(p + 1). then after at most V phases (‘phase’ = layer + block + augment) we will have found the maximum ﬂow in X and the process will halt. By the height of a layered network Y we will mean the number of edges in any path from source to sink.6 Layered networks After augmenting the ﬂow in the original network X. 75 . . Second. augment the ﬂow in X. . v1 . was in a layer ≤ a. make a new layered network Y. a = m. if we start with zero ﬂow in X. va. and contains only t. and apply the lemma to vm = t (m = H(p + 1)).6. etc. for a network X with V vertices. m) of that path also appears in Y(p).1. The theorem is now proved for the case where the path had all of its vertices in Y(p) also. Now suppose that this was not the case.6. but they aren’t just that. ﬁnd a blocking ﬂow. The various activities that are now being described may sound like some kind of thinly disguised repackaging of the Ford-Fulkerson algorithm. Now we want to exclude the ‘=’ sign. Then the corresponding edge(s) of Y was unaﬀected by the augmentation in phase p. Now we will prove the theorem. Hence the ﬂow in Y between va and va+1 could not have been aﬀected by the augmentation procedure of p hase p. Suppose it is true for v0 . Indeed. Therefore it represented an edge of Y that was helpful from va to va+1 at the beginning of phase p + 1. It was helpful from va to va+1 at the beginning of phase p + 1 because e∗ ∈ Y(p + 1) and it was unaﬀected by phase p. Hence. if va was in layer b of Y(p). and so all of the edges in Y that the edges of the path represent were helpful both before and after the augmentation step of phase p. We will ﬁrst prove Lemma 3.1. then for every a = 0. contradicting the fact that the blocking ﬂow that was used for the augmentation saturated some edge of the chosen path. ﬁnds a blocking ﬂow in a layered network in time O(V 2 ). what then? We construct a new layered network. The heights of the layered networks that occur in the consecutive phases of the solution of a network ﬂow problem form a strictly increasing sequence of positive integers. Let s → v1 → v2 → · · · → vH(p+1)−1 → t be a path from source to sink in Y(p + 1). . because here is what can be proved to happen: First. We conclude at once that H(p + 1) ≥ H(p). Let Y(p) denote the layered network that is constructed at the pth phase of the computation and let H(p) denote the height of Y(p). This contradiction establishes the lemma. but was not helpful at the beginning of phase p. But edge e∗ is in Y(p + 1). and if every vertex vi (i = 1. then b + 1 = H(p).1 has height 3. By the lemma once more. Consider ﬁrst the case where every vertex of the path also lies in Y(p). The only possibility is that vertex va+1 would have entered into Y(p) / in the layer H(p) that contains the sink. and suppose va+1 was in layer c of network Y(p). If H(p + 1) = H(p) then the entire path is in Y(p) and in Y(p + 1). and therefore H(p + 1) > H(p). so a + 1 ≥ b + 1 = H(p). m it is true that if vertex va was in layer b of Y(p) then a ≥ b. it follows that the edge e∗ : va → va+1 was not present in network Y(p) since its two endpoints were not in two consecutive layers. etc. . a ≥ b. from X and the newly augmented ﬂow function f on X. Hence.6.3. Let’s now show Theorem 3. Let e∗ : va → va+1 be the ﬁrst edge of the path whose terminal vertex va+1 was not in Y(p).6. Proof of lemma: The result is clearly true for a = 0. to be discussed in section 3.. make the layered network Y. yet e∗ ∈ Y(p). was unaﬀected by phase p.

and then decreasing the capacity of edge u → v in Y by the same amount. 3. set the ﬂow function f to zero on all edges of Y. The idea of layering networks is due to Dinic. (B) (Pushout) Take the edges that are outbound from v in some order. so the total outﬂow from v becomes exactly P ∗. let h(v ) be the ﬂow out of v to v. The in-potential of v is the sum of the capacities of all edges directed into v. because we began by choosing a vertex of minimum potential. the procedure will repeat steps (i). and the ﬂow function f in X will always reﬂect the latest augmentation of the ﬂow in Y.Chapter 3: The Network Flow Problem To summarize. Now we will push P ∗ more units of ﬂow from source to sink.7 The MPM algorithm Now we suppose that we are given a layered network Y and we want to ﬁnd a blocking ﬂow in Y. repeat for the next layer. Then saturate all except possibly one incoming edge of v . f) if possible. (iii) at most V times because the height of the layered network increases each time around. (iii) augment the ﬂow f in Y with the blocking ﬂow g. (C) Follow the ﬂow to the next higher layer of Y. (ii) ﬁnd a blocking ﬂow g in Y. Pramodh-Kumar and Maheshwari. else exit with ﬂow at maximum value. (D) (Pullback) When all layers ‘above’ v have been done. A convenient way to keep it is to carry out the augmentation procedure back in the network X at this time. The labor involved in step (i) is certainly O(E). thereby. in eﬀect ‘storing’ the contributions to the blocking ﬂow in Y in the ﬂow array for X. then the whole network ﬂow problem can be done in time O(V · (E + BFL)). Since his work was done.6. repeat (i) construct the layered network Y = Y(X. Hence if BFL denotes the labor involved in some method for ﬁnding a blocking ﬂow in a layered network. to pass through v the h(v ) units of ﬂow. in or out. we carry out procedure maxflow (X .1. We never ﬁnd a vertex with insuﬃcient capacity. if we want to ﬁnd a maximum ﬂow in a given network Y by the method of layered networks. then follow the ﬂow to the next layer ‘below’ v. all eﬀorts have been directed at the problem of reducing BFL as much as possible. 76 . This can be done concurrently with the MPM algorithm as follows: Every time we increase the ﬂow in some edge u → v of Y we do it by augmenting the ﬂow from u to v in X. (E) (Update capacities) The ﬂow function that has just been created in the layered network must be stored somewhere. etc. say P ∗ . (A) Find a vertex v of smallest potential. by calling procedure augment above until exit occurs in (i) above. let h(v ) be the ﬂow into v . In that way the capacities of the edges in Y will always be the updated residual capacities. That is. to handle the ﬂow that is thrust upon it. etc.f).{maxflow} According to theorem 3. Then assign all remaining ﬂow to the next outbound edge (not necessarily saturating it). and saturate each one with ﬂow. For each vertex v of that layer. and the outpotential of v is the total capacity of all edges directed out from v. Let V be some vertex of Y. The following ingenious suggestion is due to Malhotra. When all vertices v in that layer have been done. and it certainly can never exceed V . and so is the labor in step (iii). We intend to repeat the operation on some other vertex v of minimum potential. (F) (Prune) We have now pushed the original h(v) units of ﬂow through the whole layered network. but ﬁrst we can prune oﬀ of the network some vertices and edges that are guaranteed never to be needed again. end. proceed to the next layer below v. Now saturate all except possibly one outbound edge of v . unless and until saturating one more would lift the total ﬂow used over P ∗ . as follows. to pass through v the h(v ) units of ﬂow. (ii). The potential of v is the smaller of these two. When all v in that layer have been done.Y . for each vertex v of the next layer.

Continue the pruning process until only vertices remain that have nonzero potential.8 Applications of network ﬂow We conclude this chapter by mentioning some applications of the network ﬂow problem and algorithm.potentials of every vertex. or both. The number of partial edge-saturation operations is at most two per vertex visited.1 below we show a graph that might result from a certain group of 8 people and 9 jobs. we want to ﬁnd a maximum number of edges. i. It is obviously a blocking ﬂow: since no path between s and t remains. one most often mentions ﬁrst the problem of maximum matching in a bipartite graph.8. 3. the outpotential of its initial vertex and the inpotential of its terminal vertex are reduced by the same amount.5. that made the Ford-Fulkerson algorithm into an inﬁnite process that converged to the wrong answer. since at most V layered networks need to be looked at in order to ﬁnd a maximum ﬂow in the original network. J denote the two classes of vertices in the graph G. First we adjoin two new vertices s. no two incident with the same vertex. What is the complexity of this algorithm? Certainly we delete at least one vertex from the network at every pruning stage. then we draw an edge from s to each p ∈ P and an edge from each j ∈ J to t. t to the bipartite graph G. with its irrational edge capacities. It follows that the cost of maintaining these arrays is linear in the number of vertices. Next. since every edge has but one saturation to give to its network. then g(e) is the sum of all of the ﬂows that were pushed through edge e at all stages of the above algorithm. all edges that now have zero residual capacity. such that not all of the people are capable of doing all of the jobs. The total cost is therefore O(V 2 ) for the complete MPM algorithm that ﬁnds a blocking ﬂow in a layered network. every path must have had at least one of its edges saturated at some step of the algorithm. the time bound O(V 3 ) that we have just proved for the layered-network MPM algorithm is totally independent of the edge capacities. Hence it aﬀects only the constants implied by the ‘big oh’ symbols above. but not the orders of magnitude. then repeat from (A) above. assign people to the jobs in such a way that the largest possible number of people are employed. we may now ﬁnd that some vertex w has had all of its incoming or all of its outgoing edges deleted.6). The maximum matching problem is just this: assuming that each person can handle at most one of the jobs.. Such a graph is called bipartite. each time the ﬂow in some edge is increased. So the partial edge saturation operations cost O(V 2 ) and the total edge saturations cost O(E). The blocking ﬂow function g that we have just found is the following: if e is an edge of the input layered network Y. as follows. Else the algorithm halts. Initially we compute and store the in.’ in the following sense. To solve this problem by the method of network ﬂows we construct a network Y. is O(E). The cost of saturating all edges that get saturated . J vertices to represent the jobs.3. Thereafter. Therefore we can delete v from the network Y together with all of its incident edges.6 Layered networks The vertex v itself has either all incoming edges or all outgoing edges. and connect vertex p to vertex j by an undirected edge if person p can do job j. 3. Hence no more ﬂow will ever be pushed throug v. That vertex will never be used again. at zero residual capacities. In Fig. For each minimal-potential vertex v we visit at most V other vertices. Certainly. among these. Consider a set of P people and a set of J jobs. Take P vertices to represent the people. because the vertex v that had minimum potential will surely have had either all of its incoming or all of its outgoing edges (or both) saturated.and out. and that each job needs only one person. If we let P. It follows that steps (A)–(E) can be executed at most V times before we halt with a blocking ﬂow. In general a graph G is bipartite if its vertices can be partitioned into two classes in such a way that no edge runs between two vertices of the same class (see section 1. incoming or outgoing. V . Further. we can delete from Y all of the edges that were saturated by the ﬂow pushing process just completed. In contrast to the nasty example network of section 3. Each edge in the network is 77 .e. We construct a graph of P + J vertices to represent this situation. The operation of ﬁnding a vertex of minium potential is ‘free. Hence a maximum ﬂow in a netwrok can be found in O(V 3 ) time. In terms of the bipartite graph G. If the source and the sink are still connected by some path. so we use at most V minimal-potential vertices altogether. so delete it and any other incident edges it may still have.

of value Q.8. 3.1: Matching people to jobs Fig. The edge-connectivity of G is deﬁned as the smallest number of edges whose removal would disconnect G.2: The network for the matching problem given capacity 1. In Fig.8.8. 3.8.1 is shown in Fig.8.Chapter 3: The Network Flow Problem Fig.8. 78 . hence a maximum ﬂow corresponds to a maximum matching. 3. The result for the graph of Fig.2. 3.8. Hence the ﬂow deﬁnes a matching of Q edges of the graph G. No such j will receive more than one unit because at most one unit can leave it for the sink t. 3. any matching in G deﬁnes a ﬂow. for instance.1.3: A maximum ﬂow For a second application of network ﬂow methods. Since each edge has capacity 1. Conversely. 3. Fig. 3.3 we show a maximum ﬂow in the network of Fig. Consider a maximum integer-valued ﬂow in this network.8. Q edges of the type (s. p) each contain a unit of ﬂow. which will then cross to a vertex j of J.2 and therefore a maximum matching in the graph of Fig. consdier an undirected graph G. Out of each vertex p that receives some of this ﬂow there will come one unit of ﬂow (since inﬂow equals outﬂow at such vertices). 3. Certainly.

. . Then consider vertex 1 of G to be the source and vertex j to be the sink of a network Xj . Hence in general. there are mn edges of capacity 1 drawn from each edge xi to each vertex yj .1) since each side counts the total number of 1’s in the matrix. Hence we will suppose that (3. .4 shows. V is the edge connectivity of G. and t. Wed ask if there exists an m × n matrix A of 0’s and 1’s that has exactly ri 1’s in the ith row and exactly sj 1’s in the jth column.∗ As a ﬁnal application of network ﬂow we discuss the beautiful question of determining whether or not there is a matrix of 0’s and 1’s that has given row and column sums. . We solve not just one. m. for j = 2. is there a 6 × 8 matrix whose row sums are respectively (5. .3. 4. . Then there is a 0-1 matrix with the given row and column sum vectors if and only if a maximum ﬂow saturates every edge outbound from the source. . Network ﬂow and testing graph connectivity.8. each with capacity 1. . There is an edge of capacity ri drawn from the source s to vertex xi . 4. Next ﬁnd a maximum ﬂow in this netwrok. S. . . j = 1. in which the minimum is large.1) is true. x1 . but they can. . . . 3. we will disconnect the graph. Fig. . 5. . and here is how. Computing 4 (1975). and it is by no means obvious that network ﬂow methods can be used on it.j equal to the ﬂow in the edge from xi to yj . . Now solve the network ﬂow problem in Xj obtaining a maximum ﬂow Q(j).4: Big degree. Given G. for each j = 1.6 Layered networks if we remove all of the edges incident to a single vertex v.8. Replace each edge of G by two edges of Xj . m. xm . one in each direction. . However the edge connectivity could be a lot smaller than the minimum degree as the graph of Fig. 6) and whose column sums are (3. sn ). a graph of V vertices. . The reader will no doubt have noticed that for such a matrix to exist it must surely be true that r1 + · · · + rm = s1 + · · · + sn (3. Then the smallest of the numbers Q(j). . 3. for each i = 1. 3. . if and only if a maximum ﬂow has value equal to the right or left side of equation (3. . for each i = 1. V . .8. let there be given a row sum vector (r1 . Fix such a vertex j. 507-518. . . Tarjan. rm) and a column sum vector (s1 . 4.1). Hence the edge connectivity cannot exceed the minimum degree of vertices in the graph. one for each vertex j = 2. . that is. For instance. 3. yn . but V − 1 network ﬂow problems. Finally. 79 ∗ . n. If such a ﬂow exists then a matrix A of the desired kind is constructed by putting ai. . 3)? Of course the phrase ‘row sums’ means the same thing as ‘number of 1’s in each row’ since we have said that the entries are only 0 or 1. .8. and an edge of capacity sj drawn from vertex yj to the sink t. . . . .8. . 3. Even and R. SIAM J. . Now we will construct a network Y of m + n + 2 vertices named s. . 4. . . We will not prove this here. but the removal of just one edge will disconnect the graph. 5. y1 . 4. . . . n. low connectivity Finding the edge connectivity is quite an important combinatorial problem. E. .

Chapter 3: The Network Flow Problem Exercises for section 3.8 1. Apply the max-ﬂow min-cut theorem to the network that is constructed in order to solve the bipartite matching problem. Precisely what does a cut correspond to in this network? What does the theorem tell you about the matching problem? 2. Same as question 1 above, but applied to the question of discovering whether or not there is a 0-1 matrix with a certain given set of row and column sums. Bibliography The standard reference for the network ﬂow problem and its variants is L. R. Ford and D. R. Fulkerson, Flows in Networks, Princeton University Press, Princeton, NJ, 1974. The algorithm, the example of irrational capacities and lack of convergence to maximum ﬂow, and many applications are discussed there. The chronology of accelerated algorithms is based on the following papers. The ﬁrst algorithms with a time bound independent of the edge capacities are in J. Edmonds and R. M. Karp, Theoretical improvements in algorithmic eﬃciency for network ﬂow problems, JACM 19, 2 (1972), 248-264. E. A. Dinic, Algorithm for solution of a problem of maximal ﬂow in a network with power estimation, Soviet Math. Dokl., 11 (1970), 1277-1280. The paper of Dinic, above, also originated the idea of a layered network. Further accelerations of the netowrk ﬂow algorithms are found in the following. A. V. Karzanov, Determining the maximal ﬂow in a network by the method of preﬂows, Soviet Math. Dokl. 15 (1974), 434-437. B. V. Cherkassky, Algorithm of construction of maximal ﬂow in networks with complexity of O(V 2 E) operations, Akad. Nauk. USSR, Mathematical methods for the solution of economical problems 7 (1977), 117-126. The MPM algorithm, discussed in the tex, is due to V. M. Malhotra, M. Pramodh-Kumar and S. N. Maheshwari, An O(V 3 ) algorithm for ﬁnding maximum ﬂows in networks, Information processing Letters 7, (1978), 277-278. Later algorithms depend on reﬁned data structures that save fragments of partially construted augmenting paths. These developments were initiated in Z. Galil, A new algorithm for the maximal ﬂow problem, Proc. 19th IEEE Symposium on the Foundations of Computer Science, Ann Arbor, October 1978, 231-245. Andrew V. Goldberg and Robert E. Tarjan, A new approach to the maximum ﬂow problem, 1985. A number of examples that show that the theoretical complexity estimates for the various algorithms cannot be improved are contained in Z. Galil, On the theoretical eﬃciency of various network ﬂow algorithms, IBM report RC7320, September 1978. The proof given in the text, of theorem 3.6.1, leans heavily on the one in Shimon Even, Graph Algorithms, Computer Science Press, Potomac, MD, 1979. If edge capacities are all 0’s and 1’s, as in matching problems, then still faster algorithms can be given, as in S. Even and R. E. Tarjan, Network ﬂow and testing graph connectivity, SIAM J. Computing 4, (1975), 507-518. If every pair of vertices is to act, in turn, as source and sink, then considerable economies can be realized, as in R. E. Gomory and T. C. Hu, Multiterminal netwrok ﬂows, SIAM Journal, 9 (1961), 551-570. Matching in general graphs is much harder than in bipartite graphs. The pioneering work is due to J. Edmonds, Path, trees, and ﬂowers, Canadian J. Math. 17 (1965), 449-467.

80

4.1 Preliminaries Chapter 4: Algorithms in the Theory of Numbers

Number theory is the study of the properties of the positive integers. It is one of the oldest branches of mathematics, and one of the purest, so to speak. It has immense vitality, however, and we will see in this chapter and the next that parts of number theory are extremely relevant to current research in algorithms. Part of the reason for this is that number theory enters into the analysis of algorithms, but that isn’t the whole story. Part of the reason is that many famous problems of number theory, when viewed from an algorithmic viewpoint (like, how do you decide whether or not a positive integer n is prime?) present extremely deep and attractive unsolved algorithmic problems. At least, they are unsolved if we regard the question as not just how to do these problems computationally, but how to do them as rapidly as possible. But that’s not the whole story either. There are close connections between algorithmic problems in the theory of numbers, and problems in other ﬁelds, seemingly far removed from number theory. There is a unity between these seemingly diverse problems that enhances the already considerable beauty of any one of them. At least some of these connections will be apparent by the end of study of Chapter 5.

4.1 Preliminaries We collect in this section a number of facts about the theory of numbers, for later reference. If n and m are positive integers then to divide n by m is to ﬁnd an integer q ≥ 0 (the quotient) and an integer r ( the remainder) such that 0 ≤ r < m and n = qm + r. If r = 0, we say that ‘m divides n,’ or ‘m is a divisor of n,’ and we write m|n. In any case the remainder r is also called ‘n modulo m,’ and we write r = n mod m. Thus 4 = 11 mod 7, for instance. If n has no divisors other than m = n and m = 1, then n is prime, else n is composite. Every positive integer n can be factored into primes, uniquely apart from the order of the factors. Thus 120 = 23 · 3 · 5, and in general we will write

l

**n = pa1 pa2 · · · pal = 1 2 l
**

i=1

pai . i

(4.1.1)

We will refer to (4.1.1) as the canonical factorization of n. Many interesting and important properties of an integer n can be calculated from its canonical factorization. For instance, let d(n) be the number of divisors of the integer n. The divisors of 6 are 1, 2, 3, 6, so d(6) = 4. Can we ﬁnd a formula for d(n)? A small example may help to clarify the method. Since 120 = 23 · 3 · 5, a divisor of 120 must be of the form m = 2a 3b 5c, in which a can have the values 0,1,2,3, b can be 0 or 1, and c can be 0 or 1. Thus there are 4 choices for a, 2 for b and 2 for c, so there are 16 divisors of 120. In general, the integer n in (4.1.1) has exactly d(n) = (1 + a1 )(1 + a2 ) · · · (1 + al ) (4.1.2)

divisors. If m and n are nonnegative integers then their greatest common divisor, written gcd(n, m), is the integer g that (a) divides both m and n and (b) is divisible by every other common divisor of m and n. Thus gcd(12, 8) = 4, gcd(42, 33) = 3, etc. If gcd(n, m) = 1 then we say that n and m are relatively prime. Thus 27 and 125 are relatively prime (even though neither of them is prime). If n > 0 is given, then φ(n) will denote the number of positive integers m such that m ≤ n and gcd(n, m) = 1. Thus φ(6) = 2, because there are only two positive integers ≤ 6 that are relatively prime to 6 (namely 1 and 5). φ(n) is called the Euler φ-function, or the Euler totient function. Let’s ﬁnd a formula that expresses φ(n) in terms of the canonical factorization (4.1.1) of n. 81

Chapter 4: Algorithms in the Theory of Numbers We want to count the positive integers m for which m ≤ n, and m is not divisible by any of the primes pi that appear in (4.1.1). There are n possibilities for such an integer m. Of these we throw away n/p1 of them because they are divisible by p1 . Then we discard n/p2 multiples of p2 , etc. This leaves us with n − n/p1 − n/p2 − · · · − n/pl (4.1.3)

possible m’s. But we have thrown away too much. An integer m that is a multiple of both p1 and p2 has been discarded at least twice. So let’s correct these errors by adding n/(p1 p2 ) + n/(p1 p3 ) + · · · + n/(p1 pl ) + · · · + n/(pl−1 pl ) to (4.1.3). The reader will have noticed that we added back too much, because an integer that is divisible by p1 p2 p3 , for instance, would have been re-entered at least twice. The ‘bottom line’ of counting too much, then too little, then too much, etc. is the messy formula φ(n) =n − n/p1 − n/p2 − · · · − n/pl + n/(p1 p2 ) + · · · + n/(pl−1 pl ) − n/(p1 p2 p3 ) − · · · − n/(pl−2 pl−1 pl ) + · · · + (−1) n/(p1 p2 · · · pl ).

l

(4.1.4)

Fortunately (4.1.4) is identical with the much simpler expression φ(n) = n(1 − 1/p1 )(1 − 1/p2 ) · · · (1 − 1/pl ) (4.1.5)

which the reader can check by beginning with (4.1.5) and expanding the product. To calculate φ(120), for example, we ﬁrst ﬁnd the canonical factorization 120 = 23 · 3 · 5. Then we apply (4.1.5) to get φ(120) = 120(1 − 1/2)(1 − 1/3)(1 − 1/5) = 32. Thus, among the integers 1, 2, . . ., 120, there are exactly 32 that are relatively prime to 120.

Exercises for section 4.1 1. Find a formula for the sum of the divisors of an integer n, expressed in terms of its prime divisors and their multiplicities. 2. How many positive integers are ≤ 1010 and have an odd number of divisors? Find a simple formula for the number of such integers that are ≤ n. 3. If φ(n) = 2 then what do you know about n? 4. For which n is φ(n) odd?

4.2 The greatest common divisor Let m and n be two positive integers. Suppose we divide n by m, to obtain a quotient q and a remainder r, with, of course, 0 ≤ r < m. Then we have n = qm + r. (4.2.1)

If g is some integer that divides both n and m then obviously g divides r also. Thus every common divisor of n and m is a common divisor of m and r. Conversely, if g is a common divisor of m and r then (4.2.1) shows that g divides n too. It follows that gcd(n, m) = gcd(m, r). If r = 0 then n = qm, and clearly, gcd(n, m) = m. 82

suppose b > (a + 1)/2. Jeﬀrey Shallit traces this analysis back to Pierre-Joseph´ Etienne Finck. namely Θ(log mn). 13). write the program more or less as above.’ namely 1 in this case. then the ‘n.2. Now we distinguish two cases. (13. b − 1). This leads to the following recursive procedure for computing the g. (3. Then b − 1 ≤ a − b and so a mod b ≤ b − 1 a+1 ≤ −1 2 a−1 = 2 in this case. n mod m). function gcd(n. an equivalent measure of cost would be the number of times the algorithm calls itself recursively. (8. healthy integers n and m. Then a − b ≤ b − 1 and a mod b ≤ a − b < a − so the result holds in either case.2. If. then what we have shown is that gcd(n. and try it out with some large.2 The greatest common divisor If we use the customary abbreviation ‘n mod m’ for r. {ﬁnds gcd of given nonnegative integers n and m} if m = 0 then gcd := n else gcd := gcd(m. 3).c. Proof: Clearly a mod b ≤ b − 1. It begins with ‘if trivialcase then do trivialthing’ (m = 0). Let’s agree that one unit of labor is the execution of a single ‘a mod b’ operation. Further. (5. we want are the input.d. the remainder in the division of n by m. The gcd program exhibits all of the symptoms of recursion. 2). 401-419. m) = gcd(m. 1). (2. of 13 and 21. 8). and the number of bits that are needed to input those two integers is Θ(log n) + Θ(log m).c. Hence c log mn is the length of the input bit string.2) the cost was 7 units. In the example (4. m whose g. and get it working on some computer. Use a recursive language. m). we call the program with n = 13 and m = 21. Thus a mod b ≤ min(a − b. If 1 ≤ b ≤ a then a mod b ≤ (a − 1)/2. in 1841. Lemma 4. In this problem. Now let’s see how long the algorithm might run with an input string of that length.d. n mod m) end.c. It is one of the oldest algorithms known.c. First suppose b ≤ (a + 1)/2.∗ To measure the running time of the algorithm we need ﬁrst to choose a unit of cost or work. a mod b = a − a b b ≤ a − b. 5).4. It calls itself with smaller values of its variable list.d. What is the input to the problem? The two integers n. The above is the famous ‘Euclidean algorithm’ for the g. for example. Next.c.2. is the desired g. (1.2) When it arrives at a call in which the ‘m’ is 0. and it then recursively calls itself with the following arguments: (21. 0) (4. we want the g.d.1. ∗ In Historia Mathematica 21 (1994). a+1 a−1 = 2 2 83 .d. The reader is invited to write the Euclidean algorithm as a recursive program. and this case is all-important because it’s the only way the procedure can stop itself.

The upper bound in the statement of theorem 4.c. in Basic or Fortran.1. Show that if m and n have a total of B bits. (a) Compare the execution times of the two programs. b.Chapter 4: Algorithms in the Theory of Numbers Theorem 4.’ ? What percentage of the 1000 pairs that you chose had g. and with exactly what words he described his algorithm.78 . 2 Then.d. Write a recursive program. In fact it was found after 7 operations in that case. For each pair. aj+1 ≤ aj−1 − 1 2 aj−1 ≤ . for the g. . try to explain them mathematically. Exercises for section 4. The algorithm generates a sequence a0 . 84 . 3. precisely. in Pascal or a recursive language of your choice. .c. 21) of the display (4. 4.c. b). b) = (13. The number log2 M is almost exactly the number of bits in the binary representation of M (what is ‘exactly’ that number of bits?). . = 1? Compare your observed percentage with 100 · (6/π 2 ). 5. . where a0 = a. If a < b then after 1 operation we will be in the case ‘a ≥ b’ that we have just discussed. i. . using a recursive program and a nonrecursive program.d.. . where M = max (a. Proof of theorem: Suppose ﬁrst that a ≥ b.2).2. and the proof is complete. and aj+1 = aj−1 mod aj (j ≥ 1). .d. for the g. do you think that this theorem means by ‘the probability that .2 1.d.d. In brief.c.e. What. Find out when Euclid lived. a1 = b.c. Theorem 4.1. M Obviously the algorithm has terminated if ar < 1.2.’ in the case of Euclid’s algorithm. compute the g. then Euclid’s algorithm will not need more than 2B + 3 operations before reaching termination. Before we prove the theorem.2.. (A worst-case complexity bound for the Euclidean algorithm) Given two positive integers a. In that case M = 21 and 2 log2 M + 1 = 9. will be found after at most 9 operations. By lemma 4. Write a program that will light up a pixel in row m and column n of your CRT display if and only if gcd(m. Choose 1000 pairs of integers (n. and this will have happened when r is large enough so that 2− r/2 M < 1. at random between 1 and 1000. Run the program with enough values of m and n to ﬁll your screen. n) = 1. = 1 is 6/π2 . a1 . (b) There is a theorem to the eﬀect that the probability that two random integers have g.1 can be visualized as follows.d. 2.2. m).. we might say that ‘Time = O(bits).c. ar ≤ 2− r/2 (j ≥ 0) (j ≥ 0) (r = 0.c. If you see any interesting visual patterns.). . The theorem asserts that the g.2. of two integers in a number of operations that is at most a linear function of the number of bits that it takes to represent the two numbers. 2. 1.d. by induction on j it follows that a0 2j a1 a2j+1 ≤ j 2 a2j ≤ and so.. Write a nonrecursive program. if r > 2 log2 M . let’s return to the example (a. The Euclidean algorithm will ﬁnd their greatest common divisor after a cost of at most 2 log2 M + 1 integer divisions.1 therefore asserts that we can ﬁnd the g.

3. it represents a very important feature of recursive algorithms. Now let’s discuss the extended algorithm. n. rather generally. (4. inductively. Output from it will be g = gcd(n.5) . the corresponding coeﬃcients t. Calculate gcd(102131. A single step of the original Euclidean algorithm took us from the problem of ﬁnding gcd(n. alas!) yield recursive algorithms for the construction of the objects that are being studied. gcd(14. then it is natural to prove the validity of the algorithm by mathematical induction. u for the equation g = t m + u (n mod m). m (4. n) from the above information? How would you calculate the least common multiple (lcm) of m and n from the above information? Prove that gcd(m.3) m we ﬁnd that n g = t m + u (n − m) m (4.3. We might say. Suppose we have two positive integers m. u. While this hardly rates as a ‘practical’ application.3.2) Can we get out.3 The extended Euclidean algorithm Again suppose n. Input to it will be two integers n and m.3. then use the Euclidean algorithm.1). Fn−1 )? 4.d.1) is that t=u u=t − 85 n u. that is not immediately obvious from (4. and we have factored them completely into primes. 11) = 1. and there are others (can you ﬁnd all of them?). Conversely.3. m Hence the rule by which t .4) n = u n + (t − u )m.3.d.1) where t and u are integers. by substituting in (4. u that will work? One pair that does the job is (4. is g. m= pai .3. that we not only know g = gcd(m. The complexity analysis of a recursive algorithm will use recurrence formulas. so we can write 1 = 14t + 11u for integers t.3. The extended Euclidean algorithm ﬁnds not only the g.3. u for (4.3 The extended Euclidean algorithm 6. 7. 56129) in two ways: use the method of exercise 6 above. Then we can always write g in the form g = tn + um (4. i How would you calculate gcd(m.1)? Indeed we can. m are two positive integers whose g. that the following items go hand-in-hand: Recursive algorithms Inductive proofs Complexity analyses by recurrence formulas If we have a recursive algorithm. For instance. One ‘application’ of the extended algorithm is that we will obtain an inductive proof of the existence of t.2) the fact that n n mod m = n − m (4. Suppose.2. How many operations will be needed to compute gcd(Fn . m) and two integers t and u for which (4.3. n mod m). n mod m) but we also know the coeﬃcients t . of n and m. n).c. 8. −5).3.c. at the next step.2) transform into t. m) to gcd(m. u that satisfy (4. Can you spot integers t. n) = mn/lcm(m. In each case count the total number of arithmetic operations that you had to do to get the answer.1) (see exercise 1 below). in a natural way. in the form b n= qi i .4.3.1) is true. u for equation (4.1. Let Fn be the nth Fibonacci number. u for equation (4. inductive proofs of theorems often (not always. u. Fn−1) by the Euclidean algorithm? What is gcd(Fn . it also ﬁnds a pair of integers t. We saw that already in the analysis that proved theorem 4.

d. and let g be their greatest common divisor. u that satisfy (4. u such that g = tm + un. 1) is now complete. Let m and n be given positive integers. Now that the (2.1)} if m = 0 then g := n.{gcdext} It is quite easy to use the algorithm above to make a proof of the main mathematical result of this section (see exercise 1). u).1) and (1. Let m and n be given integers.1. 2) has been in limbo until just this moment.1. If g > 1 then there exists no t such that tm ≡ 1 (mod n) since tm = 1 + rn implies that the g. It computes u := 1 − 11/3 (−1) = 4 t := −1. procedure gcdext(n. We ﬁnd u := (−1) − 14/11 4 = −5 t := 4. with (n.11) by (11. (2. so it sets g := 1. g. m. An immediate consequence of the algorithm and the theorem is the fact that ﬁnding inverses modulo a given integer is an easy computational problem. u := 0.c. t := 1. We will need to refer to that fact in the sequel.0). in linear time.3. so we state it as Corollary 4. u := t − n/m u. of n and m.3. m) = (1. and let g be their g. 3) has so far been languishing. Proof: By the extended Euclidean algorithm we can ﬁnd. When it executes with (n. 1). the original call to gcdext from the user. of m and n is 1. but its turn has come.c. m) = (11. t := s end. Then there exist integers t.3. The call to the routine with (n. The call with (n. 11). m) = (14. m) = (3.d.c. Now it can complete the execution of the call with (n. {computes g.d. u := 0 else gcdext(m. which has so far been pending. 11).2). If g = 1 then it is obvious that t is the inverse modn of m. 0) it encounters the ‘if m = 0’ statement. n mod m. g. and ﬁnds integers t. To do this it sets u := t − n/m u = 1 t := 0.2) call executes and ﬁnds u := 0 − 3/2 1 = 1 t := 1. 86 . But this last equation says that tm ≡ g (mod n). the (3. Then it calls itself successively with (3.3) and calls itself. integers t and u such that g = tm + un. m) = (14. The routine ﬁrst replaces (14. We will now trace the execution of gcdext if it is called with (n.1) call is ﬁnished. t. Then m has a multiplicative inverse modulo n if and only if g = 1. m) = (2. t := 1. which is Theorem 4. Finally. The call to the routine with (n.Chapter 4: Algorithms in the Theory of Numbers We can now formulate recursively the extended Euclidean algorithm. the inverse can be computed in polynomial time. m) = (2. can be processed. t. In that case. u). s := u.

Find all solutions to exercises 2(a)-(c) above. Find all such pairs of integers. If r = ta + ub. .1) in this case. b)? What. draw a picture of the complete tree of calls that will occur during the recursive execution of the program. It is feasible to calculate maximum ﬂows in networks with 1000 vertices or so.3. It needs only O(n log n) time to ﬁnd the transform of a sequence of length n if n is a power of two. 4. an ) = t1 a1 + t2 a2 + · · · + tn an . where r. Find integers t. t = 4. . Since then we have seen some fast algorithms and some slow ones. On the other hand. Find the multiplicative inverse of 49 modulo 73. an be positive integers. a low power of the size of the input data string. . tn such that gcd(a1 . u such that (a) 1 = 4t + 7u (b) 1 = 24t + 35u (c) 5 = 65t + 100u 3. Your proof should be by induction (on what?) and should use the extended Euclidean algorithm. (c) Give a recursive algorithm for the computation of t1 . must r = gcd(a. can be said about the relationship of r to gcd(a. Let (t0 . Exercises for section 4. b)? 5. using the extended Euclidean algorithm. the recursive computation of the chromatic polynomial in section 2. u0) be one pair of integers t. Let a1 . for each recursive call in the tree. . Is n prime? 87 . . a. In the network ﬂow problem the complexity of the MPM algorithm was O(V 3 ).3 1. In this chapter we will meet another computational question for which. . no one has ever been able to provide a polynomial-time algorithm. where n is prime. nor has anyone been able to prove that such an algorithm does not exist. 6. those that may require exponential time. 1000 data points. . b. v are all integers. and we see that the procedure has found the representation (4. 8. The problem is just this: Given a positive integer n. 2. say. . u. . Likewise.4 Primality testing In Chapter 1 we discussed the important distinction between algorithms that run in polynomial time vs. 4. an )? (b) Prove that there exist integers t1 . . . In both of those problems we were dealing with computational situations near the low end of the complexity scale. gcdext returns the values g = 1.3 of Chapter 2 was an example of an algorithm that might use exponential amounts of time. Give a complete formal proof of theorem 4. It is feasible to do a Fast Fourier Transform on. . . u = −5. . the Fast Fourier Transform is really Fast. tn in part (b) above. and only O(n2 ) time in the worst case. u for which gcd(a. The importance of the ‘trivial case’ where m = 0 is apparent. .1.3. m) = (98. and the same holds true for the various matching and connectivity problems that are special cases of the network ﬂow algorithm. to date. . If gcdext is called with (n. a and b being given.4 Primality testing Therefore. if anything. b) = ta + ub.4. (a) How would you compute gcd(a1 . . 30). In your picture show. . the values of the input parameters to that call and the values of the output variables that were returned by that call. to the user. 7. . .

In practice. Next. there are probabilistic algorithms for the factorization problem just as there are for primality testing. subject the pair (b. this latter estimate is of the form O((Bits)c log log Bits ). n) to a certain test. The additional property that the test described below has. This is true of the Adleman.’ If you should ﬁnd it discouraging to get only the answer ‘no’ to the question ‘Is 7122643698294074179 prime?. and if the answer is ‘No. Pomerance and Rumely in 1983. Indeed the test ‘Does b divide n?’ already would perform the function stated above. it is certainly suﬃcient to ﬁnd a nontrivial divisor of that number. n − 1]. So this problem. However.9 we will discuss a probabilistic algorithm for factoring large integers. A more recent primality test. and if certainty is important. earlier. All we are asking for is a ‘yes’ or ‘no’ answer to the question ‘is n prime?. the decision about the compositeness of n will be reached without a knowledge of any of the factors of n. This is what was referred to as ‘tantalizingly close’ to polynomial time. It is important to notice that in order to prove that a number is not prime. In the test that follows. is completely deterministic. remarkable progress on the problem has been made in recent years. in fact grows extremely slowly as n grows. It is not necessary to do that.’ without getting any of the factors of that number. The question of ﬁnding a factor of n. It remains to describe the test to which the pair (b. In fact it was shown to run in time O((log n)c log log log n ) for a certain constant c. Rumely test also. In that example we showed that the obvious methods of testing for primality are slow in the sense of complexity theory. which seems like a ‘pushover’ at ﬁrst glance.Chapter 4: Algorithms in the Theory of Numbers The reader should now review the discussion in Example 3 of section 0. it has a low probability of success even if n is composite. Of course the factorization problem is at least as hard as ﬁnding out if an integer is prime.’ we would have learned virtually nothing. One of the most important of these advances was made independently and almost simultaneously by Solovay and Strassen. then it would be worthwhile to subject n to one of the nonprobabilistic primality tests in order to dispel all doubt.’ which would be constant in a polynomial time algorithm. If n is not found to be composite after 100 trials. The test has two possible outcomes: either the number n is correctly declared to be composite or the test is inconclusive. in 1976-7. even they don’t run in polynomial-time.8. Pomerance. and it runs in tantalizingly close to polynomial time. but in the case of the factorization problem. the chance that the test will declare that result is at least 1/2. If that were the whole story it would be scarcely have been worth the telling. Speciﬁcally. or all of them. Again. to be described below. then what you want is a fast algorithm for the factorization problem. and these are rather good odds. and to prove that it detects compositeness with probability ≥ 1/2. n) is subjected.’ is that if n is composite. the probability that it will be declared composite at least once is at least 1 − 2−100. If n is composite. 1 ≤ b ≤ n − 1. due to Adleman. turns out to be extremely diﬃcult. and by Rabin. These authors took the imaginative step of replacing ‘certainly’ by ‘probably. where we remark on the connection between computationally intractable problems and cryptography. Since the number of bits of n is a constant multiple of log n. In section 4. Each test would be done in quick polynomial time. that runs in polynomial time. called a pseudoprimality test. and so no polynomial-time algorithm is known for it either. given n it will surely decide whether or not n is prime. after some motivation in section 4. we do an amount of work that is an exponentially growing function of the length of the input bit string if we use one of those methods. not shared by the more naive test ‘Does b divide n?. That is. for a given n we would apply the test 100 times using 100 numbers bi that are independently chosen at random in [1. Although it is not known if a polynomial-time primality testing algorithm exists. Before doing this we mention another important development. is another interesting computational problem that is under active investigation. we will describe one of the ‘Public Key’ data encryption systems whose usefulness stems directly from the diﬃculty of factoring large integers.’ and they devised what should be called a probabilistic compositeness (an integer is composite if it is not prime) test for integers. 88 . The exponent of ‘Bits. however. Here is how the test works. That is. The test is more elaborate than the one that we are about to describe. First choose a number b uniformly at random.2.

These results will be needed in the sequel. 5.5. by lemma 4. 3. We learned. The diﬀerence.5. but they are also of interest in themselves and have numerous applications. . 7. . We will call that group the group of units of Zn and will denote it by Un . we see that in U18 the powers of 5 are 1. subtraction.1. run through all of the elements of the group. Consider the ring whose elements are 0. 1. in corollary 4. that an element m of Zn is invertible if and only if m and n are relatively prime. In a cyclic group there is an element a whose powers 1. For example. 4 have no multiplicative inverses in Z6 . Reference to Table 4. it certainly need not be a ﬁeld. . 1.4. 5 do have such inverses. .5.1. It has exactly φ(n) elements. in Table 4.5. The multiplication table of the group U18 is shown in Table 4. The invertible elements of Zn form a multiplicative group. + 0 0 1 2 3 4 5 0 1 2 3 4 5 1 2 3 1 2 3 4 5 0 2 3 4 5 0 1 3 4 5 0 1 2 4 5 4 5 0 1 2 3 5 0 1 2 3 4 ∗ 0 1 0 1 2 3 4 5 0 0 0 0 0 0 0 1 2 3 4 5 2 3 4 0 2 4 0 2 4 0 3 0 3 0 3 0 4 2 0 4 2 5 0 5 4 3 2 1 Table 4.2. . 2. stems from the fact that 1 and 5 are relatively prime to the modulus 6 while 2. .. The group U18 is indeed cyclic.5.5. . n − 1 and in which we do addition. a2.3. while 1.1: Arithmetic in the ring Z6 Notice that while Zn is a ring. 17. 3. 89 . a3 . and the powers of 5 exhaust all group elements. 11. a. .2: Multiplication modulo 18 Notice that U18 contains φ(18) = 6 elements.5). and multiplication modulo n. ∗ 1 5 7 11 13 17 1 5 7 11 13 17 1 5 7 11 13 17 5 7 17 1 11 13 7 17 13 5 1 11 11 1 5 13 17 7 13 11 1 17 7 5 17 13 11 7 5 1 Table 4. and 5 is a generator of U18 .1 shows that 2.5 Interlude: the ring of integers modulo n In this section we will look at the arithmetic structure of the integers modulo some ﬁxed integer n. 13. 4 are not.1 we show the addition and multiplication tables of Z6 . Thus the order of the group element 5 is equal to the order of the group. If we refer to the table again. because there will usually be some noninvertible elements. that each of them has an inverse and that each row (column) of the multiplication table contains a permutation of all of the group elements. with a view to ﬁnding out if the group U18 is cyclic.5 Interlude: the ring of integers modulo n Isn’t it amazing that in this technologically enlightened age we still don’t know how to ﬁnd a divisor of a whole number quickly? 4. where φ is the Euler function of (4. of course.1. Let’s look at the table a little more closely. . . This ring is called Zn .

2 · 2) = (1. x2 + y2 ). in which the two multiplications on the right side are diﬀerent: the ‘x1 · y1 ’ is done in Z2 and the ‘x2 · y2 ’ in Z3 . The answer is given by Theorem 4. 2) = (1 · 1. (x1 .3. if n is a prime number then φ(n) = n − 1. Corollary 4. Next we will discuss the fact that if the integer n can be factored in the form n = pa1 pa2 · · · par then r 1 2 the full ring Zn can also be factored.5. Thus 5 is a primitive root modulo 18. An integer n has a primitive root if and only if n = 2 or n = 4 or n = pa (p an odd prime) or n = 2pa (p an odd prime).5. y2 ) = (x1 ·y1 . which we have already seen. as a ‘product’ of Zpai .5.1) . It can be found.2. The proof of theorem 4. which the reader should check. For every integer b that is relatively prime to n we have bφ(n) ≡ 1 (mod n). in the book of LeVeque that is cited at the end of this chapter. x2 ·y2 ). i.5.3. It is important to know which groups Un are cyclic. we state as an immediate consequence of theorem 4. from Table 4. 2 + 1) = (1.5. The reader should now ﬁnd. If n is prime. all of the primitive roots modulo 18. 2) · (1.5. x2 )·(y1 . (1.3 is a little lengthy and is omitted. the groups Un are cyclic for precisely such values of n. x2 ) + (y1 . 1) = (0 + 1. then Un is cyclic. U18 is cyclic. What this means is that we consider ordered pairs x1 . (0. 2). A sample of the multiplication process is (1. In full generality we can state the factorization of Zn as 90 (4. and U12 is not cyclic. (1. then for all b ≡ 0 (mod n) we have bn−1 ≡ 1 (mod n). Hence.5. 0) where the addition of the ﬁrst components was done modulo 2 and of the second components was done modulo 3. every element of Un has an order that divides φ(n).Chapter 4: Algorithms in the Theory of Numbers A number (like 5 in the example) whose powers run through all elements of Un is called a primitive root modulo n.2 (‘Fermat’s little theorem’). where x1 ∈ Z2 and x2 ∈ Z3 . Alternatively. 2) + (1.1 (‘Fermat’s theorem’). Here is how we do the arithmetic with the ordered pairs. has only the solutions x = ±1. (x1 . in Un . First. namely Theorem 4. 0). 2). of maximum possible order φ(n).. and in particular the equation x2 = 1. A sample of the addition process is (0. for example. which integers n have primitive roots. 0). in which the two ‘+’ signs on the right are diﬀerent: the ﬁrst ‘x1 + y1 ’ is done in Z2 while the ‘x2 + y2 ’ is done in Z3 . we expect that somehow Z6 = Z2 Z3 . 1) in which multiplication of the ﬁrst components was done modulo 2 and of the second components was done modulo 3.5. (0.3. 1). Second.e. If n is an odd prime. x2. since the order of a group element must always divide the order of the group. In particular. The primitive roots are exactly the elements. Therefore the 6 elements of Z6 are (0. 1). Since 6 = 2 · 3. in a certain sense. We pause to note two corollaries of these remarks. (1. i Let’s take Z6 as an example.3.5. and we have Theorem 4. Further. if they exist. for example. According to theorem 4. y2 ) = (x1 + y1 .

for each i = 1.5. . 2. i (4. . The factorization that is described in detail in theorem 4. . is a ring isomorphism of Zn with the ring of r-tuples i (x1 . Hence x − x is divisible by M = m1 m2 · · · mr . b2 . at least.2) i=1 The factorization (4. r we have r x= j=1 bj tj (M/mj ) ≡ bk tk (M/mk ) (mod mk ) ≡ bk (mod mk ) where the ﬁrst congruence holds because each M/mj (j = k) is divisible by mk . . . The mapping which associates with each x ∈ Zn the r-tuple r 1 2 (x1 . and let M = m1 m2 · · · mr . . .5. .5. Proof 2: Here’s how to compute a number x that satisﬁes the simultaneous congruences x ≡ bi mod mi (i = 1.4. . br ). the ith ‘+’ sign on the right side is the addition operation of Zpai and in (c) the ith ‘·’ sign is i the multiplication operation of Zpai . xr ) + (y1 . xr ) in which (a) xi ∈ Zpai (i = 1. . x2 . . r). . . .5. tr . . u1 . .5. is a bijection between ZM and Zm1 × · · · × Zmr . . . So here are one of each for the Chinese Remainder Theorem. An outstanding theorem deserves two proofs. and one constructive. . Proof 1: We must show that each r-tuple (b1 .. . . . . Let n = pa1 pa2 · · · par .5 Interlude: the ring of integers modulo n Theorem 4.5. r) be pairwise relatively prime positive integers. . and the second congruence follows since tk (M/mk ) = 1 − uk mk ≡ 1 mod mk . Now the proof of theorem 4. In the contrary case we would have x and x both corresponding to (b1 . . . . such that tj (M/mj ) + uj mj = 1 for j = 1. r). xr + yr ) and (c) (x1 .5. yr ) = (x1 + y1 . yr ) = (x1 · y1 . Then the mapping that associates with each integer x (0 ≤ x ≤ M − 1) the r-tuple (b1 . A good theorem deserves a good proof. . where xi = x mod pai (i = 1. . br ). . Let mi (i = 1. say. .3) 91 . r. r) occurs exactly once. . . . .. . . There are obviously M such vectors. . for each k = 1.5. i The proof of theorem 4. and is left as an exercise for the reader. xr · yr ) (d) In (b). Indeed. . . br ) such that 0 ≤ bi < mi (i = 1. xr ). completing the second proof of the Chinese Remainder Theorem.4 will be written symbolically as Zn ∼ = r Zpai . 2. .4. . where bi = x mod mi (i = 1. But |x − x | < M . r) and i (b) (x1 . . ur . .4 follows easily. Then we claim that the number x = j bj tj (M/mj ) satisﬁes all of the given congruences. and so it will be suﬃcient to show that each of them occurs at most once as the image of some x. . one existential. . . . b2 . . . . xr ) · (y1 . But then x − x ≡ 0 modulo each of the mi . First. . . by the extended Euclidean algorithm we can quickly ﬁnd t1 . . . . r. .2) of the ring Zn induces a factorization Un ∼ = r Upi ai i=1 (4. x2 . r). . . hence x=x. . .5 (‘The Chinese Remainder Theorem’).4 follows at once from the famous Theorem 4. .

4. 7 and 11 to the respective moduli 5.4. only one of which is ‘good. n]. In more detail. n) of integers.3. 92 . Find all primitive roots modulo 27.5. ).3) is an isomorphism of the multiplicative structure only.’ (b) If the test reports ‘n is composite’ then n is composite. The outcome of the test of the primality of n depends on the base b that is chosen. if b is chosen uniformly at random from [1. as far as we can tell).e. 1). 9. (1.5. In Z12 .’ Either n is prime or it isn’t. where b is not 1 or n. 2). with a large number of diﬀerent bases b. 11 and 17. we ﬁnd U12 ∼ U4 U3 = where U4 = {1. 1. (c) The test runs in a time that is polynomial in log n. Which elements of Z13 are squares? 7.5. 8. Give a complete proof of theorem 4. Output ‘n is composite’ if b divides n. So U12 can be thought of as the set {(1. (4. 3. Of course.’ Test 1. it is silly to say that ‘there is a high probability that n is prime.’ This isn’t the good one. Use the method in the second proof of the remainder theorem 4.5 1. Here are four examples of pseudoprimality tests. and have found that it is pseudoprime to all of those bases. in the interval 1 ≤ b ≤ n. 5. In a good pseudoprimality test there will be many bases b that will give the correct answer. Write out the multiplication table of the group U27 . 2}. Find all primitive roots modulo 18. and that has the following characteristics: (a) The possible outcomes of the test are ‘n is composite’ or ‘inconclusive.5.. n. else ‘inconclusive. U3 = {1. Find all x ∈ U27 such that x2 = 1. (3. If the test result is ‘inconclusive’ then we say that n is pseudoprime to the base b (which means that n is so far acting like a prime number. 2. the probability that it will be so declared is the probability that we happen to have found a b that divides n.6 Pseudoprimality tests In this section we will discuss various tests that might be used for testing the compositeness of integers probabilistically. if n is composite. the abuse of language is suﬃciently appealing that we will deﬁne the problem away: we will say that a given integer n is very probably prime if we have subjected it to a good pseudoprimality test. 3}. for example. 2)}. a good pseudoprimality test will. By a pseudoprimality test we mean a test that is applied to a pair (b. for a large number of choices of the base b) declare that a composite n is composite. Nonetheless. Write out the complete proof of the ‘immediate’ corollary 4. with high probability (i. (3. More precisely. 4. 10. Which elements of Z11 are squares? 6.Chapter 4: Algorithms in the Theory of Numbers of the group of units. and we should not blame our ignorance on n itself. we will say that a pseudoprimality test is ‘good’ if there is a ﬁxed positive number t such that every composite integer n is declared to be composite for at least tn choices of the base b. Prove that if there is a primitive root modulo n then the equation x2 = 1 in the group Un has only the solutions x = ±1. together with the componentwise multiplication operation described above. is p1 = (d(n) − 2)/n where d(n) is the number of divisors of n. If n is composite. Exercises for section 4.5. Certainly p1 is not bounded from below by a positive constant t. Find all x ∈ U15 such that x2 = 1. given an integer n. Since Un is a group. The probability of this event. Given b. Find a number x that is congruent to 1.

b337 = b256 · b64 · b16 · b. Test 3. Let n − 1 = 2q m. 0. rather than waiting until the ﬁnal exponent is reached. We claim that bm2 ≡ 1 (mod n) for all i = q. which is composite (1729 = 7 · 13 · 19). n).5. An example of such a number is n = 1729. we compute b. by Fermat’s theorem. the test usually seems to perform quite well. . Given b. and we have the same kind of ineﬃciency as in Test 1 above.5). n) = 1. where φ is the Euler totient function. say if n = p2 . 70. because i−1 i (bm2 )2 = bm2 ≡ 1 (mod n) implies that the quantity being squared is +1 or −1.’ This one is a little better. Now we can state the third pseudoprimality test. . the test is still not ‘good. To establish the claim. The familial resemblance to Test 3 will be apparent. What is the computational complexity of the test? Consider ﬁrst the computational problem of raising a number to a power. then the proportion of useful bases is very small. and thereby complete the proof. Output ‘n is composite’ if gcd(b. else output ‘inconclusive. the number of bases b ≤ n for which Test 2 will produce the result ‘composite’ is n − φ(n). 146. but for which Test 3 gives the result ‘inconclusive’ on every integer b < 1729 that is relatively prime to 1729 (i. n − 1] that is relatively prime to n. q − 1. Despite such misbehavior. 150. 22. 93 i . 89. When n = 169 (a diﬃcult integer for tests 1 and 2) it turns out that there are 158 diﬀerent b’s in [1. If either (a) bm ≡ 1 (mod n) or (b) there is an integer i in [0. More precisely. This number of useful bases will be large if n has some small prime factors. but in that case it’s easy to ﬁnd out that n is composite by other methods. n) produces the output ‘inconclusive’ for every integer b in [1. then it is true for i − 1 also. b2 .’ but it’s a lot better than its predecessors. 23. b8 . q − 1] such that bm2 ≡ −1 i (mod n) then return ‘inconclusive’ else return ‘n is composite. . with the property that the pair (b. (the strong pseudoprimality test): Given (b. If n is composite.168] that produce the ‘composite’ outcome from Test 3.’ then n is composite.3 Un is cyclic. it is clearly true when i = q. If so then the case i = 0 will contradict the outcome of the test. For instance. . namely every such b except for 19. that is not divisible by 7 or 13 or 19). We can calculate. 80.e. 147. Finally.4. and so the equation x2 = 1 in Un has only the solutions x = ±1. (If b and n are not relatively prime or) if bn−1 ≡ 1 composite. b4 . n.6 Pseudoprimality tests Test 2. by successive squaring. there exist composite numbers n. of (4. called Carmichael numbers. Then we use the binary expansion of the exponent m to tell us which of these powers of b we should multiply together in order to compute bm . by squaring. 168. If n has only a few large prime factors. But −1 is ruled out by the outcome of the test. Proof: Suppose not.1. 99. we will describe a good pseudoprimality test.’ (mod n) then output ‘n is Regrettably.’ First we validate the test by proving the Proposition. To cite an extreme case of its un-goodness. by corollary 4. Test 4. Given b. Then n is an odd prime. but not yet good. .. and reducing modulo n immediately after each squaring operation. where m is an odd integer. If true for i. n. for example. and the proof of the claim is complete.’ else output ‘inconclusive. Since n is an odd prime. . If the test returns the message ‘n is composite. . bm mod n with O(log m) integer multiplications.

then the number of x ∈ Un such that x has exact order r. 70. What is the relationship between these two sequences? That is. However. respectively. m. Test 4 will reply ‘169 is composite. where p is an odd prime. If n = pa1 · · · pam . n). 80. 89. . 1 ≤ b ≤ n − 1 will give the result ‘n is composite. in part (b) of the test there are O(log n) possible values of i to check. therefore. if n = 169.7 Proof of goodness of the strong pseudoprimality test In this section we will show that if n is composite. In a group G suppose fm and gm are. How many elements of each order r are there in H (r divides n)? 3. for all divisors r of φ(n). In particular. applies to Test 3 above also). Let H be a cyclic group of order n. Let T (n) be the set of all b ∈ [1.6 1. then it turns out that for 157 of the possible 168 bases b in [1.’ For example. n] such that gcd(b.’ and apply o it to this problem. The basic idea of the proof is that a subgroup of a group that is not the entire group can consist of at most half of the elements of that group. For instance. For this case of n = 169 the performances of Test 4 and of Test 3 are identical. n) = 1 and bn−1 ≡ 1 (mod n). Similarly. n − 1] will yield the result ‘n is composite’ in the strong pseudoprimality test. how would you compute the g’s from the f’s? the f’s from the g’s? If you have never seen a question of this kind.168]. 150. Suppose n has the factorization n = pa1 · · · pas s 1 and let ni = piai (i = 1. if m is odd then t := t · b. . is φ(r). 2. and then the above test would take O((log n)3 ) time. the number of elements of order m and the number of solutions of the equation xm = 1. 4. Exercises for section 4. r). 147. n)). 168. then at least half of the integers b in [1. 2. 99. ﬁnd ‘M¨bius inversion. power := t mod n end. 94 . The entire test requires. m/2 . and if r divides φ(n). and for each of them we do a single multiplication of two integers each of which has O(log n) bits (this argument. Show that |T (n)| divides φ(n). i i=1 5..’ The only bases b that 169 can fool are 19. for each m = 1.Chapter 4: Algorithms in the Theory of Numbers The complete power algorithm is recursive and looks like this: function power(b. the number of primitive roots modulo n is φ(φ(n)). 146. some low power of log n bit operations. Given an odd integer n. s). 4. 23. then the number of x ∈ Un such that xr ≡ 1 (mod n) is m 1 m gcd(φ(pai ). there are no analogues of the Carmichael numbers for Test 4. of course. This is a polynomial in the number of bits of input. If n = pa . look in any book on the theory of numbers. {returns bm mod n} if m = 0 then power := 1 else t := sqr(power(b. 22.{power} Hence part (a) of the strong pseudoprimality test can be done in O(log m) = O(log n) multiplications of integers of at most O(log n) bits each. . if we were to use the most obvious way to multiply two B bit numbers we would do O(B 2 ) bit operations. In the next section we are going to prove that Test 4 is a good pseudoprimality test in that if n is composite then at least half of the integers b.

The order of each element of Un is a divisor of e∗ = lcm{φ(ni ). (b) If n is composite then B consists of at most half of the integers in [1. Now if n is a prime power.7. Clearly ψ(v) is not 1. Then it can contain no other element of order 2. so ψ(x) = −1 in this case. the number e∗ is even. . .7 Goodness of pseudoprimality test Lemma 4. s). .5. Hence s = 1 and n is a prime power. Hence for every x ∈ B. because the element −1 of Un is represented uniquely by the s-tuple all of whose entries are −1. 1. .1.2) of Un .4. y ∈ Unj (recall that j is as described above. is of the form v = (1. u. . namely 2t . which contradicts the italicized assertion above. So we can suppose that n is divisible by more than one prime number. .2. when written out as an s-tuple according to that factorization. . for otherwise the order of y. (a) If B generates Un then n is prime. since m is odd. Proof: From the product representation (4. .2. which is impossible because m is odd. ψ(x) has order 1 or 2. . We can suppose y to be an element of order exactly 2t in Unj since Unj is cyclic. and in the latter case C(b) contains −1. The order of x is equal to the lcm of the orders of the elements of the s-tuple. where m is odd and e∗ is as shown in lemma 4. we are ﬁnished.7. Let B be as in the statement of lemma 4. Proof: Suppose b ∈ B and let m be the odd part of n − 1. Then either bm ≡ 1 or bm2 ≡ −1 for some i ∈ [0. Let B be the set of all elements u of Un for which C(u) either contains −1 or has odd order (e odd). ue−1 } denote the cyclic group that u generates.7. Let B be the set of integers b mod n such that (b. Then there is a j such that φ(nj ) is divisible by 2t .7. 95 i t−1 m (x ∈ Un ) . Theorem 4. n) returns ‘inconclusive’ in Test 4. Then not only for every x ∈ B but for every x ∈ Un it is true that ψ(x) = ±1. ψ(x) = ±1.3) of Un we ﬁnd that an element x of Un can be regarded as an s-tuple of elements from the cyclic groups Uni (i = 1. Suppose n is not a prime power. 1. In the former case the cyclic subgroup C(b) has odd order. Strassen and Rabin.1. 1. s the order of the ith of those elements is a divisor of φ(ni ). This is in fact a group homomorphism: ∀x.7. Hence if ψ(x) = 1. y. . 1) where the ‘y’ is in the j th component. y ∈ Un : ψ(xy) = ψ(x)ψ(y). i = 1. Also. For each x ∈ B. Since ψ(x) is an element of C(x) whose square is 1. and therefore the order of x divides the lcm shown above. But for each i = 1. . q − 1].1. would divide 2t−1 m. it is of order 2. n − 1]. in the second sentence of this proof). Consider ψ(v). . s}. which asserts that Test 4 is good. Now we can prove the main result of Solovay. completing the proof.5. . Hence t > 0 and we can deﬁne a mapping ψ of the group Un to itself by ψ(x) = x2 (note that ψ(x) is its own inverse). Consider the element v of Un which. . Since φ(n) is an even number for all n > 2 (proof?). Lemma 4. ψ(x) is in C(x) and ψ(x)2 = ψ(x2 ) = 1. If the cyclic group C(x) is of odd order then it contains no element of even order. Hence C(x) is of even order and contains −1. ψ(v) is not −1. Thus ψ(v) is neither 1 nor −1 in Un .. Let n > 1 be odd. For each element u of Un let C(u) = {1. Then s > 1 in the factorization (4. Suppose B generates the full group Un . 1. Proof: Let e∗ = 2t m. . u2. If B generates the full group Un then n is a prime power.

to cryptography. the chance that a composite integer n would have behaved like that is less than 2−100 .9. n − 1].9 we will discuss a probabilistic factoring algorithm. For n = 9 and for n = 15 ﬁnd all of the cyclic groups C(u). Let g be one of these. If B generates the full group Un then B does too. in areas of vital public concern. 2. and so for all x ∈ Un we have xn−1 ≡ 1. it will be necessary to apply a test whose outcome will prove primality. of theorem 4. say n = pk . however.7. Now Un is cyclic of order φ(n) = φ(pk ) = pk−1 (p − 1). n). if times = 100 then outcome:=‘n probably prime’ else outcome:=‘n is composite’ end{testn} If the procedure exits with ‘n is composite. each time the user asks for a random number. and by lemma 4. Exercises for section 4. pk−1 (p − 1) since the set of all of its powers is identical with Un . In section 4. since B generates Un . in the next section we will present a remarkable application of the complexity of the factoring problem. and ﬁnd the set B. such as the one described below in section 4. On the other hand. and the proof is complete. is a multiple of p) divides pk − 1 (which is one less than a multiple of p).’ then we can be certain that n is not prime.7.7 1. Before doing so. output the next higher power of the primitive root. If we want to see the factors of n then it will be necessary to use some factorization algorithm. For n = 9 and n = 15 ﬁnd the set B . which completes the proof of part (a) of the theorem.2 above. In part (b). if k > 1. More precisely. where B is the set deﬁned in the statement of lemma 4. Another application of the same circle of ideas to computer science occurs in the generation of random numbers on a computer. A good way to do this is to choose a primitive root modulo the word size of your computer. on the one hand. Rumely and Pomerance. 96 .2.1.Chapter 4: Algorithms in the Theory of Numbers Hence in either case B ⊆ B.2. and so k = 1. outcome). Suppose there is given a large integer n. times := times + 1 until {result is ‘n is composite’ or times = 100}. Now we’ll summarize the way in which the primality test is used. referred to earlier. and we would like to determine if it is prime.7. by part (a). Hence B generates a proper subgroup of Un . apply the strong pseudoprimality test (Test 4) to the pair (b. times := 0. and then.7. in either of the above cases we have bn−1 ≡ 1. of lemma 4.5. Hence pk−1 (p − 1) (which. Also. and on the other hand is a divisor of n − 1 = pk − 1 since xn−1 ≡ 1 for all x. repeat choose an integer b uniformly at random in [2. and so can contain at most half as many elements as Un contains. By theorem 4. We would do function testn(n. then the integer n is very probably prime. and in particular for x = g. The order of g is.3 there are primitive roots modulo n = pk . n is a prime power. The fact that you started with a primitive root insures that the number of ‘random numbers’ generated before repetition sets in will be as large as possible. so the same holds for all b ∈ B . Such applications remind us that primality and factorization algorithms have important applications beyond pure mathematics. If we want certainty. such as the algorithm of Adleman. n is composite and so B cannot generate all of Un . if the procedure halts because it has been through 100 trials without a conclusive result.

who has ﬁrst made sure that E is relatively prime to (p − 1)(q − 1) by computing the g.8 Factoring and cryptography A computationally intractable problem can be used to create secure codes for the transmission of information over public channels of communication. one more item of information is computed by the receiver. DE ≡ 1 (mod (p − 1)(q − 1)). calculation is fast. rather than having been proved so. q: two large prime numbers.1 we show the interiors of the heads of the sender and receiver. 4. (A) Who knows what and when Here are the items of information that are involved. To summarize. and who knows each item: p.’ is that the ‘key’ to the code lies in the public domain. the receiver of the message. and can be readily changed if need be. 97 . though.d. the most widely used Public Key Systems lean on computational problems that are only presumed to be intractable. A remarkable feature of a family of recently developed coding schemes. Shamir and Adleman. this is a fast calculation for the receiver.d.8. In addition to the above.4. placed in the Public Domain by the receiver. Again.8 Factoring and cryptography 4.d. One should remember. then the RSA system could be ‘cracked. D The sender knows P Everybody knows n and E In Fig. q.’ In this system there are three centers of information: the sender of the message. and told to nobody else (not even to the sender!).c.e. and that is the integer D that is the multiplicative inverse mod (p − 1)(q − 1) of E. and hordes of apparently hard problems. chosen by the receiver. like factoring large integers. called ‘Public Key Encryption Systems. The receiver knows p. is 1. E : a random integer.c. that an adversary might discover fast algorithms for doing these problems and keep that fact secret while deciphering all of our messages. Since there are precious few provably hard problems. We are going to discuss a Public Key System called the RSA scheme. as we shall see. after its inventors: Rivest. thought of as a string of bits whose value. so it can be easily available to sender and receiver (and eavesdropper). and choosing a new E at random until the g. the ‘Personals’ ads of the New York Times). On the negative side. we can still take a chance that those who might intercept our messages won’t know any polynomial-time algorithms if we don’t know any. Even if we don’t have a provably computationally intractable problem. Here is how the system works. whereas an aspiring eavesdropper would be faced with an exponential amount of computation. n : the product pq is n. and the g. If that problem could be done in polynomial time. The idea is that those who send the messages to each other will have extra pieces of information that will allow the m to solve the intractable problem rapidly. i. since p and q are known. it is scarcely surprising that a number of sophisticated coding schemes rest on the latter rather than the former. and this is placed in the Public Domain. when regarded as a binary number.c.. and the Public Domain (for instance. lies in the range [0. This is easy for the receiver to do because p and q are known to him.. This particular method depends for its success on the seeming intractability of the problem of ﬁnding the factors of large integers. as well as the contents of the Public Domain. n − 1]. P : a message that the sender would like to send.

then the sender can change them without having to send any secret messages to anyone else. so the job is of smaller size.Chapter 4: Algorithms in the Theory of Numbers Fig. that the adversaries have discovered the primes p and q. To a certain extent this is so. however. (D) How to intercept the message An eavesdropper who receives the message C would be unable to decode it without (inventing some entirely new decoding scheme or) knowing the inverse D of E (mod (p − 1)(q − 1)). looks at the public keys E and n. If the receiver suspects that the code has been broken. and decode it as the receiver would.. that (p − 1)(q − 1) is φ(n). the reader is urged to contruct a little scenario. and so we have C D ≡ P DE = P (1+tφ(n)) ≡P (mod n) (t is some integer) where the last equality is by Fermat’s theorem (4. i.1). and computes C D mod n. 4. but two factors make the task a good deal easier.8. computes C ≡ P E (mod n). The eavesdropper is thereby compelled to derive a polynomial-time factoring algorithm for large integers. Make up a short (very short!) message. (C) How to decode a message The receiver receives C. however. Send the message as the sender would. May success attend those eﬀorts! The reader might well remark here that the receiver has a substantial computational problem in creating two large primes p and q. and see what the diﬃculties are. Choose values for the other parameters that are needed to complete the picture. does not even know the modulus (p − 1)(q − 1) because p and q are unknown (only the receiver knows them). Second. The receiver has now recovered the original message P. Only the public numbers n and E would change. Note that the sender has no private codebook or anything secret other than the message itself.1: Who knows what (B) How to send a message The sender takes the message P . as an eavesdropper would. The sender would not need to be informed of any other changes.e. Observe. and knowing the product pq = n alone is insuﬃcient. First. The eavesdropper. and transmits C over the public airwaves. there 98 . Before proceeding. p and q will need to have only half as many bits as n has.5. Then try to intercept the message.

Hence. . This is in contrast to the largest feasible length that can be handled by contemporary factoring algorithms of about 200 decimal digits. i i v= It then would follow. 4. We will not discuss those methods here. we don’t even know a probabilistic algorithm that will return a factor of a large composite integer. and that’s about the state of the art at present. in which case we would have 99 . For instance. The elegance of the RSA cryptosystem prompts a few more remarks that are intended to reinforce the distinction between exponential.e. for each B-number we get an (h + 1)-vector of exponents e(a). Let B be a factor base. it’s a linear time job. 0. Let n be an integer whose factorization is desired. .and polynomial-time complexities. a∈A Now deﬁne the integers ri = (1/2) a∈A e(a. A one-page message is therefore well into the zone of computational intractability. Finding an inverse mod n is no harder than carrying out the extended Euclidean algorithm. with probability > 1/2. .9 Factoring large integers are methods that will produce large prime numbers very rapidly as long as one is not too particular about which primes they are. i) denote the exponent of bi in that product.4. of course. that u ≡ ±v (mod n). Then we could nontrivially represent the zero vector as a sum of a certain set A of exponent vectors. i.3. . then that message would contain about 8000 bits. Hence either u − v or u + v has a factor in common with n. 1. equivalent to about 2400 decimal digits..i) (mod n). In this section we will discuss a probabilistic factoring algorithm that ﬁnds factors in an average time that is only moderately exponential. . b1 . How hard is it to factor a large integer? At this writing. How hard is it to ﬁnd the multiplicative inverse. .h) u= A a bri . then we have h a2 ≡ i=0 bi e(a. If we let e(a. An integer a will be called a B-number if the integer c that is deﬁned by the conditions (a) c ≡ a2 (mod n) and (b) −n/2 ≤ c < n/2 can be written as a product of factors from the factor base B. . integers of up to perhaps a couple of hundred digits can be approached with some conﬁdence that factorization will be accomplished within a few hours of the computing time of a very fast machine. mod (p − 1)(q − 1)? If p and q are known then it’s easy to ﬁnd the inverse. . 0) (mod 2). after an easy calculation. By a factor base B we will mean a set of distinct nonzero integers {b0 . as long as they are large enough. mod 2. a set of h + 2 B-numbers would certainly have that property. Deﬁnition. For example. If we think in terms of a message that is about the length of one typewritten page. bh }. in polynomial time. Deﬁnition.1. that u2 ≡ v2 (mod n). as we saw in corollary 4. Suppose we can ﬁnd enough B-numbers so that the resulting collection of exponent vectors is a linearly dependent set. i) (mod n) (i = 0. It may be. . say e(a) ≡ (0. . .9 Factoring large integers The problem of ﬁnding divisors of large integers is in a much more primitive condition than is primality testing.

that n is not a prime number. If you simply wrote down two smaller integers whose product was n. 214-220. . in polynomial time. a spea ker at a mathematical convention in 1903 announced the result that 267 − 1 is not a prime number. Dixon. since 2672 = 41 · 1729 + (−2)4 52 . if we repeat the random choices until a factor of n is found. requiring them to do only a small amount of computation in order to be so persuaded. 100 . (mod n) is true then we will have found Example: Take as a factor base B = {−2. and that s = 1 (2) Verify that rs = n we can prove. * We follow the account given in V. Then we claim that 186 and 267 are B-numbers. the job would be done. Computing. note that 1862 = 20 · 1729 + (−2)4 . We note that the speaker probably had to work very hard to ﬁnd those factors. Every prime has a succinct certiﬁcate. as the algorithm has been implemented by its author. Anyone who wished to be certain could spend a few minutes multiplying the factors together and verifying that their product was indeed n. With this certiﬁcate C(n) and an auxiliary checking algorithm. and these sum to (0. but having found them it became quite easy to convince others of the truth of the claimed result. SIAM J. for this factor base B. in the example above. n) or gcd(u + v. First. namely gcd(u − v. Pratt. hence we ﬁnd that u = 186 × 267 ≡ 1250 (mod 1729) r1 = 4. s = 1.10 Proving primality In this section we will consider a problem that sounds a lot like primality testing. we see that 267 is a B-number. D.5 4.Chapter 4: Algorithms in the Theory of Numbers learned nothing. though. the average running time will be exp{(2 + o(1))(log log log n) }. In Dixon’s implementation the factor base that is used consists of −1 together with the ﬁrst h prime numbers. 0) and (4. n − 1] until enough B-numbers have been found so their exponent vectors are linearly dependent modulo 2. Nevertheless. A pair of integers r. n). 5}. 4 (1975). This is not polynomial time. 2) respectively. 1729) = 13 and we have found the factor 13 of 1729. To see that 186 is a B-number. and to be utterly convincing all he had to do was to write 267 − 1 = 193707721 × 761838257287. There might have seemed to be some legerdemain involved in plucking the B-numbers 186 and 267 out of the air. 0) (mod 2). one simply chooses integers uniformly at random from [1. but it is moderately exponential only. (1) Verify that r = 1. In fact. and similarly. r2 = 1 v = (−2)4 (5)1 = 80 gcd(u − v. Basically the problem is to convince a skeptical audience that a certain integer is prime. it is close to being about the best that we know how to do on the elusive problem of factoring a large integer. suppose you were writing a 100-decimal-digit integer n on the blackboard in front of a large audience and you wanted to prove to them that n was not a prime. J. It can then be proved that if n is not a prime power then with a correct choice of h relative to n. but is really a little diﬀerent because the rules of the game are diﬀerent. and n = rs constitute a certiﬁcate attesting to the compositeness of n. and all doubts would be dispelled. and let it be required to ﬁnd a factor of n = 1729. s for which r = 1. Indeed*. The exponent vectors of 186 and 167 are (4. However if neither u ≡ v (mod n) nor u ≡ −v a nontrivial factor of n. viz. n) = gcd(1170.

using the certiﬁcates C(pi ) (i = 1.1. Pratt’s method of constructing certiﬁcates of primality. and we will now describe one. C(n)) results in the output ‘n is prime. Then p is prime.10. In step B3 we are looking at every divisor of p − 1. (A3) For each i := 1 to r. will need to do only a polynomial amount of further computation in order to convince themselves that n is prime. and there may be a lot of them.’ (3) If n is not prime then for every possible certiﬁcate C(n) the action of A on the inputs (n. we have xd ≡ 1 (mod p). i Check that each pi is prime. . r). The construction of the certiﬁcate is actually recursive since step 30 below calls for certiﬁcates of smaller primes. the group of units of Zp . To verify that p is prime we could execute the following algorithm B: (B1) (B2) (B3) (B4) Check that p − 1 = pai . Proof: First we claim that gcd(x. and that it can be veriﬁed at no cost. Here is a complete list of the information that is on the certiﬁcate C(p) that accompanies an integer p whose primality is to be attested to: 10 : a list of the primes pi and the exponents ai for the canonical factorization p − 1 = 20 : the certiﬁcates C(pi ) of each of the primes p1 . for let g = gcd(x. We will describe a primality-checking algorithm A with the following properties: (1) Inputs to A are the integer n and a certain certiﬁcate C(n). (2) If n is prime then the action of A on the inputs (n. The left side is a multiple of g. check that x(p−1)/pi ≡ 1 (mod p). i (A2) Check that each pi is prime. But always φ(p) ≤ p − 1. what we are about to do is to show that the problem ‘Is n prime?’ belongs to the class NP. (A1) Check that p − 1 = pai . . 101 . does such a procedure exist for primality veriﬁcation? The answer is aﬃrmative. For each divisor d of p − 1. p) = 1. Fortunately. pr 30 : a positive integer x. Thus x is an element of order p − 1 in a group of order φ(p).1. r). In the language of section 5. if not quickly discovered. however.10 Proving primality Now comes the hard part. Since xp−1 ≡ 1 (mod p) we have xp−1 = 1 + tp and xp−1 − tp = (gg )p−1 − tgg = 1.5. Now the question is. unless g = 1. Then x = gg .2 ). Lemma 4. it isn’t necessary to check every divisor of p − 1. The fact that primality can be quickly veriﬁed. Hence φ(p) = p − 1 and p is prime.1 is the basis for V. How might we convince an audience that a certain integer n is a prime number? The rules are that we are allowed to do any immense amount of calculation beforehand. check that xd ≡ 1 (mod p). but it might not operate in polynomial time. r i=1 pai i This algorithm B is correct. The reader will have no trouble proving that there is a divisor d of p − 1 (d < p − 1) for which xd ≡ 1 (mod p) if and only if there is such a divisor that has the special form d = (p − 1)/pi. Let p be a positive integer. . C(n)) results in the output ‘primality of n is not veriﬁed. The next lemma is a kind of converse to ‘Fermat’s little theorem’ (theorem 4. The audience. Check that xp−1 ≡ 1 (mod p). d < p − 1. We suppose that the certiﬁcate of the prime 2 is the trivial case. . using the certiﬁcates C(pi ) (i = 1. p). The primality checking algorithm A now reads as follows. Lemma 4. d < p − 1. is of great importance for the developments of Chapter 5. Suppose there is an integer x such that xp−1 ≡ 1 (mod p) and such that for all divisors d of p − 1. Hence (p − 1)|φ(p). and the results of that calculation can be written on a certiﬁcate C(n) that accompanies the integer n. p = gg .’ (4) Algorithm A runs in polynomial time.4.10. It follows that x ∈ Up . The right side is not.

1.1) can be written as r g(p) = i=2 g(pi ) + 4 (4.10.10. (V.2). 5.Chapter 4: Algorithms in the Theory of Numbers (A4) Check that xp−1 ≡ 1 (mod p). Hence f(p) ≤ 4 log2 p − 1 for all p ≥ 2. as written.10. We claim that g(p) ≤ 4 log2 p for all p. Write out the complete certiﬁcate that attests to the primality of 19.2) where g(p) = 1 + f(p). Let p = 17. 102 .10. Exercises for section 4. Show that there is no integer x as described in the hypotheses of lemma 4.10. Now let’s look at the complexity of algorithm A . 8. This is surely true if p = 2. 7. is ‘free. Since the number of bits in p is Θ(log p). 4. We will measure its complexity by the number of times that we have to do a computation of either of b the types (a) ‘is m = qj j ?’ or (b) ‘is ys ≡ 1 (mod p)?’ Let f(p) be that number. so the entire primality-veriﬁcation procedure operates in polynomial time. 2.1) in which the four terms. the number f(p) is a number of executions of steps that is a polynomial in the length of the input bit string. We leave to the exercises the veriﬁcation that each of the steps that f(p) counts is also executed in polynomial time. Let p = 15.1.1. Find all integers x that satisfy the hypotheses of lemma 4. Carry out the complete checking algorithm on the certiﬁcate that you prepared in exercise 4 above. The sum begins with ‘i = 2’ because the prime 2.10 1. where time is now measured by the number of bit operations that are implied by the integer multiplications. Same as exercise 2 above. 3.’ Now (4. Show that two positive integers of b bits each can be multiplied with at most O(b2 ) bit operations (multiplications and carries). Then we have (remembering that the algorithm calls itself r times) r f(p) = 1 + i=2 f(pi ) + r + 1 (4. Pratt. Prove that step A1 of algorithm A can be executed in polynomial time.10. If true for primes less than p then from (4. r g(p) ≤ i=2 {4 log2 pi } + 4 r = 4 log2 { i=2 pi } + 4 ≤ 4 log2 {(p − 1)/2} + 4 = 4 log2 (p − 1) ≤ 4 log2 p. which is always a divisor of p − 1. correspond to the four steps in the checking algorithm. 6. This yields Theorem 4. Find an upper bound for the total number of bits that are in the certiﬁcate of the integer p. for steps A3 and A4.10. 1975) There exist a checking algorithm and a certiﬁcate such that primality can be veriﬁed in polynomial time.

etc. A. Annals of Mathematics 117 (1983). Re ading. Oxford. MA. 644-654. Diﬃe and M. S. LeVeque. W. M. 1982. Academic Press. IT-22. Cohen and H.. Mathematics of Computation. New Directions and Recent Results. Hardy and E. On distinguishing prime numbers from composite numbers. 120-126 The probabilistic factoring algorithm in the text is that of John D. E. Dixon. 6 (1977). May 1980. Math. Traub ed. Wagstaﬀ Jr. is in L. Communications of the A. An account of the subject is contained in M. 6 (1976). Wright. A basic reference for number theory.. 241. 91 (1984). 35 (1980 ). Adleman. The fastest nonprobabilistic primality test appeared ﬁrst in L. 1003-1026. A streamlined version of the above algorithm was given by H. 146-157. is G. The mathematics of public key cryptography. Fermat’s theorem. erratum ibid. in Algorithms and Complexity. The American Mathematical Monthly. Report 82-18. Hellman. pp. E. New York. A method for obtaining digital signatures and public key cryptosystems. together with the complexity analysis. Probabilistic algorithms.. L. Lenstra Jr.. Pomerance.4. 21. 103 . 2 (February 1978). M. 84-85. J. O. Strassen. On distinguishing prime numbers from composite numbers. Rabin. The pseudoprimes to 25·109 . of Amsterdam. 7 (1978). A more complete account. Primality testing and Jacobi sums. Dixon. Amsterdam. Solovay and V. 1977 The probabilistic algorithm for compositeness testing was found by M. L. J. 387-406. Addison-Wesley. Some empirical properties of that algorithm are in C. Oxford University Press. Scientiﬁc American. 36 (1981). Factorization and primality tests. Selfridge and S. M. Adleman. A fast Monte Carlo test for primality. H.C. Pomerance and R. Fundamentals of Number Theory. 2 (August 1979). SIAM Journal of Computing. An Introduction to the Theory of Numbers. Hellman. J. The use of factoring as the key to the code is due to R. M. Asymptotically fast factorization of integers. 255260. 1954 Another is W. Adleman. IEEE Abstracts. 118. 333-352. U. 173-206.10 Proving primality Bibliography The material in this chapter has made extensive use of the excellent review article John D. Rumely. IEEE Transactions on Information Theory. C. Inst. New directions in cryptography. Mathematics of Computation..M. 1976 and at about the same time by R. Rivest. Shamir and L. The idea of public key data encryption is due to W.

It is in fact bound together by strong structural ties. enough so that if G has 50 vertices and we have 10 colors at our disposal. But some quite speciﬁc polynomial in the length of the input bit string must appear in the performance guarantee. Take the graph coloring problem. Can the vertices of G be properly colored in K or fewer colors? Independent set: Given a graph G and an integer K. Does V (G) contain an independent set of K vertices? Bin packing: Given a ﬁnite set S of positive integers. and an integer N (the number of bins). The travelling salesman problem (‘TSP’): Given n points in the plane (‘cities’).Chapter 5: N P -completeness Chapter 5: N P -completeness 5. Does there exist a partition of S into N or fewer subsets such that the sum of the integers in each subset is ≤ K? In other words. The integers K and n can be entered with just O(log n) bits. In the ‘Independent Set’ problem the input must describe the graph G and the integer K.’ but ‘nK ’ would not be allowed. is T bounded 104 . or if an independent set of K vertices is present. Hence a very large amount of computation will be done. or very few of them. Suppose that on the warranty card the program was guaranteed to run in a time that is < nK .1 Introduction In the previous chapter we met two computational problems for which fast algorithms have never been found. The hardness of the problem stems from the seeming impossibility of producing such an algorithm accompanied by such a manufacturer’s warranty card. Is it possible to design an algorithm that will come packaged with a performance guarantee of the following kind: The seller warrants that if a graph G. Those were the primality-testing problem. and has total length ≤ D? Graph coloring: Given a graph G and an integer K. Of course the ‘1000n8’ didn’t have to be exactly that. called the N P -complete problems. There are K n such possibilities. for example. Hence ‘357n9’ might have appeared in the guarantee. and so might ‘23n3 . for which the known algorithms are in a more primitive condition. by its vertex adjacency matrix A. The collection of problems. In this chapter we will meet a large family of such problems (hundreds of them now!). Evidently n2 bits of input will describe the matrix A. then it will be very easy to ﬁnd out if a coloring is possible. Let B denote the number of bits in the input string. for instance. We could try every possible way of coloring the vertices of G in K colors to see if any of them work. Hard problems can have easy instances.’ where the ‘capacity’ of each bin is K? These are very diﬃcult computational problems. This is an n × n matrix in which the entry in row i and column j is 1 if (i. Let’s look carefully at why nK would not be an acceptable worst-case polynomial time performance bound. and it will do so in an amount of time that is at most 1000n8 minutes. Hence there is no contradiction between the facts that the problem is hard and that there are easy cases. includes many well known and important questions in discrete mathematics. and the integer-factoring problem. can we ‘pack’ the integers of S into at most N ‘bins. If the graph G happens to have no edges at all. j) ∈ E(G) and is 0 else. but neither have such algorithms been proved to be unattainable. The real question is this (let’s use ‘Independent Set’ as an illustration). This family is not just a list of seemingly diﬃcult computational problems. if G has n vertices. How many bits are needed to do that? The graph can be speciﬁed. so the entire input bit string for the ‘Independent Set’ problem is ∼ n2 bits long. and a distance D. the problem would lie far beyond the capabilities of the fastest computers that are now available. of n vertices. returns to its starting point. such as the following. Is there a tour that visits all n of the cities. and a positive integer K are input to this program. then it will correctly determine if there is an independent set of K or more vertices in V (G). for which the best-known algorithm is delicately poised on the brink of polynomial time. Is this a guarantee of polynomial time performance? That question means ‘Is there a polynomial P such that for every instance of ‘Independent Set’ the running time T will be at most P (B)?’ Well.

Continue in this way. then we would not have provided a polynomial-time guarantee in the form of a single polynomial in B that applies uniformly to all problem instances. and no polynomial time algorithms have been found for any of them. in fact obviously T = O(B K/2 ). Ask: can the graph be 50-colored? If so. isn’t it? The key point resides in the order of the qualiﬁers. and that’s a polynomial. but is worthy of careful study because it’s of fundamental importance. For the present. Now let’s return to the properties of the NP-complete family of problems. Therefore if we say that a certain program for ‘Independent Set’ will always get an answer before B K/2 minutes. Worst-case time bounds aren’t the only possible interesting ones. however. 10 : The problems all seem to be computationally very diﬃcult. if it could be proved that no fast algorithm exists for one of the NP-complete problems. Our next task will be to develop the formal machinery that will permit us to give precise deﬁnitions of all of the concepts that are needed. If a fast algorithm could be found for one NP-complete problem then here would be fast algorithms for all of them. Here are some of them.7 we will study some average time bounds. which. when coupled with their theoretical and practical importance. 20 : Conversely. In the remainder of this section we will discuss the additional ideas informally. If the ‘polynomial’ that we give is B K/2 . The above properties are not intended to be a deﬁnition of the concept of NP-completeness. and then in section 5. because K is diﬀerent for diﬀerent instances. and what we want is the solution of the optimization problem (the chromatic number). After O(log n) steps we will have found the chromatic number of a graph of n vertices. then the chromatic number lies between 26 and 50. The distinction is a little thorny. Hence if there is a fast way to do the decision problem then there is a fast way to do the optimization problem. If not. They are intended as a list of some of the interesting features of these problems. Then that one single polynomial must work on every instance. suppose we have an algorithm that solves the decision problem for graph coloring. The question of the existence or nonexistence of polynomial-time algorithms for the NP-complete problems probably rates as the principal unsolved problem that faces theoretical computer science today. The converse is obvious. meaning a bound on the running time that applies to every instance of the problem. In sections 5.2 we’ll state them quite precisely. 0 2 : It has not been proved that polynomial time algorithms for these problems do not exist. Then ask if it can be colored in 25 colors.What is a language? by a polynomial in B if T = nK and B ∼ n2 ? It would seem so. What we are discussing is usually called a worst-case time bound.6 and 5. What is a decision problem? First. In other situations we might be satisﬁed with an algorithm that is fast on average. then the chromatic number lies between 1 and 50. 20 : But this is not just a random list of hard problems. where B is the length of the input bit string. We must give the polynomial that works for every instance of the problem ﬁrst. Sometimes we might not care if an algorithm is occasionally very slow as long as it is almost always fast. The extra multiplicative factor of log n will not alter the polynomial vs. using bisection of the interval that is known to contain the chromatic number. say of 100 vertices. Hence we will restrict our discussion to decision problems. then there could not be a fast algorithm for any other of those problems. 105 . we will stick to the worst-case time bounds and study some of the theory that applies to that situation. We’ll get to that later on in this section. A decision problem is one that asks only for a yes-or-no answer: Can this graph be 5-colored? Is there a tour of length ≤ 15 miles? Is there a set of 67 independent vertices? Many of them problems that we are studying can be phrased as decision problems or as optimization problems: What is the smallest number of colors with which G can be colored? What is the length of the shortest tour of these cities? What is the size of the largest independent set of vertices in G? Usually if we ﬁnd a fast algorithm for a decision problem then with just a little more work we will be able to solve the corresponding optimization problem. accounts for the intense worldwide research eﬀort that has gone into understanding them in recent years. well that’s a diﬀerent polynomial in B for diﬀerent instances of the problem. nonpolynomial running time distinction. the idea of a decision problem. Let a graph G be given. For instance.

with zeroes on the main diagonal (these are the vertex adjacency matrices of graphs) such that the graph that the matrix represents is 3-colorable. To put it more brieﬂy. was only to verify the proof. It then checks that no more than K colors have been used altogether. To check that a certain graph G really is K-colorable we can be convinced if you will show us the color of each vertex in a proper K-coloring. A 3-colorability computation is therefore nothing but an attempt to discover whether a given word belongs to the dictionary. you ask? Well. So we aren’t asking for a way to ﬁnd a solution. C(I)) are input to algorithm A it recognizes that I belongs to the language Q. then our checking algorithm A is very simple. If you actually want to ﬁnd one then you will have to solve a hard problem. Examples of problems in P are most of the ones that we have already met in this book: Are these two integers relatively prime? Is this integer divisible by that one? Is this graph 2-colorable? Is there a ﬂow of value greater than K in this network? Can this graph be disconnected by the removal of K or fewer edges? Is there a matching of more than K edges in this bipartite graph? For each of these problems there is a fast (polynomial time) algorithm. as a reader.’ we can think of a decision problem as asking if a given word (the input string) does or does not belong to a certain language. Where did we get that list. some proofs are extremely time consuming even to check (see the proof of the four-color theorem!). we never said it was easy to construct a certiﬁcate. Here is an analogy that may help to clarify the distinction between the classes P and NP. and here is an algorithm. To pursue the analogy a bit farther. square matrices of 0. It checks ﬁrst that every vertex has a color and only one color. The language is the totality of words for which the answer is ‘Y. Suppose G is some graph that is K-colorable. In P are the problems where it’s easy to ﬁnd a solution. Our task. P is the set of easy decision problems.with each instance I for which the answer is ‘Yes’) there is a certiﬁcate C(I) such that when the pair (I.Chapter 5: N P -completeness What is a language? Since every decision problem can have only the two answers ‘Y/N. Consider the graph coloring problem to be the decision problem Q. The certiﬁcate of G might be a list of the colors that get assigned to each vertex in some proper K-coloring of the vertices of G. but only to verify that an alleged solution really is correct. 106 . We have all had the experience of reading through a truly ingenious and diﬃcult proof of some mathematical theorem.1 entries. in NP. Certainly this problem is not known to be in P. for instance. We can image that somewhere there is a vast dictionary of all of the words in this language. If you do provide that certiﬁcate. A decision problem Q belongs to NP if there is an algorithm A that does the following: (a) Associated with each word of the language Q (i.e. (c) Algorithm A operates in polynomial time. To put this one more brieﬂy. and a method of constructing certiﬁcates that proves it. however. (b) If I is some word that does not belong to the language Q then there is no choice of certiﬁcate C(I) that will cause A to recognize I as a member of Q.. and in NP are the problems where it’s easy to check a solution that may have been very tedious to ﬁnd. and wondering how the person who found the proof in the ﬁrst place ever did it. is the set of all symetric. It is. some computational problems are not even known to belong to NP. where B is the number of bits in the input string that represents I. But we’re really only talking about checking the correctness of an alleged answer. and similarly. and that is a much easier job than the mathematician who invented the proof had. What is the class NP? The class NP is a little more subtle. What is the class P? We say that a decision problem belongs to the class P if there is an algorithm A and a number c such that for every instance I of the problem the algorithm A will produce a solution in time O(B c ). NP is the class of decision problems for which it is easy to check the correctness of a claimed answer.’ The graph 3-coloring language. Here’s another example. let alone to P. with the aid of a little extra information.

The travelling salesman probelm. that can solve 500 diﬀerent kinds of problems? That’s what NP-completeness is about.95. It seems natural to suppose that NP is larger than P. It is not known if this negation of the travelling salesman problem belongs to NP. one might presume that there are problems whose solutions can be quickly checked with the aid of a certiﬁcate even though they can’t be quickly found in the ﬁrst place. One possible response to this predicament would be to look for the solution to the system AT Ax = AT b. a decision problem is NP-complete if it belongs to NP and every problem in NP is quickly reducible to it. We will say that Q is quickly reducible to Q if whenever we are given an instance I of the problem Q we can convert it. In fact that’s one of the main reasons that we studied the algorithm of Pratt. how could a problem fail to belong to NP?’ Try this decision problem: an instance I of the problem consists of a set of n cities in the plane and a positive number K. with only a polynomial amount of labor. that the system works only on systems where the matrix A is symmetric. What is reducibility? Suppose that we want to solve a system of 100 simultaneous linear equations in 100 unknowns. No example of such a problem has ever been produced (and proved). That is. When we get home and read the ﬁne print on the label we discover. ‘if we’re allowed to look at the answers. Pratt’s algorithm is exactly a method of producing a certiﬁcate with the aid of which we can quickly check that a given integer is prime. therefore. is it prime?’ is thereby revealed to belong to NP. and the coeﬃcient matrix in the system that we want to solve is. Are there problems that do belong to NP but for which it isn’t immediately obvious that this is so? yes. Hence the graph coloring problem belongs to NP. Indeed if Q ∈ P is some decision problem then we can verify membership in the language Q with the empty certiﬁcate.’ you might reply.’ But what does ‘NP’ stand for? Stay tuned. into an instance I of Q.10. there must be an algorithm A and a way of making a certiﬁcate C(I) for each instance I such that we can quickly verify that no such tour exists of the given cities. ‘Well. For the travelling salesman problem we would provide a certiﬁcate that contains a tour. That is. The question is ‘Is it true that there is not a tour of all of these cities whose total length is less than K?’ Clearly this is a kind of a negation of the travelling salesman problem. The checking algorithm A would then verify that the tour really does visit all of the cities and really does have total length ≤ K. Does it belong to NP? If so. nor has it been proved that no such problem exists. for $49.95 that solves such systems. of all of the cities. of the form Ax = b. also belongs to NP. in such a way that I and I both have the same answer (‘Yes’ or ‘No’). The question of whether or not P=NP is the one that we cited earlier as being perhaps the most important open question in the subject area today. then we can use it to solve Q . whose total length is ≤ K. It is very clear that P⊆NP. 107 . Any suggestions for the certiﬁcate? The algorithm? No one else knows how to do this either. The answer will appear in section 5. with just a small amount of extra work. Thus if we buy a program to solve Q. More generally. What is NP-completeness? How would you like to buy one program. let Q and Q be two decision problems. of course. It is fairly obvious that the class P is called ‘the class P’ because ‘P’ is the ﬁrst letter of ‘Polynomial Time. in section 4. What we would have done would be to have reduced the problem that we really are interested in to an instance of a problem for which we have an algorithm. We run down to the local software emporium and quickly purchase a program for $49. not symmetric. although that fact wasn’t obvious at a glance. we don’t even need a certiﬁcate in order to do a quick calculation that checks membership in the language because the problem itself can be quickly solved.2. To state it a little more carefully.What is NP-completeness? It ﬁnally checks that for each edge e of G it is true that the two endpoints of e have diﬀerent colors. in which the coeﬃcient matrix AT A is now symmetric. The decision problem ‘Given n. to our chagrin.

Since graph coloring is NP-complete. It is called the satisﬁability problem. We will ﬁnd. Altogether only a polynomial amount of time will have been used from start to ﬁnish. If there aren’t any unicorns then the discussion is a little silly. NP-complete problems have all sorts of marvellous properties. as we will see. hordes of them. a few more NP-complete problems. Indeed if we were to succeed in ﬁnding a polynomial time algorithm to do instances of Q then we would automatically have found a fast algorithm for doing every problem in NP. Now use the fast graph coloring algorithm that you found (congratulations. and the extraordinary beauty and structural unity of these computational problems will begin to reveal itself. A Turing machine is an extremely simple ﬁnite-state computer. and that on the next morning you ﬁnd a fast algorithm for solving it. The answer you get is the correct answer for the original bin packing problem. The beautiful 108 . Suppose we could prove that a certain decision problem Q is NP-complete. Cook in 1971. Then we could concentrate our eﬀorts to ﬁnd polynomial-time algorithms on just that one problem Q. Let’s be more speciﬁc. should there be a single computational problem with the property that every one of the diverse creatures that inhabit NP should be quickly reducible to it? Well. Here’s the plan. Its status as an NP-complete problem was established by S. and so forth.Chapter 5: N P -completeness The implications of NP-completeness are numerous. There’s just one small detail to attend to. while at the same time it is logically clean and simple enough to be useful for proving theorems about complexity. To summarize: quick for one NP-complete problem implies quick for all of NP. Since Q is quickly reducible to Q we could transform the instance I into an instance I of Q. The proof uses the theory of Turing machines. and its purpose is to standardize ideas of computability and time of computation by referring all problems to the one standard machine. there are NP-complete problems. The beauty of the Turing machine is that it is at once a strong enough concept that it can in principle perform any calculation that any other ﬁnite state machine can do. the instance of bin packing can be quickly converted into an instance of graph coloring for which the ‘Yes/No’ answer is the same. For if we could then we would be able to solve instances of Q by quickly reducing them to instances of Q and solving them.3 we will prove that there is an NP-complete problem. It’s lovely that every problem in NP can be quickly reduced to just that one NP-complete problem. but the proofs that we humans would have to give in order to establish the relevant theorems would have gotten much more complicated because of the variety of diﬀerent kinds of states that modern computers have. the graph coloring problem is NP-complete and primality testing is in NP. But are there any NP-complete problems? Why. We’ve been discussing the economic advantages of keeping ﬂocks of unicorns instead of sheep. Conversely suppose we can prove that it is impossible to ﬁnd a fast algorithm for some particular problem Q in NP. and proving that will occupy our attention for the next two sections. and when it performs a computation. The ﬁrst NP-complete problem was the hardest one to ﬁnd. in section 5.4. and from that work all later progress in the ﬁeld has ﬂowed. Next. various methods have been developed that give approximate solutions quickly. called a Turing machine. Then we can’t ﬁnd a fast algorithm for any NP-complete problem Q either. Think about that one for a few moments. It turns out that the important aspects of polynomial time computability do not depend on the particular computer that is chosen as the model. If that had been done then the class P of quickly solvable problems would scarcely have changed at all (the polynomials would be diﬀerent but they would still be polynomials). so the reader will get some idea of the methods that are used in identifying them. Suppose that tomorow morning we prove that the graph coloring problem is NP-complete. In section 5.2 we are going to talk about a simple computer. Then consider some instance of the bin packing problem. a unit of computational labor will be very clearly and unambiguously describable. provably slow for one problem in NP implies provably slow for all NP-complete problems. So. How does that work? Take an instance I of some problem Q in NP. a fast algorithm for some NP-complete problem implies a fast algorithm for every problem in NP. in section 5. by the way!) on the converted problem. If we could prove that there is no fast way to test the primality of a given integer then we would have proved that there is no fast way to decide of graphs are K-colorable. after all. The microcomputer on your desktop might have been chosen as the standard against which polynomial time computability is measured. because. It is an idealized computer. Since nobody knows a fast way to solve these problems. Then use the super algorithm that we found for problems in Q to decide I. or that give exact solutions in fast average time.

Prove that the following decision problem belongs to P: Given integers K and a1 . Does G contain a path of length ≥ K? (c) Given a set of K integers. the regular states q1 . the ﬁnal state in a problem to which the answer is ‘Yes’ qN : the ﬁnal state in a problem to which the answer is ‘No’ (d) a program (or program module. if we think of it as a pluggable component) that directs the machine through the steps of a particular task. The possible states of the machine are. Is there a subset of them whose sum is an even number? (d) Given n integers.’ ‘1. three special states q0 : the initial state qY . Is it true that all of them are prime? 5. For which of the following problems can you prove membership in P? (a) Given a graph G. Is G bipartite? (c) Given n integers. Suppose that at a certain instant the machine is in state q (other than qY or qN ) and that the symbol that has just been read from the tape is ‘symbol.2. Does G contain a circuit of length 4? (b) Given a graph G. (b) a tape head that is capable of either reading a single character from a square on the tape or writing a single character on a square. Is the median of the a’s smaller than K? 2. .’ and ‘’ (blank). . . Fig. symbol ) the program module will decide (i) to what state q the machine shall next go. . Prove that the following decision problem is in NP: given an n × n matrix A of integer entries. . For which of the following problems can you prove membership in NP? (a) Given a set of integers and another integer K.5. . Does G contain an Euler circuit? 4.’ Then from the pair (q. Is it true that not all of them are prime? (d) Gievn a set of K integers. an . 109 . . Is there a subset of the given integers whose sum is K? (b) Given a graph G and an integer K. Exercises for section 5. that is marked oﬀ into squares that are numbered as shown in Fig. Is there a subset of them whose sum is divisible by 3? (e) Given a graph G. (c) a ﬁnite list of states such that at every instant the machine is in exactly one of those states. and (iii) whether the tape head will next move one square to the right or one square to the left. ﬁrst of all. qs .2.’ and we will spend the rest of this chapter discussing some fo these ideas. and (ii) what single character the machine will now write on the tape in the square over which the head is not positioned. or moving its position relative to the tape by an increment of one square in either direction. Is det A = 0? 3.1: A Turing machine tape Let’s describe the program module in more detail. 5. .1 below. For simplicity we can assume that the character set contains just three symbols: ‘0. 5.2 Turing Machines book of Garey and Johnson (see references at the end of the chapter) calls this ‘coping with NP-completeness.1 1.2 Turing Machines A Turing machine consists of (a) a doubly inﬁnite tape. Each square can contain a single character from the character set that the machine recognizes. and second.

. Its outputs are the newstate into which the machine goes next. The program instructs it to write at square 1 a newsymbol. possibly forever. and to enter a certain newstate. The machine reads the symbol in square 1. square:=1. Procedure gonextto is the program module of the machine. say. The whole process is then repeated.. as usual.Chapter 5: N P -completeness One step of the program. newsymbol . so it can colsult the program module to ﬁnd out what to do. therefore. . the machine is put into state q0 . state:=newstate. to write the software. and it will be diﬀerent for each task. . Here is a pidgin-Pascal simulation of a Turing machine that can easily be turned into a functioning program. but hopefully after ﬁnitely many steps the machine will enter the state qY or state qN . If we want to watch a Turing machine in operation. for every possible pair (state. . The procedure turmach has for input a string x of length B. To write a program for a Turing machine. increment). and for output it sets the Boolean variable accept to True or False. rightmost := B. The machine should be thought of as part hardware and part software. if square> rightmost then leftmost:= square. {simulates Turing machine action on input string x of length B} {write input string on tape in ﬁrst B squares} for square := 1 to B do tape[square] :=x[square]. the program module that the programmer prepared is plugged into its slot. It doesn’t vary from one job to the next. say q . {move tape head} square := square+increment end. of length B.2. 2. at which moment the computation will halt with the decision having been made.newsybol. symbol ) to (newstate. symbol) that the machine might ﬁnd itself in. goes from (state. the newsymbol and the increment shall be.B]. It now is in state q0 and has read symbol. It is in two principal parts. accept:Boolean). x :array[1. {update boundaries and write new symbol}. {initialize tape head and state} state:=0.newstate.1). We can simulate one. depending on whether the outcome of the computation is that the machine halted in state qY or qN respectively. {record boundaries of written-on part of tape} leftmost:=1.symbol. tape[square]:=newsymbol. while state = ‘Y’ and state = ‘N’ do {read symbol at current tape square} if square< leftmost or square> rightmost then symbol:=‘’ else symbol:= tape[square] {ask program module for state transition} gonnextto(state. This procedure is the ‘hardware’ part of the Turing machine. The tape head is then positioned over square 1. Its inputs are the present state of the machine and the symbol that was just read from the tape. that describes the problem that we want to solve. The programmer’s job is. what we have to do is to tell it how to make each and every one of the transitions (5.increment). the newsymbol that the tape head now writes on the current square. procedure turmach(B:integer.{while} accept:={ state=‘Y’} end. to move the head either to square 0 or to square 2. B of the tape. A Turing machine program looks like a table in which. (5.2. and the computation begins.1) If and when the state reaches qY or qN the computation is over and the machine halts. To begin a computation with a Turing machine we take the input string x. and the increment (±1) by which the tape head will now move.{turmach} 110 . and we write x in squares 1. the programmer has speciﬁed what the newstate. we don’t have to build it.

run it for some input string. Now let’s see how the idea of a Turing machine can clarify the description of the class NP.e. of length B. then replace the segment from square number −m to square number −1. and the position of the tape head after each step of the computation. A Turing machine caluclation was done ‘in time P (B)’ if the input string occupied B tape squares and the calculation took P (B) steps.2: A Turing machine program for bit parity Exercise. This is the class of problems for which the decisions can be made quickly if the input strings are accompanied by suitable certiﬁcates. −m + 1. To make that idea precise one needs a careful deﬁnition of what ‘the length of the input bit string’ means. To use a Turing machine as a checking or verifying computer. and we place the certiﬁcate C(x) of x in squares −m. of the Turing machine tape with the certiﬁcate. The hardware is the same. −1 of the tape. But on a Turing machine both of these ideas come through with crystal clarity. In the next section we are going to use the Turing machine concept to prove Cook’s theorem.’ i. But now we understand formally too. The input bit string x is what we write on the tape to get things started.2. to an accepting calculation. Take the notion of polynomial time. A certiﬁcate can be loaded into a Turing machine as follows. The language is the set of all input strings x that lead to termination in state ‘Y. We might immediately notice that some terms that were just a little bit fuzzy before are now much more sharply in focus. and at each moment the machine will be instate 0 if it has so far scanned an even number of 1’s. We then write a verifying program for the program module in which the program veriﬁes that the string x is indeed a word in the language of the machine.’ and after ﬁnitely many steps in state ‘N’ for every instance whose answer is ‘No. each of which contains a symbol from the character set of the machine.. . . and its length is the number of tape squares it occupies. B of the tape. . FInd out if it is true that the string contains an odd number of 1’s. A ‘step’ in a Turing machine calculation is obviously a single call to the program module. If the certiﬁcate contains m > 0 tape squares. state symbol newstate newsymbol increment 0 0 0 0 +1 0 1 1 1 +1 0 blank qN blank −1 1 0 1 0 +1 1 1 0 1 +1 1 blank qY blank −1 Fig.’ We all understand informally what an algorithm is. but the manner of input and the question that is being asked are diﬀerent from the 111 . .. and in the course of the veriﬁcation the program is quite free to examine the certiﬁcate as well as the problem instance.2 Turing Machines Now let’s try to write a particular program module gonextto. We will write a program that will scan the input string from left to right.5. in state 1 otherwise. 2. we place the input string x that describes the problem instance in squares 1.2. and what one means by the number of ‘steps’ in a computation. Right now let’s review some of the ideas that have already been introduced from the point of view of Turing machines. which is the assertion that a certain problem is NP-complete. and print out the state of the machine. consisting of 0 or more squares. the contents of the tape. The information on the certiﬁcate is then available to the program module just as any other information on the tape is available. for example. By a certiﬁcate we mean a ﬁnite strip of Turing machine tape. A Turing machine that is being used as a verifying computer is called a nondeterministic machine. .’ A Turing machine and an algorithm deﬁne a language. consisting of 0’s and 1’s. Program the above as procedure gonextto. inclusive. . Another word that we have been using without ever nailing down precisely is ‘algorithm. In Fig. An algorithm for a problem is a program module for a Turing machine that will cause the machine to halt after ﬁnitely many steps in state ‘Y’ for every instance whose answer is ‘Yes. 5. 5. .2 we show a program that will get the job done. Consider the following problem: given an input string x.

Having done that. each one of the literals inherits a truth value. A literal is either one of the variables xi or the negation of one of the variables.Chapter 5: N P -completeness situation with a deterministic Turing machine. that x belongs to the language recognized by the machine. no certiﬁcate would cause an accepting computation to ensue. formal logic. The rules of the game are these. We begin with a list of (Boolean) variables x1 . x3 }. Furthermore we can easily assign a certiﬁcate to every set of clauses for which the answer to SAT is ‘Yes. Given a set of clauses. without using any certiﬁcates. then every instance of Q is polynomially reducible to an instance of Q. As we have already remarked. and for which. The example already leaves one with the feeling that SAT might be a tough computational problem. if x does not belong to the language. and so this would not be a satisfying truth assignment for the set of clauses. . . that this problem belongs to NP. x3}. ¯ ¯ x If we choose the truth values (T. then the four clauses would acquire the truth values (T. as xi . and a literal xi has the opposite truth value from that of ¯ the variable xi . networks and games and about optimization problems. about algebraic structures. and the word ‘and’ as being between the clauses. xn . There are only eight possible ways to assign truth values to three variables. We assign the value ‘True’ (T) or ‘False’ (F). Here is the satisﬁability problem. Deﬁnition. given a problem instance string x and a suitable certiﬁcate C(x). {x2 . such that every clause contains at least one literal whose value is T (i. by Stephen Cook in 1971. . namely a literal xi has the same truth or falsity as the corresponding variable xi. the surprising thing is that there is an NP-complete problem at all. and it is determined as follows. then we’ll be able to see more easily why there are hundreds of them. T. T.’ Hence starting with an assignment of truth values to the variables. in which we decide whether or not the input string is in the language. It is quite clear. T. A set of clauses is satisﬁable if there exists an assignment of truth values to the variables that makes all of the clauses true. x3 }. F ) for the variables.e. {¯1 . including many computational questions about discrete structures such as graphs. since it is not immediately clear why any single problem should hold the key to the polynomial time solvability of every problem in the class NP.’ and otherwise it has the value ‘F.. Indeed. T ) (how can we recognize a set of clauses that is satisﬁed by assigning to every variable the value ‘T’ ?). the clauses 112 . A clause has the value ‘T’ if and only if at least one of the literals in that clause has the value ‘T. As soon as we see why there is one. From these we might manufacture the following list of four clauses: {x1 . we end up with a determination of the truth values of each of the clauses. Think of the word ‘or’ as being between each of the literals in a clause. Does there exist a set of truth values (=T or F). and so forth. to each one of the variables. x2 . . because there are 2n possible sets of truth values that we might have to explore if we were to do an exhaustive search. {x1 . But there is one. ¯ A clause is a set of literals. 5. one for each variable. the ﬁrst problem that was proved to be NP-complete. Finally each of the clauses also inherits a truth value from this process. T. x3 of variables. it is a decision problem. There are 2n possible literals. some true and some false. in the sense that if Q is any decision problem in NP and Q is an NP-complete problem. The satisﬁability problem (SAT). respectively. some true and some false. F ). such that every clause is satisﬁed)? Example: Consider the set x1 . The class NP (‘Nondeterministic Polynomial’) consists of those decision problems for which there exists a fast (polynomial time) algorithm that will verify. however.3 Cook’s Theorem The NP-complete problems are the hardest problems in NP. and after a little more experimentation we might ﬁnd out that these clauses would in fact be satisﬁed if we were to make the assignments (T. x2 }.

F to the 12 Boolean variables x1. A few moments of thought will convince the reader that the transformation of one problem to the other that was carried out above involves only a polynomial amount of computation.2.1.1.3} T (i) := {¯i. It is clear that if we have an algorithm that will solve SAT. The clauses T (i) say that no vertex has both color 1 and color 2. Now comes the hard part. the D(e. 3. For this purpose we deﬁne. one for each variable.1. which asserts x ¯ that not both endpoints of the edge e have the same color j. Before we carry out the proof. and the decision problem ‘Can the vertices of G be properly colored in 3 colors?’ Let’s see how that decision problem can be reduced to an instance of SAT. as input. Fig.3. 2. Let u and v be the two endpoints of e. Reducing graph-coloring to SAT Consider the graph G of four vertices that is shown in Fig.’ That veriﬁcation is certainly a polynomial time computation. 3. We want to show Theorem 5. xi.j .1: A 3-coloring problem Example.’ The certiﬁcate contains a set of truth values. 4) (i = 1. We will use 12 Boolean variables: the variable xi.3 } x ¯ (i = 1. xi.3).3 Cook’s Theorem are satisﬁable. along with the above certiﬁcate. 5.1) In the above. (5. the four clauses C(i) assert that each vertex has been colored in at least one color.3 } x ¯ V (i) := {¯i. 1971): SAT is NP-complete. j) := {¯u. (S.5.3. 4. 3). The original instance of the graph coloring problem has now been reduced to an instance of SAT. 3.2 .1. . All 16 of the clauses in (5. A Turing machine that receives the set of clauses. 3. for each edge e of the graph G and color j (=1. xv.2 } x ¯ U (i) := {¯i.3. 2. Cook. Hence graph coloring is quickly reducible to SAT. x4. V (i)) guarantee that no vertex has been colored 1 and 3 (resp.1. The graph is 3-colorable if and only if the clauses are satisﬁable. 2. In more detail. 113 . . 4) (i = 1.’ Next we have to construct the clauses that will assure us that the two endpoints of an edge of the graph are never the same color.1 . 2 and 3). 2. The ﬁrst 16 of these are C(i) := {xi. a clause D(e. xi. would have to verify only that if the truth values are assigned to the variables as shown on the certiﬁcate then indeed every clause does contain at least one literal of value ‘T. 5. xi. xi. j = 1. suitably encoded. despite the seemingly large number of variables and clauses.2.j corresponds to the assertion that ‘vertex i has been colored in color j’ (i = 1.j }.3.3 such that each of the 31 clauses contains at least one literal whose value is T if and only if the vertices of the graph G can be properly colored in three colors. 2. 2. 3. The instance of SAT that we construct has 31 clauses. j) as follows. it may be helpful to give a small example of the reducibility ideas that we are going to use. there exists an assignment of values T.1) together amount to the statement that ‘each vertex has been colored in one and only one of the three available colors. Similarly the clauses U (i) (resp. 4). .3. then we can also solve graph coloring problems. that satisfy all of the clauses. . 4) (i = 1.

. The idea can be summarized like this: the instance of SAT that will be constructed will be a collection of clauses that together express the fact that there exists a certiﬁcate that causes Turing machine TMQ to do an accepting calculation. the machine TMQ accepts x and its certiﬁcate. Since the Turing machine TMQ does its accepting calculation in time ≤ P (n) it follows that the tape head will never venture more than ±P (n) squares away from its starting position.’ Conversely. This will be done by requiring that a number of clauses be all true (‘satisﬁed’) at once. where x is a word in the language Q and C(x) is its certiﬁcate. Hence there are altogether O(P (n)2 ) variables. they will determine a real accepting calculation of the machine TMQ. corresponding to each word I in the language Q. Finally. Therefore. Now we will describe the Boolean variables that will be used in the clauses under construction. The index i runs over the steps of the accepting computation. C(x)). in polynomial time. false otherwise. and it will be followed by the formal set of clauses that actually expresses the condition on input to SAT. where n is the length of x. in order to test whether or not the word Q belongs to the language. C(x)). the clauses will not be satisﬁable. if we aren’t careful we might assign true values to T9. C(x))? Certainly not. Hence at least one of the K available state variables must be true. and there is only some ﬁxed ﬁnite number. We intend to construct. Hence let Q be some problem in NP and let I be an instance of problem Q. 114 . Let TMQ be such a Turing machine. Variable Ti. i. In the following. the tape head is positioned over square j}. then. then. and instance f(I) of SAT for which the answer is ‘Yes. is to express the accepting computation of the Turing machine as the simultaneous satisfaction of a number of logical propositions.. that every problem in NP is polynomially reducible to an instance of SAT. if accompanied by a suitable certiﬁcate. the clauses are all simultaneously satisﬁable.Chapter 5: N P -completeness Proof of Cook’s theorem We want to prove that SAT is NP-complete. can assume only O(P (n)) diﬀerent values. For example. in time ≤ P (n). K. which runs through the various tape squares that are scanned during the computation.k is true if after step i of the checking calculation it is true that the Turing machine TMQ is in state qk . the condition that we want to express. i. a polynomial number of them. thereby burning out the bearings on the tape transport mechanism! (why?) Our remaining task.33. Index a runs over the letters in the alphabet that the machine can read. What we must do. where each clause will exprss one necessary condition. and let P (n) be a polynomial in its argument n with the property that TMQ recognizes every pair (x.j = {after step i. in such a way that the clauses are satisﬁable if and only if x is in the language Q. Is it true that every random assignment of true or false values to each of these variables corresponds to an accepting computation on (x. of states that TMQ might be in. and of clauses. the bold face type will describe. Then we will be sure that whatever set of satisfying values of the variables might be found by solving the SAT problem. the machine is in at least one state. if the word I is not in the language Q.4 and to T10.e. To construct an instance of SAT means that we are going to deﬁne a number of variables. symbol a is in tape square j}. Variable Si.j. will be to describe precisely the conditions under which a set of values assigned to the variables listed above actually deﬁnes a possible accepting calculation for (x. in words.a = {after step i. Since Q is in NP there exists a Turing machine that recognizes encoded instances of problem Q. and so it takes at most O(P (n)) diﬀerent values. It is precisely here that the relative simplicity of a Turing machine allows us to enumerate all of the possible paths to an accepting computation in a way that would be quite unthinkable with a ‘real’ computer. Let’s count the variables that we’ve just introduced. say. At each step. it suﬃces to check that the collection of clauses is satisﬁable. so it can assume at most some ﬁxed number A of values. Therefore the subscript j. of literals. This leads to the ﬁrst set of clauses. Variable Qi. k indexes the states of the Turing machine.e.

k . one for each step i of the computation: {Qi.j. j.k .j.k .j. or the present state is not qk or the symbol just read is not l. Initially the machine is in state 0. and pair k . Si. head position) in accordance with the application of its program module to its previous (state. that there are not two symbols in each square at each step. the input string x is in squares 1 to n. Qi. these are O(P (n)) clauses. square. ﬁrst. . and leave the construction of the clauses as an exercise.j is true) or else the head is not positioned there. the clause ¯ ¯ {Qi.j.l }.j+INC } ¯ ¯ ¯ {Ti. The reader will by now have gotten the idea of how to construct the clauses. one for each triple (i. -1.k . . At step P (n) the machine is in state qY . Si+1.Qi+1. . In each case the format of the clause is this: ‘either the tape head is not positioned at square j.j . consider ﬁrst the following condition: the symbol in square j of the tape cannot change during step i of the computation if the tape head isn’t positioned there at that moment.j. Si.1 .Si+1. Qi.’ There is a clause as above for 115 .j . the tape head is positioned over a single square.j. that there is at least one symbol in each square at each step.Ti+1.. . Si. so for the next three categories we will simply list the functions that must be performed by the corresponding lists of clauses.k } ¯ ¯ ¯ {Ti. At each step. the machine is not in more than one state Therefore. symbol. . .. but if they are then .1 .k . The last set of restrictions is a little trickier: At each step the machine moves to its next conﬁguration (state. . The clauses that do this are {Si.K Since i assumes O(P (n)) values. This leeads to two lists of clauses which require.. and second.l .k } for each step i. −P (n). the head is over square 1.j. but still more are needed.l . and each pair j .A} where A is the number of letters in the machine’s alphabet. for each step i.j } must be true.j .l . k) = (state.j . Si.At step P (n) the machine is in state qY . The three sets of clauses that do this are ¯ ¯ ¯ ¯ {Ti. in which case either symbol k is not in the jth square before the step or else symbol k is (still) in the jth square after the step is executed. k of distinct symbols in the alphabet of the machine. and ¯ ¯ {Si. each tape square contains exactly one symbol from the alphabet of the machine. Si.k } of clauses.j. To ﬁnd the clauses that will do this job. These are O(P (n)) additonal clauses to add to the list. At each step.j.2. symbol).2. and C(x) (the input certiﬁcate of x) is in squares 0. j of distinct states. square j. symbol). This translates into the collection ¯ {Ti.j. . Si.j. Si. Qi... Qi. Qi. Qi. At each step. . These clauses express the condition in the following way: either (at time i) the tape head is positioned over square j (Ti.j . It remains to express the fact that the transitions from one conﬁguration of the machine to the next are the direct results of the operation of the program module.

Chapter 5: N P -completeness each step i = 0, . . . , P (n) of the computation, for each square j = −P (n), P (n) of the tape, for each symbol l in the alphabet, and for each possible state qk of the machine, a polynomial number of clauses in all. The new conﬁguration triple (INC, k , l ) is, of course, as computed by the program module. Now we have constructed a set of clauses with the following property. If we execute a recognizing computation on a string x and its certiﬁcate, in time at most P (n), then this computation determines a set of (True, False) values for all of the variables listed above, in such a way that all of the clauses just constructed are simultaneously satisﬁed. Conversely if we have a set of values of the SAT variables that satisfy all of the clauses at once, then that set of values of the variables describes a certiﬁcate that would cause TMQ to do a computation that would recognize the string x and it also describes, in minute detail, the ensuing accepting computation that TMQ would do if it were given x and that certiﬁcate. Hence every language in NP can be reduced to SAT. it is not diﬃcult to check through the above construction and prove that the reduction is accomplishable in polynomial time. It follows that SAT is NP-complete. 5.4 Some other NP-complete problems Cook’s theorem opened the way to the identiﬁcation of a large number of NP-complete problems. The proof that Satisﬁability is NP-complete required a demonstration that every problem in NP is polynomially reducible to SAT. To prove that some other problem X is NP-complete it will be suﬃcient to prove that SAT reduces to problem X. For if that is so then every problem in NP can be reduced to problem X by ﬁrst reducing to an instance of SAT and then to an instance of X. In other words, life after Cook’s theorem is a lot easier. To prove that some problem is NP-complete we need show only that SAT reduces to it. We don;t have to go all the way back to the Turing machine computations any more. Just prove that if you can solve your problem then you can solve SAT. By Cook’s theorem you will then know that by solving your problem you will have solved every problem in NP. For the honor of being ‘the second NP-complete problem,’ consider the following special case of SAT, called 3-satisﬁability, or 3SAT. An instance of 3SAT consists of a number of clauses, just as in SAT, except that the clauses are permitted to contain no more than three literals each. The question, as in SAT, is ‘Are the clauses simultaneously satisﬁable by some assignment of T, F values to the variables?’ Interestingly, though, the general problem SAT is reducible to the apparently more special problem 3SAT, which will show us Theorem 5.4.1. 3-satisﬁability is NP-complete. Proof. Let an instance of SAT be given. We will show how to transform it quickly to an instance of 3SAT that is satisﬁable if and only if the original SAT problem was satisﬁable. More precisely, we are going to replace clauses that contain more than three literals with collections of clauses that contain exactly three literals and that have the same satisﬁability as the original. In fact, suppose our instance of SAT contains a clause {x1 , x2 , . . . , xk } (k ≥ 4). (5.4.1)

Then this clause will be replaced by k − 2 new clauses, utilizing k − 3 new variables zi (i = 1, . . . , k − 3) that are introduced just for this purpose. The k − 2 new clauses are {x1 , x2 , z1 }, {x3, z1 , z2 }, {x4, z2 , z3 }, . . . , {xk−1, xk , zk−3 }. ¯ ¯ ¯ (5.4.2)

We now make the following Claim. If x∗ , . . . , x∗ is an assignment of ruth values to the x’s for which the clause (5.4.1) is true, then there 1 k ∗ ∗ exist assignments z1 , . . . , zk−3 of truth values to the z’s such that all of the clauses (5.4.2) are simultaneously ∗ ∗ satisﬁed by (x , z ). Conversely, if (x∗ , z ∗ ) is some assignment that satisﬁes all of (5.4.2), then x∗ alone satisﬁes (5.4.1). To prove the claim, ﬁrst suppose that (5.4.1) is satisﬁed by some assignment x∗ . Then one, at least, of the k literals x1 , . . . , xk , say xr , has the value ‘T.’ Then we can satisfy all k − 2 of the transformed clauses 116

**5.4 Some other NP-complete problems
**

∗ ∗ (5.4.2) by assigning zs := ‘T for s ≤ r − 2 and zs = ‘F for s > r − 2. It is easy to check that each one of the k − 2 new clauses is satisﬁed.

Conversely, suppose that all of the new clauses are satisﬁed by some assignment of truth values to the x’s and the z’s. We will show that at least one of the x’s must be ‘True,’ so that the original clause will be satisﬁed. Suppose, to the contrary, that all of the x’s are false. Since, in the new clauses none of the x’s are negated, the fact that the new clauses are satisﬁed tells us that they would remain satisﬁed without any of the x’s. Hence the clauses

z z z z {z1 }, {¯1 , z2 }, {¯2, z3 }, . . . , {¯k−4 , zk−3}, {¯k−3 }

are satisﬁed by the values of the z’s. If we scan the list from left to right we discover, in turn, that z1 is true, z2 is true, . . . , and ﬁnally, much to our surprise, that zk−3 is true, and zk−3 is also false, a contradiction which establishes the truth of the claim made above. The observation that the transformations just discussed can be carried out in polynomial time completes the proof of theorem 5.4.1. We remark, in passing, that the problem ‘2SAT’ is in P. Our collection of NP-complete problems is growing. Now we have two, and a third is on the way. We will show next how to reduce 3SAT to a graph coloring problem, thereby proving Theorem 5.4.2. The graph vertex coloring problem is NP-complete. Proof: Given an instance of 3SAT, that is to say, given a collection of k clauses, involving n variables and having at most three literals per clause, we will construct, in polynomial time, a graph G with the property that its vertices can be properly colored in n + 1 colors if and only if the given clauses are satisﬁable. We will assume that n > 4, the contrary case being trivial. The graph G will have 3n + k vertices:

{x1 , . . . , xn }, {¯1, . . . , xn }, {y1, . . . , yn }, {C1, . . . , Ck } x ¯

Now we will describe the set of edges of G. First each vertex xi is joined to xi (i = 1, . . . , n). Next, every ¯ vertex yi is joined to every other vertex yj (j = i), to every other vertex xj (j = i), and to every vertex xj (j = i). ¯ Vertex xi is connected to Cj if xi is not one of the literals in clause Cj . Finally, xi is connected to Cj ¯ if xi is not one of the literals in Cj . ¯ May we interrupt the proceedings to say again why we’re doing all of this? You have just read the description of a certain graph G. The graph is one that can be drawn as soon as someone hands us a 3SAT problem. We described the graph by listing its vertices and then listing its edges. What does the graph do for us? Well suppose that we have just bought a computer program that can decide if graphs are colorable in a given number of colors. We paid $ 49.95 for it, and we’d like to use it. But the ﬁrst problem that needs solving happens to be a 3SAT problem, not a grpah coloring problem. We aren’t so easily discouraged, though. We convert the 3SAT problem into a graph that is (n + 1)-colorable if and only if the original 3SAT problem was satisﬁable. Now we can get our money’s worth by running the graph coloring program even though what we really wanted to do was to solve a 3SAT problem. 117

Chapter 5: N P -completeness In Fig. 5.4.1 we show the graph G of 11 vertices that correesponds to the following instance of 3SAT:

Fig. 5.4.1: The graph for a 3SAT problem Now we claim that this graph is n + 1 colorable if and only if the clauses are satisﬁable. Clearly G cannot be colored in fewer than n colors, because the n vertices y1 , . . . , yn are all connected to each other and therefore they alone already require n diﬀerent colors for a proper coloration. Suppose that yi is assigned color i (i = 1, . . . , n). Do we need new colors in order to color the xi vertices? Since vertex yi is connected to every x vertex and every x vertex except xi , xi, if color i is going to be used on the x’s or the x’s, it will have to be assigned ¯ ¯ ¯ to one of xi , xi , but not to both, since they are connected to each other. Hence a new color, color n + 1, will ¯ have to be introduced in order to color the x’s and x’s. ¯ Further, if we are going to color the vertices of G in only n + 1 colors, the only way to do it will be to assign color n + 1 to exactly one member of each pair (xi , xi), and color i to the other one, for each ¯ i = 1, . . . , n. That one of the pair that gets color n + 1 will be called the False vertex, the other one is the True vertex of the pair (xi , xi), for each i = 1, . . . , n. ¯ It remains to color the vertices C1 , . . . , Ck . The graph will be n+1 colorable if and only if we can do this without using any new colors. Since each clause contains at most three literals, and n > 4, every variable Ci must be adjacent to both xj and xj for at least one value of j. Therefore no vertex Ci can be colored in the ¯ color n + 1 in a proper coloring of G, and therefore every Ci must be colored in one of the colors 1, . . . , n. Since Ci is connected by an edge to every vertex xj or xj that is not in the clause Ci , it follows that Ci ¯ cannot be colored in the same color as any xj or xj that is not in the clause Ci . ¯ Hence the color that we assign to Ci must be the same as the color of some ‘True’ vertex Xj or xj that ¯ corresponds to a literal that is in clause Ci. Therefore the graph is n + 1 colorable isf and only if there is a ‘True’ vertex for each Ci , and this means exactly that the clauses are satisﬁable. It is easy to verify that the transformation from the 3SAT problem to the graph coloring problem can be carried out in polynomial time, and the proof is ﬁnished. By means of many, often quite ingenious, transofrmations of the kind that we have just seen, the list of NP-complete problems has grown rapidly since the ﬁrst example, and the 21 additional problems found by R. Karp. Hundreds of such problems are now known. Here are a few of the more important ones. 118

119 ∗ .. Can we color the edges of G in K colors. If ∆ denotes the largest degree of any vertex in the given graph. If we simply have to solve an NP-complete problem. but not often. each of which is joined. Is there a subset whose sum is S/2? The above list. Is the claim that we made and proved above (just after (5. Vizing.. to all of the others. The question is to determine whether or not there is a set of K vertices in G.1. Here are some of the strategies of ‘near’ solutions that have been developed. of V vertices. the Vizing’s theorem asserts that the edges of G can be properly colored in either ∆ or ∆ + 1 colors.5.4.4.4. As another example. 3 (1964). One of our motivations for including the network ﬂow algorithms in this book was. Hamilton path: In a given graph G. and in fact it would be in P. Edge coloring: Given a graph G and an integer K. is there a path that visits every vertex of G exactly once? Target sum: Given a ﬁnite set of positive integers whose sum is S. together with SAT. Of course it must not be assumed that every problem that ‘sounds like’ an NP-complete problem is necessarily so hard.4.4.4 1.4. A beautiful theorem of Vizing∗ deals with this question. Maximum clique: We are given a graph G and an integer K.2)) identical with the statement that the Boolean expression (5. Is the claim that we made and proved above (just after (5. V.. they will have diﬀerent colors? Let us refer to an edge coloring of this kind as a proper coloring of the edges of G. if we want to traverse edges rather than vertices) the problem would no longer be NP-complete. constitutes a modest sampling of the class of these seemingly intractable problems.2) are simultaneously satisﬁable? Discuss.2)) identical with the statement that the clause (5.8) is rather amazing consideriong the quite diﬃcult appearance of the problem. as we have already seen in the case of primality testing.1) is equal to the product of the Boolean expressions (5. Is there anything that can be done to lighten the load? In a number of cases various kinds of probabilistic and approximate algorithms have been developed. Exercises for section 5. and running always in polynomial time. to show how very sophisticated algorithms can sometimes prove that seemingly hard problems are in fact computationally tractable. G. Travelling Salesman and Graph Coloring. so that whenevr two edges meet at a vertex. If for example we ask for an Euler path instead of a Hamilton path (i. indeed. Then we may be able to develop an algorithm with the following properties: (a) It always runs in polynomial time (b) When it ﬁnds a solution then that solution is always a correct one (c) It doesn’t always ﬁnd a solution. can be vertex colored in K colors. On an estimate of the chromatic class of a p-graph (Russian).. Let it be desired to ﬁnd out if a given graph G. exactly how many clauses will there be? 5.e.5 Half a loaf . by an edge of G. some very ingenious. but it ‘almost always’ does. this means that the edge chromatic number is in doubt by only one unit. 3SAT.6. then we are faced with a very long computation. An example of such an algorithm is one that will ﬁnd a Hamilton path in almost all graphs.5 Half a loaf . Since it is obvious that at least ∆ colors will be needed. 3. 25-30.1) is satisﬁable if and only if the clauses (5. Analiz. thanks to theorem 1. If we transform the problem into an instance of 3SAT. and these may often be quite serviceable.2) in the sense that their truth values are identical on every set of inputs? Discuss. failing to do so sometimes.. the fact that one can ﬁnd the edge connectivity of a given graph in polynomial time (see section 3.’ Suppose we have an NP-complete problem that asks if there is a certain kind of substructure embedded inside a given structure. Type I: ‘Almost surely . 2. for every graph G! Nevertheless the decision as to whether the correct answer is ∆ or ∆ + 1 is NP-complete. We will describe such an algorithm below.. in the sense that the ratio of successes to total cases approaches unity as the size of the input string grows large. Diskret..

. although occasionally it may require exponential time. An example of an algorithm of Type I is due to Angluin and Valiant. but it’s close.’ In this category of quasi-solution are algorithms in which the uncertainty lies not in whether a solution will be found. An example of this sort is an algorithm that will surely ﬁnd a maximum independent set in a graph. and instead they will be asked as optimization problems: ﬁnd the shortest tour through these cities. t. If v is a vertex that is not on the path P then the path is extended by adjoining the new edge (ndp. for some graphs. in section 5.5. Type II: ‘Usually fast . In the process of doing so it will delete from the graph G. An algorithm of this kind will (a) always ﬁnd a solution and the solution will always be correct. ﬁnd the size of the maximum clique in this graph. or. so we will also maintain a variable graph G . Now let’s look at examples of each of these kinds of approximation algorithms. v) that is incident with the current endpoint of the partial path P . It doesn’t always ﬁnd such a path. Input to the algorithm are the graph G and two distinguished vertices s.5. the algorithm chooses at random an edge (ndp.. It looks for a Hamilton path between the vertices s. but in theorem 5. and it attempts to extend P by adjoining an edge to a new. and (b) operate in an average of subexponential time. ﬁnd a coloring of this graph in the fewest possible colors. but it changes so that it now has 120 . The procedure maintains a partially constructed Hamilton path P . or. Of course we are going to drop our insistence that the questions be posed as decision problems. 5. An example of this type is the approximate algorithm for the travelling salesman problem that is given below. In that case the path does not get longer. and it deletes the edge (ndp. It quickly yields a tour of the cities that is guaranteed to be at most twice as long as the shortest possible tour. We will outline such an algorithm below. but which is acted upon by the program.’ In this kind of an algorithm we don’t even get the right answer.8. i. and in Fig. t (if s = t on input then we are looking for a Hamilton circuit in G). previously unvisited vertex. Since this means giving up quite a bit. v). so it will never be chosen again.. from s to some vertex ndp. will on the average require ‘only’ O(nc log n ) time to do so. but it’s an improvement over 2n . in section 5.1.. an edge. In response these algorithms will (a) run in polynomial time (b) always poroduce some output (c) provide a guarantee that the output will not deviate from the optimal solution by more than such-andsuch. then we short circuit the path by deleting an edge from it and drawing in a new edge. that is initially set to G.1 below we will see that it usually does. but in how long it will take to ﬁnd one. people like these algorithms to be very fast.e.Chapter 5: N P -completeness Type II: ‘Usually fast . So much is fairly clear.. as is shown below in the formal statement of the algorithm. but will occasionally.6. from time to time. It tries to ﬁnd a Hamilton path (or circuit) in a graph G. The averaging is over all input strings of a given size. at least if the graph is from a class of graphs that are likely to have Hamilton paths at all. v) from the graph G . etc. To do its job. However if the new vertex v is already on the path P . Note that O(nc log n ) is not a polynomial time estimate. require nearly 2n time to get an answer.

ndp := u end. t: vertex). v) to P . ndp := v else if v = t and v ∈ P then {This is the short-circuit of Fig. provided the graph G has a good chance of having a Hamilton path or circuit. the algorithm makes only a very modest claim: either it succeeds or it fails! Of course what makes it valuable is the accompanying theorem. Fig. enhanced chances of ultimate completion. ndp := s. if v = t and v ∈ P / then adjoin the edge (ndp. 121 . {uhc} As stated above.1: The short circuit Here is a formal statement of the algorithm of Angluin and Valiant for ﬁnding a Hamilton path or circuit in an undirected graph G. or fails. 5. v) from P . t) to P and return ‘success’ end. repeat if ndp is an isolated point of G then return ‘failure’ else choose uniformly at random an edge (ndp. procedure uhc(G:graph. v) to P . t) is in G but not in G . which asserts that in fact the procedure almost always succeeds. delete edge (u.5 Half a loaf .5..5. if s = t) and edge (ndp. and returns ‘failure’} G := G. {ﬁnds a Hamilton path (if s = t) or a Hamilton circuit (if s = t) P in an undirected graph G and returns ‘success’. {then} end {else} until P contains every vertex of G (except T . s. P := empty path.1} u := neighbor of v in P that is cloaser to ndp..5. v) from among the edges of G that are incident with ndp and delete that edge from G . adjoin edge (ndp. 5. adjoin edge (ndp.

We now state the theorem of Angluin and Valiant. . For instance.the time is independent of the number of vertices. there are no further choices with which to replace vertex 6. 122 . it functions in an average of constant time. If we backtrack again. be it quickly or slowly. and indeed it can be. on the average. That is to say. an NP-complete problem. There exist numbers M and c such that if we choose a graph G at random from among those of n vertices and at least cn log n edges. in G. the worst-case behavior is very exponential. 5. and the next choice would be vertex 6. We cannot enlarge S by adjoining vertex 2 to it. In Fig.1 below.6 Backtracking (I): independent sets In this section we are going to describe an algorithm that is capable of solving some NP-complete problems fast. In this case. or threshold. the size of the largest independent set of vertices. so we backtrack even further. 6}. . Our set S is now {1. so we are stuck. a graph of n vertices that has o(n log n) edges has relatively little chance of being even connected. The method is called backtracking. Therefore we backtrack.6. i. which asserts that the algorithm above will almost surely succeed if the graph G has enough edges. Now we cannot adjoin vertex 4 (joined to 1) or vertex 5 (joined to 1) or vertex 6 (joined to 3). by replacing the most recently added member of S by th next choice that we might have made for it. and that in the graph coloring problem. Now attempt to enlarge S. we dlete vertex 3 from S. and we choose arbitrary vertices s. The method is also easy to analyze and to describe in this case.e.5. . although to be sure. and it has long been a standard method in computer search problems when all else fails. 3}.6. We ﬁrst illustrate the backtrack method in the context of a search for the largest independent set of vertices (a set of vertices no two of which are joined by an edge) in a given garph G. but we can add vertex 3. 5. It has been common to think of backtracking as a very long process. 2. We want to ﬁnd. whereas a graph with > cn log n edges is almost certainly connected.Chapter 5: N P -completeness What kind of graph has such a ‘good chance’ ? A great deal of research has gone into the study of how many edges a graph has to have before almost surely it must contain certain given structures. But recently it has been shown that the method can be very fast on average. Hence consider a graph G of n vertices. or even polynomial. 5. then the probability that algorithm U HC returns ‘success’ before making a total of M n log n attempts to extend partially contructed paths is 1 − O(n−a ).1. Fix a positive real number a. Theorem 5.. so let S := {1}. n. . how many edges must a graph of n vertices have before we can be almost certain that it will contain a complete graph of 4 vertices? To say that graphs have a property ‘almost certainly’ is to say that the ratio of the number of graphs on n vertices that have the property to the number of graphs on n vertices approaches 1 as n grows without bound. The set S is {1. Again we have a dead end. t in G. namely vertex 2.1: Find the largest independent set Begin by searching for an independent set S that contains vertex 1. but is subexponential. Fig. while at the same time guaranteeing that a solution will always be found. an important dividing line. and almost certainly has a Hamilton path. the graph G has 6 vertices. for instance. For the Hamilton path problem. turns out to be at the level of c log n edges. in which the vertices have been numbered 1. In this case the average time behavior of the method is not constant. and not only delete 6 from S but also replace vertex 1 by the next possible choice for it.

{13}. Some graphs have an enormous number of independent sets. therefore. n for a graph of n vertices. the probability that none of 2 them appear is 2−k(k−1)/2. {45}. 5. S of vertices of G. {3}. 1. or equivalently. A reasonable measure of the complexity of the searching job. Now the question is. But that is the sum. consists of exactly every independent set in the graph G. 5. Observe that the list of sets S above. {5}. without actually having to write down the tree explicitly. on the average how fast is the backtrack method for this problem? What we are asking for is the average number of independent sets that a graph of n vertices has. In the example above. Any other graph G of n vertices will have a number of independent sets that lies between these two extremes of n + 1 and 2n . the list of nodes of the tree T .6. Sometimes backtracking will take an exponentially long time. Since each of these k edges has a probability 1/2 of appearing in G. in advance. of the probability that S is an independent set. 2.2 below. and S − S consists of a single element: the highest-numbered vertex in S .2: The backtrack search tree The backtrack algorithm amounts just to visiting every vertex of the search tree T . Two vertices of T . . {34}. Level 0 consists of a single root vertex. exactly zero of these edges actually live in the random graph G. The question of the complexity of backtrack search is therefore the same as the question of determining the number of independent sets of the graph G. are joined by an edge in T if S ⊆ S . n}. This is a tree whose vertices are arranged on levels L := 0. k 123 (5. including the empty set.6. . Each vertex of T corresponds to an independent set of vertices in G. {35}. The graph K n of n vertices and no edges whatever has 2n independent sets of vertices.6.1 is shown in Fig. . The backtrack tree will have 2n nodes. has just n+1 independent sets of vertices. On level L we ﬁnd a vertex S of T for every independent set of exactly L vertices of G. corresponding to independent sets S . . {345}. {6} A convenient way to represent the search process is by means of the backtrack search tree T . Hence the average number of independent sets in a graph of n vertices is n In = k=0 n −k(k−1)/2 2 . The complete graph Kn of n vertices and every possible edge. {245}. .6 Backtracking (I): independent sets To speed up the discussion. corresponding to the empty set of vertices of G. . 5. The complete backtrack search tree for the problem of ﬁnding a maximum independent set in the graph G of Fig. over all vertex subsets S ⊆ {1.. and sometimes it will be fairly quick. {4}.6. Fig. . If S has k vertices. the graph G had 19 independent sets of vertices.1) . and the search will be a long one indeed. {2}. {25}. among the k(k − 1)/2 possible edges that might join a pair of vertices in S. is the number of independent sets that G has. then the probability that S is independent is the probability that. we will now show the list of all sets S that turn up from start to ﬁnish of the algorithm: {1}. {24}. n(n−1)/2 in all.5. {16}.

10) and 3(a) above to show that the k0 th term is O((n + )log n ) and therefore the same is true of the whole sum.9 1099511627776 Table 5.1). and ask if it is 3-colorable.1.6. then is < 1 after k passes a certain critical value k0 .1) increase in size until k = k0 and then decrease. Hence not only is the average of polynomial growth. that of graph-coloring. Hence the index k0 of the largest term in (5. along with values of 2n . for ﬁxed K. 3. i. the number of vertices in the graph. Clearly the average number of independent sets in a graph is a lot smaller than the maximum number that graphs of that size might have.5 1073741824 40 3862.5 16 5 12. averaged over all graphs of all sizes. Hence the average amount of labor in a backtrack search for the largest independent set in a graph grows subexponentially.6.1) is at most n+1 times as large as its largest single term.3 32 10 52 1024 15 149.1. Exercises for section 5.Chapter 5: N P -completeness Hence in (5. A short table of values of In is shown below.000 vertices. It is some indication of how hard this problem is that even on the average the amount of labor needed is not of polynomial growth. Hence show that the terms in the sum (5. To prove this we will need some preliminary lemmas. What is the average number of independent sets of size k that are in graphs of V vertices and E edges? 2. 124 .7 Backtracking (II): graph coloring In another NP-complete problem. Now we will estimate the size of k0 in the previous problem. of V vertices. (a) Show that tk < 1 when k = log2 n and tk > 1 when k = log2 n − log2 log2 n . the average amount of labor in a backtrack search is O(1) (bounded) as n. This means that if we input a random graph of 1.6. (a) Show that tk /tk−1 = (n − k + 1)/(k2k+1 ). It is already NP-complete to ask if the vertices of a given graph can be colored in 3 colors.5 4 3 5.000. in Table 5.6 1. then we can expect an answer (probably ‘No’) after only about 197 steps of computation. Let tk denote the kth term in the sum (5.1) satisﬁes log2 n − log2 log2 n ≤ k0 ≤ log2 n (b) The entire sum in (5.8 32768 20 350.1) we have an exact formula for the average number of independent sets in a graph of n vertices. grows without bound.’ then the average labor in a backtrack search for the answer is bounded. Use Stirling’s formula (1. To be even more speciﬁc.. of In . for comparison. (b) Show that tk /tk−1 is > 1 when k is small. 5. More precisely.1: Independent sets and all sets In the exercises it will be seen that the rate of growth of In as n grows large is O(nlog n ).6. properly vertex-colorable in K colors?. but the polynomial is of degree 0 (in V ). although faster than polynomially.e. n In 2n 2 3.6 1048576 30 1342. consider the case of 3 colors. Nevertheless.6.6.6 8 4 8. the average number of nodes in the backtrack search tree for this problem is about 197. if we ask ‘Is the graph G.6.

7. Let C be one of the K L possible ways to color in K colors a set of L abstract vertices 1. Proof: We are counting the pairs (G.1. Then the number of graphs G whose vertex set is that set of L colored vertices and for which C 2 is a proper coloring of G is at most 2L (1−1/K)/2. 2 K The number of possible graphs G is therefore at most 2L 2 (1−1/K)/2 . then by lemma 5. 2) is an edge. 125 .1) 2 2K L2 1 = (1 − ). If we keep C ﬁxed and sum on G. . . The number of edges that G might have is therefore s1 s2 + s1 s3 + · · · + s1 sK + s2 s3 + · · · + s2 sK + · · · + sK−1 sK for which we have the following estimate: si sj = 1≤i<j≤K 1 2 1 2 si sj i=j K K = si sj − i.7. Instead of asking ‘How many proper colorings can a given graph have?. Now let’s think about a backtrack search for a K-coloring of a graph. The total number of proper colorings in K colors of all graphs of L vertices is at most K L 2L 2 (1−1/K)/2 . .7 Backtracking (II): graph coloring Lemma 5.1) 1 1 = ( si )2 − s2 i 2 2 2 2 L 1L ≤ − (by lemma 5. If a graph G is to admit C as a proper vertex coloring then its edges can be drawn only between vertices of diﬀerent colors. where the graph G has L vertices and C is a proper coloring of 2 G.5.2 the sum is at most 2L (1−1/K)/2. . Let s1 .’ we ask ‘How many graphs can have a given proper coloring?’ Lemma 5. of course. L. . . . the proof is ﬁnished. . L we color each new vertex with the lowest available color number that does not cause a conﬂict with some vertex that has previously been colored.j=1 i=1 s2 i (5. .2. Then use color 1 on vertex 2 unless (1. Proof: In the coloring C . . s1 +· · ·+sK = L.7. K The next lemma deals with a kind of inside-out chromatic polynomial question. 2. C). Proof: We have 0≤ i=1 K K (si − L 2 ) K Lsi L2 + 2) K K = i=1 K (s2 − 2 i s2 − 2 i i=1 K = L2 L2 + K K = i=1 s2 − i L2 . 2. Begin by using color 1 on vertex 1. . . Since there are K L such C ’s.7.7. sK get color K. . suppose s1 vertices get color 1. As the coloring progresses through vertices 1. . . sK be nonnegative numbers whose sum is L. in which case use color 2.7. Lemma 5. Then the sum of their squares is at least L2 /K. .3. where.

’ The label at each node of the tree describes the colors of the vertices that have so far been colored.Chapter 5: N P -completeness At some stage we may reach a dead end: out of colors.7. vertex 2 in color 2.3 have been colored.1: Color this graph When a dead end is reached. in colors 2. vertex 3 in color 1 and then we’d be stuck because neither color would work on vertex 4. 5.2.2. Fig. The search is thought of as beginning at ‘Root. and try again to push forward to the next vertex. Fig. 5. The (futile) attempt to color the graph in Fig. 5. 5.3: A happy search tree 126 . 5.7.2.2: A frustrated search tree Fig.1 with 2 colors by the backtrack method can be portrayed by the backtrack search tree in Fig. but not out of vertices to color.1. respectively.7. In the graph of Fig. Thus ‘212’ means that vertices 1.7.7.1 if we try to 2-color the vertices we can color vertex 1 in color 1.7. back up to the most recently colored vertex for which other color choices are available. 5. replace its color with the next available choice.

. HL(G) selects the subgraph of G that is induced by vertices 1. K) = 2 −(n) 2 L=0 n 2( 2 )−( 2 ) n L P (K. level 2 contains 6 nodes. Generally. H) . .7. to P (K. .5. Hence (5. Then we see the truth of Lemma 5. . of graphs −(n) 2 Gn n {no. correspoonding to the partial colorings 12. in fact we get 12 of them. as is shown in Fig. of nodes in tree for G} graphs Gn =2 { L=0 n {no.. L. k) denote the average number of nodes in the backtrack search trees for Kcoloring the vertices of all graphs of n vertices. L. .1 then we get a successful coloring. consists of the nodes of the search tree that are at a distance 2 from ‘Root. 23. Level 2. L. . L (see exercise 15 of section 1. of nodes at level L}} P (K.e. . Let A(n. As G runs over all graphs of N vertices. 31. K) ≤ L=0 ∞ 2−( 2 ) K L 2L L 2 (1−1/K)/2 ≤ L=0 K L 2L/2 2−L 2 /2K . HL The inner sum is exactly the number that is counted by lemma 5. .7. 21. 2. and so n A(n. k) ≤ h(K) for all n. 5.. HL(G))}.. . K) = 1 no. .6). for instance.7. a node at level L of the backtrack search tree corresponds to a proper coloring in K colors of the subgraph of G that is induced by vertices 1. Let’s concentrate on a particular level of the search tree.7. Then there is a constant h = h(K). i. 2. Now lots of graphs G of n vertices have the same n L HL(G) sitting at vertices 1. but not on n.2) gives n A(n. 13. H) HL = L=0 2−( 2 ) L P (K. The number of nodes at level L of the backtrack search tree for coloring a graph G in K colors is equal to the number of proper colorings of HL (G) in K colors. that depends on the number of colors.7. where P is the chromatic polynomial.’ In Fig. When the coloring reaches vertex 2 it has seen only the portion of the graph G that is induced by vertices 1 and 2.3. for all n. In fact exactly 2( 2 )−( 2 ) diﬀerent graphs G of n vertices all have the same graph H of L vertices in residence at vertices 1. We are now ready for the main question of this section: what is the average number of nodes in a backtrack search tree for K-coloring graphs of n vertices? This is A(n. 2. 5. 32 of the graph. 2.3.4) = 2−( 2 ) = 2−( 2 ) n n Gn L=0 n { L=0 Gn P (K.3.7.7 Backtracking (II): graph coloring If instead we use 3 colors on the graph of Fig. . HL(G)). K. such that A(n.1.4.7. . Let HL (G) denote that subgraph. This proves Theorem 5. HL(G)) (by lemma 5. 127 . The inﬁnite series actually converges! Hence A(n. Fix some value of L and consider the inner sum. k) is bounded.7. . . 5.

let v∗ be a vertex of smallest d(v). In these algorithms we don’t ﬁnd the exact solution of the problem. . and (ii) at each step we make the decision about which piece will be added next by choosing. though. In the MST problem. Let T be a minimum spanning tree for x. A MST is a tree whose nodes are the cities in question. . We will assume throughout the following discussion that the distances satisfy the triangle inequality. and we are asked to ﬁnd a round-trip tour of all of these cities that has minimum length. which we will now express as an optimization problem rather than as a decision problem. ﬁnd the distance d(v) from v to the nearest vertex of T . er−1 all appear. and let e1 . among all possible trees on that vertex set. has minimum possible length. the single one that will carry us as far as possible in the desirable direction (be greedy!). In the minimum tree T . We are given n points (‘cities’) in the plane. The ﬁrst step in carrying out the algorithm is to ﬁnd a minimum spanning tree (MST) for the n given cities. In other words. As consolation we have an algorithm that runs in polynomial time as well as a performance guarantee to the eﬀect that while the answer is approximate.8 Approximate algorithms for hard problems Finally we come to Type III of the three kinds of ‘half-a-loaf-is-better-than-none’ algorithms that were described in section 5. as we see in the following algorithm. edges e1 . . . and we let S be the union of their vertex sets. adjoin to T the edge from v∗ to the nearest vertex w = v∗ of T . and since er was the shortest edge available at that moment.5. An elegant example of such a situation is in the Travelling Salesman Problem. It may seem that ﬁnding a MST is just as hard as solving the TSP. we have a contradiction. The reason that greedy algorithms are not usually the best possible ones is that it may be better not to take the single best piece at each step. . . end{while} end. on the vertices {x1 . Let er be the ﬁrst edge of T that does not appear in T . Generally speaking. and which. 128 . Suppose f is shorter than er . as well as the distances between every pair of them.{mst} Proof of correctness of mst: Let T be the tree that is produced by running mst. procedure mst(x :array of n points in the plane). only an approximate one. (i) we are trying to construct some optimal structure by adding one piece at a time. the greedy strategy works. . but NIN (No. It’s Not). en−1 be its edges. . among all available pieces. . xn} in the plane} let T consist of a single vertex x1 . it can certainly deviate by no more than suchand-such from the exact answer. but to take some other piece. . adjoin v∗ to the vertex set of T . . This restriction of the TSP is often called the ‘Euclidean’ Travelling Salesman Problem. the global problem of ﬁnding the best structure might not be solveable by the local procedure of being as greedy as possible at each step. The algorithm that we will discuss for this problem has the properties (a) it runs in polynomial time and (b) the round-trip tour that it ﬁnds will never be more than twice as long as the shortest possible tour. Then f was one of the edges that was available to the algorithm mst at the instant that it chose er . listed in the same order in which the alfgorithm mst produced them. . {constructs a spanning tree T of minimum length. In T let f be the edge that joins the subtree on S to the subtree on the remaining vertices of x.Chapter 5: N P -completeness 5. while T has fewer than n vertices do for each vertex v that is not yet in T . in the hope that at a later step we will be able to improve things even more. in a greedy algorithm. The MST problem is one of those all-too-rare computational situations in which it pays to be greedy.

Z − e is a spanning tree of the cities. and is still a minimum spanning tree. there is an Eulerian tour W of the edges of T (2). The index of the ﬁrst edge of T that does not appear in T is now at least r + 1. In some cases the guarantees apply to the diﬀerence between the answer that the algorithm gives and the best one. We also noted that the proof remains valid even if we are dealing with a multigraph. Then T is still a tree. and immediately implies a linear time algorithm for ﬁnding Euler circuits recursively. In T exchange f for er . Since a path is a tree.e. If we put all of this together we ﬁnd that 1 1 length(Z) > length(Z − e) ≥ length(T ) = length(W ) ≥ length(Z ) 2 2 as claimed (!) More recently it has been proved (Cristoﬁdes. A step of Z that walks along an edge of the walk W has length equal to the length of that edge of W . hence Z − e is at least as long as T is. Now we have the ingredients for a quick near-optimal travelling salesman tour. go from v directly (via a straight line) to the next vertex of the walk W that you haven’t visited yet. one unit larger than before. until T = T . which is in turn twice the length of the tree T . Hence f and er have the same length. as in the proof of theorem 1. The process of replacing edges of T that do not appear in T without aﬀecting the minimality of T can be repeated until every edge of T appears in T ..8. Theorem 5. 129 .1. If we sum these inequalities over all steps of Z we ﬁnd that the length of Z is at most equal to the length of W . However. with a graph in which several edges are permitted between single pairs of vertices. (4) Now we construct the output tour of the cities. Here is the algorithm. Let Z be an optimum tour. Recall that the proof was recursive in nature. having arrived at some vertex v. The algorithm makes use of Edmonds’s algorithm for maximum matching in a general graph (see the reference at the end of Chapter 3). Given the n cities in the plane: (1) Find a minimum spanning tree T for the cities.7 Backtracking (II): graph coloring Suppose f is longer than er .6.5. A step of Z that short circuits several edges of W has length at most equal to the sum of the lengths of the edges of W that were short-circuited. There is an algorithm that operates in polynomial time and which will return a travelling salesman tour whose length is at most twice the length of a minimum tour. The next step involves ﬁnding an Euler circuit. (2) Double each edge of the tree. Begin at some city and follow the walk W . See the references below for more information.1 we learned that a connected graph has an Euler circuit if and only if every vertex has even degree. Ten Z − e is a path that visits all of the cities. thereby obtaining a ‘multitree’ T (2) in which between each pair of vertices there are 0 or 2 edges. This means that you will often short-circuit portions of the walk W by going directly from some vertex to another one that is several edges ‘down the road. 1976) that in polynomial time we can ﬁnd a TSP tour whose total length is at most 3/2 as long as the minimum tour. Next consider the length of the tour Z . by more than a constant factor.6. ﬁnd one. contradicting the minimality of T . Then T would not be minimal because the tree that we would obtain by exchanging f for er in T ( why is it still a tree if we do that exchange?) would be shorter. Way back in theorem 1. i. Hence T was a minimum spanning tree. the optimum answer.’ The tour Z that results from (4) above is indeed a tour of all of the cities in which each city is visited once and only once. and let e be some edge of Z. that is. We claim that its length is at most twice optimal.1. It will be interesting to see if the factor 3/2 can be further reﬁned. We will in fact need that extra ﬂexibility for the purpose at hand. (3) Since every vertex of the doubled tree has even degree. Polynomial time algorithms are known for other NP-complete problems that guarantee that the answer obtained will not exceed. and so Z is surely at least as long as T is. That ﬁnishes one step of the process that leads to a polynomial time travelling salesman algorithm that ﬁnds a tour of at most twice the minimum length.

On computable numbers. careful and complete. Proc. The earliest ideas on the computational intractability of certain problems go back to Alan Turing. M. 42 (1936). S. Comp. and each of them is a thorough survey of recent progress in one particular area of NP-completeness research. Graduate School of Industrial Administration. Freeman and Co. 6. 36 (1957). Proc. in R. Computers and Intractability. These are called ‘NP-completeness: An ongoing guide.Chapter 5: N P -completeness Exercises for section 5. London Math. Third Annual ACM Symposium on the Theory of Computing. Miller and J. Thatcher. Pittsburgh. Plenum. Bell System Tech.’ They are written by David S. Reducibility among combinatorial problems. J.. E. 85-103. Lewis. J. The approximate algorithm for the travelling salesman problem is in D.x−xn ). H.{mst2} Is this algorithm a correct recursive formulation of the minimum spanning tree greedy algorithm? If so then prove it. {allegedly ﬁnds a tree of minimum total length that visits every one of the given points} if n = 1 then T := {x1 } else T := mst2(n − 1. is in S. Bibliography Before we list some books and journal articles it should be mentioned that research in the area of NP-completeness is moving rapidly. C. the Journal of the Association for Computing Machinery. Technical Report. Consider the following algorithm: procedure mst2(x :array of n points in the plane). with an application to the Entscheidungsproblem... Stearns and P. a large number of NP-complete problems were found by Richard M. Ser.. San Francisco. 1977. was found by N. and the state of the art is changing all the time. Karp. 1976. Soc. 151-158. Journals that contain a good deal of research on the areas of this chapter include the Journal of Algorithms. 2. E. The most complete reference on NP-completeness is M. Complexity of Computer Computations. and if not then give an example of a set of points where mst2 gets the wrong answer. Rosencrantz. u) end. Cristoﬁdes. The complexity of theorem proving procedures. Cook’s theorem. Readers who would like updates on the subject are referred to a series of articles that have appeared in issues of the Journal of Algorithms in recent years. ACM. Johnson. An analysis of several heuristics for the travelling salesman problem. Worst case analysis of a new heuristic for the travelling salesman problem. 1972. The above paper is recommended both for its content and its clarity of presentation. Garey and D. mst2:=T plus vertex xn plus edge (xn . Information Processing Letters. Johnson.8 1. Cook. New York. R. 230-265. 563-581. It is readable. which originated the subject of NP-completeness. After Cook’s work was done. New York. eds. W. The probabilistic algorithm for the Hamilton path problem can be found in 130 . 1979. The minimum spanning tree algorithm is due to R. 13891401. 1971. Shortest connection netwroks and some generalizations. let u be the vertex of T that is nearest to xn . SIAM J. and SIAM Journal of Discrete Mathematics. the SIAM Journal of Computing. Another approximate algorithm for the Euclidean TSP which guarantees that the solution found is no more than 3/2 as long as the optimum tour. Prim. W. Carnegie-Mellon University. A. They are written as updates of the ﬁrst reference below. The above is highly recommended. A guide to the theory of NP-completeness.

Further reﬁnements of the above result can be found in E. Ninth Annual ACM Symposium on the Theory of Computing. G. Fast probabilistic algorithms for Hamilton circuits and matchings.7 Backtracking (II): graph coloring D. Angluin and L. Backtrack: An O(1) average time algorithm for the graph coloring problem. New York.5. Bender and H. 1977. 131 . S. Proc. An introduction to the theory of random graphs. New York. Journal of Algorithms 6 (1985). An excellent introduction to the subject is Edgar M. The result that the graph coloring problem can be done in constant average time is due to H. Palmer. Information Processing Letters 18 (1984). Graphical Evolution. Valiant. ACM. 275-282. Wilf. Wilf. A theoretical analysis of backtracking in the graph coloring problem. 1985. If you enjoyed the average numbers of independent sets and average complexity of backtrack. 119-122. you might enjoy the subject of random graphs. Wiley-Interscience.

W. 227 Bentley. 149. 54 Berger. Bender. K. Cooley. E. 175. 99 cryptography 165 Cristoﬁdes. 165. 170. 193 Cherkassky. 134 divide 137 Dixon. 69 average complexity 57. 227 cut in a network 115 —. 103 Coppersmith. J. 227 Appel. 182 binomial coeﬃcients 35 —. L. J. 176 digraph 105 Dinic. S. growth of 38 blocking ﬂow 124 Burnside’s lemma 46 cardinality 35 canonical factorization 138 capacity of a cut 115 Carmichael numbers 158 certiﬁcate 171. 3 big oh 9 binary system 19 bin-packing 178 binomial theorem 37 bipartite graph 44. 177 domino problem 3 ‘easy’ computation 1 edge coloring 206 edge connectivity 132 132 . 211ﬀ. 176 coloring graphs 43 complement of a graph 44 complexity 1 —. 187. V. 226 Cook’s theorem 195ﬀ. capacity of 115 cycle 41 cyclic group 152 decimal system 19 decision problem 181 degree of a vertex 40 deterministic 193 Diﬃe. 176 Aho. M. 194-201. 108. 182. V. worst-case 4 connected 41 Cook. 224. D. N. D. J. D. H. R. 135 Chinese remainder theorem 154 chromatic number 44 chromatic polynomial 73 Cohen. 208-211. B.Index Index adjacent 40 Adleman. 103 Angluin. 164. E. A. backtracking 211ﬀ.

Hardy. 135 exponential growth 13 factor base 169 Fermat’s theorem 152. 134. 226 Karzanov. M. —. E. intractable 5 Johnson. 157 Eulerian circuit 41 Even. E. G. 2 Garey. 69 Hamiltonian circuit 41. M. 226 Karp. R. 183. S. D. 205. 76. R. Galil. augmentation 109 —. Z. complement of 44 —. E. 103 o 133 . 70. four-color theorem 68 Fourier transform 83ﬀ. complexity of 93 —. 206. discrete 83 —. 168 —. L. 103 Hu. 175 height of network 125 Hellman. 188. A. D. extended 144ﬀ. 188 geometric series 23 Gomory. value of 106 —. inverse 96 Fulkerson. complete 44 —. 144 ﬂow 106 —. 225. T. J. complexity 142 —. 211ﬀ. applications of 95 ﬀ. 216ﬀ. 51 Hopcroft. planar 70 greatest common divisor 138 group of units 151 Haken. C. 135 Gardner. H. 179. 107. M. Ford. S. 134. C. 208ﬀ. empty 44 —. D. 102 K¨nig. coloring of 43. 136 independent set 61. J. 224 Enslein. —. 103 Euclidean algorithm 140. bipartite 44 —. R. 159 FFT.Index Edmonds. 176 hexadecimal system 21 hierarchy of growth 11 Hoare. —. A. Fibonacci numbers 30. 107. W. 107ﬀ. connected 41 —. blocking 124 ﬂow augmenting path 109 Ford-Fulkerson algorithm 108ﬀ. K. 136 graphs 40ﬀ. 134 Knuth. Euler totient function 138. 107ﬀ. H. E.

228 Pan. M. max-ﬂow-min-cut 115 maximum matching 130 minimum spanning tree 221 moderately exponential growth 12 MPM algorithm 108. 162. 108ﬀ. 99 layered network 120ﬀ. V. 176 positional number systems 19ﬀ. 227 L’Hospital’s rule 12 little oh 8 Lomuto. V. strong 158 public key encryption 150. N. 135 matrix multiplication 77ﬀ. N. 172 Prim. A. dense 107 —. . Lenstra. 185 polynomials. J. A. 165 Quicksort 50ﬀ.. 103 134 . M. 149. Jr. C. P. W. 108ﬀ. H. O. 148ﬀ. 149. 156ﬀ. M. 128ﬀ. 135 Pratt. 108ﬀ. . 175 Lewis. . 171. 60 nondeterministic 193 NP 182 NP-complete 61. 176 LeVeque. octal system 21 optimization problem 181 orders of magnitude 6ﬀ. W. —. 103 Lewis. M. 103 Pascal’s triangle 36 path 41 periodic function 87 polynomial time 2. P 182 Palmer. P. C. —. E. S. —. R. 164.Index k-subset 35 language 182 Lawler. 180 NP-completeness 178ﬀ. 179. 227 primality. 120ﬀ. Rabin. A. W. Pramodh-Kumar. MST 221 multigraph 42 network 105 — ﬂow 105ﬀ. multiplication of 96 Pomerance. E. V. layered 108. 135 Malhotra. 175 Ralston. 186 —. 54 Maheshwari. height of 125 Nijenhuis. . proving 170 prime number 5 primitive root 152 pseudoprimality test 149. M. testing 6.

226 Turing machine 187ﬀ. 149. 176 roots of unity 86 Rosenkrantz. 221 tree 45 Trojanowski. V. R. 103 Turing. 180 Wright. 162. E. 176 Shamir. recurrent inequality 31 recursive algorithms 48ﬀ. A. 103 ‘TSP’ 178. V. 70. S. C. 221 Tukey. E. 66. 165. 103 Wilf. 216 Strassen. 227. D. 227 vertices 40 Vizing. 135 Θ (‘Theta of’) 10 tiling 2 tractable 5 travelling salesman problem 178. J. 60. R. H. 206 Wagstaﬀ. 78. 208-11. reducibility 185 relatively prime 138 ring Zn 151ﬀ. 227 RSA system 165. 176 splitter 52 Stearns. 103. A. 103. S. A. 164. 103 o Selfridge. Ullman. A. W. E. 149. 176 Runge. R. Rivest. 99 worst-case 4. 176 Welch. 176 synthetic division 86 3SAT 201 target sum 206 Tarjan. 175 135 . D. 162. 66. R. 184. 103. J. M. 228 Winograd. 165. D. 168 Rumely. 176 slowsort 50 Solovay. 195 scanned vertex 111 Sch¨nhage. J. 149. 103 usable edge 111 Valiant.Index recurrence relations 26ﬀ. L. 103 SAT 195 satisﬁability 187. 227 Stirling’s formula 16. R. P.

- Herbert S. Wilf - Algorithms and Complexity
- Assignment 1
- slides9-15
- R09220505-DESIGNANDANALYSISOFALGORITHMS
- Matrix Multiplication
- Knapsack Problem
- Jyrki-2002 03 25
- Union Find
- Fushan Li - Principal
- Quick Sort Analysis
- EJOR_free9
- CSE565-F10-midterm1
- Test 1 Spring 01
- Split Published
- Two-Phase MMA for Solving the TPGA
- 3. the World and Mind of Computation and Complexity
- 10.1.1.12.1862
- Coupled Variational Image Decomposition And
- Pattern Search in a Single Genome
- Zitz Ler Thiele 99
- Ch7 STL Algorithms
- The Demons Algorithm
- Glossary
- BAT CS Tutorial
- Bees algorithm
- IJAIEM-2015-01-31-64
- Scaled Conjugate Gradient
- EScholarship UC Item 8pb5353s
- Steganalisis using Resized images
- Divide Conquer F10

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?