Professional Documents
Culture Documents
6......
...... ......
CAMBRIDGE STUDIES IN
ADVANCED MATHEMATICS 73
EDITORIAL BOARD
B. BOLLOBAS, W. FULTON, A. KATOK, F. KIRWAN,
P. SARNAK
RANDOM GRAPHS
Already published
I W.M.L. Holcombe Algebraic automata theory
2 K. Petersen Ergodic theory
3 P.T. Johnstone Stone spaces
4 W.H. Schikhof Ultrametric calculus
5 J.-P. Kahane Some random series of functions, 2nd edition
6 H. Cohn Introduction to the construction ofclassfields
7 J. Lambek & P.J. Scott Introduction to higher-order categorical logic
8 H. Matsumura Commutative ring theory
9 C.B. Thomas Characteristicclasses and the cohomology of/finite groups
10 M. Aschbacher Finite group theory
I1 J.L. Alperin Local representation theory
12 P. Koosis The logarithmic integral I
13 A. Pietsch Eigenvalues and s-numbers
14 S.J. Patterson An introduction to the theory of the Riemann zeta-function
15 H.J. Baues Algebraic homotopy
16 V.S. Varadarajan Introduction to harmonic analysis on semisimple Lie groups
17 W. Dicks & M. Dunwoody Groups acting on graphs
18 L.J. Corwin & F.P. Greenleaf Representations of nilpotent Lie groups and their applications
19 R. Fritsch & R. Piccinini Cellular structures in topology
20 H. Klingen Introductory lectures on Siegel modular forms
21 P. Koosis The logarithmic integral II
22 M.J. Collins Representations and characters of finite groups
24 H. Kunita Stochastic flows and stochastic differential equations
25 P. Wojtaszczyk Banach spaces.1or analysts
26 J.E. Gilbert & M.A.M. Murray Clifford algebras and Dirac operators in harmonic analysis
27 A. Frohlich & M.J. Taylor Algebraic number theory
28 K. Goebel & W.A. Kirk Topics in metric fixed point theory
29 J.E. Humphreys Reflection groups and Coxeter groups
30 D.J. Benson Representations and cohomology I
31 D.J. Benson Representations and cohomology II
32 C. Allday & V. Puppe Cohomological methods in transformation groups
33 C. Soul& et al Lectures on Arakelov geometry
34 A. Ambrosetti & G. Prodi A primer of nonlinear analysis
35 J. Palis & F. Takes Hyperbolicity, stability and chaos at homoclinic bifiarcations
37 Y. Meyer Wavelets and operators I
38 C. Weibel An introduction to homological algebra
39 W. Bruns & J. Herzog Cohen-Macaulay rings
40 V. Snaith Explicit Brauer induction
41 G. Laumon Cohomology of Drinfield modular varieties I
42 E.B. Davies Spectral theory and differential operators
43 J. Diestel, H. Jarchow & A. Tonge Absolutely summing operators
44 P. Mattila Geometry of sets and measures in euclidean spaces
45 R. Pinsky Positive harmonicfunctions and diffusion
46 G. Tenenbaum Introduction to analytic and probabilistic number theory
47 C. Peskine An algebraic introduction to complex projectile geometry I
48 Y. Meyer & R. Coifman Wavelets and operators II
49 R. Stanley Enumerative combinatories
50 1. Porteous Clifford algebras and the classical groups
51 M. Audin Spinning tops
52 V. Jurdjevic Geometric control theory
53 H. Voelklein Groups as Galois groups
54 J. Le Potier Lectures on vector bundles
55 D. Bump Automorphic.1orms
56 G. Laumon Cohomology of Drinfeld modular varieties 11
57 D.M. Clarke & B.A. Davey Natural dualities ibr the working algebraist
59 P. Taylor Practicalfoundations of mathematics
60 M. Brodmann & R. Sharp Local cohomology
61 J.D. Dixon, M.P.F. Du Sautoy, A. Mann & D. Segal Analytic pro-p groups, 2nd edition
62 R. Stanley Enumerative combinatorics II
64 J. Jost & X. Li-Jost Calculus of variations
68 Ken-iti Sato LUvy processes and infinitely divisible distributions
71 R. Blei Anal'ysis in integer andtiactional dimensions
Random Graphs
Second Edition
BELA BOLLOBAS
University of Memphis
and
Trinity College, Cambridge
CAMBRIDGE
UNIVERSITY PRESS
PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
http://www.cambridge.org
A catalogue record Jor this book is available from the British Library
QA166.17.B66 2001
511.'5-dc2l 00-068952
Preface page xi
Notation xvii
1 Probability Theoretic Preliminaries 1
1.1 Notation and Basic Facts 1
1.2 Some Basic Distributions 5
1.3 Normal Approximation 9
1.4 Inequalities 15
1.5 Convergence in Distribution 25
2 Models of Random Graphs 34
2.1 The Basic Models 34
2.2 Properties of Almost All Graphs 43
2.3 Large Subsets of Vertices 46
2.4 Random Regular Graphs 50
3 The Degree Sequence 60
3.1 The Distribution of an Element of the Degree Sequence 60
3.2 Almost Determined Degrees 65
3.3 The Shape of the Degree Sequence 69
3.4 Jumps and Repeated Values 72
3.5 Fast Algorithms for the Graph Isomorphism Problem 74
4 Small Subgraphs 78
4.1 Strictly Balanced Graphs 79
4.2 Arbitrary Subgraphs 85
4.3 Poisson Approximation 91
5 The Evolution of Random Graphs-Sparse Components 96
5.1 Trees of Given Sizes As Components 96
5.2 The Number of Vertices on Tree Components 102
vii
viii Contents
The theory of random graphs was founded by Erd6s and R~nyi (1959,
1960, 1961a, b) after Erd6s (1947, 1959, 1961) had discovered that proba-
bilistic methods were often useful in tackling extremal problems in graph
theory. Erd6s proved, amongst other things, that for all natural numbers
g > 3 and k > 3 there exist graphs with girth g and chromatic number
k. Erd6s did not construct such graphs explicitly but showed that most
graphs in a certain class could be altered slightly to give the required
examples.
This phenomenon was not entirely new in mathematics, although it was
certainly surprising that probabilistic ideas proved to be so important
in the study of such a simple finite structure as a graph. In analysis,
Paley and Zygmund (1930a, b, 1932) had investigated random series of
functions.
On= 2 = One of their results was that if the real numbers c, satisfy
otenEI
~=0 c• =o then n=0 ±c cos nx fails to be a Fourier-Lebesgue series
for almost all choices of the signs. To exhibit a sequence of signs with this
property is surprisingly difficult: indeed, no algorithm is known which
constructs an appropriate sequence of signs from any sequence c, with
- c2 = cc. Following the initial work of Paley and Zygmund, random
functions were investigated in great detail by Steinhaus (1930), Paley,
Wiener and Zygmund (1932), Kac (1949), Hunt (1951), Ryll-Nardzewski
(1953), Salem and Zygmund (1954), Dvoretzky and Erd6s (1959) and
many others. An excellent account of these investigations can be found in
Kahane (1963, 1968). Probabilistic methods were also used by Littlewood
and Offord (1938) to study the zeros of random polynomials and analytic
functions. Some decades later, a simple but crucial combinatorial lemma
from their work greatly influenced the study of random finite sets in
vector spaces.
The first combinatorial structures to be studied probabilistically were
xi
xii Preface
tournaments, chiefly because random tournaments are intrinsically related
to statistics. The study began with Kendall and Babington-Smith (1940)
and a concise account of many of the results is given by Moon (1968b).
Szele (1943) was, perhaps, the first to apply probabilistic ideas to extremal
problems in combinatorics. He observed that some tournament of order n
must have at least n!/2'- 1 Hamilton paths, because the expected number
of Hamilton paths is n!/2n- 1. Once again, it is not easy to construct a
tournament of order n with this many Hamilton paths. A little later,
Erd6s (1947) used a similar argument, based on the expected number of
k-cliques in a graph of order n, to show that the Ramsey number R(k) is
greater than 2 k/2.
Existence results based on probabilistic ideas can now be found in
many branches of mathematics, especially in analysis, the geometry of
Banach spaces, number theory, graph theory, combinatorics and com-
puter science. Probabilistic methods have become an important part of
the arsenal of a great many mathematicians. Nevertheless, this is only
a beginning: in the next decade or two probabilistic methods are likely
to become even more prominent. It is also likely that in the not too
distant future it will be possible to carry out statistical analyses of more
complicated systems. Mathematicians who are not interested in graphs
for their own sake should view the theory of random graphs as a modest
beginning from which we can learn a variety of techniques and can find
out what kind of results we should try to prove about more complicated
random structures.
As often happens in mathematics, the study of statistical aspects of
graphs was begun independently and almost simultaneously by several
authors, namely Ford and Uhlenbeck (1956), Gilbert (1957), Austin,
Fagen, Penney and Riordan (1959) and Erd6s and Rhnyi (1959). Occa-
sionally all these authors are credited with the foundation of the theory
of random graphs. However, this is to misconceive the nature of the
subject. Only Erd6s and R~nyi introduced the methods which underlie
the probabilistic treatment of random graphs. The other authors were all
concerned with enumeration problems and their techniques were essen-
tially deterministic.
There are two natural ways of estimating the proportion of graphs
having a certain property. One may obtain exact formulae, using P61ya's
enumeration theorem, generating functions, differential operators and the
whole theory of combinatorial enumeration, and then either consider the
task completed or else proceed to investigate the asymptotic behaviour
of the exact but complicated formulae, which is often a daunting task.
Preface xiii
The greatest discovery of Erd6s and R~nyi was that many important
properties of graphs appear quite suddenly. If we pick a function M =
M(n) then, in many cases, either almost every graph GM has property
Q or else almost every graph fails to have property Q. In this vague
sense, we have a 0-1 law. The transition from a property being very
unlikely to it being very likely is usually very swift. To make this more
precise, consider a monotone (increasing) property Q, i.e. one for which
a graph has Q whenever one of its subgraphs has Q. For many such
properties there is a threshold function Mo(n). If M(n) grows somewhat
slower than Mo(n), then almost every GM fails to have Q. If M(n) grows
somewhat faster than Mo(n), then almost every GM has the property Q.
For example, Mo(n) = 1n log n is a threshold function for connectedness
in the following sense: if wo(n) -* o, no matter how slowly, then almost
every G is disconnected for M(n) = 1n(log n - oo(n)) and almost every G
is connected for M(n) = 1n(logn + w)(n)).
As the proportion M/N of edges increases, where, as always, N (3)
is the total number of possible edges, the shape of a typical graph
GM passes through several clearly identifiable stages, in which many of
the important parameters of a typical graph are practically determined.
When M is neither too small nor too large, then the most important
property is that, for every fixed k, and k vertices in a typical graph have
about the same number of neighbours. Thus a typical random graph
is rather similar to an ideal regular graph whose automorphism group
is transitive on small sets of vertices. Of course, there is no such non-
trivial regular graph, and in many applications random graphs are used
precisely because they approximate an ideal regular graph.
This book is the first systematic and extensive account of a substantial
body of results from the theory of random graphs. Considerably shorter
treatments of random graphs can be found in various parts of Erd6s
and Spencer (1974), in Chapter 8 of Marshall (1971), in Chapter 7 of
Bollobts (1979a), and in the review papers by Grimmett (1980), BollobAs
(1981f) and Karofiski (1982). Despite over 750 references, several topics
are not covered in any detail and we can make no claim to completeness.
Perhaps the greatest omission is the extensive theory of random trees,
about which the reader could get some idea by consulting the beautiful
book by Moon (1970a) and the papers by Moon and Meir listed in the
references. I might justify this particular omission because the tools used
in its study are often those of enumeration rather than probability theory.
However, here and in general, the choice of topics is mainly a reflection
of my own interests.
Preface xv
The audience I had in mind when writing this book consisted mainly of
research students and professional mathematicians. The volume should
also be of interest to computer scientists. Random graphs are of ever
increasing importance in this field and several of the sections have been
written expressly for computer scientists.
The monograph was planned to be considerably shorter and was to be
completed by early 1981. However, I soon realized that I had little control
over the length which was dictated by the subject matter I intended to
explore. Temptations to ease my load by adopting short-cuts have been
rife but, for the reader's sake, I have tried to resist them. During its
preparation I have given several courses on the material, notably for Part
III for the University of Cambridge in 1980/81 and 1983/84. The first
seven chapters were complete by the summer of 1982 and were circulated
quite widely. I gave a series of lectures on these at the Waterloo Silver
Jubilee Conference. These lectures were published only recently (1984d).
I have written the book wherever I happened to be: the greatest part at
Louisiana State University, Baton Rouge; a few chapters in Cambridge,
Waterloo and Sao Paulo; and several sections in Tokyo and Budapest. I
hope never again to travel with 200 lb of paper!
My main consideration in selecting and presenting the material was to
write a book which I would like to read myself. In spite of this, the book
is essentially self-contained, although familiarity with the basic concepts
of graph theory and probability theory would be helpful. There is little
doubt that many readers will use this monograph as a compendium of
results. This is a pity, partly because a book suitable for that purpose
could have been produced with much less effort than has gone into this
volume, and also because proofs are often more important than results:
not infrequently the reader will derive more benefit from knowing the
methods used than from familiarity with the theorems. The list of contents
describes fairly the material presented in the book.
The graph theoretic notation and terminology used in the book are
standard. For undefined concepts and symbols the reader may consult
Bollobits (1978a, 1979a).
The exercises at the end of each chapter vary greatly in importance
and difficulty. Most are designed to clarify and complete certain points
but a few are important results. The end of a proof, or its absence, is
indicated by the symbol E; the floor of x (i.e. the greatest integer less
than, or equal to, x) is denoted by [x]; and the ceiling of x by [x].
With very few exceptions, the various parameters and random variables
depend on the number n of vertices of graphs under consideration and
xvi Preface
the inequalities are only claimed to hold when n is sufficiently large.
This is often stated but it is also implicit on many other occasions.
The symbols ch,c2,.., which appear without any explanation, are always
independent of n. They may be absolute constants or may depend on
other quantities which are independent of n. To assist the reader, it will
often be stated which of these is the case.
It is a great pleasure to acknowledge the debt I owe to many people
for their help. Several of my friends kindly read large sections of the
manuscript. Andrew Thomason from Exeter, Istvan Simon from Sao
Paulo, Masao Furuyama from Kyoto and Boris Pittel from Columbus
were especially generous with their help. They corrected many errors
and frequently improved the presentation. In addition, I benefited from
the assistance of Andrew Barbour, Keith Carne, Geoff Eagleson, Alan
Frieze and Jonathan Partington. Many research students, including Keith
Ball, Graham Brightwell and Colin Wright, helped me find some of the
mistakes; for the many which undoubtedly remain, I apologize. I am
convinced that without the very generous help of Andrew Harris in
using the computer the book would still be far from finished.
It was Paul Erd6s and Alfred R6nyi who, just over 20 years ago,
introduced me to the subject of random graphs. My active interest in
the field was aroused by Paul Erd6s some years later, whose love for it
I found infectious; I am most grateful to him for firing my enthusiasm
for the subject whose beauty has given me so much pleasure ever since.
Finally, I am especially grateful to Gabriella, my wife, for her constant
support and encouragement. Her enthusiasm and patience survived even
when mine failed.
Baton Rouge B.B.
December, 1984
Notation
xvii
xviii Notation
Pk property 44
P)ý Poisson distribution 8
PM probability in W(n, M) 34
PP probability in ,(n, P) 35
Pq Paley graph 357
Qn n-dimensional cube 383
Qp random subgraph of Qf 383
R(s), R(s, t) Ramsey numbers 320
S", p binomial r.v. 5
c = rQ = cQ(G) = z(G;Q) hitting time 42
t(c) function concerning the number
of vertices on tree components 102
UM = Ufl,M number of unlabelled graphs 229
u(c) function concerning the number
of tree components 109
Z(G) chromatic number 296
X, IX convergence in distribution 2
w)(n) function tending to ow 43
1
Probability Theoretic Preliminaries
The aim of this chapter is to present the definitions, formulae and re-
sults of probability theory we shall need in the main body of the book.
Although we assume that the reader has had only a rather limited ex-
perience with probability theory and, if somewhat vaguely, we do define
almost everything, this chapter is not intended to be a systematic in-
troduction to probability theory. The main purpose is to identify the
facts we shall rely on, so only the most important-and perhaps not
too easily accessible-results will be proved. Since the book is primarily
for mathematicians interested in graph theory, combinatorics and com-
puting, some of the results will not be presented in full generality. It is
inevitable that for the reader who is familiar with probability theory this
introduction contains too many basic definitions and familiar facts, while
the reader who has not studied probability before will find the chapter
rather difficult.
There are many excellent introductions to probability theory: Feller
(1966), Breiman (1968), K. L. Chung (1974) and H. Bauer (1981), to
name only four. The interested reader is urged to consult one of these
texts for a thorough introduction to the subject.
1
2 Probability Theoretic Preliminaries
A real valued random variable (r.v.) X is a measurable real-valued function
on a probability space, X : Q•- R.
Given a real valued r.v. X, its distributionfunction is F(x) = P(X <
x), -oc < x < oc. Thus F(x) is monotone increasing, continuous from
the left, lim,__._ F(x) = 0 and lim__ F(x) = 1. If there is a function
f(t) such that F(x) = fo, f(t) dt, then f(t) is the density Junction of
X. We say that a sequence of r.vs (Y,) tends to X in distribution if
limr__., P(Y, < x) = P(X < x) = F(x), whenever x is a point of continu-
d
ity of F(x). The notation for convergence in distribution is Xn d X. Of
course, convergence in distribution depends only on the distributions of
the r.vs in question.
If h is any real-valued function on R, the expectation of h(X) is
In fact, one can do a little better, for by the Cauchy inequality with
no = {w : X(w) • 0} we have
then
Er(X) Npk(k)r
k=r
4 Probability Theoretic Preliminaries
Note that if X denotes the number of objects in a certain class then Er(X)
is the expected number of ordered r-tuples of elements of that class.
The r.vs X 1 , X 2.... are said to be independent if for each n
n
P(Xi = ki, i = _.. n)= P(Xi = ki)
-2 ( X) =E [{ Xi }1 = Zi 0-.2
In our calculations we shall often need the following rather sharp form
of Stirling's formula proved by Robbins (1955):
e-l/(6m) 2g n
(nm m i- (m(n - m) (n
I< M(n)- / (1.5)
On putting p = m/n and q = 1-p we find that if m -+ oc and n-m --* oc,
then
(ni) - (27r)-1/2(pPP qq)-n(pqn)-l 2
and
b-y)Y < (a-) ( 1< <e 1-b/a)y.
_(b)Y (1.7)
and we say that Sn,p has a binomial distribution with parameters n and
p. By definition b(k; n, p) is the probability that we get k heads when
tossing a coin n times, provided the probability of getting a head is
p. Since E(X(i1 ) = p and o' 2 (X(i)) = E{(X(') - p) 2 } = pq, the binomial
6 Probability Theoretic Preliminaries
distribution with parameters n and p has mean E(Sn,p) = pn and variance
a2 (Sn,p) = pqn.
It is easily seen that b(k; n, p) is largest when k approximates to pn, the
expectation of Sn,p. Indeed, note that
b(k; n, p) +(n + 1)p - k
b(k - 1; n, p) kq
Hence, if m is the unique integer satisfying p(n+ 1)-I < m < p(n+ 1), then
the terms b(k; n, p) strictly increase up to k = m - 1 and strictly decrease
from k = m; furthermore, b(m - 1;n,p) < b(m;n,p) unless m = p(n + 1),
when equality holds. Consequently from Stirling's formula (4) it follows
that if p is fixed then, as n --* o,
1 1
max b(k; n, p) -
k 2-pqn a/2
Relations (1.11) and (1.5) imply the following simple but useful bound
on the probability in the tail of the binomial distribution.
b(k; n, p) (n - m)p 1
b(k-1;n,p)- (m+1)q u
so
n
Then
d 1--p
dx log f(x) = -p log x - p(1 - px) log I p < 0,
1.2 Some Basic Distributions 7
býmý2p) mln--m)/ In n
-- 'm m(n -m
(m(n---m)) 1/2~
1 (n ) 1/2 f(u)n. D]
(1.12)
where p = R/N and q = 1 - p. As we are interested mostly in the
upper bound for qk, we show the second inequality. Consider the second
expression for qk and note that
N-(nN - n)R-k (R -N
since
N-R N-R N (k<
N-k N N-k - N1 .
A r.v. Y is said to have Poisson distribution with mean 2 > 0 if
P(Y = k) = p(k;2) = e-;2k/k1, k = 0,1.
This distribution, which we shall denote by P;, or P(2), is also closely
related to the binomial distribution. Indeed, if 2 = pn and 0 < p < 1,
then
(n)k pk 1 n-- < =kp.(k;2)epk. (13
b(k; n, p) = (! - e p(k;k )er (1.13)
0(t) = (t) 2
e-t /2 and D(x) = /2/dt.
Our first aim is to give upper bounds for the terms b(k; n, p) of the
binomial distribution. The restrictions in the theorem below are for the
sake of convenience.
10 Probability Theoretic Preliminaries
Theorem 1.2 Suppose pn > 1 and 1 < hqn/3. Then if k > pn + h, we have
2hirph3--2pqn qn p2n
1 exp h + h + h
b(k; n, p) <
Proof
We may and shall assume that k = pn + h. From inequality (1.5) we
find
+ (qn - h + 1) h + 2q 2 n 2 -+2q3) }
The required inequality follows by expanding and simplifying the expres-
sion on the right and taking into account the conditions pn > 1 and
1• h < qn/3. 11
Ppqn 1/2 1 { h2 h h3
P(S',p Ž pn + h) < (S2r) - exp-2pqn q + pn .
Hence
2 2
1eX/2
P(ISp - pn[> h) < (pqn)/ e-h /2pqn -
"h x
(ii) Suppose 0 < p < 1 and (pn)-/ 2 < < 6. Then
Proof (i) To deduce this from Theorem 1.2 we have to note simply that
h h3 1 71
pY~n2
pqn +_2log <- 2 o 2 0.22579.
2
<e-e~ pn/3q exP2 q {.1 -- 66n(-6 < •
Another upper bound for the probability in the tail is due to Chernoff
(1952). See also Okamoto (1958). For 0 < p < l and q = l-p, write Hp(x)
for the weighted entropy function: Hp(x) = x log(p/x)+(1--x) log{q/(1-
x)},0 < x < 1. Note that Hp(x) = Hq(I - x). Then we have
and, equivalently,
P(X Ž! k) •_ exp{nHp(k/n)}.
Janson (2001) pointed out that from this it is easy to deduce that if
X.. Xn are independent Bernoulli random variables, X = In=1 X,
and E(X) = l then for t Ž 0 we have
> A+ t) exp
<_ 2 ( A t/3) )
P (X
and
_ex p -2
P (X _ - t)
As our next result shows, the bound in Theorem 1.2 is essentially best
possible, provided h = o{(pqn) 2/ 3}.
~,p)>
b ~ pp >
b(k;n I p ex
> 2-7-q (n, 2pqn
h2 -2q2n2 p 3
-3p3n3 2pn It f
-h3 )
1.3 Normal Approximation 13
Hence
(27rpqn) 1/2 b(k 1,n, p) > e-fl (Pmnk+112 qn\
• k/ 1
_) k-n-1/2
e-fl (I+ h)-k-112 ( 1
pn qn/
The estimates in Theorems 1.2 and 1.5 can be put together to give an
approximation of the binomial distribution by the normal distribution.
From the formulae so far it is clear that the most convenient measure of
k - pn is in multiples of the standard deviation ax = (pqn)'/ 2 .
(i) Suppose h, < h2 , h 2 -h --+ oo as n --- oo and lhlI+ h2z = o{(pqn) 2/ 3}.
Put hi = xi(pqn) 1/ 2 , i = 1, 2. Then
1 X2/2
P(S,,p > pn + h) -.x e .
Proof Let us assume for the moment that 1 < hi < h 2 . Then with
14 Probability Theoretic Preliminaries
h_ Sn,p <_pn + h 2 ) _
P(pn+hi 1 ( ht < e h2/22 {1 +o(1)}
where in the sums we assume that h runs over the values making pn + h
an integer. An analogous lower bound is obtained by Theorem 1.5, so (i)
is proved. Now to lift the restriction 1 < hl, note simply that the omission
of two terms from Zh, •_h<h 2 b(pn + h; n, p) leaves a sum asymptotic to the
original.
The second part can be proved analogously. It can also be deduced
from the first part by putting hi = h and h2 = h, + (pqn)5/8, say. The last
assertion is a consequence of (1.15'). El
Theorem 1.7 (i) Suppose 0 < p <_1,e pqn >_ 12 and 0 < e •_ 1/12. Then
t
P Sn,p 1_ pn <e 'Pn.
log v)
gj(A 1,...,An) ( A n
iA ~ J {l..n}.
I_
where the summation is over all 2' subsets J c {1._ n} and cj depends
only on J and the bi.
Given a set Jo • {1. n}, choose B1 .... B, such that P(Bi) = I if
i c JO and P(Bi) = 0 if i ý Jo. Then P{gj(BI, ... ,Bn)} = 0 unless J = JO
and P{gjo(B1 .... B,)} = 1. Thus with this choice of the Bi the left-hand
side of (16') is cjo. Consequently cj > 0 for every set J, so (16') does hold
for every choice of the events B 1 ,..., B,. D]
Aj= n Aj and A 0 = Q.
jEJ
Furthermore, set
pi = P(Aj)
and
Sr Zpi,p r =0,1 .I n.
IJ =r
P A) = (-1)r+lSr. (1.17)
Equivalently,
To see (1.17) note that if P(A 1 ) =.... P(A 1 ) = 1 and P(A1 +1 ) .....
(A~)
U - Z(_1)r+lS, I 1 Z (- i)r~l (() = )r (,i 0.
Given sets J ( L ( N = {1. ..... n}, let us apply (1.17') to the sets
AnAj, i E M = L-J:
P U (Ai n Ai)} 1
Y (-1)1'I +P(AI n A.) = Z(-1)1' Tlp~ju
1*0 '#0O
Since the probability above is non-negative and at most p.j, we have
(-1), {s + E(1)kxk} Ž 0
k=1
for every 1, 1 = 0, 1._ n. (Note that in this case we have Xk >_ 0.)
Proof By Theorem 1.8 it suffices to prove the assertions in the case when
for some 1,0 < I < n, of the sets Ai have probability 1 and the others
have probability 0. Then Sr = (1) and Pk = 60k If < k then all terms in
(1.19) are 0 and if 1 = k then (1.19) is of the form 1 = 1 -0 +0-0 ...
Assume that 1 > k. Then Pk = 0 and
k+m - =k+mM
r=k r~ k/ \r} L
r(k k! r!
k~m( (O
_I)r~ (I - k r-k klm( _1 rk( -)
r=krk k! (r-k)!
Corollary 1.11 . Let X be a r.v. with values in {0, 1. n} and let Er(X)
be the rth factorial moment of X. Then
Proof Let Ai be the event {X > i}, = 1.....n, and define the numbers
So, S,..., S, as above. Then clearly
Sr = Er(X)/r! (1.20)
and {X = k} is the event that exactly k of the Ai occur. Hence the result
follows from Theorem 1.10. El
In the proof above the value of X is the number of Ai's that occur. In
fact (1.20) holds whenever X is defined from an arbitrary set of Ai's in
this way.
Corollary 1.12 Let X be a non-negative integer valued r.v. whose rth fac-
torial moment Er(X) = E{(X)r} is finite for r = 1,... R. Suppose s,
1.4 Inequalities 19
Corollary 1.13 Let X be a non-negative integer valued r.v. with finite mo-
ments. If
lim Er(X)rm /r! = 0
r---oo
for all m, then
One has similar results for probabilities of the form P(X >_ k). The
following is the analogue of Corollary 1.11. The straightforward proof is
omitted.
S X"Z
....,iý
i . a xil...,'a -
20 Probability Theoretic Preliminaries
is at least 0 if for each j either cj = bj or aj < cj < bj and cj - aj is even,
and it is at most 0 if for each j either cj = bj or a, <• cj •_ bj and cj - aj
is odd.
Corollary 1.15 Let X1 ..... Xm, be random variables with values in {0, 1. n}
and let
E(rl ..... r,.) = E{(Xi)ri ... (X.)r,,}.
Then
and the sum satisfies the alternating inequalities. Here r = Ei' ri and
k = E', ki.
so
-PO -•pi •-(3m1 - "12) +r I1 E-(i2 -3i + 2)pi.
1 1
1.4 Inequalities 21
Since
2i(i - 2) < 3(i - 1)(i - 2) = 3(i 2 - 3i + 2), i = 1, 2,...,
we have
3i +~ 2)p, _
12 -••i--2 - 33 3
i2 - 2i)pi = 3(M2 - 2ml).
1 1
Hence
I - P0 > '(3m, - m2) + max {0, '(m 2 - 2ml)},
which implies our inequalities. It shows also that the first inequality in
the theorem holds unconditionally-which is no surprise since it is a
particular case of Corollary 1.12.
To see that the inequalities are best possible, in the first case de-
fine the distribution of X by p0 = 1 - (3m, - m2 )/2, p, = 2 ml -- M2,
P2 = (m2 - ml)/2, 0 = P3 = P4 ..... and in the second case by
Po = 1 - (Sm, - m2)/6, Pi = 0, P2 - (3m, - m2)/2, P3 = (M 2 - 2m,)/3,
0 = P4 = P5 = .... The conditions on m, and m2 imply that these choices
are possible. It is easily checked that the moments are as required. D
P(A2 n-... n Ad+l lAd+2 n"... n Am) P (A 2 IA 3n...n Am)P (A 3 IA4 ... Am)
d+1
P..(Ad+ IAd+2 n ... n Am)>H(1 - 7)=- (1-7),
j=2 jCJ(1)
Theorem 1.18 Let AI,A 2,...,Am be events with dependence graph F. Sup-
poseO< y<I and
P(Ai) < y(l -- 7)d'i)-l
P ( >0.
Proof The first part follows by imitating the proof of Theorem 1.17, with
J(i) = F(i). Note that if j is a neighbour of vertex 1, then j has degree
d(j) - 1 in F -- {1}, so with d = d(1) and the notation of the proof of
Theorem 1.17,
P(AjjAj+j n ... n) Am) ýý_I -- P(Aj)(1 -- 7)-d(j)+1 >_ 1 -- 7.
as claimed.
The second part follows by putting =1/A. wj
Note that {(A - 1)/A}Ai- > 1/e if A > 2, so if A(G) = A > 2 and
P(A,) <_ 1/(eA) for every i, then P(nm=IAi) > 0. In the original paper
of Erd6s and Lovfsz (1975) this was proved under the assumption that
P(Ai) :!_1/4A.
The Lovftsz Local Lemma enables us to show that certain unlikely
events do occur with positive probability, even when this probability is
exponentially small. In some applications, mere existence is frequently
24 Probability Theoretic Preliminaries
Theorem 1.19 Let (Xk)o be a martingale such that [Xk - Xk-1 I-- Ck,k =
1.. ... l,for some constants cl,..., c1 . Then, Jor every t Ž0,
Jor some positive constants c 1,.. -. Ck. Then the random variable X =
1.5 Convergence in Distribution 25
IiNX•<EX -t)•:ýexp{~t Y
Y
lP(X_<EX-t)_exp{-2t2 ck
and
lim Er(X)rm /r! = 0, m = 0, 1.
r-.x
Then
d
Xn --+ X.
Then
d( Xn, PA) ---- O.
This simple result will often be used in the main body of the book [see
also Bender (1974a) for many other applications and refinements of the
method]. We shall also need the analogue of Theorem 1.22 for several
sequences of r.vs. The result below is implied by Corollary 1.15.
Then X, (n), ... , X.(n) are asymptotically independent Poisson random vari-
ables with means '1,..,m, that is
Mn= J xdF.
If all moments exist and
Theorem 1.24 Let (X,) be a sequence of r.vs and let 2, --+ 0c. Suppose
Er(X,,) - =o(,q5 )
S-N(O, 1).
k k
E(X- )kk/2} = ) ck,rtEr(Xn)..-jn2 ' (1.21)
r=O t=O
k k
E (Z,) n = 3>c Ck,r,tr~nt/-kl
n --- +Mk,
r=0 r=0
where mk is the kth moment of N(0,1). Therefore (1.21) and our assump-
tion on Er(X,) imply that the kth moments of the sequence (Xn -- )Y2n 1/2
also converge to Mik.
Suppose
n
sn-, E-• y 2 dFk(y) -*0 (1.22)
• [>_!tsý
[k=l
Corollary 1.26 Suppose (Xk) and (sn) are as in Theorem 1.25, and for some
•5>0
n
Z 2
(JXkk÷•i = o(sn÷6 ). (1.23)
k=i
Then the Lindeberg condition (1.22 is satisfied and
Theorem 1.27 Let fl(x1,..., XN), f 2 (xI, ... XN) ... fr(x .. XN) be non-
negative functions, non-decreasing in each variable xi. Then if X 1. XN
are independent r.vs, we have
It is easily seen that the general case follows from the case r = 2 and
N = 1 (Ex. 21). By replacing f1 (xl. . XN) by fj(-xI .....- -XN) we see
that the same inequality holds if the functions are non-increasing in each
variable xi.
1.5 Convergence in Distribution 29
Theorems 1.22 and 1.24 give rather simple-minded, though very useful,
methods for showing approximations by Poisson and normal distribu-
tions. In the context of random graphs, a considerably more sophisticated
method was used by Barbour (1982). This approach is based on that of
C. Stein (1970) for the normal distribution and that of Chen (1975) and
Barbour and Eagleson (1982) for the Poisson distribution.
The approximation by the Poisson distribution rests on an ingenious
observation we shall state as a theorem. For A ( Z+ and A > 0 let
X = XAA be the real-valued function on Z+ defined by x(O) = 0 and
Here
Cm = {0, 1. ,
and
P;(B) = e-' ZE k/k!
kcB
is the probability that a Poisson r.v. with mean ) has its value in B.
Proof For a proof of (i) and (ii) we refer the reader to Barbour and
Eagleson (1982). Although the proof of (iii) is even more straightforward,
we give it here, since the significance of x is exactly that with the aid of
x we can express the difference P(Y e A) - P2 (A) in a pleasant form.
Assertion (iii) is equivalent to the relation
x(k + 1) - kx(k) I - P(A), if k E A, (1.24)
1-P (A), if k 4 A.
To see (1.24) note that if k > 1, then
,x(k + 1) - kx(k) = 2 ke 2k!{P),(A n Ck) - P2 (A)PA(Ck) - P(A n Ck-1)
+P).(A)PA(Ck-1)}
Exercises
1.1 Prove inequalities (1.8), (1.9) and (1.11).
1.2 Check that O(x) is indeed a density function.
1.3 Let Y be a geometric r.v. with mean q/p, i.e. let P(Y = k) =
qkp, k = 0,1 ..... Prove that E(Y) = q/p,a2 (Y) = q/p 2 and
Er(Y) = r!(q/p)r.
1.4 Show that an exponential r.v. with parameter 2 > 0 has mean
1/2 and variance 1/22.
1.5 Let (Y,) be a sequence of geometric r.vs with Y,, having mean
d
(1 -- p,)/p,. Show that if p, -+ 0, then pY, -L, where L is an
exponential r.v. with mean 1.
1.6- Let X and Y be independent Poisson r.vs with means 2 and u.
Prove that X + Y has Poisson distribution with mean 2 + p.
1.7 Prove formula (1.15). [See Feller (1966, vol. I, p. 179) for a proof
of the case 1= 0.]
1.8 Prove that the rth absolute moment of N(0, 1) is (r - 1)!! =
(r- 1)(r-3)...3 A- 1 ,2(r/e)r/ 2 if r is even, and (1/ /)2'/ {(r-
1)/21! - 1/2(r/e)r/ 2 if r is odd. Indeed,
1~ 11
xlrex 2rc
12dx 2 j reX22d
- 1 t 11/2(2t)r12e-tdt
- ,[ 2Er/2 t(r- 1 )/ 2 e - td t
,- 2r/2E (r l)
j 2
xne X /Zdx • (n!)1/ 2 .
holds in every probability space and for all choices of the Ai's iff
it holds for all choices of the events in the space {xl, x2 }, P(x,) =
P(x 2 ) = 1. (Galambos and R~nyi, 1968.)
1.15 Let L be the lattice of all subsets of [n] = {1,2_..n} and let 29
be the set of probability measures on [n] such that no point gets
measure 0. Then every P c -9. can be considered as a lattice
homomorphism L -- [0, 1], A - P(A). Finally, let us say that
two maps 0, ip : L --+ [0, 1] are homotopic if O(A) < O(B) iff
W(A) < W(B).
Is it true that a lattice homomorphism t : L -, [0, 1] is
homotopic to Op for some p E -9n iff
P(Sn,p •ý M) = k=n
(n~)
pkqfl-k <_(l)"(l)"
i= 1. m, then
nAj)
m _ f(1-- 6jP(Aj)) > 0.
=1j=1
P nAj 0
and so
E{(fl(X 1 )f 2 (XI)} Ž E{fl(XI)}E{f 2 (XI)}.
34
2.1 The Basic Models 35
Proof (i) Let us select the M2 edges one by one. The resulting graph will
certainly have property Q if the graph formed by the first M1 edges only
has the property.
(ii) Put p = (P2 - P1)/(1 - PI). Choose independently G1 E W(n, pl) and
G G '?(n,p) and set G2 = G1 U G. Then the edges of G2 have been selected
independently and with probability p1 + P - PIP = P2, so G 2 is exactly
an element of W(n, p2). As Q is monotone increasing, if G1 has Q so does
G 2 . Hence PPI(Q) •- PP2 (Q), as claimed. Ll
(
Pp(Q) = E PIAI(I p)N-IAI.
ACQ
The second part of Theorem 2.1 says that Pp(Q) is a monotone in-
creasing function of p. Let us give another proof of this fact. This proof
is somewhat pedestrian, but fairly informative. We may assume that Q is
non-trivial, i.e. 0 ý Q and X c Q. Set
Then h0 = 0, hN = 1,
N
pp(Q) = E hkpk(l _ p)N-k,
k=O
Theorem 2.2 (i) Let Q be any property and suppose pqN --+ oo. Then the
following two assertions are equivalent.
"PP(Q) = Z (N)pmqN-mpm(Q)
(N pmqN-MpM(Q)
1 1
> PM(Q)[e /6M(27rpqN) /2]-l.
This result shows that if we know PM(Q) with a fair accuracy for
every M close to pN, then we know Pp(Q) with a comparable accuracy.
The converse is far from being true for a general property. As a trivial
example, let Q be the property that the graph has an even number of
edges. Then Pp(Q) ,- 1/2 whenever pqN --+ oc. On the other hand, if
M = 2[pN/2] then 0 _<pN - M < 2 and PM(Q) = 1 for every n.
However, as it is much more pleasant to work in §p than in SM, let us
note three cases when we can do so without losing much information
about ýM. First, Theorem 2.2 implies that if Q is a convex property
and a.e. graph in Sp has Q then a.e. graph in W(([pN]) has Q, provided
pqN -) oc. Once again, it is obvious that the convexity restriction cannot
2.1 The Basic Models 39
be discarded. For example, if Q is the property that the number of edges
is not a square, then if pqN -+ oc then Pp(Q) -- 1 but PM(Q) = 0 for
M = [(pN)'/ 2 J2 . The second useful connection between Sp and %Mis
only a little less trivial, though we state it as a theorem.
Theorem 2.3 Suppose a > 0 and the function X " Nn __, [0, a] is such that
X(G) < X(H) wherever G c H. Assume, furthermore, that 0 < pt(n) <
p2(n) < 1, M(n) c N,
limp qiN = limp 2 q 2N = oo
and
S EM(X) = (N + 1)1
M=0 ,10
E(X)dp.
Proof Since
N
we have
1
xM(1- _X)N-Mdx = B(M
A /
+II 1,N--
T M +II 1)-- 1 NN)+1
[ M N -
where B(v,jy) is the beta function (see Gradshteyn and Ryzhik, 1980,
p. 950). The identity is easily proved by partial integration. LI
40 Models of Random Graphs
Erd6s and R~nyi (1959, 1960, 1961a) discovered the important fact that
most monotone properties appear rather suddenly: for some M = M(n)
almost no GM has Q while for 'slightly' larger M almost every GM has
Q. Given a monotone increasing property a function M*(n) is said to be
a threshold.function for Q if
M(n)/M*(n) -* 0 implies that almost no GM has Q,
and
M(n)/M*(n) -+ oo implies that almost every GM has Q.
A priori there is no reason why a property should have a threshold func-
tion though this follows from an observation of BollobSis and Thomason
(1987), noting that, with the natural definition, every monotone prop-
erty of subsets of a set has a threshold function. Considerably deeper
results were proved by Friedgut (1999): he gave necessary and sufficient
conditions for the existence of sharp thresholds for monotone graph
properties.
When we study a specific graph property Q, we do not rely on general
results, and often succeed in doing more thick merely identifying a
threshold function: for many a property Q we shall determine not just
the threshold function but an exact probability distribution: for every
x,0 < x < 1, we shall find a function Mx(n) such that P(GMA has Q) -* x
as n ---> o.
It is clear that threshold functions are not unique. However, threshold
functions are unique within factors m(n), 0 < lim m(n) <_lim m(n) < co,
that is if M, is threshold function of Q then M; is also a threshold
function of Q iff M; = O(M,) and M* = O(M2). In this sense we shall
speak of the threshold function of a property.
As mentioned earlier, the nearer our r.vs are to being independent, the
easier it is to handle them. This is the reason why Np is more pleasant
to work with than 5m. In Sp the edges are chosen independently, while
in the model NM the choice of an edge does have a (fortunately small)
effect on the choice of another edge.
Another model with good independence properties is <6(n;k-out) =
Wk-out. A random graph Gk-out is constructed as follows. The vertex set is
V = {1,2,...,n}. For each vertex x c V select k other vertices, all (n k)
choices being equiprobable, and all choices being independent, and for
each selected vertex y direct an edge from x to y. Let G = Gk-out be
the random directed graph constructed in this way, with • = •k-out the
probability space formed by them. Equivalently, ;k-out is the probability
2.1 The Basic Models 41
space of all directed graphs with vertex set V in which every vertex has
outdegree k.
It is fascinating to note that ! 6k-out is mentioned in the Scottish Book:
in 1935 Ulam asked for the probability of Gk-out being connected (see
Mauldin, 1979; p. 107, Problem 2.38).
For G E i let Gk-out = 0(G) be the random graph with vertex set V and
edge set {xy: at least one of iy and jx is an edge of G}, Wk-out = 0(G) :
G E <ýk-out. The probability of a graph H in %k-out is the probability of
the set of graphs 0- 1(H) in ;k-out. Note that every Gk-out has at most
kn edges but every vertex of it has degree at least k. As we shall see in
Chapter 3, this is in sharp contrast with the graphs GM, for a.e. GM has
isolated vertices unless M is not much smaller than (n/2) log n.
The model W(n;p 1 ,p172 . -. Pk) is also obtained by identifying edges
obtained in different ways. The graphs of this model are obtained by first
selecting edges of colour 1 with probability Pl, then edges of colour 2 with
probability P2, etc. This gives us a multigraph (graph with multiple edges)
whose edges are coloured 1,2,..., k. To obtain our r.g. we just forget the
colours of the edges and identify multiple edges. Of course, the model we
get in this way is nothing but W(n,p), where p = 1- _H•(1 -pi). However,
by selecting the edges in several stages we can work with independent
graphs: the graph formed by the edges colour i is independent of the
graph formed by the edges of colour j, j = i. This model was used in the
second part of the proof of Theorem 2.1.
There are many other models of random graphs. By giving all graphs
in a finite set the same probability we obtain a probability space. In this
way we have random k-regular graphs, random trees, random forests,
random planar graphs, etc. Of these, only random regular graphs will
be investigated in detail. As regular graphs are not easily generated,
we devote an entire section (§4) to the introduction of random regular
graphs.
Let us conclude this section with two models which, in some sense, are
refinements of %(n, p) and S(n, M).
Let 0 < p < 1 be fixed. The model ý§(N, p), studied in Bollobfis and
Erdds (1976), consists of all graphs with vertex set N, whose edges
are chosen independently and with probability p. In other words a r.g.
G e S(N,p) is a collection (Xij) = {Xij : 1 i < j} of independent r.vs
with P(Xij = 1) = p and P(Xij = 0) = q: a pair ij is an edge of G
iff Xi = 1. Clearly G, = G[1,2,...,n], the subgraph of G spanned by
1,2,..., n, is exactly a r.g. Gp on V = { 1, 2,..., n}.
Note that the space W(N, p) has uncountably many points and depends
42 Models of Random Graphs
only on p. The definition of 'almost every' is then just the standard def-
inition in measure theory: almost every (a.e.) G E W(N,p) is said to have
a property Q if P(G has Q) = 1. Note that if a.e. G is such that for some
n(G) E N every G, n > n(G), has property Q, then a.e. Gp has Q. On the
other hand, the converse is far from being true. Thus assertions concern-
ing a.e. graph in !(N, p) tend to be stronger than assertions about a.e. Gp.
A random graph process on V = {1, 2,..., n}, or simply a graph process,
is a Markov chain G = (Gt)', whose states are graphs on V. The process
starts with the empty graph and for 1 •< t < (•) the graph G, is obtained
from G,_1 by the addition of an edge, all new edges being equiprobable.
Then Gt has exactly t edges, so for t = (2) we have Gt = K'. Fort > (n)
we take Gt = K" as well.
A slightly different approach to random graph processes goes as fol-
lows. A graph process is a sequence (Gt)4 = 0 such that
Thus M*(n) is the threshold function of Q if whenever wo(n) --+ oo, the
hitting time is almost surely between M*(n)/lo(n) and M*(n)co(n):
Theorem 2.5 Suppose M = M(n) and p = p(n) are such that for every
E > 0 we have
to show that if p satisfies (2.2) then PP(]Pk) = o(n-1). This is true with
plenty to spare.
Note that if iG1 > 2k, then G has Pk if and only if a vertex z can be
found for all pairs W 1, W 2 with IWI = IW21 = k.
Suppose p satisfies (2.2). Write Pp(W 1 , W2 ;z) for the probability that a
fixed vertex z 0 W1 U W2 will not do for a given pair (W1, W2),I W1 =
IW21 = k, and let Pp(W 1 , W2 ) be the probability that no vertex z will do
for (W1, W2). Then
and so
Theorem 2.6 Suppose M and p satisfy (2.1) and (2.2), and Q is a property
of graphs given by a first-order sentence. Then either Q holds for a.e. graph
in ,?•(n, M) and W(n, p) or it fails.for a.e. graph in W(n, M) and W(n, p).
Since the theory of graphs together with all Pk's has no finite models
and all countable models are isomorphic, the theory is complete (see
Vaught, 1954; Gaifman, 1964). Hence if Q is any first-order sentence
then either Q or ] Q is implied by some set of Pk's and so by some Pk, say
by Pk,. In particular, either P(G has Q) > P(G has Pk,) or P(G has ]Q) >
P(G has Pk,). By Theorem 2.5, a.e. graph in our range has Pk,, so the
proof is complete. LW
For monotone properties Theorem 2.6 can be restated in terms of
hitting times as follows: if Q is a monotone property given by a first-
order sentence then for almost no graph process Q does M = cQ(6)
satisfy (2.1). The moral of Theorem 2.6 is that if we wish to study a
property of random graphs given by a first-order sentence, then we have
to do it in models that do not satisfy (2.1) and (2.2). In fact, many
graph properties one studies are not given by first-order sentences: 'G
is bipartite', 'G has chromatic number at least k', 'G is Hamiltonian', 'G
contains [n 1/ 2] vertices dominating all vertices', etc.
Theorems 2.5 and 2.6 are essentially best possible, as is shown by the
following result.
Theorem 2.7 Given r > 0 choose k E N such that ke > 1 and define
p = p(n) > 0 by
2 nk epkn = k!
Let Q be the property that for every set of k vertices there is a vertex
joined to all of them. Then 0 < p < e if n is sufficiently large and
1S< limP(Gp has Q) < limP(Gp
_5
has Q) < 5
Proof By definition
Hence
k
= (X)2 + O(n2k-Ie-2pkn),
/=1
since
1
(1+ p2k-I)n • e 'k - 1+ {(logn)kl/n}I/k.
Consequently,
E2(X) = E(X)2 {1 O(1/n)} * 4
By Corollary 1.11 (or the first part of Theorem 1.16) and the trivial
inequality P(X = 0) > 1 - E(X), we have
I -1- E(X) < P(X = 0)• 1- E(X) +-E 2(X) 8
Theorem 2.8 Let 0 < p = p(n) < . Then a.e. G c W is such that if U is
any set of u > ( 2 52 /p) log n vertices, then
u> E-2 = pu
P (U)2 7 log n'
Since there are (n) < n"/u! sets of u vertices, the probability that some
set U with u > u 0 = [( 2 5 2 /p) logn] elements violates the inequality of
the theorem is at most
Z U>UOnu
-- 2< n
48 Models of Random Graphs
As inequality (2.3) defines a convex property, by Theorem 2.2 the result
above holds for §M as well, where M = p (n).
Note that Theorem 2.8 concerns only graphs of fairly many edges,
for (2.3) holds vacuously unless u < n and so p > (144/n)logn,pn 2 /2 >
72nlog n. In fact, large sets of graphs of much smaller size also behave
rather regularly. We state the result for WM.
Theorem 2.9 Suppose M = M(n) < n2 /4 and uo = uo(n) < n/2 satisfy
Muo/{n 2 log(n/uo)} - oo.
Then for every r > 0 a.e. GM is such that every set U of u > uo vertices
satisfies
le(U) - M(u/n) 2 1 < ,M(u/n)2 . (2.5)
We have (•) < (en/u)u choices for U, so the probability that (2.6) fails
for some set U, IU1 > uo, is at most
• 5 exp[u{--2pu°/7 + I + log(n/uo)}]
Id>U0
= o(l),
for
-9 2puo/7 + log(n/uo) < -(E2 /7)Muo/n 2
+ log(n/uo) ---- -oo.
implies (2.5). E
2.3 Large Subsets of Vertices 49
Corollary 2.10 If M/n --+ c and E > 0 then a.e. GM is such that
2 2 2 2
le(U) - Mu /n ] < eMu /n
Theorem 2.11 Let 0 < p = p(n) <_ 1/2. Then a.e. Gp is such that
whenever U and W are disjoint sets of vertices satisfying (2 52 /p) log n <
u = 1uI _•w = IWI.
Theorem 2.12 Suppose M and uo are as in Theorem 2.9. Then for every
e > 0 a.e. GM is such that
The next two results, essentially from Bollobfis and Thomason (1982),
are more technical. They also express the fact that in most graphs all
large sets of vertices behave similarly.
satisfies
IZwJ < 12logn/{e 2p}.
50 Models of Random Graphs
Proof We may assume that wo < n and so pn -* oo. For fixed w > wo, W
and x c V - W Corollary 1.4 implies that
Therefore, if z > 4(log n)/ilp, the probability that some set W of order w
exists which fails to satisfy the conclusions of the theorem is at most
3
nWe qpwz/ /3
Theorem 2.15 Suppose 6 = 6(n) and C = C(n) satisfy 6pn > 3logn.C >
3 log(e/3) and C6n --+ oo. Then a.e. Gp is such thatJorevery U ( V, UI =
u = [C/p] the set
6
P (IT,P(I•!
I ýn)6n_<
• ( 6n
n) e-C 6 nf< (e/6)6neC n <•e2C~n/3.
-
(4C 2 1
•exp logn--C6n) •exp - C6n =o(1).
2
I Y(d ; G o)[ e- )12-' /4- p(2m )m (2m i di!)
where
Let n 2 denote the number of di's greater than 1, i.e. set n2 = max{l I 1<
n, d, > 2}. Since 2m - n -* oo, we have n 2 -+ 00. Clearly
~Il=k
and a trite calculation gives
{£ }k
. {w(5) di(di - 1) /k! (2.8)
Then by (2.8)
limn,_.Ck(d)/n k > 0.
Furthermore, put
Clearly there are exactly Ck(d) sets of pairs of vertices that can be k-cycles
of configurations.
Let us show that if the sequence d is decreased a little then Ck(d) does
not decrease too much. More precisely, given a non-negative integer t,
denote by d- t the sequence dt+l, dt+2,.... d, and define Ck(d- t) by (2.9)
and (2.10). Very crudely indeed, for all k > 1 and 1 t < n we have
are such that for some 1 < t = 2r0 +-Z-= 1 iri altogether they have at most
1 edges and these edges join at most 1- 1 groups W,. Hence, very crudely
indeed,
t
P(Xo + X 1 + X 2 = 0) -e0 -2 = e
as required.
It is not hard to show that A in Theorem 2.16 need not be fixed. The
same proof executed with slightly more care shows that the assertion
holds for A < (log n)1/ 2/2 (see Bollobfis, 1980b; Bollobfis and McKay,
1985).
The main significance of Theorem 2.16 and its proof is that when we
study random regular graphs (whose degree does not grow too fast with
n), instead of considering the set of regular graphs we are allowed to
consider the set of configurations. In order to make this statement a little
more precise, we restate some of the facts above for regular graphs.
Let dl = d 2 =....= d = r > 2 and let D and 0 be as in the proof of
Theorem 2.16. Clearly,
1
0- (Sr-reg) = {F E • : X 1 (F) = X 2 (F) = 0}
and O- 1(G)J = (r!)' for every G e 'r-reg-
Given a set (property) Q C Sr-,reg, how can we estimate P(Q)? Find a
set Q* ( D such that
Q*n - 1(Sr-reg) = O I(Q).
56 Models of Random Graphs
Then, trivially,
Corollary 2.18 If r > 2 is fixed and Q* holds .br a.e. configuration [i.e.
P(Q*) -> 1], then Q holds. br a.e. r-regular graph.
Although it does not really belong to this section, let us note another
trivial consequence of the proof of Theorem 2.16 (see Bollobfis, 1980b;
Wormald, 1981a, b).
Corollary 2.19 Let r > 2 and m > 3 be fixed natural numbers and
denote by Yj = Yj(G) the number of i-cycles in a graph G c #r-reg.
Then Y3,Y 4,. . . . .Yn are asymptotically independent Poisson random vari-
ables with means X3 ,24,..., 2 m, where 1i = (r - 1)i/(2i).
where L 2(0) is defined to be 1. From this one can deduce that the
exponential generating function of L2(n) is
1 {rt 2 +±2t}
exp t .2
E L 2(n)t'/n! =•
nŽO 01 ex 4__ '-
that of graphs with given degree sequences. For d = (di)', write L(d), for
the number of graphs on [n] with d(i) = di, i = 1..... n. Thus, if Go is the
empty graph then, with the notation of Theorem 2.16, L(d) = IY(d; G0 )J.
McKay and Wormald (1990b) gave a complex integral formula for
L(D) under the assumption that for a sufficiently small r > 0 and some
> we have Idi - dj < n1/2+ and min{di, n -d} > Cn/ log n for all i
and j. However, the integral itself is rather difficult to estimate.
Using a different approach, McKay and Wormald (1991) gave an
asymptotic formula for L(d) under the assumption that A = maxdi =
o(m 1 / 3 ), where m = E n di is the number of edges. In particular, they
proved the following substantial extension of Corollary 2.17: if I < r =
o(n 1/ 2) and rn is even then
( r 2 -1 r3 ) (rn)!
Lr(fl) = exp 4 12n / (rn/2)!2rn/ 2(r!)n"
Exercises
In Exercises 1-3 Q is a monotone increasing property.
2.1 Suppose 0 < M 1 < M 2 < N, PM,(Q) < 1 and PM2(Q) > 0. Prove
that PM,(Q) < PMA(Q).
2.2 SupposeO<pi <P2 < 1,p=(p2-pl)/(1-pl)and
What is Q?
2.3 Let M = [pN] and pqN -* oo. Show that
and
PM(Q) > 2Pp(Q) - 1 + o(1).
and
P(Gk-out does not contain i, i = 1. s) < qS.
2.9 Consider the following Markov chain (G,)' defined by Capo-
bianco and Frank (1983). The states are finite graphs and the
initial state G1 is the trivial graph with one vertex. If Gt- 1 is
complete then G, is obtained from Gt- 1 by the addition of a
vertex. If Gt-l is not complete then with probability p we add
an edge to obtain G, and with probability q = 1 - p we add a
vertex. Here 0 < p < I is fixed, and all missing edges are equally
likely to be added. [Note that IG, + e(G,) = t for every t.] Show
that, with probability tending to 1, Gt has about pt vertices and
qt edges.
2.10 Let r > 3 be fixed and let uo = o(n). Prove that there is a function
e = e(n, uo) -- 0 such that if Go = Go(n) is a graph with vertex
set V = 1. n}, size u < u0 and maximum degree at most r,
then
(1- rn <r
)
(en)U P(Gr-reg contains Go) _<(1 - e) (ry.
Exercises 59
2.11 For a fixed k c N consider Sk-out. Call an edge xy of a random
graph a double edge if the original directed graph contains both
xcy and yx. Use the technique of the proof of Theorem 2.16 to
60
3.1 The Distribution of an Element of the Degree Sequence 61
Clearly
dr > k iff Yk > r
and
dn-r < k iff Zk > r + 1.
If p = o(n- 3 /2 ), then
n-i n--1c
E(Y 2) ZE(Xj) <- n~E ~ pj • (pn)'/j! o=)
j= =2 j=2
Theorem 3.1 Let e > 0 be fixed, En-3/ 2 <_ p = p(n) _ 1 - en-3/2, let
k = k(n) be a natural number and set 2 k = 2 k(n) = nb(k;n - 1,p). Then
the following assertions hold.
(i) If limAk(n) = 0, then limP(Xk = 0) = 1.
(ii) IJlim Ak(n) = oo, then limP(Xk > t) = 1
for every fixed t.
(iii) f10 < lim Lk(n) < lim 2 k(n) < o,
then Xk has asymptotically Poisson distribution with mean ýk:
P(Xk = r) -, e-4ýr
Proof We may assume without loss of generality that p < 1/2. Note that
2
ýk = k(n) is exactly the expectation of Xk(G). Hence
Let us distinguish two cases according to the size of p(n). Suppose first
that p(n) < ½ is bounded away from 0. Then limAk(n) > 0 implies that
k(n) is about pn, say
1Ipn 3
<• k(n) < 3pn.
Therefore, by (1),
E,(Xk) - (n)rb(k;n - 1,p)' - .ý. (3.2)
Now suppose that p = o(1). Then (1) implies the following upper
bound for E,(Xk):
where R = (2) and the maximum is over all sequences dl, d;,..., d with
I d* = 21 and 0 < d* < min{r - 1,k}. Clearly lim 2k(n) > 0 implies that
k(n) = o(n), so if 0 < d < min{r - 1,k} and n is sufficiently large,
Our assumption that lirn k(n) > 0 implies that k2 o(pn2 ). Indeed, if
-
for some fixed q > 0 the inequality k >_ p 1p/2n held for arbitrarily large
n then we would have
lim P(Xk _ t) = 1
n--0O0
312
Theorem 3.2 Let e > 0 be fixed, en-3/ 2 < p = p(n) <_ 1 --n- , let
k = k(n) be a natural number and set
This last result enables us to determine the distribution of the first few
and the last few values in the degree sequence. First we present some
results concerning the case when p is neither too large nor too small.
Then
m-1 ck
lim P{d. < pn + x(pqn)1/ 2} = e-c E c
k=O
Proof Put K = [pn + x(pqn) 1 21]. Then dn < pn + x(pqn)1/ 2 means that
at most m - 1 vertices have degree at least K, that is
YK m - -1.
so by Theorem 1.6
,UK - (c/n)n = c.
This implies the assertion of the theorem. D
For every fixed positive c the value of x calculated from (5) is about
(2 log n) 1 /2. Furthermore,
2
x(n, 1) = (2 log n)1/ 1 log log n log(27rl/ 2 ) +° (g n)-1/2}
112.
I. 4logn 2logn f
3.2 Almost Determined Degrees 65
e e ' Ee-'Y/k!. El
k=O
ii pn 1- 2pq lo n/ 2 1
limp A<pn+(2pqnlogn)l/ log 21g- 1/2) +•
4olog n _ Iog(2ir y
n- 4logn 2logn 2logn
if
p = (log n + k log log n + y)/n
then
P{6(Gp) =k} -1->I ±1}+-
ee /k! and P{6(Gp) =k+ e -e /k!. l
it true that a.e. Gp has a unique vertex of maximum degree? The aim of
this section is to answer these questions. The proofs of the results are
based on Theorem 3.1.
We show first that if almost all graphs have the same maximum degree
then p = o(log n/n) and then we show that this necessary condition is
essentially sufficient.
Theorem 3.6 Suppose that p <_ 2 and there is a Junction D(n) such that in
the space W(n,p) we have
and so
lim n(epn/cpn)cPn = lim n(e/c)cpn = cc.
l---oo n-o
Theorem 3.7 Suppose p = o(log n/n). Let k = k(n) > pn be such that
max{2k(n), uk(n)- 1 } is minimal. Then the following assertions hold.
(i) Iff0 < lim)k(n) < lim nk(n) < c, then
and
P{A(G) = k(n) - I} e -k(n).
(ii) If lim k(n) = cc, then a.e. GP has maximum degree k(n) and if
lim lk(n) = 0, then a.e. Gp has maximum degree k(n) - 1.
Hence a.e. Gp is such that the maximum degree is k(n) - 1 unless there is
a vertex of degree k(n), in which case it is exactly k(n):
and
Theorem 3.8 (i) Suppose that p •_ 2 and there is a function d(n) such that
in the space S(n, p) we have
Then
p • {1 +o(1)}logn/n.
(ii) If p {1I + o(1)} log n/n, then there is a function d(n) such that the
minimum degree of a.e. Gp is either d(n) or d(n) - 1.
68 The Degree Sequence
Proof (i) Write p in the form p = {log n + wo(n)}/n. Then, if ow(n) =
O(logn), say,
AO= nb(O; n - 1,p) = n(1 - p),-1 - nn- 1 e-"(') = e-
so, by Theorem 3.1, Gp almost surely has an isolated vertex iff wo(n)
-oo. Therefore we may assume that p > (log n + C)/n for some constant
C.
As in the proof of Theorem 3.6, we find that if limn_. P {16(n) =
Hence
Theorem 3.9 (i) If p < 1 and pn/log n --+* 0, then a.e. Gp has a unique
vertex of maximum degree and a unique vertex of minimum degree.
(ii) If p < 1 and a.e. Gp has a unique vertex of maximum degree or a
unique vertex of minimum degree, then pn/ log n ---. oc.
Proof (i) For simplicity write b(l) = b(1; n - 1, p) and B(1) = B(1; n - 1,p).
Let k be the maximal natural number with
nB(k) > 1.
It is easily checked that k > pn, k - pn, nb(k-1) --4 0 and b(k)/b(k-1)
1. Consequently one can choose 1 = 1(n), pn < 1 < k, such that nb(l) -- 0
and nB(l) -* oo. If we vary I in such a way that these relations remain
valid then we can find an m = m(n), pn < m < k, such that
nB(m) --+ oo and nB(m)nb(m) ---* 0.
On the other hand, the probability that there are at least two vertices
whose degrees are equal and at least m is at most
2
E P(Xj > 2) _<Z E 2 (Xj) <_ 1 n 2b(j;n- 2,P)
j>m j!m j>m-1
< nb(m - 1; n - 2,p)nB(m - 1; n - 2,p) ,-. nb(m)nB(m).
To see the second inequality above, note that the expected number of
ordered pairs of vertices of degree j _>m is
n(n - 1){pb(j - 1; n - 2, p) 2 + (1 - p)b(j; n - 2,p) 2} • n 2 b(j; n - 2, p) 2 .
Hence Zj>_m P(Xj > 2) -- 0, proving the assertion.
(ii) A sharper form of this assertion is left as an exercise for the reader
[Ex. 3, see also Bollobhis (1982a)]. El
Let us summarize the results of this section in a somewhat imprecise
form. If p = o(log n/n), then both the minimum and maximum degrees
are essentially determined and there are many vertices of minimum
degree and many vertices of maximum degree. If elog n/n < p < {1 +
o(1)} log n/n for some e > 0, then the minimum degree is essentially
determined and there are many vertices of minimum degree, while the
maximum degree is not confined to any finite set with probability tending
to 1. If(1 +e)logn/n <_p < Clogn/n for some e > 0 and C > 0, then
neither the minimum nor the maximum degree is essentially determined
and the probability that there are several vertices of minimum and
maximum degrees is bounded away from 0. Finally, if pn/ log n - oo and
p<½,then a.e. Gp has a unique vertex of maximum degree and a unique
vertex of minimum degree.
The events {ab e E(Gp)}, {dG -h(a) >_ k - I} and {dG• a(b) >_k - 1} are
independent, so the probability of their intersection is
The second term in the expression (3.6) for E2((Yk) is justified analogously.
Hence
o0(Y) E(Yk2) - k
= lk + n(n-- 1){pB(k- 1;n- 2)2 + qB(k;n- 2)2} -/Pk
< P1 k + n2 {pB(k - 1;n - 2)2 + qB(k;n - 2)2 - B(k;n- 1)2}. (3.7)
and
B(k- 1;n-2)-B(k;n-2) = (- 2 p
1
qn
1
=b(k- 1;n-2).
Lemma 3.11
U-2 (Yk) = O(/uk).
Proof This is immediate from Lemma 3.10, since one can check from
Theorems 1.1-1.5 that
Theorem 3.12 Suppose m = m(n) = o(n), re(n) --+ co and wo(n) -> oo
arbitrarily slowly. Define x by
I - (D(x) = m/n.
3.3 The Shape of the Degree Sequence 71
1 12 ml n 1/2
x (pqn) ] < w(n)
Id. - pn -
n 2jrx
and
x - {2 log(n/m)} 1 2
.
We may assume without loss of generality that co(n) < mI/ 2 and
ow(n) = o(x). Set
k = pn + x(pqn)1 /2 + )(n){pqn/mlog(n/m)}j1 2 .
By Theorem 1.6
and
Hence
2
m >_ E(Yk) + wo(n)E(Yk)" .
If m does not grow too fast, then the dependence of x above on m can
be calculated precisely and one is led to the following corollary.
72 The Degree Sequence
Corollary 3.13 Let wo(n) --+ oo and m = m(n) = O(1og n)/(log log n) 4 . Then
a.e. Gp satisfies
1/2
n)1/ 2 + log log n (,n
dm -pn - (2pqn log
( 8log )nJ •
12
+ log(2r / m) pqn 1/2 (on pqn 1/2
G2logn) • (n mlogn)
almost surely.
The result above is best possible (Ex. 8). Furthermore, if (pn -
log n)/ log log n -* x, then almost all Gp's have about the same min-
imal degree (Ex. 9).
McKay and Wormald (1997) proved several powerful general results
about the joint distribution of the degrees of a random graph that
imply the results above. They showed that the joint distribution can
be closely approximated by models derived from a set of independent
binomial distributions. For example, they proved that for a wide range of
functions M = M(n), the distribution of the degree sequence of a random
graph GM e W(n, M) is well approximated by the conditional distribution
{(X1..... X,) : IZX 1 = 2M}, where X 1, ... ,X, are independent r.vs,
each with the same distribution as the degree of one vertex:
k( M
)(5k
3.4 Jumps and Repeated Values
We have seen that if p is not too close to 0 or 1, then a.e. GP has a unique
vertex of maximum degree. In fact, unless p is close to 0 or 1, for a.e. Gp
3.4 Jumps and Repeated Values 73
the first several values of the degree sequence are strictly decreasing. The
following result gives a good estimate of the likely jumps di - di+,.
1 - (1)(x) = m/n.
1 2
Note that x = O{logn) / }. By Theorem 3.12, a.e. Gp satisfies
dm Ž: pn + (x - g)(pqn)
where
r = {log(n/m)}l 1 2.
where
oc(n) pqn 1/j2
- i 2 G ogn J
The expected number of such ordered pairs is clearly
Ln-2
L =n(n -1) p Z b(k;n-2)k+J--
E b(1;n-2)
k=K --1 I=k
n-2 k+J-1
+q b(k;n-2) E b(l;n-2)
k=K l=k
n-2
- n2 5 b(k; n - 2)Jb(k; n - 2)
k=K--1
L i(n)
n2 pqn /2 x m_ 2c(n)x
-n 2 logn) (pqn)'/ 2 n (log n) 1/ 2
Since oc(n) -+ 0, the expected number of pairs tends to 0. El
74 The Degree Sequence
Let us state without proof that Theorem 2.15 is essentially best possible.
Theorem 3.16 Suppose m --* oc and o)(n) -* oc. Then a.e. GP is such that
(pqn 1/2
Sw(n)
2
di - di~ < im \logn) for some i < m.
d(xi) = dl > d(x 2 ) = d 2 Ž! ... Ž_ d(xn) = dn. If di < di+1 + 2 for some
i :< m, then put Gp into <S(n, p) - -t and end the algorithm. Otherwise
for i = 1, 2,..., n compute
f(xi) = E a(i,j)2j,
j=1
Exercises 75
Exercises
3.1 Let x be a positive constant, k > 2 a fixed integer and p =
p(n) - xn--I/k. Then a.e. Gp has no vertices of degree k + 1
and the number of vertices of degree k tends to PA, where
2 = xk/k !. Deduce that if for fixed k we have pn1+1/k -- oo and
pn(k+2)/(k+i) - 0, then a.e. Gp has maximum degree k.
76 The Degree Sequence
where ,l = Z ý=I
)ji. (Bollobts, 1982a.)
3.4 Let c and p be as in the previous exercise. Let '1A = qA(c) be the
unique positive root of
1 +ctl = c(1 --
+ q/)log(1 +±q)
and, for c > 1, let tq6 = t16(c) be the unique root satisfying
-1 < q6 < 0 (see Figure 3.1). Show that a.e. Gp is such that
1.0
0.5
0.0
-0.5-
-1.0
0 10 20 30 40
Fig. 3.1. The functions iqA(c) and q6(c).
Exercises 77
3.5 Let c > 0 and p = p(n) - c/n. Prove that for a.e. Gp the maximum
degree satisfies
AG-lon
logn 11+
I log loglog n +. 1 + logc + o(l)f
log log n log log n log logn
3.6 Let k e N. Show that a.e. Gk-out has minimum degree k and
maximum degree
logn 1+ logloglogn+ 1 +logk+o(1)
log logn log log n log log n
3.7 Write Bp and BM for random graphs from ý{K(n,n),p} and
'{K(n,n),M}. Show that if x G R, p(n) = {logn + kloglogn +
x + o(1)}/n and M(n) = (n/2){log n +k log log n + x + o(1)}, then
and
3.8 Prove that Corollary 3.14 is best possible: if almost all G 's have
about the same maximal degree, then pn/ log n --+ oo.
3.9 Show that if (pn - log n)/ log log n --+ oo, then there is a function
d(n) -* o such that 6(Gp) = {1 + o(1)}d(n) almost surely. [Note
that d(n) need not be pn.]
3.10 Let VI, V2 , ... , Vt be disjoint copies of V = {1,2,. .. ,.n}. Select
elements of Vk(2) at random and join i to j by as many
edges as the number of copies of ij have been selected. Denote
by Gtn the random multigraph obtained in this way and let Gs,t,N
be the graph in which i is joined to j iff in Gt,N they are joined
by at least s edges. Use the connection between GN and Gp to
prove that if N = n 11 st [{logn + c + o(1)}/ s then the
distribution of the number of isolated vertices of G,,t,N tends to
P(e-e). (Godehardt, 1980, 1981.)
4
Small Subgraphs
Given a fixed graph F, what is the probability that a r.g. Gp (or GM)
contains F? The present chapter is devoted to a thorough examination
of this question.
The degree sequence of a graph G e I" depends only on the number
of edges in the sets Vi( 2 ) = {ij : 1 j < n,z j = i} c V( 2 ), i = 1 ... n.
Our job in the previous chapter was made fairly easy by the fact that
these subsets VJ(2) are not very numerous and not very large (there are
n of them, each of size n - 1) and that they are essentially disjoint (any
two of them have one element in common). The property of containing
a fixed graph F has similar advantages: it depends on the edge distri-
bution in fairly few, fairly small and essentially independent subsets of
y(2).
Erd6s and R~nyi (1959, 1960, 1961a) were the first to study the
appearance of small subgraphs; further results were obtained by
Schfirger (1979b), Bollob~is (1981d), Palka (1982) and Barbour (1982).
In the first section we shall study a rather special class of subgraphs.
As a corollary of the results, in the second section we shall deter-
mine the threshold function for the existence of an arbitrary sub-
graph F. It will turn out that the threshold function depends only
on the maximum of the average degrees of the subgraphs of F. The
third section is devoted to the case when a r.g. is likely to contain
many copies of our graph F. The first two sections are based on
Bollobfis (1981d), the third section contains results from Barbour
(1982).
The study of the number of subgraphs in a r.g. generated by some
underlying geometric structure is related to the topic of this chapter but
is outside our scope (see Hafner, 1972a,b; Schtirger, 1976).
78
4.1 Strictly Balanced Graphs 79
where 2 = cl /a.
(X)
.. . %. (4.1)
ka a a
80 Small Subgraphs
Indeed, there are (Q) ways of selecting the k vertices of an H-graph from
V = {1,2, ..,n} and then there are k!/a ways of choosing the edges of
an H-graph with a given vertex set. The probability of selecting 1 given
edges is pl.
By Theorem 1.20 it suffices to show that for every fixed r, r = 0,1,...,
the rth factorial moment Er(X) of X tends to 2 r as n -> oc.
Now let us estimate Er(X), the expected number of ordered r-tuples of
H-graphs. Denote by E'(X) the expected number of ordered r-tuples of
H-graphs having disjoint vertex sets V1, V2,..., Vr. Then, clearly,
E'(X)= (n) (n-k) .. (n( -(k-1)k) (kH) r= (n)rkprPH _r. (4.2)
Therefore all we need is that E7(X) = Er(X) - E'(X) = o(1). This will
follow from the fact that H is strictly balanced.
Suppose the graphs A and B are such that B is an H-graph and it has
exactly t vertices not belonging to A. If 0 < t < k, then
e(A U B) > e(A) + tl/k + 1/k. (4.3)
Indeed, since H is strictly balanced, fewer than (k - t)l/k edges of B join
its k - t vertices in V(A) n V(B).
Let H 1, H 2, ... , Hr be H-graphs with IU=rl
IrEr V(Hi)l = s < rk. Suppose
Hj has tj vertices not belonging to UJiJ• Hi, so that 2 = s - k. We
assume that 0 < t 2 < k, so by (4.3)
e(H1 u/-H2) I + t• + I.
k k~
Since also
we have
(r\ Si 1
e UH) > k- - .
s~k
kr-1 kr-i
Proof Let 0 < cl < c < c2 and pi = cin-k/l, i - 1, 2. Since X(G) •_ X(G')
whenever G c G',
Hence the result follows from Theorem 4.1 if we recall that cl and c2 are
arbitrary numbers satisfying 0 < cl < c < c 2.
The conditions in Theorem 4.1 can be weakened to include the case
when H depends on n but k and I grow rather slowly with n.
krkl = o(n).
Proof All we have to check is that E' - 2r and El' = o(1) continue to
hold. The first is immediate:
so
(cI/a)r > E' > (cI/a)r rk)rk > (cI/a)r - r = {o(1)}(c/a).
Ell 1
rk-_.
s=k+l
{(s) }!ýrplk
QkI() l k
rk--1
Corollary 4.4 The conclusions of Theorem 4.1 and Corollary 4.2 hold if H,
k, 1, a and c depend on n, provided c'/a -'. 2.for some positive constant 2
and
k = O{(logn)1 / 4 }.
Proof Since there are (k±+1)k-1 labelled trees of order k+ 1, the probability
that Gp contains a tree of order k + 1 is at most
(k ) (k + l)k-lpk =O(n-1/(k-l)).
The proof of Theorem 4.1 and Theorem 1.21 imply easily that if
H1 ,...,Hm are non-isomorphic strictly balanced graphs having the same
(average) degree then in an appropriate range of p the multiplicities of
the Hi's are asymptotically independent Poisson r.vs. The easy proof is
left as an exercise.
Since cycles are strictly balanced and have average degree 2, we have
the following corollary.
where 2j = c1/2i.
84 Small Subgraphs
Proof The automorphism group of a cycle of length i has order 2i. E
and
pn/ 0
for all r, t e N.
Let r and t be fixed natural numbers and r > 0. By (4.2) we have
where
0 < Cn < (rk)2 /n = o(An t ). (4.7)
4.2 Arbitrary Subgraphs 85
Furthermore, by (4.4),
rk--1 r / )r
Er (),n <p/k pl
s=k
rk-1
pl/k En rk)rks
rk- 1
s=k
< 2 p1/k ýns-rksrkp(s/k-r)I
s=k
>2,s/k-r as/k-rsrk-
rk-1
,1/4 g- plk)
s=k
= o(A). (4.8)
Since Er(X) = E•(X) + E7(X), relation (4.5) follows from (4.6), (4.7) and
(4.8). D
In §3 we shall prove a considerably stronger result about Poisson
approximation.
3 7
10
2 5 6- 8
1 9
d(FI) = m(F) > d(F 2 1F1) > d(F 3 1F2) >... > d(FsiFs-).
E(Y) - c (k ='it
4.2 Arbitrary Subgraphs 87
where c _>1 depends only on the pair (F, F0 ). Since pmik/1 -) o, we have
p--* o as m --+ c.
Let us give an upper bound for the second factorial moment E 2 (Y),
that is for the expected number of ordered pairs (H 1, H 2 ) of appropriate
H-graphs. Denote by E•(Y) the expected number of pairs (H 1 , H 2 ) with
V(H 1 ) n V(H 2 ) = V(Fo) and set E"'(Y) = E 2 (Y) - E'(Y). Clearly
s=k
2k-I
• C2/t2 E ms--2kpsl/k+l/k-21
s=k
SC3 22m- Ip (Zk-1)1/k+l/k-21
2
= pI
c3 /22m- -1)k = o(/2 ).
and
E(Y 2 ) = {1 + 0(1)}/p2 .
P(Y = 0) = o(1),
Theorem 4.12 Suppose the graph F has a unique subgraph F 1 with d(F1 ) =
re(F) = 21/k, where k = IFI and 1 = e(FI) > 2. Let p - cn-k/I, where c
is a positive constant and denote by Y = Y(G) the number of Fi-graphs
d
contained in F-graphs of G E ý§(p). Then Y -- P (), where 2 = c'/a and a
is the order of the automorphism of F1 .
In particular,the probability that a r.g. Gp contains at least one F-graph
tends to 1 - e-).
88 Small Subgraphs
n IVi
Y {I - (s - 1)e}n and ni = Vil en, i =2,3...,s.
1
Note that p ,- c^n7 , so by Corollary 4.5
P(G[V1 ] contains exactly r disjoint Fl-graphs)- e-:t/r!.
Since pn-ki/ti --* oo for i = 2, 3,...,s, for every fixed F1 -graph with
vertex set in VY we have
P(G[V1 U 12] : H : F1 for some H 2 F 21G[V 1 ] DF1 ) -' 1.
Similarly, if F, is a fixed Fi-graph whose vertex set is contained in U>.=1 Vi,
I::
then
P (G YLJj
DH:DFý for some H 2tFi 1 G L YJ - Pi)
Hence almost every graph containing r disjoint Fl-graphs contains r
F-graphs whose Fi-subgraphs are disjoint. Therefore
since 2, 2 as e -+ 0.
-
so Y d P(2), as claimed.
4.2 Arbitrary Subgraphs 89
In the result above we cannot discard the condition that F1 is the
unique subgraph of maximal average degree. For example, if F = 2K 3 ,
that is F is a disjoint union of two triangles and X(G) is the number of
F-graphs in G, then with p = c/n we have
0', if pnk/I 0,
lim Pp(G =) F)
n-oo = L101, if pnk/1
pnk/ 00o.
cc.
a2(XF) 1F
Pp(Gp ý F) Pp(XF - 2
0) 5 (E(XF))
The real content of this theorem is the upper bound; it can be proved
by making use of the martingale inequality given in Theorem 1.20. For
a sketch of this proof and many other beautiful results about small
subgraphs of random graphs, see Chapter III of Janson, Luczak and
Rucifiski (2000).
Then the total variation distance between the distribution of Y and Pý,
tends to 0, in fact
IJfI= (n)
kJ[ k!
a (n)k
a
and
E(Y) = pI•' = 4n.
92 Small Subgraphs
With a slight abuse of notation the elements of A will be denoted by
a, l,3...
For o E and Gp G ýVp set
,= XG = , ifa c= ,
X0, otherwise
Y n=. Xfl,
where fln denotes the set of common edges of the graphs /3 and L, and
the summation is over all elements #3of - satisfying #fln a 0. Finally,
let us write p•, for E(X,,) = pt.
Our aim is to apply Theorem 1.28 of Chapter 1 to estimate the
total variation distance between the distribution of Y and the Poisson
distribution. The following identity, valid for all functions x : Z+ --+ R,
will enable us to estimate the left-hand side of the equation appearing in
Theorem 1.28(iii):
Furthermore, if X, = 1 then
0_ Y - Y" - 1 = xf,
x#
so
Let us take the expectation of (4.14). Then (4.15), (4.16) and the
independence of X, and Y, imply that
+±(Ax)ZE E I.f
•(Ax) p~pp +-~ E(XoXfl)}
(4.17)
z E(X,,Xf) 0 (E~
flBn•
=
\s=k+l
nSp sllk±/k)
O(n2k-2p(2k-2)l/k+I/k). (4.18)
Also,
Therefore Theorem 1.28(ii) and relations (4.17), (4.18) and (4.19) imply
Since the constants hidden in the O's above are independent of the set
A, Theorem 1.28(iii) and (4.20) give
d{ýY(Y),P,} = sup E{,,x(Y ± 1)- Yx(Y)}J - O(nk-2p(k-2)l/k+l/k),
A
as claimed by (4.13).
94 Small Subgraphs
Exercises
4.1 What is the threshold function of the property of containing a
cycle? And a cycle and a diagonal?
4.2 Use Theorem 1.21 to prove Theorem 4.8.
4.3 Let F be a fixed graph with maximal average degree m > 0.
Show that if lim, P(GM : F) = 1, then M/n 2 - 2/n - oc and if
limn P(GM = F) = 0, then M/n2- 2 /m --_ 0.
4.4 Given e > 0 and 0 < x < 1, find a first-order sentence Q and a
sequence {M(n)} such that
n2-e < M(n) • N/2 and lim PM(Q) = x.
[Moon and Moser (1962); see also Frank (1979b). The probability
Pi,n that a tournament contains no 4-cycle (i.e. it is acyclic) was
estimated by Bollobfss, Frank and Karofiski (1983), who also
calculated the asymptotic value of Pn,[cnj. In particular, they
proved that
2
P,,n - 2 - (n !)2 (log 2)-2n+1 {nr(1 - log 2)}1/2.
In the next two chapters we shall study the global structure of a random
graph of order n and size M(n). We are interested mostly in the orders
of the components, especially in the maximum of the orders of the
components.
The chapters contain the classical results due to Erd6s and R~nyi
(1960, 1961a) and the later refinements due to Bollobis (1984b). Before
getting down to the details, it seems appropriate to give a brief, intuitive
and somewhat imprecise description of the fundamental results. Denote
by Lj(G) the order of the jth largest component of a graph G. If G has
fewer than j components, then we put Lj(G) = 0. Furthermore, often
we write L(G) for LI(G). Let us consider our random graph GM as a
living organism, which develops by acquiring more and more edges. More
precisely, consider a random graph process G = (G,)0. How does L(GM)
depend on M? If M/n(k-2)/(k-1) -, oo and M/n(k-])/k -* 0 then a.e. GM is
such that L(GM) = k. If M = [cn], where 0 < c < ½,then in a.e. GM the
order of a largest component is log n but if M = [cn] and c > 12 then a.e.
GM has a unique largest component with about en vertices, where E > 0
depends only on c. (The unique largest component is usually called the
giant component of GM.) Furthermore, in the neighbourhood of t = n/2
for a.e. G = (Gt)' the function L(Gt) grows rather regularly with the
time: L(Gt) increases about four times as fast as t.
The present chapter is devoted to the study of components which are
trees or have only a few more edges than trees. The main aim of the
subsequent chapter is to examine the giant component.
96
5.1 Trees of Given Sizes As Components 97
M = M(n) - !cn 2-k/(k-I) for some constant c > 0 and X 1 denotes the
number of subgraphs of a random graph GM isomorphic to T, then
X- d P(A), where A = ck-I/a. Also, if X 2 denotes the number of trees of
order k contained in a graph GM, then X 2 d- P(y), where p = ck-1kk- 2 /k!
Furthermore, if X 3 is the number of components of GM that are trees of
d
order k, then X3 -* P(p) also holds.
As M(n) grows to become of larger order than n2-k/(1-1), the number
of tree components of order k of a typical GM tends to infinity with n.
However, in a large range of M(n) the distribution of the number of
tree components of order k is still asymptotically Poisson. The proof of
this beautiful theorem of Barbour (1982) is rather similar to the proof of
Theorem 4.15, in particular, it is also based on Theorem 1.28. To simplify
the calculation we work with the model #(n,p).
Let p = p(n), k = k(n) and denote by Tk = Tk(Gp) the number of
isolated trees of order k in Gp. Write A for the expectation of Tk. As by
Cayley's formula there are kk- 2 trees with k labelled vertices,
=E(Tk) = (n) kk-2pk-1 qk(n-k)+(')-k+l (5.1)
Theorem 5.1
where
c1(~k(n=
(lnk~)=
-
1)
k- 2
_k-I qk(n-k) T( ) )k -k i -
n
and
Proof Let Jf be the set of all trees of order k whose vertex set is
contained in V. As in the proof of Theorem 4.15, we shall use c•,fl, 7,...
to denote elements of Y. Furthermore, we shall frequently suppress the
suffix k in Tk. For o( E Y and Gp E SP set
and
E(T) ý_E(Tk) =SE(X,) = (n) kk- p 2
Hence
we have
]E{2x(T + 1) - Tx(T)}j I! 2(Ax)E(jT - Tr*). (5.2)
j=n-k+l
where
(5.4)
Now let A = Z+ and let x = XAA be the function defined in Theorem
1.28. Then by Theorem 1.28 and relations (5.2), (5.3), (5.4) we have
IP(T e A) -P(A)ý <- 22min{1,ý-l}{c 1 (n,k,p)+C 2(n,k,p)}
• 2{c 1(n,k,p) + C2(n,k,p)}
and so
d(Y(Tk),PA) :5 2{c(n,k,p) + c2(n,k,p)}.
d{Y(Tk),Pý} - 0
provided any of the following three conditions is satisfied:
(i) k(n) --* oo, (ii) np(n) --* 0, (iii) k(n) Ž 2 and p(n) = o(n- 1).
where a(n) is a bounded function. This can be read out of Corollary 5.2,
but it is also a simple consequence of Theorem 1.20.
Proof Parts (i) and (ii) were proved in Chapter 4, and (iv) is Theorem 5.3,
above.
5.1 Trees of Given Sizes As Components 101
To see (iii) suppose pkn < 2k log n. Then, as in the proof of Theorem 5.3,
n-nk-k--2k1- p)kn
E(Tk) k!
kk-2
E(Tk) n(pnekPn/(k-))k-I 0,
implying (v).
The proof above shows that for a fixed k the expected number of
isolated trees of order k in GP increases as p increases up to p - (k-1)/kn,
and from then on it decreases. Furthermore, the maximum is about
(k - I)k-1
k .k !ek-I
Theorem 5.5 Let k > 2 and c > 0 be fixed and let p = p(n) - c/n. Then
where
2
)Lk = E(Tk) - nkk- ck-le-kc/k!
102 The Evolution of Random Graphs-Sparse Components
and
2
2 = O (Tk) _n{1 + (C - ' 1
1)(kc)k-ekc/k!}.
O-k 0'+ ( 1( :
Theorem 5.6 For every fixed k >_2 there exists a constant C(k) such that
the r.v. Tk on ýI{n,p(n)} satisfies
where
t(c) = 1 k! (cekk ]
k~l
Remark. By standard but tedious manipulations one can show that the
function s = s(c) = ct(c) is the only solution of
1.0
0.8
0.6
0.4
0.2
0.0
0 1 2 3 4
E{T(Glln)} = n + 0(l).
This shows that s = s(c) = ct(c) is a solution of se-s = ce-c in 0 < s < 1.
It is easily seen that there is only one solution, so s is exactly this solution.
104 The Evolution of Random Graphs-SparseComponents
Proof of Theorem 5.7 (i) Write Xk for the number of k-cycles in Gp,
where p • c/n. Then
2 2k
so
ciD cfC
2 4
E(Tk) = (n ) kk- (C)- (1-- C)kfn-k(k+3)/2+1 (5.9)
and
2
E(Tk) > kk- 2 ( 1- k)k ck- ecke-c k/n
E (mkTk) - kk-k-ck
Skkk+l
show that the terms with k > n1/ 2 contribute rather little to the sums.
Note that by relation (5.9) that if 1 < k < n, then
1+ )-k2 C - )flAk2
E{(k+l)Tk+l}/E(kTk)=(n--k) k n n
(5.13)
( kTk) - 1n k! ck-le-ck
\k=l k=l
+E+n
(~kTk)
k Tk + nny 00k k- c ek o.
- ) k= O (1).
\k=kl / k=kl
Corollary 5.8 Suppose p = c/n, 0 < c < 1, and co(n) - oo. Then a.e. Gp is
such that every component is a tree or a unicyclic graph, and there are at
most (o(n) vertices on the unicyclic components.
Proof We may assume that wo(n) • log log n. Since T(Gp) < n for every
Gp and E(T) = n + 0(1),
P Ž5
k.k> 0.
k=kl
For 1 < k < I < ki the expected number of ordered pairs of compo-
nents of Gp, the first of which is a tree of order k and the second is a
tree of order 1, is clearly
(n)(n-k k~ 2 l~ 2
1 2
(c)k+ - (i-c)(k+l)(flk-1)T(k+7)-k1+2
Hence
E(T2 E { (
(k=1
_< k 2 E(Tk)2
: Tk) 2
1 + 2Ck 2 + E(Tk)
k=1
2
S{E(k)} {k E(Tk) n- + 0 k{E(Tk)
In the proofs of Theorems 5.7 and 5.9 we made use of the fact that
2
TZk=k, k Ep(Tk) = o(1) if p - c/n and c > 1. In fact, trees of considerably
smaller order are also unlikely to appear in Gp.
Theorem 5.10 Let 0 <c< oo,c • 1 and p = c/n. Set a = c-l- log c
and let
co
cx/-7 - e-°do.
1 1t-e-•
O5/2
Proof It is easily seen that for M > 0 there is a constant c2 > 0 such
(1,
that
Y = Tk
k=ko
E(Y) = k (n) kk
2
(C)kkn ( C kflk
2
/2+3k/2-l
k =ko
k2
n u r logn) - (ko+5)
c \/'27• 1=0
0 5i/2 e 10 cc
Corollary 5.11 Let 0 < c < oo, c f 1,p = c/n, = c-- I - log c and
wo(n) -* oo. Denote by L(t)(Gp) the maximum order of a tree component of
Gp. Then a.e. Gp is such that
The last result indicates that if p is very close to I/n, then the maximum
order of a tree component of Gp is of larger order than log n. We shall
see in the next section that as p approaches l/n, this maximum order
becomes n2 / 3 .
The study of the number of tree components is similar to the study of
the number of vertices on tree components and the result is no surprise.
The proof, which is very close to the proofs of Theorems 5.7 and 5.9, is
left to the reader.
Theorem 5.12 Let 0 < c < oo,c = 1,p = c/n and denote by U = U(Gp)
the number of tree components of Gp. Then
E(U) = u(c)n + 0(1)
where
u(c) = 1 kk-2
c k k! (ec)k
and
o' 2 (U) = 0(n).
1.0
0.8
0.6
0.4
0.2
0.0
0 4
42
Theorem 5.13 Let 2 < k(n) < n. Then Eo/,(Tk) attains its supremum in
0 > 0 at Ok = (k - 1)n/[k{n - (k + 1)/2}].
If k = n then Ok = 2 and
Ez/n(Tn) - (e3/2n)(2/e)n. (5.19)
k)) 1/nk
2 ±1 k) n-k
Eok/n(Tk) = 0(1) k(n
k ) -k kfn-(k/2)}
S -nk n - k/2
= O(nk 51 / 2 ). (5.20)
E 0MT) ( n nk-2
12(n- 1+ k n-k
1
Eo/nTk 221k(n -k)) nk 1 nk)
5.3 The Largest Tree Components 111
~<
~k - (- -k/ ) kn-k/2)
1+
<_n o(1)-5/2
k -5 2 ( 5 .2 1)
Proof The fact that Eo/n(Tk) attains its supremum at Ok = (k- 1)n/[k{n-
(k + 1)/2}] follows from (5.18) upon differentiating with respect to 0.
Hence
2 -
1-(k k)
(n) k k- ( k -)k- ( n
EOk/n(Tk) =
Clearly On = 2 and
2/n(Tn)= nn-
2
(2)n-1 (1-2n)n(n-3)/2+i
e2 (2/e)'
2n
since
2
(1- •) n /2 I
(±
<
{ + (
kn~
(i +
-k( kk
k n-k)l
(ni
_k
-
)
lkk/2
Therefore
EoA/n(Tk) = O{n 3 / 2 k 5 2
/ (1 _ 0)n/21
Eok/ln(Tk) = O(nk-5/2),
- iy Y
and
n -- Ik12 ) ) e-k'
Armed with Theorem 5.13 we can determine how small k(n) has to be
if we want to guarantee that for some p almost every Gp contains an
isolated tree of order k.
5.3 The Largest Tree Components 113
2
Proof (i) Suppose k/n /5 --_ o and 0 < p < 1. Then by (5.19) and (5.20)
Pp(Tk > 0) •! Ep(Tk) •; EOk/n(Tk) = O(nk-5 1 2 ) = o(1).
(ii) Assume that k - cn2/ 5 . Then in <#(n, 1/n) the rth factorial moment
of Tk is easily estimated:
(nrtk~k)f~ (kl - 1nkO~
nrkO(k)
Er(Tk)- (n)kkrk-2)n-rk-)
(k !)rn
{(n) k 2 n-k1
k I infk}ýr-IET~
so
2)
Ep(Tk,,Tk2 )/Ep(Tk,)Ep(Tk (n (1k p)-kk2
(n)k,
< ( )k,( l -klk2
Consequently
Ep(TkI, Tk,) < Ep(Tk,)Ep(Tk2 ). (5.27)
114 The Evolution of Random Graphs-SparseComponents
Now if k = o(n 2/ 5 ) then in 'ý(n, 1/n) we have E(Tk) --* oo. Therefore, by
(5.27),
2 2
E(T2') = E(Tk, Tk) + E(Tk) < E(Tk) + E(Tk) - E(Tk)
Therefore
2 3
Proof (i) Suppose ko/n / - oo. Then by Theorem 5.13
n
Ep (ý-_ Tk) • Eok/n(Tk) =O(n) E k-5/2 = o(l).
k=ko kk k=ko
nk
51 2 k)-(n-k) (l - )k(n k/2)
5.3 The Largest Tree Components 115
nk -5/2
,
3
exp (n-k)
(k
+
k2 k3 N
2 +
+3n 3
ý,1-7r( n +2n
-k(n-k/2) I 2n2)
I 01}
12
nk-5
3
exp{-k /(6n 2 )}. (5.28)
2 3 5
In particular, if k = o(n / ), then El/l(Tk) - nk- /// 2'.
Hence with U = Y{Tk c1n2/ 3 < k •_ C2n2/ 3} we have
In the second relation above we made use of the fact that {c 2 (n) -
cj(n)}n 2 / 3 --, c0.
Analogously to part (iii) of the proof of Theorem 5.14, denote by E(Tk,
Tk2, ... , Tk,) the expected number of ordered r-tuples of components of
G1/, such that the ith component is a tree of order ki. If K = • ki =
o(n 2 / 3 ), then
)
- (k,k2,n..,kr)/{iiii
(k .) k, k, -X< kkn
nexp
n-K) (n+2 -Z(n-ki) -,+ <kkkn
n- K=Kn2
Ln/ r•
n ki zn kikj
( n i=1n
2 2n i<j
Consequently
Er(U) - {E(U)}r _ ,r
for every r > 1, and so our assertion follows from Theorem 1.21.
(iii) This is an immediate consequence of (ii). El
116 The Evolution of Random Graphs-Sparse Components
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5
Fig. 5.3. The function p(c) = (1/6/F3n) f,3/ 6 v-3/ 2e-'dv, the limit of the expected
number of isolated trees of order at least cn2 / 3 in G11 ,.
Hence
Ez(V) •< e-C 3{E(V)}2 = e c3•2,
With a little work one can show that if either k(n) -* oo or O(n) -- oo
then E(Ck) -* 0. Finally, if k is constant and p - c/n for some constant
c > 0, then
E(Ck) - 2 Cke-kc =
so Ck -* P;. by Theorem 1.20. The details are left to the reader (Ex. 5).
= 0.011564 ....
The study of the components other than cycles needs some preparation.
Denote by 16(n, m) the set of connected graphs with vertex set V =
{1, 2,..., n} having m edges, and put C(n, m) = IW?(n, m)1. Clearly W6(n, n +
k) = 0 if k < -2, W(n,n- 1) is the set of trees so C(n,n- 1) = nn-2,
and 16(n, n) is the set of connected unicyclic graphs. The function C(n, n)
was determined by Katz (1955) and R~nyi (1959b). The formula can
also be read out of more general results of Ford and Uhlenbeck (1957),
and another proof was given by Wright (1977a). Here we shall prove it
from the following lemma, due to R~nyi (1959a) but asserted already by
Cayley (1889).
Lemma 5.17 Let 1 < k < n. Denote by F(n,k) the number of forests
on V = {1,2,..., n} which have k components and in which the vertices
1, 2...,k belong to distinct components. Then
Proof The lemma can be proved in many ways. Here we shall present
two proofs. Both are different from those given by Katz (1955), R~nyi
(1959a) and Wright (1977a).
Our first proof is based on the following special case of Abel's gener-
alization of the binomial formula:
proving (5.30).
Our second proof is based on Priifer codes for trees. The forests
appearing in the lemma are in one-to-one correspondence with the trees
on V' = {0, 1, 2,..., n} in which the neighbours of vertex 0 are 1, 2,..., k.
If the code of a tree on V' is constructed by writing down the neighbour
of the current end vertex with the maximal label then the code we obtain
is such that the last k - 1 elements are O's, 0 does not occur anywhere else
and the element preceding the O's is one of 1, 2,..., k. Conversely, every
such code belongs to a tree in which the neighbours of 0 are 1,2, .... , k.
Since the number of such codes is n'- -kk, our lemma follows. LJ
Theorem 5.18
n'- ( ) = ()
C(n,n)=
2r=3 j= n=j
r=3 j=1
Proof Since
(I - j) -Lexp{-() n}
j=1
(1--
I 1) = e- 2
/(2n){I + 0(r/n) + O(r3/n2)j,
j=l
we have
n 1 -/ )(1) e-r22n) +
r=3 j=1 r=3
Therefore
C(n, n) = (71/8) 1/ 2 nn-l/ 2 + O(n- 1 ). El
The use of Lemma 5.17 enabled R~nyi (1959b) to determine the asymp-
totic distribution of the length of the unique cycle contained in a con-
nected graph of order n and size n (Ex. 7).
We shall present a number of consequences of Theorem 5.18 concerning
the unicyclic components of random graphs, but first we continue the
examination of the function C(n, n + k).
Bagaev (1973) showed that C(n, n+1) _- 54n+, and for 2 < k = o(n 1/3)
Wright (1977a, 1978b, 1980, 1983) determined the asymptotic value of
C(n,n+k):
C(n,n + k) ykn + + O(k3/ 2n 1/2)},
{3k-1)/2{1
where
2
7l/ 3k(k - I)Ak
Yk 2(5k_1)/2F(k/2)
and
5 k I 68h~k-h k-
1 =2 = 36 ,k+1 6k + E k k > 2.
h=1
C(n, n + k) = 3)1/2 k n(
3
k)
2
+ O(k3/ 2 n- 1 / 2 ) + O(k-)
5.4 Components Containing Cycles 121
Additional numerical data were given by Gray, Murray and Young
(1977).
The results above were considerably extended by Luczak (1990a) who
showed that the last formula is a good approximation of C(n, n + k)
provided k = o(n) and k --* oo. Furthermore, Bender, Canfield and McKay
(1990) proved a rather complicated asymptotic formula for C(n, n + k)
for every function k = k(n) as n --* (Y.
In order to study the evolution of random graphs we shall not need an
asymptotic formula for C(n, n + k), but we do need the following upper
bound, due to Bollobits (1984a).
Theorem 5.20 There is an absolute constant c such that for 1 < k < n
C(n, n + k) •_ (c/k)k/2nn+(ak-1)12.
Proof Recall that W(n, n+k) is the set of connected graphs with vertex set
V = {1, 2... n} having n + k edges. Every graph GEcW(n, n + k) contains a
unique maximal connected subgraph H in which every vertex has degree
at least 2. Let W be the vertex set of H. Clearly e(H) = w + k, where
w = IW1. Furthermore, G is obtained from H by adding to it a forest
with vertex set W, in which each component contains exactly one vertex
of W. Hence wnn-I-w graphs GeW(n, n + k) give the same graph H.
Denote by T the set of branch vertices of H and by U = W - T the
set of vertices of degree 2. Set t = ITI and u = IU = w - t. Clearly, H
is obtained from a multigraph M (i.e. a 'graph' which may contain loops
and multiple edges) with vertex set T by subdividing the edges with the
vertices from U (see Fig. 5.4).
The multigraph M has t + k edges and loops, so there are at most
u!B(u + t + k - 1, t + k - 1) ways of inserting the u vertices in order to
obtain H from M. Since every vertex of M has degree at least 3 (a loop
contributing 2 to the degree of the vertex incident with it),
3t_< 2(t +k), that is, t < 2k.
Consequently
2k
2
2 14
U4 U5
Fi . .5
A
4 g a h H nd ' a h M wt
a mutg TM t . ., 4 n
. /3
U6 .. 7 us 9
U 10
Fig. 5.4. A graph H and a multigraph M with T =tl {t1 t4} and U
(t 2/2 +t +3t/2
k - +k-l)
I -(l~~-
and
( nn t u! = t!+. n exp(--u2 /2n).
Therefore
2k n-t
C(n, n + k) <_nn'- A(t, u),
t=I u=O
5.4 Components Containing Cycles 123
where
A(t,u) (lk )k't- exp(--u 2 /2n) u+ t + k- 1 (t + u).
Let us split the double sum above into two parts, according to the size
of u. Clearly
2k k 2k k
S S A(t, u) < t1(c, ,kttk exp(_u 2 /2n)2t+k(3k)
t=1 u=l t=l u=i
2k
1 l/t! • nl/2(Ok)k-312
nn /2 5 (C 2 k)t-k
t=1
< (k'/~
n<~
2
52kný~)2
~t (t(c +6 k)t•k
k)!}U
12 nt1
2
2k
< n,(k+l)/2 E(cAk)t/2+k/2nt/2/t!
t=1
2
< n(k+l)/2 (c8/k)k/ nk. (5.35)
Cn,n+k) + k)<
C~n~+ k <_n N +k)
2(nen2)+k " (5.36)
Inequality (5.36) and Theorem 5.20 have the following immediate conse-
quence.
Proof Theorem 5.20 implies that in proving this corollary we may assume
that n < k < N - n; in particular, n > 6. By inequality (5.36), in this
range we have
2 2 2 3 2
C(n, n+k) • nr(e)n-+k ( n k)n41k/ ( k )k/ kk/ nn+( kl)/
<
n 2
In
exp
e2)n+kexp {
n k+
k(n + k/2) _ nk/2 } 2
k1k/ nn+( k-l)/
3 2
< ( e nk
- e -(n+k)/2 k-k/2 nn+(ak-1)/2
2
/(k12
n+(3k-1)/ ,
<k n
as claimed.
With a little more work one can show that c = 1 will do in Theo-
rem 5.20. Using this sharpening of Theorem 5.20, the proof above shows
that c = 1 will do in Corollary 5.21 as well.
As mentioned earlier, Theorem 5.18 and Corollary 5.19 can be used
to obtain fairly precise information about the unicyclic components of
random graphs. Our first result concerns components of bounded order.
Theorem 5.22 Let p ,- c/n,0 < c < ao and denote by Xk = Xk(Gp) the
number of unicyclic components of order k in Gp. Let 3 •_ k 1 < k2 < ... <
ks be fixed. Then Xk1 , Xk ...... Xk, are asymptotically independent Poisson
random variables with means 1 2 .. s, where
k,-3
A,= 2(ce- )kE Zk/jl!.
j=o
5.4 Components Containing Cycles 125
Y k(ce_
cc, k-3
Zki/j!
E
k=3 j=O
Proof (i) Suppose p "- c/n,O < c < oc, c * 1. Then by Theorem 5.18
n (n)k k ~ 2
c~kflk /2-3k/2k-
k=3 - )=0i/!
10
2 -
0 12 3 4
25
20
15
10-
0 j
0 1 2 3 4
and
E(XkXk2 ) -" E(Xk.)E(Xk2).
2
(E
(kxk)
kXk - EE ( YkXk
k' )2ko+I k2E(Xk).
\k=3 / k=3 / k=3
From this it follows that
2 ,k-3
2
0"2(X) Zk E(Xk) - k(ce-c)k EkJ/.!
k=3 k=3 j=0
2
n k (n)k kn-k /2-3k/2
I
-,4 ý_. nk n-
k=3
n41 2 k-l_I
2 (i -
" ek /2n+3k/2n
k=3 j=1
Since
k-( 1 -) <exp{ k(k-1) k(k-1)(2k-1)
H-n 2n 6n2' k4/(4n3)}
j=1
Consequently
E(X)
k exp
os Vn 4
Corollary 5.24 (i) Given 0 < c < oo, c * 1, there is a b = b(c) such that
2
P{X(Gc/n) > t} <• b/t for all t > 0.
In particular,if wo(n) ---* cc, then a.e. Gc/1 has at most co(n) vertices on its
unicyclic components.
2 3
(ii) If wo(n) -* cc then a.e. G11 , has at most (o(n)n/ vertices on its
unicyclic components.
128 The Evolution of Random Graphs Sparse Components
Proof Both assertions follow from Theorem 5.23 and Chebyshev's in-
equality D
Exercises
5.1 Prove that if k > 2 is fixed and p = p(n) = o(n- 1), then cl(n, k,p)+
c 2(n,k,p) = o(1), where cl and C2 are as in Theorem 5.1. Deduce
that
dj{Y-(Tk),PJ --* 0,
and
( n) knn
kj,
-'-l = ( njn
ni, n2, nm
n l-1...n
, m-1,
where the summation is over all ordered partitions (ni, n2, ... ,nm)
of n - j. (Katz, 1955.)
5.6 Prove that there are
(n (n--1)n-k 1
Exercises 129
trees on {1,2, ... ,n} in which vertex 1 has degree k. [Clarke
(1958), see also R~nyi (1959a, p. 83) and Moon (1970a, p. 14).]
5.7 Denote by y, the length of the unique cycle in a connected graph
of order n and size n. Let x > 0 be fixed and set ro = [xn 1/2].
{r
Show that, analogously to Theorem 5.16,
The results of the previous chapter imply that for c < 1 a.e. graph process
G= (G,)N is such that for t -,' lcn it satisfies Li(Gt) = O(logn), i.e. the
maximum of the orders of the components of Gt is O(log n). As t passes
n/ 2 , LI(G,) suddenly begins to grow. The main aim of this chapter is to
investigate the speed of this growth.
Erd6s and R~nyi (1960, 1961a) proved that a.e. G is such that LI(Gtn/21)
has order n2/ 3 and if c > 1 then a.e. G satisfies
L I(G[,n/ 2 ]) - I - t(c)}n.
130
6.1 A Gap in the Sequence of Components 131
Theorem 6.1 Let so = (2 log n)l/Zn 2 / 3 and Jor s #ý0 set ko(s) = [(3 log s -
logn - 2loglogn)n2/s 2)J. Then a.e. graph process G = (G,)' is such that
if t = n/2 + s, so < Isl < n/2 and G, has a component of order k, then
either k < ko(s) or k > n2/3.
In particular,a.e. graph process is such that, for to = [(n/2)+sol < t < n,
the graph G, has no component whose order is between n2/ 3/2 and n2/ 3.
S S E(k) = o(1).
t=to k=ko(s)
(6.1)
where
Ep(k) = ,-[Ep{X(k,d)} :-1 •_ d •_ k*].
C(k,k + d) •_ co 6 dkk+(3d-1)/2
Ep(k) = Ep{X(k,d)} • k 6 k P
d=--I d=-I
x (1 - p)kn-k(k+3)/2 d
}
< c1 (n) kk-2pk- exp {-p (kn--)
Thus
5 2 2 3
Ep(k) <_ c3 nk / exp{-0.499 k + ; k/3} =Fp(k),
Consequently
tl kt(s) t1 ki(s)
t~to
Let us call a component small if it has fewer than n2/ 3/2 vertices, and
large if it has at least n2/ 3 vertices. By Theorem 6.1 a.e. graph process is
such that for to _<t _<n every component of G, is either small or large.
This will enable us to show that a.e. graph process is such that shortly
after time n/2 it has a giant component and all other components are
small. Hence, the order of the giant component is the difference between
n and the number of vertices on the small components. The following
theorem will be used to estimate the number of vertices on the small
components.
MU {0,-1}}, -
Theorem 6.2 Suppose A cz N x {N < e = r(n) _<n/4 and
p = (1 + e)/n. Set
and
k= max{k :(k,d) e A}.
If > _(1 + 6) 2 /n and kZ{B + (1 + g) 2 /n} • n, then
a 2 (X,)
a <2i+2Ui+- { + (1 +- - 2
) } i+1
n n
I - y < (1 - x) ex-y.
Consequently
((;)k7+k2 h n)
_n{k,kn.k (i-I k2 + i) ~ n_
-!
6.1 A Gap in the Sequence of Components 135
2
= expkk2 e+ ) =) 1 +6(ki,k 2 ).
2(1+ ) Z{k+1 1
a Ep(X2i) + 2 1 k2+ Ep{X(kl,dj)}
Sn2
x Ep{X(k 2 ,d 2 )} (ki,dl),(k 2 ,d 2 ) E A}
S/t2i + 2E + 2(1+r)2)2
n n2 ) i+1.[
Theorem 6.3 Let -1/4 < e = e(n) < 1/4, 8EJ> 2n- 1 / 3, p = (1 + E)/n and
define
5
g,(k) = log n - 5 logk + k{log(1 + 8) - e} + 2 log(l/E) + k 2 /2n.
Y. Ep{X(k,d)} • clEp{X(k,-l)}.
d=-1
c, Z Ep{X(k,-1)}
k=k 2
n2/3
cn
C3 E k -5/2 exp[kllog(1 + e) - e} + k2 E/2n]
k=k2
2
< c4 nk25/ exp[k 2 {log(1 + e) - e} + k 2 v/2n]8-2
C4
C exp{gE(k 2 )}.
By our choice of k 2 we have g•(k 2) -• -oc, so almost no GP has a
component whose order is between k 2 and n 2 / 3 .
(ii) In the proof of the second inequality we may and shall assume that
n r 3 (logn) 2 -* cc and k 0 -> cc. Set k, = [8(logn) .], A = {(k,-1)
k0 < k _< k•} and let Xo be as in Theorem 6.2. Then g,(k,) -+ -cc and
k, k,;
= E(Xo)
=o = L E(Tk) > c5 n E k-5/ 2 exp[k{log(1 + e) - rf}.
k=ko k=ko
Let us state some explicit bounds for S(Gp) implied by Theorem 6.3.
Proof (i) As we may assume that (log n)e- 2 < n 2 / 3, all we have to check
is that g,(3(logn)e 2 ) -* -c0.
(ii) Straightforward calculations show that
Theorem 6.5 A.e. graph process G = (G,)' is such that for t >Ž 5n/8 the
graph G, has no small component of order at least 100 log n.
Proof By Corollary 6.4, a.e. graph process is such that for some t
satisfying 3n/5 < t < 5n/8 the graph G, has no small component whose
order is at least 75 log n. Since the union of two components of order
at most a = [100 log n] has order at most 2a < n2/ 3 , the assertion of the
theorem will follow if we show that a.e. graph process is such that for
t > 3n/5 the graph G, has no component whose order is at least a and
at most 2a. Furthermore, since for t >_ 2nlog n a.e. Gt is connected (cf.
Ex. 15 of Chapter 2), it suffices to prove this for t < 2n log n.
Let 3n < t < 2nlogn and set c = 2t/n. Then the expected number of
components of Gt having order at least a and at most 2a is
2a ,N)k
(-n5/2 (N kn)k
•ý c1 jien)'k N N
k=a
2a
2
< c2 n Y ekk-5/ ck e-ck
k=a
2a
12
= c2 nEk-5 (c el-c)k,
k=a
where co, cl and c 2 are absolute constants. Since for c = 1.2 we have
c - 1 - logc > 0.0176, the last expression is at most
c 3 n(log n)-3/2n-1. 7 = o(n-1/2). L
estimate will help us to prove that the large components of Gt0 are likely
to merge soon after time to, we start with this estimate.
and
E= kX(k,-1).
k=1
= 0(1) Z kkl (1
k=4
0 k1/ex
k~-)- k -- 2n + 8 - 2-+ O(83) k -- (I + eOk- (1--+ k
Ik2n4 2n
k,
since
k 2 /n = o(8 2k).
±3k
This gives us
E(Y 1*) = 0(n-' 8- 3 ) = o(1).
E(Yo) = kE{X(k,0)}
k _k,
=~~~ (n C kZI) (1 )k ( 1+ V
)kl k/2+3k/2
/7r/8 Z
V1 exp{-k. 2 /2 + k 2 /2n + O(kv 3) + 0(k/n)}
k k,
P{0 -YoŽ(n)v7 2 } -2 0.
(iii) It is easily checked that E' = e - le 2 + 0(v 3), since 0 < e' < 1 is
defined by (1 - F') e' = (1 + E)e-'. Hence it suffices to prove the first
inequality.
We shall estimate E(Y-I) rather precisely and then we shall make use
of Theorem 6.2 to conclude that for a.e. Gp the variable Y-1 is rather
close to its expectation.
Set p' = (1 - e')/n and write E' for the expectation in <(n,p'). We shall
exploit the fact that E and E' are rather closely related.
First of all, calculations analogous to those in (i) and (ii) show that
and
E'(Yo) 1 2
2
Hence
2
E'( = n - 1 2+ o(E- ).
2
6.2 The Emergence of the Giant Component 141
In order to pass from E'(Y-) to E(Y_ 1), note that for k < k, we have
1-
E' + 8 1(n- 1)
1 exp{O(ek 2 /n)} - - + (Ek 2 n).
1+c 1+ O
Consequently
= 0(n) x-1/ 2 e -,
2
/2 dx = 0(ne- 1).
Theorem 6.8 A.e. graph process G = (Gt)u is such that for every t > tl
[n/2 + 2(logn)l/ 2 n2/Sj the graph Gt has a unique component of order at
least n2/ 3 and the other components have at most n2/ 3/2 vertices each.
Proof Let so = (3logn)'1 2n2/3 . By Theorems 6.1 and 6.5, a.e. graph
process is such that for t > to = [n/2 + so] every component of G, is
either small or large. Therefore our theorem will follow if we show that
a.e. G = (Gt)' is such that for some to < t • ti all large components of
Gt, are contained in a single component of Gt,.
By Corollary 6.7, a.e. Gto has at least 3,0 vertices on its large compo-
nents. For such a graph H = Gto we can find disjoint subsets V1, V2.
Vm of V(H) = V such that
U Vi = U
i=l j=1
where C1, C 2, ... C1 are the large components of H = Gto. Set p = n-4/3
,
1 (l- .-
p)4 I _ e-, >
Hence the probability that in Hp all Vi's are contained in the same
component is at least the probability that a random graph in W(m, 1/2)
6.3 Small Components after Time n/2 143
Theorem 6.9 Let t = n/2 + s and wo(n) -> . If 2(log n)1 /2 n2 / 3 < s = o(n),
then a.e. Gt is such that
and
L 2 (Gt) •- (log n)n 2 /s 2 .
If nl- '- < s = o(n/ log n) for some To < 1 then a.e. Gt satisfies
Proof The assertion concerning LI(G,) follows from Theorem 6.8 and
Corollary 6.7. The inequalities concerning L2(Gt) can be read out of The-
orem 6.8 and Corollary 6.4, with e = 2s/n. Unfortunately, Corollary 6.4
cannot be applied as it stands, since the property in question is not con-
vex, but the proof of Theorem 6.3 can be adapted to give an analogous
result for Gt instead of Gp (see Ex. 2). El
(ii) A.e. G is such that if t > con/2, then every small component of G, is
a unicyclic graph or a tree, and there are at most w(n)log n unicyclic
components.
(iii) For ow(n) -- oo a.e. G is such that if t Ž: wo(n)n, then every component
of Gt, with the exception of its giant component, is a tree.
2 2 2
• 2(en)kk-5/ e- kt/n(2t/n )k-1
= 4nk-5 1 2 (C el1-)k. (6.10)
where c = 2t/n.
Therefore the probability that for some to • t < T1 the graph Gt
contains a component whose order is at least a = a(t) and at most b, is
6.3 Small Components after Time n/2 145
"12
= O{(log n)-3/21 E exp{- log n + 3 log log n - ow(n)} = o(1).
t=10
In the second inequality we made use of log(c el-c) = -a. This completes
the proof of (i).
The proofs of (ii) and (iii) are similar. 1-
The reason for the inequality t < cj in part (i) of the theorem was to
enable us to use a fairly small function for a(t). With a' = [(2/a)log n]
no upper bound has to be put on t (see Ex. 9).
Let co > 1, c = 2t/n > co and let caand a be as in Theorem 6.10.
By Corollary 5.19 the expected number of unicyclic components of Gt
having order at most a is not more than
= () k- 1 2
k -( 2t k < a-'(c e -C)k= 0(1),
k=3 k=3
Theorem 6.11 Let c > 1 be a constant, t = [cn/2J and ow(n) -> oc. Then
a.e. Gt is the union of the giant component, the small unicyclic components
and the small tree components. There are at most o)(n) vertices on the
unicyclic components. The order of the giant component satisfies
where
(C) = 1 0 kk-I
cEk
k=1
k! (c e c)k,
where z = c - 1 - log c.
In the range of t covered by Theorem 6.9, the giant component increases
about four times as fast as t. Another way of proving Theorem 6.9 would
be to establish this fact first and then use it to deduce the assertion about
the size of the giant component. What is the expectation of the increase
of L,(Gt) as t = (1 + e)n/2 = n/2 + s changes to t + 1? The probability
that the (t + 1)st edge will join the giant component to a component of
order Lj is about L 1Lj/ (). Hence the expectation increase of Ll(Gt)
is about
2 I ±
2
(2L I/n ) : k
2
(n) kk- +1_
k n n
k=(
n 2/3
2
I/n )E •n k /2exp(_E2 k/2) S
2
(2)1/2 (Li/n) x-1/ e-x_2 /2 dx
From this one can deduce that if t is o(n) but not too small, then
L 1 (t) = Li(n/2 + s) - cis. Since t'(1) = -2, by Theorem 6.11 we must
have c, = 4.
where
Theorem 6.13 Let 2 < #3 < 1 be fixed and t = cn/2 where c = c(n) <
8 log n. Then
P{fw(Gt)--u(c)ný Ž (logn)nf3 } < 100n 1- (log n)-.
n + n2J
• 17(log n)n.
Hence, by Chebyshev's inequality
Clearly
2
w*(Gp) •! w(Gp) •E w*(Gp) + n1/ . (6.13)
148 The Evolution of Random Graphs-the Giant Component
Inequalities (6.11), (6.12) and (6.13) imply
Finally, since n > w(Gt) > w(Gt+l) Ž 1 for every t, by the De Moivre-
Laplace theorem (Theorem 1.6) we have for every t, 0 < t < 4nlogn,
that
as claimed. []
Corollary 6.14 The probability thatJbr a fixed #3a graph process (Gt)o
satisfies
Iw(Gt) - u(2t/n)nl <_2(log n)n'#
( n
- ) n2/)
Putting it slightly differently, with r = 2s/n, we set s' = e'n/2. Note that
with p 1+8 and p' 1-8', a random graph Gp has about n + s edges
and Gp, about -2 s"
Theorem 6.15 Let t =n + s, where s/n 2 / 3 - oo. Then a.e. Gt is such that
Luczak (1990b) gave the following precise estimate for the excess of
Gt in the supercritical range.
Theorem 6.16 Let o)(n) --* oo, and t = n/2 + s, where w)(n)n 2/ 3 < s <
n/co(n). Then a.e. Gt is such that
0
exc(Gt) - 16s3n o- s 3 /n 2. U
Janson, Knuth, Luczak and Pittel (1993) extended this result: they
showed that exc(Gt), suitably normalized, has asymptotically normal
distribution.
Taking up a suggestion of Erd6s, Janson (1987), Bollobfis (1988b) and
Flajolet, Knuth and Pittel (1989) studied the length ýn = ý(G) of the first
cycle that appears in a random graph process G = (G,)f' on n vertices.
Rather interestingly, E(ý,) --* oo as n -- o, but , has a beautiful limit
distribution.
150 The Evolution of Random Graphs the Giant Component
lim
n-oo P(ý, = k) = 21 j' ) tkl et/2+t 2/4 1 _ tdt.
Furthermore,
S"c/2 F(1/3) 6
21/632/3
In fact, Flajolet, Knuth and Pittel (1989) also accomplished the difficult
task of giving precise estimates for all the moments of ý,, and Janson,
Knuth, Luczak and Pittel (1993) also proved a number of related detailed
results. The first cycle in a certain random directed graph process, thought
to be relevant to the explanation of the origin of life based on the idea
of self-organization, as expounded by Eigen and Schuster (1979), was
studied by Bollobfis and Rasmussen (1989).
Let us turn to the k-core of a graph, introduced by Bollobfis (1984a).
For k > 2, the k-core corek(G) of a graph G is the maximal subgraph of
minimal degree at least k. The 2-core is simply the core of the graph:
core(G) = core 2 (G). As every graph of minimal degree at least 2, the core
is made up of some vertex disjoint cycles and a set of branchvertices
(vertices of degree at least 3) joined by vertex disjoint paths. There is a
unique multigraph K of minimal degree at least 3 such that the 2-core
of G is a subdivision of K. This multigraph K is the kernel ker(G) of G.
Here the term 'multigraph' is used in a wide sense: in addition to loops
and multiple edges, vertexless loops (called free loops) are allowed as
well. Note that the multigraph M defined in the proof of Theorem 5.20
was precisely the kernel of G E W'(n, n + k).
Luczak (1991a) conducted a detailed study of the core just beyond
the critical probability n/2. Let us write Dk(G). For the number of
vertices of degree k in the core of G. Among other results, Luczak
(1991a) proved that if t = n/2 + s, where s = o(n) and s/n 2/ 3 -- o, then
a.e. Gt is such that D2(Gt) 8s2 /n, D3(Gt) - 32s 3 /3n 2 and, for every
fixed k, Dk(Gt) = O(sk/nkk-1) Furthermore, a.e. Gt contains an induced
topological random cubic graph on about 32s 3/3n 2 vertices.
Since by the theorem of Robinson and Wormald (1992) a.e. cubic
graph has a Hamilton cycle (see Theorem 8.22), a.e. G, has a cycle that
goes through about 32s 3/3n 2 cubic vertices of the core. By tightening
and continuing this argument, Luczak (1991a) was able to give a good
estimate of the circumference, the length of a longest cycle, of a random
graph.
6.4 Further Results 151
Theorem 6.18 Let t = n/2 + s, where s = o(n) and s/n 2/ 3 -+ 00. Then a.e.
Gt has circumference at least (16 + o(1))s2 /3n and at most 15s 2 /2n. El
Luczak, Pittel and Wierman (1994) showed that the transition range
from being planar to being non-planar is precisely the window of the
phase-transition.
Ak(G) = corek(Gjk)I.
The following beautiful and difficult result about the sudden emergence
of the k-core was proved by Pittel, Spencer and Wormald (1996).
Theorem 6.20 For every k >_ 3 there are constants Ck > 1 and Pk > 0 such
that Tk(G) - Ck n/ 2 and A (G) -'-pkn for a.e. graph process G. Furthermore,
Ck and Pk can be expressed in terms of Poisson branching processes. In
particular,Ck = min;,O0 Ork( ) = k + V/Li6gk + O(logk), where 7Ek() =
P(Poisson(,) Ž k - 1).
nj-2 1 nj < I for every i. Molloy and Reed (1995) proved that the existence
of a giant component is governed by the quantity Q = E i(i - z)ýi. If
Q > 0 then a.e. random graph in this space has a giant component, and
if Q < 0 then almost no random graph has a giant component.
In conclusion, let us say a few words about the various approaches to
random graph problems, especially phase transitions. 'Statistical graph
theory' had existed before Erd6s and R~nyi started their work on random
graphs, but its aim was exact enumeration. Generating functions were
found and then analytic methods were applied to obtain asymptotics.
In this method there was nothing probabilistic, random variables and
probability spaces were not mentioned.
Erd6s and R~nyi introduced an entirely novel approach-rather than
strining for exact enumeration, they adopted a probabilistic point of
view and treated every graph property as a random variable on a certain
probability space. This is precisely the method we apply much of the
time in this book; in particular, the results in this chapter were obtained
by this 'static' probabilistic method.
It is fascinating to note that percolation theory (the theory of random
subgraphs of lattices) was started by Broadbent and Hammersley (1957)
at about the some time as Erd6s and R~nyi were laying the foundations
of the theory of random graphs, but the two theories were developed with
minimal interaction (see Grimmett (1999)). By now there is more and
more interaction between the two disciplines; for example, the detailed
study of the phase transition in the mean-field random-cluster model by
BollobAs, Grimmett and Janson (1996) was based on the properties of
the phase transition in the random graph process (Gt)N
The original approach to random graphs based on generating functions
is far from out of date; rather the opposite is true: it is more powerful
than ever. This approach may require much work and technical expertise,
but it can produce deep and difficult results. Perhaps the best examples
of this can be found in the monumental paper of Janson, Knuth, Luczak
and Pittel (1993) about the structure of a random graph process next to
its phase transition.
In addition to natural mixtures of these two basic approaches, there
are several other lines of attack; perhaps the most exciting of these is
the dynamical approach pioneered by Karp (1990). (The germs of this
method can be found in the work of Erd6s and R6nyi.) In Karp's
dynamical approach edges are 'uncovered' or 'tested' only when we need
them: the random graph Gp unravels in front of our eyes, much as a
birth process.
6.5 Two Applications 153
Then Wp(I) = 1 and Wp(n - 1) is the first time t when the graph Gt is
connected. By the greedy algorithm of Kruskal (1954) (see also Bollobis,
1979a, Chapter I) an economical spanning tree of Hn is formed by the
2
edges of G appearing at times p(l), Wp( ) ... , i(n - 1), i.e.
n-1
S(X) = Xpt.
k=l
By Ex. 15 in Chapter 2
About how large is ip(k)? Since ip(k) is the first t for which Gt has
n - k components, and by Corollary 6.14 a graph G, has about u(2t/n)n
components, tp(k) is about Wpo(k), where Wpo(k) is defined by
u{2To(k)/njn = n - k.
u{2Wo(k)/n} Ž_ [n7/8j/ n
and so
9
exp[-2{jpo(k) + n8/ }/n] > !n1/8
since u(c) e' -- 1 as c - oo. Easy calculations show that lu'(c)l > e
(Ex. 10). Hence, for k < no
and, similarly,
for every t :s 4nlogn. Hence by (6.14), (6.15), (6.16) and (6.17) with
probability 1 - o(n-1/4)
= limr2
= _=2 n -o(k)lim 2 / t t du(2t/n),
lim so(n) 2
n- n k ) n-a'o n
k=1
we have
wkrn so(n)=Zo•kk!-x22 o xk-
k- e-x dx=
dX s =(3)
k= J k=I
= log2-1+.I: (k0l-2K
k>1 X\ ' I (k + lk~I1
= 2.0847 ....
158 The Evolution of Random Graphs the Giant Component
Strictly speaking, the value 2.0847 ... for the BollobAs-Simon constant
is due to Knuth (2000, Chapter 21), who gave precise estimates for it and
determined it up to 45 decimal places.
It is interesting to note, that the analogous expectation for the Quick-
Find algorithm in which w(xy) = (ICxj + ICyl)/2 is considerably larger,
namely n2 /8 + O(n(log n) 2).
Exercises
6.1 Prove that if 0 _•x _•y •1, then
1- y < (1 - x) eX Y.
6.2 Suppose t = n/2 + o(n). Imitate the proof of Theorem 6.3 to
show that a.e. G, satisfies S(Gt) <_(log n)n 2 /s 2.
Show also that if
(This follows rather easily from Corollary 6.4 and Theorem 6.8.)
6.5 Let c > 1. Prove that there is a constant cl such that
= o(n-2).
6.6 (Ex. 5 ctd.) Show for t > cn/2 that the expected number of (k,
d)-components of Gt with k < cl logn and d > 1 is O(n- 1).
Exercises 159
6.7 (Ex. 5 ctd.) Prove that there is a constant c2 = c2(c, cI) such that
for t > cn/2 the expected number of vertices of G, on unicyclic
components of order at most c, log n is at most c 2.
6.8 Prove that for t < 2nlogn and k = O(logn) the lifetime of
a component of order k of Gt has approximately exponential
distribution with mean n/(2k).
6.9 For t > n/2 define c(t) = 2t/n, o#(t) = c - 1 - logc and a'(t) =
[(2/o)logn]. Prove that for co > 1, a.e. graph process G is such
that for t > con/2 the graph Gt has no small component of order
at least a'(t) (cf. Theorem 6.10(i)).
6.10 Show that the function u(c) of Theorem 6.12 satisfies Iu'(c)I >
e-c (cf. the proof of Theorem 6.15).
7
Connectivity and Matchings
160
7.1 The Connectedness of Random Graphs 161
(1964, 1966) proved the following result, which is very surprising at first
sight: if M is large enough to ensure that a.e. GM has minimum degree
at least 1, then a.e. GM is connected and, if n is even, a.e. GM has a
perfect matching. A sharp version of this result is due to Bollob~is and
Thomason (1985): a.e. graph process is such that the moment the last
isolated vertex vanishes, the graph becomes connected and, if n is even,
the graph acquires a perfect matching. In §3 we consider matchings in
bipartite graphs and in §4 we examine the general case.
The aim of §5 is to take a quick look at the rich body of results
concerning reliable networks or, what it amounts to, the probability
of connectedness of G E S(H; p). Once again, we are faced with an
impossible task. Rather than give an unsatisfactorily superficial account
of good many results, we concentrate on a striking theorem of Margulis
(1974a).
The connectivity properties and 1-factors of random regular graphs
will be examined in §6.
set
z(Tk) = max{O} U {t : G, has a component which is a tree of order k}.
Thus T(Tk) is the last time our graph contains an isolated tree of order
k, provided it ever contains one.
As in Chapter 5, denote by Tk the number of components which are
trees of order k. Clearly
Ep(
T =
Ep(Tk) ( )k
~~n )) k-2pk-( 1 -p)kn+O(k
2
)'
,,2-, 0
For k E N define 00 = Oo(k) > 1 by
k 5/2n
Oo- (ooe'-o)k = 1.
162 Connectivity and Matchings
Theorem 7.1 If k = o(logn) and wo(n) -- o, then
Proof (i) The lower bound on T(Tk) is rather trivial. Let tl = [{00 -
o)(n)/k}n/2] and 01 = 2t 1/n. Then
Then
Consequently
2b
and
In fact, recalling Theorem 5.4, we see also that the number of isolated
vertices is asymptotically a Poisson r.v. with mean e-".
2nd Proof The probability that Gp has a component of order 2 (i.e. an
isolated edge) is at most
What is the probability that for some r, 3 <_ r _<n/2, our random
graph Gp contains a component of order r? Since a component of order
r contains a tree of order r whose vertices are joined to no vertex outside
the tree, this probability is at most
7.1 The Connectedness of Random Graphs 165
[n/2j /n/2]
r=3
rn r 2
prl(1 ~P)r(n-r) < [n/2
r=3 _52p ~r rg log n ±rlog2
-r{logn+c+o(l)} + nlogn+c+o(1)}
[n/2]
<nE r 5 /2 exp{-2r(log n)/5} •< n/
r=3
and (7.6
k2
, Qik - (e'- 1)--+0 (7.6)
k=1 i=1
then a.e. graph in ý§{n, (pjj)} consists of a giant component and isolated
vertices, and the number of isolated vertices tends to P;,o in distribution.
Although in general relation (7.6) is not easily checked, it is satisfied if
the probabilities pij are close to (log n)/n (see Ex. 3).
The analogue of Theorem 7.3 for random r-partite graphs (see Ex. 5)
was proved by Palfisti (1963, 1968). Considerably more-detailed investi-
gations of the connectivity of random r-partite graphs were carried out
by Kelmans (1967b) and Rucifiski (1981).
Since connectedness is a monotone property, P(GM is connected) <
P(GM+± is connected) for every M = M(n). It is fairly surprising that,
166 Connectivity and Matchings
as proved by Wright (1972) (see also Wright, 1973b, 1974c, 1975a), the
probability of connectedness of an unlabelled graph can be less for more
edges. We shall return to this in Chapter 9.
Thus z(Q; G) is the first time when the graph acquires property Q. In
particular, z{K(G) > k} is the first time when the graph becomes k-
connected and z{36(G) > kJ is the first time the minimum degree becomes
k. Trivially z{((G) Ž> k;G} <_ z{K(G) _>k;G} for every graph process
G. It is fascinating, and at the first sight rather surprising, that, in fact,
equality holds for a.e. graph process. This was proved by Bollobfis and
Thomason (1985) for every function k = k(n), 1 < k < n- 1, but here we
prove it only in the case when k is constant.
Lemma 7.5 Let k > 2, w(n) --* oo, (o(n) •_ log log log n and p - {log n +
(k-1) log log n-w(n)}/n. Then almost no G, contains a non-trivial (k -I)-
separator.
or, equivalently,
P Ar +P
+) UAr +P U Ar =0(1) (7.7)
In estimating the first probability above we shall make use of two facts.
(a) A.e. Gp has minimum degree k - 1. This follows by Theorem 3.5.
(b) Given 1 e N, a.e. Gp is such that no two vertices of degree at most 1
are at distance at most 1. Indeed, the expected number of paths of length
k > 1 ending in vertices of degrees i and j is at most
nk+1pIpk(pn)i-l(pn)--(1 -_ p) 2n-i-j-2 = 0(1).
Now (a) and (b), with 1 = k + r0 , imply that p(Urr° 2 Ar) = o(l). To see
this note that if 6(Gp) _Žk - 1 and Gp contains a (k - 1)-separator with
IWI = r > 2, then Gp contains two vertices of degree less than k + r at
a distance less than k + r.
Let us turn to the estimate of the second term in (7.7). If 6(Gp) > k - 1
and S is a (k - 1)-separator with ro < IWI = r _<rl, then the graph
spanned by W, U S has at least r(k - 1)/2 edges. Given S and W1, the
probability of such a graph is at most
r( 7
1;i
~)/ (k- )rkI)
Consequently, by (a),
P ( U
Ar
rr
o(1) +
=r
n
rr n-r
k-I1 )- 3 4
r(k-1)/1 (1 _ p)r(n-r-k+l)
rl
r~rl
r2
This shows that, with probability 1 - o(1), every edge incident with some
zi and drawn not later than time M 2, joins zi to the giant component
of GM,. On the other hand, a.e. GM2 is connected. Hence z{TK(G) > 1} =
{16(G) _>1} with probability tending to 1.
(ii) Suppose k >Ž 2. In this case our theorem is an immediate conse-
quence of Lemma 7.5. Indeed, by Lemma 7.5 and Theorem 2.2 there
is a function M(n) such that M(n) < (n/2){logn + (k - 1)loglogn -
log log log n} and almost no GM contains a non-trivial (k - 1)-separator.
Hence a.e. graph process G (G,)N is such that for t > M(n) the graph
-
Theorem 7.6 If p(n) <_(log n + k log log n)/n for some fixed k, then
P{x(GA•) = k} -> 1 - ee
and
Theorem 7.8 Let p be fixed, 0 < p < 1. Then a.e. Gp contains t = [n1/7]
distinct vertices x1 , x 2 . . xt such that if Go = G, and Gi = Gi- 1 - {xi},
1, 2,..., t, then
Furthermore,for each i, 0 < i < t, we have K(Gj) = 6 (Gi) and Gi has larger
minimum degree and larger vertex connectivity than any other subgraph of
order n - i. 7
In 1995, Cooper and Frieze introduced the kth nearest neighbour graph
and studied its connectivity properties. To define this random graph Ok,
we start with the complete graph K, with vertex set, and pick a random
order el ... , eN of the edges of Kn. For each vertex i, let Ei be the set of
the first k edges incident with i, and define Ok = ([n], U Ej).
The random graph Ok has a beautiful description in terms of random
weights' on the edges. Assign i.i.d.r.vs from an atomless distribution to
the edges of K. Then, with probability 1, no two edges have the same
weight. To get the edge set of Ok, for each vertex take the k lightest
edges.
There is a 'dynamic' description of Ok as well, based on a random
graph process (Gt)' or, what is the some, a random permutation el, e2 ,
eN of the edges. Define a sequence of graphs Ho ( H 1 cz ... c HN
with vertex set [n] as follows. Set E(Ho) = 0. Having defined H,, if at
least one of the two endvertices of e,+± has degree less than k then set
170 Connectivity and Matchings
E(Ht+i) = E(Ht) U {e,+l}, otherwise set E,+1 = E,. The process stops
when every degree of Ht is at least k.
The random graph Ok is rather close to Gk-OUt; for example, kn/2 <_
e(Ok) •! kn- (k+ ). As pointed out by Cooper and Frieze (1995), a.e. Ok
has about 3n/4 edges, a.e. 02 has about lln/8 edges and, in general, for
k = o(log n), a.e. Ok has
kn - 2n - 1) l:_i<j<k
n(n-- i-2
2d(i,) i+ j j - _2n
-- +Ok3
+ 0( ~n
Io gn)
edges, where d(i, j) is the Kronecker delta. This is easy to deduce from
the martingale inequalities Theorems 1.19 and 1.20, provided one can
calculate the expectation
Concerning the connectivity of Ok, Cooper and Frieze (1995) proved
that a.e. 01 is disconnected and, for every fixed k > 3, a.e. Ok is k-
connected. It is rather surprising that 02 behaves in a somewhat peculiar
way: for wo(n) -+ oc, a.e. 02 has a giant componant with at least n - w)(n)
vertices. Furthermore,
lim P(0 2 is connected) = r,
n--*co
where r is a constant satisfying 0.99081 _<r < 0.99586. The proof of these
assertions are ingenious and complicated.
A geometric random graph resembling Ok was introduced and studied
by Penrose (1999). For a set X, of n points in a metric space and a
positive real r > 0, let Gr(Xn) be the graph with vertex set X, in which
two vertices are joined by an edge if their distance is at most r. As r
increases, Gr(X,) has larger and larger minimum degree and connectivity.
Given X,, let Pn,k = pk(X,) be the minimal value of r at which Gr(Xn)
has minimal degree at least k, and define Un,k = ak(Xn) similarly for
k-connectedness. Formally, the hitting radii S,,k and U,,k are given by
and
Unk = 0-k (X,) = min{r > 0 : k (Gr(Xn)) Ž k}.
Extending his result from 1997, Penrose (1999) proved that if X" is a
random n-subset of the d-dimensional unit cube Id then, with probability
tending to 1, for the geometric random graph Gr(X,) these two hitting
7.3 Matchings in Bipartite Graphs 171
radii coincide:
lim P (0n,k = Sn,k) = 1.
n--o0
note that this result is the exact analogue of Theorem 7.6. However, we
should not be misled by the easy proof of that result: the proof of this
extension to geometric random graphs is much harder.
where wo(n) --* oc and w(n) < log log log n. If a.e. graph in 91(n, p; >_k) has
Q, then
T(Q;G) = r{T(G) > k;G}
n {logn+(k-1)loglogn-
Io(n)
let O(G) be the coloured graph whose blue subgraph is GMH and whose
green edges are the first edges added after time M, and not later than
M2 which increased the degree of a vertex to k. Clearly iO(S.) = S. and
(1 - En)P{ -'(A)l = (1 - &n)P(A) for all A c S.. (7.8)
is such that
P(10) = 1 - q/, where qn --+ 0.
Hence
P(OO n SO) > 1 - gn - fln.
and by (7.8) the set J' = n S.) satisfies
P ' )ý 1 -- 6n(1 -E -tJ --
1 -en
Lemma 7.10 Let A and B be (disjoint) subsets of V(2). Then the probability
that a graph G in S(n,p;>_ k) is such that
Ac-E(G) and BnE(G)= 0
is at most
(p A (1 -p)IBI.
Proof Note simply that the probability that a set A 1 = V(2) consists of
green edges is at most {2/(n - k)} A.I
Now we are well prepared to prove the main result of the section.
Lemma 7.12 Let G be a bipartite graph with vertex classes V1 and V2, I /Y1
1 =
IV21 = n. Suppose G has no isolated vertices and it does not have a complete
matching. Then there is a set A ( Vi (i = 1,2) such that:
(i) F(A) = {y : xy G E(G) for some x E A} has JAI- 1 elements,
(ii) the subgraph of G spanned by A U F(A) is connected and
(iii) 2 _<JAI <--(n + 1)/2.
Hence JAI -<IBI = IV21- 1F(A)l = n - (JAI- 1), so JAI _<(n + 1)/2.
Let us continue the proof of Theorem 7.11. Denote by F, the event
that there is a set A c- Vi(i = 1 or 2), JAI = a, satisfying (i), (ii) and (iii) of
Lemma 7.12. All we have to show is that in !N{K(n,n),p;>_ 1} we have
P UFa = o(1),
a=2 og n
nl
The first non-trivial lower bound was given by Lazarus (1993), who
proved that lim,_E(m(U,)) > 1 + 1/e. Goemans and Kodialam (1993)
improved the lower bound to 1.44 by extending the methods of Lazarus,
and Olin (1992) to 1.51.
It is clear that if we are interested in the limit of the expectation
E(m(Xn)) then only the distribution of Xij near 0 matters. In particular,
we may replace Uij by A1j, where Aij has exponential distribution with
mean 1, i.e., P(Aij < x) = 1 - ex for x > 0. Aldous (1992) observed
that considering the exponential random assignment matrix A, = (A11) has
many advantages over the uniform random assignment matrix U, = (Uij),
and many of the proofs become much simpler. Among other results,
Aldous (1992) proved that c* does exist: the expected cost of a minimum
assignment (with the exponential distribution, say) does tend to a constant
c* between 1.51 and 2.
Let us see how the lower bound 1 + 1/e given by Lazarus (1993) can be
proved if we use exponential r. vs A/j, 1 < i, j < n, each with mean 1. Set
Ai = mini Aij and Bij = Aij -Ai. Then each Ai is exponentially distributed
with mean 1/n. Furthermore, the entries of the matrix B, = (Bij) are again
i.i.d. exponential r. vs with mean 1 except that a random element Bij(i) of
each row is replaced by 0. For 1< j <!n, let !Bj = mini Bij.
Trivially,
n n
E(m(A,)) E E(Ai) + E E(B3j)
i~l ji-1
Sr= ( ) (n -r)n,
n-I r Y
=- E(_lr ( n~ )jr(1 I)kk( r =n (n-)n.
r=1 k=1
Therefore,
E(m(An)) _ I + ( n '
E(m(An)) = k
k=t
( ) (n - 2)p 2 (1 _-p)2(n-3).
some constant c, then the expectation is bounded away from 0 and, fur-
thermore, with probability at least E(c) > 0, the graph contains such
a configuration of six vertices (Ex. 7). Hence p has to be at least
(1/2n){logn + 2loglogn + wo(n)}, where wo(n) -* cc. The main result
of the section shows that this trivial necessary condition is sufficient.
Proof Let p0 = (log n + log log n)/n. Since a.e. GN 0 is connected, we may
and shall assume that p < po. The proof consists of a series of lemmas.
The first three have no obvious connection with matchings.
Let A, be the event that (i) Gp consists of a giant component and at
most n 1/2 / log n isolated vertices, (ii) it has at most n"/ 2 vertices of degree
1, and (iii) no two vertices of degree 1 have a common neighbour. D
Proof By the remark after Theorem 7.2, the trees of order 2 vanish after
time (n/4){log n + log log n + wo(n)}. Furthermore, it is immediate (cf. the
proof of Theorem 7.3. 1) that the expected number of vertices of degree 0
is o(n 1/ 2/ log n) and the expected number of vertices of degree 1 is o(nl/ 2).
Finally, p was chosen exactly to ensure that the expected number of
pairs of vertices of degree 1 having a common neighbour tends to 0:
( ) (n - 2)p 2 (1 p)2(n-3)
" ( 2logn
+o(n) exp{-logn-2loglogn- o(n)} =o(1)
Let A2 be the event that Gp does not have r0 = (4 log log n/ log n)n
independent vertices.
Since
1 e logn 2en
2 loglog n ro
the right-hand side of (9) is o(1). 0
Proof The two vertex classes of a K(ro, ro) can be selected in 2 (r,)
(n-ro) < (;) different ways. Hence, as in the proof of Lemma 7.16, the
expected number of K (ro, ro) graphs in the complement of Gp is at most
_p)r2
<
0(1 2 en) <-2r-r = o(1). D
r0 \ en
rod 2 e0 }r <
P([n/2J
P Br) = o(1). (7.12)
r=h
If Br holds, then Gp contains a set of r + 2 independent vertices:
7.4 Matchings in Random Graphs 181
Let A 4 be the event that one cannot find a set R c V such that
1 _< IRI < r0 , Gp - R has at least two components and the union of all
but a largest of the components of Gp - R has at least r0 vertices.
Proof We claim that A3 (- A4, so the assertion follows from Lemma 7.17.
To see A 3 c A 4 , suppose Gp - R has components of orders a, _• a2 :<
... _< a, and ZI- ai > r0. Let j be the smallest index with Zi., ai > ro.
Then
so
Saia_
i=j+l
n - r0 - (n + ro)/2 _>r
Since no edge joins one of the first j components to the others, the
complement of Gp contains a
(•ai, 1 la
K ( ai a) - K(ro, ro). Ii
Clearly
Ur-I Br, U
J
r=l s=r+l
J B(r, s)
and
{JiB~r~si} (nA4 =0
182 Connectivity and Matchings
so by Lemma 7.18 relation (7.13) and our theorem follow if we show that
0(1). (7.1t5)
0=1 =S,+iB(r,s)) =
s))
P{5B (r, n
enr
)(n ,ns
- r) rs )
'es~2r
P2 ,(1 _ ps(n-r-s)
where no = n - 2ro.
By the definition of p and no
e-Pno _ (n)(log n)3 n-1/12,
I:B(r,r)= o(1).
r=s0
7.4 Matchings in Random Graphs 183
For h > 1 let Ch be the event that V contains pairwise disjoint sets
T,H and S such that 0•< ITI = t < so-h, IHI = h,3 •__ ISI = s <so, S
spans a component of Gp - T U H, each vertex of T U H is joined by an
edge to some vertex in S and for each x G T there is a vertex y which is
isolated in Gp - T U H and is adjacent to x.
Hence the expected number of triples (T, H, S) with the parameters above
(n( ) n )
is at most
)2 n i t~p t~l _p) t
lo (esp)slI(Sp)llh(l Ps,
Consequently
We have arrived at the last step in the proof Theorem 7.14. Let h0 > 6
be fixed. Given t > 0, u > h0 and w > 3u, denote by C(t, u, w) the event
that with r = t+u and s= t+w the event Br holds, R= TUU,S =
T' U WIT = IT = t,U = u and W1 = w, the set T' consists of
vertices isolated in Gp - T U U, Gp contains a complete matching from T
to T', each vertex in T is adjacent to at least one vertex in W, W consists
of u + 1 components of Gp - T U U and each of those components sends
at least h0 edges to U. Recalling that if A1 holds then no vertex has two
neighbours of degree 1,
SO SO h0 s0 so SO
U U {B(r, s)lnAýlUCU U U C(t,,u,w).
r=1 s=r+1 h=O t=O u=ho w=3u+3
To prove (7.16) we make use of the fact that Gp[W] has at least w-u-1
edges and U and W are joined by at least uho edges. We obtain the
following crude upper bound on P{C(t,u, w)}:
n;2~f(-p Wt
)P(n) (n)(w(w- 1),2) Pw-u-I WPh(1-)"
t! tnPo' Wt t W UW ho
= (t, u, w),
Ilc•+W(log n)5t+4w+4hn-t/2-w/2+2ho+l-h/2/wt
Therefore
So So t So t
E E E C(t, ho, w) •_ E E(logn)12tn-t/2-w/2-2 = o(1)
t=O u=ho w=ho t=O w=ho
and
SO S SO SO SO
Not much effort is needed to prove the hitting time version of Corol-
lary 7.21: one checks that the estimates in the proof of Theorem 7.14
sail through for W6(n,p; > 1) with p = (logn - logloglogn)/n and then
applies Lemma 7.9.
Theorem 7.23 Let M = noo(n), where wo(n) -* oo. Then a.e. GM contains
(n/2) + o(n) independent edges.
186 Connectivity and Matchings
Alon, R6dl and Rucifiski (1998) proved that if 0 < 2e < Y < 1 and
n > n(e) then every super (c.,e)-regular graph of order 2n has at least
(c - 2e)'n! and at most (a + 2e)nn! complete matchings. Note that for
r = r(n) > 2en almost every r-regular biportite graph satisfies these
conditions.
The rest of this section is somewhat of a digression from the main
line of this book; we shall give a brief account of some results about
1-factors and large matchings (sets of vertex disjoint hyperedges) in
uniform hypergraphs.
Using Chebyshev's inequality, de la Vega (1982) proved the analogue
of Theorem 7.23 for hypergraphs: if r > 2 is fixed and M = now(n), where
w(n) -* oc, then a.e. r-uniform hypergraph with n labelled vertices and
M edges has n/r + o(n) vertex disjoint hyperedges (see Ex. 9).
It is considerably more difficult to investigate 1-factors of hypergraphs.
Schmidt and Shamir (1983) proved that if r > 3 is fixed, n is a multiple
of r, wo(n) --* oo and M = n3 / 2 co(n), then a.e. r-uniform hypergraph with n
labelled vertices and M edges has a 1-factor, i.e., has n/r vertex disjoint
edges.
By applying the analysis of variance technique first used by Robinson
and Wormald (1992) in their study of Hamilton cycles (see Chapter 8),
7.5 Reliable Networks 189
Frieze and Janson (1995) greatly improved this result: for every 8 > 0,
a.e. 3-uniform hypergraph on [n] with M(n) = [n 4/3+'j edges (3-tuples)
has a 1-factor. It seems that this difficult result is still far from best
possible: it is likely that the threshold for a 1-factor is again the same as
the threshold for minimal degree at least 1. Thus the following extension
of Corollary 7.21 is likely to be true. Let X c R and M = M(n) =
n(log r + x + o(1)). Then if n -* oo through multiples of r, the probability
that a random r-uniform hypergraph on [n] with M edges has a 1-factor
tends to 1 - e- e. In fact, with probability tending to 1, the hitting time
of a 1-factor may well be exactly the hitting time of minimal degree at
least 1.
Cooper, Frieze, Molloy and Reed (1996) used a similar technique to
prove surprisingly precise results about 1-factors in random regular hy-
pergraphs. Fix r > 2 and d > 2, and set Pd = 1 +(logd)/ ((d - 1)log d-_)
Let Gd(r)
Vd-reg be a random d-regular r-uniform hypergraph with vertex
set [n], where r divides dn. If r < Pd then a.e. G(r) has a 1-factor, and
if r > Pd then almost no G(r) has a 1-factor. The proof requires rather
difficult moment calculations.
Finally, we turn to a beautiful and deep theorem of Alon, Kim and
Spencer (1997) about near-perfect matchings in fixed graphs. There are
constants c3 , C4, .... such that the following holds. Let H be an r-uniform
d-regular hypergraph on n vertices such that no two edges have more
than one vertex in common. Then, for r = 3, H contains a matching
covering all but c3nd- 1/2 (logd)1/ 2 vertices and, for r > 3, there is a
matching missing fewer than Cknd-1/(r-l) vertices. In a Steiner triple
system on n points, every point is in d (-1)
2- triples, so this theorem
implies that every Steiner triple system on n points contains at least
n/3 - c 3n'/ 2 (log n) 31 2 pairwise disjoint triples.
j=n-1
0< e < , does f -1(1 - t) -f-(e) tend to 0 as n --* oc? If it does, then
the sequence (H,) behaves like (Ku), the sequence of complete graphs,
namely there is a threshold function po(n) E (0, 1) such that if p is slightly
smaller than po, then fH,(P) is close to 0 and if p is slightly larger than
p0, then fH,(P) is close to 1.
Not every sequence has a threshold function for connectedness. As a
simple example, let H, consist of two complete graphs K n and K"2 =
K"-n joined by an edge, where n, = [n/2]. Then fH,(p) -+ p uniformly
on [0,1], since the probability of H, being connected depends mostly on
the probability of the edge joining KnI to Kn2.
Clearly, if we join Kn1 and Kn2 by a fixed number of edges, then
the obtained sequence (H,,) still fails to have a threshold function for
connectedness. On the other hand, if we join Knl and Kn2 by l(n) edges
and 1(n) -+ oc, then (H,) does have a threshold function. Note that in the
latter case the edge-connectivity 4(Hn) of Hn tends to oo as n --+ oo. What
Margulis proved is exactly that 2(H,) -+ oo ensures that the sequence
(H,) has a threshold function for connectedness.
Theorem 7.26 There is a function k : (0, 1) x (0, 1) -* R such that if
1,(H) >_k(6,e), then
fH-1(1 - o) - fH-1 (8) < 6.
0 =# V = J(X), then (cf. the remark after Theorem 2.1) f.01 (p) is a
strictly increasing function, fi, 1 (O) - 0 and f =,(1)
1.
Let us see first that Theorem 7.27 implies Theorem 7.26. Let k(6, r) be
the function given in Theorem 7.26 and let H = (V, E) be a graph with
2(H) >_ k(, 8). Set X = E and
PP(s) = p(B).
Bc-4
=th
Byp() ea dfntnoh
By the definition of h.•,, we have
= kj k1-(m-k+01)J k-li.
k=O , k k=O AC k
=P Ik lpk(1-_p)m,- k - p(m
-k)
kýO (p 1- dp
Then 7 > 0 and we claim that k(J,r) = 7-'(2/6) 2 has the required
properties.
Suppose this is not so and the non-trivial monotone family d is such
that k(,d) = T-1( 2 /3)2 and
f 1(1 - r)- fj (r) Ž:6.
Then there exist p, and P2 such that
6/2 < p1 <P1 + 6/2 < P2 •< 1 - /2
and
8• f.V(Pl) • fAP2) < 1-
Since f. (p) is a polynomial in p, by the mean value theorem there is a
p G (pl,p2) such that
df,v(p) 1 - 2E 2
dp_! 6-/2 <6
Consequently, by Lemma 7.28, inequality (7.18) and Theorem 7.29 we
have
2> ( df(p 2 (Y > P()2 > k(2)pp(Oad)Vp(_d)
(2) l~(d ,o
(6)2
(P)2g~ 0
Corollary 7.31 Let gH(p) be the probability that a graph in <J(H,p) contains
a cycle. There is a function k : (0,1) x (0,1) -+ 1R such that if H has girth
at least k(6, g), then
gH1 (1- g-g (e) <3.
Theorem 7.32 For r > 3 and ao > 3, a.e. Gr-reg is such that if V =
A US UB, a = JAI < I,s = S1 and there is no A--B edge, then
s > r, ifa=1,
s 22r-3, ifa=2,
s ((r-2)a, if3<_aao
and
s> (r-2)aO, if a > ao.
so s > (r - 2)a.
To complete the proof it suffices to show that a.e. Gr-reg is such that if
V = AUS UB, a = JAI _>a, and [S[ = sl, then there is at least one A--B
edge. By Corollary 2.18 it suffices to prove that a.e. configuration has
the corresponding property, and this is what we proceed to do, using the
notation of Theorem 2.16. What is the probability, S(a), that for a given
value of a there is a partition with no A- B edge? We have choices (n)
for S and (flSi) choices for A. With probability at least (n - s, - a)/n, a
configuration edge incident with an element of A* = UiA Wi joins this
element to B* = Ui•BWi and so, pairing off the elements of A* one by
one, we see, very crudely, that
S (a) n ( n -- s 1 (a +S1)ar/2 <_ 1 n -- s1 ) (a+ Si )'al'=/2 ()
S1 a n( an
) = O(n-s5
3 2
Now S'(al) = O(nsl+a,- al/ ) = o(1) and for a, _<a < a+±1 _ a2
1 2
we have S'(a + l)/S (a) = O(n-t/ ), so
a2 a2
as required.
The proof above shows that Theorem 7.32 holds for a0 = Lc0 log n] as
well, where co > 0 depends only on r.
Theorem 7.32 implies immediately that for r > 3 a.e. r-regular graph
of even order has a 1-factor.
7.6 Random Regular Graphs 197
Corollary 7.33 For r >!3, a.e. element of G(nr-reg) has [n/2] independent
edges.
Proof It is well known that every r-connected r-regular graph has [n/2]
independent edges (see, for example, Bollobis, 1978a, p. 88).
Bollob~is and McKay (1986) calculated the expectation and variance of
the number of 1-factors of Gr-reg. It is interesting to note that although
the expectation is large, so is the variance and therefore Chebyshev's
inequality does not even imply Corollary 7.33, which itself is very weak.
Theorem 7.34 Let r > 3 be fixed, n even, and denote by X = X(G) the
number of 1-factors of G G W(n, r-reg). Then
2 2
E(X) - e1 / 4 {(r - 1)(r-0)/rr }n/
and
2
E(X ) - E(X)2e-(2r-l)/4(r 1)2 .
one has to work to prove it. Shamir and Upfal (1982) proved that a.e.
G 6-out has [n/2j independent edges, and Frieze (1985c) extended this to
the following best possible result.
Theorem 7.36 For k >_2, a.e. Gk-out has Ln/2j independent edges. In Chap-
ter 8 we shall see considerably stronger results about .6k-out.
Exercises
7.1 Let 0 < p < 1 be fixed and set P, = P(Gp is connected).
By considering the probability that the component containing
vertex 1 has k vertices, prove that
I - Pn = k- I1 k E.1
k=l
(Gilbert, 1959.)
7.2 Let 0 < p < 1 be fixed and set R, = P (vertices I and 2 belong
to the same component of Gp). As in Ex. 1, prove that
1 - n-= ( 12 pqk(n--k) = n ( - 2 "k~nk)
k~l k=2
(Gilbert, 1959.)
7.3 Prove that if Pij = (logn + cij)/n and maxIc ij I 0(1), then
Kovalenko's condition (7.6) is satisfied.
7.4 Use Lemma 8 to prove that
Z exp{-wi(n)(k/n)r-} = o(n),
k=1
Prove that there are no(c) and 0 < b < 1 such that if n > no(c),
then
an(s) • e( 3-c)S/s !, if s < n/log n,
and
an(s) <bs, if n/log n s < n/2.
[Godehardt and Steinbach (1981), correcting an inequality
claimed by Erd6s and R&nyi (1959), who applied it to prove
that for M = [(n/2)(log n + c)J a.e. GM consists of a giant
component and isolated vertices (see Theorem 7.3).]
7.17 Let x c IRbe fixed and p = 3{logn +1 log logn+ ½log' +-x-+
o(1)} 1/2 /n1 / 2. Write X for the number of edges of Gp which are
d
contained in no triangle. Prove that X P(e-x). -
7.18 Call a graph G locally connected if for every vertex u the subgraph
G[F(u)] of G induced by the set of neighbours of u is connected.
Show that for p as in Ex. 17, the probability that Gp is locally
connected tends to e-e-. (Erd6s, Palmer and Robinson, 1983.)
8
Long Paths and Cycles
201
202 Long Paths and Cycles
vertices of degree 0 and 1. What happens if our model is such that
every vertex has a reasonably large degree? Is a.e. Gk-o,, Hamiltonian
if k is sufficiently large? And Gr-rcg? In this area the breakthrough is
due to Fenner and Frieze (1983), who answered the first question in
the affirmative. Subsequently the method of the proof was extended by
Bollobhs (1983a) and Fenner and Frieze (1984) to prove the analogous
result about Gr-reg.
Theorem 8.1 Let 0 < 0 = O(n) < log n- 3 log log n and p = O/n. Then a.e.
Gp contains a path of length at least
(I 4log2) n.
Proof Set r = p/2 and consider the space WJ(n; r, r) of red-blue coloured
multigraphs (see §1 of Chapter 2), in which the red edges are selected
with probability r and the blue edges are selected with probability r.
By forgetting the colours of the edges and replacing multiple edges by
simple edges we obtain a random graph GP,, where p0 = I - (1 - r)2 =
p - p2/4 < p. Hence it suffices to show that a.e. G c W(n; r, r) contains a
long path. Before going into details, let us give an intuitive description
of the proof.
The advantage of the model §(n; r, r) is that we have two chances of
using independence: first when considering the red edges and then when
considering the blue edges. We shall try to construct a long path in a
beautifully straightforward fashion. We start with an arbitrary vertex and
try to construct a long path by extending our path edge by edge, using
8.1 Long Paths in Gc/n-FirstApproach 203
only red edges. When we grind to a halt, we give the blue edges a try to
help us out. If there is a blue edge which can be used to continue our
path, we use it, and continue testing the graph for red edges. If at some
stage we find that no blue edge goes to a new vertex either, we retrace
our steps to the first vertex from which the blue edges have not yet been
tried. If during this process our current path shrinks to a vertex which
we can no longer leave, we start a new sequence of paths by picking
a new initial vertex. As we shall see, with probability tending to 1 this
algorithm does produce a long path.
Now let us give a formal definition of the algorithm. Given a random
coloured graph G E S(n;r,r) the algorithm constructs a sequence of
triples (Pk, Uk, Bk), where Pk is a directed path (the current path after
k steps), Uk is a set of vertices (the set of untried vertices) and Bk is
the set of blue vertices (the set of vertices which have been tried for
red edges). To start the algorithm, pick a vertex xo E V(G) and set
P 0 = x 0 , UO = V - {xO} and B0 = 0. Suppose we have constructed
(Pk, Uk, Bk), where the directed path Pk has initial vertex Xk, terminal
vertex Yk and the set Uk has Uk vertices. If Uk = 0, we stop the
algorithm; otherwise we distinguish three cases.
for all k,j, and )r. Since Uk+1(G) is Uk(G) or uk(G) - 1, the sequence
{Uk(G)}1' 0 is a Markov chain, as claimed.
For 1 •<_i<n-l set
Xi = max{k - 1 : Uk = U1 = i}
This shows that with probability tending to 1 our algorithm does con-
struct a long path provided for some pair (j,k) we have P(Yj • k) -- 1
and 2j + k is fairly small.
8.1 Long Paths in G/n--First Approach 205
and
2 = (i-( - r)i
o. (Xi) {(-- -- r)i}2
Furthermore,
n-- n-1
E(Yj) = E(Xi +1) • 1/(1 - e-ri)
i=j+1 i=j+1
< (1- e-rx)yldx <_ (1 -- ey)-ldy
r Jog2
2 (n[--lo)g
r n+
n_
rl
-ler, --(n)nl/2 -2
1 2
=n - 2 [log 2 + I e rn - oi(n)n / -2
Ž -41o 2)0
A rather attractive feature of the algorithm above is that the algorithm
it is based on never reverses the direction of an edge with respect to the
direction of the current path. Consequently de la Vega's proof implies
the following result as well.
206 Long Paths and Cycles
Theorem 8.3 Let Mi(n) <_ (n/2)(log n- 3 log logn) and M 2 (n) _<n(log n-
3 log log n). Then a.e. Gm, has a path of length at least n-2(log 2)n 2/M, (n)
and a.e. directed graph with M 2 (n) edges has a direct path of length at least
n - 2(log2)n2 /M 2 (n).
erased for the first time, then one of y and w became an endvertex, so
we have {y, w}J U UN. Hence y G U UN. []
Lemma 8.5 Suppose h, u E N, h >_2, and a graph is such that its longest
path has length h but it contains no cycle of length h + 1. Suppose, further-
more, that for every set U with IUl < u we have
Ig U F(U)J > U1. (8.4)
Then there are u distinct vertices Y1, Y2,.. , yu and u not necessarily distinct
sets Yh, Y2,..., Yu such that for i = 1,2,..., u we have yi Yi,, IYi >-u and
the graph has no yi - Yi edge. Furthermore, the addition of any of the
Yi - Yi edges creates a cycle of length h + 1. In particular,there are at
least u(u + 1)/2 non-edges such that if any one of them is turned into an
edge, the new graph contains an (h + 1)-cycle.
so
IU uF(U)J < 31 Ul 1.
Consequently, by relation (8.4) we have Ig - u, say
{yl,y2. , yu} U.
Lemma 8.7 (i) Let p = log n/n,0 < 6 < 1/y < 1,uo [(log n)Uo and
ul = [6n]. Then a.e. Gp is such that if U c V(Gp) and uo IUI < ul, then
IUUF(U) I _ I.
(ii) Let p = O(logn/n),c > 1 and lo = o(n(logn)c/(1-")). Then a.e. Gp is
such that for every 1 • lo a set of I vertices spans at most cl edges.
Proof (i) Let E(u,w) be the expected number of pairs (U, W) such that
U, W = V(Gp),UN W = UlUl
= u > 1,1WI = w 1 and
F(U)\u = W.
Then clearly
since
1 - (1 - p)" _<pu.
8.2 Hamilton Cycles-First Approach 209
£
u=u0--1 w=1
U(log n)w ()U
Ut
> yu(logn)(7
* - 1)u (e)'e(/-1)UnU2/n
U=Uo-1
U1
m~f•'l
\ pc < 2(elp)f',
eli
the probability that Gp does not have the required property is at most
10 10
2Z ('1 (elp)"' < =
1_4 1=4
Lemma 8.8 Let wo(n) --+ oo and p = (1/n){logn + log logn + wo(n)}. Then
a.e. Gp is such that iVU c- V(Gp) and UI < 4n, then
IU u F(U) 1 31 U. (8.5)
c (n~)
I:
t*
Since
3uO
n Il-[t131 (121°ogn)t -- o(1),
t=4
Since P{(6(GM) > 2} --+ 1 iff co(n) = 2M/n - logn - loglogn -* oc (see
Theorem 3.5), if a.e. GM is Hamiltonian we must have w(n) - oo. Koml6s
and Szemer~di (1983) and Korshunov (1977) were the first to show that
this rather trivial necessary condition is also sufficient to ensure that a.e.
GM is Hamiltonian. The proof we present is from Bollobfis (1983a).
Theorem 8.9 Let wo(n) -- oo, p = (1/n){log n+log log n+o(n)} and M(n)
L(n/2){log n + log log n + wo(n)}j. Then a.e. Gp is Hamiltonian and a.e. GM
is Hamiltonian.
by first showing that for 0 _<j < k an element of Sj contains a long path
with probability 1 - o(1).
For Gk E Sk and 0 •< j k let Gj be the subgraph of Gk formed by the
edges of colours 0, 1_... J. Then Gk -- Gj defines a measure-preserving
map 0.k
of into 9Nj. Conversely, a random element of ifV is obtained from
a random element of ,j-i by adding edges of colour j with probability
pj = Pi.
Given a non-Hamiltonian graph G of order n, let l(G) be the maximal
length of a path in G; if G is Hamiltonian, set l(G) = n. Furthermore, let
Q be the property that G is connected and if U c V(G) and IU • n/4,
IU u F(U)I > 31u .
and
as required. 1:1
Lemma 8.10 Let M(n) = (n/2){logn + log logn - oc(n)}I N, c(n) --*
oo, ý(n) = o(log log log n) and let k > 1 be fixed. Then a.e. graph G, G
,,,,(n;M,Ž> 2) is such that
(i) 5(G,) = 2, A(GJ) _•3 log n and G, is connected,
(ii) the graph G, has at most go = [log log n] green edges and the green
edges are independent,
(iii) there are o{n/(logn)} vertices of degree at most 1 log n, and there
are at most n edges incident with these vertices,
(iv) no two vertices of degree at most 1 log n are within distance k of
each other,
(v) no cycle of length at most k contains a vertex of degree at most
td log n,
(vi) for (log n) 4 < s <_ n/4, every set S of s vertices has at least 3s
neighbours which do not belong to S,
(vii) for I •_ n(log n)-3, every set of 1 vertices spans at most 21 edges.
8.3 Hamilton Cycles Second Approach 213
Proof a.e. GM is connected and has at most log log n vertices of degree
1 implying assertions (i) and (ii). To prove (iii) it suffices to show that
in Gp, p = M/N, the expected number of vertices of degree at most
do= [½ log n] is o(n/log n):
k n1+1 2 (2n) (N
-2n+3) /(N)
d=0 -)
1=1
= 2d, M -- 2d,)
2
nM/N (M ) 2d,±k
en) 2dl e-
-
<4 nk+1 d,N
< n k+128(10ogn)/5n-2 (logn k= ok /)
Hence a.e. GM is such that the balls of radius k about its vertices of
degree at most 110log n are disjoint and contain o(n3/ 4 ) vertices. Therefore
the probability that the other endvertex of a green edge belongs to such
a ball is o(n-1/ 4 ). Since a.e. Gc has at most go green edges, (iv) follows.
A slight variant of the argument above shows (v).
Assertion (vi) is immediate from Lemma 8.7(i).
Finally, Lemma 8.7(ii) implies that a.e. GM is such that any set of
1 < n(log n)- 3 vertices spans at most 21 edges. Since the green edges are
independent in almost every Gc, assertion (vii) follows.
Our next aim is to give an upper bound for F(n). Let B be the blue-
green subgraph of G obtained after the omission of the red edges of a
8.3 Hamilton Cycles Second Approach 215
Set
S1 = {x E S :dG(x) < 1 logn},S 2 = S - S1.
Corollary 8.12 Set wo(n) = 2M(n)/n - log n - log logn. Then a.e. GM is
Hamiltonian iff wo(n) -+ oc, and a.e. GM is non-Hamiltonian 1ff 0w(n) -+ -- c.
P(D,,p is Hamiltonian) --
* , if E > 0
The second relation follows from the fact that for P < 0 almost no Dp
is connected (see Ex. 6).
218 Long Paths and Cycles
Theorem 8.11 tells us that the primary obstruction to a Hamilton cycle
is the existence of a vertex of degree at most 1. Similarly, the primary
obstruction to 1edge-disjoint Hamilton cycles is the existence of a vertex
of degree at most 21 - 1. Bollob~s, Fenner and Frieze (1990) studied the
secondary obstruction to these properties. More precisely, for k > 1 we
say that a graph G has property Q/k, in notation, G E -Qk, if G contains
[k/2j edge-disjoint Hamilton cycles and, if k is odd, a matching with
[n/2J edges. Thus, if n is even, G E Slk iff G has a k-factor F which
contains [k/2J edge-disjoint Hamilton cycles. To rule out the primary
obstruction, we simply condition on the minimal degree being at least k,
that is we consider the space .(n, M; 6 > k) of all graphs on [n] with M
edges and minimal degree at least k. (If M < kn/2 then this set is clearly
empty). We write G,,,m;k for a random element of !(n, M; (5 _ k).
For k > 2, what is the main obstruction to Vk in S(n,M;( > k)? As
noted by Luczak (1987), k-legged spiders or simply k-spiders provide such
an obstruction: k + 1 vertices of degree k having a common neighbour.
Bollobo's, Fenner and Frieze (1990) proved that this is indeed the
case and so, somewhat surprisingly, the threshold function for sgk in
W(n,M;(5 >_ k) is decreasing with k. The proof of this result is rather
involved, partly due to the fact that the random graphs Gn,M~k are not
too easy to generate in a form that is pleasant to study.
0 if d, - o- , sufficiently slowly,
lim P (Gn,M;k E Vk) =- if d
-es(d) -d,
where
e -(k+l)d
(k + 1)!{(k - 1)!}k+1(k + 1)k(k+I)'
1 - a(c) = sup{o > 0" a.e. Gc/1 contains a path of length > an}
and
1 - fl(c) =sup{fl >0 a.e. Gc/1 contains a cycle of length > f/n}.
Clearly i(c) _ f/(c) and it is, in fact, likely that a(c) = fl(c) for all c > 0.
We shall not prove this but we shall give the same bounds for c(c) and
Oc).
At least how large does a(c) have to be? Since a.e. Gc/1 contains
{1 + o(1)}n (1 - =n-I
This bound can be improved by taking into account that after the
deletion of the vertices of degree 1, many more vertices of degree 1 will
be created in a typical Go/,, whose deletion results in a further batch of
vertices of degree 1, etc., but we shall be content with (8.15).
Now let us turn to the upper bound on f/(c). We know from Theorem
1 that i(c) = 0(1/c). BollobAis (1982d) was the first to show that fl(c)
decays exponentially: fl(c) < c 24 e-c/ 2 if c is sufficiently large. This bound
was improved to an essentially best possible bound by Bollobts, Fenner
and Frieze (1987).
220 Long Paths and Cycles
Lemma 8.17 Let e > 0. If c is sufficiently large, then a.e. G,1, is such that
(i) any t < n/6c3 vertices of Gc/, span at most 3t/2 edges,
(ii) any t < n/3 vertices of G(/fl span at most ct/5 edges,
(iii) n - h <c6en,
(iv) the set W = {x :(1 - E)c < dH(x) < (1 + -)c} has at least (1 -c-4)h
elements, and spans at least (1 - e)ch/2 edges and
(v) H has at most (1 + v)ch/2 edges.
Let us assume that the graph H = H(G•,/,) has vertex set {1,2,..h}
and degree sequence 6 < d, < d2 • ... •< d, < 4c. Let #, = #'(Gc1,)
be the set of all graphs with vertex set {1,2 .. h} and degree sequence
(di)n. Turn Ae' into a probability space by giving all members of At the
same probability. Note that all members of , occur as H = H(G1/,)
with precisely the same probability since if H 1,1H2 G Y and F is such
that V(F) = {1,2_..n}, no edge of F joins two vertices of H1 and
8.5 Hamilton Cycles in Regular Graphs-FirstApproach 221
H1 = H(H 2 U F). Hence a.e. G,1, is such that a.e. element H of ,(G, 1 ,)
satisfies the conclusions of Lemma 8.17.
In view of this, it suffices to show that for some E > 0 and large
enough c a.e. graph in * is Hamiltonian. The proof of this is based on
the colouring argument due to Fenner and Frieze (1983), encountered
in the previous section, and the model of random graphs with a fixed
degree sequence, introduced by Bollobfis (1980b) and discussed in §4 of
Chapter 2.
The space * is rather close to the space of regular graphs (say c-
regular graphs) so this assertion is rather close to Theorem 8.21, below,
stating that for large enough r, a.e. r-regular graph is Hamiltonian. In
fact, our task here is considerably easier since we may use the prop-
erties in Lemma 8.17. Nevertheless, we shall not give the proof for we
shall encounter the colouring argument once again in the next section
(Theorem 8.19), in a somewhat simpler form.
Frieze (1986d) further improved the upper bound on fl(c) and showed
that, asymptotically, a(c) and fl(c) are equal to the right-hand side of
(8.15), which gives the correct order of oc(c) and fl(c).
Theorem 8.18 There is a.function e(c) satisfying lim•, e(c) = 0 such that
In fact, Frieze proved that the analogue of Theorem 8.18 holds for
c = c(n) ---. co as well; in particular, Theorem 8.9 is a special case of this
extension.
Theorem 8.19 There is a ko E N such that if k >_ ko, then a.e. Gk-out is
Hamiltonian.
Proof We shall start by showing that a.e. Gk-out spreads, i.e. satisfies an
assertion similar to Lemma 8.7. To be precise, consider 4k-out and for
U r V and G E 4k-out write
F(U) = FGý(U) ={x E V :Wx e=-
E(G) for some u e U}. LI
Proof Note that the conditions imply that k > 5. If (8.17) fails, then
U[ > uo = [k/41 > 2. For u 0 < u < ul = [onj, the probability that (8.17)
(n)
(n-u){ý(4u) /(n)y
fails is clearly at most
S3u
<(n)U ( n u)- (n -u)3u (n -u)n-4u (4u ku
3
< {3 4k(uk/un)k-4 ( 1 - 4u/n)4 -nlu }U
= Su • ckn-2
We shall show that Theorem 8.19 holds with k0 = 23. Set c = 0.202
and note that for k > k0 inequality (8.16) holds and
o= {0 E Wk-ou, e 0o}
E4()
Given G e ýO, for each vertex x colour k - I of the arcs leaving x blue
and the remaining arc red: let G = b U k be the resulting colouring. Let
Jr'( be the collection of coloured graphs obtained from elements of g0
for which B = 0(G) has as long a path as 0(G). By construction,
With more effort one could prove that Theorem 19 holds for a smaller
k0 , say for k0 = 19. Nevertheless, this would not be worth the additional
effort, however slight, since Cooper and Frieze (2000) used a different
method to prove it for ko = 4. We shall return to this in the next section.
Furthermore, it is very likely that the theorem holds even for k0 = 3: a.e.
G 3 -out is Hamiltonian. If true, this result would be best possible since
lim,-,)P(G•_out is not Hamiltonian) > 0 (see Exx. 4 and 5).
In view of Theorem 8.19, it is not too surprising that if r is sufficiently
large, then a.e. Gr-reg is Hamiltonian. This was proved independently by
Bollobfis (1983a) and Fenner and Frieze (1984), by adapting the proof
above to the model of random regular graphs. Since the proof is fairly
intricate and not too short, we do not give it here.
Theorem 8.21 There is an ro c N such that if r > ro, then a.e. Gr-reg is
Hamiltonian.
As we shall see in the next section, much better results have been
proved by different methods.
Fenner and Frieze (1984) gave a fairly small bound for r0 , namely
r0 < 796, and with more work even that bound could be reduced.
However, the remarks after Theorem 8.19 are valid in this case as well:
it would not be worthwhile to prove r0 < 10, say, at the expense of
more involved calculations, for it is very likely that a.e. cubic graph
is Hamiltonian, so one can take r0 = 3. In fact, by estimating the
variance of the number of Hamilton cycles, Robinson and Wormald
(1984) proved that the probability that a cubic graph is Hamiltonian is at
least 0.974. Furthermore, they also proved that a.e. bipartite cubic graph
is Hamiltonian. On the other hand, a result of Richmond, Robinson and
Wormald (1985) shows that, occasionally, Hamilton cycles are rare: a.e.
cubic planar graph is non-Hamiltonian.
P(X > 0) -* 1 as n --* o. This was the approach used in Chapters 3 and
4 when we studied degrees and small subgraph, but the methods in the
proofs of Theorems 8.19 and 8.21 were entirely different. The obvious
explanation for this is that in Theorems 8.19 and 8.21 we were looking for
very large and fairly complicated subgraphs; after all, it is NP-complete
find a Hamilton cycle in a graph. In view of this it is rather surprising
that a streamlined and more sophisticated version of the second moment
method enabled Robinson and Wormald (1994) to prove Theorem 8.21
with ro = 3.
Before we say a few words about the proof of Theorem 8.22, let us
note that occasionally Hamilton cycles are more elusive than they seen.
For example, Tait conjectured in 1880 that every 3-connected, 3-regular,
planar graph is Hamiltonian, and it was only in 1996 that Tutte found
a counterexample. In 1985 Richmond, Robinson and Wormald proved
that counterexamples to Tait's conjecture are anything but rare: almost
no 3-connected, 3-regular, planar graph has a Hamilton cycle.
At the first right it is surprising that Robinson and Wormald first
proved Theorem 8.21 for cubic graphs (i.e., r = 3) and they needed more
ideas to extend the result to all r-regular graphs with r >_ 3. The reason
for this is the method they used: it is exiting to compute moments of the
number of Hamilton cycles in the space of cubic graphs thin in the space
of r-regular graphs for r > 4.
Here we give only an all-too-brief sketch of the work of Robinson and
Wormald. Let us write H = H(Gr-ref) for the number of Hamilton cycles
in Gr-ref. First, Robinson and Wormald (1984) proved that in the space
W3-ref of cubic graphs of order n, with n even,
E(H(G3-reg)) 2 k 3/
and
Although this variance is for two large to enable one to deduce that a.e.
cubic graph is Hamiltonian, these relations imply that asymptotically at
last 2-3/e of all cubic graphs are Hamiltonian. Robinson and Wormald
had the striking idea that the second moment method (i.e., Chebyshev's
226 Long Paths and Cycles
inequality) may still give the required result if the space N3-reg is par-
titioned into suitable subsets. To start this programme, by considering
triangle-free cubic graphs, Robinson and Wormald (1984) proved that
for large n the probability that a cubic graph is Hamiltonian is at last
0.974.
To prove Theorem 8.22 for r = 3, Robinson and Wormald (1992) went
considerably further: they partitioned '93-reg according to the distribution
of short old cycles. (Even cycles do not seem to matter.) Recall from
Corollary 2.19 of Bollobits (1980a) and Wormald (1981a,b) that if we
denote by Yk = Yk(Gr-reg) the number of cycles of length k in Gr-reg then
Y 3, Y 4 ,..., Y1 are asymptotically independent Poisson r.v. with means
)L3, 4 . ý, where Ak = (r - 1)k/2k. In particular,
21,
2k + 1
E(Y2k+1(G3-reg))
They estimated the mean and variance of H on such a set, and applied
Chebyshev's inequality to show that, as 1 --+ c, the conditional probability
P(H > 01j(c1 ,....c,))tends to 1 for every response c1 ,c 2..... Needless to
say, much work and ingenuity are needed to carry out this task.
The case r _>4 of Theorem 8.22 is proved in a rather different way. It
is easily seen that
In order to prove that P(H > 0) --* 1, one could proceed as in the cubic
case sketched above but it turns out that the work would be technically
extremely difficult. This is simply because it is much hunter to compute
the variance of the number of Hamilton cycles on a set f2(cl,c2,....ci)
due to the fact that in S3-reg the union of two Hamilton cycles has only
vertices of degrees 2 and 3, while in Wr-reg, with r _>4, these may be
vertices of degree 4 as well.
Rather than taking this route, Robinson and Wormald conditioned
on the distribution of 1-factors, assuming that n is even. They showed
that the removal of a random 1-factor (complete matching) from Gr-reg
8.6 Hamilton Cycles in Regular Graphs-Second Approach 227
Dc-in,fl-out was studied by Fenner and Frieze (1982), Cooper and Frieze
(1990) and Reed and McDiarmid (1992).
Concerning Hamilton cycles, Frieze and Luczak (1990) greatly im-
proved on the bound in Theorem 8.19: they proved that a.e. G5 -out is
Hamiltonian. Cooper and Frieze (1994) proved the difficult result that a.e.
D 3 -in, 3-0 ut digraph is Hamiltonian and in 2000, in a long and complicated
paper, they further improved this result.
Exercises
8.1 Denote by lip and u2 the expectation and variance of the number
of Hamilton cycles in a graph in V(n,p). Note that pp = 2(n-
1) !pn and conclude that if c > 0 is a constant, then
((r - 2)(r-2)/2(r
r(r-2)/2 - 1))n7
) r22Tn"
9
The Automorphism Group
229
230 The Automorphism Group
Before even starting on a reformulation of (9.1), let us see a rather
simple condition M has to satisfy if (9.1) is true. If the automorphism
group of a graph of order n is trivial, then the graph contains at most
one isolated vertex and at most one vertex of degree n - 1. Hence, by
Theorem 3.5, if (9.1) holds then
2
M _log n - oo and 2(N- M)log n -- o. (9.2)
n n
It is perhaps a little surprising that this simple necessary condition on M
is in fact sufficient to imply (9.1). This important result is due to Wright
(1971); earlier P61ya (see Ford and Uhlenbeck, 1957), Erd6s and R~nyi
(1963) and Oberschelp (1967) had proved (9.1) under much stronger
conditions than (9.2). In order to give an entirely combinatorial proof of
this result, rather different from the original one, we shall proceed very
slowly and cautiously.
What do we have to do to prove (9.1)? Consider the symmetric group
S, acting on V = {1,2,..., n}. For p c S, let N(p) be the set of graphs in
SM invariant under p and put I(p) = JS(p)j. Then
For the identity permutation I e S, we have 1(1) = LM, so (9.1) holds iff
It is clear that I(p) depends only on the size of the orbits of p acting
on V(2), the set of N pairs of vertices, because a graph GM belongs to
ý#(p) if and only if the edge set of GM is the union of some entire orbits
of p acting on V(2). Let us pause for a moment to consider in how many
ways a set of given size can be made up of entire orbits.
Let A be a fixed set with a elements. We shall consider set systems
V = {A1,A 2. .. A,} partitioning A "A = Us=, Ai. Denote by f(b;•V) =
3F(b;A 1 .... A,) the collection of those b-element subsets of A which are
the unions of some of the sets A,. Thus B c f(b;V1) iff ýBj = b and for
9.1 The Number of Unlabelled Graphs 231
a= jmj.
j=1
The definition of Y(b;,Q) implies that if f(b;sit) = Yf(b;.si)J, then
Furthermore, if si is the union of two disjoint systems, say sit' and Q"',
then
b
f(b; i) = 5f(b - i; ')f(i; ").
i=0
b
f(b; m l , m,) = f(b - i; , "".)f(i; m'.m). (9.7)
i=0
The last property of the function f(b;m 1, ... ,mi) we need is not quite
immediate so we state it as a lemma.
f(b;ml, M2 . . M)
in • f (b;mi + 2, 1(a - m, - 2)).
232 The Automorphism Group
Proof Since a set Ai with more than three elements can be partitioned
into sets of size 2 and 3, by (9.5) we may suppose that -4 contains no
sets of size greater than 3, that is m4 = M5 = .... mt = 0. Furthermore,
if m3 = 0 the lemma is proved, since then a - m, = 2m2 and by (9.6)
and
b
f(b;mmi + 1,M2 -+3k + 1) = Zf(b - i;m1,m 2 )f(i; 1,3k + 1).
i=0
In order to show (9.8) we use (9.4) to write out the two sides explicitly.
If the left-hand side of (9.8) is not zero, then i is a multiple of 3.
Furthermore, if i = 6j, then (9.8) becomes
n• I m! = (fl)m. (9.9)
Lemma 9.2 Let 2 < m <_n, n > 3, and denote by N 1 = N,(m) the integer
satisfying
and so
M', < N 1.
I(P) = 1 + xJ)MJ
N
<_ p MqM-Mi I{1 + (p/q) }jM1 2 1 2
j=2
= p-Mq (N-M)(p 2 + q 2 )(N-Ml)/ 2 .
M n2
n +m)
m/2,
9.1 The Number of Unlabelled Graphs 235
so
N - M, Ž!m (n -1I- m= N(m).
By inequality (9.5) of Chapter 1 we have
LM Ž: 8-1/ 2 p-Mq-(N-M)(pqN)y1/ 2 .
Hence, recalling (9.9), we find that
2 2
SI(p)/L~M S<Im(p 2
+ q 2 )N(m)/ 81/2(pqN)l/
2
• (n)m(p + q2 )N(m)/ 2 n • nm+l(p 2
+q 2)(m/2){n 1 (r/2)}.
Therefore
2cm 2ml
I-! n- 7)
2n)->~~
ŽM + 1 + C
and so
Then
UM -• LM/n!
Proof The map sending a graph into its complement preserves the
automorphism group and sets up a one-to-one correspondence between
IM and §N-M. Hence in proving this result we may assume that M •<
N/2. Furthermore, because of Theorem 3 we may and shall assume that
We shall also assume that n > no where no depends on the function wO(n)
and is chosen so that all our inequalities hold. It will be easy to check
that there is such an no. As usual, we shall assume that ow(n) does not
grow fast, say (t(n)= o(log log n).
Let N 1 = N 1 (m) and N 2 = N 2 (m) = ½(N - N 1) be as in Lemma 9.2
and set
=(M -2i) ( 2
where the summation is over all i satisfying 0 < i _<N 2 and M - N 1 <
2i < M. If p E S(m), then by Lemma 9.2 we have
exp {2M ( - 5) }
Set
L(m, i) N- N 1 -2i+2 (M - 2i + 2)(M - 2i + 1)
L(m, i - 1) 2i (Ni - M + 2i - 1)(Ni - M + 2i)
and note that for a fixed value of m the ratio l(m, i) is a decreasing
function of i. In fact, if I < i < j, then
l(m,j)
P -J(m, i)
J
and so
N - N- M(M - 1)
2 (NI--M+l)(NI--M+2)
2
mn (2n log n)
2
2 (N - mn - M)
Since
Then
I(m, 1) < l(m, 1)
l(mij + l) < -- 1
j + I - 21(m, 1) 2
Since
2
NIN• { 2r) +m/2+2} N• (/ nrn (l+n ),
by (9.12) we have
Furthermore,
2
2
N1 - M> -(n - M)
5
so
n2 (2n log n) )6
6
l(m, 1) _ -
4 { 2n3/2(log n)4}2 < 7n/(logn) .
Z(n)mL(mn)/LM = o(l)
L~m
i~~m= (N2 ) (N's /i (N~)
(N/2 NM-2i M
<_ )N,- ---h(m, i).
i ! (M--2i)! (N-- M)M
Proof Let
U(Q) = {G E ýu(n,M) G has Q},
L(Q) = {G G S(n,M) G has Q},
Despite Theorem 9.5 and Corollary 9.6, the model SU(n, M) has to
be treated with caution. For example, PM(Q) "- Pu(Q) need not hold
if PM(Q) is not bounded away from 0 or Put(Q) is not bounded away
from 1 (see Ex. 4). More importantly, Theorem 2.1, concerning monotone
properties, does not hold for unlabelled graphs. We do not even have
to look far to find a property showing this, for one of the simplest
properties, connectedness, will do. Denote by pu(n, M) the probability
that Gu E "u(n,M) is connected. Wright (1972) proved that for every
k E N there is an n = no(k) such that
Theorem 9.7 Let r > 3 be fixed and denote by Lr(n) the number of labelled
r-regulargraphs of order n and by Ur(n) the number of unlabelled r-regular
graphs of order n. Then, provided rn is even,
2
Ur(n) "- Lr(n)/n - e-r -1)/4(rn)!/{(rn/2)!2rn/2 (r!)"n!}. ED
The second relation is just Corollary 2.17, but the proof of the first
relation is more complicated than that of Theorem 9.4. This is only to
be expected since the graphs in Theorem 9.7 have many fewer edges
than those in Theorem 9.4 and, furthermore, when proving Theorem 9.7,
we have to work with a fairly complicated model of random graphs.
Of course, the basic aim is the same: we have to prove the analogue of
relation (9.3). Instead of presenting the fairly involved proof, we shall
just pinpoint some of the crucial steps.
To start with, recall the machinery set up in the proof of Theorem
2.16. In particular, m = rn/2 and N(m) is the number of configurations
associated with labelled r-regular graphs of order n. We know from the
proof of Theorem 11.16 that if 0 < co < e-(r2 1)/4 and n is sufficiently
large, then
L,(n) >! coN(m)/(r! )n.
Theorem 9.8 Let r > 3 be fixed and let n -* oo such that rn is even. Let
Q be a property of graphs of order n. Then
Pr-reg(Q) = {1 + o(l)}Prureg(Q),
Q holds for a.e. Gr-reg iff it holds for a.e. Gru-reg and limno Pr-reg(Q) =
C,0 < c < 1, t'ff limn__oo PUru-eg(Q) = C. [
We close this brief section with a result resembling Theorem 9.7 but
outside the scope of this book: Babai (1980c) proved that a.e. Steiner
triple system is asymmetric, i.e., has a trivial automorphism group. Note
however, that a Steiner triple system on n points has fairly many triples:
every point is contained in (n - 1)/2 triples, so the lack of symmetry is
not too surprising.
Then a.e. GM is such that for every pair of vertices (x,y) there is an
1
1, 1 < <o
1 3[{(logn)/log(M/n)} /2], with dl(x) := d,(y). El
Theorem 9.10 Let r Ž! 3 and 8 > 0 be fixed and set l0 [(+ +E logn/
log(r - 1)]. Then a.e. r-regular labelled graph of order n is such that every
vertex x is uniquely determined by {di(x)}ý. U
Theorem 9.11 For every fixed r > 3 there is an algorithm which,for a.e. G G
J(n, r-reg), tests in 0(n2 ) time, for any graph H, whether H is isomorphic
to G or not.
Proof Imitate the proof of Theorem 3.17, putting a graph G E W(n, r-reg)
into A-" iff the vertices of G are distinguished by the distance sequences.
Order these distance sequences lexicographically. L
We would not like to give the impression that research on the graph
isomorphism problem is closely related to the theory of random graphs:
this is certainly not the case, although random algorithms are often
faster than deterministic ones. The main and very substantial body of
research on the graph isomorphism problem is outside the scope of this
book; for earlier results the reader is referred to Read and Corneil (1977)
and for more recent results, many of which are based on the theory
of permutation groups, the reader could consult, among others, Babai
(1980a, b, 1981), Babai, Grigoryev and Mount (1982), Filotti, Miller and
Reif (1979), Furst, Hopcroft and Luks (1980), Hoffmann (1980), Hopcroft
and Tarjan (1972, 1974), Hopcroft and Wong (1974), Lichtenstein (1980),
Luks (1980), Mathon (1979), Miller (1978, 1979, 1980) and Zemlyachenko
(1975). For example, Luks (1980) proved that isomorphism of graphs of
bounded degree can be tested in polynomial time (cf. Theorem 9.11).
i.e. A(G) is the minimal number of total changes (additions and deletions
246 The Automorphism Group
where
A(x,y) = F(x)AF(y)-{x,y}
= {z E V(G) -- {x, y} : z is joined to precisely one of x and y}.
For every vertex z e V(G) there are 2d(z){n - 1 - d(z)} ordered pairs
(x, y) with z E A(x, y). Hence
Z E [A(x, y)l = 2 E d(z){n - 1 - d(z)} < 2n{(n - 1)/2}2.
x y:x z
Therefore
1
A(G) <_ý
A*(G):E<_nn IZ
a(x, y)J <_ý
(n - 1)/2,
as claimed. E
The proof above shows that A*(G) = (n - 1)/2 if and only if G is
(n - 1)-regular and IA(x,y)[ = (n - 1)/2 for any two distinct vertices x
and y, i.e. if for any two distinct vertices there are precisely (n - 1)/2
vertices in different relation to x than to y. Graphs with this property
are the so-called conference graphs (see §3 of Chapter 13), the prime
examples of which are the Paley graphs (see §2 of Chapter 13).
Erd6s and R~nyi (1963) showed that the bound above is essentially
best possible and, even more, a.e. G1/2 shows that the bound is essentially
the best possible.
Theorem 9.14 (i) Every graph of order n and average degree d has degree
of asymmetry at most 2d{1 - d/(n - 1)}.
(ii) Suppose M = M(n) = (n/2)d(n) c N satisfies
Then
A(GM) ={2+o(1)}d (I - d )
almost surely.
Theorems 9.12 and 9.13 show that
A(n) = maxA(G)
IGj=n2
= {½ + o(1)} n. (9.19)
2A*(n). Furthermore, it is easily seen that B(G) < 2A(G) • 2A*(G) so, in
fact, B(n) <_2A(n) < 2A*(n) (Ex. 9).
Spencer (1976) proved that these inequalities are rather close to being
equalities.
Furthermore,
2
n/2 - {1 +o(1)}(nlogn)1/ < !B(n) <_A(n) <•A*(n) • (n- 1)/2. (9.21)
U
Just a few words about the proof. It is easily shown that a.e. G e
1(n, ½) is such that A*(G) = (n/2) - {1 + o(1)}(nlogn) 1/ 2 ; in fact, this is
an immediate consequence of the fact that each IA(x,y)[ has binomial
distribution with parameters n - 2 and 1. In the main part of the proof
one shows that if A*(n) >_ 0.49n, and n is sufficiently large, then (9.20)
holds. This is done by picking a graph G with A*(G) = A*(n) and
replacing G by GAH, where H is the union of 100 random sets of [n/2]
independent edges (not necessarily in G). Almost equivalently, H is a
random 200-regular graph. It turns out that a.e. H destroys all global
symmetries, i.e. symmetries which are not transpositions.
For some values of n one can do much better than (9.21): if there is a
conference graph of order n, then A*(n) = (n - 1)/2 so (n - 1)/2 - 200 <
A(n) < (n- 1)/2. In particular, as pointed out by Erd6s and R~nyi (1963),
for every prime power q satisfying q =1 (mod 4) there is a Paley graph
Pq of order q, we have (q - 1)/2-200 •< A(q) • (q - 1)/2. Spencer (1976)
conjectured that A*(n) = n/2 + 0(1) and so A(n) = n/2 + 0(1). In fact, it
seems likely that A(n) = (n - 1)/2 for infinitely many values of n.
Theorem 9.16 (i) lim,-,, a,(H) = a(H) exists for every finite group H and
it is a rationalnumber.
(ii) a(H) = 1 if and only if H is a direct product of symmetric groups.
(iii) If H is Abelian but not an elementary Abelian 2-group then
a(H) = 0.
(iv) The values of a(H) for metabelian groups H are dense in [0, 1]. El
The result above is in striking contrast with the case of Cayley graphs.
Babai and Godsil (1982) conjectured that, unless a group H belongs to a
known class of exceptions, almost all Cayley graphs of H have H for their
full automorphism group. Though this assertion is somewhat cryptic, we
do not expand on it; Cayley graphs will make another appearance in §1
of Chapter 13.
Exercises
9.1 Without relying on any of the results in this chapter, give a simple
proof of the fact that a.e. G c W(n, 2) has trivial automorphism
group.
9.2 Define an appropriate model S"(n,p) and state and prove an
analogue of Theorem 9.5 for the models W(n,p) and Su(np).
9.3 (cf. Theorem 9.5.) Let M = (n/2)(log n + x) where x -* c, c e R,
and let Q be the property of having at least two isolated vertices.
Sketch a proof of the fact that
limPM(Q)/PMU(Q) > 1
251
252 The Diameter
posed by Erd6s and R~nyi (1962), and Murty and Vijayan (1964); in
this form it seems to be due to Elspas (1964). In spite of its innocent
appearance, the problem is very resilient; we are still far from a satis-
factory solution. For reviews of this problem and other related questions
see Bollobfis (1978a, chapter IV) and Bermond and Bollob~is (1982).
A cycle of length 2D + 1 shows that n(D, 2) = 2D + I for all D C N.
Furthermore, as in a graph of maximal degree A there are at most
A(A - 1)k1 vertices at distance k from a vertex, we have the following
obvious upper bound on n(D, A).
The bound in Theorem 10.1 is usually called the Moore bound, and the
graphs showing equality are the Moore graphs of diameter D and degree
A. It is clear from the definition of no(D, A) that a Moore graph of
diameter D and degree A is a A-regular graph of girth 2D + 1 and order
no(D, A). This description may also be taken as the definition of a Moore
graph. It is unfortunate that there are very few Moore graphs: Hoffman
and Singleton (1960) proved that if there is a Moore graph of diameter
2 and degree A > 3, then A e {3, 7, 57}, and Damerell (1973) and Bannai
and Ito (1972) proved that there is no Moore graph for D > 3 and A > 3.
In fact, Moore graphs do exist for D = 2 and A = 3 and 7, but it is not
known whether there is a Moore graph of diameter 2 and degree 57.
Though there are only few Moore graphs, it is difficult to improve
substantially the upper bound no(D, A). For D = 2 Erd6s, Fajtlow-
icz and Hoffman (1980) proved that if A is even and at least 4, then
n(2, A) _<A2 - 1 = no(2, A) - 2. However, we do not know whether
supA {no(2, A)- n(2, A)} = oc; or even more, it is not even known whether
SUpDA{no(D, A) - n(D, A)} = oo.
Of course, the relation supDA{no(D, A) - n(D, A)} < oc would show that
the trivial upper bound no(D, A) is, in fact, almost the correct value of
n(D, A). Sadly, the lower bounds for n(D, A) we do know at the moment
are rather far from no(D, A). It is precisely here that random graphs come
in: several of the best bounds follow from results on random graphs.
However, this is for later: in the present section we present bounds ob-
tained by constructions. Although this book is on random graphs, it
would be unreasonable to deny that constructions, especially simple con-
structions, are more informative than bounds based on random graphs.
10.1 Large Graphs of Small Diameter 253
Theorem 10.2 The de Bruijn graph B(h,k) has order hk, maximal degree
2h and diameter at most k. In particular,if A > 2 is even, then
r_1
What can we say about n(D, A) for a fixed value of A and large values
of D? Leland and Solomon (1982) constructed cubic graphs of small
diameter and large order. For k c N, k > 2, the Leland-Solomon graph
254 The Diameter
Fk(X) = {y c G : d(x, y) = k}
and write Nk(x) for the set of vertices within distance k from x:
k
Nk(x) = UFix).
i=O
Thus diam G < d iff Nd(x) = V(G) for every vertex x, and diam G Ž> d
iff there is a vertex y for which Nd-l(y) k V.
Why does a.e. Gp have small diameter, provided p is not too small?
Because a random graph is likely to be 'spreading': given a not too
large subset W of V, with large probability Gp has almost pn[ WI vertices
adjacent to vertices in W. Thus, with large probability, the number of
neighbours of a vertex x,d(x) = FI(x)J, is not much smaller than pn;
the number of vertices at distance 2, 1F2(x)A, is not much smaller than
(pn)2 ; and so on, the number of vertices at distance k, Fk(x)l, is not much
smaller than (pn)k, where, say, d = d(n) = 2k. Now if IFk(x)I and IFk(y)J
are sufficiently large, then, with large probability, either Fk(x)nFk(y) - 0
or else Gp contains a Fk(x) - Fk(Y) edge. Hence, with large probability,
diam Gp •s 2k + 1 = d - 1.
When we try to justify the steps above, we encounter two problems.
First, the exceptional probability we work with must be kept rather small
since the estimates have to be carried out about d(n) times for each vertex.
To make sure that the sum of the exceptional probability is still small, we
shall work with concrete constants rather than constants C1 , c2,..., and we
shall avoid terms of the form O(n-k), 0(1), etc. As concrete constants are
rather unpleasant to work with, we shall keep our estimates very crude
indeed. Secondly, we have to try to make sure that in each subproblem
we tackle we can work with independent events. This is why we shall
keep estimating conditional probabilities.
In preparation for our main result, we need four lemmas. In these
lemmas we shall suppose that 0 < p = p(n) < 1, d = d(n) is a natural
number, d > 2 ,pdn'-1 = log(n2 /c) for some positive constant c and
pn/ log n -+ oo as n --* oo. As our final conclusion will be that in a range
of p and d, slightly smaller than the range given just now, almost every Gp
256 The Diameter
has diameter d or d + 1, the restriction on p is not too severe. Indeed, we
know that if lim,•P(Gp is connected) = lim,,_,P(diam Gp < ox) > 0,
then p >_ (log n + co)/n for some constant co. We also know that if p is
bounded away from 0, or even if just pnl/2-> is bounded away from 0,
then diam Gp •< 2 for a.e. Gp (see Theorem 2.6), so we may assume that
p = o(n'1/ 2 +,) for every r > 0.
Note that
p = nh/ •I{log(n 2 /c)}l/d
and
d = {logn + log logn + log 2 + 0(1/log n)}/log(pn),
so the maximum of d is {1 + o(1)} log n/loglogn. Clearly
(pn)d-l = log(n 2 /c)/p = o(n),
Lemma 10.6 Let x be a fixed vertex, let 1 < k = k(n) • d - 1, and let
K = K(n) satisfy
1 2
6 • K < (pn/logn) / /12.
and
b <• 2(pn)k- 1.
Set
Then
P {lFk(x)I--apnl -(1k +ik +yk)apn Ik} <n-
Proof How do we find the sets Fk-l(x) and Nk-i(x)? First we test which
vertices are adjacent to x, then which vertices are adjacent to FI(x), and
so on, up to Fk-2(x). At each stage we have to test pairs of vertices,
at least one of which belongs to Nk-2(x). Hence the probability that a
10.2 The Diameter of Gp 257
O2
(k
P I W - apnk Ž(A~ + Ilk)apnkt
1/2 _C
•P Fk(X I,4)
- Pant A
ctapnt IK~k)
n k)
• ( ntPa exp{- OCpank/3} •_ exp{--c pank/3}
_ exp{-ck2(pn)k/9} = n-K 2/9.
and
Proof The conditions imply that 6d-1 - 0 as n --+ oo, so we may assume
that 6a < I
Let x be a fixed vertex and denote by Z the set of graphs for which
I11(x)l - (pn)1 J <E6(pn)', 0 •_ 1 < k.
Clearly, 2 K - c t.
258 The Diameter
We shall prove by induction that
1 - P(ný) <-- 2kn-K2/9 (10.2)
for every k,0 < k • d - 1. For k = 0 there is nothing to prove. Assume
that 1 •< k < d - 1 and (10.2) holds for smaller values of k. Clearly
1 - P(C2z) = 1 - P(Ik*l) + P(O2ý_)P{IFk(x)I - (pn)kl > 6k(pn)k Qz_}
Now if G c K)*, then a = IFk-(x) satisfies j(pn)k - apn! < 3kl(pn)k.
Therefore
{
PI F(X)I -(pn)kI Žý6k(pnl k_-i
P(Qzk-,)-I P Fk~x) -- apnj >2!(k -- k-1)(pn)k i2k
The last inequality holds because of Lemma 10.6, and the second in-
equality holds since dn-K2 /9 < 1. Consequently
1 -- P(Qý) • 2(k - 1)nK 2 /9 -+2 n-K 2 /9 2kn-K 2 /9
proving (10.2). Lemma 10.7 is an immediate consequence of inequal-
ity (10.2). EA
Lemmas 10.6 and 10.7 tell us that, with probability close to 1, the
sets F 1(x),F 2 (x), ... , Fd-i(x) do grow at the required rate. The next two
lemmas, which we shall state only, show that, with probability close to 1,
two given vertices are far from each other. These lemmas enable one to
prove that the diameter of GP is unlikely to be small.
Given distinct vertices x and y, and a natural number k, define
FZ(x, y) = {z c Fk(x) n Fk(y) •
F(z) n {Fk-I(x) - Fk-i(y)} * Q and F(z) n {Fk-(y) - Fk-1(x)} * 0}.
Denote by Hlk C ý(n,p) the set of graphs satisfying IFk-l(x)) _<2(pn)k-'
and IFk-1(Y)I < 2(pn)k-1. Pick a constant K > e7 and for 1 < k < d/2
define ck = Ck(n,p,K) by
2
Nd_2(x) < 2(pn)d- and Id- l(x) - (pn)d-l1 I !. d-1(pn)d-l,
and
2
IF{Nd-l(x) A Na-j(y)}j <- 16 p 2 dln d-2
We are ready to present the main result of the section, from Bollobis
(1981b).
Proof Though, strictly speaking, we do not know yet that a.e. Gp has
diameter at least d and at most d + 1, it is clear from the statement of
our theorem that diam Gp < d + 1 and diam Gp > d almost surely. This
will be proved by considering the number of pairs of vertices 'far' from
each other.
If for some vertices x,y e G we have y ý Nd(x) [equivalently, x
Nd(y)], then we say that x is remote from y and we call (x, y) a remote
pair. Let X = X(G) be the number of remote pairs of vertices in G. We
shall show that the distribution of X tends to the Poisson distribution with
parameter c/2. Assuming P(diam Gp •< d-1)+P(diam Gp Ž! d+2) = o(1),
this is more than enough to imply the assertion of the theorem.
Our first aim is to prove that almost no GP contains two remote pairs
260 The Diameter
Consequently,
P(x is remote from both y and z)
c5 / 3 n 10 3
n-4.+
4 / < -1/4
A similar argument shows that, for every fixed r, the rth factional
moment of X is within o(1) of the expected number of ordered r-tuples
of disjoint remote pairs. In turn, this implies that
i=#i
T -- Nd(xi) n Nd-I(xj)},
jij
r
S = V(G)-- U Nd-l(xj),
j=l
S' = S - F(T),
ai = JAiJ, s= ISI, s' = JS' and t= T 1 .
6 log n -- * 0 and e -* 0.
Indeed, the first relation holds since pn/(log n) 3 -- * oo, and, if n is large,
d-1
5
' d-1 •3 (ol + fll + 71) •!<4(al + fld-1 + Yd-1)
1=1
2
4 Klon / + d-nd-2 + 2pd- n -3
pn)
<4 K(logn
p-j 1/2 + 3logn
p- + 6logn
(p)2J 1
( pn/ pn 2
(pn) J
Furthermore, E -- 0 since
p2d-1 n2d-3 = (pdnd-1)2_(pn) < (3 log n) 2 /pn 0.
and
R'= P'(Vi, 1• i < r, ]yi E S' not joined to Ai, i =1. r).
Rr = {1 -(1 -_ (1 _p)a)s},
The assertion of the theorem is precisely the union of the three relations
(7), (10.8) and (10.9). 1
Corollary 10.11 (i) Suppose p 2 n- 2logn - cc and n 2 (1 -p) --* oo. Then
a.e. Gp has diameter 2.
(ii) Suppose the function M = M(n) < (2) satisfies
2M 2 /n 3 - log n -- oo.
Corollary 10.12 (i) Suppose that functions d = d(n) > 3 and 0 < p =
p(n) < 1 satisfy
and
Corollary 10.13 Suppose 0 < e < 1 and the sequences (Dk), (Ak) are such
that
Proof We shall omit most of the details but will give some of the
underlying ideas.
Recall from Chapter 2 that elements of %(n,r-reg) are obtained as
images of configurations, i.e. images of partitions of Ui=l Wi into pairs
(edges), where W1, W2 ,..., W, are disjoint r-sets. Let F be a configuration
and let Wi, Wj be classes. Define the distance d(Wi, Wj) = dF(Wi, Wj)
between Wi and Wj in F as the minimal k for which one can find classes
Wi = Wi, Wil I..... Wik = Wj such that for every h, 0 < h :< k - 1, some
edge of F joins Wi, to Wi,,,. If there is no such k, then the distance
between Wi and Wj is said to be infinite. The diameter of F is diam
F = max,<i<j_<n d(Wi, Wj). By Corollary 2.17 the theorem follows if we
show that diam F < d for a.e. configuration F.
The proof of this resembles the proof of Theorem 10.10. Let tt = [d/2J
and t 2 - [d/2] - 1. We wish to show that for a fixed class, say W1, and
any integer k < ti, with probability close to 1, the set Fk(W0 of classes
at distance k from W1 is rather large; furthermore, there are many edges
joining the classes in Fk(WI) to the classes in Fk+l(Wt). The final step is
then to show that, with probability tending to 1, and two classes, say W,
and W2 , are such that, either they are within distance d - 1 or else some
class in Ft,(WI) is joined to some class in Ft2(W 2 ).
As in the proof of Theorem 10.10, the key is to keep our choices
independent as far as possible. In fact, in the proof it is convenient to
use a recursive definition of a configuration, rather than the, admittedly
very simple, original definition.
We shall construct a random configuration by picking edges one by
one, taking those nearest to a fixed class, say W1, first. To be precise, we
shall construct sets Ei, Si and Mi in such a way that Si will be the set of
indices j with d(W 1, Wj) = i in the final configuration, Ei will turn out to
be the set of edges incident with the classes at distance less than i from
W, and Mi will be the set of vertices (members of the classes Wj) not
incident with any edge in El. Thus to get Ei+1 from E, we add to Ei all
the edges incident with the vertices in Li = Mi n (Ujcs, Wj).
By the remarks above, it is clear that we shall set E0 = 0, So = {1} and
M0 = W = U'=, Wj. Suppose we have defined Ei, Si and Mi. In order
266 The Diameter
to define Ei+1 , Si+ 1 and Mi+l, we pass through a number of intermediate
stages, corresponding to random selections of single edges.
Set Ei°' = Fi, S -0'
= 0, M(O) = Mi and L(0) = Li = Mi n (Ulqs, WI).
The set Li is the set of elements of classes at distance i from W1 which
are not (and will not be) joined to classes nearer to W1. Suppose we
have defined Et J),s}J),M J) and L~j). If L1 j) = 0 (so we can no longer
add edges incident with classes at distance i from W1, for we have run
out of free elements) then put Ei+ 1 = E~), Si+ 1 =- Si), Mi_ = M!j) and
Li+1 = Mi+1 n (Uj1 s,+, Wj). Otherwise pick a vertex x e L~i) (say take the
first vertex in L~i) in some predefined order or just take a random vertex
in L¶i)), give all vertices in Mj) - {x} the same probability, and choose
one of them, say y, at random. Add the edge xy to our configuration
to be constructed: set Eij+l) = Eij) U {(x,Y)},L(+'i) = Li)- {x,y} and
Mj+1)
i
= MU)I
{x,y}. Finally, if y E W 1 and I ý Si, then set S(j+1)I --
S0) U {l}, otherwise put SP+') = Sij).
Since IM J+')l -= M}J)]-2 and L'j+1 ) ( Mii+'), the process does
terminate so it does define Ei+l, Si+l and Mi+l.
What is the significance of the sets El, Si, Mi for a random configura-
tion? After a certain number of steps we must arrive at Ek+1 = Ek and so
Sk+1 = 0. If Mk = 0, then we have constructed a random configuration.
If Mk =ý 0, then Mk is a union of some entire classes Wj. Pick one of
these classes, say W2, and repeat the process for Mk and W2 instead of
W and W1. Having ground to a halt, once again check whether we have
got an entire configuration. If not, then what we have is a partition of
some entire classes Wj, so pick a new class, say W3, and continue the
process. When, finally, the process does grind to a halt, we have selected
a random configuration.
Fortunately, there is no need to construct an entire random configu-
ration: we may stop with Ek, Sk and Mk, where k is the minimal natural
number with Ek+1 = Ek. Then Ek is precisely the edge set of the com-
ponent of a random configuration containing W1 in the following sense:
for every i < k the probability that Ei = k is the same as the probability
that k is the set of edges incident with the classes at distance less than i
from W 1.
The description above, of course, does nothing to justify the claims at
the beginning of the proof, but it does make our task fairly pleasant.
Nevertheless, for the details we refer the reader to Bollob~s and de la
Vega (1982).
Theorem 10.15 For a fixed natural number r > 3, the diameter of a.e.
r-regular graph of order n is at least
[logr, 1 n] + [1ogr-l log n - log,-, {6r/(r - 2)}] + 1. D
By Theorems 10.14 and 10.15, for every fixed r > 3, the diameters of
almost all graphs take only boundedly many values; in fact, the results
show that if r is large, then for many values of n the diameter tends to
take only one of three possible values. With considerably more work one
could show that for some sequences nk --+ co and dj -* cc a.e. r-regular
graph of order nk has diameter dk or dk + 1. Thus the diameter of a
random r-regular graph is also almost constant.
If we restrict our attention to r-regular graphs of large girth, then one
can show that there are r-regular graphs of slightly smaller diameter than
guaranteed by Theorem 10.14. Namely, if r > 3 and E > 0 are fixed and
then for every large enough n there is an r-regular graph of order n and
diameter at most d. This gives us the following bound on n(D, A).
Corollary 10.16 If A > 3 and s > 0 are fixed and D is sufficiently large,
then
(2 + P)D log(A - 1)
This shows that limD {log 2 n(D, A)}/D = log 2(A - 1), guaranteeing
that there is plenty of room for improvement on the constructions men-
tioned after Theorem 10.5: one should aim for 'A = {log 2(A- 1)1-1,
which would be best possible.
{log n - oc(n)}/n. We know from Chapter 2 that S•M and I•p are closely
related; say, by Theorem 2.2,
P(GM has Q) < 3P(GP has Q) (10.12)
for every monotone property Q. We shall often use (10.12) to estimate
probabilities in the blue subgraph of a graph in •(n, M; > 1).
Let us show that a.e. graph in §(n, M; > 1) is such that for k < d/2
every vertex has many vertices at distance k from it. As earlier in this
chapter, for a graph G and vertex x c V = V(G) set Fk(x) = {y G V :
dG(x,y) = k}, and put dk(x) = Fk(x)[. We claim that a.e. G c S(n,M;> 1)
is such that for every vertex x we have
91(logn) 2 _<d3(x) < 9(logn) 3 (10.13)
and if 3 < k < D/2, then
I(logn)k-1 I logn) < dk(x) < 9(logn)k I+ logn) (10.14)
10.4 Graph Processes 269
To see (10.13), note that in a.e. G c W(n,M;> 1) there are few green
edges and few vertices of degree at most 1 log n. Hence a.e. graph is such
that no green edge has an endvertex of degree at most 2 log n, so
IF,(x) U F,(y)l > ![ logn] + 1 = m0 , (10.15)
whenever xy is a green edge. Furthermore, the probability that some blue
edge xy fails to satisfy (10.15) is by (10.12), at most
Then
P,(11 f(A)l - alognj > 6alogn) = o(n- 2 ) (10.16)
To prove (10.16) note that for any given vertex y c V\B we have
Pp{y ý f (A)} = (1 - p)a = 7, say, so, in '§(n,p), If(A)I has binomial
distribution with parameters n - b and p = 1 - c. Since
ap(1 - ap) • 1 --e-ap < = 1 -(1 - p)a • pa,
we have
J(n - b)p - alog ný _ (6/110)alogn.
270 The Diameter
3(1 - p)al" 2
< exp {--I(logn)D-2} <-- exp (5 logn) = o(n- 2 ).
This shows that a.e. graph is such that any two of its vertices are within
distance D1 + D2 + 1 = D of each other, completing the proof of (10.11).
The first inequality in Theorem 10.10 is considerably easier to prove.
For example, take p = (log n + log log log n)/n. Then a.e. GP is connected,
so the first inequality holds if a.e. Gp has diameter at least [(log n -
log 2)/log log n] + 1. This can be shown by proving appropriate analogues
of inequalities (10.13) and (10.14). El
With some more work, one could increase the lower bound in Theo-
rem 10.17 by about 1 and so obtain that for most values of n the diameter
of G, is, almost surely, one of two values. Furthermore, by imitating the
proof of Theorem 10.17, one can show that for any t Ž_ co the diameter
of G, is almost determined.
10.5 Related Results 271
Theorem 10.18 Suppose M = M(n) = (n/2)d(n) e N,d(n)-logn --+ o and
d(n) < n - 1. Then a.e. GM is such that
log n-log2 I log n+6
log d(n) j 1 log logn I
Recalling Theorem 10.10, we see that the assertion of Theorem 10.18 is
new only in the range d(n)-logn --* o and d(n) = O{(logn) 3 }. Moreover,
if d(n), the average degree, is considerably greater than log n, then the
gap in Theorem 10.18 can be narrow (cf. Ex. 2).
A variant of the proof of Theorem 10.17 shows that if c > 1 and
M = [cn/2], then, almost surely, the diameter of the giant component
of GM is O(log n). Furthermore, if c is large enough, then the asymptotic
order of the giant component is almost determined.
for some constant c, then a.e. Gm,n;p is such that any two vertices of V have
a common neighbour in W. A jbrtiori,a.e. Gm,n;p has diameter at most 4.
ED
or if k is even and
pkmk/ 2 mk/ 2 -1 - 2 log n - oc,
or jf k is even and
pkmk/ 2 nk/12J
2
- 2 log n - -oo,
and
liM Pl{/d(Gp) = s} = 1 - exp(-c(G)/s!).
For G E (g, let XG be the 0-1 r.v. on Q which is 0 at a point woe Q iff
the walk on G, given by the sequence uwand starting at the distinguished
vertex of G, visits all vertices. Define
Y(0)) = 1
and so
2
E(Y) = E(XG) 2J.J] < 2-t+ln (rnz)n/ 1,
and give precise bounds for the diameter. In what follows, we shall give
a brief sketch of these results.
Our first aim is to define a random graph process (G'0 1 for each
k > 1 such that G' has vertex set [t] and kt edges, and the probability of
an edge tj is proportional to the degree of j in Gt. Since the distribution
of a random k-element subset of [t] is not specified by the marginal
probabilities with which the various elements are contained in this set
when we add a vertex to Gt, we shall add k new edges one at a time.
If we want to start with really small graphs like the 'vertexless and
edgeless graph' or a graph with one vertex and some edges (loops) than
we are forced to allow multiple edges and loops. Following Bollobfis and
Riordan (2002), this is the route we shall take.
First we define a random process (Gt)', for k = 1; the general case
will follow easily. The graphs are nested: G, c Gc' ... , with G' having
vertex set [t] and t edges (and loops), with precisely one of these edges
incident with t. Thus for s < t the graph G' is the subgraph of GI induced
by [s]. Clearly, GI is the graph on [1] with a single loop. All that remains
is to specify how to choose the random edge st incident with t in Gl,
given G'-1 . The random variable s is distributed as follows:
For the rather long proof of this theorem we refer the reader to the
original article; here we shall give only a brief sketch, paying more
attention to the lower bound, which is considerably easier than the upper
bound.
The lower bound can be proved by comparing Gkn with a random
graph on [N] = [kn], in which the edges are chosen independently of
each other, and the probability of an edge ij is c/l/ij for some constant
c > 0. To be precise, the following lemma turns out to be very useful.
Lemma 10.30 Let H be a loopless graph on [N] in which every vertex i
has at most one neighbour j with I •_ j < i. Then
1
P(H c GN) <_2 e(H)(A(H))+2) I7 "
ijEE(H)
10.6 Small Worlds 279
This lemma can be proved by examining what happens in the process
(G') as we pass from G]'- to G].
With the aid of Lemma 10.30, one can prove the following slightly
stronger form of the lower bound in (10.18): a.e. G" is such that the
distance p(n, n - 1) between n and n - 1 is at least L = log n/(20k2 log n).
In fact, this stronger assertion can be proved by estimating the expectation
E(X,) of the number Xt of paths of length I between n and n - 1.
Consider a particular path Q = vov 1 ... vt on [n] with v0 = n, v, = n- 1
and I < L. A graph H on [kn] is a realization of Q if H has edges
xjyj+, j = 0, 1, ... 1I - 1, with [xj/k] = [yj/kl = vj. Since H has maximal
degree at most 2, Lemma 10.30 implies that
kn 1 - 1 1-1 116 - 1
-1
p(HczG~n)<24 1 floo 1 <1'f 1 =161H 1'
_ (H
•=0 ~--+ <v16 .=0-/Xl
_Gj v j=l_ V1
VOVI
-1-1
P(Q - Gk)n) -- (16k2)1 j Hxvo -
Voi j-i
E (XI) • (16k')' ~ jj
2 1j
(16k )1 ( n
n(n-) v= V
(logn)-n
(8 -< 10 (log n)-',
w(NI(v)) = Z 1
jENI(v)
Exercises
10.1 Give a detailed proof of Theorem 10.18.
10.2 Let M = M(n) = (n/2)d(n) c N be such that d(n)/logn - oo
and d(n) <_ (log n) 4 . Prove that almost every GM satisfies
log n+ loglog n i [logn + loglogn + 1
log d(n) _ diam GM _ log d(n)
Exercises 281
10.3 (cf. Theorem 10.21 and the assertion after Theorem 10.22.) Prove
that almost no Gm,,,,,;p has an isolated vertex iff
m
n(1 -p) -* 0.
The main aim of this chapter is the study of the complete subgraphs
of G,,p for a fixed p. We know from §1 of Chapter 2 that one of the
advantages of keeping p fixed is that the spaces W{n, P(edge) = p} need
not be considered one at a time, for they are images of the single prob-
ability space W(N,p) consisting of graphs on N whose edges are chosen
independently and with probability p. A pleasing property of (i(N, p) is
that the term 'almost every' has its usual measure theoretic meaning.
A clique of Gn,p depends on many of the edges incident with few of the
vertices. A consequence of this turns out to be that the clique number of
G,,p is remarkably close to a constant. There are a set M c-N of density 1
and a function cl: M -+ N such that for a.e. G G #(N, p) there is a number
mo = mo(G) such that if n >_ m0, n c M, then cl(Gn) = cl(n). A proof of
this theorem of Bollobds and Erd6s (1976a) is the main aim of §1. In §2
we discuss the possibility of approximating the distribution of the number
of complete subgraphs of a given order by a Poisson distribution.
As an application of the results of §1, in §3 we obtain estimates of the
greedy chromatic number of Gnp, while the second section is about an
approximation of the number of complete r-graphs by a Poisson r.v. The
highlight of the chapter is §4, in which we use martingale inequalities to
determine the asymptotic value of Z(Gn,p) for p constant. The last section
is about sparse graphs: we shall sketch a large number of results.
11.1 Cliques in Gp
A maximal complete subgraph of a graph is called a clique of the graph.
The clique number cl(G) of a graph is the maximal order of a clique of G;
equivalently, cl(G) is simply the maximal order of a complete subgraph.
The independence number flo(G) of G is the maximal cardinality of an
independent set of vertices in G. Thus f•o(G) = cl(G).
282
11.1 Cliques in Gp 283
Matula (1970, 1972, 1976) was the first to notice that for fixed values of
p the distribution of the clique number of a r.g. Gp is highly concentrated:
almost all Gp's have about the same clique number. Results asserting this
phenomenon were proved by Grimmett and McDiarmid (1975); these
results were further strengthened by Bollobfis and Erd6s (1976a).
As we mentioned earlier, for the greater part of this section we fix
0 < p < 1 and consider the space S(N,p). Given a graph G E ýN(N,p),
we write G, for G[1,2,....n], the subgraph of G spanned by the set
{1,2,..... n}. It turns out that for a.e. G E 91(N,p) the sequence cl(Gn) is
almost entirely determined.
For r e N let Yr be the r.v. on ý#(n,p) defined by
[Note that f(r) is simply the expression for Er in which (n) has been
replaced by its Stirling approximation.] It is easily checked that
2
r0 = logb n - 2 logb logb n + 2logb(e/ 2 ) + 1 + o(1)
2
= logb n + O(loglogn)= 2logn + O(loglog n),
log b
where, to simplify the notation, we wrote b for I/p.
284 Cliques, Independent Sets and Colouring
Let 0 < r < 2. Given a natural number r > 2 let nr be the maximal
natural number for which
E(nr, r)Ž r
2
nr =r b(r-1)12 + o(rbr/2), n'• r b(r-l)/ + o(rbr12 )
e e
and
r-coo
In particular, the set of natural numbers M = {n n' < n < nr+1 for
some r > 2} has density 1. This is the set of graph orders for which the
clique number is almost determined.
E(Y 2 ) () () ( (11.2)
11.1 Cliques in Gp 285
so the variance of Yr is
2r = r --
and thus
By our assumption,
b' > nl+' and r < 3 logb n
Turning now to the space §(N,p), these inequalities imply that, for a
fixed r
P{G X,(G) (= r for some n, n' :< n < nr+l} < 4r-(l+E).
As Yr•'=2 r-(1+') < oo, the Borel-Cantelli lemma implies that for a.e. graph
G E <#(N, p), with the exception of finitely many r's we have
Since cl(Gn) < cl(Gn,), whenever n < n', roughly speaking, Theorem 11.1
tells us that there is a function cl: N -- N such that for a.e. G E S(N,p)
there is an integer mo(G) such that if n >_ mo(G), then cl(G,) = cl(n) or
cl(n) + 1; furthermore, for a.e. n we have cl(Gn) = cl(n). What is this
function cl(n)? Note that for 6 = o(1) we have
Corollary 11.2 For a.e. G e ý%(N,p) there is a constant mo(G) such that if
n >_ mo(G), then
[ro(n) - 2(log log n)/ log nj < cl (G,) < [ro(n) + 2(log log n)/ log n].
Furthermore,
Let us estimate the probability that cl(Gn) differs from ro(n) substan-
tially. First, for the probability that cl(G,) is substantially larger than
ro(n) we have the following trivial but fairly good upper bound:
Theorem 11.3 (i) Let 0 < e,r : N-+ N,0 < r(n) < ro(n),r(n) -+ cc and
put
t = t(n) = [r0(n) -- r(n) -- E] --.
Then
ro(lVil) > ro(n) - 1/2 - 2logbS < ro(n) - e/2 - t < r(n) + 1 + E/2.
Thus, if n is large, the probability that Gn [Vi] contains no K' with 1 > r(n)
is at most n- 1. Since the graphs G,[Vi],i = 1,2,..., s, are independent of
each other,
P{cl(G,) < r(n)} < n-.
Q = p2 + p +1 and m = [n/Q].
Partition V(G,) = {1,2,..., n} into Q classes, say CLC2, ... , CQ, each
having m or m + 1 vertices. Consider the sets C1, C2 ,..., CQ as points of
a finite projective geometry. For each line e of this geometry, let Ge be
the subgraph of Gn with vertex set We = UCieCi whose edges are the
edges of G, joining distinct Ci's. Note that Ge is just a random element
of W(Ke, p), where Ke is the complete (P + 1)-partite graph with vertex
classes Ci c e. [See §1 of Chapter 2 for the model W(H, p).]
Since
(P + l) M, _ (Wrel) '
so
P {cl(Gn) < r} < n-3Q/ < n-n2•
Theorem 11.4 Given constants 0 < p < 1 and 0 < r < •, a.e. random graph
in 9#(n, p) is such that it has a clique of order r for all r satisfying
(0 + 8)logb n < r < (2 - E)10gb n
To see that co < b note that not only is an element of S(n,p) unlikely
to contain cliques of order less than (1 - E)logb n, but it is also unlikely
to contain any set of r < (1 - F)logb n vertices not dominated by another
vertex. Even more, if r < (1 - e) logb n, then the probability that some
r-set of vertices is not dominated by any vertex in a fixed subset of En
vertices is at most
(n) (1 _ pr)n.. < nr exp(-prcn/2) < e n': < e',
Theorem 11.5 Let 0 < p < 1 and e > 0 be fixed. Set b = 1/p.
(i) A.e. G E S(N,p) contains a K(xI,X2 .... ) such that
Xr < br(l+e)
Xr > br(l-E)
Theorem 11.6 Let r = r(n) = O(n 1 3) and let p = p(n), 0 < p < 1, be such
that
->o and2
Now Fr <-E(Yr)-•,
r4
F2 • n(p_ - 1) = o(1)
n2
and
rnrn ( n) r- 1
1
Fr r= ") )< {E(Yr) p
Theorem 11.7 Suppose 0 < p < 1 is fixed and r d= r(n) e N is such that
E(Yr) = E{kr(Gp)} , 2for some 2 > 0. Then Yr d P2 ,
lim P{cl(Gp) = r - 1} e-A and lim P{cl(Gp) = r} 1-
n----Co n---cO
Set
Then
P{cl(Gp) = r - 1} . e-;' o(1),
P{cl(Gp) = r} = e-ý" - e-' + o(1)
and
P{cl(G) = r+1}
+ - 1 -e--' + o(1).
The form of Corollary 11.8 is slightly misleading: it gives the impression
as if in a certain range of n three clique numbers could appear with
probabilities bounded away from 0 although, by Corollary 11.2, we know
that this is not the case. The explanation is, of course, very simple: given
0 < c < C < oo, if n is sufficiently large, then at most one of 2n and
/i is in the interval (c, C). Note also that instead of the definition of r
in Corollary 11.8, we could have taken r = [ro] or r = [ro], where r0
is the real number used in the proof of Theorem 11.1 and appearing in
Corollary 11.2: an approximation of r E IR satisfying E(Yr) = 1, with a
suitable interpretation of this expectation.
Theorem 11.7 is easily extended to the case when E(Yr) tends to 0o
rather slowly. However, by the method of Barbour we can do much
better. All we have to do is imitate the proof of Theorem 4.15.
Theorem 11.9 Let 0 < p < 1 be fixed and let r = r(n) be such that with
Proof As in the proof of Theorem 4.15, write - for the set of all
complete r-graphs with vertex set contained in {1, 2 .. , n}, write a, /3...
for the elements of _#' and for a E Yi' define a r.v. X,, on '&(n, p) by
X,(G) 1, if Gp,
{0, otherwise.
fl5t"
Therefore
E~ ~~
P-#
(n)P(r) ( r)(r)
P22 I:•l•= r--1
Ian•#0 1=2
22 r( n - r) ( r) / ( n)
5 E(XcXf) = (n 1=2
n-r) (r) p2( 2'(
r -
)
Slrný
fl+-
11.2 Poisson Approximation 293
r--1r- I
Now
(r - 1)2
S(I+ 1)(n - 2r + 1 + 1)p-
Thus
R = O{(log n) 4 /n 2 + (log n)3/n},
as required. Li
large, then
(ii) Let 0 < ro = -o(n) < 1 and ko(n) = In/[{1 -ro(n)}logdn]1. Then
Proof (i) We shall prove considerably more than we have to: we shall2
show that for each 1,1 < 1 _< n, with probability greater than 1 - n-U
the lth colour class C, has fewer than
vertices.
Indeed, what is the probability P(l;j,,j 2 ... IJk2 ) that for each i, 1 <
i < k2 vertex ji is precisely the (k, + i)th vertex of C, ? Clearly
k2
since the probability that vertex ji is not joined to any vertex in a given
11.3 Greedy Colouring of Random Graphs 295
if n is sufficiently large.
Therefore with probability greater than 1 -n -_lU2 every colour class
produced by the greedy algorithm has at most k-1 vertices so P{zg(Gp) <
n/(k - 1)} < n-u2
(ii) Let Aj be the event that at least k colours are used to colour the
first j vertices, let Ak A k be the event Xg(Gp) > k and let Bk be the
event that vertex j gets colour k. Note that for 1 < j < n we have
n
B>+, Aj c- A ýj= Ak and AkI= Bk+I.
j=k+l
and so
P(Bj+4Ak) - Pj,k
and
n
P(Ak+l) •! P(Ak+IlAk) • : Pj,k •! nPn,k.
j=k+l
Therefore
and
Theorem 11.12 Let 0 < p < 1 and r > 0 be fixed and set d = 1/(1 -p).
Then a.e. Gp satisfies
nn
S(1 + logd log n/log n) <_ Z(Gp) • n 1 + (2 + E)log log n/log n}.
2 1Ogd n 0lod n
Theorem 11.13 Let 0 < p < 1 and c > 0 be constants and set d = 1/(l-p)
and k = [cn logd nj. Then
lim E{pGp(k)}
n•[0% = O <' 1,
if c > I2
The result above seems to indicate that x(Gp) is indeed about n/1 logd n
for a.e. Gp; the trouble is that for k - (1 +e)n/ logd n the variance of pG,(k)
is rather large so Chebyshev's inequality does not imply that pG•(k) > 0
for a.e. GP.
Let us take another look at the greedy algorithm. By Theorem 11.10(ii)
we know that the probability of the greedy algorithm using substantially
more than n/logdn colours is very small. In fact, as pointed out by
McDiarmid (1979a), if we wish, we can very easily obtain a much better
bound on the probability of this unlikely event. Indeed, in the proof we
were very generous: all we used was that the (k0 + 1)st colour is unlikely
to occur, even if k0 colours do occur. But for any k c N the (ko + k)th
colour is even less likely to arise provided the first k0 + k - 1 colours do
occur. Hence the least we can conclude from the proof is that
0
P{Zg(Gp) Ž!ko + k} < exp(k log n - kn' /logd n).
Corollary 11.15 A.e. Gp is such that, no matter what order of the vertices we
take, the greedy algorithm uses fewer than { 1+ 5(log log n)/ log n}n/ logd n
colours.
P(X <•EX - t)
2
e-t /2nc (11.8)
and
P(JX - EXI > t) < 2e-t 2 /2nc 2 . (11.9)
There are many graph invariants that satisfy (7) for a small value of
c. For example, if X is the clique number then (7) holds with c = 1, so
P(ll(G,,p) - E(l(Gn,p))l > t) < 2e-t/2n.
Similarly, the chromatic number of a graph changes by at most 1 if we
change some of the edges incident with a vertex, so
P(lz(Gn,p) - E(Z(Gn,p))# > t) • 2e-t 2 /2.
In particular, if w(n) --* then a.e. Gnp W(n,p) is such that its clique
number is within ow(n)n1/ 2 of f(n) ý E(l(G,p)) and its chromatic number
is within w(n)nW/ 2 of g(n) = E(x(Gn,p)). This theorem of Shamir and
Spencer (1987) was the first result about random graphs that used a
martingale inequality to prove that, with probability tending to 1, a r.v.
is concentrated in a short interval. In fact, Shamir and Spencer also
proved that for small values of p the chrometic number of a.e. GnP is in
a considerably shorter interval.
300 Cliques, Independent Sets and Colouring
Theorem 11.16 Let 0 < p = p(n) = cn-' < 1, where c > 0 and 0 <•a < 1.
Then there is a function u(p, n) such that
(i) f <0
: a < 1/2 and wo(n) -+ o then
1
P(u •_ Z(Gn,p) •_ u + wo(n)(logn)n 12-) !) 1
as n - o,
(ii) if 1/2 < i < 1 then
( r2~+1)
P(u x!(G~,p) +[oý+1 -* 1.
Note that, in particular, if o > 6 than, for a.e. Gn,p, the chromatic
number Z(Gn,p) takes one of five values.
What is fascinating about the result above is that although it tells
us that for p = 1, say, X(Gn,1/2) is concentrated in an interval of length
w(n)(log n)n 2, it gives no information about the location of this interval.
All we know, from Theorem 11.12, is that this interval is contained in
1
the interval [(1 + o(1)) 0 n
1 ,92n1,(1 + o(1))logn .
Following the idea in Bollobis (1988c), what we can prove very easily is
that a variant of Yr is highly concentrated. As customary, we formulate
the result in terms of complete graphs rather than independent sets. As
before, write Yr = Yr(Gnp) for the number of complete r-graphs in Gn,p,
and Er = E(n, r) for the expectation E(Yr(Gnp)).
Theorem 11.17 Let 0 < p < I be fixed, and let rt = r1 (n) > 3 and
S = o(n) Ž! 1/2 be such that Ep(n, ri) = n' = o(n 2 /(logn)2 ). Then for
0 < c <• 1 we have
op2n = 0 (in),
r+1
for r - r0 = 0(1) we have Ep(n,r + 1)/Ep(n,r) = nrpr ft9r-ro (no)2r,
for n large enough rl and a can be chosen as
prescribed. Furthermore, for r' > r0 + I we have Ep(n, r') = o(1), and if
3 < r" < ro - 3 than Ep(n, r")/n 2 - c. Hence ro-3 < ri < ro+ 1 and
so, in particular,
P-rl =1e((log n)2•
Let U = U(Gn,p) be the maximal number of edge-disjoint complete
rl-graphs Kr, in Gn,p. By definition, U _• Yri, so it suffices to bound the
probability that
U _<(1 - c)n'.
(n
E(U') Ž (1 + o(1))
Theorem 11.18 Let 0 < p < 1 be fixed, and set q = 1 -p and d 1/q.
Then a.e. G,,p is such that
nI_ loglogn n 3 loglog n
2
2 logd n ± log n ) - logd n( fogn
In particular,x(Gn,p) = (1 + o(1))n/ 2 logd n for a.e. Gn,p.
Proof The lower bound is in Theorem 11.12, so our task is to prove the
upper bound.
Set s, = [2logdn - 5log1logn], and let nl be the smallest natural
number such that
Eq(nl,Si) Ž 2n 5/3.
It is easily seen that if n is large enough,
n/(log n) 3 < nj < n/(log n)(log log n)
and
2n5/3 < Eq(nl,sl) < 3n5/3.
Hence, by Theorem 11.17,
P(cl(Gni,q) < s1 ) •< e
/3
This implies that a.e. Gn,p is such that every set of nj vertices contains s,
independent vertices, and so
ni n
Z(Gn,p) < -+
n-nl n (I +- 3 1ogdlogn),
s• - 2 logd n
as claimed. El
Matula (1987) used a rather different approach to the study of the chro-
matic number of a random graph; with this 'expose-and-merge' algorithm,
Matula and Ku~era (1990) gave an alternative proof of Theorem 11.18.
As an immediate consequence of Theorem 11.18, we see that a.e. Gn,p
11.5 Sparse Graphs 303
is such that every subgraph of chromatic number Z(Gnp) has (1 + o(1))n
vertices.
By refining the above proof of Theorem 11.18, McDiarmid (1990)
proved that X(Gn,p) is concentrated in an even smaller interval than
claimed there.
Theorem 11.19 Let 0 < p < 1 be fixed, and set q = 1 - p,d = 1/q and
s = s(n) = 2 logd n - 2 logd logd n + 2 logd(e/ 2 ). Then for a.e. Gp we have
n in
r + o(1) 2 1 1r
Although for close to thirty years the aim had been to prove that
X(Gn,p) is highly concentrated, after the results above the obvious task is
to show that X(Gn,p) is not concentrated in a very short interval. It would
not be too surprising if, for a fixed p, the shortest interval containing
the chromatic number of a.e. Gp were of order about n1 / 2 , but at the
moment we cannot even exclude the unlikely possibility that for some
1 = 1(p) and c = c(n, p), the chromatic number of a.e. Gp is in the interval
[c,'c +/].
Theorem 11.20 For 0 < a < 1 we have f(P) = ft(a) + f 2(M), where
fi(g) = e-' + I k-- kilk-I k--I k2k-1 (ce- )2k
fk=2 1=1 kE01! + 22(e-)kot
k !)l
304 Cliques, Independent Sets and Colouring
and
1 00 k 2k-1 rn-k
f2 = 1 k + l),()e-)k' tk,l,Z'
k=1 1=0 m=k
where tkt,m is the number of trees on {1, 2,..., k + l} with bipartition (k, l)
and independence number m.
oca(T)
components isomorphic to T, where a(T) is the order of the automor-
phism group of T and ITI is the order of T.
Denote by .ýs the set of trees with vertex set {1, 2..s} and by Fk,t
the set of trees with vertex set {1,2 .. , k + l} in which every edge is of
the form ij with i • k and j > k. Thus
Then, by (10),
= 11~c 1 0E /3 z
fl°(T)(e-)
0 T ' (11.11)
f (a)
= 1: s- (oce-)s.
s=l Tc.•s
1-3(T)
_0 .(11.12)
f (2oc E _____1 (oce)s
s=I TcYF,
Since for s = k+l,k > 1,the set F, contains () k 1 1lk- 1 trees isomorphic
to some tree in FkI, and for s = 2k the set Ys contains 2 (k) 2 trees
isomorphic to some tree in Fk,k, the right-hand side of (12) is fI(0) and
the difference of the two sides is precisely f2(W). [
Although the series giving fI (a) does not converge fast, it is fairly
easy to calculate concrete values of fI (oc). The series of f 2 (c) is not too
pleasant, but it does converge fast.
11.5 Sparse Graphs 305
For ot > 1 the lower bounds on f(o) are rather weak. In fact, for some
values of a we can hardly do better than apply the following theorem
of Shearer (1983) concerning the independence number of (non-random)
triangle-free graphs. The theorem will be proved, in a slightly different
form, in the next chapter (Theorem 12.13; as we shall see there, the
proof is based on the greedy algorithm for constructing independent
sets.
Corollary 11.22 f(OC) > (glogO- _ + 1)/(g - 1)2 Jbr all x > 1.
a.e. Gn,,/n has at most log n triangles. Hence a.e. G,,/n contains a triangle-
free subgraph Go spanned by at least n- log n vertices. Furthermore, a.e.
Gn,/,I is such that Go has average degree at most al. Consequently
•1 logo1 -- ~ ±1 1
/3o(G,,!/n) > flo(Go) > (lo1
- 1)2 (n - log n)
(ii) If i(n) = o(n/ log n), then for every fixed positive e
logc - e
fig(Gn,p) > n
Lemma 11.24 Suppose 1 < a = oc(n) < n and b = b(n) > 1 are such that
Proof We may suppose that k = n/b is an integer. Set p = oi/n and write
Ek for the expected number of independent k-sets in Gp. It suffices to
show that limn- Ek = 0. Clearly, by inequality (1.5),
For large values of a the upper bound on f(a), implied by Lemma 11.24,
is about 2 log ot/oc, as shown by Bollobas and Thomason (1985), Gazmuri
(1984), McDiarmid (1984) and de la Vega (1984).
Theorem 11.25 (i) Suppose ot > 1 and b > 1 are constants and
2b 2 log b - 2b(b - 1) log(b - 1) < c.(11.14)
Then
f(a) •_ 1/b.
In particular,
f(c) < 2 log a/o
2 log n
flo(G <,•/,) < 2logd O:
flo(Gn,•/)
1 < 2 logd n
For 2.27 < c. < ee/ 2 + 0.1 the assertion follows from Lemma 11.24 by
a computer check.
(iii) With k = [2 log1d ] and Ek as in Lemma 11.24, we have
Theorem 11.26 Let p = oc(n)/n and suppose cL(n) < k(n)logk(n) • n1/2 .
Then a.e. Gn,p is such that
Zgo(Gn,p) • k + 3.
Proof If a < 1 is a constant and p = cx/n, then by Corollary 5.8 a.e. Gn,p
is a union of components each of which is a tree or a unicyclic graph.
Therefore in this case Zgo(Gn,p) • 3.
Suppose next that a > 1 is a constant and k = k(n) < 15. Then by the
second part of Theorem 11.23 a.e. Gn,p is such that the first colour class
found by the greedy algorithm has at least n1 = [[{log(cs + 1) - }/lc]n]
vertices. In the subgraph spanned by the remaining at most n2 = n - ni
vertices the probability of an edge is
0 C n2 Oc- log(t + 1) + e st
n n2 n n2 n2
Hence if a.e. G,,,,/, satisfies
Zg(Gn,•,/,n) < s
then
Zg(G,,,,n) < s + 1
Zg(Gn,./n) < s
Z(Gn,a/n) < s + 1
zgo(Gn,y/n) • k + 3
then
zg0(GAk/n) < k + 3
for a.e. Gn,,/,. If oc = Ilogl < n 1 / 2 for 15 < 1 < k < nl/ 2/ log logn then
these conditions are satisfied since
inf [o1-j -
16<1<k
±t
+ logal] inf1j{(1
= 1<< 1) log(1- 1)- ilogi
+ log 1+ log log l}
= 15log(15/16)+loglog16 > 0.
This completes the proof of our theorem. El
If o is not too small, then the greedy algorithm works rather well, even
without a good colouring of the sparse subgraph at the end.
Theorem 11.27 Suppose c.(n) < n,c(n)/logn --* and p = c•(n)/n. Set
S= (3 log log n)/log a. Then a.e. Gn,p is such that
z,(G,,p), -ý (I1+ F)log(l1q) nlo k,
where q = 1 - p.
Proof What is the probability that the simple greedy algorithm uses the
(k + 1)st colour to colour vertex j + 1? It is at most
Hence
n-I
P{Zg(Gnp) > k} < Z(1 -qj/k)k
j=O
< nexp(-a-l/(l+E)k)
<explogn+(l+e)a-l/(I1e+)log(1-ac/n)n
exp glog a lf
Once again, it is easily seen that the upper bound is essentially best
possible: for p = c/n a.e. GP satisfies Zg(Gn,p) - {log(1/q)/ log a}n.
Theorems 11.26 and 11.27 contain most of the early results about
colouring spaces random graphs, including several results of Kalnins
(1970), Erd6s and Spencer (1974, pp. 94-96), McDiarmid (1979a), de la
Vega (1985) and Bollobits and Thomson (1985).
These results are reminiscent of Theorem 11.12: although they give
good bounds on greedy colourings, they do not come close to determin-
ing the asymptotic chromatic number of almost every G,,,/,. After the
appearance of Theorem 11.18, martingale methods were brought in to
tackle colouring problems, and all this changed rapidly.
Frieze (1990) proved the following sharp result about the independence
number.
Theorem 11.28 For every E > 0 there is an cc, > 0 such that if cc, < cc =
cc(n) = o(n) then a.e. G,,/n is such that
2(log c- loglogoc-log2 + 1) en
-- _ n < -.
Theorem 11.30 For all positive r, o-, there is an integer no = no(r, a) such
that if O < p = p(n) < n 1 /2-7 then there is an integer k = k(n,p) such
that
and
d"= inf{c : P(Z(Gn,,/I) < k) -* 0}.
dk for every k > 3, i.e., there is a constant dk > 0 such that for every e,
o < e< dk, we have
P(Z(Gn,(dk_,)/n) < k) -I 1
and
P(z(Gn,(dk+,)/n) •!_k) -- 0.
Building on the deep results of Friedgut (1999) on necessary and suf-
ficient conditions for sharp threshold functions, Achlioptas and Friedgut
(1999) proved that there is a sharp transition from being k-colourable to
having no k-colourings. To state this result, for n > k > 3 and 0 < E < 1,
define dk(n,r-) by
P(Z(Gn,dk(n,s)/n) = k) =
Then, for fixed k and r, dk(n, E) - dk(n, -e) - 0 as n -- oo. This result
of Achlioptas and Friedgut corresponds to Theorem 11.16, guaranteeing
concentration in some interval, but what is missing is the analogue of
Theorem 11.18, telling us that the short threshold interval does not move
around much as n -* oo. It is likely that a major new idea is needed to
prove that d' = d".
The sequences (di) and (d") are related to the sequence (Ck) associated
with the appearance of a k-core discussed in Section 6.4. Recall that the
k-core of a graph is the maximal subgraph of minimal degree at least k,
and we may define Ck as follows:
Also, as stated in Theorem 6.20, Pittel, Spencer and Wormald (1996) gave
11.5 Sparse Graphs 313
an exact expression for Ck in terms of Poisson branching processes, and
deduced that Ck = k+ ,,/klogk+O(logk). In particular, C3 = 3.35... < d 3.
Hence, by (11.16), we have d 4 > 5.16 > c 4 = 5.14..., and repeated
application of (11.16) gives that d' > Ck for 4 < k < 40. As noted by
Molloy (1996), Theorem 6.20 of Pittel, Spencer and Wormald implies
that Ck < 2.3k. On the other hand, (11.16) implies that d' > klogk > 3k
for k _>40, and so d' > Ck for k _>4, as claimed.
Let us say a few words about d' and d"[ for small values of k. The
first lower bounds on d•, namely d' > 2.88, due to Chvttal (1991), and
d' > 3.34, due to Molloy and Reed (1995), were really lower bounds on c3.
Achlioptas and Molloy (1997) were the first to go considerably beyond the
bound given by the 3-core: they proved that d' Ž> 3.846. Achlioptas and
Molloy (1999) made use of a technique of Kirousis, Kranakis, Krizanc
and Stamation (1998) to give upper bounds for d". Among other results,
they showed that d"' _<5.044, d" < 9.174, ds' • 13.896, d'6 • 19.078 and
d" < 24.632.
Although in a random graph process a k-core appears well before a
(k + 1)-chromatic subgraph, as shown by Molloy and Reed (1999), shortly
after the emergence of a k-core a.e. random graph contains a subgraph
of the type Gallai showed to appear in (k + 1)-critical graphs. Thus to
get closer to proving the existence of dk and determining its value, we
shall have to go beyond Gallai's condition.
To conclude this section, we shall consider the independence chromatic
number of random r-regular graphs. We start with a brief review of
some results on the independence number of (non-random) graphs. The
independence ratio i(G) of a graph G of order n is flo(G)/n. Let i(A, g) be
the infimum of the independence ratio of graphs with maximum degree
A and girth at least g. For g = 3 we have no restriction on the girth so
put i(A) = i(A, 3). Clearly i(A) = i(A, 3) < i(A, 4) < ... furthermore set
i*(A) = limg-0 i(A, g).
Turin's classical theorem [see Bollobis (1978a, Chapter 6) for numer-
ous extensions] implies that i(G) >_ 1/(d + 1), where d = d(G) is the
average degree of G. The inequality was improved by Wei (1980, 1981)
to the following: if G has degree sequence dj, .... dn, then
flo(G)
f30(G) >-
Ž d di +1 1" (11.17)
i=1
It turns out that inequalities (11.18)-(11.21) are useful only for small
values of A since Ajtai, Koml6s and Szemer&di (1980, 1981b) achieved a
breakthrough for large values of A:
i(A,4) > clog A/A (11.23)
for some constant c > 0. This result was sharpened by Griggs (1983b)
and Shearer (1982), the latter improving (11.23) to
i(A, 4) > (A log A - A + 1)/(A - 1)2, (11.24)
which is a consequence of Theorem 11.18, to be proved in the next
chapter (Theorem 12.14).
11.5 Sparse Graphs 315
Let us turn, at last, to the independence ratio of random regular
graphs. We know from Corollary 2.19 that for every r > 3, a.e. Gr-reg is
essentially triangle-free: it has, say, fewer than co(n) triangles, provided
o)(n) -+ oc. Hence deleting an edge from each triangle will hardly increase
the independence number, so flo(Gr-reg) Ž! {n - o)(n)}i(r, 4). Even more,
in this inequality i(r, 4) can be replaced by i(r, g) for any fixed g since,
again by Corollary 2.18, a.e. Gr-reg contains at most wo(n) cycles of length
less than g. Hence inequalities (11.22) and (11.24) imply lower bounds
for the independence ratio of a.e. Gr-reg.
By making use of these ideas and properties of the model W(n, r-reg)
described in Chapter 2, Bollobfis (1981c) proved that
rlogr- r + 1 2logr (11.25)
(r - 1)2 r
if r > 4 and
7+o(<) < i(G3 -reg) < 6 (11.26)
18 -
For r large enough, inequality (11.25) was improved to an essentially
best possible result by Frieze and Luczak (1992); the proof is based
on a detailed and ingenious analysis of an algorithm that constructs an
independent set.
Theorem 11.32 For e > 0 and 0 < 0 < 1/3, there is an ro such that if
ro < r = r(n) •_ no then the independence ratio i(Gr-reg) of a.e. r-regular
graph satisfies
2
i(Gr-reg) - -(logr - loglogr + 1 - log2)1 < glr. ED
r
Frieze and Luczak (1992) also determined the chromatic number of
a.e. r-regular graph; not responsibility, Z(Gr-...g) is about 1/i(Gr-reg) for
a.e. Gr-reg if r is in an appropriate range.
Theorem 11.33 Ifr = r(n) is sufficiently large and not more than no for
some fixed 0 < 1/3 than the chromatic number X(Gr-reg) of a.e. r-regular
graph satisfies
r 8r loglogr 8r loglog r
2 log r (log r) 2 (log r) 2 D
For fixed r > 3 and g > 4, the probability that a random r-regular
graph has girth at least g is bounded away from 0. Hence, Theorem 11.33
implies that if r > 3, g > 4, n is large enough and r, is even then there is an
316 Cliques, Independent Sets and Colouring
Exercises
11.1 Show that for every E > 0 there are0 < q(n) < (4 +E) (logn)n- 1 /2 ,
p(n) = 1 - q(n) and r(n) c N such that p and r satisfy the condi-
tions of Theorem 11.1 and so a.e. Gp has clique number r.
11.2 Prove that if H is any graph of order n, then
kr(H)l (n)
so
319
320 Ramsey Theory
k-colouring of a sufficiently large system contains a monochromatic
subsystem of a given size; furthermore, for k = 2 one almost invariably
considers red-blue colourings. However, as our main aim in this chapter
is to study 2-colourings of complete graphs and a red-blue colouring of
Kn is naturally identified with the subgraph G formed by the red edges,
say, so that the blue subgraph is the complement of G in K , more
often than not we shall study graphs and their complements rather than
red-blue colourings of complete graphs.
and
R(s, t) (-- +t
s- 12 )'
(12.2)
Proof (i) In the proof of (12.1) we may assume that R(s - 1,t) and
R(s,t - 1) are finite. Let G be a graph of order R(s - 1, t) + R(s,t - 1)
and let H be its complement. Pick a vertex x. Since
by symmetry we may assume that dG(x) > R(s - 1, t). Let W = FG(X),
G, = G[W] and H 1 = H[W]. As IWý > R(s - 1, t), either G1 contains a
complete graph K 1 of order s - 1 or else H 1 contains a Kt. In the first
case K 1 and x form a KS in G and in the second case H contains a Kt.
Thus (12.1) holds.
(ii) If s = 2 and t > 2, then R(s,t) = t, so equality holds in (12.2);
similarly we have equality in (12.2) for s > 2 and t = 2. Assume now that
s > 3, t > 3 and (12.2) holds for all (s', t') with 2 < s' < s, 2 < t' < t and
s' - t' < s + t. Then by (12.1) we have
- s--2 s -- 1 s-- I
Much effort has been spent on determining exact values of the function
R(s, t) for 3 5<
s _ t, with distressingly little success: R(3, 3) = 6, R(3,4) =
9, R(3,5) = 14, R(4,4) = 18, R(3,6) = 18 and R(3,7) = 23. The first four
values were determined by Greenwood and Gleason (1995), the last two
are due to Kalbfleisch (1967) and Graver and Yackel (1968): for a survey
of Ramsey numbers see Chung and Grinstead (1983).
It would be even more interesting to determine the correct order of
R(s). By Theorem 12.1,
R(s) < 2s--2)<Is 1/2
- s--1 ) 6 S-/
Erd6s (1947) was the first to give an exponential lower bound for
R(s). Although, by now, the result looks very simple indeed, the proof
was the first exciting application of random graphs. We know from
Corollary 11.2 that a.e. Gn,11 2 has clique number about 2log2 n. Hence
for s slightly larger than 2log 2 n a.e. Gn, 1/ 2 is such that neither G,112
nor its complement contains a Ks. In other words, a.e. Gn,1/ 2 shows that
R(s) > n + 1, giving a lower bound only slightly greater than 2s/2. In
fact, there is no need to require cl(Gn1/ 2 ) < s - 1 for a.e. Gn, 1/ 2 : clearly
P{cl(Gn, 1 2 ) <- s - 1} > 1 will suffice.
322 Ramsey Theory
(n)<20-1.
Then R(s) > n + 1.
Proof Consider the space .(n, ½). Denote by Ys = Y,(G 11 2 ) the number
of Ks subgraphs of G 11 2 and by Y, the number of Ks subgraphs in the
complement. Then by (12.1) of Chapter 11 we have
(n 2(en)s2) < 1 1.
s2 < 1/2 (S s1/2 <
and
P(Ys Ž_1) + P(Ys _Ž1) • E(Y,) + E(Ys) = 2E(Y5 ). (12.4)
Both inequalities are crude, but which is cruder in the context above?
We know from Theorem 11.7 that if E(Ys) tends to a constant it, then
P(Ys > 1) is not only bounded by u + o(1), as asserted by (12.4), but it
tends to I -e-/. All this means that it suffices to make sure that E(Ys) is
slightly smaller than log 2, rather than 2. Unfortunately, this only allows
us to increase the bound n + 1 in Theorem 12.3 to about n + cn/ log n, a
gain hardly worth bothering about.
12.1 Bounds on R(s) 323
What can we gain by improving inequality (12.3)? As discovered by
Spencer (1975), we can gain a factor 2 by applying the Lov~isz Local
Lemma (Theorem 1.18) on a family of almost independent events.
k s -k <-
k=2
Proof Consider the probability space S(n, '). For a E V(s), i.e. for an
s-subset a of the vertex set V, let A, be the event that the graph spanned
by a is either complete or empty. Furthermore, let F be a graph on V/s)
in which two vertices a,, 02 E V(') are joined by an edge iff ýal n U21 > 2.
Then F is a dependence graph of the system of events {A, : a e VEs)},
A = A(F) = k- s n-
k=2
and
1
P(A,) = 2-()+1 < < (A - 1)A-1/AA.
P n ,> 0.
It is a little sad that the use of the ingenious Lovasz Local Lemma
has gained us only a factor 2 over the bound resulting from the most
straight-forward method.
324 Ramsey Theory
12.2 Off-Diagonal Ramsey Numbers
Let 3 < s _• t < n, 0 < p < 1, and consider the space #(n,p). As
before, write Y, = Y, (Gp) for the number of KS-subgraphs in G, and
Yt = Yt(Gp) for the number of Kt-subgraphs in the complement of Gp.
Then R(s, t) Ž_n + 1 iff Y8(Gp) = Yt(Gp) = 0 for some Gp G ?(n, p), i.e.
R(s,t) > n+ 1 ift P{(Y, + Yt)(G,p) = O} >0. (12.5)
for all t > s. In §3 we shall return to upper bounds and we shall improve
on (12.7).
As shown by Spencer (1975), the straightforward method used by
Erd6s (1947) in Theorem 12.3 gives a reasonably good lower bound for
every R(s,t): all we have to do is apply (12.5) for suitable values of n
and p.
Theorem 12.7 Suppose 3 •_ s < t 0 < p < 1 and
Then
R(s,t) > n + 1.
12.2 Off-Diagonal Ramsey Numbers 325
provided one can find 0 <VA < 1 and 0 < VB < 1 such that
1
PA •-- VA( - VA)dA(I - VB)dAB (12.12)
and
PB •--VB(1 -- TA)dBA(1 -TB)dB. (12.13)
Now if, say, PA > 2 -dA and PB > 2-dB, then (12.12) and (12.13) imply
that 0 < VA < 1 and 0< VB < 1. In that case, by inequality (12.10) of
Chapter 1, inequality (12.11) follows if instead of (12.12) and (12.13) we
require that
and
PB • VB exp{--( + VA)VAdBA - (1 + 7B)VBdB}. (12.13')
Substituting 6A = VA/PA and 6B = /PB, as in Ex. 19 of Chapter 1, we
find the following consequence of Theorem 1.18.
Theorem 12.10 If there are 3 A, 6B such that 1 < ýA < 1/( 2PA), 1 < 6B <
1 g(2pB),
109g6A ( + 6•APA)(dA PA)6•A + (I + 6BpB)(dABPB)6B
0_ (12.14)
12.2 Off-Diagonal Ramsey Numbers 327
and
then
S{FAi nn 1} >0.
The following theorem is a slight sharpening of some results of Spencer
(1977), giving the promised improvement on (12.10).
where
S( (s- 2)(s + 1) (,+-)/2 2(s-2)!, 1/(s-2)
and
a>2b+s-1. (12.18)
Such a choice is possible since if we take r = 0, c = cs and replace the
inequalities in (12.17) and (12.18) by equalities, then, as one can check,
the solutions in a and b are positive.
Set n = [c(t/logt)(s+)1 / 2 ], p = a(log t)/t and consider the probability
space <(n,p). Let U = V('), W = V(t) and let A,, B, (a G U, c c W)
be the events as in (12.6). Let F be the graph on U U W in which
two vertices are joined by an edge iff the corresponding subsets of V
have at least two vertices in common. Then F is precisely a dependence
graph of {A, : a e U} U {B, : T ( W} and, with the notation used in
Theorem 12.10,
PA = P(') = {a(logt)/t}(3),
328 Ramsey Theory
dA
dA
<
( (ns-2
2 s
t
--
n 2,--
< 3n-
'
< 3c,-2(t/ log t)(,-2)(,+1)/2,
= t2 (t/ log t)(s-2)(s+))/2,
and
dABPB6B t t-a(t-())2tbt
2 2
• (ec)it(s-1)t/ a(t l)/ +bt(log t)-t(s+1)/2 = o(1)
since, by (12.18), the exponent of t is negative if t is sufficiently large.
Hence (12.14) does hold.
To complete the proof we have to check (12.15). Now, by the last
inequality, dBPB6B = o(1). Furthermore,
3
ý (1 + E){a(logt)/t}(2) CS-2t2
6 APAdBA
2(s - 2)!
(l+ += a(ý')C,-2 (tlogt)0 < (lI + 0) -l 09 B,
2(s - 2)!
Once again, one is tempted to exclaim: all this ingenuity and effort
just to increase the exponent (s - 1)/2 to (s + 1)/2. However, this time
the complaint would not be justified, partly because we gain a factor
(t/ log t) rather than only 2, and t can be large, and more importantly
because it is of great interest to estimate R(s, t) for a fixed small value
of s and if s is small, say 3 or 4, then the improvement replaces a trivial
result by a substantial one.
Of course, Theorem 12.11 is sharpest for s = 3: the lower bound on
R(3, t) given by Theorem 12.11 is rather close to the upper bound given
by (12.7), which will be improved slightly in §3. This lower bound is
12.2 Off-Diagonal Ramsey Numbers 329
essentially due to Erd6s (1961), and the proof was one of the first real,
and fairly complicated, applications of random graphs. Spencer (1977)
improved the lower bound a little (to the one given in Theorem 12.11)
and gave a considerably simpler proof, namely the proof presented above.
To close this section we present a streamlined version of the proof due
to Erd6s (1961) that
R(3, t) > c(t/log t2 (12.19)
Theorem 12.12 A.e. Gp is such that Hp has no set oft independent vertices.
Proof We know from Corollary 3.4 that a.e. Gp has maximal degree
at most A0 = [2pnh = [2act/ log t]. From now onwards in this proof
most probabilities are taken conditional on A(Gp) < AO; for simplicity,
we shall just speak of c-probability and write P' instead of P. As we shall
estimate c-probabilities only from above, our task is rather easy: crudely
we bound the c-probability of an event by twice its probability. Let us
pick a set W ( V of vertices and denote by Cw the event that Gp[W]
has an edge xy such that Gp contains no triangle xyz with z c V - W.
330 Ramsey Theory
Our interest in Cw stems from the fact that if Gp E Cw, then W is not an
independent set in Hp. Indeed, if Gp e Cw and a minimal set E c E(Gp)
contains xy, then Hp + xy - Gp contains a triangle xyw. Now Gp z Cw
implies w c W, and so Hp contains the edges xw and yw.
Hence, our theorem follows if we prove that a.e. Gp belongs to nCw,
where the intersection is taken over all W e VO). Since we have (n) <
(en) t < tt /(logt) choices for W (if t is sufficiently large) this, in turn,
follows if we show that
PC(Cw) <- t- t. (12.21)
Why is (12.21) true? Because (i) the c-probability that 'many' pairs of
vertices of W have common neighbours (outside W) is very small and
(ii) the c-probability that in a given large set of pairs of vertices of W
no pair is an edge is also small.
Let us turn to a real proof of (12.21). Set q = p2/2, do = [pt(l + F)1,
no = [t 2 -,], di = [e 2i(pt)/i] and ni = [t/e2i], i = 1,2,..., io = [(log t -
log logt)/2j. Write Pi for the c-probability that at least ni vertices in
V - W are joined to at least di vertices in W. Denoting by dw(z) the
number of neighbours of a vertex z c V - W in W, we see that the r.vs
dw(z), z E V- W, are independent binomial r.vs, each having parameters
p and t, and so mean pt = a(log t). Hence, by Theorem 1.7(i),
P0 2 (n)
2n -{,2a(logt)/31no < t <nt-,n°<t-2t-1,
(
Pi •! 2 n ) e-e 2 a(Iog t)ni < t2te-2 i-at < -t1
These inequalities show that, with c-probability at least 1-t-2t, dw(z) >
do for at most no - 1 vertices z e V - W, dw(z) >_ d, for at most n, - 1
vertices, and so on, finally dw(z) > dio for no vertex z G V - W, since
dio > A0 . Therefore, with c-probability at least 1 - t 2t, the number of
pairs of vertices in W sharing a neighbour in V - W is at most
(d2°) -- 1/i0
-1 - 1 d~i
2
<-- 2 )3 t
2
+ 2e 2 t(a log t) 1 e2i/i2
i=1
12.2 Off-Diagonal Ramsey Numbers 331
2
2
ca 2 (1 + C)3 t +40t(alog t) ot 5(logt)-2
_2 g \.og t)5(ot
ca 2 (l + r')4 t2
< 2 t(12.22)
2
where r [(1 + 8)t2 /a]. What is the c-probability that none of r given
F
pairs of vertices in W is an edge of Gp? It is at most
f(d)=(dlogd-d+l)/(d-1)2, d#0,1.
12.3 Triangle-Free Graphs 333
GI = n - d(x) - I
and
so
2 d2 - d
h d nd- 2d
n-d-1 n-d-1
that is
Letting n -> ct, we have h -* 0 and therefore, assuming that the derivative
exists,
(d 2 - d)g'(d) •< I - (d + 1)g(d). (12.27)
Assuming, though we should not, that (12.27) holds for all d >_ 1, we
cannot do better than choosing for g(d), d _>1, the solution of
(d 2 - d)g'(d) = I - (d + l)g(d),g(1) = ½,
no ~ n
e(G)~~ no/n
Theorem 12.17 There are constants c3, c4.... such that Jor every s Ž 3 we
have c, < 2(20)s-3 and, if t is sufficiently large,
2
R(s, t) < csts- /(log t)y .
Theorem 12.18 There is a constant c > 0 such that every 2-colouring oj'K"
contains a complete graph whose colouring has discrepancy at least cn 31 2 .
Theorem 12.19 (i) For 0 < a <1 there is a constant c1(c•) such that
&•(r) < clo(c)r.
(ii) For 1 < a < 1 there is a constant c2(0) > 1 such that &(r) >
{C2(0)}r. ED
The first part of this result can be proved by successive deletion of
appropriate vertices (Ex.
2
8); the second part is proved by considering
eN(n, 1) for n = [er(-1/2) /2j (see Ex. 9).
What can we say about R&(r) in the most interesting case, namely
when o = 1? Erd6s and Pach came rather close to determining the exact
growth of RI/ 2(r).
Theorem 12.20' For every c c N there exist E(c) > 0 and ro(c) c N such
that for every r > ro(c) there is a graph G such that G1 > cr and if H is
any subgraph of G or G with at least r vertices, then 5(H) < --{12 -e(c)}IHI.
12.5 The Size-Ramsey Number of a Path 341
The proof of Theorem 12.20' makes use of the model 1{n, (pij)}, defined
in §1 of Chapter 2. Let c E N, c _>2, and let r E N be sufficiently large.
Set k = c + 2,1 = [{1 - (1/3k)}r],n = kI > cr, and V = {1,2,..., n}.
Partition V into k equal classes: V = U=1 Vi. For 1 < s < t < n set
2{ (3k) 3
i
3
-1, ifsE V i,t e Vj and i/=j,
6
P+
= (3k) i, if s, t G Vi.
Then, as shown by Erd6s and Pach, a.e. G E ý{n;(pij)} will suffice for
the graph G in Theorem 12.20'.
Given graphs GI, G2 and H, we write H -- (GI, G 2 ) for the assertion
that every red-blue colouring of the edges of H contains either a red G1
or a blue G2 .
Define the Ramsey number of G1 versus G2 as r(G1,G 2) = min{n
Kn --+ (G1 , G 2 )}; for simplicity, write r(G) = r(G, G) and call it the
Ramsey number of G. A central problem in generalized Ramsey theory
is the determination or at least good estimation of r(G1, G2) for various
pairs of graphs G1, G 2 . We shall not go into this problem here, but in §3
of Chapter 13, we shall estimate the Ramsey number of C 4, a four-cycle
versus Ks.
when n = r(G1 , G2), since K' --+ (G1, G2); furthermore, even more trivially,
r(G 1 , G2) •< 2P(G 1, G2).
For dense graphs G1 and G2 inequality (12.29) is not too crude, but
for sparse graphs (12.29) is very bad indeed. The aim of this section is
to present a striking example of the latter, namely a beautiful theorem
of Beck (1983a) showing that the size-Ramsey number P(ps) of a path
of length s is surprisingly small: j(pS) = O(s). [Note that inequality
(12.29) is slightly worse than ?(P5 ) < s2/2.] The proof below is essentially
from Beck (1983a); for a slight variant of the proof, relying on different
lemmas, see Bollobds (1985a).
342 Ramsey Theory
?(P5 ) -<720s.
Proof Let d c N, 16 < 4d < c < c' < (4.001)d,0 < a < 0.03,p >
c'/6c, p = c/n,u = [cnJ and s = 3u- 1. Let us consider the space ý§(n,p).
We know that a.e. Gp satisfies
d(Gp) > d and e(Gp) •_ c'n/2. (12.31)
Our aim is to show that if c, c', d, a and p are chosen suitably, then a.e.
Gp is such that if X c Y ( V, IYI = 3 X[ = 3t _ 3u, then
e(X, Y) < dt. (12.32)
Because of Lemma 12.21, relations (12.31) and (12.32) will imply that a.e.
Gp is such that if G' = Gp and e(G') > 1e(Gp), then G' contains a path of
length s and so ?(P,) < ps if s is sufficiently large.
Denote by Z, = Zt(Gp) the number of pairs (X, Y) violating condition
(12.32). Then our aim is to prove that
P Zt Ž_ = o(1). (12.33)
)_
< ct3/2(t/n)(d-3)t2 -2te (d+3)t( lO.O1)dte-5t2 c/2n, (12.34)
344 Ramsey Theory
where co = co(x, c, d), since (1 - x)'-' < ex for 0 < x < 1. Hence
E E(Z,) = o(1),
t=l
ZE(Z) = o(1)
t=lI
where uo = [aonJ,ao = io(d) > 0. Also, if u0 _<t = f/n _<u, then (12.34)
gives that
logE(Zt) <_ t{(d- 3)logfl- 21og2 + {3- (1/3)} log(1 - 3/3)
+dlog(10.01e) - 1Odfl} = tf(d, fl).
Exercises
12.1 Prove that if G has n vertices, then fro(G) _Žn/(d + 1). [This
is more or less Turfin's classical theorem, see Bollob*s (1978a,
chapter 6) for numerous related results.]
12.2 Consider the following two random greedy algorithms for con-
structing independent sets in a graph G.
(a) Take a random linear order on V(G) and run the greedy
algorithm.
(b) Pick a vertex at random, put it in your independent set and
delete the vertex and all its neighbours; pick another vertex
at random, put it in your independent set and delete the
vertex and all its neighbours; and so on.
Show that (a) and (b) define the same probability measure on
the set of independent sets of G; in particular, they give the same
expected order PO(G) of an independent set.
12.3 Do all independent k-sets occur with the same probability in an
independent set constructed by a random greedy algorithm? Do
they occur with the same probability as the first k elements of
an independent set?
12.4 Prove that if G is a graph of order n and average degree d < 1,
then
rlo(G) >_ flo(G) >_ (1 - d/2)n.
346 Ramsey Theory
12.5 Let G be a graph of order n, containing h triangles. Run the
following greedy algorithm to eliminate the triangles in G. Set
Go = G. Having constructed Gj, if k 3(Gj) > 0, pick a vertex xj
of Gj in as many triangles as possible and set Gj+I = Gj - {xj},
if k 3(Gj) = 0, terminate the sequence in Gj. Let G, be the final
member of the sequence. Show that
GSj > cj(ln - h) for {(1 - 1)/3}n < h < (1/3)n,
where
lH2j -- 2 2
=1 2 8
OCI= rIl -, SO0C (1,O2 -, O3 8, ,etc.
_ý22j + 1 5 35
12.6 Show that for fixed values of I the bounds in Ex. 5 are essen-
tially best possible. Note that as h/n -* oc the bound is about
cn 5/2h-3/2,
12.7 Deduce from the proof of Theorem 12.17 that for all . > 0 and
s > 3 there exists a constant c(g, s) > 0 such that if t is sufficiently
large, then either
R(s, t) < c(r, s)R(s - 1, t)(t/ log t)
or
R(s -- 1, t) < R(s -- 2, t)tl;.
348
13.1 Character Sums 349
we have
p(X) = (X - xo)LS(X). E
y2 = f(x)
satisfies
Is - q <• 4dq'1 2 .
350 Explicit Constructions
Proof We start by listing the elementary relations that will be needed
in the proof proper. Put L = [q1 /2] and J = 1(L + u), where u = 2d or
2d + 1, so that J is an integer. It is easily checked that
E =L{l(q-d-2)+(L- 1)(d-1)+J} < V=J(q-d) (13.1)
and
(q- d -2) + (J - 1)q + d(q - 1)/2 L{(q/2) + 2dq1/2 - d}. (13.2)
For 0 • j < J and k = 0, 1 put A(j,k) = qj + kd(q - 1)/2 and B(j,k) =
1(q -d-2) +qj+kd(q - 1)/2. We claim that the intervals [A(j, k), B(j, k)]
are disjoint. Indeed, suppose that [A(j, k), B(j, k)]n[A(j', V), B(f,k')] #• 0,
where (j,k) = (j',k'). Then clearly k * k', say k = 0 and k' = 1, and so
either
A(j, 0) •_ A(j', 1) • B(j,0)
or
A(j', 1) _•A(j, O) • B(j', 1).
Now the first relation implies that
0 _ q(j' - j) + dq/2 - d/2 <_ 'q- '(d + 2),
D1 {p(X)g(X)} = ')__
2S <_s < 2S + d.
We shall show that both S and S' are at most 1q + 2dq1 / 2 - d. Then
S q - d - S' > 1q - 2dq11 2 so
This polynomial (D(X) is not zero since all non-zero summands have
different degrees. Indeed, if pyk(X) is not the zero polynomial, then
and we know that the intervals [A(j,k),B(j,k)] are disjoint. The most
important property of PDis that DoD(x) - 0(0 •_ 1 < L) for all x E F
satisfying g(x) = 1. Indeed, as D'{p(X)Xqj} = XqiDl{p(X)}, and xq = x
for every x c F, if g(x) = 1 we have
J-1
(D'¼D)(x) = {f(x)}- EZ{(DpoPo)(x) ± (D pj1)(x)g(x)}xqj
jO0
352 Explicit Constructions
J-I
= {f(x)}' Z{(Dlpjo)(x) - (D1pjl)(x)}x'
j-O
= {f(x)}-Pit(x) = 0.
By Lemma 13.1 the polynomial ()(X) has a zero of order at least L at each
x satisfying g(x) = 1, so SL < deg (D< '(q-d-2)+(J-1)q+d(q-1)/2.
Hence (13.2) implies
S •_ q/2 + 2dq1/ 2 - d,
as required. D
where S and S' are as in the proof of Theorem 13.2. Therefore, if f(X) has
odd degree m > 3 and q > 4m2 is a prime, then the proof of Theorem 13.2
implies that
f(x,y) = 0
satisfies
where go * 0 is constant. If
max(deg g)/i =m/d and (m,d) = 1, (13.4)
1<i<d
where
e(x) = exp(27rix).
354 Explicit Constructions
Z A(f(x))Yp(g(x)), (13.5)
xEF
= S S] x(x)9(x)x(l/yip(-y)
x y•O
= S
E x(zY)9(zY)x(l/Y)(--Y)
y*O z
q + EZx(z) E9(u) = q,
z#1I U
Zipx2 + bx
y,•(aX2 x + c)
c) = yi
EY (a( +c
a X + bb+~a )2+cb b-4a
2 )
x
: Z(c)2Xx)YW(x) 5
E
x
(x/c)Yp(X) E x(y)lp(cy).
y
= E ip((t-t')x)= 5 51P((t-t'Ox)
x t,t'ET ttcT x
-~ S ~~ (Y)i((u - w)y)
y U EU WGy
5
E 5i(uy) 5ýW W)
y UGU wEWE
qUUI1 /21 W 11 2
- / ,
2 x+2 2x+2
1 x+l 2x+1
0 x 2x z
Fig. 13.1. Two drawings of the Paley graph P 9 , with F 9 = 73[x]/(x 2 + 1).
A + £0 ,
C - +
+ - 0 + - - ++ -
13.2 The Paley Graph Pq 359
+0 +
0+ 0
0+00+-+0....
everywhere except for a 0 in the main diagonal: denote by C this new
matrix. Thus for q = 5, with the earlier convention and the parentheses
omitted, A, B and C are as follows
0
+ 0+
0 +0+
0 + +
+
+
0 +
0+
-+
-
00 + 0+ + 0 + + + 0 +
+0+ + 0
0 0+++ + -+ 0
q- E=Y xA~{
xu)}{Z ( I)lz2 171 x(x- w)}
x ZczULJWZý
= E (--1) ZOWI
I Z(Pz(x))
Z=UuWz'z x
Finally,
Theorem 13.10 Let U and W be disjoint sets of vertices of the Paley graph
Pq and denote by v(U, W) the number of vertices not in U U W joined to
each vertex of U and no vertex of W. Then
Then
Theorem 13.13 Let U and W be disjoint sets of vertices of the Paley graph
Pq. Then
and
e(U, W) - I UWJ < U 1/21WII/ 2q1/2 .
U) e(U)} = 4e(U) -- 2 U)
E
. -(u-w) = 2e(U)-2{(
and
EY Zx(u-
, W) =eU )-( JW (,W)
= 12e(U, W)J - IU1 W1.
By Lemma 13.7, the first expression is at most IUq/ 2 and the second is
at most IUI"/2 W 11/ 2q 1/2.
All our results so far concern the existence of many vertices and
edges of one kind or another. However, r.gs are used not only because
they contain many subgraphs of various kinds, but also because they
do not contain certain subgraphs. Thus, as a.e. Gn,112 has fairly small
clique number, in Chapter 12 we could use it to give lower bounds
for Ramsey numbers. Do Paley graphs give us good lower bounds for
Ramsey numbers? In other words, do Paley graphs have small clique
numbers?
On examining Pq for some small values of q, we are greatly encouraged.
Thus P 5 (which happens to be the pentagon C 5) is triangle-free so shows
13.2 The Paley Graph Pq 363
e(U) - - () < q
so
cl(Pq) = u <_ q112 + 1. (13.11)
Theorem 13.14 Let U be a set of u vertices of the Paley graph Pq. Then
Furthermore,
cl(Pq) • q.
Proof We may assume that 0 < u < q. Let d = 2e(U)/u be the average
degree in the graph G[U] and denote by n(U) the number of unordered
pairs of neighbouring edges (ac, bc) with a, b E U. By Theorem 13.8,
for each pair (a, b) of adjacent vertices of U there are (q - 5)/4 such
pairs and for each pair (a, b) of non-adjacent vertices there are (q - 1)/4
such pairs. Hence
U. Therefore
implying
ud(d - 1) + u q 1-) - -) 1
u(u- 1)(q- 1) du
8 2
Multiplying by u(q - u)/(2q) and rearranging, we find that
2 16q
/
(cu)
which is precisely the first inequality to be proved.
To see the assertion about the clique number, note that if U spans a
complete subgraph of Pq and I= u, then
and so u q. LI
Can we do better than Theorem 13.14? Not too easily. In fact, for a
general prime power q =1 (mod 4) Theorem 13.14 is best possible: if
q = p2 for some odd prime p and F0 is the prime field of Fq then every
element of F0 is a square. Hence Fo spans a complete subgraph of Pq
and so cl(Pq) _>p = q 1/2. This shows that if we want Pq to have a small
clique number, then q had better be a prime. Furthermore, even when
q is a prime, cl(Pq) cannot be shown to be small by the use of general
character sum estimates, for those estimates would give a proof in the
case q is a square.
Suppose then that q is a prime, q - 1 (mod 4). Write n(q) for the least
natural number which is not a quadratic residue modulo q. Then the set
{1,2.... ,n(q)} spans a complete subgraph of Pq, so
n(q) • cl(Pq).
The function n(q) (to be precise, its analogue for a general character Z of
lFq) has been studied extensively; for an excellent survey see Montgomery
(1971, chapter 13). Extending results of P61ya (1918) and Vinogradov
13.3 Dense Graphs 365
(1918), Burgess (1957, 1962, 1963) proved that if r > 0 and q is sufficiently
large, then
n(q) <!ýq'I/6•+.
such that e(U) is about M (n) / (•) for all u-sets U if u is not too small,
provided J(G) is not much smaller than 2M/n, the average degree, and
no two vertices have substantially more than 2M 2 /n neighbours, which
366 Explicit Constructions
is about the expected number in GM. Thus, to show a fair resemblance
to a typical element of 2(n, M), it suffices to prove that all degrees are
about the same and all pairs of vertices have about the same number of
common neighbours.
Let us start with natural generalizations of Paley graphs: Cayley
graphs. Given a group H and a set X of generators of H, the Cayley graph
G(H, X) is a directed multigraph with vertex set H in which there is an
arc (directed edge) from a to b if a-lb E X. [The assumption that X is a
set of generators is not very important and we usually ignore it; if X does
not happen to be a set of generators then G(H,X) falls into isomorphic
components, with each component being isomorphic to a Cayley graph
of a subgroup of H.] With a slight abuse of terminology, the graph
obtained from G(H,X) by forgetting the orientation and multiplicity of
edges is also called a Cayley graph and is denoted by G(H,X). There
seems to be little harm in this ambiguity; in fact, usually we choose X
to be self-inverse: X = X-1 = {x- 1 : x G X} in which case G(H,X) is
obtained from G(H, X) by replacing each edge ab by the arcs ab and ba.
Clearly the Paley graph Pq is a Cayley graph: Pq = G{(Fq, +), Qq}, where
Qq is the set of non-zero squares in Fq.
Cayley graphs give us many new models of random graphs. For
example, given a sequence of groups H1 , H 2.... with H, having order
2k, + 1,k, -- o, and r, E N, we may define S(H,,r,) as the set of all
Cayley graphs G(H,,X,) with X, running through all self-inverse sets of
2r, elements. Thus %(H, r,) consists of (k,) 2rm-regular graphs of order
2k, + 1.
In a rather vague sense, it is clear that 'most' elements of 'most' of the
spaces , r(H,,rn)
resemble appropriate typical (more conventional) r.gs.
Our problem is, then, to choose a particular Cayley graph that we can
prove to be similar to an r.g.
As a slight generalization of Pq, we can choose q and d such that q is
a prime power, d_> 2, d divides q- 1 and -I c Xq = {xd : x E Fq,X # 0}
and then take G{(Fq,+),Xq}. As another variant, we can take q and d
as above, pick a subset Yq of Fq and define Xq = {x : xd G Yq,X = O}.
Methods analogous to those in §2 can be used to show that, under certain
conditions, these graphs do resemble typical r.gs in S(n, p) for constant p.
How can we define concrete r.gs which are slightly less dense? Instead
of IFq, let us take IFp = Zp for some prime p and let t E N, 2 < t < p. Let
Q(p, t) be the graph with vertex set Fp and edges set
-
{ 1,
0,
if 0 < {s/p} _ t/p,
otherwise,
1
and let (cj)p- be the Fourier coefficients of f:
p-1 p-1
f(s) = E cje(sj/p) = cjeiisj/p"
j=0 j=0
Thus
j1 e(--sj/p),
ejZps=lI
p
in particular, co = t/p.
E c1 l < logp.
j=l
Proof Clearly
1 2
p-i - (p- )1
=j 5e(-sj'/p) =- 5 e(js/p)
j=l P j=l S < p j~l =II
1-e(j/p) 1 -
2
4 (p-01/2 1 (p-i)1 I
< 1 E"= < logp
p 4~ j-1 =
since
2
nij/pI > 4j
II - e(j/p)I = 11- e
p
if 0 < j < p/2. D
For large values of p, Lemma 13.15 is easily improved (see Ex. 9).
368 Explicit Constructions
Proof Let us consider the case U = {u, v}, W = 0; the other cases being
similar (see Ex. 10). Let f be as above, so that
p-I
u(UQ ) = f{(u - r)2 }f{(v - r)2}
r=O
p-i
= S S CjCke{(u - 2
r)2 j/p}e{(v - r) k/p}
r=O j,k
p-I
= pco + 5 Cjck e{(u - r)j/p +(v-r)2k/p}
(j.k)1(O,O) r=O
= pc2 + R,
say. Since Wv(x) = e(x/p) is an additive character of IFp, by (13.6) and
Lemma 13.6 we have
Hence
2 2
IR[I ICjCk pl/2 < pl/ (logp) ,
j,k
For 0 < k < 1 < r - 1 let G(r,q,k,l) be the bipartite graph with
bipartition {Vk(r,q), VI(r,q)}, with A e Vk(r,q) adjacent to B c V1(r,q) if
B contains A. Reiman (1958) was the first to consider a graph G(r, q, k, 1),
namely G = G(r, q, 0, 1), in connection with the problem of Zarankiewicz
for K(2,2). Clearly G is an m by n bipartite graph of girth 6 with
m = vo(r,q),n = vl(r,q); furthermore, it has the maximal number of
edges among all m by n bipartite graphs without a quadrilateral.
The graphs G(r, q, 0, r-1) were considered recently by Alon (1985), who
showed that they resemble a typical Gp in the sense that most subsets of
vertices have 'many' neighbours (see Theorem 2.15 and Theorem 13.10).
Graphs in which all subsets of vertices of certain sizes have 'many'
neighbours are usually called expanders. There are a good many different
definitions in use, some of which will be discussed here and in §4. Call a
graph (a, b)-expanding if for any set A of a vertices.
JA U F(A)l 2! a + b, (13.14)
i.e. if A has at least b neighbours not in A. Similarly, a bipartite graph is
said to be (a, b)-expanding if (13.14) holds whenever A is a subset of one
of the vertex classes. [In that case, (13.14) is equivalent to IF(A)[ > b.]
Tanner (1985) showed that the expansion properties of a graph are
related to the eigenvalues of various matrices associated with the graph;
this connection was further exploited by Alon and Milman (1984, 1985)
and then Alon (1985) proved the following theorem. The result also
follows from an argument along the lines of the proof of Theorem 13.14.
aa + b/3 + c7 = 0.
Mathon (1978) was the first to construct conference graphs with trivial
automorphism groups: 1152 of the 1504 conference graphs of order 45
constructed by Mathon have trivial automorphism groups. There is little
doubt that many of these graphs have degree of asymmetry 22 and so
show A(45) = 22; and it seems very likely that A(n) = (n - 1)/2 infinitely
often.
be the entropy.function.
and if D c U, DI
D can, then G contains a complete matching from D into
W.
With a little work one can check that the conditions imply Zdmcn S(d) =
o(1). L
Theorem 13.22 Let 0 < o < 1/2 < 2 and r E N be such that
U-(ý+l)•( -- IO) (1=(
N - 2a)-(1-U) < i 0-,,(1 _ )2 1•r
_)(-),
Then for every fixed k and sufficiently large n there is an n by n+k expander
graph with parameters (2,A, r). D1
Corollary 13.23 If 0 < a < 1/2 < 2 and k is a non-negative integer, then
there is an r such that Jor every sufficiently large n there is an n by n + k
expander graph with parameters (2, a, r). F
Although expanders are easily shown to exist, they are rather difficult
to construct, so it was a real breakthrough when Margulis (1973) gave
an explicit construction for an infinite family of expander graphs. It is
easily seen that small sets are more likely to expand than large ones so it
makes sense to take this into account when defining expanders. Thus to
state the result of Margulis, it is natural to define an (n, r, 2)-expander as
an n by n bipartite graph of maximal degree at most r, with bipartition
13.4 Sparse Graphs 375
Theorem 13.24 For n > 4, let U and W be two disjoint copies of 7Zm X Zm
and define a bipartite graph Gm with bipartition(U, W) by joining (x, y) e
U to (x,y),(x+l,y),(x,y+1),(x,x +y) and (-y,x) in W. Then Gm is an
(m , 5,,2)-expanderfior some absolute constant 2 > 0. El
For 0 < p < 1 < a a graph G is called (p, c)-jumbled if every induced
subgraph H of G satisfies
le(H) - p (cJH) _ H.
k
Sdi = kd
i=1
378 Explicit Constructions
and
n k
E dŽ -2 Z(pn - di) = k(pn - d).
j=k+l i=1
di •_ (c1) (p 2 n + 1)
This is equivalent to
nk
(d - pk)2 ((k- t)lI+ p(- p)n),
n
which implies that
Id - p(k - 1)1 _•((p + 1)n) 1/ 2,
Theorem 13.28 Let 0 < p < 1 and let h, m and n be positive integers with
2 •_ h < n - 2. Let G be a graph such that for every induced subgraph H
of G with h vertices we have
e(H) -- p(h)2)_m
Then
ýe(H) - p (LF 80
g) M ( _ q-2
Theorem 13.29 Let H be a graph of order h > 3 with m edges, and let G
be a (p, a)-jumbled graph of order n, where p < 1/2. Furthermore,suppose
that g2phn Ž!42 hh2, where 0 < E < 1. Then
(1 -)hpm(l -- p)(')-mnh < N*(H) _ (1 +E)hpm( -- p)(')-mnh L
1. For s > 1, P1 (s) is the property that for every graph H with s vertices
we have
N* (H) = (1 +o(1))ns2-1).
2. For t >_3, P(t) is the property that e(G,) > (1 + o(1))n 2/4 and
6. For vertices u and v of G,, let a,(u, v) be the number of vertices joined
either to both u and v or to neither. Then P 6 is the property that
n
a,(u, v) - o(n3 ).
u,vCV(G,)
Here is then the theorem of Chung, Graham and Wilson (1989) con-
cerning these properties.
Theorem 13.30 For j > 4 and t > 4 even, all the properties P1 (s),P 2 (t),
P3 .... P 7 are equivalent. H
Although some of the implications, like P1 (s) => P2(s) and P 4 > P5,
are immediate, and some others are fairly easy, much work is needed to
prove the equivalence of all the properties. In particular, it is surprising
that condition P 2(4) is so strong, although it looks rather weak.
A graph G, is said to be quasi-random if its natural sequence (G,)
satisfies any (and so all) the properties in Theorem 13.30. Clearly, one
could easily define quasi-random graphs 'imitating' the properties of Gn,p
for any constant p, not only for p = 1/2. In this way we obtain quasi-
random graphs of density p. Chung and graham (1989, 1990, 1991a,b)
went considerably further: in addition to quasi-random graphs, they
studied quasi-random hypergraphs, set systems and tournaments as well.
Simonovits and S6s (1991) brought to light the following fascinating
connection between quasi-random graphs and Szemer~di's lemma : if a
graph is quasi-random than in every Szemer~di partition all pairs are
regular with density 1/2 and, conversely, if a graph has a Szemer~di
partition in which almost all pairs are regular with density 1/2 then the
graph is quasi-random. An unexpected bonus of this is that if a graph
has a Szemer~di partition in which almost all pairs have density 1/2 then
there are no exceptional pairs.
It is easily shown that if NG,(H) = (1 + o(1))nhpet(M for all graphs H
of order h, where 0 < p < 1 is fixed, than G, is quasi-random of density
p. It would be tempting to conjecture that if NG,(H) = (1 + o(1))nhpe(H)
for a not too exceptional graph H of order h then G, is quasi-random.
In fact, this is for from being the case. Nevertheless, Simonovits and
Exercises 381
Exercises
13.1 Prove that if X is a non-principal multiplicative character of
F = Fq and W is a non-principal additive character, then
S zX) = 0 and O = 0.
xcF xCF
•p 1 /2logp.
Ac=x(c)
[By Lemma 13.5 and relation (13.8), with ip(x) = e(x/p) we have
S S S
p-1 s q-1 s
Given a natural number n, denote by Qn the graph whose vertices are all
the 0-1 sequences (81, C2, .... N) of length n and in which two sequences
(i.e. vertices of Qf) are adjacent iff they differ in exactly one member.
Equivalently, V(Q") ={0 ..... 2'- 1} and i is joined to j if, when written in
binary form, they differ in precisely one digit. We call Qn the n-dimensional
cube or simply n-cube. Clearly Q" is an n-regular, n-connected bipartite
graph of order 2'.
In §1 we shall study the space W(Qn; p) of random subgraphs of Qf, an
example of the probability space introduced in §1 of Chapter 2. Thus, to
obtain a random element Qp of V(Qn ;p), we delete the edges of Qf with
probability q = 1-p, independently of each other. For what values of p is
Q' likely to be connected? As we shall see, the critical value of p is I, and
the main obstruction to connectedness is the existence of isolated vertices.
Let xt, x 2 , ... , Xn be random vertices of Qf and consider each xi as a
vector in R'. What is the probability that these n (not necessarily distinct)
vectors are linearly independent? As one would expect, this probability
tends to 0 as n -* c; however, the proof, due to Koml6s (1967, 1968)
is surprisingly difficult. Later Koml6s (1981) extended his own theorem:
he showed that the probability is O(n-'/2 ); our main aim in §2 is to
prove this result. It is fascinating that the proof is based on a lemma of
Littlewood and Offord (1938), which was also the starting point of the
study of random polynomials by Littlewood and Offord, mentioned in
the Preface.
The problem to be discussed in §3 seems to have nothing to do with
the n-cube but, on closer examination, it reduces to a question concerning
subsets of vertices of the n-cube. Given finite sets S1.. Sn, split Un=I Si
into two sets, say R and B, such that each set Si has about as many
elements in R as in B. How good a splitting can we choose?
383
384 Sequences, Matrices and Permutations
The problem can also be formulated in terms of bipartite graphs: given
an n by m bipartite graph with bipartition (U, W), split W into two sets,
say R and B, such that each of the n vertices of U is joined to about as
many vertices in R as in B.
In §3 we shall present a beautiful theorem of Spencer (1981) about
the existence of good splittings. The ingenious proof uses probabilistic
arguments; among others, it makes use of the symmetric group, as a
probability space.
In §4 we glance at random permutations and random elements of
other finite groups. Although the topic is definitely not part of the theory
of random graphs, there are several reasons for a very brief discussion
of the area in this chapter. The asymmetric group was among the first
combinatorial objects studied by probabilistic methods, there is a tangible
kinship between results concerning random elements of groups and many
of the theorems appearing in this book, and often the methods are similar
to those we have been using.
Finally, §5 is about random mappings of a finite set into itself.
bQ,,(H) = kn - 2e(H).
Therefore
bQ.(k) = kn - 2en(k),
where
Denote by h(i) the sum of digits in the binary expansion of i, and set
1+k-1
and
f.(k) = f(O,k).
The following theorem of Harper, Bernstein and Hart solves the isoperi-
metric problem in the cube Qf.
386 Sequences, Matrices and Permutations
Theorem 14.1
e,(k) = f,(k) and bQ,,(k) = kn - 2f/(k).
Proof' Let us prepare the ground by examining the function f(1, 1 + k).
Note that
h(i) + h(2r - I - i) = r
for all i, 0 << 2r - 1, and so
f(1,l +k) + f(2r - - k,2r 1) = rk (14.1)
whenever I+k 2 r. Furthermore, for each j _>1, in the binary expansions
of the numbers 0, 1, 2,... the jth digit is 2J times 0, then 2' times 1, etc.,
Hence the sum of the jth digits of 1,1 + 1 + k - 1 is minimal if the
first block of 0's is as long as possible. Therefore
f(l, 1 + k) >>Ž
f.(k) (14.2)
for all 1 > 0 and
f(1, 1+ k) > f(2', 2r + k) = f.(k) + k (14.3)
whenever k < 2 r < 1.
Relations (14.1) and (14.2) imply that
f(I,l+k) = f(1,2r)+f(2r,l+k)
> {(2r -l)r-f,(2r- l)} + {f,(+k-2r)+l+k-2r}
> {(2r-l)r-f(2r-k,2r-k+2r-l)+2r-1}
+f.(l+k-2r)+l+k-k
- f(l+k-2',k)+f,(l+k-2r)+k=f,(k)+k.
The penultimate equality followed from (14.1) since 2r+1 - k - 1 < 2r.
This completes the proof of (14.5).
14.1 Random Subgraphs of the Cube 387
and
bQ°(k) _k(n - [log 2 ki).
so
Then
lim P(Qp is connected) = e-.
n-- oQo
By Theorem 1.20, it suffices to show that for every fixed r _>0, the rth
factorial moment of X tends to W.rCounting very crudely indeed, a set
of r vertices is incident with at least r(n - r) edges and at most rn edges.
Furthermore, there are at most
This gives
(2q)rn(1 - r 2 /2n) < Er(X) < (2q)rn{1 + 2-nrnqr 2 }.
and so
Z qb(S) b(s)
S •
and so
E qb(s) = o(1).
s=2 SC0k
Let us turn to the range s1 < s _<2n-1. Here we shall partition 16, into
two sets. Let 16S consist of those sets S in %6for which
Hence
2- 1 2-ng
s=s] +1
s=s3+1 SGI1
2 -1 lo n S )
- ý (elon)Sj
S=.51+1
Lemma 14.5 Let G be a graph of order v and suppose that A(G) < A,
2e(G) = vd and A + 1 < u < v - A - 1. Then there is a u-set U of vertices
with
IN(U)] = IU U F(U)ý > vd{1 - exp(-u(A + 1)/v)}.
d v - uA+l(( d)
_<z
vxA( +v A
_v-A exp +.)d +v 1--
14.1 Random Subgraphs of the Cube 391
d I- (_ u(A±+1)
A exp
Let us continue the proof of Lemma 14.4.
(iii) Let us estimate the double sum corresponding to (14.7) over the
range s, < s s2 = [2'/(log n) 4j and S E W+. Then
Set
u = [2s/nj.
By Lemma 14.5 there is a u-set U c S such that N(U), the set of vertices
of H within distance 1 of U, satisfies
This shows that the sets in W+ can be selected as follows: first we select
u vertices of Q', then Fs/3] - u neighbours of these vertices and then
[2s/3j other vertices. Hence
I
< u+ 2 [u2s'/3]
and so
Z qb(S) < e~n e 2n ) Xs/3j
2 (nlog 2 s)(logn).
SEr \\ uJ [2s/3J
Write s = 2-l so that < 1I3- 4log 2 log n/n. Then the sum above is at
most
(clog n)s 2 (2 / 3 )sn(1-fl)-sn+sn = (clog n)s2-sn(1-fl)/3,
Hence
S2
E E qhb(s) = o(1).
s=si Se•+
1
NH(U)~ NH,(U) t I(i 5 ){1 -exp(-n /4 )}
I -- 2 n S-(log log 2 n1
(ki) i=1
where the summation is over all (ka,k 2 ..... k) with ki < (log 2 n) 2, since
14.1 Random Subgraphs of the Cube 393
SG,ý
where
as required. This concludes the proof of Lemma 14.4 and so the proof of
Theorem 14.3 is complete.
What can we say about the orders of the large components of Q• for
fairly small values of p? For what values of p is there a 'giant' component,
if ever? As in Chapter 6, let us write Li(Qp) for the order of an ith largest
component of Qp. Erd6s and Spencer (1979) conjectured that LI(Qp),
with p = c/n, has a jump at c = 1. This was proved by Ajtai, Koml6s
and Szemer~di (1982), who remarked also that LA(QO) is almost always
small. Thus a giant component emerges shortly after p = 1/n and the
graph becomes connected shortly after p = 1. Note that the ratio of
these probabilities is about the logarithm of the order of the graph, just
as in the case of Gp, studied in Chapter 6.
- n/2] "
([n/2])
of the 2' sums Y", Eixi, Ei = ±1,fall into the open interval (S - 1,S + 1).
is a Sperner family. 17
14.2 Random Matrices 395
same value. D
where
Pk = 1k/2) 2
Zo jaij = 0 (14.9)
j=1
Noting that
Pk - (2/7rk)1/2
for if no < r(aa, ... a,) < n, then for some k >_ no the vectors a,,...,ak-1
are independent but ak depends on them. Therefore to complete the proof
of our theorem it suffices to show that
2
P(Ek) < cl/2n-knl/ (14.10)
for some constant cl and every k, no < k < n. Let no < k < n. Denote by
bl,.... bn the row vectors of the matrix (a, .... ak-1). By Lemma 14.10
[n/21
where flk = 1. Since r(bl,.... bn) > n/2, at least n/2 of the fli are non-zero.
If Ek holds then ak is a linear combination of a,,..., ak-1 so the linear
relation (14.12) has to hold for the coordinates of ak:
k
Sfliaik = 0. (14.12')
i=I
Theorem 14.12 For 1 < i < j let aij be independent real-valued r.vs such
that all aij, i < j have the same distribution and all aii have the same
distribution. Suppose, furthermore, that all central moments of the aij are
2
finite and put a2 = a7(aij), For i < j set aji = aij and let A, = (aij)'ij=l.
Finally, denote by W,(x)(A,) the number of eigenvalues of A, not larger
than x, divided by n. Then Wn(x) is a r.v. for every n E N and x E R, defined
on the space of random matrices An, and lim,-,, W,(x2a /n) = W(x) in
distribution, where
0, ~ if X •! -1,
W(x)= (l-X2)1/2, if _<x<,1
1, if x 2 1.
Note that W(x) is precisely the distribution function whose density is
2/7r times the unit semicircle.
Juhfisz (1981) and Fiiredi and Koml6s (1981) studied the largest and
second-largest eigenvalues of a random symmetric matrix. Juhfisz proved
that if E(aii) > 0, then the second-largest eigenvalue is much smaller than
the largest eigenvalue. This result of Juh~isz was extended by Fiiredi and
Koml6s to the following theorem which, in view of the semicircle law, is
essentially best possible.
Theorem 14.13 Let K, u, u and v be fixed positive numbers. Let aij, 1 <_
i < j < n, be independent r.vs such that jaij I K Jbr all i <_j,E(aij) = u
and a2 (aij) = U2 for all i < j, and E(aii) = v for all i. For i < j set
aji = aij, and define A = (aij)7',j=. Let 2 1(A) > 22 (A) > ... > 2(A) be the
eigenvalues of A.
(i) If p > 0, then, as n -* ,
max
i>2
[2(A)[ < 2ax/n + O(n 1/ 3 log n).
14.3 Balancing Families of Sets 399
Then our aim is to find a partition RUB for which the vector (disc Si)n=l
R" is small in some norm.
After several results concerning this problem, due to Brown and
Spencer (1971), Olson and Spencer (1978) and Beck and Fiala (1981),
the 12 norm case was settled completely by Spencer (1981). Our main aim
in this section is to prove Spencer's theorem (Theorem 14.21). Given n,
define f 2(n) as follows:
S)2 . (14.13)
f 2 (n) = max min i (d s
(Si) (R,B)
To justify the use of the maximum above note that by splitting the atoms
of a system {Si}n as equally as possible we obtain a partition for which
disc Si is less than 2n for every i. Hence f 2(n) < fn2'. As we shall see
later, this simple estimate is very crude indeed.
Before getting down to some mathematics, let us reformulate the
definition of f 2 (n). Given sets S1, S2 ..- Sn (- {1, 2,..., m}, write
Si,
= i0,1, if
U if jj C
V Si.
/
Si n B1)2
) 2
i=1 jeR jcB
2 2
ni
i=l ( d=lni ) = =n ru
f 2 (n) = max
M ujGT' min
max r;j=+1 E7 9juj . (14.14)
Zlh eg,,i
2
• hn.
I FA h IU
14.3 Balancing Families of Sets 401
2 su
h 2
FA - 9h)n.h+
= u +h+lUh+l 2 (h+1)f.
In order to get our first bound on f 2(h) we note that Q" is contained in
the ball of radius /n- about the origin. Hence, if for an arbitrary number
of vectors u1, u2. Um in the unit ball of R" we can find E, = +1 with
IZE ruii • h(n), say, then f 2(n) < ,/nh(n). It turns out that the best
possible function h(n) is easily determined.
2
The bound n1/ is best possible.
Proof If u1 ,.... u, are orthonormal then IE EjuiI = n1/ 2 for every choice
of the ri, so the bound cannot be improved. To see that n1/ 2 is an upper
bound, in the first instance we choose 01, 12 ... ,g.. between -1 and 1 so
that most of them are exactly +1 and the sum E' acui is 0 and then
change the remaining ci to +1 in such a way that the sum will not be
altered by much. We state these two steps as separate lemmas.
Lemma 14.16 Given ul,...,Um e R' there are (oi)7 such that
Ociui = 0, -1 Oci< 1,
Eciui = 0}
Proof Set H = {(c1,.... Lm) ":Z R' and let K m be the solid
cube: K' =-{(xl...,Xm) : x 1•}(= R'. Let o be an extreme point of
Km n H. Then since codim H < n, o is not in the interior of a face of
dimension greater than n of Km. Hence, ot is in the n-dimensional skeleton
of K1, i.e. all but at most n of the ai's are +1. li
402 Sequences, Matrices and Permutations
Lemma 14.17 Let u1 ,.... un e Rn'IU, I <- 1, and let (Gi.... ) C Kn. Then
there exist ei = ±1 such that
n n ~~
1EiUi,-- u n1/'2.
Having selected El, V2,. 9k, put a = l(ei - oci)ui, b = a - (o4k+1 + 1)Uk+l,
c = a - (ak+1 - l)Uk+l. Then b and c are the endpoints of a segment of
length 2 containing a. Hence by the remark above
(,Oi-- L n1/2.
giui 5
• ruiUi- t1•iui = ( - oxi)Ui n 1/ 2 . El
Proof Given S1, S2 ,. Sn, let uI, U2 . Um E Q' be the vectors determined
by the Si. Then uI/n1/ 2 < I for every i, so for some Ei = ±1 we have
5 Eui/n1/
2
<n
11 2
14.3 Balancing Families of Sets 403
implying
m 2
eiUi < n2 . El
The above attack on f 2(n) has one main deficiency. Having selected
m1, O2, .. .,-I, the ei are chosen one by one, so in the choice of ri we do
not take into account li+ 1,0i+1..... n. In order to get a better bound on
f 2(n) we select the ri all at once. To do this we shall replace the Fi by
appropriate random variables Xi and will show that the expected value
of I Z7 Xui 2 is small. The existence of these random variables will be
implied by the following lemma.
Lemma 14.19 Given r > 0 there is an no such that ifn > no and 0 < pi <
1, i = 1.. n, then there are Bernoulli r.vs X 1,X 2 ,....Xn satisfying
P(Xi =1) = Pi, P(Xi = 0) = 1 - Pi
and
Proof We shall show first that the result is true if for some integer a we
have p1 = P2 .... pn = a/n. Consider the symmetric group Sn, acting
on {1, 2,..., n}, as a probability space. For i = 1...,n, define
Ai -={7 G Sn :mr(i) <_a} and Yi = •Ai
and
E (Yi Yj) = P(Ai n Aj) -- aa-
n~- 1) =)-Pn•-1
a-- I i < j.
for
E(YY
1 z=(AjlA)=n(n -1) ~n -1I
Hence, if S c {1, 2,. ., n} has s elements, then
2
0"2(E y) 7 E(y'2) + 2 E E(YiYj)-(sp)
SES
-- I{ +(s- )1 sp}
n -- n(n -) n-- n --
=pq-sq + pn -- a
{ -} n(n-1 )
<spq (1 s) + np
n- 1 4
where q = 1 - p. Therefore
12 y n
<16+
16
Let X 1,X 2 -.-. Xn be 0-1 r.vs satisfying P(Xi) = pi, Xi > Yi and
P(Ai, n Ai2 n ... n Aik) •- E(Xi, Xj2 ... Xik) •-- P(Ci, 2 nA...
nCQ n Cik).
for some integer a. Use the construction above to define r.vs for the pi's
in each group and make these groups of r.vs independent of each other.
Since the variance is additive on independent r.vs, if n is large enough,
the variance of any sum of our r.vs is at most
Lemma 14.19' Given e > 0, there is an no such that the following holds.
if"n > no and 0 < pi < 1, <_ i < n, then there exist r.vs X 1,X 2 .... Xn
such that
P(Xi = 1) • pi, P (X= -1) = 1 - P,
and
a2 ( Xi) • 4n(I±v+
Lemma 14.20 Given F > 0, there is an no such that the following holds.
If n >_no, v1 , v2., .. v, are (not necessarily distinct) vertices of the n-cube
Qn and -1 < ci < l for i = 1,2,..., n, then there exist ei = +1 such that
and
_4(1n )
2 (EX)
cie
406 Sequences, Matrices and Permutations
for every set S ý{1, 2,..., n}. Then E(Xi) = pi- (1 - pi) = oi. Denote by
Sj the set of indices i for which the jth coordinate of vi is 1. Then
n t/2
since
E EM.-21i)
f=02 _Xi <4ln(
2
\ xsj •-1
for every j. Consequently (14.15) holds for some choice of the qi.
To get a lower bound for f 2 (n), assume first that there is an Hadamard
matrix of order n, i.e. there are orthogonal vectors VI, V2 . vn G Rn
with coordinates +1. We may also assume that v, = (1, 1.i. Set
Ui = 2(vl +-vi),i= 1,2,..., n, so that ui G Qn. Then for i = 1 we have
n
4
n1 2 + n-- 1 -
2n
n (14.16)
The thrust of Theorem 14.21 is, of course, the upper bound which,
loosely stated, claims that f 2 (n) O(n). Putting it another way, given
a family of n sets, for some partition of the underlying set the average
of the squares of the discrepancies is 0(n 1 / 2 ). Recently, Spencer (1985)
greatly improved this assertion: he showed that for some colouring not
only the average is O(n) but even the maximum! To formulate Spencer's
theorem, analogously to (14.13), define
Note that, by the easy part of Theorem 14.21, f,(n) > {• + o(1)}n 1 / 2 .
To change the topic slightly, let 11, be the set of n by n matrices whose
entries are +1. For A = (aij) G -&/,,, let d(A) = I >,j aij1 . How small can
we make d(A) by successively changing the signs of all entries in some
rows or some column? And how large can we make it? These questions
were studied by Gleason (1960), Moon and Moser (1966), Koml6s and
Sulyok (1970) and Brown and Spencer (1971).
To formulate the results precisely, for A, B E #//, let A • B if A is
obtained from B by changing the signs of all entries in a column or in a
row; let ,- be the equivalence relation induced by •. Define
d(n) = max min d(B)
AE-A', B-A
and
D(n) = min max d(B).
Aerff, B-A
=1, ýf n is odd,
{2, if n is even. Li
and it is easily seen, using Theorem 1.20, that for each fixed k we have
d
Xk -_ P1/k (Ex. 7). In particular, the probability that a random element
of S, has no fixed point tends to e-1: a rather well-known result. Also,
the expected number of cycles (or orbits) is precisely F"=, 1/k.
Let us have a look at some more involved r.vs on S,. Fix a set
R c V = {1,2,..., n},IRI = r > 1. The orbits of a permutation a G Sn
partition V and so induce a partition 7r(a) = 7r(a, R) of R. Denote by
1(a) = I(a,R) the set of elements of V - R in the orbits of a containing
elements of R. [Thus R U I (a, R) is the minimal subset of V containing
R and invariant under a.] By definition, 7r and I are r.vs on Sn.
Proof By symmetry
f•,ri- ri+ - 1
Therefore
where E' denotes summation over all integer sequences (ki)' with ki > 0
and 'll ki = k. Note that
+ki-1 I SHri
r + k-1 (14.19)
i1 ri - rr-1
Corollary 14.26 7r(a) and l(a) are independent r.vs, the distribution of n(a)
is independent of n,
Pn In (a) = p -fl(ri-
1 1)!
and
Pn{'°')= P = =fJ'
-- 1(r,-
(n), 1)! k=O (+r k I)
JJ(r -1) n
(n)r (n r
n)~r r"
and so
Theorem 14.27 The Bernoulli r.vs W 1 ,..., W, are independent and satisJy
P(Wh = 1)= I/h,h = 1. n.
Proof Since (W 1 .... Wh) is a function of 7r(a,{1 .... h}) and Wh+1
is a function of I(a,{1,....h}), the independence of the W, follows
from Corollary 14.26. Furthermore, Wh(a) = 1 iff in the partition
7r(a, {1..._h}) the class containing h contains only h. By Corollary 14.26
the probability of this is precisely the probability that a random permu-
tation on {1,2,..., h} fixes h : Ph{a(h) = h} = 1/h. El
The result is rather surprising since, at the first sight, On hardly seems
to be concentrated about n(logn)/ 2 . Indeed, already Landau (1909) had
proved that with 0n = max{On(r) :r E Sj}, we have
lim logn - 1,
n-ý---ogn
Theorem 14.33
n r-1
(i) P(Gf is connected) P(7 = 1) = Z [( (
E - )
r=1 j=O
(n0-(
n) ' I --n j!J " (n) 1/2 n- 1/2.
E(7,) 1 1 -).
j=O
d
For every fixed r we have 7r PI/r.
Pi
(iii)
E() = - logn
r~l j=0
and
0-2(7) = I
ri 0
r=1 j=O
n-I r• I
)j n-r
Er H S Hs -1+ n h)
r=1 j=O s=I h=O
~2
414 Sequences, Matrices and Permutations
P(o=k)=P(9=k)=k(n)kn-k = k-I 1 )
n n j=0
P( ) =1 t(n)knk E 1- .
kAI k=1 j=O
kn"-k-I ways of extending this cycle to a r.m.d. (see Lemma 5.17 and
Theorem 5.18). As there are n' r.m.ds, the exact formula for P(y = 1)
follows. To see the asymptotic formula note that, as in the proof of
Theorem 5.18,
E(r) = () (r - 1)!n-r
and so
P(ot = k, 5 = 1) = (n - 1)kln-k = (n)kn-k-1.
P(cx k) = k(n)kn-k,
and
n
2
E{z(m)} - c-/ - 1. (14.21)
Proof What is the expected number of trees of order m for which the
unique cycle in the component containing the tree has r + 1 edges? There
are (n) ways of selecting the vertex set of the tree and mI-' ways of
selecting a rooted spanning tree on that set. There are (n - n)r ways of
attaching a directed (r + 1)-cycle to the root and (n - m)"'mr ways of
extending the union of our tree and cycle to a r.m.d. Hence
E(-cm) (n m)r(n
mm-l(n-- - m)n-m-rn
r=O
k=O
Theorem 1.21),
(14.20)
Corollary 14.35 Denote by z* the maximal tree size in a r.m.g. Then for
S< c < 1 we have limn- < P(r*/n > c) = (1/ ,c) - 1.
Proof Clearly c*/n >_ c iff z([cnl) > 1. Furthermore, a r.m.g. Gf cannot
contain two trees each with more than n/2 vertices, so r([cn]) takes only
two values, 0 and 1. Hence
P(z*/n Ž c) = P{-r([cn]) = t} = E{([cn])} - (1/1c)- 1. __
Theorem 14.33 and Corollary 14.35 show that random mappings are
rather peculiar : with high probability there are exceptionally large trees,
i.e. there are vertices with exceptionally many ancestors. On the other
hand, the number of cyclic vertices is only of order ,Fn.
A random mapping digraph was proposed by Gertsbakh (1977) as a
model for an epidemic process: X is the population, initially a subset
A of X is infected with a contagious disease and the disease spreads to
other elements of X according to some rule depending on a r.m.d. D1.
There are three obvious possibilities: the disease spreads only forwards
along an arc, it spreads only backwards, or it spreads in both directions.
In this way we get direct, inverse and two-sided epidemic processes.
For a function f G Y let f* "X -* IP(X) be defined by
f*(x) {f(x)} u f-'(x).
closures of A under f,f- and f*, respectively. Thus f(A) is the set of all
descendants of elements of A and f*(A) is the minimal subset B of X
which contains A and satisfies f*(b) c: B for every b E B. Clearly
are the number of elements which are eventually infected if the disease is
initially confined to A and then spreads according to the function f and
the rules above.
Gertsbakh (1977) posed a number of interesting questions. For how
large a set A are we likely to end up with at least a fixed proportion of
infected elements? Is there a threshold function mo = mo(n) in the sense
that if JAI = o(mo), then for a.e. function (in the usual r.g. sense) only o(n)
of the elements will be infected but if JA[/mo --+ oo then for a.e. function
n - o(n) elements will be infected? Or, simply, what can we say about the
distributions of ý(A),il(A) and C(A)? As these distributions depend only
on JAI, for [A/ = m we may write mq,, and ým instead of ý(A), q(A) and
ý(A). Note that j = ccand q, = /1. Each of the r.vs , and (m has
range {m,m + 1,...,n}.
It so happens that qlm, the r.v. associated with the inverse epidemic
process, is the easiest to handle. Fortunately, as far as threshold functions
are concerned, it is also the most interesting.
The probability that in an inverse epidemic process a single element
infects the entire set was determined by Gertsbakh (1977).
Theorem 14.36
P(q, = n) = P(#3 = n) = 1/n.
Proof When does an element x infect the entire set? If and only if Gf is
connected and x is on the unique cycle of Gf. Hence, arguing as in the
proof of Theorem 14.33(i) and making use of Theorem 14.33(iv),
P (q , = n) k)=
- (n) (k - 1)!knn-l kn-n
k=1
1 (nknk!1ZP(cx=k)!1U
nI k n k n
-(S)
P(Df is connected I{i : Yi = O} = S) = P(D /S is connected) I - 20'
where the second inequality follows from the induction hypothesis, with
S U {0} playing the role of the distinguished vertex 0. Consequently
E i
i=l k~l
n- 1 o(1 - )n-k-I
k 1
120 Z2u
=1
i= k1/
(n-1 o-,k'(
1 - 0 )f-k
12 =i k=i
20 (1-2),o
- 0 )
Proof Let A ( X,B r X\A, JAI = m and IBI = s. Denote by C the event
that, starting from A, B will be the set of infected elements in the inverse
epidemic process. Furthermore, let C1 be the event that f(B) = A U B, C 2
be the event that f(X\A U B) c X\A U B and C 3 be the event that
for every b E B there is an oriented path whose initial vertex is b and
which ends in A. Clearly C = C1 A C 2 A C 3 and the events C1 and C 2 are
independent. Furthermore,
(iii) If m = o(n) and mr/n 1/ 2 --+ oo, then (m 2 /n 2 )(n - qlm) converges in
distribution to a r.v. with density
Quite crudely, Theorem 14.39 tells us that our process has a threshold
function.
From this theorem we can find the threshold functions for the direct
and two-sided epidemic process; in a way the results are disappointing:
the threshold functions are extreme in both cases.
Corollary 14.42 (i) If m = o(n) and m -+ oc, then, starting with m infected
elements, in the direct epidemic process the number of eventually infected
elements is of order (nm) 11 2 . In particular,mo = n is a threshold function
for the direct epidemic process.
(ii) If m --* oc, then in the two-sided epidemic process almost all elements
will be infected, provided we start with m infected elements. In particular,
mo= 1 is a thresholdJunction for the two-sided epidemic process.
Exercises
14.1 Given a graph G of order n, consider a one-to-one map f
V(G) --* {0, 1.. n - 1}, called a numbering of G, and set
Show that for the cube Qn with V(Q") = {0, 2'- 1}, a(f, Qn)
is minimized by the natural numbering f (i) = i, i 00, 1.. 2'- 1.
(Harper, 1964, 1966a.)
14.2 The bandwidth bw (G) of a graph G is
bw(G) = minmax{j f(x) -f(y)ý : xy E E(G)},
f
where the minimum is taken over all numberings of G (see
Ex. 1). Show that the bandwidth of the cube Qf is attained by a
14.5 By considering scalar products show that if a,, a2... , ad+2 E IRd
and lail ->I for each i, then Iai+ajlI ,,/-2 for some i =ý j. Use this
and Turrn's theorem to deduce that if X and Y are identically
distributed random vectors in R1 , then
(Katona, 1977.)
14.6 For 1 < p < oo write II'yp for the p-norm: I(xi)Ilp = (E' xP)'/P"
Define fp(n) analogously to f 2(n) and f,(n):
{1 (Si)}'. 1 Ip.
fp(n) = max min {disc
(Si) (RB)
Deduce from Theorems 14.21 and 14.22 that for 2 _<p <
P2 < cc we have
(Spencer, 1981.)
Exercises 423
E [Z{Rk(b) - 2/n}2 2= 2k I-
E{y(s)} - log(l/c).
(Stepanov, 1969c.)
424 Sequences, Matrices and Permutations
14.13 Alter the definition of an inverse epidemic process so that loops
are not allowed. Prove that the probability that a single element
infects the entire set tends to e/n.
[cf. Theorem 4; Gertsbakh (1977).]
14.14 Deduce from relation (14.31) that in a two-sided epidemic process
a positive proportion of the elements is expected to be infected
by a single element:
lira E(ýIl/n > 0.
n---oo
(Pittel, 1983.)
15
Sorting Algorithms
425
426 Sorting Algorithms
the nature of this problem depends considerably on the magnitude of r.
If we restrict the depth severely, then the size has to be of considerably
larger order than nlogn; on the other hand, with width 0(n) there are
algorithms of size 0(nlog n). The problem of sorting with small depth
(i.e. in few rounds) was suggested only rather recently by Roberts and
was motivated by testing consumer preferences by correspondence (see
Scheele, 1977). On the other hand, the problem of sorting with (relatively)
small depth is a classical problem in computing (see Knuth, 1973) and
there is a formidable literature on the subject.
In this chapter we shall consider only the two extreme cases: that
of very small depth and that of small width. The first two sections are
devoted to variants of the problem of sorting in two rounds, and the third
section gives a brief description of a recent major breakthrough by Ajtai,
Koml6s and Szemer~di (1983a, b) concerning sorting with width 0(n). In
the fourth section we present some recent results about bin packing.
A wider view of sorting is taken in a recent review article by Bollobfis
and Hell (1984), to whom the reader is referred for additional references.
G has a directed path whose initial vertex is x and whose terminal vertex
is y.
The question we investigate is this. How few edges can a graph of
order n have if every acyclic orientation of it is such that its d-step
closure contains (") - o(n 2) edges? The results we are going to present
are from Bollobis and Rosenfeld (1981).
One of our concrete random graphs has relatively few edges but there
are many more edges in its 2-step closure. Let p be a prime power and let
Go be the graph constructed by Erd6s and R~nyi (1962) and described
in §4 of Chapter 13. Thus the vertex set of Go is the set of points of the
projective plane P G(2, p) over the field of order p and a point (a, b, c) is
joined to a point (o, /f, ) iff aot + b/I + cy = 0.
2
Theorem 15.1 The graph Go has n = p +p+1 vertices, I2p(p + 1)2 _ 1n3/2
2
edges and every orientationof Go contains at least 1 (p 1)2 90
n2 direct
implications.
2
Proof Let us recall the following facts about Go (see pp. 328-329): p
vertices have degree p + 1, the remaining p + 1 vertices are independent
and have degree p, no triangle of Go contains any of these p + 1 vertices,
the graph Go has diameter 2 and contains no quadrilaterals.
Consequently, no pentagon of Go has a diagonal. Since every orienta-
tion of an odd cycle contains a directed path of length 2, every orientation
60 of Go is such that every pentagon of Go has a diagonal which is a
direct implication in G0. Thus in order to prove that C2(Go) has many
edges we shall show that Go has many pentagons and no pair of vertices
is the diagonal of many pentagons.
At least how many pentagons contain an edge x 2x 3 of Go? We may
assume that x 3 has degree p + 1. Let x4 be a vertex adjacent to x 3 but
not to x2. Since Go has no quadrilateral, we have at least p - 1 choices
for X4. Let x, be a vertex adjacent to x 2 but not to x 3. If d(x 2 ) = p + 1,
then, as before, we have at least p - 1 choices for x1. If d(x 2 ) = p, then
x2 is contained in no triangle so once again we have p - I choices for xJ.
Finally, since Go has diameter 2, there is a (unique) vertex x5 adjacent to
both xi and X4. In this way we have constructed a pentagon XlX2 ... x 5.
In constructing this pentagon we started with an arbitrary edge x2x 3, so
Go has at least
- 1)2
2p(p + 1)2(p
pentagons.
428 Sorting Algorithms
In at most how many pentagons of Go can a given pair ab c V(2) be
a diagonal? Suppose X1X2 X3X4 X5 is a pentagon and x, = a, x 3 = b. Since
Go has no quadrilateral, the vertex x2 is determined by a and b. As x4
has to be a neighbour of x 3 different from x2 , we have at most p choices
for x4. Having chosen x 4, once again we have at most one choice for x 5 .
Hence at most p pentagons contain ab as a diagonal.
Finally, every orientation of a pentagon gives at least one direct impli-
cation, so no matter how we orient Go, we obtain at least
2
2
l1op(p + 1) (p - 1) 2 /p = l10(p - 1)2
zw = {z E V - W : IF(z) n W1 3 pw}
and
[n/k] < Wi<I [n/k].
then [W1 > w0 , so [ZwJ < wo. Each x e Wi-Zw is joined to a set UL of
at least 3pw > uo vertices in W. By assumption at most (E/3)n vertices
of Go are not joined to some vertices in Ux. Hence for such a vertex x
there are at most 2[n/k] + (E/3)n < 6en vertices y G U>1 Wj such that
xy is not a direct implication. Consequently, for a fixed i there are at
most won + [n/k]6en2 < En2 /k pairs of the form {xj, xl},xl G Wi, j < 1,
for which xjxl is not a direct implication. As there are k blocks Wi, the
assertion follows.
Proof The first assertion is immediate from Theorem 15.2. To see the
second, note simply that a.e. Gp has at most wo(n)n 3/ 2 edges [in fact, at
most (1 + e)wo(n)n 3 / 2 /2 edges]. []
Theorem 15.4 (i) Given r > 0, there is a constant C > 0 such that with p =
CnI/d-1 a.e. Gp is such that the d-step closure of every acyclic orientation
of Gp contains at least (1 - r) (2) edges.
In particular, if wo(n) --+ o0, then there are graphs G 1, G 2 ,... such that
G, has n vertices, at most wo(n)nl+l/d edges, and the d-step closure of every
acyclic orientation of G, has {1 + o(1)} (") edges.
(ii) Given C > 0, there is an e > 0 such that if n is sufficiently large, then
every graph of order n and size at most Cnl+l/d has a consistent orientation
with at most ( 1 -e)n 2 d-step implications. L
E(M) = di + 1"
i=l
Since
n
Z di
2Cn,=
i=1
1
E(M) > n2C ±I'
so
2
E{M(M - 1)/2}1 (n/(2C + 1)) >n 2/9C
and p = n-'. Then a.e. GP is such that it has at most jn / 3 (logn)1 / 3 edges
and every consistent orientation of it has at least (3) - 2n 5 / 3 (log n) 1 /3
direct implications. In particular, SORT(n,2,2) <_ ln 5 / 3 (logn)1 / 3 if n is
sufficiently large.
and
Proof We shall assume that n is sufficiently large, so that for the sake
of simplicity the integer part signs can be safely ignored. Let H be
the subgraph of G spanned by the s = n/2 vertices of smallest degree.
Then A = A(H) < 4m/n = 2m/s. The consistent orientation of G we are
looking for will be such that no vertex of H will be dominated by a
vertex not in H. Thus for x, y E H there is a d-step implication from
x to y in G if and only if there is a d-step implication in H. Hence
it suffices to show that for some consistent orientation H/ of H at least
2-14n(3 d-1)/(d-1)m-d/(d-1) pairs of vertices of H do not belong to the d-step
closure of Ht.
Consider the probability space of all s! orderings of the vertices of
15.2 Sorting in Two Rounds 433
H. We shall estimate from above the probability that there is an (i + 1)-
step implication between a and b, where a and b are given vertices and
1 <i<_d-1.
Let Au be the event that there are exactly u vertices between a and b.
Then
P(Au) - 2{s- (u + 1)}
s(s - 1)
The probability that a given a - b path of length i + I yields an
(i + 1)-step implication, given that Au occurs, is
1! ! u- i
->~ ~ - n- EPA)
Ti ii (-- 2) '
u=0 S i=t
M ZP(Au)
' - E- S, U--
u=0 S i=1
ZP(Au)
E s)--m--3 max Si V
-- O 2l< i<d- 1 ( s - 2)
434 Sorting Algorithms
P
P(Au) w = 1ws,
Let us minimize
X2 (15.2)
subject to the conditions
Now let us order the vertices of G in such a way that every vertex
in Ci precedes every vertex in C+l,,i = 1,2,...,k - 1, and let Gbe the
orientation of G induced by this ordering. Then for x, y c Ci, x = y, there
is no implication from x to y so at most
It is perhaps worth noting that the proof above gives the following
assertion in extremal graph theory. Colour a graph of order n and size rn
by the greedy algorithm. Then there are at least (2n 3/9m){1 + o(1)} pairs
of vertices of the same colour, provided m/n ---* c and mr/n2 ---* 0.
To conclude this section, let us say a few words about explicitly
constructed two-round sorting algorithms. Higgkvist and Hell (1981)
constructed a two-round sorting problem of width (13/30)n 2 - (13/30)n,
based on the Petersen graph and balanced incomplete block designs,
and the explicit algorithm in Theorem 15.1 can be viewed as a two-
round sorting algorithm of width 2n22 + 0(n 3 / 2 ). The first constructive
subquadratic two-round sorting algorithm is due to Pippenger (1984);
the algorithm is based on Pippenger's expander graphs. Recently Alon
(1986) used his expander graphs, described in §3 of Chapter 13, to give
an explicit two-round sorting algorithm of width n7/ 4.
by Ajtai, Koml6s and Szemer~di (1983a, b). Our aim in this section is to
describe their solution.
In fact, instead of restricting the depth, in this case it is more natural
to restrict the width to precisely n/2. Indeed, with width n/2 we can use
a matching for the questions in each round, assuming, of course, that n
is even.
One of the best-known sorting algorithms was constructed by
Batcher (1968): it has depth O{(logn) 2} and does use matchings in
each round. Valiant (1975a) give an algorithm of width 0(n) and depth
0(log n log log n) thereby coming very close to the information theoretic
bound. C. P. Kruskal (1983) extended Valiant's algorithm to obtain sort-
ing algorithms of any width w = 0(n/ log n) and depth 0(n log n/w). This
meant that for width 0(n/ log n) the order of the information theoretic
lower bound on the size can indeed be attained.
Numerous related results were obtained by Hirschberg (1978), Prep-
arata (1978), Borodin and Cook (1980), Borodin et al. (1981), Reischuk
(1981), Shiloach and Vishkin (1981), Borodin and Hopcroft (1982), Yao
(1982) and Reif and Valiant (1983). Among others, width 0(n) algorithms
were given which, with probability tending to 1, need only 0(log n) time.
Finally Ajtai, Koml6s and Szemer~di (1983a, b) managed to get rid of
the unwanted 0(log n) factor. They proved that there is a deterministic
sorting algorithm of depth 0(log n) and width n/2 in which, for even n,
the questions in each round are distributed according to a matching.
To state the result precisely, we need some definitions. Let F1, F2 ,... , F,
be partitions of {1,2,..., m} into pairs and singletons, i.e. let each Fi
consist of independent edges of the complete graph with vertex set
{1,2,...,m}. If m is even, then the Fi's will always be chosen to be
1-factors, i.e. matchings. Denote by ALG[F1,F 2, .... F,;m] the following
algorithm, used for ordering m elements. Let xI,, x2.. xm be the elements
to be ordered; place them into registers marked 1,2,...,m. In round 1
compare the contents of the registers joined by an edge of F1 : if the
larger register contains the smaller element, interchange the elements,
otherwise do not move them. In round 2, compare the current contents
of the registers joined by an edge of F 2 and interchange the contents,
whenever they are the wrong way round. In round 3 use F 3, etc., up to
Fl. Thus an algorithm of the form ALG[F1,F 2,....Fj;m] is a so-called
comparator network, and one with a rather simple structure. We think
of ALG[F1, F2,...,F I ; m] as an algorithm which accepts a permutation
of 1,2,..., m, namely x1,x 2 , .. , xm, and produces another of 1,2,..., m,
namely the permutation whose ith element is the element in the register
15.3 Sorting with Width n/2 437
Lemma 15.12 For every E > 0 there is a constant 1 = 11(g) such thatfjor ev-
ery m E N there is an E-halving algorithm of the form ALG[Fi, F 2 ,.... , F; in].
and
Ili i k,7r(i) < k}j < Em.
Lemma 15.13 For every E > 0 there is an I = 12(0) such that for every
m c N there is an e-sorting algorithm of the form ALG [F1 , F2,...,F; m].
binary tree Tt of order 2 t+±- 1 and height t, t = 0, 1,_... to. The 1-factors
in each batch are chosen with the aid of Lemma 15.13.
The partition associated with the trees Tt depend on a positive constant
A, that will be chosen to be a power of 2, the algorithms associated with
the partitions depend on a positive constant e as well, which will be chosen
to be considerably smaller than 1/A. Before giving exact definitions, let
us give an intuitive description of the entire algorithm.
Let {xt(i,j) : 0 •_ i •_ t, 1 <_ j • 2i} be the vertices of T,, with
xt(ij), i < t, having two descendants:x,(i+ 1, 2j- 1) and xt(i+ 1, 2j). Thus
xt(O, 1) is the root of Tt and xt(t, 1), xt(t, 2),..., xt(t, 2 t) are its endvertices.
What partition of the register set {1,2,...., n} shall we associate with
Tt? Say Jrj(i,j) is the set corresponding to vertex xj(i,j). Then the union
of the sets of registers corresponding to the vertices in the branch at
xt(i,j) is a large subset of the interval [(j - 1)2-'n + 1,j2-'n], with
the extremes of the union at J$(i,j), the apex of the branch, and the
middle of the union at the other endvertices. Where is the rest of the
interval [(j - 1)2-in + 1, j2-n] ? At the antecedents of the vertex xt(i, j):
at xt(i - 1, [j/2J), xt(i - 2, [j/4J), etc., which contain even more extreme
elements of the interval than Jt(i,j). Furthermore, each Jt(i,j) is chosen
to be the union of at most two intervals.
For fixed i and j, as t increases, J(i, j) becomes smaller and smaller
and, eventually, it becomes empty. Thus more and more registers belong
to sets at higher levels. Eventually, all registers filter up (down?) to
the endvertices; the partition corresponding to Tto is just the discrete
partition: ft,(ij) is empty unless i = to and Oto(toj) = {j}, 1 • i <
2t, = n.
Putting it another way: the subgraph U, of Tt spanned by vertices
xt(i,j) for which fj(i,j) -= 0, is a forest: with t increasing, eventually
Jf(O, 1) will be empty so Ut will have two components, then Jt(l, 1) and
JJ(l, 2) also become empty and the forest Ut will have four components,
etc. Finally, Ut0 will consist of 2 t0 = n vertices, namely the endvertices of
Tt0. In Fig. 15.1 we show some of the partitions for n = 128.
To define the sequence of 1-factors associated with a tree, call a
subgraph spanned by a set {xj(i,j),xt(i + 1,2j- 1),xt(i + 1,2j)} a fork.
A fork is even if i is even and odd if i is odd. Note that a fork has one
or three vertices; it has only one vertex if either i = -1, in which case
the fork is the root of the tree, or i = t, in which case the fork is an
endvertex. Now for the 1-factors associated with T,. Take an E-sorting
algorithm, of the kind guaranteed by Lemma 15.13, on each even fork of
T,, follow it by an r-sorting algorithm on each odd fork of Tt, and repeat
440 Sorting Algorithms
this, say, eight times. This means choosing at most 1612(0) 1-factors on
the set of registers: this is the sequence of 1-factors we associate with T,.
Altogether then we shall choose 1612 (g)(log 2 n + 1) = O(log n) 1-factors,
as required by Theorem 15.11.
The effect of the E-sorting algorithms is that more and more elements
to be sorted will belong to the correct set of registers. During the part of
the algorithm corresponding to Tt, each element moves at a distance not
more than 16 on T,, and most elements move towards the set containing
the register in which they have to end up. Of course, movement from one
register to another is possible only if the registers belong to the same
component of Ut, so it is crucial that, at each stage, the registers in each
component Ut contain, collectively, the right set of elements.
It is not too difficult to show that with an appropriate choice of A
and e, the algorithm we defined is indeed a sorting algorithm. However,
for the details the reader is referred to Ajtai, Koml6s and Szemer~di
(1983a, b).
To conclude this section, let us give the exact partition of [1, n] associ-
ated with T, for some A = 2a.
Set Xt(j) = [Ai(2A)-t- 1 n] and
Yt(i) = Xt(j).
j=1
[9,64] [65,120]
[1] u[128]
[1,2] [31, 32] [33,34 u[63,64] [65,66] u[95,96] [97,98] b[12 7 ,128]
04 z 1 011"
0 0 0 0
where ni is the number of xj's equal to yi. Then clearly YLI Eim=l ni = n.
A bin Bi used in a packing of a list L can be identified with the sublist
Li of L consisting of the elements of L in Bg. We call the multiset 11L, the
type of Bi. If 1L = {y- nl, y2 n 2 -... ,Ym" nm, then we can represent the
type of a bin by a vector T E Zm, where T = (t1, t 2 , . . , tm) corresponds
to the bin with exactly ti members equal to yi. In this way, to every
packing of L corresponds a multiset of types of bins used in the packing.
15.4 Bin Packing 443
Proof Note first that for any multiset p E M(i7 , m) the number of possible
types of bins is bounded by a function r(q7, m). In fact rather crudely we
may take
z(17,m) = ([1/1] + 1)m,
since a bin cannot contain more than [1/17j elements of the same value,
and there are m distinct values.
Write q = q(p) < z(q/, m) for the actual number of types of bins which
can be used in a packing of a multiset p e M(i7 , m). Let T 1, T 2,..., Tq be
the vectors representing these types of bins, say T, = (t', t, . ... t,).
If a packing 13of y has bi bins of type Ti, then 1131 •ij bi and
Zi=L bitj = nj,j = 1,2,...,m. We shall call a multiset = {T" I c1, T72
c2 ... , Tq ' Cq} a loose packing of y if
q
Sciti _>n'j, j =12. .. m.
subject to
q
ZaŽit>, j= 1,2. m.
and
ai>O, i =1,2,..., q.
Clearly P can be solved in time depending only on q. Let (c,i2, .... , O~q)
be a solution.
Then =_ c _< I bi =3 for every packing/3 = (b 1 ,b2 . -, bq) of
u, and so q,=l ~i!</p*.
Furthermore, with ca = [c1i] we have
q q q
i=l
i!l i=1
and
oc'>_0, i= 1,2,..., q.
Hence the multiset {T1 • ot', T 2 "2.... Tq . a' } defines a loose packing of
puand the number of bins used is at most p* + q < p* + z(q/, m). ED
Theorem 15.15 For every 0 < P < 1 there is a bin packing algorithm S
with Rs" fý• 1 + P, whose running time is at most C, + Cnlog(1/8), where
C, depends only on g and C is an absolute constant.
bins. If L* is sufficiently large, then this number is less than L*(1 + 8).
15.4 Bin Packing 445
Suppose now that h > 0. Let xj, <_ x j2 < ... <_ xj1,- be some increasing
rearrangement of the terms of L greater than q and set
and
h h .. h lh
L 3 =Koyy3 ... Y1R.
L; < L;(1
(+ ),
which is Claim 1.
Let us turn to the proof of Claim 15.2. Set Q SO that yh .•M. .
S2 sota
Exercises
15.1 Check that Theorem 15.9 implies Theorem 15.4(ii).
15.2 By imitating the proof of Theorem 15.2, prove Theorem 15.6.
15.3 For a tournament T, denote by h(T) the maximal cardinality of a
consistent (i.e. acyclic) set of arcs in T and set f(n) = min{h(T)
T is a tournament of order n}. Prove that h(T) - 2 (2) =
O{n 3/2(logn)1 /2} for a.e. tournament T of order n and deduce
that
0 < f(n) - 1 (n) = O{n 3 /2(logn)
1
/ 2}.
16.1 Connectivity
We know from Theorem 7.3 that if c E R is a constant, p = (log +n)/n and
M = [(n/2)(log n + c)], then lim,• P(Gp is connected) = limn,- P(GM
is connected) = 1 - e-e.
Several authors have examined the probability that certain random
graphs with very few vertices are connected (see Fu and Yau, 1962; Na
and Rapoport, 1967). Of course, used indiscriminately, the approxima-
tions hidden in the relations above cannot be expected to be too good
for small values of n (see Fillenbaum and Rapoport, 1971; Rapoport and
Fillenbaum, 1972; Schultz and Hubert, 1973, 1975). However, values of
447
448 Random Graphs of Small Order
P(n,M) = P(GM is connected) have been directly calculated for small
values of n. Using the formula
P(n,M) = 1 - (N) M E
j=2 (-I)J
J nl-
n-n2 E 2 '
!... nj! k=.
where F' denotes the sum over all partitions of n into exactly j parts, Ling
(1973a, b, 1975) computed P(n,M) for all n < 16 and Ling and Killough
(1976) computed P(n,M) for larger values of n. The disadvantages of
this formula are clearly the occurrence of the alternating sign, requiring
numbers to be stored with considerable accuracy, and the sum over all
partitions, using up a lot of CPU time. Nevertheless, Ling and Killough
gave extensive and accurate tables of P(n, M) for n < 100.
For small values of n, Bollobfis and Thomason (1985) used the recursive
formula of Gilbert (1959) (see Ex. 1 in Chapter 7) to estimate the
probability of connectedness, while for larger values of n they relied on
the Jordan-Bonferroni inequalities (Theorem 1.10). The results are given
in Table 16.1.
Table 16.2 gives values for P(n,M), where M is the nearest integer to
n(c + log n)/2.
Table 16.1. The probability of Gn,,p being connected for p = (c + log n)/n
n -0.83 -0.48 -0.19 0.000 0.087 0.367 0.672 t.03 1.50 2.25 4.61
10 0.057 0.153 0.266 0.348 0.388 0.514 0.639 0.760 0.869 0.958 0.999
20 0.083 0.192 0.306 0.384 0.421 0.536 0.647 0.753 0.852 0.940 0.998
30 0.094 0.204 0.316 0.391 0.427 0.536 0.643 0.746 0.843 0.932 0.996
40 0.099 0.209 0.319 0.393 0.428 0.535 0.639 0.740 0.837 0.927 0.995
50 0.102 0.211 0.319 0.393 0.427 0.533 0.636 0.736 0.833 0.924 0.995
60 0.103 0.211 0.319 0.392 0.426 0.531 0.633 0.733 0.830 0.921 0.994
70 0.104 0.212 0.319 0.391 0.425 0.529 0.631 0.731 0.827 0.919 0.994
80 0.104 0.212 0.319 0.390 0.424 0.527 0.629 0.729 0.825 0.918 0.994
90 0.105 0.212 0.318 0.389 0.423 0.526 0.628 0.727 0.823 0.916 0.993
100 0.105 0.212 0.318 0.389 0.422 0.525 0.626 0.725 0.822 0.915 0.993
150 0.105 0.211 0.315 0.385 0.419 0.520 0.621 0.720 0.817 0.912 0.992
200 0.105 0.210 0.314 0.383 0.416 0.518 0.618 0.717 0.814 0.910 0.992
250 0.105 0.209 0.312 0.382 0.414 0.516 0.616 0.715 0.812 0.908 0.992
300 0.105 0.208 0.311 0.380 0.413 0.514 0.614 0.713 0.811 0.907 0.991
350 0.104 0.208 0.310 0.379 0.412 0.513 0.613 0.712 0.810 0.907 0.991
400 0.104 0.207 0.310 0.379 0.411 0.512 0.612 0.711t 0.809 0.906 0.991
450 0.104 0.207 0.309 0.378 0.410 0.511 0.611 0.710 0.808 0.906 0.991
500 0.104 0.206 0.308 0.377 0.410 0.510 0.610 0.709 0.808 0.905 0.991
550 0.104 0.206 0.308 0.377 0.409 0.510 0.610 0.709 0.807 0.905 0.991
600 0.103 0.206 0.308 0.376 0.409 0.509 0.609 0.708 0.807 0.904 0.991
650 0.103 0.206 0.307 0.376 0.408 0.509 0.609 0.708 0.806 0.904 0.991
700 0.103 0.205 0.307 0.375 0.408 0.508 0.608 0.707 0.806 0.904 0.991
750 0.103 0.205 0.307 0.375 0.408 0.508 0.608 0.707 0.806 0.904 0.991
800 0.103 0.205 0.306 0.375 0.407 0.508 0.607 0.707 0.806 0.904 0.991
850 0.103 0.205 0.306 0.375 0.407 0.507 0.607 0.707 0.805 0.903 0.991
900 0.103 0.205 0.306 0.374 0.407 0.507 0.607 0.706 0.805 0.903 0.991
950 0.103 0.205 0.306 0.374 0.406 0.507 0.607 0.706 0.805 0.903 0.991
1000 0.103 0.204 0.306 0.374 0.406 0.507 0.606 0.706 0.805 0.903 0.991
1500 0.102 0.203 0.304 0.373 0.405 0.505 0.605 0.704 0.804 0.902 0.990
2000 0.102 0.203 0.304 0.372 0.404 0.504 0.604 0.704 0.803 0.902 0.990
2500 0.102 0.202 0.303 0.371 0.403 0.503 0.603 0.703 0.802 0.902 0.990
3000 0.101 0.202 0.303 0.371 0.403 0.503 0.603 0.703 0.802 0.901 0.990
3500 0.101 0.202 0.302 0.370 0.403 0.503 0.603 0.702 0.802 0.901 0.990
4000 0.101 0.202 0.302 0.370 0.402 0.502 0.602 0.702 0.802 0.901 0.990
4500 0.101 0.202 0.302 0.370 0.402 0.502 0.602 0.702 0.802 0.901 0.990
5000 0.101 0.202 0.302 0.370 0.402 0.502 0.602 0.702 0.801 0.901 0.990
6000 0.101 0.201 0.302 0.370 0.402 0.502 0.602 0.702 0.801 0.901 0.990
7000 0.101 0.201 0.301 0.369 0.402 0.502 0.602 0.701 0.801 0.901 0.990
8000 0.101 0.201 0.301 0.369 0.401 0.501 0.601 0.701 0.801 0.901 0.990
9000 0.101 0.201 0.301 0.369 0.401 0.501 0.601 0.701 0.801 0.901 0.990
10,000 0.101 0.201 0.301 0.369 0.401 0.501 0.601 0.701 0.801 0.901 0.990
20,000 0,100 0.201 0.301 0.369 0.401 M.01 0.601 V.01 0.800 0,900 0.990
40,000 0.100 0.200 0.300 0.368 0.400 0.500 0.600 0.700 0.800 0.900 0.990
450 Random Graphs of Small Order
n -0.83 -0.48 -0.19 0.000 0.087 0.367 0.672 1.03 1.50 2.25 4.61
10 0.000 0.113 0.437 0.575 0.575 0.686 0.838 0.922 0.965 0.994 1.0
20 0.068 0.202 0.365 0.472 0.522 0.654 0.757 0.834 0.914 0.968 0.999
30 0.091 0.225 0.353 0.449 0.480 0.620 0.712 0.801 0.893 0.957 0.998
40 0.089 0.217 0.352 0.443 0.486 0.588 0.691 0.785 0.876 0.948 0.997
50 0.096 0.224 0.345 0.433 0.467 0.579 0.687 0.781 0.863 0.941 0.997
60 0.100 0.228 0.340 0.425 0.453 0.572 0.673 0.770 0.858 0.936 0.996
70 0.104 0.221 0.338 0.422 0.457 0.568 0.664 0.763 0.851 0.932 0.995
80 0.101 0.218 0.339 0.411 0.452 0.558 0.659 0.759 0.846 0.930 0.995
90 0.102 0.219 0.334 0.407 0.443 0.553 0.657 0.753 0.843 0.928 0.995
100 0.104 0.215 0.333 0.406 0.446 0.553 0.652 0.749 0.840 0.926 0.994
150 0.102 0.215 0.327 0.400 0.432 0.537 0.638 0.736 0.830 0.920 0.993
200 0.102 0.213 0.321 0.395 0.429 0.529 0.632 0.730 0.825 0.916 0.993
250 0.104 0.213 0.319 0.390 0.424 0.527 0.627 0.725 0.821 0.913 0.992
300 0.103 0.210 0.318 0.389 0.422 0.525 0.623 0.722 0.819 0.912 0.992
350 0.103 0.210 0.316 0.386 0.418 0.521 0.622 0.721 0.817 0.911 0.992
400 0.103 0.209 0.314 0.384 0.418 0.520 0.620 0.717 0.815 0.909 0.992
450 0.103 0.209 0.313 0.384 0.416 0.518 0.618 0.717 0.814 0.909 0.992
500 0.103 0.209 0.312 0.383 0.416 0.516 0.617 0.715 0.813 0.908 0.991
550 0.103 0.207 0.311 0.381 0.414 0.515 0.615 0.714 0.812 0.907 0.991
600 0.103 0.207 0.310 0.380 0.413 0.514 0.615 0.713 0.811 0.907 0.991
650 0.103 0.207 0.311 0.380 0.412 0.513 0.613 0.713 0.810 0.906 0.991
700 0.103 0.206 0.310 0.379 0.411 0.513 0.613 0.712 0.810 0.906 0.991
750 0.103 0.206 0.309 0.379 0.411 0.512 0.612 0.711 0.809 0.906 0.991
800 0.102 0.206 0.309 0.378 0.411 0.511 0.612 0.711 0.809 0.906 0.991
850 0.102 0.205 0.309 0.378 0.410 0.511 0.611 0.710 0.808 0.905 0.991
900 0.102 0.206 0.309 0.377 0.409 0.511 0.610 0.710 0.808 0.905 0.991
950 0.102 0.206 0.308 0.377 0.409 0.510 0.610 0.710 0.808 0.905 0.991
1000 0.102 0.205 0.308 0.377 0.409 0.510 0.610 0.709 0.807 0.905 0.991
1500 0.102 0.204 0.306 0.374 0.407 0.507 0.607 0.707 0.805 0.903 0.991
2000 0.102 0.203 0.305 0.373 0.405 0.506 0.606 0.705 0.804 0.903 0.991
2500 0.101 0.203 0.304 0.372 0.405 0.505 0.605 0.705 0.804 0.902 0.990
3000 0.101 0.203 0.303 0.372 0.404 0.504 0.604 0.704 0.803 0.902 0.990
3500 0.101 0.202 0.303 0.371 0.404 0.504 0.604 0.703 0.803 0.902 0.990
4000 0.101 0.202 0.303 0.371 0.403 0.503 0.603 0.703 0.802 0.902 0.990
4500 0.101 0.202 0.303 0.371 0.403 0.503 0.603 0.703 0.802 0.901 0.990
5000 0.101 0.202 0.302 0.371 0.403 0.503 0.603 0.703 0.802 0.901 0.990
6000 0.101 0.202 0.302 0.370 0.402 0.503 0.602 0.702 0.802 0.901 0.990
7000 0.101 0.201 0.302 0.370 0.402 0.502 0.602 0.702 0.802 0.901 0.990
8000 0.101 0.201 0.302 0.370 0.402 0.502 0.602 0.702 0.801 0.901 0.990
9000 0.101 0.201 0.301 0.370 0.402 0.502 0.602 0.702 0.801 0.901 0.990
10,000 0.101 0.201 0.301 0.369 0.402 0.502 0.602 0.701 0.801 0.901 0.990
20,000 0.100 0.201 0.301 0.369 0.401 0.501 0.601 0.701 0.801 0.900 0.990
40,000 0.100 0.200 0.300 0.368 0.401 0.501 0.601 0.700 0.800 0.900 0.990
45,000 0.100 0.200 0.300 0.368 0.400 0.500 0.600 0.700 0.800 0.900 0.990
16.3 Colouring 451
distribution of fio(Gpp). Clearly
and
2
oU (Xk) = (n (k) (n k) qk(k-lhd;) - E(Xk) 2 .
and by Chebyshev's inequality
P(Xk = 0) •_ 0-2 (Xk)/E(Xk) 2
= U(Xk). (16.2)
Table 16.3 shows the information one can derive from (16.1) and (16.2)
in the case p - 1 for all those values of k for which E(Xk) > 0.01 and
U(Xk) > 0.01.
For large values of n (5000 and 10000) let us consider four algorithms
on random graphs Gn,, c = 3 and 4, searching for large independent
sets. All four algorithms choose vertices one by one. The first algorithm,
called hammer, always chooses an available vertex of minimum degree. In
the other three algorithms the vertices are ordered at the beginning, and
at each stage we choose the first available vertex. In simple greedy the
natural order is taken, in low first (greedy) we choose an order in which
the degrees are non-decreasing and in a high first (greedy) we choose an
order in which the degrees are non-increasing. The outcomes are shown
in Table 16.4.
16.3 Colouring
Graph-colouring algorithms have been constructed and their proba-
ble behaviour analysed by numerous authors, including Br~laz (1979),
Christofides (1971), Corneil and Graham (1973), Kucera (1977) and
Lawler (1970). Algorithms colouring random graphs which are just about
manageable on available machines were constructed by numerous au-
thors, including Johnson (1974), Garey and Johnson (1976), Leighton
(1979), Johri and Matula (1982) and Bollob~is and Thomason (1985).
The standard test for the efficiency of the algorithm is usually a GnP with
n = 1000 and p= 1. Improving on previous results, Johri and Matula
used an average of 95.9 colours; this was further improved by BollobAs
and Thomason to an average of 86.9 colours.
For a r.g. Gp with n = 1000 and p = 2, with large probability we have
80 < ,(G) < 126.
452 Random Graphs of Small Order
n k E(XAk) U(Xk)
40 5 42.58 0.181
40 6 117.13 0.601
40 7 8.89 2.577
40 8 0.29 23.502
60 6 1527.83 0.198
60 7 184.16 0.589
60 8 9.53 2.504
60 9 0.22 27.810
80 7 1514.78 0.250
80 8 107.99 0.746
80 9 3.37 4.079
80 10 0.05 84.320
100 7 7633.00 0.138
100 8 693.23 0.346
100 9 27.68 1.224
100 10 0.49 12.818
200 9 17,104.98 0.105
200 10 638.10 0.236
200 11 10.76 1.103
200 12 0.08 37.313
400 11 25,386.89 0.057
400 12 401.84 0.119
400 13 2.93 1.359
600 12 55,133.78 0.033
600 13 608.83 0.063
600 14 3.12 0.967
800 12 > 105 0.017
800 13 26,484.02 0.026
800 14 181.74 0.059
800 15 0.58 3.591
1000 13 > 105 0.015
1000 14 4228.26 0.022
1000 15 16.96 0.178
1000 16 0.03 50.685
2000 17 3.95 0.434
3000 18 5.04 0.305
4000 19 0.72 1.847
5000 19 50.62 0.031
5000 20 0.02 50.261
E(k)=E n! 2n (
all
wErk) H ai! H nj!
where the sum is over all sequences (ai) of length k with E ai =n and
16.3 Colouring 453
Graph number 1 2 3 4 5
Time taken 48 : 38 51 : 56 54 : 58 49 : 11 55 : 23
Colours used 87 86 87 87 87
Graph number 6 7 8 9 10
Time taken 43 :45 50 : 16 55 : 31 56 :46 67 :40
Colours used 87 87 87 87 87
Mean time: 53 : 23, mean colours used: 86.9.
454 Random Graphs of Small Order
Table 16.6. The diameter and girth of a random sample of s cubic graphs
of order n
5 7 6 6 30 9 3 6 4 4 0
6 49 22 7 47 6 1 7 47 15 1
7 14 0 8 4 0 0 8 24 2 0
8 2 0 9 3 0 0
8 20 11 4 9 2 1 1
9 48 10 2 3 8 10 71 18 11
10 5 0 0 11 5 1 0
10 5 2 1 11 4 6 4 12 1 0
11 25 7 0 12 17 7 2 13 It 3
14 1 1 15 1 1 15 2
could find the actual largest independent set at each stage. For lack of
space, we shall not describe the various refinements incorporated into
this basic strategy.
The results of colouring 10 graphs by one of the algorithms used by
Bollobfis and Thomason on an IBM 3081 (5 MIPS), with a program
written in FORTRAN, are given in Table 16.5.
457
458 Reterences
Bermond, J.-C. and Bollobbs, B. (1982). The diameter of graphs-a survey, Proc.
Twelfth Southeastern Conf. on Combinatorics, Graph Theory and Computing,
Congressus Numerantium 32, 3-27.
Bermond, J.-C., Delorme, C. and Farhi, G. (1982). Large graphs with given
degree and diameter III. In Graph Theory (Bollobhs, B., Ed.). Annls Discrete
Math. 13, 23-32.
Bermond, J.-C., Delorme, C. and Farhi, G. (1984). Large graphs with given
degree and diameter II, J. Combinatorial Theory (B) 36, 32-48.
Bermond, J.-C., Bond, J., Paoli, M. and Peyrat, C. (1983). Graphs and
intercommunication networks: diameter and vulnerability. In Surveys in
Combinatorics (Lloyd, E. K., Ed.). Lond. Math. Soc. Lecture Notes, 82,
pp. 1-30.
Bernstein, A. J. (1967). Maximally connected arrays on the n-cube, SIAM J.
Appl. Math. 15, 1485-1489.
Best, M. R. (1970). The distribution of some variables on symmetric groups,
Indag, Math. 32, 385-402.
Biggs, N. L. (1974). Algebraic Graph Theory. Cambridge Tracts in Math. 67,
Cambridge University Press, vii + 170pp.
Bilardi, G. and Preparata, F. P. (1984a). Minimum area VLSI network for
O(logn) time sorting, Proc. 16th Annual ACM Symp. on Theory of
Computing, Washington, D.C., 64-70.
Bilardi, G. and Preparata, F. P. (1984b). The VLSI optimality of the AKS
sorting network. Coordinated Science Laboratory Report R-1008,
University of Illinois, Urbana.
Birnbaum, Z. W., Esary, J. D. and Saunders, S. C. (1961). Multicomponent
systems and structures and their reliability, Technometrics 3, 55-57.
Bixby, R. (1975). The minimum number of edges and vertices in a graph with
edge connectivity N and MN-bounds, Networks 5, 253-298.
Blass, A., Exoo, G. and Harary, F. (1981). Paley graphs satisfy all first-order
adjacency axioms, J. Graph Theory 5, 435-439.
Bloemena, A. R. (1964). Sampling from a Graph. Math. Centre Tracts,
Amsterdam.
Blum, M., Pratt, V., Tarjan, R. E., Floyd, R. W. and Rivest, R. L. (1973). Time
bounds for selection, J. Computer, Syst. Sci. 7, 448-461.
Boesch, F. T. and Felzer, A. (1972). A general class of invulnerable graphs,
Networks 2, 261-283.
Boesch, F. T., Harary, F. and Kabell, J. A. (1981). Graphs as models of
communication network vulnerability: connectivity and persistence,
Networks 11, 57-63.
BollobAs, B. (1978a). Extremal Graph Theory. Academic Press, London, New
York, San Francisco, xx + 488pp.
Bollobfis, B. (1978b). Chromatic number, girth and maximal degree, Discrete
Math. 24, 311-314.
BollobAs, B. (1979a). Graph Theory-An Introductory Course. Graduate Texts in
Mathematics, Springer-Verlag, New York, Heidelberg, Berlin, x + 180pp.
Bollob~s, B. (1979b). A probabilistic proof of an asymptotic formula for the
number of labelled regular graphs, Preprint Series, Matematisk Institut,
Aarhus Universitet.
BollobAs, B. (1979c). Graphs with equal edge-connectivity and minimum degree,
Discrete Math. 28, 321-323.
BollobAs, B. (1980a). The distribution of the maximum degree of a random
462 References
Cooper, C., Frieze, A., Molloy, M. and Reed, B. (1996). Perfect matchings in
random r-regular, s-uniform hypergraphs, Combinatorics, Prob. Computing
5, 1-14.
Coppersmith, D. and Sorkin, G. (1999). Constructive bounds and exact
expectations for the random assignment problem, Random Structures and
Algorithms 15, 113-144.
Coppersmith, D. and Sorkin, G. B. (2001). On the expected incremental cost of
a minimum assignment, Contemporary Combinatorics,to appear.
Corneil, D. G. and Graham, B. (1973). An algorithm for determining the
chromatic number of a graph, SIAM J. Comput. 2, 311-318.
Cornuejols, G. (1981). Degree sequences of random graphs, unpublished preprint.
Damerell, R. M. (1973). On Moore graphs, Math. Proc. Camb. Phil. Soc. 74,
227-236.
Delorme, C. (1985). Large bipartite graphs with given degree and diameter,
J. Graph Theory see Q3.
Delorme, C. and Farhi, G. (1984). Large graphs with given degree and diameter,
IEEE Trans. Comput. 33, 857-860.
Donath, W. E. (1969). Algorithm and average - value bounds for assignment
problems, IBM J. Res. Dev. 13, 380-386.
Doyle, J. and Rivert, R. (1976). Linear expected time of a simple union-find
algorithm, Info. Process. Lett. 5, 146-148.
Dvoretzky, A. and Erd6s, P. (1959). Divergence of random power series,
Michigan Math. Journal 6, 343-347.
Dyer, M., and Frieze, A. (1991). Randomized greedy matching, Random
Structures, Algorithms 2, 29-45.
Dyer, M., Frieze, A. and Jerrum, M. (1998). Approximately counting Hamilton
paths and cycles in dense graphs, SIAM J. Comput. 27, 1262-1272.
Dyer, M. E., Frieze, A. M. and McDiarmid, C. J. H. (1986). On linear programs
with random costs, Math. Program. 35, 3-16.
Dyer, M., Frieze, A. and Pittel, B. (1993). The average performance of the
greedy matching algorithm, Ann. of Appl. Prob. 3, 526-552.
Ebert, W. and Hafner, R. (1971). Die asymptotische Verteilung von
Koinzidenzen, Z. Wahrscheinlichkeitstheorie verwand. Gebiete 18, 322-332.
Edwards, C. S. and Elphick, C. H. (1983). Lower bounds for the clique and the
chromatic numbers of a graph, Discrete Appl. Math. 5, 51-64.
Egorychev, G. P. (1980a). New formulae for the permanent, Dokl. Akad. Nauk
SSSR 254, 784-787 (in Russian).
Egorychev, G. P. (1980b). A solution of a problem of van der Waerden for
permanents, Inst. Fiziki im. L.V. Kirenskogo, S.S.S.R. Akad. Nauk, Siberian
Branch, preprint IFSO-13M, Krasnoyarsk (in Russian).
Eigen, M. and Schuster, P (1979). The Hypercycle A Principle of Natural
Selforganization, Springer-Verlag, Heidelberg.
Elspas, B. (1964). Topological constraints on interconnection limited logic,
Switching Circuit Theory, Logical Design 5, 133-147.
Epihin, V. V. (1972). An estimate of the probability of connectedness of a graph,
Systems.]br the Distribution of Information. Izdat. Nauka, Moscow,
pp. 124-127 (in Russian).
Erd6s, P. (1945). On a lemma of Littlewood and Offord, Bull. Amer. Math. Soc.
51, 898-902.
Erd6s, P. (1947). Some remarks on the theory of graphs, Bull. Amer. Math. Soc.
53, 292-294.
468 References
Fajtlowicz, S. (1977). The independence ratio for cubic graphs, Proc. Eighth
South-eastern Con:.. on Combinatorics, Graph Theory and Computing,
Utilitas Math., Winnipeg, pp. 273-277.
Fajtlowicz, S. (1978). On the size of independent sets in graphs, Proc. Ninth
South-eastern Conf. on Combinatorics,Graph Theory and Computing,
Utilitas Math., Winnipeg, pp. 269-274.
Falikman, D. I. (1981). A proof of van der Waerden's conjecture on the
permanent of a doubly stochastic matrix, Mat. Zametki 29, 931-938 (in
Russian).
Fararo, T. J. and Sunshine, M. (1964). A Study of a Biased Friendship net,
Syracuse Univ. Press, Syracuse, New York.
Feller, W. (1945). The fundamental limit theorems in probability, Bull. Amer.
Math. Soc. 51, 800-832.
Feller, W. (1966). An Introduction to Probability Theory and its Applications, vols
I and II. John Wiley and Sons, New York, London, Sydney.
Fenner, T. 1. and Frieze, A. M. (1982). On the connectivity of random
m-orientable graphs and digraphs, Combinatorica 2, 347-359.
Fenner, T. I. and Frieze, A. M. (1983). On the existence of Hamiltonian cycles in
a class of random graphs, Discrete Math. 45, 301-305.
Fenner, 1. I. and Frieze, A. M. (1984). Hamiltonian cycles in random regular
graphs, J. Combinatorial Theory (B) 37, 103-112.
Fillenbaum, S. and Rapoport, A. (1971). Subjects in the Subjective Lexicon.
Academic Press, New York.
Filotti, I. S., Miller, G. L. and Reif, J. (1979). On determining the genus of a
graph in O(vo(g)) steps, Proc. Eleventh Annual ACM Symp. on Theory of
Computing, Atlanta, Georgia, pp. 27-37.
Flajolet, P., Knuth, D. E. and Pittel, B. (1989). The first cycles in an evolving
graph, Discrete Math. 75, 167-215.
Folkert, J. E. (1955). The distribution of the number of components of a
random mapping function, Ph.D. Dissertation, Michigan State University.
Ford, G. W. and Uhlenbeck, G. E. (1957). Combinatorial problems in the theory
of graphs, Proc. Natn. Acad. Sci. U.S.A. 43, 163-167.
Frank, H. and Frisch, I. (1971). Communication, Transmission and Transportation
Networks. Addison-Wesley, Reading, Massachusetts.
Frank, H., Kahn, R. E. and Kleinrock, L. (1972). Computer communication
network design-experience with theory and practice, AFIPS Con.f Proc.
40, 255-270.
Frank, 0. (1977). Survey sampling in graphs, I, Statist. Plan. Infb. 1, 235-264.
Frank, 0. (1978). Estimation of the number of connected components in a
graph by using a sampled subgraph, Scand. J. Statist. 5, 177-188.
Frank, 0. (1979a). Estimating a graph from triad counts, J. Statist. Comput.
Simul. 9, 31-46.
Frank, 0. (1979b). Moment properties of subgraph counts in stochastic graphs,
Annls N. Y Acad. Sci. 319, 207-218.
Frank, 0. (1981). A survey of statistical methods for graph analysis. In
Sociological Methodology 1981 (Leinhardt, S., Ed.). Jossey-Bass, San
Francisco, pp. 110-115.
Frank, 0. and Harary, E (1980a). Balance in stochastic signed graphs, Social
Networks 2, 155-163.
Frank, 0. and Harary, E (1980b). Maximum triad counts in graphs and
digraphs, J. Combinatorics, Info. Systems Sci. 5, 1-9.
References 471
721-748.
Luczak, T. and Wierman, J. C. (1989). The chromatic number of random graphs
at the double-jump threshold, Combinatorica 9, 39-49.
Lueker, G. S. (1981). Optimisation problems on graphs with independent
random edge weights, SIAM J. Comput. 10, 338-351.
Luks, E. M. (1980). Isomorphism of graphs of bounded valence can be tested in
polynomial time, 21st Annual IEEE Symp. on Foundations of Computer
Science, pp. 42-49.
McDiarmid, C. J. H. (1979a). Colouring random graphs badly, Graph Theory
and Combinatorics (Wilson, R. J., Ed.). Pitman Research Notes in
Mathematics, vol. 34, 76-86.
McDiarmid, C. J. H. (1979b). Determining the chromatic number of a graph,
SIAM J. Comput. 8, 1-14.
McDiarmid, C. J. H. (1980a). Clutter percolation and random graphs.
Combinatorial Optimization (Rayward-Smith, V. J., Ed.). Mathematical
Programming Study, vol. 13. North-Holland, Amsterdam, New York,
Oxford, pp. 17-25.
McDiarmid, C. J. H. (1980b). Percolation on subsets of the square lattice, J.
Apple. Probab. 11, 278-283.
McDiarmid, C. J. H. (1981). General percolation and random graphs, Adv. Appl.
Probab. 13, 40-60.
McDiarmid, C. J. H. (1982). Achromatic numbers of random graphs, Math.
Proc. Camb. Phil. Soc. 92, 21-28.
McDiarmid, C. J. H. (1983a). On the chromatic forcing number of a random
graph, Discrete Appl. Math. 5, 123-132.
McDiarmid, C. J. H. (1983b). General first-passage percolation, Adv. Apple.
Probab. 15, 149-161.
McDiarmid, C. (1990). On the chromatic number of random graphs, Random
Structures and Algorithms 4, 435-442.
McKay, B. D. and Wormald, N. C. (1984). Automorphisms of random graphs
with specified vertices, Combinatorica 4, 325-338.
McKay, B. D. and Wormald, N. C. (1990a). Uniform generation of random
regular graphs of moderate degree, J. Algorithms 11, 52-67.
McKay, B. D. and Wormald, N. C. (1990b). Asymptotic enumeration by degree
sequence of graphs of high degree, Europ. J. Combinatorics 11, 565-580.
McKay, B. D. and Wormald, N. C. (1991). Asymptotic enumeration by degree
sequence of graphs with degrees o(n"/2 ), Combinatorica 11, 369-382.
McKay, B. D. and Wormald, N. C. (1997). The degree sequence of a random
graph. I. The models, Random Structures and Algorithms 11, 97-117.
McLeish, D. L. (1979). Central limit theorems for absolute deviations from the
sample mean and applications, Canad. Math. Bull. 22, 391-396.
Mader, W. (1967). Homomorphieeigenschaften und mittlere Kantendichte von
Graphen, Math. Annln 174, 265-268.
Maehera, H. (1980). On random simplices in product distributions, J. Appl.
Probab. 17, 553-558.
Maehera, H. (1985). On the number of induced subgraphs of a random graph,
Discrete Math. See Q4.
Margulis, G. A. (1973). Explicit constructions of concentrators, Problemy
Peredachi Informatsii 9(4), 71-80 (in Russian). English transl. in Problems
Info. Transmission 9 (1975), 325-332. Plenum Press, New York.
Margulis, G. A. (1974). Probabilistic characteristics of graphs with large
References 483
P6lya, G. (1918) Uber die Verteilung der quadratischen Reste und Nichtreste,
Nachr. Akad. Wiss. Gdttingen Math. Phys. 21-29.
P6lya, G. (1937). Kombinatorische Anzahlbestimmungen fuir Gruppen, Graphen
und chemische Verbindungen, Acta Math. 68, 145-254.
P6sa, L. (1976). Hamiltonian circuits in random graphs, Discrete Math. 14,
359-364.
Pratt, V. R. and Yao, F. E (1973). On lower bounds for computing the ith
largest element, Proc. 14th Annual IEEE Symp. on Switching and Automata
Theory, IEEE Comput. Soc., Northridge, California, pp. 70-81.
Preparata, E P. (1978). New parallel-sorting schemes, IEEE Trans. Comput.
C-27, 669-673.
Preparata, F. P. and Vuillemin, J. (1979). The cube-connected cycles: a versatile
network for parallel computation, Proc. 20th Annual IEEE Symp. on
Foundations of Computer Science, pp. 140-147.
Proskurin, G. V. (1973). On the distribution of the number of vertices in the
strata of a random mapping, Theory Probab. Applies 18, 803-808.
Provan, J. S. and Bull, M. 0. (1981). The complexity of counting cuts and
computing the probability that a graph is connected. Working Paper MS/S
#81-002, College of Business and Management, University of Maryland at
College Park.
Quintas, L. V., Schiano, M. D. and Yarmish, J. (1980). Degree partition matrices
for identity 4-trees. Mathematics Department, Pace University, New York.
Quintas, L. V., Stehlik, M. and Yarmish, J. (1979). Degree partition matrices for
4-trees having specified properties, Proc. Tenth Southeastern Conf on
Combinatorics, Graph Theory and Computing, Congressus Numerantium
XXIV, Utilitas Math., Winnipeg, Manitoba.
Quintas, L. V. and Yarmish, J. (1981). Valence isomers for chemical trees,
MATCH 12, 75-86.
Rado, R. (1964). Universal graphs and universal functions, Acta Arith. 9,
331-340.
Ramanujacharyulu, C. (1966). On colouring a polygon and restricted random
graphs. In Theory of Graphs, International Symposium, Rome, 1966, Gordon
and Breach, New York, Dunod, Paris (1967), pp. 333-337.
Ramsey, F. P. (1930). On a problem of formal logic, Proc. Lond. Math. Soc. (2)
30, 264-286.
Rapoport, A. and Fillenbaum, S. (1972). An experimental study of semantic
structures. In Multidimensional Scaling (Romney, A. K., Shephard, R. N.
and Nerlove, S. B., Eds), vol. II. Seminar Press, New York, pp. 93-131.
Read, R. C. (1959). The enumeration of locally restricted graphs (I), J. Lond.
Math. Soc. 34, 417-436.
Read, R. C. (1960). The enumeration of locally restricted graphs (II), J. Lond.
Math. Soc. 35, 344-351.
Read, R. C. and Corneil, D. G. (1977). The graph isomorphism disease, J. Graph
Theory 1, 339-363.
Recski, A. (1979). Some remarks on the arboricity of tree-complements, Proc.
Fac. Sci. Tokai Univ. 15, 71-74.
Reed, B. and McDiarmid, C. J. H. (1992). The strongly connected components
of 1-in 1-out, Combinatorics, Probability and Computing 1, 265-274.
Reif, J. H. and Valiant, L. G. (1983). A logarithmic time sort for linear size
networks, Proc. 15th ACM Symp. on Theory of Computing, pp. 10-16.
Reiman, I. (1958). Ober ein Problem von K. Zarankiewicz, Acta Math. Acad.
488 References
496
Index 497
over 150
refernce meansi~~lt
l thatl
IE t1
Ihis repesnt 1 an• ltoup date and!!
l
1 ili•~theor
(fune by Ert~
l a s nd'l
-IilliRi y in the lat i f fties ) is to estimate the
numerou - exercises in
each ch i silll ide lf I
av c ore ri
ED SII I
I II I I I II
.... . ....
M.1
F PI Si .
I
9.
ISBN .L..
...
1.
97221.. 0-521-79722-5