Professional Documents
Culture Documents
11 - Number of Isolated Vertices - Concentration - Size Bias Coupling PDF
11 - Number of Isolated Vertices - Concentration - Size Bias Coupling PDF
Abstract
Let Y be a nonnegative random variable with mean µ and let Y s , defined on the same space as Y ,
have the Y -size biased distribution, that is, the distribution characterized by
E[Y f (Y )] = µE[f (Y s )] for all functions f for which these expectations exist.
The size bias coupling of Y to Y s can be used to obtain the following concentration of measure result
when Y counts the number of isolated vertices in an Erdös- Rényi random graph model on n edges with
edge probability p. With σ 2 denoting the variance of Y ,
Z θ
Y −µ µ
P ≥ t ≤ inf exp(−θt + H(θ)) where H(θ) = sγs ds
σ θ≥0 2σ 2 0
with
n
pes
γs = 2e 2s
1+ + (1 − p)−n + 1.
1−p
Left tail inequalities may be obtained in a similar fashion. When np → c for some constant c ∈ (0, ∞)
as n → ∞, the bound is of the order at most e−kt for some positive constant k.
∗ Department of Mathematics, University of Southern California, Los Angeles, CA 90089, USA,subhankg@usc.edu and
larry@usc.edu
2000 Mathematics Subject Classification: Primary 60E15; Secondary 60C05,60D05.
Keywords: Large deviations; graph degree; size biased couplings
1
counting the number of vertices v of K with degree d(v) = d for some fixed d. In this paper we derive upper
bounds, for fixed n, on the distribution function of the number of isolated vertices of K, that is, (2) for the
case d = 0 where Y counts the number of vertices having no incident edges.
For d in general, and p = pn depending on n, previously in [7] the asymptotic normality of Y was shown
when n(d+1)/d pn → ∞ and npn → 0, or npn → ∞ and npn − log n − d log log n → −∞; see also [10] and [3].
Asymptotic normality of Y when npn → c > 0, was obtained by [2]. The size bias coupling considered here
was used in [6] to study the rate of convergence to the multivariate normal distribution for a vector whose
components count the number of vertices of some fixed degrees. In [8], the mean µ and variance σ 2 of Y for
the particular case d = 0 are computed as
In the same paper, Kolmogorov distance bounds to the normal were obtained and asymptotic normality
shown when
O’Connell [9] showed that an asymptotic large deviation principle holds for Y . Raič [11] obtained
nonuniform large deviation bounds in some generality, for random variables W with E(W ) = 0 and Var(W ) =
1, of the form
P (W ≥ t) 3
≤ et β(t)/6 (1 + Q(t)β(t)) for all t ≥ 0 (4)
1 − Φ(t)
where Φ(t) denotes the distribution function of a standard normal variate and Q(t) is a quadratic in t.
Although in general the expression for β(t) is not simple, when W equals Y properly standardized and
np → c as n → ∞, (4) holds for all n sufficiently large with
√
C1 C2 t C4 t/ n
β(t) = √ exp √ + C3 (e − 1)
n n
for some constants C1 , C2 , C3 and C4 . For t of order n1/2 , for instance, the function β(t) will be small as
n → ∞, allowing an approximation of the deviation probability P (W ≥ t) by the normal, to within some
factors.
Theorem 1.1 below, by contrast, produces a non-asymptotic, explicit bound, that is, it does not require
2
any relation between n and p and is satisfied for all n. Moreover by (9) and (7), the bound is of order e−λt
2
over some range of t, and of worst case order e−ρt , for the right tail, and e−ηt for the left tail, where λ, ρ
and η are explicit, with the bound holding for all t ∈ R.
Theorem 1.1. For n ∈ {1, 2, . . .} and p ∈ (0, 1) let K denote the random graph on n vertices where each
edge is present with probability p, independently of all other edges, and let Y denote the number of isolated
vertices in K. Then for all t > 0,
Z θ
Y −µ µ
P ≥ t ≤ inf exp(−θt + H(θ)) where H(θ) = sγs ds, (5)
σ θ≥0 2σ 2 0
2
Recall that for a nonnegative random variable Y with finite, nonzero mean µ, the size bias distribution
of Y is given by the law of a variable Y s satisfying
for all f for which the expectations above exist. The main tool used in proving Theorem 1.1 is size bias
coupling, that is, constructing Y and Y s , having the Y -size biased distribution, on the same space. In [4]
and [5], size bias couplings were used to prove concentration of measure inequalities when |Y s − Y | can be
almost surely bounded by a constant independent of the problem size, the number of vertices in the present
case. Here, when Y is the number of isolated vertices of K, we consider a coupling of Y to Y s with the
Y -size bias distribution where this boundedness condition is violated. Unlike the theorem used in [4] and
[5], which can be applied to a wide variety of situations under a bounded coupling assumption, it seems that
cases where the coupling is unbounded, such as the one we consider here, need application specific treatment,
and cannot be handled by one single general result.
Remark 1.1. Useful bounds for the minimization in (5) may be obtained by restricting to θ ∈ [0, θ0 ] for
some θ0 . In this case, as γs is an increasing function of s, we have
µ
H(θ) ≤ γθ θ 2 for θ ∈ [0, θ0 ].
4σ 2 0
The quadratic −θt + µγθ0 θ2 /(4σ 2 ) in θ is minimized at θ = 2tσ 2 /(µγθ0 ). When this value falls in [0, θ0 ] we
obtain the first bound in (9), and setting θ = θ0 yields the second, thus,
( t2 σ 2
exp(− µγ ) for t ∈ [0, θ0 µγθ0 /(2σ 2 )]
Y −µ
P ≥t ≤ θ0
2
µγθ0 θ0
(9)
σ exp(−θ0 t + 4σ 2 ) for t ∈ (θ0 µγθ0 /(2σ 2 ), ∞).
Though Theorem 1.1 is not an asymptotic result, when np → c as n → ∞, (3) and (6) yield
σ2 s
→ 1 + ce−c − e−c , β → ec and γs → 2e2s+ce + ec + 1 as n → ∞.
µ
Since limn→∞ γs and limn→∞ µ/σ 2 exist, the right tail probability is at most of the exponential order e−ρt
for some ρ > 0. Also, the left tail bound (7) in this asymptotic behaves as
t 1 + ce−c − e−c
2
σ2
2
t
lim exp − = exp − ,
n→∞ 2 µ(β + 1) 2 e−c + 1
2
that is, as e−ηt with η > 0.
The paper is organized as follows. In Section 2 we review results leading to the construction of size
biased couplings for sums of possibly dependent variables, and then apply it in Section 3 to the number Y
of isolated vertices; this construction first appeared in [6]. The proof of Theorem 1.1 is also given in Section
3.
3
Definition 2.1. Let A be an arbitrary index set and let X = {Xα : α ∈ A} be a collection of nonnegative
random variables with finite, nonzero expectations EXα = µα and joint distribution dF (x). For β ∈ A, we
say that Xβ = {Xαβ : α ∈ A} has the X-size bias distribution in coordinate β if Xβ has joint distribution
dF β (x) = xβ dF (x)/µβ .
Just as (10) is related to (8), the random vector Xβ has the X-size bias distribution in coordinate β if
and only if
E[Xβ f (X)] = µβ E[f (Xβ )] for all functions f for which these expectations exist.
Letting f (X) = g(Xβ ) for some function g one recovers (8), showing that the β th coordinate of Xβ , that is,
Xββ , has the Xβ -size bias distribution.
The factorization
dF (x) = P (X ∈ dx|Xβ = x)P (Xβ ∈ dx)
of the joint distribution of X suggests the following way to construct X. First generate Xβ , a variable with
distribution P (Xβ ∈ dx). If Xβ = x, then generate the remaining variates {Xα , α 6= β} with distribution
P (X ∈ dx|Xβ = x). Similarly, the factorization
dF β (x) = xβ dF (x)/µβ = P (X ∈ dx|Xβ = x)xβ P (Xβ ∈ dx)/µβ = P (X ∈ dx|Xβ = x)P (Xββ ∈ dx) (11)
suggests that to generate Xβ with distribution dF β (x), one may first generate a variable Xββ with the
Xβ -size bias distribution, then, when Xββ = x, generate the remaining variables according to their original
conditional distribution given that the β th coordinate takes on the value x.
Definition 2.1 and the following special case of a proposition from Section 2 of [6] will be applied in the
subsequent constructions; the reader is referred there for the simple proof.
Proposition 2.1. Let A be an arbitrary indexP set, and let X = {Xα , α ∈ A} be a collection of nonnegative
random variables with finite means. Let Y = β∈A Xβ and assume µA = EY is finite and positive. Let
Xβ have the X-size biased distribution in coordinate β as in Definition 2.1. Then, if I is a random index,
independent of X, taking values in A with distribution
P (I = β) = µβ /µA , β ∈ A,
In our examples we use Proposition 2.1 and the random index I, and (11), to obtain Y s by first generating
XII with the size bias distribution of XI , then, if I = β and Xββ = x, generating {Xαβ : α ∈ A\{β}} according
to the (original) conditional distribution P (Xα , α 6= β|Xβ = x).
4
To derive the needed properties of this coupling, let N (v) be the set of neighbors of v ∈ V, and T be the
collection of isolated vertices of K, that is, with d(v), the degree of v, given in (1),
N (v) = {w : Xvw = 1} and T = {v : d(v) = 0}.
Note that Y = |T |. Since all edges incident to the chosen V are removed in order to form K s , any neighbor
of V which had degree one thus becomes isolated, and V also becomes isolated if it was not so earlier. As
all other vertices are otherwise unaffected, as far as their being isolated or not, we have
X
Y s − Y = d1 (V ) + 1(d(V ) 6= 0) where d1 (V ) = 1(d(w) = 1). (12)
w∈N (V )
In particular the coupling is monotone, that is, Y s ≥ Y . Since d1 (V ) ≤ d(V ), (12) yields
Y s − Y ≤ d(V ) + 1. (13)
Now note that for real x 6= y, the convexity of the exponential function implies
Z 1 Z 1
ey − ex ey + ex
= ety+(1−t)x dt ≤ (tey + (1 − t)ex )dt = . (14)
y−x 0 0 2
Using (14), that the coupling is monotone, and that Y is a function of T , for θ ≥ 0 we have
s θ s s
E(eθY − eθY ) ≤ E (Y − Y )(eθY + eθY )
2
θ
= E (exp(θY )(Y s − Y ) (exp(θ(Y s − Y )) + 1))
2
θ
= E {exp(θY )E((Y s − Y )(exp(θ(Y s − Y )) + 1)|T )} . (15)
2
Now using that Y s = Y when V ∈ T , and (13), we have
where in the final inequality we have used that 1(V 6∈ T ) ≤ d(V )1(V 6∈ T ).
To derive a bound on the expectation in (16), first note that by conditioning we obtain
P (d(V ) = k, 1(V 6∈ T ) = 1|T ) = P (V 6∈ T )P (d(V ) = k|T , V 6∈ T ). (17)
Since V is chosen independently of K, the distribution on the right hand side of (17) is a Binomial sum with
number of trials equal to the number n − 1 − Y of non-isolated vertices other than V , each having success
probability p, conditioned to be nonzero, that is,
(
n−1−Y pk (1−p)n−1−Y −k
for 1 ≤ k ≤ n − 1 − Y
P (d(V ) = k|T , V 6∈ T ) = k 1−(1−p)n−1−Y (18)
0 otherwise.
Using the conditional distribution (18) and (17), and dropping the factor P (V 6∈ T ) in the later to obtain
an inequality, it can be easily verified that the first derivative of the conditional moment generating function
of d(V ) satisfies
(n − 1 − Y )(peθ + 1 − p)n−2−Y peθ
E(d(V )eθd(V ) 1(V 6∈ T )|T ) ≤ .
1 − (1 − p)n−1−Y
5
Starting with the first term of (16), by the mean value theorem applied to the function f (x) = xn−1−Y ,
for some ξ ∈ (1 − p, 1) we have
Hence, recalling θ ≥ 0,
Lastly, again applying (17) and (18) we may handle the second term in (16) by the inequality
(n − 1 − Y )p (n − 1 − Y )p
E(d(V )1(V 6∈ T )|T ) ≤ n−1−Y
≤ = β, (20)
1 − (1 − p) (n − 1 − Y )p(1 − p)n
with β as in (6).
Substituting inequalities (19) and (20) into (16) yields
Hence for t ≥ 0,
Y −µ θ(Y − µ)
P ≥t ≤ P exp ≥e θt
≤ e−θt M (θ) ≤ exp(−θt + H(θ)).
σ σ
6
As the inequality holds for all θ ≥ 0, it holds for the θ achieving the minimal value, proving (5).
To demonstrate the left tail bound let θ < 0. Since Y s ≥ Y and θ < 0, using (14) and (13), and recalling
s
Y = Y when V ∈ T , we obtain
s |θ| θY s
E(eθY − eθY ) ≤ E (e + eθY )(Y s − Y )
2
≤ |θ|E(eθY (Y s − Y ))
= |θ|E(eθY E(Y s − Y |T ))
≤ |θ|E(eθY E((d(V ) + 1)1(V 6∈ T ))|T )).
and therefore
s
m0 (θ) = µE(eθY ) ≥ µ (1 + (β + 1)θ) m(θ).
µ(β + 1)θ2
log(M (θ)) ≤ . (25)
2σ 2
The inequality in (25) implies that for all t > 0 and θ < 0,
µ(β + 1)θ2
Y −µ
P ≤ −t ≤ exp θt + .
σ 2σ 2
References
[1] Baldi, P., Rinott, Y. and Stein, C. (1989). A normal approximations for the number of local
maxima of a random function on a graph, Probability, Statistics and Mathematics, Papers in Honor of
Samuel Karlin, T. W. Anderson, K.B. Athreya and D. L. Iglehart eds., Academic Press, 59-81.
[2] Barbour, A.D., Karoński, M. and Ruciński,A.(1989). A central limit theorem for decomposable
random variables with applications to random graphs, J. Combinatorial Theory B, 47, 125-145.
[3] Bollobás, B. (1985). Random graphs. Academic Press Inc., London.
[4] Ghosh, S. and Goldstein, L.(2009). Concentration of measures via size biased couplings, Probab.
Th. Rel. Fields, DOI: 10.1007/s00440-009-0253-3.
7
[5] Ghosh, S. and Goldstein, L.(2011). Applications of size biased couplings for concentration of mea-
sures, Electronic Communications in Probability, 16, pp. 70-83
[6] Goldstein, L. and Rinott, Y.(1996). Multivariate normal approximations by Stein’s method and
size bias couplings, Journal of Applied Probability, 33,1-17.
[7] Karoński, M. and Ruciński, A. (1987). Poisson convergence and semi-induced properties of random
graphs. Math. Proc. Cambridge Philos. Soc., 101, 291-300.
[8] Kordecki, W.(1990). Normal approximation and isolated vertices in random graphs, Random Graphs
’87, Karoński, M., Jaworski, J. and Ruciński, A. eds., John Wiley & Sons Ltd., 131-139.
[9] O’Connell, N.(1998). Some large deviation results for sparse random graphs, Probab. Th. Rel. Fields,
110, 277-285.
[10] Palka, Z. (1984). On the number of vertices of given degree in a random graph. J. Graph Theory, 8,
167–170.
[11] Raič, M.(2007). CLT related large deviation bounds based on Stein’s method, Adv. Appl. Prob., 39,
731-752.