Professional Documents
Culture Documents
Density Ratio Binomial Vs Poisson
Density Ratio Binomial Vs Poisson
Poisson Distributions
Lutz Dümbgen (University of Bern)∗
and
Jon A. Wellner (University of Washington, Seattle)†
October 9, 2019
arXiv:1910.03444v1 [math.ST] 8 Oct 2019
Abstract
Let b(x) be the probability that a sum of independent Bernoulli random variables with
parameters p1 , p2 , p3 , . . . ∈ [0, 1) equals x, where λ := p1 + p2 + p3 + · · · is finite. We prove
two inequalities for the maximal ratio b(x)/πλ (x), where πλ is the weight function of the
Poisson distribution with parameter λ.
1 Introduction
and we exclude the trivial case λ = 0. Under this assumption, the distribution Q of X is
given by
X Y Y
IP(X = x) =: b(x) = pi (1 − pk )
J : #J=x i∈J k∈J c
1
provided that the quantity
X
∆ := λ−1 p2i
i≥1
is small. Indeed, a suitable version of Stein’s method, developed by Chen (1975), leads to
the remarkable bound
X
dTV Q, Poiss(λ) ≤ (1 − e−λ )∆ ≤ p2i max(1, λ),
i≥1
where dTV (·, ·) stands for total variation distance; see Theorem 2.M in Barbour et al.
(1992). Note also that
X
Var(X) = pi (1 − pi ) = λ(1 − ∆).
i≥1
Conjecture and main results. Motivated by Dümbgen et al. (2019), we are aiming
at upper bounds for the maximal density ratio
ρ Q, Poiss(λ) := sup r(x)
x≥0
with
b(x)
r(x) := .
π(x)
Note that for arbitrary sets A ⊂ N0 , the probability Q(A) = IP(X ∈ A) is never larger
than the corresponding Poisson probability times ρ Q, Poiss(λ) , no matter how small
−1
the Poisson probability is. Moreover, dTV Q, Poiss(λ) ≤ 1 − ρ Q, Poiss(λ) . Hence,
ρ Q, Poiss(λ) is a strong measure of error when Q is approximated by Poiss(λ). We
conjecture that
ρ Q, Poiss(λ) ≤ (1 − ∆)−1 . (2)
p∗ := max pi ≥ ∆.
i≥1
2
2 Preparations
Discrete scores. With n := #{i ≥ 1 : pi > 0} ∈ N ∪ {∞}, note that b(x) > 0 if and
only if x ≤ n. For any x ≥ 0,
π(x + 1) λ
= ,
π(x) x+1
so the “scores” r(x + 1)/r(x) are given by
r(x + 1) (x + 1)b(x + 1)
=
r(x) λb(x)
(x + 1)b(x + 1) xb(x)
≤ λ ≤ (5)
b(x) b(x − 1)
with b(−1) := 0.
In particular,
Y X
b(0) = (1 − pk ) = exp log(1 − pk ) < exp(−λ) = π(0),
k≥1 k≥1
where
pi qi
qi := ∈ [0, ∞), pi = .
1 − pi 1 + qi
Ratios of consecutive binomial weights. There are various ways to represent the
ratios b(x + 1)/b(x). In the subsequent versions, the following notation will be useful: For
any set J ⊂ N, we define
X X
s(J) := pi and S(J) := qi .
i∈J i∈J
3
In case of #J < ∞ we set
s̄(J) := s(J)/#J,
S̄(J) := S(J)/#J,
. X . X
W̄ (J) := W (J) W (L) = w(J) w(L)
L:#L=#J L:#L=#J
with the convention 0/0 := 0. Then for any integer x ≥ 0 with b(x) > 0,
b(x + 1) X X 1 X
= W (L) = W (L \ {k})qk
b(0) x+1
L:#L=x+1 L:#L=x+1 k∈L
1 X X
= W (J) qk
x+1
J:#J=x k∈J c
1 X
= W (J)S(J c ).
x+1
J:#J=x
Consequently,
(x + 1)b(x + 1) X
= W̄ (J)S(J c ). (6)
b(x)
J:#J=x
Consequently,
b(x) X 1 X 1
= W̄ (L) . (7)
(x + 1)b(x + 1) x+1 qk + S(Lc )
L:#L=x+1 k∈L
P
One can repeat the previous arguments with the sums k∈J c pj /s(J c ) = 1 in place of
P c
k∈J c qk /S(J ) = 1. This leads to
b(x) X X W (J)pk
=
b(0) p + s((J ∪ {k})c )
c k
J:#J=x k∈J
X X 1 − pk
= W (L) ,
pk + s(Lc )
L:#L=x+1 k∈L
Analyzing equation (8) will lead to a first result about the location of maximizers of
r(·) plus a preliminary bound for ρ Q, Poiss(λ) .
4
Proposition 1. Any maximizer x ∈ N0 of r(x) satisfies the inequalities
1 ≤ x ≤ ⌈λ⌉.
Moreover,
ep∗ − 1 ⌈λ⌉
ρ Q, Poiss(λ) ≤ 1 + ∆ ≤ e⌈λ⌉p∗ .
p∗
Proof of Proposition 1. Since r(0) < 1, any maximizer xo of r(·) has to satisfy xo ≥ 1.
To verify the inequality xo ≤ ⌈λ⌉, it suffices to show that for any x ≥ λ with b(x) > 0,
r(x + 1)
≤ 1.
r(x)
This is equivalent to
b(x)
≥ λ−1 . (9)
(x + 1)b(x + 1)
If b(x + 1) = 0, this inequality is trivial. Otherwise, according to (8), the left hand side of
(9) equals
X 1 X 1 − pk
W̄ (L) .
x+1 pk + s(Lc )
L:#L=x+1 k∈L
But in case of x ≥ λ,
1 − s̄(L) 1 − s̄(L)
≥ = λ−1 ,
λ − xs̄(L) λ − λs̄(L)
whence (9) holds true.
Now we only need an upper bound for r(x) and apply it with x ≤ ⌈λ⌉. First of all,
X Y Y
r(x) = λ−x x! eλ pi (1 − pk )
J:#J=x i∈J k∈J c
X Y Y
= λ−x x! pi epi exp(pk + log(1 − pk ))
J:#J=x i∈J k∈J c
X Y
≤ λ−x x! pi epi
J:#J=x i∈J
X x
Y X p
k pk x
≤ λ−x pk(s) epk(s) = e .
λ
k(1),...,k(x)≥1 s=1 k≥1
Moreover,
epk ≤ 1 + (pk /p∗ )(ep∗ − 1) ≤ ep∗
5
3 Bounds in terms of p∗
3.1 A general strategy to verify upper bounds
for a given differentiable function g : [0, 1) → [0, ∞) with g(0) = 0 and g′ (0) ≥ 1. An
explicit example is given by g(s) := − log(1 − s). To verify this conjecture, we analyze the
function f : (0, 1] → R given by
f (t) := log ρ Qtp , Poiss(tλ) − g(tA),
and
f (t) ≤ ⌈tλ⌉tp∗ − g(tA).
This implies already that f (0 +) = 0. If we can show that for any fixed x ∈ {1, . . . , ⌈λ⌉},
the log-density ratio Lx (t) := log rtp (x) is a continuously differentiable function of t ∈
(0, 1], then f is continuous on (0, 1] with limit f (0 +) = 0, and for t < 1,
where
N (t) := arg max rtp (x)
1≤x≤⌈tλ⌉
and
ge(s) := sg ′ (s).
6
Then a sufficient condition for f (1) ≤ 0 is that f ′ (t +) ≤ 0 for all t ∈ (0, 1), and this can
be rewritten as follows: For t ∈ (0, 1) and 1 ≤ x ≤ ⌈tλ⌉,
L′x (t) ≤ e
g(tA)/t if x ∈ N (t).
xbtp (x)
L′x (t) ≤ e
g(tA)/t if ≥ tλ. (11)
btp (x − 1)
Now it is high time to analyze the functions Lx (·) for 1 ≤ x ≤ ⌈λ⌉. The inequality
x ≤ ⌈λ⌉ implies that b(x) > 0, because otherwise, λ would be a sum of x − 1 weights
pi ∈ [0, 1), and this would lead to the contradiction ⌈λ⌉ ≤ x − 1. For a set J ⊂ N with
#J = x,
∂ ∂ xY Y
wtp (J) = t pi (1 − tpk )
∂t ∂t
i∈J k∈J c
Y Y X Y Y
= xtx−1 pi (1 − tpk ) − p ℓ tx pi (1 − tpk )
i∈J k∈J c ℓ∈J c i∈J k∈J c \{ℓ}
Y Y X Y Y
= xtx−1 pi (1 − tpk ) − tx pi (1 − tpk )
i∈J k∈J c ℓ∈J c i∈J∪{ℓ} k∈(J∪{ℓ})c
x 1 X
= wtp (J) − wtp (J ∪ {ℓ}).
t t c ℓ∈J
Consequently,
∂ X ∂
btp (x) = wtp (J)
∂t ∂t
J:#J=x
x X 1 X X
= wtp (J) − wtp (J ∪ {ℓ})
t t
J:#J=x J:#J=x ℓ∈J c
x X 1 X X
= wtp (J) − wtp (L)
t t
J:#J=x L:#L=x+1 ℓ∈L
x X x+1 X
= wtp (J) − wtp (L)
t t
J:#J=x L:#L=x+1
x x+1
= btp (x) − btp (x + 1).
t t
This gives us the identity
∂ x x + 1 btp (x + 1)
log btp (x) = − .
∂t t t btp (x)
An elementary calculation yields
∂ x
log πtλ (x) = − λ,
∂t t
7
so
∂ x + 1 btp (x + 1)
L′x (t) = log rtp (x) = λ − .
∂t t btp (x)
Consequently, (11) may be rewritten as follows: For each t ∈ (0, 1) and 1 ≤ x ≤ ⌈tλ⌉,
Since we could replace p with tp, it even suffices to show that for 1 ≤ x ≤ ⌈λ⌉,
(x + 1)b(x + 1) xb(x)
≥ λ−eg(A) if ≥ λ. (12)
b(x) b(x − 1)
P P
Note that b(1)/b(0) = i≥1 qi > i≥1 pi = λ, so (12) implies that
2b(2)
≥ λ−e
g (A).
b(1)
In case of A = p∗ and g(s) = − log(1 − s), the strategy just outlined works nicely, leading
to our first main result. Note that e
g(s) = s/(1 − s).
P
Theorem 1. For any sequence p of probabilities pi ∈ [0, 1) with λ = i≥1 pi < ∞,
ρ Q, Poiss(λ) ≤ (1 − p∗ )−1 .
By Jensen’s inequality,
1X 1 1 X −1 −1
c
≥ (q i + S(J )) = S̄(J) + S(J c ) ,
x qi + S(J c ) x
i∈J i∈J
so
b(x − 1) X −1
≥ W̄ (J) S̄(J) + S(J c ) .
xb(x)
J:#J=x
8
On the other hand, (6) yields
(x + 1)b(x + 1) X
= W̄ (J)S(J c )
b(x)
J:#J=x
X X
= W̄ (J) S̄(J) + S(J c ) − W̄ (J)S̄(J)
J:#J=x J:#J=x
X
≥ λ− W̄ (J)S̄(J)
J:#J=x
p∗
≥ λ− = λ−e
g (p∗ ),
1 − p∗
because for any set J with x elements,
1X p∗
S̄(J) = qi ≤ .
x 1 − p∗
i∈J
4 Bounds in terms of ∆
At the moment we do not know whether our general strategy works for A = ∆. Instead
we derive some bounds via direct arguments. We start with an elementary result about
the log-density ratio L1 (t) = log rtp (1).
The right hand side is a smooth function of t ∈ [0, 1] with L1 (0) = 0. Moreover,
X pi X p2i .X p
i
L′1 (t) = pi − +
1 − tpi (1 − tpi )2 1 − tpi
i≥1 i≥1 i≥1
X p2i X p2i .X pi
= −t + ,
1 − tpi (1 − tpi )2 1 − tpi
i≥1 i≥1 i≥1
X . X
L′1 (0) = p2i pi = ∆.
i≥1 i≥1
9
P
Finally, with ai (t) := pi /(1 − tpi ) and S(t) := i≥1 ai (t),
X p2i X p3i .X p
i
L′′1 (t) = − 2
+ 2 3
(1 − tpi ) (1 − tpi ) 1 − tpi
i≥1 i≥1 i≥1
X p2i .X pi 2
−
(1 − tpi )2 1 − tpi
i≥1 i≥1
X X X
2
= − ai (t) + 2 ai (t)3 /S(t) − ai (t)2 aj (t)2 /S(t)2
i≥1 i≥1 i,j≥1
X
≤ − ai (t) − 2ai (t) /S(t) + ai (t)4 /S(t)2
2 2
i≥1
X 2
= − ai (t)2 1 − ai (t)/S(t)
i≥1
≤ 0.
The second last inequality is strict, unless #{i ≥ 1 : pi > 0} = 1, and in that case both
preceding inequalities are equalities.
Propositions 1 and 2 are the main ingredients for the following upper bound for
log ρ Q, Poiss(λ) .
P
Theorem 2. For any sequence p of probabilities pi ∈ [0, 1) with λ = i≥1 pi ≤ 1,
∆ λ
∆ ≥ log ρ Q, Poiss(λ) ≥ ∆ 1 − − .
2 2(1 − p∗ )
L1 (1) = L1 (0) + L′1 (0) + 2−1 L′′1 (ξ) = 0 + ∆ + 2−1 L′′1 (ξ) ≤ ∆.
10
so
X 1 X λ
(pi + log(1 − pi )) ≥ − p2i = − ∆.
2(1 − p∗ ) 2(1 − p∗ )
i≥1 i≥1
Moreover,
X pi X
log λ−1 ≥ log λ−1 (pi + p2i ) = log(1 + ∆) ≥ ∆ − ∆2 /2,
1 − pi
i≥1 i≥1
Remark 3 (Total variation distance). Since b(0) ≤ π(0), Theorem 2 implies that in case
of λ ≤ 1,
dTV Q, Poiss(λ) = sup Q(A) − Poiss(λ)(A)
A⊂{1,2,3,...}
−1
≤ sup Q(A) 1 − ρ Q, Poiss(λ)
A⊂{1,2,3,...}
≤ (1 − b(0))(1 − e−∆ )
X
≤ λ(1 − e−∆ ) ≤ λ∆ = p2i .
i≥1
Q P
Here we used the elementary inequalities 1 − b(0) = 1 − i≥1 (1 − pi ) ≤ i≥1 pi =
λ and 1 − e−∆ ≤ ∆. Consequently, Theorem 2 implies a reasonable upper bound for
dTV Q, Poiss(λ) .
References
11