You are on page 1of 21

Stein’s Method for Poisson-Exponential Distributions

Anum Fatima
University of Oxford, UK & Lahore College for Women University, Pakistan
fatima@stats.ox.ac.uk

Gesine Reinert
arXiv:2212.09615v2 [math.PR] 13 Dec 2023

University of Oxford & The Alan Turing Institute, London, UK


reinert@stats.ox.ac.uk

Abstract
The distribution of the maximum of a zero truncated Poisson number of
i.i.d. exponentially distributed random variables is known as a Poisson-Exponential
distribution. It arises for example as a model for monotone hazard rates, and
also as a limiting distribution for example, of a Generalized Poisson-Exponential
distribution and a scaled Poisson-Geometric distribution.
To assess these distributional approximations, we first derive a Stein equation for
the distribution of the maximum of a random number of random variables, and
detail it for Poisson-Exponential distributions. We then provide upper bounds on
the approximation errors in total variation distance when approximating Poisson-
Exponential distributions and a Generalized Poisson-Exponential distribution by
different Poisson-Exponential distributions. Moreover, employing standardised
Stein equations we obtain upper bounds on the bounded Wasserstein distance
when using a Poisson-Exponential distribution to approximate a scaled Poisson-
Geometric distribution. With this result we bound the bounded Wasserstein distance
between the scaled distribution of the maximum of a random number of waiting
times of the occurrence of pattern in Bernoulli sequences and a Poisson-Exponential
distribution.

Keywords: Stein’s Method, Poisson-Exponential distribution, Poisson-Geometric distribution


AMS 2020 Subject Classifiation: 60F05, 60E05

1 Introduction
Complementary risk and competing risk problems arise in many fields such as industrial reliability,
demography, biomedical studies, public health, and actuarial sciences. In such studies often there is
no information on which particular risk factor, among all the risk factors associated with the system
failure, is responsible for the failure. The only observable quantity is the time when the whole system
fails, which corresponds to either the maximum or minimum lifetime among all possible risk of
failures. The case of the maximum observed is referred to as complementary risk, and the case of the
minimum observed is referred to as competing risks; in both situations, the risks can be viewed as
latent as often the actual risk factors cannot be measured. Both of these problems are often studied
together as max(X1 , . . . , Xp ) = − min(−X1 , . . . , −Xp ); for a discussion see for example [8].
This paper focuses on complementary risks. Some examples for such risks are, the number of
successive failures of the air conditioning system of each member of a fleet of 13 Boeing 720 jet
airplanes, attributed to defective components or errors committed during the production process;
the period between successive coal-mining disasters while considering construction faults or human
errors committed by inexperienced miners as latent risks [2], serum-reversal time (days) of children
contaminated with HIV from vertical transmission [30], the daily ozone concentrations in New York
during May–September 1973 [17], and the fatigue life of alloy specimens from [7].

Preprint. Preliminary work.


Stein’s Method for Poisson-Exponential Distributions

As a simple probabilistic model which incorporates increasing hazard rates, [9] introduced the
Poisson-Exponential distribution, which is a distribution of maxima of independently and identically
distributed exponential random variables with parameter λ (mean λ1 ; the distribution is denoted
exp(λ)), each relating to an underlying latent risk, with the total number of such variables following
a zero-truncated Poisson distribution with parameter θ, independently of the exponential random
variables. As an application, they model the number of million revolutions before failure of ball
bearings in an endurance test, speculating that risk factors could include the risk of contamination
from dirt from the casting of the casing, the wear particles from hardened steel gear wheels and
the harsh working environments, along with other unobserved factors. They advocate that for this
situation complementary risks are unobserved, so the Poisson-Exponential distribution is a better
model than conventional models for lifetime data. The Poisson-Exponential distributions with
parameters θ, λ > 0, denoted by P E(θ, λ), has probability density function (pdf)
−λx
θλe−λx−θe
p(x|θ, λ) = ; x > 0, (1.1)
1 − e−θ
and cumulative distribution function (cdf)
−λx
e−θe − e−θ
F (x) = ; x > 0. (1.2)
1 − e−θ
For θ → 0 this cdf converges to that of the exponential distribution with parameter λ.
−λx−θe−λx
The failure rate for the P E(θ, λ) distribution is θλ e −θe−λx , a strictly increasing function in x.
1−e
However, often in practice, underlying hazard rates are not increasing. For modelling data with
increasing or decreasing hazard rates, [16] suggest the generalized exponential distribution. For
decreasing, increasing, and upside-down bathtub failure rate function, [27] propose generalized
exponential geometric distributions. Modelling increasing, decreasing and sigmoid shaped hazard
rates can be carried out using the Extended Poisson Exponential distribution from [12]. More
modifications can be found in [2], [7], [30], [17].
Here we focus on the proposal by Fatima & Roohi [13]; they introduce the family of Generalized
Poisson-Exponential (GPE) distributions by adding a new parameter to the P E(θ, λ) model, to
provide more flexibility in modeling increasing as well as decreasing failure rates, depending on
the choice of parameters. The Generalized Poisson-Exponential distribution GP E(θ, λ, β) with
parameters β, θ, λ > 0, has cdf
−λx

e−θe − e−θ
F (x) = ; x > 0. (1.3)
1 − e−θ

If β = 1 in (1.3) we recover the Poisson-Exponential distribution (1.1). [13] showed that for the
Aarset [1] data set, consisting of 50 observations on the time to first failure of devices, the GPE
distribution provides a better fit than the PE and some other distributions used to model complementary
risk data. A GPE distribution is not as easy to manipulate and interpret as a PE distribution and hence
it is a natural question how far apart the GPE is from a suitable PE; for different choices of parameters
in the approximating PE.
As a further application, we consider the approximation, by a PE distribution, of a scaled Poisson-
Geometric (PG) distribution, which is the of maximum of M i.i.d. geometric random variables where
M follows a zero-truncated Poisson distribution, independently of the geometric random variables.
The PG distribution is thus a discrete analog to the PE distribution. By taking the maximum of
waiting times until chosen patterns occur in a sequence of Bernoulli trials as example, we illustrate
that the PE approximation can also be applicable for the maximum of dependent variables.
This paper provides explicit bounds for such distributional approximations. Here we use distances
between probability distributions of the form
dH (L(X), L0 (Z)) := sup |Eh(X) − Eh(Z)| (1.4)
h∈H

where H is a set of bounded test functions. Table 1 gives the distances used in this paper; B(R)
denotes the Borel sets of R, and || · || the supremum norm in R.

2
Stein’s Method for Poisson-Exponential Distributions

Table 1: Sets of test functions H and associated probability distances


Kolmogorov distance (dK ) {I[· ≤ x] : x ∈ R}
Total variation distance (dT V ) {I[. ∈ A] : A ∈ B(R)}
Bounded Wasserstein distance Lipb (1) := {h : R → R, |h(x) − h(y)| ≤ |x − y|
(dBW ) for all x, y ∈ R; ∥h∥ ≤ 1}

We note here the alternative formulation for total variation distance, found for example as Proposition
C.3.5 in [21], based on Borel-measurable functions h : R → R;
1
dT V (L(X), L0 (Z)) = sup |Eh(X) − Eh(Z)|. (1.5)
2 ∥h∥≤1

To assess distributional distances, we use Stein’s method. The seminal work of Charles Stein [28]
derives bounds on the approximation error for normal approximations using what is now called Stein’s
method. Following these ideas, [11] extended Stein’s work to Poisson approximation, see also [4] and
[5]. Generalizations to many other distributions and dimensions are now available, see for example
[10], [21], and [20]. An overview of applications in the area of statistics can be found in [3]. For this
paper perhaps the most relevant works on Stein’s method are [15], which use Stein’s method to give
bounds on the distance between a beta distribution and the distribution of Pólya-Eggenberger urn
which has a discrete probability mass function, as well as [22] and [24] which develop Stein’s method
for geometric distributions. The preprint [14] contains ideas which are related to standardisations
used in this paper. The density method for finding Stein operators used in this paper originated in
[29] and is surveyed in [18].
As described in [6], the main aim of Stein’s method is to obtain explicit bounds on the distance
between a probability distribution L(X) of interest and a usually well-understood approximating
distribution L0 (Z), often called the target distribution. The test function h is connected to the
distribution of interest through a Stein Equation
h(x) − Eh(Z) = T f (x). (1.6)
In this equation T f (x) is a Stein operator for the distribution L0 , with an associated Stein class
F(T ) of functions such that E[T f (Z)] = 0 for all f ∈ F(T ) ⇐⇒ Z ∼ L0 . Hence the distance
(1.4) can be bounded using dH (L(X), L0 (Z)) ≤ sup |ET f (X)| where F(H) = {fh |h ∈ H} is
f ∈F (H)
the set of solutions of the Stein equation (1.6) for the test functions h ∈ H. In this paper we set up a
Stein equation for the distribution of the maximum of a random number of i.i.d. random variables,
and we specify it to develop Stein’s method for Poisson-Exponential distributions.
The remainder of this paper is organised as follows: Section 2 gives a general Stein operator for the
maximum of a random number of i.i.d. random variables via the Stein density approach, see [29] and
[19], and uses it to obtain a Stein operator for the Poisson-Exponential distribution. Non-uniform
bounds on the solutions of the corresponding Stein equation are derived. Section 3 contains a general
approach for comparing the distributions of maxima of a random number of i.i.d random variables.
Subsection 3.2 derives (upper) bounds on the distances between two Poisson-Exponential distributions
with different parameters. Subsection 3.3 provides bounds on the distance between a Generalized
Poisson-Exponential distribution and the corresponding Poisson-Exponential distribution. Bounding
the distributional distance between a Poisson-Geometric distribution and a PE distribution is carried
out in Subsection 3.5. To this purpose, we use a joint standardization procedure for comparing
discrete and continuous Stein operators, which may be of independent interest. Subsection 3.6 applies
the results to approximating the maximum of waiting times until a given pattern is observed in a
random number of binary sequences; this example illustrates that the results include approximations
for maxima of a random number of dependent random variables. Proofs which are standard and
would disturb the flow of the argument are postponed to Section 4.

2 Stein’s Method for Poisson-Exponential Distributions


2.1 Stein Equations for the Maximum of a Random Number of Variables

3
Stein’s Method for Poisson-Exponential Distributions

This section starts with obtaining a Stein equation for the distribution of the maximum of a random
number of i.i.d. random variables before specifying it to the Poisson-Exponential distribution.
Following [29, 19], a score Stein operator for a continuous distribution with pdf p is given by
(f p)′ (x)
T f (x) = (2.1)
p(x)
acting on functions f such that the derivative exists. Here 0/0 = 0, so that the operator is 0 for values
x which are not in the support of the pdf p(·). The Stein class F(p) is the collection of functions f :
R → R such that f (x)p(x) is differentiable with integrable derivative and limx→0,∞ f (x)p(x) = 0.
The Stein equation corresponding to T is
T f (x) = h(x) − Eh(X) (2.2)
where X ∼ p and h ∈ H is a class of test functions of interest. Taking expectations in (2.2) gives the
mean-zero property, ET f (X) = 0. It is straightforward to see that
Z x
1
f (x) = [h(t) − Eh(X)]p(t)dt (2.3)
p(x) 0
solves (2.2) for h. If h is bounded then f ∈ F(p).
Now let N be a random variable taking values in the non-negative integers, with a finite second
moment, let Y = Y1 , Y2 , . . . , be a sequence of i.i.d. continuous random variables with cdf FY and
pdf pY , which is independent of N , and consider
W = W (N, Y) = max{Y1 , Y2 , . . . , YN }. (2.4)
n−1 N
P
Then W has pdf pW (x) = n P(N = n)n(F Y (x)) p Y (x). With GN (x) = E x the
probability generating function of N , we have pW (x) = pY (x)G′N (FY (x)); differentiating,
p′W (x) = p′Y (x)G′N (FY (x)) + (pY (x))2 G′′N (FY (x)); the second moment assumption on N en-
sures that G′′N (x) exists for 0 ≤ x ≤ 1. Thus, the score function for W is
G′′ (FY (x))
ρW (x) = pY (x) ′N + ρY (x) (2.5)
GN (FY (x))
with ρY the score function of Y . The corresponding Stein equation for h ∈ H is
g ′ (w) + ρW (N,Y) g(w) = h(x) − Eh(W (N, Y)). (2.6)

2.2 The Poisson-Exponential Stein Equation


To obtain a Stein operator for the Poisson-Exponential distributions, P E(θ, λ) with pdf (1.1), note
e−θ
 G′′ (x)
that for N a zero-truncated Poisson(θ) variable, GN (x) = 1−e −θ eθx − 1 and GN′ (x) = θ. Hence
N
(2.5) yields the score function
p′ (x)
ρ(x) = = λ(θe−λx − 1), (2.7)
p(x)
and (2.1) gives
T f (x) = f ′ (x) + λ(θe−λx − 1)f (x). (2.8)
For a bounded test function h ∈ H, with H for instance as in Table 1, with (2.6) and X ∼ P E(θ, λ),
the corresponding Stein equation for the P E(θ, λ) distribution is
f ′ (x) + λ(θe−λx − 1)f (x) = h(x) − Eh(X). (2.9)
+
Lemma 2.1. Let h : R → R be bounded and f denote the solution (2.3) of the Stein equation (2.9)
for h. Let h̃(x) = h(x) − Eh(X) for X ∼ P E(θ, λ). Then for all x > 0
∥h̃∥  −λx

e−λx f (x) ≤ 1 − e−θ+θe (2.10)
θλ
∥h̃∥
≤ ; (2.11)
θλ
|λ(θe−λx − 1)f (x)| ≤ ∥h̃∥; (2.12)
2∥h̃∥
|f (x)| ≤ ; (2.13)
λ
|f ′ (x)| ≤ 2∥h̃∥. (2.14)

4
Stein’s Method for Poisson-Exponential Distributions

If in addition h ∈ Lipb (1) and h′ denotes its derivative which exists almost everywhere (by
Rademacher’s theorem) then at all points x at which h′ exists,
|f ′′ (x)| ≤ ∥h′ ∥ + 2λθ∥h̃∥ + 3λ∥h̃∥. (2.15)

Proof. Proof of (2.10) and (2.11). First, we bound e−λx f (x);


−λx
e−λx x e−λx x 1 − e−θ(1−e )
Z Z
−λx
e f (x) = h̃(t)p(t)dt ≤ ∥h̃∥ p(t)dt = ∥h̃∥ ,
p(x) 0 p(x) 0 θλ
and (2.10) follows. From 1 − e−y < 1 ∀ y > 0, we get (2.11).

Proof of (2.12). Case 1: θe−λx − 1 > 0. In this case 0 < x < lnθ
λ and we have
x
λ(θe−λx − 1)
Z
|λ(θe−λx − 1)f (x)| ≤ ∥h̃∥ p(t)dt.
p(x) 0

As p′ (t) = λ(θe−λt − 1)p(t) ≥ λ(θe−λx − 1)p(t) for 0 < t < x < lnθ λ , it follows that
x x
λ(θe−λx − 1)
Z Z
1 −λx
0< p(t)dt ≤ p′ (t)dt = (1 − e−θ(1−e )+λx ) ≤ 1.
p(x) 0 p(x) 0
lnθ
Hence we obtained the bound (2.12) for 0 < x < λ .

Case 2: θe−λx − 1 ≤ 0. In this case x ≥ lnθ


λ and
0 < λ(1 − θe−λx )p(t) < λ(1 − θe−λt )p(t) = p′ (t).
Using (2.3) gives
∞ ∞
λ(1 − θe−λx )
Z Z
−λx 1
|λ(θe − 1)f (x)| ≤ ∥h̃∥ p(t)dt ≤ ∥h̃∥ p′ (t)dt ≤ ∥h̃∥.
p(x) x p(x) x
Hence the bound (2.12) follows for all x > 0.

Proof of (2.13), (2.14) and (2.15). As λ(θe−λx − 1)f (x) = λθe−λx f (x) − λf (x), the triangle
inequality along with (2.12) and (2.11) gives (2.13) as |λf (x)| ≤ |λθe−λx f (x)| + |λ(θe−λx −
1)f (x)| ≤ 2∥h̃∥.
For (2.14), let X have pdf p. Using the triangle inequality, from (2.9) we obtain |f ′ (x)| ≤ |h(x) −
Eh(X)| + |λ(θe−λx − 1)f (x)| and using (2.12) yields the bound (2.14).
Now for h differentiable at x, taking the the first order derivative in (2.9) gives |f ′′ (x)| ≤ |h′ (x)| +
|λ(θe−λx − 1)f ′ (x)| + |θλ2 e−λx f (x)|. Using (2.11) and (2.14) we obtain the bound (2.15) through
∥h̃∥
|f ′′ (x)| ≤ ∥h′ ∥ + λ|θe−λx − 1|2∥h̃∥ + θλ2 ≤ ∥h′ ∥ + 2λθe−λx ∥h̃∥ + 2λ∥h̃∥ + λ∥h̃∥.
θλ
This completes the proof.
Remark 2.2. If θ → 0, the PE distribution converges to the exponential distribution with parameter
λ; when further λ = 1, (2.9) reduces to the Stein equation (4.2) in [23]. For this simplified version,
the bound in [23] is only 12 of the bound (2.14); this discrepancy arises through our use of the triangle
inequality for θ > 0. [23] then consider the distribution of a random sum of non-negative random
variables and give bounds on the distance between the distribution of such random sums and an
exp(1) distribution. Here instead we are interested in a maximum of random number of i.i.d. random
variables, not a sum.

3 A Poisson-Exponential Distribution as Target


In this section, we compare a Poisson-Exponential distribution first, with another Poisson-Exponential
distribution with different parameters, second, with a Generalized Poisson-Exponential distribution,
and third, with a Poisson-Geometric distribution.

5
Stein’s Method for Poisson-Exponential Distributions

3.1 A General Comparison Approach


For a distributional comparison, let X1 and X2 be two random variables with densities p1 , p2 defined
on the same probability space and on a nested support, with score Stein operators T1 and T2 . Then
E[h(X2 )] − E[h(X1 )] = E[f1 (X2 )(ρ2 (X2 ) − ρ1 (X2 ))] (3.1)
where ρ1 and ρ2 are the score functions for the densities p1 and p2 and f1 (x) is the solution of Stein
equation for T1 (Ley et al.[18]). For comparing maxima of a random number of variables, W (N, Y)
and W (M, Z) as in (2.4), using the score functions (2.5) is in principle straightforward, bounding

E (ρW (N,Y) − ρW (M,Z) )(W (M, Z)) g(W (M, Z))
G′′N (FY (W )) G′′M (FZ (W ))
  
= E pY (W ) ′ − pZ (W ) ′ g(W )
GN (FY (W )) GM (FY (W )))
+E(ρY − ρZ )(W )g(W )| (3.2)
with W = W (M, Z) and g solving the W (N, Y)-Stein equation (2.6) for a function h of interest. In
particular if G′′M (z)/G′M (z) = θM and G′′N (z)/G′N (z) = θN are constant, such as when they are
Poisson or zero-truncated Poisson distributed, then (3.2) simplifies to bounding
|E {(θN pY (W ) − θM pZ (W )) g(W )} + E(ρY − ρZ )(W )g(W )| . (3.3)

3.2 Comparison of Two Poisson-Exponential Distributions


The following theorem uses (3.3) to compare two Poisson-Exponential distributions.
Theorem 3.1. Let X1 ∼ P E(θ1 , λ1 ) and X2 ∼ P E(θ2 , λ2 ) with λ1 ≤ λ2 . Let H = {h : R →
R, ∥h∥ ≤ 1}. Then for all h ∈ H
λ1 θ2 (1 − e−θ1 )
   
θ2 λ2 λ2
|Eh(X2 ) − Eh(X1 )| ≤ ∥h̃∥ −1 + −1 +2 . (3.4)
θ1 λ1 λ1 λ2 (1 − e−θ2 )

Proof. If p1 = P E(θ1 , λ1 ) and p2 = P E(θ2 , λ2 ) then using their corresponding score functions
(2.7), (3.1) gives
Eh(X2 ) − Eh(X1 ) = E[f1 (X2 )(λ2 (θ2 e−λ2 X2 − 1) − λ1 (θ1 e−λ1 X2 − 1))], (3.5)
where f1 (x) is the solution of Stein equation (2.9) when X ∼ p1 . Thus realizing X1 and X2 on the
same probability space, (3.3) yields
|Eh(X2 ) − Eh(X1 )| ≤ E θ2 λ2 e−λ2 X2 − θ1 λ1 e−λ1 X2 |f1 (X2 )| + E|λ2 − λ1 ||f1 (X2 )|.
Since for λ1 < λ2 it holds that e(λ1 −λ2 )x ≤ 1 and |e−x − 1| ≤ x for all x ≥ 0, we have
|θ2 λ2 e−λ2 x2 − θ1 λ1 e−λ1 x2 | ≤ |θ2 λ2 − θ1 λ1 |e−λ1 x2 + (λ2 − λ1 )x2 θ1 λ1 e−λ1 x2 . (3.6)
Hence
|Eh(X2 ) − Eh(X1 )| ≤ |θ2 λ2 − θ1 λ1 |Ee−λ1 X2 |f1 (X2 )|
+(λ2 − λ1 )θ1 λ1 EX2 e−λ1 X2 |f1 (X2 )| + (λ2 − λ1 )E|f1 (X2 )|.
Moreover by (2.10),
1  λ1 X 2
 1 − e−θ1
EX2 e−λ1 X2 |f1 (X2 )| ≤ ||h̃||EX2 1 − e−θ1 +θ1 e ≤ ||h̃|| EX2 .
θ1 λ 1 θ1 λ1

While for the Poisson-Exponential an exact formula for the mean is available, see [9], that formula
involves generalised hypergeometric functions and is not easy to analyse. For our purpose the
PN
following bound on the mean of X ∼ P E(θ, λ) suffices. As X = max{E1 , . . . , EN } ≤ i=1 Ei ,
with N a zero truncated Poisson random variable with parameter θ and Ei ∼ exp(λ), independent of
each other and of N , it follows that
N
!
X X 1 θ
E Ei = P (N = n)n = . (3.7)
i=1 n
λ λ(1 − e−θ )

6
Stein’s Method for Poisson-Exponential Distributions

Using also the bounds (2.11) and (2.13),



1
|Eh(X2 ) − Eh(X1 )| ≤ ||h̃|| |θ2 λ2 − θ1 λ1 |
λ1 θ1
θ2 (1 − e−θ1 )

1
+|λ2 − λ1 |θ1 λ1 | + 2(λ 2 − λ1 )
θ1 λ1 λ2 (1 − e−θ2 ) λ1
−θ1
  
θ2 λ 2 θ2 (1 − e ) λ2
= ||h̃|| − 1 + (λ2 − λ1 ) +2 −1 .
θ1 λ 1 λ2 (1 − e−θ2 ) λ1
Re-arranging gives the assertion.

Remark 3.2. 1. As ∥h̃∥ ≤ 2∥h∥, for λ1 ≤ λ2 a bound in total variation distance follows using
(1.5),
λ1 θ2 (1 − e−θ1 )
  
θ2 λ2 λ2
dT V (P E(θ1 , λ1 ), P E(θ2 , λ2 )) ≤ −1 + −1 +2 .
θ1 λ1 λ1 λ2 (1 − e−θ2 )

θ2
2. For λ1 = λ2 , the bound (3.4) tends to 0 when the ratio θ1 → 1.

3. If X1 ∼ P E(θ, λ1 ) and X2 ∼ P E(θ, λ2 ), for λ1 ≤λ2 , the bound (3.4) gives


  
λ2 λ1
|Eh(X2 ) − Eh(X1 )| ≤ ∥h̃∥ −1 θ+3 . (3.8)
λ1 λ2

Hence the closer λ1 and λ2 , the smaller the bound will be but for λ1 ̸= λ2 the bound increases
with θ.
4. For θ → 0 in (3.8), the comparison reduces to a bound on the distance between the distributions
of X1 ∼ exp(λ1 ) and X2 ∼ exp(λ2 ) which is of the same order but inflated by a factor 3
compared to the bound
λ2
|Eh(X2 ) − Eh(X1 )| ≤ ∥h∥ 1 − (3.9)
λ1
which is obtained by using the Stein operator (4.2) and the bound (4.4) in [23].

3.3 Approximating the Generalized Poisson-Exponential Distribution


The Generalized Poisson-Exponential distribution is a generalization of Poisson-Exponential dis-
tribution with one additional parameter β. This distribution is not of the form (2.4) but its score
function, given in (3.12) below, can be derived from the pdf given in (3.10) below. It is of interest to
see how much we sacrifice when approximating a Generalized Poisson-Exponential distribution with
a corresponding Poisson-Exponential distribution. Such an approximation may be desirable since the
Poisson-Exponential model, with its construction as a maximum of a Poisson-truncated number of
i.i.d. exponential random variables, is easier to manipulate than the Generalized Poisson-Exponential
model. The pdf of the Generalized Poisson-Exponential distribution with parameters θ > 0, λ > 0,
and β > 0 is given by
−λx
βθλe−λx−θβe −λx
p(x|θ, λ, β) = (1 − e−θ+θe )β−1 ; x > 0. (3.10)
(1 − e−θ )β
For β < 1 the pdf of the GPE distribution is monotonically decreasing, while for β ≥ 1 it is
unimodal positively skewed with skewness depending upon the values of two shape parameters β
and θ. Moreover for β < 1 the limit of the density function at 0 is undefined but for β ≥ 1 it is 0.
This difference in behaviour is reflected in the bound on the expectation in Lemma 3.3. The proof is
deferred to Section 4.
Lemma 3.3. Let X ∼ GP E(θ, λ, β). Then,
βθ
(
λ(1−e−θ )β
if β ≥ 1;
EX ≤ θ β e−θ(β−1)
(3.11)
(β)β λ(1−e−θ )β
Γ (β + 1) if β < 1.

7
Stein’s Method for Poisson-Exponential Distributions

The following theorem is based on the comparison of score Stein operators. We note that the score
function of the GPE distribution with parameters β, θ, λ ≥ 0 is
" #
−θ+θe−λx
p′ (x) (β − 1)e
ρ(x) = = λθe−λx β + − λ, x > 0. (3.12)
p(x) 1 − e−θ+θe−λx

We denote its corresponding Stein operator from (2.2) by TGP E .


Theorem 3.4. Let X1 ∼ P E(θ1 , λ1 ) and X2 ∼ GP E(θ2 , λ2 , β). For any h, let h : R → R be
bounded. Then for λ2 ≥ λ1 it holds that
   
λ2 λ 2 θ2 λ2
|Eh(X2 ) − Eh(X1 )| ≤ ∥h̃∥ |β − 1| + β−1 + − 1 (λ1 EX2 + 2) . (3.13)
λ1 λ1 θ1 λ1

Proof. We set p1 = P E(θ1 , λ1 ) and p2 = GP E(θ2 , λ2 , β). To employ (3.1), first we check that f
as in (2.3), for h bounded, is in the domain of this new Stein operator, that is, ETGP E |f (X)| < ∞.
R ∞ p)′ (x)
Now E[T f (X)] = 0 (fp(x) p(x)dx = limx→∞ (f p)(x) − limx→0 (f p)(x), and for p as in (3.10)
we have |f (x)p(x)| tends to 0 as x → 0 and as x → ∞, showing that f is in the Stein class for p.
Hence we can use (3.1). With the score functions from (2.7) and (3.12) respectively, in (3.1) we have
Eh(X2 ) − Eh(X1 )
−λ2 X2
!
(β − 1)e−θ2 +θ2 e
 
−λ2 X2 −λ1 X2
= E λ 2 θ2 e β+ −λ X − λ 2 − λ (θ
1 1 e − 1) f1 (X2 )
1 − e−θ2 +θ2 e 2 2

where f1 is the solution of the Poisson-Exponential Stein equation (2.9). Hence,


|Eh(X2 ) − Eh(X1 )|
−λ2 X2
e−θ2 +θ2 e
≤ E λ2 θ2 (β − 1)e(λ1 −λ2 )X2 −λ X e
−λ1 X2
|f1 (X2 )|
1 − e−θ2 +θ2 e 2 2
+ E λ2 θ2 βe−λ2 X2 − λ1 θ1 e−λ1 X2 |f1 (X2 )| + E|λ1 − λ2 ||f1 (X2 )|
−λ2 X2
e−θ2 +θ2 e −λ1 X2
≤λ2 θ2 |β − 1|E −λ X e f1 (X2 )
1 − e−θ2 +θ2 e 2 2
λ2 θ2
+ λ1 θ1 β − 1 E|e−λ1 X2 f1 (X2 )| + λ1 θ1 (λ1 − λ2 )EX2 |e−λ1 X2 f1 (X2 )|
λ1 θ1
+ (λ1 − λ2 )E|f1 (X2 )| (3.14)
by a calculation similar to (3.6) for the last inequality. Now from (2.10) with λ2 ≥ λ1 ,
−λ2 x
e−θ2 +θ2 e −λ1 x 1 ∥h̃∥ −λ1 x

−θ +θ e −λ2 x e f1 (x) ≤ θ (1−e−λ2 x ) (1 − e−θ1 (1−e )


)
1−e 2 2 e 2 λ θ
−1 1 1
∥h̃∥
≤ . (3.15)
λ1 θ2
Using (2.11), (2.13) and (3.11) along with (3.15) in (3.14) gives the bound (3.13).

If λ1 = λ2 and θ1 = θ2 the bound (3.13) can be improved by a factor of 2:


Corollary 3.5. For X1 ∼ P E(θ, λ), X2 ∼ GP E(θ, λ, β) and h : R → R bounded,
|Eh(X2 ) − Eh(X1 )| ≤ ∥h̃∥|β − 1|. (3.16)

Proof. With f1 (x) the solution of (2.9), (3.1) gives


e−λX2
 
|Eh(X2 ) − Eh(X1 )| = E λθ|β − 1| −λX2 f1 (X2 ,
)
1 − e−θ+θe
Hence using (2.10) yields the bound (3.16).

8
Stein’s Method for Poisson-Exponential Distributions

Remark 3.6. 1. The bound (3.16) solely depends on the parameter β and tends to 0 when β → 1,
which is in line with the fact that for β→1, GP E(θ, λ, β) converges to P E(θ, λ), see [13].
2. As in Remark 1, the bounds can easily be converted into bounds in total variation distance using
(1.5).
3. In order to bound the distance between P E(θ1 , λ1 ) and GP E(θ2 , λ2 , β) when λ1 ≤ λ2 we use
Theorem 3.4. When λ1 > λ2 we use
|GP E(θ2 , λ2 , β) − P E(θ1 , λ1 )| ≤|GP E(θ2 , λ2 , β) − P E(θ2 , λ2 )|
+ |P E(θ2 , λ2 ) − P E(θ1 , λ1 )|.
The first term on the r.h.s. is bounded in Corollary 3.5 and the second term is bounded in Theorem
3.1.

3.4 Approximating a Random Number of Maxima


Here we consider the distance between the distribution of a maximum of M exponential random
variables Ei ∼ exp(λ) and of a maximum of M independent random variables Xi , where M is a
zero truncated Poisson random variable with parameter θ. For this task we employ two different
approaches, first one based on (3.2) and the second one based on a Lindeberg-type argument.
Proposition 3.7. Let M be a zero-truncated Poisson with parameter θ, and let X1 , X2 , . . . , be
independent random variables with pdf pX . Let E, E1 , E2 , . . . be exponentially distributed with
parameter λ. Then
4
E|λe−λW − pX (W )| + E| − λ − ρX (W )| .

dT V (P E(θ, λ), L(max{X1 , . . . , XM })) ≤
λ

Proof. Abbreviate W = max{X1 , . . . , XM }. Let h : R+ → R be bounded and f denote the solution


(2.3) of the Stein equation (2.9) for h. From (3.3) and (2.13),
θE λe−λW − pX (W ) f (W ) + E(−λ − ρX (W ))f (W )
 

2∥h̃∥
E|λe−λW − pX (W )| + E| − λ − ρX (W )|


λ
where ρX denotes the score function of X. The assertion follows by taking h an indicator function so
that ||h̃|| ≤ 2||h|| ≤ 2.

As −λ is the score function of the exponential distribution, the bound in Proposition 3.7 is close
to zero if the density and the score function of X are close to that of the exponential distribution.
An example is given in Theorem 3.1 with θ1 = θ2 , applying the bound (3.6) to bound the first
contribution in Proposition 3.7. The next result has weaker assumptions on the random variables of
interest, but only yields a bound in bounded Wasserstein distance.
Proposition 3.8. Let M ∈ N+ be a random variable with finite mean and let X1 , X2 , . . . , be
independent random variables. Let E, E1 , E2 , . . . be i.i.d. exponentially distributed with parameter
λ. Then for all bounded and Lipschitz functions h : R → R,

X
|Eh(max{E1 , . . . , EM }) − Eh(max{X1 , . . . , XM }))| ≤ E|Ei − Xi |P(M ≥ i)||h′ ||. (3.17)
i=1

In particular,

X
dBW (L(max{E1 , . . . , EM }), L(max{X1 , . . . , XM })) ≤ E|Ei − Xi |P(M ≥ i).
i=1

If M is a zero truncated Poisson random variable with parameter θ and X, X1 , X2 , . . . and


E, E1 , E2 , . . . , are i.i.d. random variables then
θ
dBW (P E(θ, λ), L(max{X1 , . . . , XM }) ≤ E|E − X|.
1 − e−θ

9
Stein’s Method for Poisson-Exponential Distributions

Proof. We employ a Lindeberg-type argument, as follows. Defining X1 , X2 , . . . and E1 , E2 , . . . on


the same probability space, as the maximum is Lipschitz(1) we have
|Eh (max{E1 , . . . , EM }) − Eh (max{X1 , . . . , XM })|
X∞ Xm
≤ P(M = m) E |h (max{E1 , . . . , Ei , Xi+1 , . . . , XM })
m=1 i=1
−h (max{E1 , . . . , Ei−1 , Xi , . . . , XM })|

X X m X∞
≤ P(M = m) E |Ei − Xi | ∥h′ ∥ = E |Ei − Xi | P(M ≥ i)∥h′ ∥.
m=1 i=1 i=1

If E; E1 , E2 , . . . are i.i.d. then the expression simplifies, as then



X
E |Ei − Xi | P(M ≥ i) ∥h′ ∥ = E(M )E |E1 − X1 | ∥h′ ∥.
i=1
θ
Using, E(M ) = 1−e−θ
gives the last assertion.
Remark 3.9. In real-world applications, there are unlikely to be infinitely many risks Xi . As an
example suppose that Xi = 0 for i > K. Then Proposition 3.8 gives

K−1
!
X X
dBW (P E(θ, λ), L(max{X1 , . . . , XM }) ≤ E |Ei − Xi | P(M ≥ i) + λ P(M ≥ i) .
i=1 i=K

eθ θK
P∞
Now, with M having the zero-truncated Poisson(θ) distribution, i=K P(M ≥ i) ≤ 1−e−θ K!
.
Hence if Xi = Ei for i = · · · , K and M is zero-truncated Poisson θ random variable,
eθ θ K
dBW (P E(θ, λ), L(max{X1 , . . . , XM }) ≤ λ .
1 − e−θ K!
Remark 3.10. Although Proposition 3.8 provides a simple way to bound the distance between the
maximum of a random number of random variables without resorting to Stein’s method, its use
requires that the number of both such random variables has an identical probability distribution;
moreover the bound is in the bounded Wasserstein distance. Stein’s method, on the other hand, has
the flexibility to bound the stronger total variation distance when the distribution of the number of
random variables is not identical.
The next application shows that even in the setting of Proposition 3.8, using Stein’s method can be
advantageous.

3.5 Approximating the Poisson-Geometric Distribution


In this section, we consider the distribution of Y = max{T1 , . . . , TM }, the maximum of indepen-
dently identically distributed Geometric(p) random variables Ti , i = 1, . . . , M , when the number
of random variables M , is an independent zero truncated Poisson random variable with parameter
θ. We call this distribution the Poisson-Geometric distribution. The Poisson-Geometric distribution
P G(θ, p) with parameters θ and p has probability mass function (pmf)
y y−1
e−q θ − e−q θ
P (Y = y) = , y = 1, 2, 3, ... , (3.18)
(1 − e−θ )
where q = 1−p. As for X having Geometric(λ/n) distribution, the distribution of n−1 X converges to
exp(λ), it is plausible to approximate the distribution of Y /n by a corresponding Poisson-Exponential
distribution, which is a continuous distribution. To bound the distance between a discrete and a
corresponding continuous distribution we can compare Stein operators for the two distributions, with
the complication that one of the operators is discrete and the other is continuous.
For a discrete distribution with pmf P (Y = y) = q(y) and support I = [1, ∞) a discrete backward
Stein operator is
∆− p(y)
T f (y) = ∆− f (y) + f (y), y ∈ I, (3.19)
p(y)

10
Stein’s Method for Poisson-Exponential Distributions


where ∆− is the backward difference operator, ∆− f (x) = f (x) − f (x − 1) and ∆p(y) p(y)
is the
discrete backward score function; for details see Remark 3.2 and Example 3.13 in [18] is.
If W = W (n, Y) is as in (2.4) but with Y having pmf pY on the non-negative integers, and cdf FY ,
we have for the pmf pW of W that
pY (x) = GN (FY (x))−GN (FY (x − 1)) = ∆− (GN ◦ FY )(x)
and hence
∆− pY (x) ∆− (∆− (GN ◦ FY ))(x)
= . (3.20)
pY (x) ∆− (GN ◦ FY )(x)
In particular, a straightforward calculation shows that if Y ∼ P G(θ, p) then
 −qy θ y−1
− e−q θ

e
∆− p(y) = p(y − 1) −qy−1 θ − 1 .
e − e−qy−2 θ

This discrete backward score function includes the ratio of two exponential functions which compli-
cates the comparison with the Poisson-Exponential Stein equation (2.9). However there are many
Stein equations that characterize the same distribution, see for example [18]. So one way to simplify
this problem is to use the Standardization concept from [18] which is based on the observation that
Stein operators can also be applied to products of functions, say cg, where c plays the role of a
standardization function. A related situation arose in [15] where a standardization, in the sense of
[18], is applied to one of the Stein operators. In our setting we instead standardize both operators,
as detailed in the next section. A second complication arises from scaling: for Y ∼ P G(θ, n−1 λ)
we would approximate n−1 Y by X ∼ P E(θ, λ), since one distribution is discrete and the other is
continuous. Using the concept of standardization and minding the re-scaling of the discrete random
variable we propose the following procedure.

3.5.1 Comparing a discrete and a continuous distribution using standardised Stein equations:
To compare a discrete random variable Yn with distribution Qn , discrete backward score function
ρn and standardised Stein operator Tn (dg), and a continuous random variable X with distribution
Q, score function ρ and standardised Stein operator T (cg), with support(Qn ) ⊂ support(Q), it is
convenient to first rescale the discrete random Yn using a strictly monotone scale function sn (·), so
that Xn = sn (Yn ) ∼ Q̃n with score function ρ̃n , After re-scaling, for a function d : R → R, the
d−standardised Stein operator for Xn is
Tn (dg)(Xn ) = d(BXn )∆− g(Xn ) + d(Xn )ρ̃n (Xn ) + ∆− d(Xn ) g(Xn ),
 
(3.21)
while for a function c : R → R, the c−standardised Stein operator for X is
T (cg)(X) = c(X)g ′ (X) + [c(X)ρ(X) + c′ (X)] g(X). (3.22)
Then, for any bounded test function h ∈ H,
|EQn h(Xn ) − EQ h(X)|= E|Tn (dg)(Xn ) − T (cg)(Xn )|
≤ EQn d(Xn )ρ̃n (Xn ) − c(Xn )ρ(Xn ) + ∆− d(Xn ) − c′ (Xn ) |g(Xn )|
+EQn |d(BXn ) [g(Xn ) − g(BXn )] − c(Xn )g ′ (Xn )| . (3.23)
Here g(·) is the solution of the Stein equation (3.22) for some test function h and c(·), d(·) are two
standardization functions; B is the backward shift operator; Bf (x) = f (x − 1) and ∆− d(x) =
d(x) − d(Bx). With the scaling xn =sn (y) = y/n, (3.23) yields
|Eh(Xn ) − Eh(X)|
   
1 1
≤ E d Xn − g(Xn ) − g Xn − − c(Xn )g ′ (Xn ) (3.24)
n n
 
1
+E d(Xn )ρ̃n (Xn ) − c(Xn )ρ(Xn ) + d(Xn ) − d Xn − − c′ (Xn ) |g(Xn )|.
n
(3.25)
As an aside, while such standardization and scaling could potentially be used to create a “bespoke
derivative” as developed in [14]; the connection is not obvious.

11
Stein’s Method for Poisson-Exponential Distributions

For P E(θ, λ), (3.22) gives the standardised Poisson-Exponential Stein equation

c(x)g ′ (x) + [c(x)λ(θe−λx − 1) + c′ (x)]g(x) = h(x) − Eh(X). (3.26)


Note thatf (·) = (cg)(·), where f (·) is the solution of the score Stein equation defined using (2.1).
With this observation, bounds on the solution of this Stein equation are obtained in Lemma 3.11
below.

3.5.2 Application: Approximating a Poisson-Geometric distribution by a Poisson-Exponential


distribution:
For a P G(θ, pn ) an appropriate scale function is given by sn (y) = ny . Using this scale function,
along with a standardization function d, a standardised Stein equation is
nxn +1 nxn
!
e−θqn − e−θqn
    
1 1
d xn − g(xn ) − g xn − + d(xn ) nxn nxn −1 − 1
n n e−θqn − e−θqn
 
1
+d(xn ) − d xn − g(xn ) = h(xn ) − Eh(Xn ). (3.27)
n

To avoid the ratio of exponentials in the standardised PG Stein equation (3.27), with qn = 1 − nλ ,

we choose the standardization function d as
λ nz λ nz−1
d(z) = e−θ(1− n ) − e−θ(1− n ) .
Finally taking the limit as n → ∞ in d(z) and rescaling we take
λθ −λz−θe−λz
c(z) = e , f or θ > 0.
n2

Next we obtain bounds on the standardised Stein equation for the Poisson-Exponential distribution
(3.26) in the following Lemma.
Lemma 3.11. For g(x), the solution of Stein equation (3.26), for bounded differentiable h such that
−λx
∥h∥ ≤ 1 and ∥h′ ∥ ≤ 1, and c(x) = nλθ2 e−λx−θe , we have

−λx n2
|e−2λx−θe g(x)| ∥h̃∥,
≤ (3.28)
λ2 θ 2
−λx n2
e−λx−θe (θe−λx − 1)g(x) ≤ ∥h̃∥, (3.29)
λ2 θ
−λx n2
|e−λx−θe g(x)| ≤ 2 2 ∥h̃∥, (3.30)
λ θ
−λx n2
e−λx−θe g ′ (x) ≤ 3 ∥h̃∥, (3.31)
λθ
λθ −λx−θe−λx ′′
e g (x) ≤ ∥h′ ∥ + 9λθ∥h̃∥ + 11λ∥h̃∥. (3.32)
n2

Proof. For the bounds on the solutions of the standardised Stein equation (3.26), if g solves the
modified Stein equation (3.26) then f (·) = (cg)(·) solves the P E(θ, λ) Stein equation (2.9). The
bound (2.11) in Lemma (2.1); immediately gives (3.28). Also c′ (x) = c(x)λ(θe−λx − 1) so that
c′ (x)g(x) = λ(θe−λx −1)f (x); using (2.12) we get (3.29). Combining(3.28), (3.29) and the triangle
inequality we obtain (3.30).
Since (cg)′ (x) = c(x)g ′ (x) + c′ (x)g(x), rearranging and using (2.14), (2.12) with the triangle
inequality gives (3.31). For (cg)′′ (x) − c′′ (x)g(x) − 2c′ (x)g ′ (x) = c(x)g ′′ (x), using (3.31),
|c′ (x)g ′ (x)| ≤ λ|θe−λx −1|c(x)g ′ (x) ≤ 3λ(θ+1)∥h̃∥. For c′′ (x)g(x) = λ2 (θe−λx −1)2 c(x)g(x)−
λ2 θe−λx c(x)g(x), using the triangle inequality and (3.29) and (3.28) we get |c′′ (x)g(x)| ≤
λ|θe−λx − 1|∥h̃∥ + λ∥h̃∥ ≤ λθ∥h̃∥ + 2λ∥h̃∥. These two results along with (2.15) give (3.32).

12
Stein’s Method for Poisson-Exponential Distributions

Theorem 3.12. Let Yn ∼ P G(θ, pn ) with pn = λ/n and X ∼ P E(θ, λ), then for the scaled
Poisson-Geometric random variable Zn = Yn /n and bounded h with bounded first derivative, we
have
|Eh(Zn ) − Eh(X)|
 
 −2    θλ2
1 θλ λ 21 − θλ λ 8 + 3e n(n−λ)
≤ en 1− θλ  + 3e n 1 − +
n n  2 n 1 − e−θ
  −1  −2  −4 #
1 θλ θλ2 λ 3 λ 8 λ
+ 1 + e n + 9e n(n−λ) 1− + 1− + 1− ∥h̃∥
3 n 2 n 3 n
"  −1  −2 # )
11 λ − θλ λθ λ λ 1 ′
+λ + 2 − 2 e n + 6e n 1 − +4 1− ∥h̃∥ + ∥h ∥ .
2 n n n 2
(3.33)
Remark 3.13. As n → ∞, for fixed θ and λ the bound (3.33) decreases to 0 at rate n−1 . The bound
also allows for λ = λ(n) and θ = θ(n) to depend on n, as long as λ(n)θ(n)
n → 0 and λ(n)
n → 0 the
bound decrease to 0 at the same rate that is n−1 .
Remark 3.14. Equation (3.33) can be translated into a bound in the bounded Wasserstein distance,
using H = {h : R → R s.t. ∥h∥ ≤ 1 and ∥h′ ∥ ≤ 1} in (1.4).
Proof. To obtain the stated bound we bound (3.24) and (3.25) separately.
A. Bounding (3.25). Using the score functions of Poisson-Exponential and Poisson-Geometric
−λz λ nz λ nz−1
distributions with c(z) = nλθ2 e−λz−θe , and d(z) = e−θ(1− n ) − e−θ(1− n ) ,
 
1
d(z)ρ̃n (z) − c(z)ρ(z) + d(z) − d z − − c′ (z)
n
λ2 θ −λz
= 2 2 e−λz−θe (θe−λz − 1)
n
 λ nz+1 λ nz λ nz−1 λ nz−2

− e−θ(1− n ) − e−θ(1− n ) − e−θ(1− n ) + e−θ(1− n ) .
−2
λ2 θ

λ −λz
≤ 2 2
1 − 1 − |e−λz−θe (θe−λz − 1)|
n n
−2 −1
θ2 λ3
   
λ θλ 1 λ
+ 3 1− en 1−
n n 3 n
"  −1  −3 # 
θλ θλ λ λ −λz
θ + θe n + 6e n + 12 1 − + 8θ 1 − + 2 (θ + 2λz) e−2λz−θe
n n
" #
−2 −1
2θλ3
  
λ θλ λ θλ −λz
+ 3 1− en 1− e n + λz + θ e−λz−θe . (3.34)
n n n
Inequality (3.34) will be proved in Section 4. Note that
 −2  −2  2  −2
λ λ λ λ λ λ
1− 1− = 1− 1− −1 = 1− −2 . (3.35)
n n n n n n
Now using (3.34) with (3.28), (3.29), (3.30) and the simplification (3.35) we have
 
1
E d(Zn )ρn (Zn ) − c(Zn )ρ(Zn ) + d(Zn ) − d Zn − − c′ (Zn ) |g(Zn )|
n
 −2  
λ λ λ θλ θλ
≤ 2 1− ∥h̃∥ − 2 + 3θe n + 4λe n E(Zn )
n n n
 −3   −1  −3 
λ λ θλ λθ λ λ
+ 1− e n θ + (θ + 18)e n + 12 1 − + 8θ 1 − .
3n n n n
(3.36)

13
Stein’s Method for Poisson-Exponential Distributions

PM
To bound EZn the expectation, we argue similarly as for (3.7); Yn = max{T1 , . . . , TM } ≤ i=1 Ti ,
with M being zero truncated Poisson (θ) random variable and T1 , T2 , . . . , TM i.i.d. Geometric( nλ ).
PM nθ 1
So EYn ≤ E i=1 Ti = λ(1−e −θ ) and for Zn = n Yn ,

θ
EZn ≤ . (3.37)
λ(1 − e−θ )
Next, using (3.37) in (3.36) and simplifying we get
 
1
E d(Zn )ρn (Zn ) + d(Zn ) − d Zn − − c(Zn )ρ(Zn ) − c′ (Zn ) |g(Zn )| (3.38)
n
θλ  −2 "  −1  −4 #
θλe n λ 8 1 λ λθ 8 λ
≤ 1− 6+ + 1 − (1 + e n ) + 1 − ∥h̃∥
n n 1 − e−θ 3 n 3 n
 −2 "  −1  −2 #
λ λ λ 2 λθ λ θλ λ
+ 1− 2 − 2 + 6e n 1− + 4e n 1− ∥h̃∥.
n n n n n

B. Bounding (3.24). Using Taylor expansion for (3.24) we have


   
1 1
c(z)g ′ (z) − d z − g(z) − g z −
n n
 
λθ −λz−θe−λz
′ λ
−θ(1− n )nz−1 λ nz−2
−θ(1− n ) 1 ′ 1 ′′
= e g (z) − (e − e ) g (z) − g (z + ϵ)
n2 n 2n2
1
for some 0 < ϵ < n. In Section 4 we show that
   
1 1
c(z)g ′ (z) − d z − g(z) − g z − (3.39)
n n
−1 " −3 −1
θλ2
  
λ θ λ θλ λ θλ
≤ 3
1− θ+ 1− e +λ 1−
n e n−λ z
n n 2 n n
 −2 #
λ θλ −λz
+θ 1 − e n−λ e−λz−θe |g ′ (z)|
n
 −2
θλ λ θλ −λz
+ 3 1− e n e−λz−θe |g ′′ (z + ϵ)|.
2n n
Taking expectation and using the bounds (3.31) and (3.32) we obtain
   
′ 1 1
E c(Zn )g (Zn ) − d Zn − g(Zn ) − g Zn − (3.40)
n n
 −1 "  −3 θλ θλ
!  −1
θλ λ 3 λ θλ 3e n−λ 9e n λ
≤ 1− 3+ 1− e +
n + 1−
n n 2 n 1 − e−θ 2 n
 −2 #  −2 θλ  −2
λ θλ λ λ 11 θλ e n λ
+3 1 − e n−λ ∥h̃∥ + 1− e n ∥h̃∥ + 1− ∥h′ ∥.
n n n 2 2n n
Using (3.40) in (3.24), (3.38) in (3.25) and simplifying gives
|Eh(Zn ) − Eh(X)|
 
 −2    θλ2
1 θλ λ 21 − θλ λ 8 + 3e n(n−λ)
≤ en 1− θλ  + 3e n 1 − +
n n  2 n 1 − e−θ
  −1  −2  −4 #
1 θλ θλ2 λ 3 λ 8 λ
+ 1 + e n + 9e n(n−λ) 1− + 1− + 1− ∥h̃∥
3 n 2 n 3 n
"  −1  −2 # )
11 λ θλ λθ λ λ 1
+λ + 2 − 2 e− n + 6e n 1 − +4 1− ∥h̃∥ + ∥h′ ∥ .
2 n n n 2

14
Stein’s Method for Poisson-Exponential Distributions

Table 2: Values of n above which bound (3.33) outperforms bound (3.41)


λ 1 1 10 10 10 25 25 25 75 75 75 100 100
θ 1 10 1 10 25 10 25 75 25 75 100 75 100
n0 3 5 29 41 70 102 173 384 518 1152 1443 1535 1924

Remark 3.15. Instead of comparing Stein operators we could have employed Proposition 3.8
to bound the distance between the distribution of a maximum of exponential random variables,
X = max{E1 , . . . , EM } ∼ P E(θ, λ) where Ei ∼ exp(λ) and a scaled maximum of geometric
random variables Zn = Ynn , with Yn = max{G1 , . . . , GM } ∼ P G(θ, nλ ) and Gi ∼ geom(λ/n),
when M is a zero truncated Poisson random variable with parameter θ. Using (3.17) in Proposition
3.8 gives that for all bounded and Lipschitz functions h : R → R
 
Yn θ G
Eh(X) − Eh ≤ E X− ||h′ ||.
n 1 − e−θ n

Now we use the coupling G̃i = ⌈nXi ⌉ ∼ geom(1 − e−λ/n ), so that if (k − 1) < nXi ≤ k then
G̃i = k, and we note E|nX − G| ≤ E|nX − G̃| + E|G̃ − G|. Here
∞ Z k λ
X n n − e n (n − λ)
E|nX − G̃| = λe−λx (k − nx) dx = λ ,
k=0
(k−1)
n
λ(e n − 1)

λ
and since 1 − e− n ≤ nλ gives that G̃ is stochastically greater than or equal to G, we get E|G̃ − G| =
1
E(G̃ − G) = −λ
− nλ . Hence we get the bound
1−e n

  λ
!
Yn θ 2 n − e n (n − λ)
Eh(X) − Eh ≤ −θ λ ∥h′ ∥. (3.41)
n 1−e n λ(e n − 1)

In (3.33) using scaling we can make ∥h̃∥ as small as desired, so to compare it with (3.41) we focus on
the terms with ∥h′ ∥ in both bounds. For continuous ∥h′ ∥, our  bound (3.33)
 outperforms the bound
λ
θλ −2
→ 12 and 1−eθ −θ 2 n−e λ(n−λ) → (1−eθ −θ ) , as n → ∞.
n
(3.41) for large n, since 12 e n 1 − nλ

λ(e n −1)
In particular, for any n > n0 the r.h.s. of (3.41) is larger than the coefficient of ∥h′ ∥ in the bound
λ
θλ 4θ(n−λ)2 (n−e n (n−λ))
(3.33), where n0 = n0 (θ, λ) solves e n = λ . Table 2 shows such values of n0 .
n2 λ(1−e−θ )(e n −1)

3.6 The Maximum Waiting Time of Sequence Patterns in Bernoulli Trials


As an example of maxima of a random number of dependent variables we approximate the distribution
of the maximum waiting time until in each of M sequences of i.i.d Bernoulli trials, a prespecified
pattern is observed, the pattern may differ for different sequences but all patterns are of the same
fixed length.
(i) (i)
Consider M independent parallel systems (X1 , X2 , ...), i = 1, 2, . . . , M , of i.i.d. Bernoulli trials
with same success probability p = P (Xi = 1) = nλ , where M is an independent zero truncated
(i)
Poisson random variable with parameter θ. For each sequence i let Ij represent the indicator
(i)
function that a binary sequence pattern of length k occurs starting at Xj ; the pattern may be specific
(i)
to sequence i. Let Vi = min{j : Ij = 1} denote the first occurrence of the pattern of interest in
the ith system. We are interested in approximating the distribution of the maximum waiting time
among all M parallel systems that is . We approximate the distribution of Un = W/n using a
Poisson-Exponential distribution with parameters θ and λ, as follows.

15
Stein’s Method for Poisson-Exponential Distributions

Corollary 3.16. In the above setting,


dBW (L(Un ), P E(θ, λ))
 
 −2    θλ2
2θλ(k − 1) 1 θλ λ 21 − θλ λ 8 + 3e n(n−λ)
≤ + en 1− θλ  + 3e n 1 − +
n(1 − e−θ ) n n  2 n 1 − e−θ
  −1  −2  −4 #
1 θλ θλ2 λ 3 λ 8 λ
+ 1 + e n + 9e n(n−λ) 1− + 1− + 1−
3 n 2 n 3 n
"  −1  −2 # )
11 λ − θλ λθ λ λ 1
+λ + 2 − 2 e n + 6e n 1 − +4 1− ∥h̃∥ + .
2 n n n 2

Yn
Proof. If X ∼ P E(θ, λ), Yn ∼ P G(θ, λ/n) and Zn = n then we write
dBW (L(Un ), L(X)) ≤ dBW (L(Un ), L(Zn )) + dBW (L(Zn ), L(X)). (3.42)
Here we couple Un = n1 max{V1 , · · · , VM } and Zn = 1
n max{T1 , · · · , TM } by using the same zero-
truncated Poisson variable M . We have
∞ X
m
X θλ(k − 1)
dT V (L(Un ), L(Zn )) ≤ dT V (L(V i ), L(T i )) m P(M = m) ≤ ,
m=1 i=1
n(1 − e−θ )

using the bound from Corollary 1 in [22] in the last step. With (1.5),
2θλ(k − 1)
dBW (L(Un ), L(Zn )) ≤ 2dT V (L(Un ), L(Zn )) ≤ .
n(1 − e−θ )
Combining this result with (3.33) in (3.42) we obtain the assertion.

The assumption of i.i.d. sequences can be weakened to that of a Markov chain by applying Theorem
5.5 in [25]. This theorem gives a Poisson process approximation for the number of “declumped”
counts of each pattern, which in turn yields that the waiting time for each pattern to occur is approxi-
mately exponentially distributed. The theorem also gives an explicit bound on the approximation, but
requires considerable notation, and hence we do not pursue it here.

4 Further proofs
−λx
Proof of Lemma 3.3. For λ > 0, θ > 0 and β ≥ 1, (1 − e−θ+θe )β−1 ≤ 1 and hence
Z ∞
βθλ −λx −λx
E(X) = xe−λx−θβe (1 − e−θ+θe )β−1 dx
(1 − e−θ )β 0
Z ∞
βθλ
≤ xe−λx dx.
(1 − e−θ )β 0
This gives the first bound in (3.11). The second bound in (3.11), when β < 1, is obtained E(X) ≤
β(θλ)β e−θβ+θ R ∞ β −βλx
(1−e−θ )β 0
x e dx. □

λ nz n

Proof of Inequality (3.34). For z > 0, with a = θ 1 − n , and b = n−λ ,

λ2 θ −λz−θe−λz −λz
2 e (θe − 1)
n2  
λ nz+1 λ nz λ nz−1 λ nz−2
− e−θ(1− n ) − e−θ(1− n ) − e−θ(1− n ) + e−θ(1− n )
λ2 θ −λz−θe−λz −λz
= 2 e (θe − 1) − k(b)
n2

16
Stein’s Method for Poisson-Exponential Distributions

where
a 2 1
k(b) = e− b − e−a − e−ab + e−ab = (b − 1)2 4ae−a (a − 1) + R
2
via Taylor expansion for k(b) around 1, noting that k(1) = k ′ (1) = 0 and k ′′ (1) = 4a(1 − a)e−a .
With
a3 e−a/b 3 3 −ab2 3 −ab 6a2 e−a/b 6ae−a/b 2
k (3) (b) = − 8a b e + a e − + + 12a2 be−ab
b6 b5 b4
n
for some 1 < ξ ≤ b = n−λ
a a a
!
1 a2 e− ξ 6ae− ξ 6e− ξ 2 2
R = (b − 1)3 a a2 e−aξ + 6
− 5
+ 4
+ 12aξe−aξ − 8a2 ξ 3 e−aξ .
3 ξ ξ ξ
2 a n−λ
n
To bound R we use that 1 < ξ < n−λ so that e−aξ ≤ e−a and e−aξ ≤ e−a ; also e− ξ ≤ e−a n .
Hence with the crude bounds a ≤ θ, and e−a ≤ 1,
 6  5
1 2 −a n−λ 2 −a n−λ n−λ n−λ
3
|R| ≤ (b − 1) a a e + a e n + 6ae−a n
3 n n
 4    3 !
n−λ −a n−λ n −a 2 n −a
+ 6e n + 12a e + 8a e
n n−λ n−λ
   3 !
1 λ λ λ n n
≤ (b − 1)3 a2 e−a θ + θeθ n + 6eθ n + 6a−1 eθ n + 12 + 8θ .
3 n−λ n−λ

Substituting the expressions for a and b,


 3  2nz   −nz
1 λ λ λ nz λθ 1 λ
e−θ(1− n ) θ + θe n + 6e n + 6e n
λθ λθ
|R| ≤ θ2 1 − 1−
3 n−λ n θ n
   3 
n n
+ 12 + 8θ
n−λ n−λ
−3 −1
θ 2 λ3
  
λ −2λzn −θ (1− nλ nz
) θ + θe λθ λθ λ
≤ 1 − e e n + 6e n + 12 1 −
3n3 n n
 −3  3
 −3
λ 2θλ λ nz
e n e−λzn e−θ(1− n ) .
λθ λ
+ 8θ 1 − + 3 1−
n n n
nz −λz nz
−λz nz
Here e−θ(1− n )  = eθe −θ(1− n ) e−θe = eθ[e −(1− n ) ] e−θe . Also for |x| < n we
λ λ −λz λ −λz

2 n n 2
have ex 1 − xn ≤ 1 + nx and 0 ≤ ex − 1 + nx ≤ xn ex . Hence, for λz ≤ nz,
nz
(λz)2 −λz λ2 z −λz

λz
0 ≤ e−λz − 1 − ≤ e = e (4.1)
nz nz n

and xe−ax = x
eax ≤ x
1+ax < a1 , giving
nz λ2 z −λz −λz −λz
e−θ(1− n )
λ θλ
≤ eθ n e e−θe ≤ e n e−θe . (4.2)
Hence
−3  −1
θ2 λ3 θλ
 
λ λθ λθ λ
|R| ≤ e n 1− θ + θe + 6e + 12 1 −
n n (4.3)
3n3 n n
−3  −3
θλ3
 
λ −λzn λ 2θλ −λzn
+8θ 1 − e−2λzn −θe +2 3 1− e n e−λzn −θe .
n n n

17
Stein’s Method for Poisson-Exponential Distributions

Next, we write the difference (3.34) as


λ2 θ −λz−θe−λz −λz
2 e (θe − 1)
n2
λ nz+1 λ nz λ nz−1 λ nz−2
−(e−θ(1− n ) − e−θ(1− n ) − e−θ(1− n ) + e−θ(1− n ) )
−2 
λ2 θ −λz−θe−λz −λz
 
λ
≤ 2 2 e (θe − 1) 1 − 1 − + |R| + |R2 | (4.4)
n n
with
2   
 2nz 
λ λ λ nz
−θ (1− n ) −2λz−θe−λz
R2 =2 θ θ 1− e −e
n−λ n
 nz 
λ λ nz −λz
− 1− e−θ(1− n ) − e−λz−θe .
n
−λz 2nz −θ(1− λ )nz
We write R2 = R2,1 + R2,2 with R2,1 = e−2λz−θe − 1 − nλ e n and R2,2 =
λ nz
−λz−θe−λz λ nz −θ (1− n )

e − 1− e n . First,
"
 2nz   nz #
−θ (1− n
λ
)
nz λ
−2λz −2λz −λz λ
|R2,1 | ≤ e e − 1− +e θ e − 1− .
n n
n
Here we employed Property 4 in [26]: for all x > 0 we have 1 + nx − 1 ≤ xex and ex − 1 ≤ xex .
Now using (4.1) and (4.2) we bound R2,1 as
2λ2 z θλ −2λz−θe−λz θλ θλ −2λz−θe−λz
|R2,1 | ≤ ene + ene . (4.5)
n n
Similarly for R2,2 ,
nz
λ2 z θλ −λz−θe−λz θλ θλ −λz−θe−λz

−λz λ λ nz
e−λz−θe − 1− e−θ(1− n ) ≤ ene + ene . (4.6)
n n n
Hence we bound R2 using (4.5) and (4.6)
−2 −2
θ2 λ4 θ3 λ3
 
λ θλ −λz λ θλ −λz
|R2 | ≤ 4 3 1− ze n e−2λz−θe +2 3 1− e n e−2λz−θe
n n n n
4
 −2 2 3
 −2
θλ λ θλ −λz θ λ λ θλ −λz
+2 3 1− ze n e−λz−θe +2 3 1− e n e−λz−θe . (4.7)
n n n n
Combining (4.3), (4.4) and (4.7) and simplifying the upper bound gives the bound (3.34) as stated in
(3.34). □

nz
Proof of Inequality (3.39). For z > 0 we let a = θ 1 − nλ n
, b = n−λ and c = ab, so that for
1
(3.39), there is a 0 < ϵ < n such that
   
′ 1 1
c(z)g (z) − d z − g(z) − g z −
n n
 
λθ −λz−θe−λz 1 1
= 2
e − t(b) g ′ (z) + 2 t(b)g ′′ (z + ϵ),
n n 2n
λ nz−1 λ nz−2
where t(b) := e−θ(1− n ) − e−θ(1− n ) = e−c − e−cb . Using Taylor expansion around b = 1,
for some 1 < ξ < b
 2
1 n nθ −λz 1
t(b) = θλe−λz− n−λ e − (b − 1)2 c2 e−cξ + R3 ,
n n−λ 2

18
Stein’s Method for Poisson-Exponential Distributions

where
 2  nz 
θλ n λ nθ
− n−λ ( λ nz
1− n ) nθ
−λz− n−λ e−λz
R3 = 1− e −e .
n n−λ n
Hence
   
′ 1 1
c(z)g (z) − d z − g(z) − g z −
n n
λθ −λz−θe−λz n
(1− n−λn
)θe−λz g ′ (z) + 1 (b − 1)2 c2 e−cξ g ′ (z)
o
≤ e 1 − e
n2 2n
1 1
+ R3 g ′ (z) + t(b)g ′′ (z + ϵ) . (4.8)
n 2n2

We bound the terms in this expression separately. First, we again use that for x ≥ 0 we have
λ −λz
xe−x ≤ 1 − e−x ≤ x to obtain 1 − e− n−λ θe λ
≤ (n−λ) θe−λz , and hence
 −1 2 2
λθ −λz−θe−λz n (1− n−λ
n
)θe−λz g ′ (z) ≤ 1 − λ
o λ θ −λz−θe−λz ′
2
e 1 − e e |g (z)|. (4.9)
n n n3
  nz nz
λ2
Using 12 (b − 1)2 = 2(n−λ) λ θ
1 − nλ ≥ θ 1 − nλ

2 as well as 1 − n ≤ 1, 1−λ/n and (4.2)
we have
1  4 λ nz
e−2λz e−θ(1− n ) |g ′ (z)|
2 2
(b − 1)2 c2 e−cξ g ′ (z) ≤ λ2nθ3 n−λ n
2n
2 2 −4 θλ −λz−θe−λz ′
≤ λ2nθ3 1 − nλ ene |g (z)|. (4.10)
Similarly for x ≥ 0, ex − 1 ≤ xex gives
t(b) = e−bc (ec(b−1) − 1) ≤ c(b − 1)e−c
 −2
θλ λ θλ −λz
≤ 1− e n e−λz−θe . (4.11)
n n
 

In order to bound R3 we use (4.6) with n−λ instead of θ to obtain
 nz
−λz λ λ nz
e− n−λ (1− n )
nθ nθ
e−λz− n−λ e − 1−
n
λ2 z n−λ
 
θλ nθ −λz n θλ n−λ
θλ nθ −λz
≤ e e−λz− n−λ e + e e−λz− n−λ e
n n−λ n
λ2 z n−λ
 
θλ −λz n θλ n−λ
θλ −λz
≤ e e−λz−θe + e e−λz−θe .
n n−λ n
Hence
−2
θλ3
  
1 λ θλ
−λz−θe−λz nθ
R3 g ′ (z) ≤ 3 1− e n−λ ze ′
|g (z)| z + . (4.12)
n n n λ(n − λ)
Combining (4.9), (4.10), (4.11) and (4.12) gives for (4.8) that
   
′ 1 1
c(z)g (z) − d z − g(z) − g z −
n n
2 2
−1 −4
λ2 θ 2
 
λ θ λ −λz−θe−λz ′ λ θλ −λz
≤ 1 − e |g (z)| + 1 − e n e−λz−θe |g ′ (z)|
n3 n 2n3 n
−2 −1 !
θλ2
 
λ θλ
−λz−θe −λz
′ λ
+ 3 1− e n−λ ze |g (z)| λ + θ 1 −
n n n
 −2
θλ λ θλ −λz
+ 3 1− e n e−λz−θe |g ′′ (z + ϵ)|. □
2n n

19
Stein’s Method for Poisson-Exponential Distributions

Acknowledgements. We thank Christina Goldschmidt, David Steinsaltz and Tadas Temcinas for
helpful discussions. We would also like to thank the anonymous reviewers for suggestions which
have led to an overall improved paper.
Funding Information. AF is supported by the Commonwealth Scholarship Commission, United
Kingdom. GR is supported in part by EPSRC grants EP/T018445/1 and EP/R018472/1.

References
[1] A ARSET, M. V. (1987). How to identify a bathtub hazard rate. IEEE Transactions on Reliability
R-36, 106–108.
[2] A DAMIDIS , K. AND L OUKAS , S. (1998). A lifetime distribution with decreasing failure rate.
Statistics & Probability Letters 39, 35–42.
[3] A NASTASIOU , A., BARP, A., B RIOL , F.-X., E BNER , B., G AUNT, R. E., G HADERINEZHAD ,
F., G ORHAM , J., G RETTON , A., L EY, C., L IU , Q. ET AL . (2023). Stein’s method meets
computational statistics: A review of some recent developments. Statistical Science 38, 120–
139.
[4] A RRATIA , R., G OLDSTEIN , L. AND G ORDON , L. (1989). Two moments suffice for Poisson
approximations: the Chen-Stein method. The Annals of Probability 9–25.
[5] BARBOUR , A. D., H OLST, L. AND JANSON , S. (1992). Poisson Approximation. The Clarendon
Press, Oxford, UK.
[6] BARBOUR , A. D., ROSS , N. AND Z HENG , G. (2021). Stein’s method, smoothing and functional
approximation. arXiv:2106.01564.
[7] BARRETO -S OUZA , W., DE M ORAIS , A. L. AND C ORDEIRO , G. M. (2011). The Weibull-
geometric distribution. Journal of Statistical Computation and Simulation 81, 645–657.
[8] BASU , A. P. AND K LEIN , J. P. (1982). Some recent results in competing risks theory. Lecture
Notes-Monograph Series 2, 216–229.
[9] C ANCHO , V. G., L OUZADA -N ETO , F. AND BARRIGA , G. D. C. (2011). The Poisson-
exponential lifetime distribution. Computational Statistics & Data Analysis 55, 677–686.
[10] C HEN , L. H., G OLDSTEIN , L. AND S HAO , Q.-M. (2011). Normal Approximation by Stein’s
Method. Springer, Heidelberg, London.
[11] C HEN , L. H. Y. (1975). Poisson approximation for dependent trials. The Annals of Probability
3, 534–545.
[12] FATIMA , A. AND ROOHI , A. (2015). Extended Poisson exponential distribution. Pakistan
Journal of Statistics and Operation Research 361–375.
[13] FATIMA , A. AND ROOHI , A. (2015). The Generalized Poisson-exponential distribution. Journal
of ISOSS 1, 103–118.
[14] G ERMAIN , G. AND S WAN , Y. (2023). One-dimensional Stein’s method with bespoke deriva-
tives. arXiv preprint arXiv:2310.03190.
[15] G OLDSTEIN , L. AND R EINERT, G. (2013). Stein’s method for the beta distribution and the
Pólya-Eggenberger urn. Journal of Applied Probability 50, 1187–1205.
[16] G UPTA , R. D. AND K UNDU , D. (1999). Theory & methods: Generalized exponential distribu-
tions. Australian & New Zealand Journal of Statistics 41, 173–188.
[17] JAYAKUMAR , K., BABU , M. G. AND BAKOUCH , H. S. (2021). General classes of complemen-
tary distributions via random maxima and their discrete version. Japanese Journal of Statistics
and Data Science 4, 797–820.
[18] L EY, C., R EINERT, G. AND S WAN , Y. (2017). Stein’s method for comparison of univariate
distributions. Probability Surveys 14, 1–52.
[19] L EY, C. AND S WAN , Y. (2013). Stein’s density approach and information inequalities. Elec-
tronic Communications in Probability 18, 1–14.
[20] M IJOULE , G., R AI Č , M., R EINERT, G. AND S WAN , Y. (2023). Stein’s density method for
multivariate continuous distributions. Electronic Journal of Probability 28, 1–40.

20
Stein’s Method for Poisson-Exponential Distributions

[21] N OURDIN , I. AND P ECCATI , G. (2012). Normal Approximations with Malliavin Calculus:
From Stein’s Method to Universality. Cambridge University Press, Cambridge, UK.
[22] P EKÖZ , E. A. (1996). Stein’s method for geometric approximation. Journal of Applied
Probability 33, 707–713.
[23] P EKÖZ , E. A. AND R ÖLLIN , A. (2011). New rates for exponential approximation and the
theorems of Rényi and Yaglom. The Annals of Probability 39, 587 – 608.
[24] P EKÖZ , E. A., R ÖLLIN , A. AND ROSS , N. (2013). Total variation error bounds for geometric
approximation. Bernoulli 19, 610 – 632.
[25] R EINERT, G., S CHBATH , S. AND WATERMAN , M. S. (2000). Probabilistic and statistical
properties of words: an overview. Journal of Computational Biology 7, 1–46.
[26] S ALAS , A. H. (2012). The exponential function as a limit. Applied Mathematical Sciences 6,
4519–4526.
[27] S ILVA , R. B., BARRETO -S OUZA , W. AND C ORDEIRO , G. M. (2010). A new distribution with
decreasing, increasing and upside-down bathtub failure rate. Computational Statistics & Data
Analysis 54, 935–944.
[28] S TEIN , C. (1986). Approximate Computation of Expectations. Institute of Mathematical
Statistics, Hayward, California.
[29] S TEIN , C., D IACONIS , P., H OLMES , S. AND R EINERT, G. (2004). Use of exchangeable pairs
in the analysis of simulations. Lecture Notes-Monograph Series 46, 1–26.
[30] T OJEIRO , C., L OUZADA , F., ROMAN , M. AND B ORGES , P. (2014). The complementary
Weibull geometric distribution. Journal of Statistical Computation and Simulation 84, 1345–
1362.

21

You might also like