You are on page 1of 19

A MULTIPLE COUPON COLLECTION PROCESS

AND ITS MARKOV EMBEDDING STRUCTURE

ELLEN BAAKE AND MICHAEL BAAKE


arXiv:2405.05203v1 [math.PR] 8 May 2024

Abstract. The embedding problem of Markov transition matrices into Markov semigroups
is a classic problem that regained a lot of impetus and activities in recent years. We consider
it here for the following generalisation of the well-known coupon collection process: from
a finite set of distinct objects, a subset is drawn repeatedly according to some probability
distribution, independently and with replacement, and each time united with the set of
objects sampled so far. We derive and interpret properties and explicit conditions for the
resulting discrete-time Markov chain to be representable within a semigroup or a flow of a
continuous-time process of the same type.

1. Introduction
The classic coupon collector’s problem (CCP) starts from a set S = {1, 2, . . . , N } of distinct
types of objects (coupons). The objects are hidden in breakfast cereal boxes — exactly one in
each box, its type distributed according to U (S), the uniform distribution on S, independently
for every box. The collector opens one box at a time, that is, he samples with replacement
from U (S), and stops when he has all types in S. One is typically interested in the distribution
of the number of boxes opened, that is, the number of samples required.
The classic CCP has been generalised in various ways. Neal [15] investigates the situation
that the objects may have different frequencies (and may be absent), so that U (S) is replaced
by a discrete distribution p = (pi )i∈S∪{0} on S ∪ {0} with 0 < pi < 1 for i ∈ S ∪ {0} and
Pm
i=0 pi = 1. Schilling and Henze [17] allow for a fixed number 1 6 k < m of different objects
per box, uniformly on the set of subsets of S of size k, or slightly non-uniformly.
Here, we consider an even more general setting, the multiple coupon collection process
(MCCP), which we define as follows. Every box contains a random subset of S, where subset
P
K ⊆ S is present with probability pK , with K⊆S pK = 1, again independently for every box.
Our interest here is in the discrete-time Markov chain (Xn )n∈N0 on the set of subsets of S,
where Xn is the set of types collected until step n. While we do not consider the usual stopping
time problem, we are concerned with whether X is embeddable, that is, whether there is a
continuous-time Markov chain (Yt )t∈R >0 such that P(Y1 = J | Y0 = I) = P(X1 = J | X0 = I)
for all I, J ⊆ S. Put differently, we ask the question under which conditions on the distribution
p = (pK )K⊆S the Markov transition matrix of X occurs in a time-homogeneous Markov

2010 Mathematics Subject Classification. 60J10, 60J27, 15A16, 06A07.


Key words and phrases. Markov chains, coupon collection, embedding problem, incidence algebra.
1
2 ELLEN BAAKE AND MICHAEL BAAKE

semigroup of the form {etQ : t > 0} for some Markov generator Q. More generally, one can
also consider the time-inhomogeneous situation of a general Markov flow; compare [13, 5] and
references therein. For our special matrix family, this extension does not seem to provide
extra insight, wherefore we only touch upon it briefly.
Concrete and exhaustive criteria for a discrete-time Markov chain to be embeddable this
way are known for state spaces of up to cardinality 4; see [6, 4] and references therein. Beyond
this, they are restricted to specific families of Markov matrices; see [3, 5] for examples. The
MCCP enriches this collection by an interesting member for which a complete, yet non-trivial
answer can be given. This is partly due to the fact that the transition matrix is of triangular
form, and partly due to a fruitful combination of algebraic, combinatorial and probabilistic
tools. The MCCP thus adds relevant insight into the embeddability problem, a topic that
has a long history [10, 14, 13] and currently receives increased attention [8, 9, 3].
The paper is organised as follows. We begin by recalling some material from combinatorics
and linear algebra in Section 2. Then, the MCCP is described in more detail in Section 3, with
some emphasis on the defining properties and their consequences. The corresponding families
of Markov matrices and generators are analysed in Section 4, which establishes the core for
the embedding problem. Section 5 then dives a little deeper into the algebraic, combinatorial
and analytic structures via the underlying moduli spaces for Markov matrices and generators.
Finally, Section 6 analyses the positivity conditions needed for embeddability, which leads to
a clear answer for this family of processes.

2. Preliminaries
Below, we need some rather classic notions and results from combinatorics and linear
algebra. Since they will be combined in a somewhat unusual manner, we recall a few properties
of partitions and matrices, while introducing our notation at the same time.
Let S be a finite set, and consider the lattice P(S) of partitions of S; see [1] for background.
Here, we write a partition of S as A = {A1 , . . . , Am }, where m = |A| is the number of its
(non-empty) parts (also called blocks), and one has Ai ∩ Aj = ∅ for all i 6= j together with
A1 ∪ · · · ∪ Am = S. Below, we shall need a specific combinatorial identity, which we state and
prove for lack of reference and convenience of the reader.

Fact 2.1. If S is a non-empty finite set and P(S) its partition lattice, one has the identity
X
(−1)|A| |A|! = (−1)|S| .
A∈P(S)

Proof. This can be proved by induction in the cardinality of S. When |S| = 1, there is only
the trivial partition of S, and both sides of the formula evaluate to −1.
Assume that the identity holds for |S| = n and consider the set S ′ = S ∪ {e} with one new
element, e, which now allows the following induction argument. The partitions of S ′ come in
two types. Either, {e} forms a part of its own and thus augments a partition of S by this
COUPON COLLECTION AND MARKOV EMBEDDING 3

part, or we have a partition of S where e is added to one of its existing parts, for which there
are as many choices as there are parts.
This means that we can evaluate our new sum via two separate sums over P(S) as
X X X
(−1)|B| |B|! = (−1)|A|+1 |A| + 1 ! + (−1)|A| |A|! · |A|


B∈P(S ′ ) A∈P(S) A∈P(S)



X
= − (−1)|A| |A|! = −(−1)|S| = (−1)|S | ,
A∈P(S)

which completes the induction step and thus the proof. 

One relevant lattice is the family of subsets of a given finite set with the partial order of set
inclusion. Here, we consider S = {1, 2, . . . , N } and its 2N subsets, which form the power set
lattice, denoted by 2S , where we use ⊆ for the partial order between two sets. Here, I ⊂ J is
then used for I ⊆ J with I 6= J, and we write J − I for J \ I whenever I ⊆ J for simplicity.
Further, |K| denotes the cardinality of K, and K = S − K is the complement of K in S.
The Möbius function µM of this lattice is given by µM (I, J) = (−1)|J−I| when I ⊆ J and
by µM (I, J) = 0 otherwise. Further, the Möbius inversion formula acts as
X X
(1) g(K) = f (I) ⇐⇒ f (K) = (−1)|K−I| g(I),
I⊆K I⊆K

where f and g are arbitrary functions on 2S ; see [1, Ch. IV.2 and Ex. 4.15.IV] for more.
There is an interesting connection between the lattice 2S of subsets of a non-empty finite
set S and the partition lattice of S as follows, where summation variables are marked with a
dot below them for clarity.

Lemma 2.2. Let S be a non-empty finite set and let f : 2S \ ∅ −→ R + be a strictly


positive function. Assume that f is extended to a function on the partitions of S by setting
Q
f (A) = A∈A f (A) for any A ∈ P(S). Then, one has the identity
X X
(−1)|A|−1 |A| − 1 ! log f (A) = (−1)|S−A| log f (A).


A∈P(S)
. ∅6=A⊆S
.
P
Proof. Observing that log f (A) = A∈A log f (A), the order of the double summation on the
left-hand side can be changed, which turns it into
X X
(−1)|A|−1 |A| − 1 !

log f (A)
∅6=A⊆S
. A∈A∈P(S)
.
Here, the second sum is 1 when A = S, and otherwise turns into a sum over all partitions of
the form {A, B1 , . . . , Bm }, where B = {B1 , . . . , Bm } is a partition of S − A. Adjusting the
count of the number of parts, the second sum then becomes
X
(−1)|B| |B|! = (−1)|S−A|
B∈P(S−A)

by Fact 2.1. Putting the two cases together establishes the claimed identity. 
4 ELLEN BAAKE AND MICHAEL BAAKE

Some of our results would naturally be formulated in terms of the (complex) Jordan normal
form (JNF) of a given matrix. Since we will later see that we only need to deal with diagonal-
isable matrices, we simplify the tools and methods to this setting. Such matrices have a JNF
with only trivial elementary Jordan blocks. We call a matrix B simple if its eigenvalues are
distinct, in which case B is automatically diagonalisable. When we have multiple eigenvalues,
we call σ(B), the spectrum of B, degenerate.
Markov matrices have a special spectral structure as follows, which we recall from [4].
Throughout, we use Md to denote the subset of Markov matrices in Mat(d, R), which is a
(0)
closed convex set, and Ad for the set of all matrices from Mat(d, R) with zero row sums.
They form a non-unital algebra, because 1 is not an element of it, and no other two-sided
(0)
unit exists in it. All Markov generators are elements of Ad .

Fact 2.3. For all M ∈ Md , one has 1 ∈ σ(M ) together with equal algebraic and geometric
multiplicity. In particular, there is no non-trivial Jordan block for λ = 1.
Further, the corresponding statement holds for generators, which is to say that any Markov
generator has 0 as an eigenvalue, again with no non-trivial Jordan block for it. 

Given a real matrix B, we need to know when it possesses a real matrix logarithm, that is,
a real matrix R such that B = eR holds, where the exponential of a matrix is defined by the
convergent series eR = ∞ 1 n
P
n=1 n! R . Further, we ask when such a real logarithm is unique.
The following important characterisation follows from [7, Thms. 1 and 2].

Fact 2.4. A diagonalisable matrix B ∈ Mat(d, R) has a real logarithm if and only if the
following two conditions are satisfied.
(1) The matrix B is non-singular.
(2) Each negative eigenvalue of B occurs with even multiplicity.
Further, when B has simple spectrum with positive eigenvalues only, the real logarithm of B
is unique. 

Finally, we need the spectral mapping theorem (SMT). When B is a matrix with spectrum
σ(B), and h(x) is any polynomial in one variable, also h(B) is well defined, and the general
SMT [16, Thm. 10.33] applied to matrices states that
 
(2) σ h(B) = h σ(B) := {h(λ) : λ ∈ σ(B)},
which holds for the spectrum including multiplicities. This extends easily to power series of
matrices when all eigenvalues lie inside the disk of convergence. This is all we will need below;
see [11, 12] for further background.

3. The multiple coupon collection process


Here, we use the set S = {1, 2, . . . , N } for the N objects and the corresponding lattice, 2S ,
as introduced above. Now, let Z ⊆ S be the (random) set of objects sampled in one step,
and let its distribution be given by p = (pK )K⊆S , that is, P(Z = K) = pK > 0 for K ⊆ S,
COUPON COLLECTION AND MARKOV EMBEDDING 5
P
with K⊆S pK = 1. As mentioned in the introduction, the MCCP then is the discrete-time
Markov chain (Xn )n∈N0 on the set of subsets of S, where Xn is the set of distinct objects
sampled until step n. Note that no record is kept on the number of times an object occurs in
the process. Below, we assume X0 = ∅ unless stated otherwise. Clearly, Xn+1 = Xn ∪ I with
probability pI for I ⊆ S, independently for every n and independently of the current state
Xn . This gives the transition probabilities MIJ = P(Xn+1 = J | Xn = I) as
X
(3) MIJ = pK
.
I∪K=J

for all I, J ⊆ S, where K runs through all subsets of J subject to the given condition. The
relation (3) includes the vanishing condition for I 6⊆ J because the empty sum is zero by
convention. In particular, M is triangular, with MSS = 1. Let us formalise this as follows.

Definition 3.1. Let S = {1, 2, . . . , N }, for an arbitrary but fixed N ∈ N. Then, a matrix
M ∈ Mat(2N , R) is said to possess Property PM, or to be a PM matrix for short, if there is
a probability vector p = (pK )K⊆S such that Eq. (3) holds for all I, J ⊆ S.

Despite the origin of (3) from the MCCP, Definition 3.1 does not require M to be a Markov
matrix in the first place. The Markov property should then be viewed as a consequence of
the parametrisation. Indeed, we have the following simple consistency result.

Fact 3.2. Any PM matrix is a Markov matrix.

Proof. If p is a probability vector, we have pK > 0 for all K ⊆ S, and then MIJ > 0 for all
P
I, J ⊆ S by (3). Since we also have K⊆S pK = 1, we can calculate the row sums of M as
X X X X X X
MIJ = pK = pL∪(J−I)
˙ = pK = 1,
.
J⊆S .
J⊆S .
I∪K=J .
I⊆J⊆S L∈J−I K⊆S

which is true for any I ⊆ S. 


The class of PM matrices precisely consists of the MCCP Markov matrices. Clearly, Prop-
erty PM is preserved under convex combinations, as we shall exploit below and in Section 5.
Due to the bijective parametrisation by probability vectors, the set of PM matrices, for any
fixed set S, has local dimension 2|S| − 1. All such matrices have a rather specific triangular
form, and can be interpreted as elements of the incidence algebra [1, 18] of 2S , which manifests
itself in strong properties due to the underlying lattice structure.

Example 3.3. Let us look at the elementary special case of each object i ∈ S appearing
independently with probability πi , where we assume 0 < πi < 1 to avoid degeneracies. Let
us mention in passing that this case also appears naturally in the context of genetics, where
the objects are the links between the sites in a sequence, each of which is cut by genetic
recombination, independently in every generation and independently of each other [2].
For K ⊆ S, this gives Y Y
pK = πi (1 − πj )
i∈K j∈K
6 ELLEN BAAKE AND MICHAEL BAAKE

and thus the discrete-time Markov matrix M with entries


 Y  Y
(4) MIJ = P(X1 = J | X0 = I) = πi (1 − πj ),
i∈J−I j∈S−J

where (Xn )n∈N0 is the corresponding discrete-time Markov chain. The derivation of (4)
Q  P Q Q
employed the identity 1 = i∈I πi + (1 − πi ) = K⊆I i∈K πi j∈I−K (1 − πj ).
The corresponding continuous-time process, by an educated guess or insight from the ex-
isting literature, consists of independent Poisson processes, with rate ri = − log(1 − πi ) at
site i ∈ S. This gives e−ri = 1 − πi and describes the appearance of links (or coupons) at
time 1 via the probabilities
 Y  Y
P(Y1 = J | Y0 = I) = (1 − e−ri ) e−rj ,
i∈J−I j∈S−J

where Yt t>0 now is the Markov chain in continuous time. This matches the discrete-time
formula from (4). ♦

While this example looks simple, the connection between discrete and continuous time
processes is considerably more complex in the presence of dependencies, and requires a more
detailed investigation.

4. Markov matrices and generators for MCCPs


Let us analyse the structure of PM matrices in some detail, and then derive the correspond-
ing notion for Markov generators (and beyond). This will establish a rather strong algebraic
structure that is extremely helpful for dealing with the embedding problem.

Lemma 4.1. If M is a PM matrix, with parametrising probability vector p, any power M n


with n ∈ N is a PM matrix as well, then with parameter vector p(n) = pM n−1 , whose entries
are
n
(n)
X Y
n−1

p K = pM K
= p Ui .
U1 ,...,Un i=1
U1∪···∪Un =K

Proof. The probability vector p coincides with the row (M∅K )K⊆S of the Markov matrix M .
The corresponding row of M n clearly is p(n) = pM n−1 , and we can inductively calculate
n
(n+1) (n)
X X X Y X
(n)

pK = p M K
= p I MIK = p Ui pJ
I⊆K I⊆K U1 ,...,Un i=1 .
I∪J=K
U1∪···∪Un =I
(5) n n+1
X Y X Y
= pJ p Ui = p Ui ,
U1 ,...,Un ,J i=1 U1 ,...,Un+1 i=1
U1∪···∪Un ∪J=K U1∪···∪Un+1 =K

which establishes the claimed formula for the p(n) .


COUPON COLLECTION AND MARKOV EMBEDDING 7

It remains to prove that p(n) is the probability vector needed for the validity of Property
PM for M n . This is clear for n = 1 by assumption, so we can once again proceed inductively
from n to n + 1, this time via
(n+1) (n)
X X X (n) X
M IJ = MIK MKJ = pH pU
.
I⊆K⊆J .
I⊆K⊆J .
I∪H=K . =J
K∪ U

(n) (n+1)
X X
= pH pU = pK ,
H,U ⊆J .
I∪K=J
I∪H∪U =J

where the last step used the relations for the p(n+1) derived in (5). 

Next, let us look at the property analogous to PM at the level of rate matrices. To do
so, we consider A = M − 1, which is a rate matrix in the incidence algebra (its off-diagonal
entries are non-negative and its row sums are zero, as are all elements AIJ with I 6⊆ J).
What is more, it inherits some of Property PM as follows. For I 6= J, we have AIJ = MIJ
and Eq. (3) applies, while the diagonal elements are fixed by the zero row sum condition, so
P
AI I = − I⊂J AIJ holds for all I ⊆ L.
.
This suggests that, given S, we can make a (preliminary) definition of what we will call
Property PG, for all elements B of the incidence algebra with zero row sums, by saying that
a real vector r = (rK )∅6=K⊆S of dimension 2|S| − 1 exists such that
X
BIJ = rK for I 6= J together with
.
I∪K=J
(6) X
BI I = − BIJ
I⊂J.

holds for all I, J ⊆ S. Then, A = M − 1 satisfies Property PG with rK = pK for ∅ 6= K ⊆ S,


and all row sums are zero. Note that Property PG is a linear property in the sense that any
linear combination of PG matrices is another matrix with this property. In other words, the
PG matrices form a vector space over R. Note that this property is not restricted to rate
matrices. A little later, we shall see how one can consistently add a value for r∅ , then giving
access to more powerful algebraic methods, but we do not need this for our present discussion
and thus first continue with the conditions from (6).

Lemma 4.2. Let M be a PM matrix and set A = M − 1. Then, for every k ∈ N, the
matrix Ak has zero row sums and possesses Property PG according to (6), with a unique real
parameter vector r (k) of dimension 2|S| − 1.

Proof. It is clear that A has zero row sums, and writing Ak = Ak−1 A for k > 1 can be used
to verify this property for all Ak .
Property PG was derived above for the matrix A, when we motivated the notion. It is also
satisfied by every matrix of the form M k − 1, by Lemma 4.1, with a parameter vector that
8 ELLEN BAAKE AND MICHAEL BAAKE

derives from p(k) in the obvious way. From here, we get Property PG for Ak via
k  
k
(−1)k−m M m − 1 ,
X
k

A =
m=1
m
which easily follows from a double application of the binomial formula, and the linearity of
(k)
Property PG mentioned earlier. The uniqueness of r (k) follows from the relation r K = (Ak )∅K
for K 6= ∅. 

At this point, we can state the situation for PM matrices as follows.

Proposition 4.3. If M is a non-singular PM matrix, it possesses a real logarithm, R, which


is an element of the incidence algebra with zero row sums and real spectrum.
Further, R possesses Property PG as in (6) with a unique, real parameter vector r, and R
is a Markov generator if and only if r has non-negative entries only.

Proof. Being a PM matrix implies that all eigenvalues of M , which are its diagonal elements,
are non-negative numbers and bounded by 1. Since det(M ) 6= 0, we obtain σ(M ) ⊂ (0, 1].
With A = M − 1, one real logarithm can be given as the principal matrix logarithm [11] via
the power series

(−1)k−1
R = log(1 + A) =
X
(7) Ak ,
k
k=1
which is convergent because the spectral radius of A satisfies ̺A < 1. The matrix R, by an
application of the Cayley–Hamilton theorem, is a linear combination of A and finitely many
of its (positive) powers, so clearly a matrix with zero row sums.
Due to the triangular structure of R, which is inherited from that of A and its powers,
all eigenvalues of R are diagonal elements and thus real. Further, R is a PG matrix, by
Lemma 4.2, with a real parameter vector r that is unique. Since rK = R∅K for K 6= ∅, we
see that R can only be a Markov generator if all these entries are non-negative. If so, Eq. (6)
then implies that all off-diagonal elements of R are non-negative as well. 

We are now in the position to further analyse the embedding problem, which is significantly
simplified in the sense that it suffices to check the entries of a vector rather than all off-diagonal
elements of R. To do so, we can now relate the parameters of M and R with the corresponding
eigenvalues, and then the latter with each other via the SMT from Eq. (2).
Indeed, given a non-singular PM matrix, its eigenvalues are labelled with K ⊆ S and read
X X
(8) λK = MKK = pI = pI
.
K∪I=K .
I⊆K

by an application of Eq. (3), where λ∅ = M∅∅ = p∅ and λS = MSS = 1. At this point, we


can say more about the matrix structure as follows.

Proposition 4.4. Every PM matrix is diagonalisable. All such matrices, for any fixed set
S, commute with one another.
COUPON COLLECTION AND MARKOV EMBEDDING 9

Proof. Consider the column vectors eI := (δI,J )J⊆S with I ⊆ S, which define the standard
basis of R d with d = 2|S| . Now, set
X
vK = eI with K ⊆ S ,
.
I⊆K

which are linearly independent and thus also constitute a basis. We will now show that, when
M ∈ Mat(d, R) has Property PM (so M ∈ Md by Fact 3.2), we get M v K = λK v K for all
K ⊆ S, independently of the parameter vector p of M . Consequently, we always have equal
algebraic and geometric multiplicities of the eigenvalues, and M is diagonalisable.
The claim follows from an explicit calculation. For I, K ⊆ S, we have
X X  X X
(M v K )I = MIJ eL = MIJ δL,J
J
.
J⊆S .
L⊆K .
J⊆S .
L⊆K
X X X X X
= MIL = pU = p(L−I)∪V ,
.
I⊆L⊆K I⊆ .
L⊆K I∪ .
U =L I⊆ .
L⊆K .
V ⊆I

where we have dropped terms that are zero in the third step, and then used (3). Now, the
last expression is zero whenever I 6⊆ K. Otherwise, it equals
X X X
p U ∪V = p J = λK
U. ⊆K−I V. ⊆I .
J⊆K

by (8). The two cases together establish that (M v K )I = (λK v K )I holds for all I, K ⊆ S, and
v K is indeed an eigenvector for λK as claimed.
Consequently, for fixed S, all PM matrices are simultaneously diagonalised by the same
parameter-independent basis matrix, and hence commute. 

Remark 4.5. Given S and d = 2|S| , the corresponding PM matrices form a monoid within
Md that is commutative due to Proposition 4.4. This can alternatively be seen as follows.
For K ⊆ S, define the matrix M (K) via
(
(K) 1, if I ∪ K = J ,
MIJ =
0, otherwise,

which is the PM matrix with parameter vector p(K) = (δK,L )L⊆S . The matrices M (K) are
the d extremal elements among the PM matrices, and satisfy

(9) M (K) M (L) = M (K∪L) = M (L) M (K) .


Since every PM matrix is a convex combination of the extremal ones, commutativity of all
PM matrices follows. Various other properties can also be derived from (9). ♦

From Eq. (8), it is also clear that we have


(10) det(M ) 6= 0 ⇐⇒ p∅ > 0,
which has the following immediate consequence.
10 ELLEN BAAKE AND MICHAEL BAAKE

Corollary 4.6. When M ∈ Md has Property PM with p∅ = 0, it is not embeddable, neither


into a time-homogeneous semigroup nor into a time-inhomogeneous Markov flow.

Proof. Both types of embeddability require M to be non-singular, via det(eQ ) = etr(Q) and
Eq. (10) in the first case and an application of Liouville’s theorem to the Kolmogorov forward
equation, Ṁ (t) = M (t)Q(t), in the latter; compare [5, Rem. 3.3]. 

When p∅ > 0, all eigenvalues of M lie in (0, 1], while all eigenvalues of R are still real. We
can thus employ the SMT with the standard real logarithm (which is the unique inverse of
the exponential function as a mapping form R to R + ) to calculate the eigenvalues of R as
µK = log(λK ) = RKK , with K ⊆ S .
Here, µS = 0 reflects the zero row sum property of R. Under our assumptions, Proposition 4.3
P
implies the existence of a real parameter vector (rK )∅6=K⊆S so that RIJ = I∪K=J rK for
P .
I 6= J together with RI I = − I⊂J RIJ . Since R still is a triangular matrix, we thus also get
.
the relation
X X X X
(11) µK = − RKH = − rI = − rI .
K⊂H. K⊂ .
H K∪ .
I=H
. =∅
I∩K6

It can be solved for the entries of r via the Möbius inversion formula from (1) and a few extra
steps (omitted here because we shall see a simpler approach in Remark 4.11), which gives
X
(12) rK = (−1)|H−K | µH , with µI = log(λI ) and K 6= ∅.
∅6=H⊇K
.
We have thus arrived at the following result.

Theorem 4.7. If M is a non-singular PM matrix, it has a real logarithm, R = log(1 + A),


which satisfies Property PG with the real parameter vector r from (12). This R is a Markov
generator if and only if all entries of r are non-negative, in which case M is embeddable.
Further, when M has simple spectrum, R is the only real logarithm of M, and M is
embeddable into a Markov semigroup if and only if R is a Markov generator.

Proof. The first part is a consequence of Proposition 4.3 together with our above calculations
around the eigenvalues.
The final claim is clear once we know that R is the only real logarithm of M , which is the
case when the spectrum of M is simple, by Fact 2.4, but not in general. 

Formally inserting K = ∅ in (12) and observing ∅ = S, one obtains


r∅ = µ∅ = log(λ∅ ) = log(p∅ ),
which satisfies r∅ 6 0. From now on, we use this definition to extend the parameter vector
r to one of full dimension 2|S| . Due to the zero row sum property of R, one then has the
P
relation K⊆S rK = 0, as is most easily seen by inserting K = ∅ into (11). Also, with this
extension, we refine our previous definition as follows.
COUPON COLLECTION AND MARKOV EMBEDDING 11

Definition 4.8. Let S = {1, 2, . . . , N } and let 2S be the order lattice of the d = 2N subsets
of S. A matrix B ∈ Mat(d, R), indexed with the elements of 2S , is said to have Property PG,
or to be a PG matrix, if there is a real vector r ∈ R d with I⊆S rI = 0 such that
P
X
BIJ = rK
.
I∪K=J

holds for all I, J ⊆ S, where empty sums are zero.

Before we continue, we need to check that this definition is consistent with our preliminary
one from above. Indeed, in comparison with (6), we see that we have the same type of
condition for all BIJ with I 6= J, where the element r∅ never shows up. So, we have to verify
that the conditions for all BII match. From (6), in analogy with (11), we get
X X X X X
BII = − BIJ = − rK = − rK = rK ,
I⊂J. I⊂J. I∪K=J
. .
K⊆I
. 6=∅
K∩I
P
where the last step is a consequence of K⊆S rK = 0. Definition 4.8 simply gives
X X
BII = rK = rK ,
.
I∪K=I .
K⊆I

which is the same sum.

Lemma 4.9. If B is a PG matrix, it has zero row sums. If it is also a rate matrix, eB is a
Markov matrix with Property PM.

Proof. The first claim is the analogue of Fact 3.2, and has the same proof (with the parameter
vector p replaced by r). This means that 0 ∈ σ(B) with eigenvector v = (1, 1, . . . , 1)T .
If B is a Markov generator, eB is a Markov matrix; this follows easily from [11, Eq. 10.2],
eB = lim 1 + n1 B ,
n
n→∞

by observing that, for all sufficiently large n, the matrix 1 + n1 B has non-negative entries only.
The required row sum condition is a consequence of eB v = v.
Now, observe that eB = 1 + A, where A is a PG matrix with non-negative off-diagonal
entries. It thus has a parameter vector r with non-negative elements (except at index ∅).
One verifies that 1 + A satisfies Eq. (3) with parameter vector p = (rK + δ∅,K )K⊆S , which is
a probability vector. This establishes Property PM for eB . 

So, we now have a completely analogous way to define Properties PM and PG, both with
a real parameter vector that agrees with one row of the matrix. The difference is the entry
sum, which is 1 for PM and 0 for PG, as we know it from the row sums of the corresponding
matrices. Since Definitions 3.1 and 4.8 show the same structure with respect to the parameter
vector, we can repeat the proof of Proposition 4.4 to get the following consequence.

Corollary 4.10. Any PG matrix is diagonalisable. All such matrices, for any fixed S,
(0)
commute with one another, and thus form an Abelian subalgebra of Ad , where d = 2|S|. 
12 ELLEN BAAKE AND MICHAEL BAAKE

Remark 4.11. The above derivation can now be used to calculate the semigroup {e tQ : t > 0}
non-recursively, for any rate matrix Q with Property PG. Indeed, if r is the parameter vector
for Q, the eigenvalues of Q are its diagonal elements, µK = QKK with K ⊆ S, and we have
the relations from Eq. (11), with R replaced by Q. In fact, with Definition 4.8, we simply get
X X
(13) µK = QKK = rI and rK = (−1)|K−I| µI ,
.
I⊆K .
I⊆K

which replaces Eq. (12) and its (omitted) derivation. Clearly, tQ with t > 0 still has Property
PG, now with parameter vector tr and eigenvalues tµK . Then, for any t > 0, the eigenvalues
of M (t) = e tQ are λK (t) = e tµK , with K ⊆ S.
If Q has Property PG, the same applies to Q k for k ∈ N, which need not be rate matrices
(0)
for k > 1, though they are always elements of Ad . Indeed, profiting from the consistent
definition of r∅ , we can now mimic our arguments for Lemma 4.1 to see that r (2) = rQ is
the parameter vector for Q2 and we get those of Q k recursively via r (k+1) = r (k) Q for k > 1.
Then, the exponential series in conjunction with the Cayley–Hamilton theorem tells us that
e tQ = 1 + A(t) holds for every t > 0, where A(t) is a polynomial in Q and its (positive)
powers, hence a PG matrix. This implies that each M (t) is a PM matrix, with a unique
parameter vector p(t). The latter satisfies the time-dependent analogue of Eq. (8), which can
be solved for the entries of r(t) by Möbius inversion, giving
X X X  X 
(14) pK (t) = (−1)|K−J| λJ (t) = (−1)|K−J| e tµJ = (−1)|K−J| exp t rI .
.
J⊆K .
J⊆K .
J⊆K .
I⊆J
P
Now, we simply obtain MIJ (t) = I∪K=J pK (t) for each t > 0 as in (3), and thus a simple,
.
explicit approach to the semigroup. ♦
When, in Theorem 4.7, we hit a situation where more than one generator exists for M
(which can only happen for the non-generic case of multiple eigenvalues, and even then at
most for very small values of det(M ); compare [8, 4] and references therein), only one can
have Property PG. In this case, two (or more) Markov semigroups cross each other in M , but
the extra solutions rarely have an interesting probabilistic interpretation.
At this point, in view of Theorem 4.7, we need to understand when we get non-negative
entries rK for all K 6= ∅. As it turns out, this is best looked at in a probabilistic fashion,
which we do after a closer inspection of the model structure.

5. Intermezzo: Algebraic and analytic properties of the model


As above, we use S = {1, 2, . . . , N }, its power set 2S as a lattice with order relation ⊆,
and R d with d = 2N as the vector space of row vectors, indexed by the elements of 2S . In
particular, we write x ∈ R d as x = (xI )I⊆S .
Earlier, we considered the family of PM matrices M ∈ Md , which had matrix elements
MIJ = I∪K=J pK for some p ∈ R d with non-negative entries only and I⊆S pI = 1. In
P P
.
other words, the (d − 1)-dimensional probability simplex
Sd−1 = {p ∈ R d : all pI > 0 and I⊆S pI = 1}
P
COUPON COLLECTION AND MARKOV EMBEDDING 13

is the moduli space of this family of matrices. Note that the matrix family is a closed convex
subset of Md , which is matched by Sd−1 being a closed convex simplex in R d . It has d
extremal elements, namely the vectors eK = (δK,L )L⊆S , which was used in Remark 4.5.
In view of the role of the parameter vectors p and their behaviour under multiplication of
PM matrices, we now define a multiplication ⋆ : R d × R d −−→ R d by
X X
(15) (x ⋆ y)K := xI y J .
.
I⊆K I∪ .
J=K

It has a remarkable property as follows, which relates to Proposition 4.4 and Remark 4.5.

Lemma 5.1. The multiplication defined by (15) is commutative and satisfies the sum rule
X X X
(x ⋆ y)K = xI yJ .
K⊆S
. I⊆S
. J⊆S
.
Proof. By definition, one has
X X X X X
(x ⋆ y)K = xI y J = xI y J = yJ xI = (y ⋆ x)K ,
.
I⊆K I∪ .
J=K I,J⊆K .
J⊆K J∪ .
I=K
I∪J =K

which establishes commutativity, while the normalisation follows from


X X X X X X
(x ⋆ y)K = xI y J = xI y J = xI yJ ,
.
K⊆S .
K⊆S I,J⊆K I,J⊆S .
I⊆S .
J⊆S
I∪J =K

which completes the argument. 

If the entries of x and y sum to α, those of x ⋆ y sum to α2 , whence the cases α = 1 and
α = 0 are special. Let us analyse this a bit more. Clearly, if r, s ∈ Sd−1 , then also r ⋆ s ∈ Sd−1 ,
and we get that (Sd−1 , ⋆) is an Abelian monoid, that is, an Abelian semigroup with unit. The
latter is ε = (δ∅,K )K⊆S , because ε ⋆ x = x for all x, as one can easily verify.
Next, we look at the PG matrices, hence those with a parameter vector r = (rI )I⊂S ∈ R d
P
subject to the condition that I⊆S rI = 0. These matrices have zero row sums, but need
.
not be rate matrices, which they are if and only if rI > 0 holds for all I 6= ∅. Here, the PG
(d)
matrices form a subalgebra of A 0 of dimension d − 1, and its moduli space is

X = {x ∈ R d : x1 + . . . + xd = 0} ≃ R d−1 ,

where x + y and x ⋆ y correspond to the sum and the product of the matrices parametrised by
x and y, respectively, with the correct distributive law. In other words, (X, +, ⋆) is an Abelian
algebra, but without a multiplicative unit. It is closed under limits in the usual topology.
Next, consider the Cauchy (or initial value) problem

(16) Ṁ (t) = M (t)Q(t) with M (0) = 1,


14 ELLEN BAAKE AND MICHAEL BAAKE

where {Q(t) : t > 0} is any continuous family of PG matrices. Since Q(t) and Q(s) commute
for all t, s > 0 by Corollary 4.10, the solution is simply given by
Z t

(17) M (t) = exp R(t) with R(t) = Q(τ ) dτ ,
0
as follows from the standard theory of matrix-valued, ordinary differential equations; we refer
to [19, Ch. IV] for background. Clearly, since the algebra of PG matrices is closed under
taking limits, R(t) is a PG matrix for every t > 0, and we have the following result.

Corollary 5.2. The forward flow of the Cauchy problem (16), with {Q(t) : t > 0} a contin-
uous family of PG matrices, consists of Markov matrices with Property PM only. Further,
every individual M (t) from this flow is embeddable into a time-homogeneous Markov semi-
group generated by a rate matrix with Property PG. 

The continuity condition can be relaxed considerably, by replacing the differential equation
with the corresponding Volterra integral equation; see [5] and references therein for more.
Instead, let us interpret (17) in terms of the moduli spaces introduced above. To do so,
we write PG matrices as Qq , hence indexed with the corresponding parameter vector, and
analogously use Mp of PM matrices. Now, in Eq. (17), we clearly have R(t) = Rr(t) with
Z t
(18) r(t) = q(τ ) dτ ,
0

where q(τ ) denotes the parameter vector of Q(τ ).



It is now possible to calculate exp R(t) , for every fixed t > 0, as follows, where we drop
the explicit notation for time dependence for clarity. Consider the exponential series for eR
and observe that Rn , with n ∈ N, has parameter vector r (n) = r ⋆ · · · ⋆ r =: r ⋆n . Further,
1 = R 0 is a PM matrix with our unit ε from above as parameter vector. If we set r⋆0 := ε,
we can now define the mapping Exp : X −−→ R d by
X r ⋆n
r 7→ Exp(r) := ,
n!
n=0

which is a vector whose entries sum to 1. In fact, Euler’s relation still holds, namely
⋆n
Exp(r) = lim ε + r

,
n→∞ n
which can then be used (as in the proof of Lemma 4.9) to show that Exp(r) is indeed a
probability vector, as it must.

Corollary 5.3. In the setting of Corollary 5.2, each matrix M (t) from the flow is a PM

matrix with parameter vector p(t) = Exp r(t) with r(t) as in (18). 

Clearly, one has Exp(0) = ε. Observing that the binomial formula for ⋆ holds in the form
n  
⋆n
X n
(x + y) = x⋆m ⋆ y ⋆(n−m) ,
m
m=0
COUPON COLLECTION AND MARKOV EMBEDDING 15

we see that Exp(.) satisfies Exp(x + y) = Exp(x) ⋆ Exp(y). Recalling the computations from
Remark 4.11, one can evaluate Exp(r) explicitly via
X X  X Y
(−1)|K−J| exp (−1)|K−J| erI .

Exp(r) K = rI =
.
J⊆K .
I⊆J .
J⊆K .
I⊆K

d
P
If we set Y := {y ∈ R : K⊆S yK = 1}, we know that Exp(X) ⊂ Y. It would be of interest
to characterise Exp(X) more explicitly. Further, with
X> := {x ∈ X : xK > 0 for all ∅ 6= K ⊆ S},
we clearly have Exp(X> ) ⊂ Sd−1 ⊂ Y, where Exp(X> ) is the moduli space of the embeddable
cases in Sd−1 , which we still need to analyse. So far, we have the following.

Corollary 5.4. Let M be a PM matrix with parameter vector p. Then, there is a PG matrix
R with M = eR if and only if the equation p = Exp(x) has a solution x ∈ X. Further, R is
a Markov generator if and only if x ∈ X> . 

Let us return to our remaining task to better understand the embeddability condition.

6. Positivity conditions and embeddability


Recall that Z ⊆ S denotes the (random) set of objects that are sampled in one step, with
P(Z = K) = pK for K ⊆ S. In what follows, a crucial role will be played by the probability
of a relevant ‘non-event’,
X X
(19) qK := P(Z avoids K) = P(Z ∩ K = ∅) = pI = 1 − pJ ,
.
J∩K6 = ∅
.
I⊆K

where the last step is a consequence of p being a probability vector. In particular, one has
P
q∅ = I⊆S pI = 1, as it must be, and qK > qH whenever K ⊆ H. The qK are related with
.
the eigenvalues of M via
X X
(20) λK = MKK = pI = pI = P(Y ∩ K = ∅) = qK .
.
K∪I=K .
I⊆K

This also gives λ∅ = qS = p∅ , which must be positive for embeddability by Corollary 4.6.
We therefore assume, for the rest of this section, that p∅ > 0, which then also implies that
qK > 0 holds for all K ⊆ S via (19).

Since we now have µK = log λK = log(qK ), we also get
(−1)|H−K |
X Y
(21) rK = (−1)|H−K | log(qH ) = log qH
∅6=H⊇K
. ∅6=H⊇K
.
for K 6= ∅, and need to know when they all satisfy rK > 0. Alternatively, due to our new
version (13) with the full parameter vector r, we can also write
X Y (−1)|K−H|
(22) rK = (−1)|K−H| log(qH ) = log qH ,
.
H⊆K .
H⊆K
16 ELLEN BAAKE AND MICHAEL BAAKE

now for all K ⊆ S, where r∅ = log(qS ) = log(p∅ ). A simple calculation with the Möbius
inversion from Eq. (1) then gives
X X
log(qK ) = rH = − rI 6 0.
.
H⊆K . =∅
I∩K6

Before we continue, let us look at a simple special case.



Example 6.1. If S = {1, 2}, we have P(S) = ∅, {1}, {2}, S , and thus get
qS q{2} q{1}
rS = log , r{1} = log , r{2} = log
q{1} q{2} qS qS
together with r∅ = log(qS ), so the sum of all four coefficients vanishes, as it must. Here, due
to qK > qH for K ⊆ H, the conditions r{1} > 0 and r{2} > 0 are automatically satisfied,
which can also be understood via the conditional probabilities
qS 
r{i} = − log = − log P Z ∩ {i} = ∅ | Z ∩ (S − {i}) = ∅ .
qS−{i}
For the remaining coefficient, we have
(23) rS > 0 ⇐⇒ ∆q := qS − q{1} q{2} > 0,
where ∆q has the interpretation of a correlation, either between the two non-events {1 6∈ Z}
and {2 6∈ Z} or (equivalently) between the events {1 ∈ Z} and {2 ∈ Z}. Here, one has
P(i ∈ Z) = p{i} + pS for i ∈ S and P(Z = S) = pS , which then gives
∆q = p∅ pS − p{1} p{2}
P
by a simple calculation via (19) and the relation K⊆S pK = 1. ♦

In this example, the required non-negativity depends on just one condition, namely Eq. (23),
which is equivalent to
P(1, 2 ∈ Z) − P(1 ∈ Z)P(2 ∈ Z) > 0.
This suggests the following probabilistic interpretation. Consider the continuous-time MCCP
with r{1} r{2} > 0, to avoid degeneracies, and r{1,2} > 0. For r{1,2} = 0, we are back to
Example 3.3 and its independence structure, so
−r t  −r t 
P(1, 2 ∈ Z) = 1 − e {1} 1 − e {2} = P(1 ∈ Z)P(2 ∈ Z),
so there is no correlation. If we now add joint sampling at rate r{1,2} > 0, we obtain
−t(r{i}+r{1,2} )
P(i ∈ Z) = p{i} (t) + p{1,2} (t) = 1 − e for i ∈ {1, 2} and
−t(r{1} +r{1,2} ) −t(r{2}+r{1,2} ) −t(r{1}+r{2}+r{1,2} )
P(1, 2 ∈ Z) = p{1,2} (t) = 1 − e −e −e ,
by using the formula for pK (t) from (14). This gives P(1, 2 ∈ Z) > P(1 ∈ Z)P(2 ∈ Z) and
thus a positive correlation, in line with the intuition for this simple case. Put differently,
there is no way to achieve a negative correlation within the continuous-time process, whereas,
in contrast, objects can very well avoid each other in discrete time — just set p{1} p{2} > 0
together with p{1,2} = 0.
COUPON COLLECTION AND MARKOV EMBEDDING 17

These considerations might trigger the naive question whether rK > 0 for all ∅ 6= K ⊆ S
could be equivalent to CK > 0, where
X  Y
(24) CK = (−1)|A|−1 |A| − 1 ! P(A ⊆ Z)
A∈P(K)
. A∈A

is the correlation function of the events {i ∈ Z} for i ∈ K. In fact, this is not true, but points
in the right direction. To proceed, we need to consider larger sets S.
Indeed, the situation gets more involved for sets S with |S| > 3. Here, one still has
qS−{i} p∅ + p{i}
r{i} = log = log > 0
qS p∅
for i ∈ S, with non-negativity for the same reason as in Example 6.1, and the analogous
interpretation in terms of conditional probabilities. For i, j ∈ S with i 6= j, we get

qS qS−{i,j} p∅ p∅ + p{i} + p{j} + p{i,j}
r{i,j} = log = log  
qS−{i} qS−{j} p∅ + p{i} p∅ + p{j}
from (22), which implies
r{i,j} > 0 ⇐⇒ qS qS−{i,j} > qS−{i} qS−{j} ⇐⇒ p∅ p{i,j} > p{i} p{j} .
For S = {1, 2}, this simplifies as stated earlier because one then has qS−{i,j} = q∅ = 1.
Next, for i, j, k ∈ S distinct, one obtains
qS−{i,j,k} qS−{i} qS−{j} qS−{k}
r{i,j,k} = log
qS qS−{i,j} qS−{i,k} qS−{j,k}

qS−{i,j,k} 2
qS−{i,j,k} 2
qS−{i,j,k}
= log − log − log
qS qS−{i,j} qS−{k} qS−{i,k} qS−{j}
2
qS−{i,j,k} 3
qS−{i,j,k}
− log + 2 log .
qS−{j,k} qS−{i} qS−{i,j} qS−{i,k} qS−{j,k}
Again, for S = {1, 2, 3}, one gets a slight simplification due to qS−{i,j,k} = q∅ = 1. The
quantities r{i,j,k} cannot reasonably be expressed in terms of the parameters pK .
To see what is going on in general, we define, for A, B ⊆ S, the quantities
B P(Y ∩ (A ∪ B) = ∅) q
(25) qA := P(Y ∩ A = ∅ | Y ∩ B = ∅) = = A∪B ,
P(Y ∩ B = ∅) qB

where qA = qA because q∅ = 1, and one also has
B
q B
qA∪B = A∪B = qA .
qB
Now, we can formulate the following result.

Proposition 6.2. For any non-empty subset K ⊆ S, we have the identity


X
rK = (−1)|K| (−1)|A|−1 |A| − 1 ! log qA
K
 
,
A∈P(K)
.
18 ELLEN BAAKE AND MICHAEL BAAKE

B := B
Q
where qA A∈A qA for any partition A ∈ P(K) and any B ⊆ S.
Proof. From (22), upon substituting A = K − H, one gets
X X
(−1)|A| log(qA∪K ) = (−1)|A| log(qA∪K ) − log(qK )

rK =
.
A⊆K .
A⊆K
X X
(−1)|A| log qAK
= (−1)|K| (−1)|A|−1 |A| − 1 ! log qA
K
  
= ,
.
A⊆K A∈P(K)
.
where the second step is true because A⊆K (−1)|A| = 0, while the next step follows from
P
.
(25). Since, by convention, no part in a partition is the empty set, we can invoke Lemma 2.2
with f (A) = qAK in the last step. 
If we compare the formula for the rK with the expression for the CK from (24), one notices
the same basic structure, but three important differences. First, we have the appearance of
logarithms, which is perhaps not surprising. Second, the probabilities P(A ⊆ Z) are replaced
by the conditional probabilities P(Z ∩ A = ∅ | Z ∩ K = ∅). Finally, there is an extra factor
(−1)|K| , which is due to the appearance of non-events instead of events.
At this point, we can wrap up the embedding structure as follows.
Theorem 6.3. Let M be the Markov matrix of a discrete-time MCCP for S = {1, 2, . . . , N },
so a PM matrix with parameter vector p = (pK )K⊆S . If M is non-singular, which is equiva-
lent with p∅ > 0, it has a real logarithm, R, in the form of the principal matrix logarithm.
This R is a triangular matrix with zero row sums and real spectrum. It satisfies Property
PG according to Definition 4.8, with the parameter vector r = (rK )K⊆S from Proposition 6.2.
Further, R is a Markov generator if and only if rK > 0 holds for all ∅ 6= K ⊆ S, in which
case M is embeddable into a continuous-time MCCP.
Finally, R is the only real matrix logarithm of M with Property PG. In the generic case
that the spectrum of M is simple, no other real logarithm of M exists, and M is embeddable
if and only if R is a Markov generator.
Proof. The first part follows from Corollary 4.6 and Theorem 4.7, and the second from the
same theorem in conjunction with Eq. (22) and Proposition 6.2.
If R′ is any real logarithm of M in triangular form, it also has real spectrum because its
eigenvalues are the diagonal elements. When it also has Property PG, its unique parameter
vector r ′ follows from the spectrum by Eq. (22), and must then agree with r, hence R′ = R.
The final claim is then clear. 
If M has a degenerate spectrum and a sufficiently small determinant, other real logarithms
of M are possible, but never with Property PG. In this sense, Theorem 6.3 gives a complete
answer to the embedding problem, both in the generic case of simple spectrum and in general,
then under the constraint that an embedding should be into a model of the same type (meaning
MCCP in our case). One can say more also in the case of a degenerate spectrum, by comparing
the centralisers of M and any real logarithm of it (see [7] as well as [3] and references therein
for some tools), but we leave further details to the interested reader.
COUPON COLLECTION AND MARKOV EMBEDDING 19

Acknowledgements
It is our pleasure to thank Jeremy Sumner for interesting discussions on the embedding
problem and for various helpful comments on the manuscript. This work was supported by
the German Research Foundation (DFG), within the CRC 1283 (Project ID 317210226) at
Bielefeld University.

References
[1] M. Aigner, Combinatorial Theory, reprint, Springer, Berlin (1997).
[2] E. Baake and M. Baake, Ancestral lines under recombination, in: Probabilistic Structures
in Evolution, eds. E. Baake and A. Wakolbinger, EMS Press, Berlin (2021), pp. 365–382;
arXiv:2002.08658.
[3] M. Baake and J. Sumner, On equal-input and monotone Markov matrices, J. Appl. Probab. 54
(2022) 460–492; arXiv:2007.11433.
[4] M. Baake and J. Sumner, Embedding of Markov matrices for d 6 4, preprint, arXiv:2311.02596.
[5] M. Baake and J. Sumner, On the algebra of equal-input matrices in time-inhomogeneous Markov
flows, preprint, arXiv:2404.11222.
[6] M. Casanellas, J. Fernández-Sánchez and J. Roca-Lacostena, The embedding problem for Markov
matrices, Public. Matem. 67 (2023) 411–445; arXiv:2005.00818.
[7] W.J. Culver, On the existence and uniqueness of the real logarithm of a matrix, Proc. Amer.
Math. Soc. 17 (1966) 1146–1151.
[8] E.B. Davies, Embeddable Markov matrices, Electronic J. Probab. 15 (2010) paper 47, 1474–1486;
arXiv:1001.1693.
[9] T. Eisner and A. Radl, Embeddability of real and positive operators, Lin. Multilin. Alg. 70 (2022)
3747–3767; arXiv:2003:08186.
[10] G. Elfving, Zur Theorie der Markoffschen Ketten, Acta Soc. Sci. Fennicae A2 (1937) 1–17.
[11] N.J. Higham, Functions of Matrices: Theory and Computation, SIAM, Philadelphia, PA (2008).
[12] R.A. Horn and C.R. Johnson, Matrix Analysis, 2nd ed., Cambridge University Press, Cambridge
(2013).
[13] S. Johansen, Some results on the imbedding problem for finite Markov chains, J. London Math.
Soc. 8 (1974) 345–351.
[14] J.F.C. Kingman, The imbedding problem for finite Markov chains, Z. Wahrscheinlichkeitsth. verw.
Geb. 1 (1962) 14–24.
[15] P. Neal, The generalised coupon collector problem, J. Appl. Prob. 45 (2008) 621–629.
[16] W. Rudin, Functional Analysis, 2nd ed., McGraw-Hill, New York (1991).
[17] J. Schilling and N. Henze, Two Poisson limit theorems for the coupon collector’s problem with
group drawings, J. Appl. Prob. 58 (2021) 966–977.
[18] E. Spiegel and C.J. O’Donnell, Incidence Algebras, Dekker, New York (1997).
[19] W. Walter, Ordinary Differential Equations, Springer, New York (1998).

Technische Fakultät, Universität Bielefeld, Postfach 100131, 33501 Bielefeld, Germany

Fakultät für Mathematik, Universität Bielefeld, Postfach 100131, 33501 Bielefeld, Germany

You might also like