STAT8310 Topic5 1 2021

STAT8310 Statistical Theory
2021
Topic 5.1
Generating functions
STAT8310 2021 Topic 5.1 1

Generating functions
• The generating function of the sequence {a0 , a1 , a2 , . . .} is defined
to be the function
X∞
g(z) = aj z j .
j=0
• Note that this is a function of z.

• If the sequence is finite, then g(z) is a finite degree polynomial.
• If the sequence is infinite, then g(z) is not a polynomial, but it is
assumed that g(z) exists for some open interval containing 0.
• The number aj is the coefficient of z j in g(z) (or the expansion of
g(z)).
STAT8310 2021 Topic 5.1 2

• For example, if aj = 1, ∀j, then ∀z ∈ (−1, 1) ,
∞
X
g(z) = zj
j=0
1
= .
1−z
−1
• We can then get aj back by expanding (1 − z) about z = 0,
and finding the coefficient of z j , which here is 1. This may seem
trivial, but this is quite an important concept.
• If g(z) exists for z ∈ A, where A is open and contains 0, then,
assuming we can differentiate term by term,
STAT8310 2021 Topic 5.1 3

k
d
g (k) (z) = k g (z)
dz
∞
X
= j (j − 1) · · · (j − k + 1) aj z j−k
j=k
= k (k − 1) · · · 1ak
∞
X
+ j (j − 1) · · · (j − k + 1) aj z j−k .
j=k+1
• Thus
g (k) (0) = k (k − 1) · · · 1ak

= k!ak ,
since all of the other terms are 0.
STAT8310 2021 Topic 5.1 4

• We thus have
g (k) (0)
ak = .
k!
1
• Taking the previous example, where g(z) = , we can thus
1−z
obtain the ak from the above:
k

1 d −1
ak = k
(1 − z)
k! dz z=0
1 k

−k−1
= (−1) (−1) (−2) · · · (−k) (1 − z)
k!

z=0
= 1.
STAT8310 2021 Topic 5.1 5

Probability Generating Functions
• Let X be a discrete random variable.
• The probability generating function (pgf) GX (z) of X which has
probability function fX (x) is defined by
X

GX (z) = E z
X
= fX (x)z x .
x
• The summation will usually start at 0 or 1, depending on the

lowest value X takes on.
STAT8310 2021 Topic 5.1 6

Example
Let X be uniformly distributed in {−a, −a + 1, . . . , b − 1, b},

where a, b > 0. Then provided z 6= 1,
b
X 1 x z −a − z b+1
GX (z) = z = .
x=−a
a + b + 1 (a + b + 1)(1 − z)
✷
• We shall assume from now on that X is non-negative.
• From above, we can find fX (x) from the formula
k

1 d
fX (k) = k
GX (z)
k! dz z=0
STAT8310 2021 Topic 5.1 7

Examples
1. X ∼ Bernoulli(p) . Then ∀z,
GX (z) = (1 − p) z 0 + pz
= 1 − p + pz.
2. X ∼ Bin(n, p) . Then ∀z,

n
X n n−x
GX (z) = px (1 − p) zx
x=0
x
n
X n n−x x
= (1 − p) (pz)
x=0
x
= (1 − p + pz)n .
STAT8310 2021 Topic 5.1 8

3. X ∼ Poisson(θ) . Then ∀z,
∞ x
−θ θ
X
GX (z) = e zx
x=0
x!
∞ x
−θ (θz)
X
= e
x=0
x!
∞ x
X (θz)
= e−θ eθz e−θz
x=0
x!
= e−θ+θz
STAT8310 2021 Topic 5.1 9

1
4. X ∼ Geometric(p) . Then, as long as |z| < ,
1−p
∞
(1 − p)x−1 pz x
X
GX (z) =
x=1
pz
= .
1 − (1 − p) z
STAT8310 2021 Topic 5.1 10

5. X ∼ NegBin(r, p).
Recall from Topic 2: Newton’s generalized binomial theorem

∞
X r r−i i
(a + b)r = a b
i=0
i
where

r r(r − 1) · · · (r − i + 1)
= for r any real number
i i!
Gives
∞
X −k
p−k = (−1)y (1 − p)y .
y=0
y
STAT8310 2021 Topic 5.1 11

1
Now, as long as |z| < , we have
1−p
∞
x−1
(1 − p)x−r pr z x
X
GX (z) =
x=r
r−1
∞
(x − 1) · · · 1
(1 − p)x−r pr z x
X
=
x=r
(x − r) (x − r − 1) · · · 1 (r − 1) · · · 1
∞
X (x − 1) · · · r x−r r x
= (1 − p) p z (put y = x − r)
x=r
(x − r) (x − r − 1) · · · 1
∞
X (y + r − 1) · · · r y
= pr z r (1 − p) z y
y=0
y (y − 1) · · · 1
∞
(−r) (−r − 1) · · · (−r − y + 1)
{− (1 − p) z}y
X
= pr z r
y=0
y (y − 1) · · · 1
= pr z r {1 − (1 − p) z}
−r
r
pz
STAT8310 2021 = . Topic 5.1 12
1 − (1 − p) z
• Note that the pgf in 5 is the rth power of the pgf in 4, and that
the pgf in 2 is the nth power of the pgf in 1. This is not by
accident!
• Exercise: Now find P (X = x) from the above pgfs.
STAT8310 2021 Topic 5.1 13

Some properties of pgfs
• The function G(z) is differentiable for |z| < 1 and its derivative is
X
′
G (z) = xf (x)z x−1 < ∞.
x
• (Uniqueness) Let X and Y have pgfs GX (z) and GY (z). If for

some G(z) we have
GX (z) = GY (z) = G(z) for |z| < 1,
then X and Y have the same probability mass function.
STAT8310 2021 Topic 5.1 14

Calculating moments from the pgf
• First, note that GX (1) = 1 (sum of all probabilities).

• Next, note that, as long as we can differentiate term by term, we
have
X
′
GX (z) = xfX (x)z x−1 .
x
• Thus
X
G′X (1) = xfX (x)
x
= E (X) .
• Similarly,
′′ X
GX (z) = x (x − 1) fX (x)z x−2 .
x
STAT8310 2021 Topic 5.1 15

• Thus
′′ X
GX (1) = x (x − 1) fX (x)
x
= E {X (X − 1)}
2

=E X −X
2

= E X − E (X) .
• Hence
2
2
var X = E X − {E (X)}
′′ 2
= GX (1) + E (X) − {E (X)}
′′ 2
= GX (1) + G′X (1) − {G′X (1)} .
• And so on. Since

′′′
3 2

GX (1) = E {X (X − 1) (X − 2)} = E X − 3E X + 2E (X) ,
STAT8310 2021 Topic 5.1 16

it follows that
′′′
n ′′ ′
o ′
3

E X = GX (1) + 3 GX (1) + GX (1) − 2GX (1)
and
n o n ′′ o
3 ′′′ ′′ ′ ′
E (X − µX ) = GX (1) + 3GX (1) + GX (1) − 3GX (1) GX (1) + G′X (1)
n ′ o3 n ′ o3
+ 3 GX (1) − GX (1)
′′′ ′′ ′ ′ ′′
= GX (1) + 3GX (1) + GX (1) − 3GX (1) GX (1)
n ′ o3
′ 2
− 3 {GX (1)} + 2 GX (1) .
• There is no point in deriving a general formula.
STAT8310 2021 Topic 5.1 17

The pgf of a sum of independent discrete rvs
• Let X1 , X2 , . . . , Xn be independent discrete rvs with probability
functions
fX1 (x), fX2 (x), . . . , fXn (x)
and pgfs
GX1 (z), GX2 (z), . . . , GXn (z).
• We wish to calculate the probability function of

Y = X1 + X2 + · · · + Xn .
• This may be difficult to do from first principles (using, e.g.
discrete convolution).
• We shall need to assume in what follows that if the rvs
STAT8310 2021 Topic 5.1 18

U1 , . . . , Un are independent, then
( n ) n
Y Y
E h (Ui ) = E {h (Ui )}
i=1 i=1
• Now, the pgf of Y is

Y

GY (z) = E z
X1 +X2 +···Xn

=E z
X1 X2 Xn

=E z z ···z
X1 X2 Xn

=E z E z ···E z
= GX1 (z)GX2 (z) · · · GXn (z)
• We may recognise this as the pgf corresponding to some

probability function.
• In that case, Y will have that probability function.
STAT8310 2021 Topic 5.1 19

• Even if we don’t recognise the probability function, we can find it
by expanding GY (z) about 0 and identifying fY (y) as the
coefficient of z y in this expansion, or by using
k

1 d
fY (k) = k
GY (z) .
k! dz z=0
• An important case is where the Xi are also identically

distributed, i.e. they have the same probability function fX (x)
and thus the same pgf GX (z), for then
n
GY (z) = {GX (z)} .
Example 1: Let X1 , X2 , . . . , Xn be i.i.d. Bern(p) . Then

GY (z) = {GX (z)}n = (1 − p + pz)n , which is the pgf of a Bin(n, p)
rv. Thus Y ∼ Bin(n, p) .
STAT8310 2021 Topic 5.1 20

Example 2: X1 , X2 , . . . , Xn i.i.d. Geometric(p) . Then
n
n pz
GY (z) = {GX (z)} = ,
1 − (1 − p) z
which is the pgf of a NegBin(n, p) rv. Thus Y ∼NegBin(n, p) .
Example 3: X is (discrete uniform) on {1, 2, . . . , m} . Then

m
X 1 x
GX (z) = z
x=1
m
 m
z (1 − z )
; z 6= 1


= m (1 − z)
1 ; z = 1.


STAT8310 2021 Topic 5.1 21

and  n
m
z (1 − z )
; z 6= 1


GY (z) = m (1 − z)
1 ; z = 1,


which isn’t of much use!
Example 4: X1 , X2 , . . . , Xn i.i.d. Binomial(m, p) . Then

n m n
GY (z) = {GX (z)} = {(1 − p + pz) }
mn
= (1 − p + pz) ,
which is the pgf of a Binomial(mn, p) . Thus Y ∼ Bin(mn, p) .
STAT8310 2021 Topic 5.1 22

Example 5: X1 , X2 , . . . , Xn i.i.d. Poisson(θ) . Then
n
GY (z) = {GX (z)}
−θ+θz n
= e
= e−nθ+nθz ,
which is the pgf of a Poisson(nθ) rv. Thus Y ∼ Poisson(nθ) .
The independence of rvs in terms of pgfs

It is natural to ask the following question: if X and Y are
non-negative, integer-valued rvs such that
GX+Y (z) = GX (z)GY (z)
is satisfied, does it follow that X and Y are independent?
We show by an example that in general the answer is negative.
STAT8310 2021 Topic 5.1 23

Example
Let ξ and η be independent rvs such that ξ takes the values 0, 1 and
2 with probability 1/3 each, and η takes the values 0 and 1 with
probabilities 1/3 and 2/3 respectively. Define X = ξ and
Y = (ξ + η)(mod 3).
STAT8310 2021 Topic 5.1 24

Then Y takes the values 0, 1 and 2 with probability 1/3 each.
Further, the sum X + Y takes the values 0, 1, 2, 3 and 4 with
probabilities 1/9, 2/9, 3/9, 2/9 and 1/9 respectively. It can be
checked that
GX+Y (z) = GX (z)GY (z)
is satisfied. However, the variables X and Y are not independent,
they are functionally dependent.
STAT8310 2021 Topic 5.1 25

Moment Generating Functions
• Let X be a r.v. Then the moment generating function of X is
defined by
tX

MX (t) = E e
 R
 ∞ etx f (x)dx ; X continuous
−∞ X
=
 P etx fX (x) ; X discrete
x
• This is related to the Laplace transform of fX (x).

−tX

• In fact, many texts define the mgf to be E e , which is the
Laplace transform of fX (x). We will use the above definition, as
it coincides with the definition in the textbook and some other
well-known Statistics textbooks.
STAT8310 2021 Topic 5.1 26

The moment generating function generates moments:
tX

MX (t) = E e
2 2 3 3

t X t X
= E 1 + tX + + + ···
2! 3!
(∞ )
X (tX) r
=E
r=0
r!
Assuming that we can take expectations inside the summation sign

(we can’t always do this), we get
STAT8310 2021 Topic 5.1 27

t2 2

MX (t) = 1 + tE (X) + E X + . . .
2!
∞ r
X t
= E (X r )
r=0
r!
∞ r
′ t
X
= µr
r=0
r!
k

Therefore the moments µ′k =E X can be obtained either by
tk
1. finding the coefficient of k! in the series expansion of MX (t)
about t = 0, or
STAT8310 2021 Topic 5.1 28

2. evaluating the k th derivative of MX (t) at t = 0: since
∞ r−k
(k) ′ t
X
MX (t) = r (r − 1) · · · (r − k + 1) µr
r=0
r!
∞ r−k
′ t
X
= r (r − 1) · · · (r − k + 1) µr ,
r!
r=k
it follows, since the only term not involving powers of t is the

r = k term, that
k

(k) d
MX (0) = k MX (t) = k!µ′k /k! = µ′k .
dt t=0
Note that MX (0) = E (1) = 1.
For the mgf to exist, it is necessary that MX (t) exist in a
STAT8310 2021 Topic 5.1 29

neighbourhood of zero, i.e. that MX (t) be finite in some interval
(−b, b) .
The mgf is related to the pgf, since for a discrete random variable X,
MX (t) = GX (et ) .
Trivially, if we wish to calculate central moments, then we use the

mgf of the random variable Y = X − µ.
Sometimes it is easier to calculate the mgf of Y directly.
STAT8310 2021 Topic 5.1 30

In other cases, we use

tY t(X−µ)

E e =E e
tX
−µt

=e E e
= e−µt MX (t).
STAT8310 2021 Topic 5.1 31

Moment generating functions of some known
distributions
1. X ∼ U [0, 1]
tX

MX (t) = E e
Z 1
= etx 1dx
0
 h i1
 etx ; t 6= 0
t
= 0
 1 ; t=0

 1 (et − 1) ; t 6= 0
t
=
 1 ; t = 0.
STAT8310 2021 Topic 5.1 32

• Compute µ′k by differentiation:
t t
′ te − (e − 1) × 1
MX (t) =
t2
• Thus
t t

′ te − (e − 1)
MX (0) = lim
t→0 t2
• Use l’Hôpital’s rule: If limt→0 f (t) = 0 and limt→0 g(t) = 0 ,
then ′
f (t) f (t)
lim = lim ′
t→0 g(t) t→0 g (t)
STAT8310 2021 Topic 5.1 33

• Therefore, using l’Hôpital’s rule, we get
t t t

′ te + e − e
MX (0) = lim
t→0 2t
t
e
= lim
t→0 2
1
=
2
and so
1
µ′1 =µ= .
2
STAT8310 2021 Topic 5.1 34

• Compute µ′k by series expansion:
1 t
MX (t) = e −1
t
2 3 4

1 t t t
= 1+ t + + + + ...− 1
t 2! 3! 4!
t t2 t3
= 1+ + + + ...
2! 3! 4!
∞
X tr
=
r=0
(r + 1)!
∞ r
X t 1
=
r=0
r! r + 1
Thus
1
µ′k = for k = 0, 1, 2, 3, . . .
k+1
STAT8310 2021 Topic 5.1 35

• Hence
1 1
µ = µ′1 = =
1+1 2
1 1
µ′2 = =
2+1 3
and
σ 2 = µ′2 − µ2
1 1 1
= − =
3 4 12
etc.
• Computation of the moments is much easier directly!
STAT8310 2021 Topic 5.1 36

2

2. X ∼ N µ, σ
The mgf of X is
tX

MX (t) = E e
Z∞ ( 2 )
1 1 x−µ
= etx √ exp − dx
2πσ 2 2 σ
−∞
Z∞
1 1
exp − 2 x2 − 2µx + µ2 − 2σ 2 tx dx

=√
2πσ 2 2σ
−∞
• We now illustrate the principle of integration by recognition:

the exponent in the above is
1 2 2 2

− 2 x − 2xµ + µ − 2σ tx
2σ
STAT8310 2021 Topic 5.1 37

which is quadratic in x.
2
• The exponent of the pdf is of the form c (x − a) + d.
• If we can get the exponent here in the same form, we can use
the fact that the pdf of X integrates to 1 to find the mgf.
• The exponent is, completing the square, with respect to x,
1 n 2 2 2
o
− 2 x − 2x µ + σ 2 t + µ + σ 2 t − µ + σ 2 t + µ2

2σ
1 h 2
2 2 2 4 2 2
i
=− 2 x− µ+σ t − µ − 2µσ t − σ t + µ
2σ
1 2
2 1
=− 2 x− µ+σ t + µt + σ 2 t2
2σ 2
• Thus
∞
1 1
Z
µt+ 12 σ 2 t2
2
2
MX (t) = e √ exp − 2 x − µ + σ t dx
2πσ 2 −∞ 2σ
1 2 2
= eµt+ 2 σ t
,
STAT8310 2021 Topic 5.1 38

2 2

since the integral is that of the pdf of a N µ + σ t, σ rv.
• Then
µt+ 12 σ 2 t2 2
′

MX (t) =e µ+σ t
and
′′ µt+ 21 σ 2 t2 2
2 µt+ 1 σ2 t2
2
MX (t) =e σ + µ+σ t e 2 .
• Hence
′
E (X) = MX (0)
= e0+0 (µ + 0) = µ
and
2

E X = M ′′ (0)
2
= e0+0 σ 2 + (µ + 0) e0+0
= σ 2 + µ2 .
STAT8310 2021 Topic 5.1 39

• Consequently
var X = µ2
= σ 2 + µ2 − µ2
= σ2 .
3. X ∼ Poisson(λ) .

 e−λ λx x = 0, 1, 2, . . .
x!
fX (x) =
0 otherwise
STAT8310 2021 Topic 5.1 40

• Thus
tX

MX (t) = E e
X
= etx fX (x)
x
∞ −λ x
X
tx e λ
= e
x=0
x!
∞ x
X (λet )
= e−λ
x=0
x!
t
= e−λ · eλe
λ(et −1)
=e .
STAT8310 2021 Topic 5.1 41

• The derivatives are given by
λ(et −1)
′
MX (t) =e λet
n o
(t) = λ eλ(e −1) et + et eλ(e −1) λet
t t
′′
MX
t λ(et −1) 2 2t λ(et −1)
= λe e +λ e e
• Hence
′
E (X) = MX (0)
= eλ(1−1) λe0 = λ
and
2
(0) = λ + λ2
′′
E X = MX
var (X) = λ + λ2 − λ2 = λ.
• The calculation using pgfs was clearly easier.
STAT8310 2021 Topic 5.1 42

• Now try series expansion:
2 t 2 3 t 3
λ (e − 1) λ (e − 1)
MX (t) = 1 + λ et − 1 +

+ + ···
2! 3!
while
2 3
t t
et − 1 = t + + + · · ·
2! 3!
STAT8310 2021 Topic 5.1 43

• Thus
2 3

t t
MX (t) = 1 + λ t + + + · · ·
2! 3!
2

λ 2 1 4
+ t2 + t3 + + t + ···
2! 3! 4
3

λ 3
+ t3 + t4 + . . . + · · ·
3! 2!
2
2 t
= 1 + λt + λ + λ
2!
3
3! 2 3 t
+ λ+ λ +λ + ···
2! 3!
STAT8310 2021 Topic 5.1 44

• So
E (X) = λ
E X = λ + λ2
2

3
E X = λ3 + 3λ2 + λ.

• In none of the previous examples was it easy to use the mgf

to obtain moments (using either the derivative or the series
expansion method).
• Also, the mgf of a distribution may not always exist. Even
the existence of all moments of a distribution does not
guarantee that the mgf exists.
STAT8310 2021 Topic 5.1 45

Example
Suppose that X is a rv with density

 1 exp(−√x),

2 if x ≥ 0
f (x) =
 0, if x < 0.
Then E(X k ) = Γ(2k + 2), k = 0, 1, . . . , and hence X has moments of

any order. However, the mgf MX (t) does not exist. Indeed, the mgf
can be written in the form
1 ∞ √
Z
MX (t) = exp(tx − x) dx.
2 0
If ε > 0 is small enough then for every t with 0 < t < ε we have
√
tx − x → ∞ as x → ∞. This implies that
R∞ √
0
exp(tx − x) dx = ∞. Therefore MX (t) does not exist in spite of
the fact that all moments of X exist.
STAT8310 2021 Topic 5.1 46

Basic Properties of mgfs
• MX (0) = 1;
• MX (t) may not always exist for any real t 6= 0.
• Ma+bX (t) = eat MX (bt) for a, b constants.
Proof:
n o
Ma+bX (t) = E et(a+bX)
n o
at (bt)X
=E e e
= eat MX (bt) .
STAT8310 2021 Topic 5.1 47

Uniqueness Theorem
• The mgf of a rv X uniquely determines the distribution function
of X at all points of continuity.
• The proof of this result comes from the uniqueness theorem for
Laplace transforms.
• The pdf or pf of a rv can thus in principle be obtained by
inverting the mgf.
STAT8310 2021 Topic 5.1 48

Moment generating functions of sums of
random variables
• Let X have mgf MX (t) and Y have mgf MY (t) with X and Y
independent.
• Then the mgf of W = X + Y is
tW

MW (t) = E e
n o
t(X+Y )
=E e
tX+tY

=E e
tX tY

=E e e
tX tY

=E e E e (by independence of X and Y )
= MX (t)MY (t)
STAT8310 2021 Topic 5.1 49

i.e. the mgf of the sum of two independent random variables is
the product of the mgfs.
2 2

• Suppose X ∼ N µX , σX and Y ∼ N µY , σY with X and Y
independent.
• Then
MW (t) = MX (t)MY (t)
2 2 2 2
µX t+ 12 σX t µY t+ 21 σY t
=e e
= e(µX +µY )t+ 2 (σX +σY )t .

1 2 2 2
• By the uniqueness theorem, we have

2
σY2

W ∼ N µX + µY , σX + ,
since MW (t) is recognised as the mgf of a rv which is normally
2 2

distributed with mean (µX + µY ) and variance σX + σY .
STAT8310 2021 Topic 5.1 50

• We could have obtained this by using the convolution formula,
but this idea is much more powerful and easily generalises:
Extension
• Let X1 , X2 , . . . , Xn be independent rvs, with mgfs

MXi (t); i = 1, . . . , n.
n
P
• Let Y = Xi .
i=1
STAT8310 2021 Topic 5.1 51

• Then
tY

MY (t) = E e
Pn
= E et i=1 Xi
n
!
Y
=E etXi
i=1
n
Y
tXi

= E e (by independence of Xi s)
i=1
Yn
= MXi (t).
i=1
• Thus the mgf of the sum of independent random variables is the

product of the mgfs.
STAT8310 2021 Topic 5.1 52

We want to know if
MX1 +X2 (t) = MX1 (t)MX2 (t)
implies the independence of X1 and X2 .
Example
Let (X1 , X2 ) be a two-dimensional random vector defined by the

table:
X1 \ X2 1 2 3
2 1 3
1 18 18 18
3 2 1
2 18 18 18
1 3 2
3 18 18 18
STAT8310 2021 Topic 5.1 53

We can easily find that X1 and X2 are identically distributed rvs
taking each of the values 1, 2, 3 with probability 1/3. The sum
Y = X1 + X2 is a rv taking values 2, 3, 4, 5, 6 with probabilities 1/9,
2/9, 3/9, 2/9, 1/9 respectively. For the mgfs we get
1 t
MX1 (t) = MX2 (t) = (e + e2t + e3t ),
3
1 2t
MY (t) = (e + 2e3t + 3e4t + 2e5t + e6t ).
9
Clearly
MX1 +X2 (t) = MX1 (t)MX2 (t).
However, the rvs X1 and X2 are not independent as can be seen
easily from the table above:
P (X1 = i, X2 = j) 6= P (X1 = i)P (X2 = j) for all i 6= j.
STAT8310 2021 Topic 5.1 54

Special cases
• If the Xi are identically distributed as well as being independent,

then
MXi (t) = MX (t),
say, ∀i.
n
P
• Thus Y = Xi has mgf
i=1
n
Y
MY (t) = MXi (t)
i=1
Yn
= MX (t)
i=1
n
= {MX (t)} .
STAT8310 2021 Topic 5.1 55

n
P
• Now suppose that Y = ai Xi where the Xi s are independent,
i=1
and the ai s are constants.
• Then
Pn
MY (t) = E et i=1 a i Xi
n
!
Y
=E etai Xi
i=1
n
Y
tai Xi

= E e (by independence of the Xi′ s)
i=1
Yn
= MXi (ai t)
i=1
Yn
= MX (ai t) if the Xi s are iid.
i=1
STAT8310 2021 Topic 5.1 56

1
– Special case ai = n ; (i = 1, . . . , n) . Then
n n
X 1 1X
Y = Xi = Xi = X.
i=1
n n i=1
and so
n
Y t
MX (t) = MX
i=1
n
n
t
= MX .
n
STAT8310 2021 Topic 5.1 57

2

– Example: Xi iid ∼ N µ, σ . Then
n
t
MX (t) = MX
n
" ( 2 )#n
t 1 2 t
= exp µX + σX
n 2 n
( 2 )
t 1 2 t
= exp µX n + σX n
n 2 n
2
σX
µX t+ 12 t2
=e n
which is the mgf of a rv which is normally distributed with

2
σX
mean µX and variance n .
– Hence, by applying the uniqueness theorem, it follows that if
STAT8310 2021 Topic 5.1 58

2

the Xi are iid N µX , σX, then
2

σX
X ∼ N µX , .
n
– Previously, without the use of mgfs, we were able to find the
mean and variance of X , but not define its exact distribution.
Further examples
n
1. The mgf of a rv X ∼ Bin(n, p) is (1 − p + pet ) , (by putting
z = et in its pgf).
• Suppose X1 ∼ Bin(n1 , p) , independently of X2 ∼ Bin (n2 , p) .
STAT8310 2021 Topic 5.1 59

Then
MX1 +X2 (t) = MX1 (t)MX2 (t)

t n1 t n2

= 1 − p + pe 1 − p + pe
t n1 +n2

= 1 − p + pe
• Thus X1 + X2 ∼ Bin(n1 + n2 , p) by the uniqueness theorem.
2. The mgf of a rv X ∼ Poisson(λ) has MX (t) = eλ(e −1)

t
(by
putting z = et in its pgf).
• Suppose Xi ∼ Poisson(λi ); i = 1, 2, . . . n and are independent.
Pn
• Let Y = i=1 Xi .
STAT8310 2021 Topic 5.1 60

• Then
n
Y
MY (t) = MXi (t)
i=1
n
eλi (e −1)
Y t
=
i=1
Pn
λi (et −1)
=e i=1
λ(et −1)
=e ,
Pn
where λ = i=1 λi .
• Thus Y ∼ Poisson(λ) by application of the uniqueness
theorem.
Question: Is X Poisson?
Answer: No, since X can take non-integer values.
STAT8310 2021 Topic 5.1 61

3. The mgf of a rv X which is distributed NegBin(k, p) is
t
k
pe
t
.
1 − (1 − p) e
• Suppose X1 and X2 are independent and negative binomially
distributed with parameters (k1 , p) and (k2 , p) .
• Then
MX1 +X2 (t)

t
k1 t
k2
pe pe
=
1 − (1 − p) et 1 − (1 − p) et
t
k1 +k2
pe
= t
.
1 − (1 − p) e
• Application of the uniqueness theorem gives X1 + X2 ∼
NegBin(k1 + k2 , p) .
STAT8310 2021 Topic 5.1 62

4. Sum of i.i.d. Gamma’s.
• We’ll put β = 1/γ, to make the algebra easier:
• If X ∼ G (α, β), then
Z ∞ α α−1 −γx
tx γ x e
MX (t) = e dx
0 Γ (α)
Z ∞ α−1 −(γ−t)x
α x e
=γ dx
0 Γ (α)
α α
(γ − t) xα−1 e−(γ−t)x
Z ∞
γ
= α dx
(γ − t) 0 Γ (α)
 α
γ

γ−t ; t<γ
=
 ∞ ; t≥γ
 α
1

1−βt ; t < 1/β
=
 ∞ ; t ≥ 1/β.
STAT8310 2021 Topic 5.1 63

• If X1 , . . . , Xn are i.i.d G (α, β) , i = 1, . . . , n then the mgf of
Pn
Y = i=1 Xi is
n
Y −α
MY (t) = (1 − βt)
i=1
−nα
= (1 − βt) .
• Application of the uniqueness theorem gives

Pn
Xi ∼ G (nα, β) .
i=1
• Note: we needed to keep the β ′ s the same for each Xi .
• We previously obtained this result using convolutions.
STAT8310 2021 Topic 5.1 64

n
1
P
5. Sample mean of i.i.d. Gamma rvs: Letting X = n Xi ,
i=1
we have from above

t
MX (t) = MY
n
−nα
βt
= 1−
n

Application of the uniqueness theorem gives X ∼ G nα, nβ .
STAT8310 2021 Topic 5.1 65

Cumulant Generating Function
• Definition: KX (t) = log MX (t) is called the cumulant generating
function (cgf) of X.
• It generates “cumulants” in the same way that the mgf generates
moments.
• Let
∞
X κr
KX (t) = tr .
r=0
r!
• Then κr is called the rth cumulant of X.

• If the mgf exists for a distribution, then so does the cgf.
STAT8310 2021 Topic 5.1 66

• Let X1 and X2 be independent rvs and let Y = X1 + X2 . Then
KY (t) = log MY (t)

= log {MX1 (t)MX2 (t)}
= log MX1 (t) + log MX2 (t)
= KX1 (t) + KX2 (t)
• The cgf of two independent rvs is thus the sum of the two cgfs.
Pn ′
• Let Y = Xi where the Xi s are i.i.d.
i=1
• Then
KY (t) = log MY (t)

n
= log {MX (t)}
= n log MX (t),
STAT8310 2021 Topic 5.1 67

and
KX (t) = log MX (t)

n
t
= log MX
n

t
= n log MX
n

t
= nKX .
n
• These results will come in useful later.

• As with mgfs, cumulants can be obtained from cgfs in two
different ways:
1. Differentiation
2. Taylor Series Expansion
STAT8310 2021 Topic 5.1 68

• Note: There is no simple formula for κr in general and cumulants
have no intrinsic meaning.
1. Differentiation
KX (0) = log MX (0) = log 1 = 0
′
′ d MX (t)
KX (t) = {log MX (t)} =
dt MX (t)
′ ′
M (0) µ
∴ κ1 = KX ′
(0) = X = 1 = µ′1 = µX
MX (0) 1
i.e. first cumulant is E (X) = µX .
2. Now
′

′′ d MX (t)
KX (t) =
dt MX (t)
′′
′ 2
MX (t)MX (t) − {MX (t)}
= 2 .
{MX (t)}
STAT8310 2021 Topic 5.1 69

Thus
2
′′ 1 × µ′2 − (µ′1 ) 2
κ2 = KX (0) = = σX
12
2
i.e. the second cumulant is the variance = σX . Similarly,
(3)
KX (0) = κ3 = µ3 ,
the third central moment. Also

(4)
KX (0) = κ4
= µ4 − 3σ 4
is not really meaningful, except that all of the cumulants of a

normal random variable from the 3rd onwards are 0, since
2

when X ∼ N µ, σ ,
1
KX (t) = µt + σ 2 t2 ,
2
STAT8310 2021 Topic 5.1 70

(r)
and so KX (t) = 0 when r ≥ 3.
3. The Taylor series expansion of KX (t) is found by expanding
MX (t) :
KX (t) = log MX (t)

2 3

t t
= log 1 + µ′1 t + µ′2 + µ′3 + · · ·
2! 3!
= log {1 + h(t)}
2 3
where h(t) = µ′1 t + µ′2 t2! + µ′3 t3! + · · · . Now, if |S| < 1,
1 2 1 3 1 4
log (1 + S) = S − S + S − S + · · · .
2 3 4
STAT8310 2021 Topic 5.1 71

Thus, as long as |h(t)| < 1,
1 1 1
KX (t) = h(t) − {h (t)} + {h(t)} − {h(t)}4 + · · ·
2 3
2 3 4
2 3
2 3
2
t t 1 t t
= µ′1 t + µ′2 + µ′3 + · · · − µ′1 t + µ′2 + µ′3 + · · · +·
2! 3! 2 2! 3!
Expanding and grouping like terms, we have
t2
2
KX (t) = µ′1 t − µ′2 − (µ′1 ) + ···
2!
tr
Since κr is the coefficient of r! in KX (t) , we have
κ0 = 0
κ1 = µ′1 = µ
2
κ2 = µ′2 − (µ′1 ) = σ 2 .
STAT8310 2021 Topic 5.1 72

“Central” Cumulant Generating Function
• As with the mgf, we can find the cgf of Y = X − µ from
KX−µ (t) = log MX−µ (t)

−µt
= log e MX (t)
= −µt + KX (t)
• Thus the only cumulants of Y and X which are different are the
first cumulants.
STAT8310 2021 Topic 5.1 73

STAT8310 Topic5 1 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

STAT8310 Topic5 1 2021

Uploaded by

Copyright:

Available Formats

STAT8310 Statistical Theory

STAT8310 2021 Topic 5.1 1

• Note that this is a function of z.

STAT8310 2021 Topic 5.1 2

STAT8310 2021 Topic 5.1 3

g (k) (0) = k (k − 1) · · · 1ak

since all of the other terms are 0.

STAT8310 2021 Topic 5.1 4

STAT8310 2021 Topic 5.1 5

• The summation will usually start at 0 or 1, depending on the

STAT8310 2021 Topic 5.1 6

Let X be uniformly distributed in {−a, −a + 1, . . . , b − 1, b},

STAT8310 2021 Topic 5.1 7

1. X ∼ Bernoulli(p) . Then ∀z,

2. X ∼ Bin(n, p) . Then ∀z,

STAT8310 2021 Topic 5.1 8

STAT8310 2021 Topic 5.1 9

STAT8310 2021 Topic 5.1 10

Recall from Topic 2: Newton’s generalized binomial theorem

STAT8310 2021 Topic 5.1 11

STAT8310 2021 Topic 5.1 13

• (Uniqueness) Let X and Y have pgfs GX (z) and GY (z). If for

GX (z) = GY (z) = G(z) for |z| < 1,

then X and Y have the same probability mass function.

STAT8310 2021 Topic 5.1 14

• First, note that GX (1) = 1 (sum of all probabilities).

STAT8310 2021 Topic 5.1 15

• And so on. Since

STAT8310 2021 Topic 5.1 16

• There is no point in deriving a general formula.

STAT8310 2021 Topic 5.1 17

• We wish to calculate the probability function of

STAT8310 2021 Topic 5.1 18

• Now, the pgf of Y is

• We may recognise this as the pgf corresponding to some

STAT8310 2021 Topic 5.1 19

• An important case is where the Xi are also identically

Example 1: Let X1 , X2 , . . . , Xn be i.i.d. Bern(p) . Then

STAT8310 2021 Topic 5.1 20

Example 3: X is (discrete uniform) on {1, 2, . . . , m} . Then

STAT8310 2021 Topic 5.1 21

which isn’t of much use!

Example 4: X1 , X2 , . . . , Xn i.i.d. Binomial(m, p) . Then

which is the pgf of a Binomial(mn, p) . Thus Y ∼ Bin(mn, p) .

STAT8310 2021 Topic 5.1 22

The independence of rvs in terms of pgfs

STAT8310 2021 Topic 5.1 23

STAT8310 2021 Topic 5.1 24

STAT8310 2021 Topic 5.1 25

• This is related to the Laplace transform of fX (x).

STAT8310 2021 Topic 5.1 26

Assuming that we can take expectations inside the summation sign

STAT8310 2021 Topic 5.1 27

STAT8310 2021 Topic 5.1 28

it follows, since the only term not involving powers of t is the

Note that MX (0) = E (1) = 1.

For the mgf to exist, it is necessary that MX (t) exist in a

STAT8310 2021 Topic 5.1 29

Trivially, if we wish to calculate central moments, then we use the

Sometimes it is easier to calculate the mgf of Y directly.

STAT8310 2021 Topic 5.1 30

STAT8310 2021 Topic 5.1 31

STAT8310 2021 Topic 5.1 32