Chap6 - Moment Generating Functions

page 93
110SOR201(2002)
Chapter 6
Moment Generating Functions

6.1
Definition and Properties
Our previous discussion of probability generating functions was in the context of discrete r.v.s.
Now we introduce a more general form of generating function which can be used (though not
exclusively so) for continuous r.v.s.
The moment generating function (MGF) of a random variable X is defined as
X
ex P(X = x)
MX () = E(eX ) = Zx
ex fX (x)dx
if X is discrete
(6.1)
if X is continuous
for all real for which the sum or integral converges absolutely. In some cases the existence
of MX () can be a problem for non-zero : henceforth we assume that M X () exists in some
neighbourhood of the origin, || < 0 . In this case the following can be proved:
(i) There is a unique distribution with MGF M X ().
(ii) Moments about the origin may be found by power series expansion: thus we may write
MX () = E(eX )
= E
X
(X)r
r=0
X
r
r=0
r!
r!
E(X r )
[i.e. interchange of E and
valid]
i.e.
MX () =
r=0
0r
r
r!
where 0r = E(X r ).
(6.2)
So, given a function which is known to be the MGF of a r.v. X, expansion of this function in
a power series of gives 0r , the r th moment about the origin, as the coefficient of r /r!.
(iii) Moments about the origin may also be found by differentiation: thus
o
dr n
dr
X )
E(e
{M
()}
=
X
d r
dr r

d
X
= E
(e )
d r
(i.e. interchange of E and differentiation valid)

r
X
.
= E X e
page 94
110SOR201(2002)
So

dr
{MX ()}
d r
=0
= E(X r ) = 0r .
(6.3)
(iv) If we require moments about the mean, r = E[(X )r ], we consider MX (), which
can be obtained from MX () as follows:

MX () = E e(X) = e E(eX ) = e MX ().

Then r can be obtained as the coefficient of
r
r!
MX () =
in the expansion
r=0
or by differentiation:
(6.4)
r
r!
dr
{MX ()}
r =
d r

(6.5)
(6.6)
=0
(v) More generally:

Ma+bX () = E e(a+bX) = ea MX (b).
(6.7)
Example
Find the MGF of the N (0, 1) distribution and hence of N (, 2 ). Find the moments about the
mean of N (, 2 ).
Solution
If Z N (0, 1),
Z
MZ () = E(e
Z )
1 2
1
=
ez e 2 z dz
2
Z
1
1
1
=
exp{ (z 2 2z + 2 ) + 2 }dz
2
2
2
Z
1
2
1 2 1
= exp( 2 ) 2
exp{ (z ) }dz.
2
But here
1 exp{...}
2
is the p.d.f. of N (, 1)), so

1
MZ () = exp( 2 )
2
If X = + Z, X N (, 2 ), and
MX () = M+Z ()
= e MZ () by (6.7)
= exp( + 21 2 2 ).
(6.8)
page 95
Then
110SOR201(2002)
MX () = e MX () = exp( 12 2 2 )
X
( 21 2 2 )r X
2r 2r
=
=
r!
2r r!
r=0
r=0
X
2r (2r)! 2r
.
.
.
=
2r r! (2r)!
r=0
Using property (iv) above, we obtain

2r+1 = 0, r = 1, 2, ...
2r (2r)!
2r =
, r = 0, 1, 2, ...
2r r!
e.g. 2 = 2 ;
6.2
(6.9)
4 = 3 4 .
Sum of independent variables
Theorem
Let X, Y be independent r.v.s with MGFs M X (), MY () respectively. Then
MX+Y () = MX ().MY ().
Proof
MX+Y () = E e(X+Y )

(6.10)
= E eX .eY
= E(eX ).E(eY ) [independence]
= MX ().MY ().
Corollary
If X1 , X2 , ..., Xn are independent r.v.s,

MX1 +X2 ++Xn () = MX1 ().MX2 ()...MXn ().
(6.11)
Note: If X is a count r.v. with PGF GX (s) and MGF MX (),

MX () = GX (e ) :
GX (s) = MX (log s).
(6.12)
Here the PGF is generally preferred, so we shall concentrate on the MGF applied to continuous
r.v.s.
Example
Let Z1 , ..., Zn be independent N (0, 1) r.v.s. Show that
V = Z12 + + Zn2 2n .
Solution
(6.13)
Let Z N (0, 1). Then

MZ 2 () = E eZ
Assuming < 21 , substitute y =

MZ 2 () =
1 2
1
2
ez e 2 z dz
2
Z
1
1
exp{ (1 2)z 2 }dz.
=
2
2
1 2z. Then
1 2
1
1
1
e 2 y .
dy = (1 2) 2 ,
1 2
2
1
< .
2
(6.14)
page 96
Hence
110SOR201(2002)
MV () = (1 2) 2 .(1 2) 2 ...(1 2) 2
= (1 2)n/2 , < 21 .
Now 2n has the p.d.f.
n
1
1
w 2 1 e 2 w ,
n
2 ( 2 )
n
2
w 0; n a positive integer.
Its MGF is
1
n
1
w 2 1 e 2 w dw
n
2 ( 2 )
Z0
n
1
1
w 2 1 exp{ w(1 2)}dw
n
n
2
0
2 2 ( 2 )
(t = 21 (1 2)
Z
n
n
1
(1 2) 2 n
t 2 1 et dt
( 2 ) 0
n
< 12
(1 2) 2 ,
MV ().
=
=
=
ew
n
2
( < 21 ))
So we deduce that V 2n . Also, from MZ 2 () we deduce that Z 2 21 .

If V1 2n1 , V2 2n2 and V1 , V2 are independent, then
n1
MV1 +V2 () = MV1 ().MV2 () = (1 2) 2 (1 2)

= (1 2)(n1 +n2 )/2 .
n2
2
( < 21 )
So V1 + V2 2n1 +n2 . [This was also shown in Example 3, 5.8.2.]
6.3
Bivariate MGF
The bivariate MGF (or joint MGF) of the continuous r.v.s (X, Y ) with joint p.d.f.
f (x, y), < x, y < is defined as

MX,Y (1 , 2 ) = E e1 X+2 Y
e1 x+2 y f (x, y)dxdy,
(6.15)
provided the integral converges absolutely (there is a similar definition for the discrete case). If
MX,Y (1 , 2 ) exists near the origin, for |1 | < 10 , |2 | < 20 say, then it can be shown that
"
r+s MX,Y (1 , 2 )
1r 2s
= E(X r Y s ).
(6.16)
1 =2 =0
The bivariate MGF can also be used to find the MGF of aX + bY , since

MaX+bY () = E e(aX+bY ) = E e(a)X+(b)Y
= MX+Y (a, b).
(6.17)
page 97
110SOR201(2002)
Bivariate Normal distribution
Example
Using MGFs:
(i) show that if (U, V ) N (0, 0; 1, 1; ), then (U, V ) = , and deduce (X, Y ),
where (X, Y ) N (x , y ; x2 , y2 ; );
(ii) for the variables (X, Y ) in (i), find the distribution of a linear combination aX + bY , and
generalise the result obtained to the multivariate Normal case.
Solution
(i)
We have
1 U +2 V )
MU,V (1 , 2 ) = E(e
Z Z
e1 u+2 v
=
12
= ......... =
Then
exp{......}dudv

exp{ 21 (12 +
MU,V (1 , 2 )
1
2 MU,V (1 , 2 )
1 2
So
1
1
p
exp
[u2 2uv + v 2 ] dudv
2
2(1 2 )
2 1
21 2 + 22 )}.
= exp{.....}(1 + 2 )
= exp{....}(1 + 2 )(1 + 2 ) + exp{....}.
"
2 MU,V (1 , 2 )
E(U V ) =
1 2
= .
1 =2 =0
Since E(U ) = E(V ) = 0 and Var(U ) = Var(V ) = 1, we have that the correlation coefficient of
U, V is
Cov(U, V )
E(U V ) E(U )E(V )
(U, V ) = p
= .
=
1
Var(U ).Var(V )
Now let
X = x + x U,
Y = y + y V.
Then, as we have seen in Example 1, 5.8.2,

(U, V ) N (0, 0; 1, 1; ) (X, Y ) N (x , y ; x2 , y2 ; ).
It is readily shown that a correlation coefficient remains unchanged under a linear transformation of variables, so (X, Y ) = (U, V ) = .
(ii) We have that
h
MX,Y (1 , 2 ) = E e1 (x +x U )+2 (y +y V )
= e(1 x +2 y ) MU,V (1 x , 2 y )
= exp{(1 x + 2 y ) + 21 (12 x2 + 21 2 x y + 22 y2 )].
So, for a linear combination of X and Y ,
MaX+bY () = MX,Y (a, b) = exp{(ax + by ) + 12 (a2 x2 + 2abCov(X, Y ) + b2 y2 ) 2 }
= MGF of N (ax + by , a2 x2 + 2abCov(X, Y ) + b2 y2 ) 2 ),
i.e.
aX + bY N (aE(X) + bE(Y ), a2 Var(X) + 2abCov(X, Y ) + b2 Var(Y )).
(6.18)
page 98
110SOR201(2002)
More generally, let (X1 , ..., Xn ) be multivariate normally distributed. Then, by induction,
n
X
i=1
n
n
X
X
X
ai aj Cov(Xi , Xj ) .
ai Xi N ai E(Xi ),
a2i Var(Xi ) + 2
i=1
(6.19)
i<j
i=1
(If the Xs are also independent, the covariance terms vanish but then there is a simpler
derivation (see HW 8).)
6.4
Sequences of r.v.s
6.4.1
Continuity theorem
First we state (without proof) the following:

Theorem
Let X1 , X2 , ... be a sequence of r.v.s (discrete or continuous) with c.d.f.s F X1 (x), FX2 (x), ...
and MGFs MX1 (), MX2 (), ..., and suppose that, as n ,
MXn () MX ()
for all ,
where MX () is the MGF of some r.v. X with c.d.f. F X (x). Then

FXn (x) FX (x)
as n
at each x where FX (x) is continuous.

Example
Using MGFs, discuss the limit of Bin(n, p) as n , p 0 with np = > 0 fixed.
Solution
Let Xn Bin(n, p), with PGF GX (s) = (ps + q)n . Then

MXn () = GXn (e ) = (pe + q)n = {1 +

(e 1)}n
n
where = np.
Let n , p 0 in such a way that remains fixed. Then

MXn () exp{(e 1)}
since
1+
a
n
n
ea
as n ,
as n , a constant,
(6.20)
i.e.
MXn () MGF of Poisson()
(6.21)
(use (6.12), replacing s by e in the Poisson PGF (3.7)). So, invoking the above continuity
theorem,
Bin(n, p) Poisson()
(6.22)
as n , p 0 with np = > 0 fixed. Hence in large samples, the binomial distribution
can be approximated by the Poisson distribution. As a rule of thumb: the approximation is
acceptable when n is large, p small, and = np 5.
page 99
6.4.2
110SOR201(2002)
Asymptotic normality
Let {Xn } be a sequence of r.v.s (discrete or continuous). If two quantities a and b can be found
such that
(Xn a)
c.d.f. of N (0, 1) as n ,
(6.23)
c.d.f. of
b
Xn is said to be asymptotically normally distributed with mean a and variance b 2 , and we write
Xn a a
N (0, 1)
b
Xn N (a, b2 ).
or
(6.24)
Notes: (i) a and b need not be functions of n; but often a and b 2 are the mean and variance of
Xn (and so are functions of n).
(ii) In large samples we use N (a, b2 ) as an approximation to the distribution of X n .
6.5
Central limit theorem
A restricted form of this celebrated theorem will now be stated and proved.
Theorem
Let X1 , X2 , ... be a sequence of independent identically distributed r.v.s, each with mean and
variance 2 . Let
(Sn n)
.
Sn = X 1 + X 2 + + X n ,
Zn =
n
Then
a
Zn N (0, 1)
a
and Sn
Proof
or
N (n, n 2 ).
Let Yi = Xi
P(Zn z) P(Z z)
as n , where Z N (0, 1),
(i = 1, 2, ...). Then Y1 , Y2 , ... are i.i.d. r.v.s, and
Sn n = X1 + + Xn n = Y1 + + Yn .
So
MSn n () = MY1 ().MY2 ()....MYn () = {MY ()}n ,
and
MZn () = M Snn () = E exp

h
S
n n
= E exp (Sn n)( n )
= MSn n
Note that
E(Y ) = E(X ) = 0 :
Then
= MY
oi
i
on
E(Y 2 ) = E{(X )2 } = 2 .
2
3
+ E(Y 2 ) + E(Y 3 ) +
1!
2!
3!
= 1 + 21 2 2 + o( 2 )
MY () = 1 + E(Y )
page 100
(where o( 2 ) denotes a function g() such that
g()
2
110SOR201(2002)
0 as 0). So
1 2 1
1 n
1 n
MZn () = {1 + 12 2 ( n
2 ) + o( n )} = {1 + 2 . n + o( n )}
(where o( n1 ) denotes a function h(n) such that
h(n)
1/n
0 as n ).
Using the standard result (6.20), we deduce that

MZn () exp( 21 2 )
as n
which is the MGF of N(0,1).

So
c.d.f. of Zn =
i.e.
Sn n
c.d.f. of N (0, 1)
n
Zn N (0, 1)
or
as n ,
Sn N (n, n 2 ).
(6.25)
u
t
Corollary
n
P
Let X n =
1
n
Proof
variance
X n = W1 + + Wn where Wi = n1 Xi and W1 , ..., Wn are i.i.d. with mean

. So
2
2
a
X n N (n. , n. 2 ) = N (, ).
n
n
n
i=1
Xi . Then X n N (, n ).
(6.26)
2
n2
and
u
t
(Note: The theorem can be generalised to

independent r.v.s with different means & variances
dependent r.v.s
but extra conditions on the distributions are required.
Example 1
Using the central limit theorem, obtain an approximation to Bin(n, p) for large n.
Solution
Let Sn Bin(n, p). Then

Sn = X 1 + X 2 + + X n ,
where
Xi =
1,
0,
if the ith trial yields a success

if the ith trial yields a failure.
Also, X1 , X2 , ..., Xn are independent r.v.s with

E(Xi ) = p,
So
Var(Xi ) = pq.
Sn N (np, npq),
i.e., for large n, the binomial c.d.f. is approximated by the c.d.f. of N (np, npq).
[As a rule of thumb: the approximation is acceptable when n is large and p
np > 5.]
1
2
u
t
such that
page 101
110SOR201(2002)
Example 2
As Example 1, but for the 2n distribution.
Solution
Let Vn 2n . Then we can write

Vn = Z12 + + Zn2 ,
where Z12 , ..., Zn2 are independent r.v.s and

Zi2 21 ;
Zi N (0, 1),
So
E(Zi2 ) = 1,
Var(Zi2 ) = 2.
Vn N (n, 2n).
Note:
u
t
These are not necessarily the best approximations for large n. Thus
(i)

P(Sn s) P Z
= FS
s+ 21 np
npq
s+ 21 np
npq
where Z N (0, 1)
The 12 is a continuity correction, to take account of the fact that we are approximating a
discrete distribution by a continuous one.
(ii)
p
2Vn
6.6
approx
N ( 2n 1, 1).
Characteristic function
The MGF does not exist unless all the moments of the distribution are finite. So many distributions (e.g. t,F ) do not have MGFs. So another GF is often used.
The characteristic function of a continuous r.v. X is
CX () = E(e
iX
)=
eix f (x)dx,
(6.27)
where is real and i = 1. CX () always exists, and has similar properties to M X (). The
CF uniquely determines the p.d.f.:
1
f (x) =
2
CX ()eix d
(6.28)
(cf. Fourier transform). The CF is particularly useful in studying limiting distributions. However, we do not consider the CF further in this module.

Chap6 - Moment Generating Functions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap6 - Moment Generating Functions

Uploaded by

Copyright:

Available Formats

page 93

Moment Generating Functions

Definition and Properties

[i.e. interchange of E and

MX () = E e(X) = e E(eX ) = e MX ().

(v) More generally:

Ma+bX () = E e(a+bX) = ea MX (b).

is the p.d.f. of N (, 1)), so

Using property (iv) above, we obtain

Sum of independent variables

If X1 , X2 , ..., Xn are independent r.v.s,

Note: If X is a count r.v. with PGF GX (s) and MGF MX (),

GX (s) = MX (log s).

Let Z N (0, 1). Then

Assuming < 21 , substitute y =

So we deduce that V 2n . Also, from MZ 2 () we deduce that Z 2 21 .

MV1 +V2 () = MV1 ().MV2 () = (1 2) 2 (1 2)

So V1 + V2 2n1 +n2 . [This was also shown in Example 3, 5.8.2.]

e1 x+2 y f (x, y)dxdy,

MaX+bY () = E e(aX+bY ) = E e(a)X+(b)Y

= MX+Y (a, b).

Bivariate Normal distribution

Then, as we have seen in Example 1, 5.8.2,

First we state (without proof) the following:

where MX () is the MGF of some r.v. X with c.d.f. F X (x). Then

at each x where FX (x) is continuous.

Let Xn Bin(n, p), with PGF GX (s) = (ps + q)n . Then

Let n , p 0 in such a way that remains fixed. Then

Central limit theorem

as n , where Z N (0, 1),

(i = 1, 2, ...). Then Y1 , Y2 , ... are i.i.d. r.v.s, and

MZn () = M Snn () = E exp

= E exp (Sn n)( n )

(where o( 2 ) denotes a function g() such that

(where o( n1 ) denotes a function h(n) such that

Using the standard result (6.20), we deduce that

which is the MGF of N(0,1).

X n = W1 + + Wn where Wi = n1 Xi and W1 , ..., Wn are i.i.d. with mean

(Note: The theorem can be generalised to

Let Sn Bin(n, p). Then

if the ith trial yields a success

Also, X1 , X2 , ..., Xn are independent r.v.s with

Let Vn 2n . Then we can write

where Z12 , ..., Zn2 are independent r.v.s and

You might also like