L18 Cumulants PDF

19:39 01/11/2001
TOPIC. Cumulants. Just as the generating function M of a random variable X generates its moments, the logarithm of M generates a sequence of numbers called the cumulants of X. Cumulants
are of interest for a variety of reasons, an especially important one
being the fact that the j th cumulant of a sum of independent random
variables is simply the sum of the j th cumulants of the summands.
Definitions and examples. To begin with, suppose that X is a
real random variable whose real moment generating function M (u) =
E(euX ) is finite for all us in an open interval about 0. Since M (0) =
1 6= 0 and since the composition of two functions which have power
series expansions itself has a power series expansion, we may write
X n un
(log M )(u) =
(1)
n=0 n!
for all u in some (possibly smaller) open interval about 0. The numbers 1 , 2 , . . . in this expansion are called the cumulants (for a
reason that will be explained subsequently). Notice that
n = (log M )(n) (0)
(2)
all n; in particular 0 = log(M (0)) = log(1) = 0. (1) implies (see

Theorem 12.4) that for all z in some open ball about 0 in C
X n z n
(log G)(z) =
;
(3)
n=1 n!
here G(z) = E(ezX ) is the complex generating function and log
denotes the principal branch of the complex logarithm function. In
particular the characteristic function (t) = E(eitX ) of X satisfies
X n in tn
K(t) := log((t)) =
(4)
n=1
n!
for all t in some open interval about 0 in R, and
n = K (n) (0)/in
(5)
tions were developed in the exercises in Section 11. None of those

results are used here. Moreover
there
is a change in notation
in
Section 11 K(t) denoted log M (t) , whereas here K(t) is log (t) .)
Formula (5) is used to define the first n cumulants of X when X
is only known to have an nth moment, i.e., when E(|X|n ) < . In
this situation is n-times continuously differentiable and close to 1
in an open interval about 0 R, and so K = log() is defined and
n-times continuously differentiable in that interval. Consequently the
j th cumulant
j := K (j) (0)/ij
(6)
exists for j = 1, . . . , n, and

Xn ij j tj
+ o(tn ) as t 0.
K(t) =
j=1
j!
(7)
The following observation is useful for recognizing the cumulants.

If X has a nth moment and 1 , . . . , n are numbers such that
Xn ij j tj
K(t) =
+ o(tn ) as t 0,
(81 )
j=1
j!
then (see Exercise 1)
j = j for j = 1, . . . , n .
(82 )
Example 1. (A) Suppose X N (, 2 ). Then

2 t2
2
for t near 0. X has infinitely many cumulants since its moment generating function is finite in a neighborhood of 0 (actually, finite everywhere); formula (5) gives
(t) = eit e
2 2
t /2
and K(t) = log((t)) = it
1 = , 2 = 2 , 3 = 4 = = 0.
(9)
for all n N. Since the functions log M , log G, and K = log generate the cumulants, they are called cumulant generating functions
(CGFs). (Some properties of cumulants and their generating func-
Since the first and second cumulants of any random variables are its
mean and variance (see (19) and (20)), the normal distribution has
the simplest possible cumulants.
18 1
18 2
(B) Suppose X Gamma(r). Then

1 r
(t) =
and
1 it
X (it)n
X (n 1)! (it)n
K(t) = r log(1 it) = r
=r
n=1
n=1
n
n!
Properties of cumulants. This section develops some useful properties of cumulants. The nth moment of cX is cn times the nth moment
of X; this scaling property is shared by the cumulants.
Theorem 1 (Homogeneity). Suppose X is a random variable with
an nth cumulant. Then for any c R, cX has an nth cumulant and
for t near 0. X has cumulants of

orders. By the uniqueness of the
Pall
coefficients in the power series n=1 n in tn/n!, we get

n = r(n 1)! for n 1.
(10)
it
1)
and
K(t) = (eit 1) =
(13)
Proof For t near 0 we have

cX (t) = E(eitcX ) = X (ct)
(C) Suppose that X Poisson(). Then

(t) = e(e
n (cX) = cn n (X).
X
n=1
= KcX (t) = KX (ct)
in tn
n!
(n)
(n)
= KcX (t) = cn KX (ct)
for t near 0. Evidently
(n)
1 = 2 = 3 = = .
(11)
(D) Suppose that X has an unnormalized t-distribution with 3 degrees

of freedom; its density has the form
c
(1 + x2 )2 .
X has only two moments, and thus only two cumulants. It turns out
(see Exercise 13.11) that (t) = e|t| (1 + |t|) for all t R. Thus
K(t) = |t| + log 1 + |t|
|t|2
|t|3
= |t| + |t|
+
+
(for |t| < 1)
2
3
it
i2 t2
it
i2 t2
t2
+ o(t2 ) = 1 + 2
+ o(t2 )
= + o(t2 ) = 0 + 1
2
1!
2!
1!
2!
as t 0. (82 ) gives
The nth moment of X + b is a linear combination of the first

n moments of X (with what coefficients?). The situation regarding
cumulants is much simpler:
Theorem 2 (Semi-invariance). Suppose X is a random variable
with an nth cumulant. Then for any b R, X + b has an nth cumulant
and
n (X) + b, if n = 1,
(14)
n (X + b) =
n (X),
if n > 1.
Proof For t near 0 we have
X+b (t) = E(eit(X+b) ) = eitb E(eitX ) = eitb X (t)
= KX+b (t) = itb + KX (t)
(n)
1 = 0 and 2 = 1.
(12)
3 , 4 , . . . are undefined.
(n)
= n (cX) = KcX (0)/in = cn KX (0)/in = cn n (X).
= KX+b (t) =
dn
dtn itb
(n)
+ KX (t)
= (14) holds.
18 3
18 4
19:39 01/11/2001
Here is the reason for the name cumulants; note that (16) is
much simpler than the corresponding relation for moments.
Theorem 3 (Cumulants accumulate). Suppose X and Y are independent random variables, each having an nth cumulant. Then
S := X + Y has an nth cumulant, and
n (S) = n (X) + n (Y ).
(15)
Proof S (t) = X (t)Y (t), so KS (t) = KX (t) + KY (t).

Now we investigate the relationship between moments and cumulants. We first consider moments about 0, which we write as
j := E(X j )
(16)
for j = 0, 1, 2, . . . . Note that 0 = 1.

Theorem 4 (The cumulant/moment connection). Suppose X
is a random variable with n moments 1 , . . . , n . Then X has n
cumulants 1 , . . . , n , and
Xr r
r+1 =
j r+1j for r = 0, . . . , n 1.
(17)
j=0 j
and j = K (j) (0)/ij
where (t) = E(eitX ) and K(t) = log((t)), or, equivalently, (t) =

eK(t) , for all t near 0. Differentiating this last identity gives
0 (t) = eK(t) K 0 (t) = (t)K 0 (t)
and evaluating this at t = 0 gives
i1 = 1(i1 ) = 1 = 1 = (17) holds for r = 0.
Differentiating (18) r times gives (see Exercise 3)
Xr r
(r+1) (t) =
(j) (t) (K 0 )rj (t)
j=0 j
and evaluating this at t = 0 shows that (17) holds for 1 r < n.
18 5
Pr
(18)

j r+1j for r = 0, . . . , n 1.
r
j=0 j
Writing out (17) for r = 0, . . . , 3 produces

1
2
3
4
= 1 ,
= 2 + 1 1 ,
= 3 + 21 2 + 2 1 ,
= 4 + 31 3 + 32 2 + 3 1 .
(19)
These recursive formulas can be used to calculate the s efficiently

from the s, and vice versa. When X has mean 0, that is, when
1 = 0 = 1 , j becomes
j := E (X E(X))j
and formulas (19) simplify to
2 = 2 ,
2 = 2 ,
3 = 3 ,
3 = 3
4 = 4 +
Proof For j = 0, . . . , n we have

j = (j) (0)/ij
(17): r+1 =
322 ,
4 = 4
(20)
322 .
Since the central moments 2 , 3 , . . . and the cumulants 2 , 3 , . . .

are unaffected by adding a constant to X, these formulas are valid
even when E(X) 6= 0. Note that 2 is simply the variance of X.
Example 2. The following display exhibits the moment/cumulant
connection for some important distributions:
X N (, 2 )
1 = , 1 = ,
X Gamma(r)
X Poisson()
1 = r, 1 = r,
1 = , 1 = ,
2 = r
2 = , 2 = ,
3 = 0,
3 = 0,
4 = 0,
4 = 3 4, 4 = 6r, 4 = 6r + 3r2, 4 = , 4 = + 32.
2 = , 2 = ,
2 = r,
3 = 2r, 3 = 2r,
18 6
3 = , 3 = ,
More on the cumulant/moment connection. Equations (19)

express 1 , . . . , 4 recursively in terms of 1 , . . . , 4 . By carrying
out the recursions one finds that
2 = 2 + 21 ,
m2 times
mk times
:= m1 + m2 + + mk
3 = 3 + 32 1 + 31 ,
4 = 4 + 43 1 + 322 + 62 21 + 41 .
Note that the formula for 4 is a linear combination of products of j s
and that, including multiplicities, the subscripts j on the j s in those
products form the following lists:
These are all the possible lists of nonincreasing positive integers which
add to 4; theyre called the additive partitions of 4. In what follows
IPam going to show that for any n N, n is a sum of the form
partitions of n, c is a
c where ranges over the additive
Q
certain number depending on , and = j j .
We need some notation. For a positive integer n let
Pn be the collection of all additive partitions of n.
(21)
By definition an element of Pn has the form

= [ j1 , . . . , j1 , j2 , . . . , j2 , . . . , jk , . . . , jk ]
| {z } | {z }
| {z }
m2 times
(221 )
mk times
for some number k, the ji s and

Pk mi s being positive integers satisfying
j1 > j2 > > jk and n = i=1 mi ji . Shorthand for (221 ) is
m
[ j1m1, j2m2, . . . , jk k ];
(23)
elements, of which k := k (namely, j1 , j2 , . . . , jk ) are distinct. The

quantities
c :=
n!
(j1 !)m1 (j2 !)m2
(jk
!)mk
1
m1 ! m2 ! mk !
d := c (1) 1 ( 1)!
[4], [3, 1], [2, 2], [2, 1, 1], and [1, 1, 1, 1] .
m1 times
ements, of which only 3 (namely 4, 2, and 1) are distinct. In general

the partition (22) has a total of
1 = 1 ,
m1 times
(22): = [ j1m1, j2m2, . . . , jk k ] := [ j1 , . . . , j1 , j2 , . . . , j2 , . . . , jk , . . . , jk ].

| {z } | {z }
| {z }
(222 )
(25)
play an important role in what follows, as do the products

Y
Y
:=
j and :=
j .
j
(24)
(26)
Theorem 5. For a random variable having moments 1 , . . . , n and

cumulants 1 , . . . , n ,
X
X
n =
c and n =
d ;
(27)
the sums here are taken over the elements = [ j1m1, j2m2, . . . , jk k ]
of Pn and c , d , , and are defined by (24)(26) above.
Example 3. One of the additive partitions of 11 is = [41 , 22 , 13 ] =
[ j1m1 , j2m2 , j3m3 ] for j1 = 4, j2 = 2, j3 = 1, m1 = 1, m2 = 2, m3 = 3.
According to (27), the contribution this makes to 11 is c where
:= 4 22 31
and
c :=
11!
1
= 34650 .
4! (2!)2 (1!)3 1! 2! 3!
note that in (222 ) the notation ji i means replicate ji a total of

1 2 3
mi times, not raise ji to the mth
i power. For example [4 , 2 , 1 ]
denotes the partition [4, 2, 2, 1, 1, 1] of n = 11; this partition has 6 el-
Moreover since = 1 + 2 + 3 = 6, the contribution makes to 11

is d where
18 7
18 8
= 4 22 13
and d = c (1)5 5! = 4158000 .
19:39 01/11/2001
(23): :=
Pk
i=1
mi
(24): c := n!/
Qk
mi
mi !
i=1 (ji !)
As we will see, Theorem 5 is a special case of the formula given

below for the nth derivative of the composition of two functions. The
formula applies to real or complex valued functions of a real or complex variable. For example, one of the functions may be a characteristic function, which is (in general) a complex-valued function of a
real variable, and the other may be the complex logarithm function.
Recall that a function is said to have an nth derivative at a point
if the function is (n 1)-times differentiable in an open neighborhood
of the point and the (n 1)st derivative is differentiable at the point.
Theorem 6 (Fa`
a di Brunos formula). Let n be a positive integer
and let f and g be two functions such that: the composite function
h := g(f ) is defined in an open neighborhood of a point x; f has an
nth derivative at x; and g has an nth derivative at y := f (x). Then h
has an nth derivative at x given by the formula
hY
i
X
h(n) (x) =
c g ( ) (y)
f (j) (x)
(28)
Pn
where c and are defined by (24) and (23) respectively, f (j) (x) is
the j th derivative of f at x, and g () (y) is the th derivative of g at y.
Example 4. For the additive partition [21 , 11 ] of n = 3 one has
= 1 + 1 = 2 and c = 3!/[(21 1!)(11 1!)] = 3. According to Fa`a di
Brunos formula,
derivative of h = g(f ) should contain the
0 the third
00
00
term 3g f (x) f (x)f (x). This can be verified by direct calculation:
h00 = g 00 (f )(f 0 )2 + g 0 (f )f 00 ,
h000 = g 000 (f ) (f 0 )3 + 2g 00 (f )f 0 f 00 + g 00 (f )f 0 f 00 + g 0 (f )f 000

= g 000 (f )(f 0 )3 + 3g 00 (f )f 0 f 00 + g 0 (f )f 000 .
18 9
Pn c
(j)
g ( ) f (x)
(x)
j f
Example 5. Let X be a random variable with characteristic function

, cumulant generating function K = log(), moments 1 , . . . , n and
cumulants 1 .. . n . in n is h(n) (x) for x = 0 and h(t) = (t) =
eK(t) = g f (t) with f (t) = K(t) and g(z) = ez . (28) applies because
K is defined in a neighborhood of t. Since f (j) (x) = ij j and g () (z) =
ez equals 1 at z = y = f (x) = 0, Fa`
a di Brunos formula (28) implies
in n =
X
Pn
hY
j
i
X
ij j = in
Pn
c ;
dividing through by in gives the LHS of (27). Similarly,
the RHS of
(27) follows by applying (28) for x = 0 to K(t) = g (t) with g(z) =
log(z) and using g () (z) = (1)1 ( 1)!/z = (1)1 ( 1)! for
z = y = (0) = 1.
Proof of Theorem 6. The method used in Example 4 shows that

h has an nth derivative at x, so that by Taylors theorem
h() =
Xn
j=0
h(j) (x)
( x)j + o | x|n as x.
j!
I am going to show that

h() =
h0 = g 0 (f )f 0 ,
(28): h(n) (x) =
h = g(f )
Xn
j=0
Hj
( x)j + o | x|n as x
j!
for certain numbers H0 , H1 , . . . , Hn , with Hn being defined by the

RHS of (28). This implies h(j) (x) = Hj for j = 0, . . . , n, and in
particular that h(n) (x) = Hn , as (28) asserts.
18 10
(28): h(n) (x) =
h = g(f )
Pn c
(j)
g ( ) f (x)
(x)
j f
Since f has an nth derivative at x, Taylors theorem implies that

Xn fj
( x)j + o | x|n as x
(29)
f () =
j=0 j!
for fj = f (j) (x). Similarly, since g has an nth derivative at y = f (x),
Xn g
(30)
g() =
( y) + o | y|n as y
=0 !
for g = g () (y). Thus
Xn g
h() = g f () =
f () f (x) + o |f () f (x)|n
=0 !
Xn g Xn fi
=
( x)i + o | x|n
+ o | x|n
=0 !
i=1 i!
Xn g Xn fi
=
( x)i + o | x|n
=0 !
i=1 i!
X n Hj
=
( x)j + o | x|n
(31)
j=0 j!
=1
n
X
g h
=1
j=0
j=0
as t 0 through R, then aj = bj for j = 0, . . . , n. [Hint: use

induction on j.]
Exercise 2. (a) Suppose X and Y are independent random variables,

each having an nth moment. As in Theorem 3, put S = X + Y .
Express n (S) in terms of j (X) and j (Y ) for j = 0, . . . , n. (b)
Suppose b R and X has an nth moment. Express n (X + b) in
terms of j (X) for j = 0, . . . , n.
Exercise 3. Let (a, b) be an open subinterval of R and let f and g

be complex-valued functions defined on (a, b). Show that if f and g
are n-times differentiable, then so is h := f g and
Xn n
h(n) (t) =
f (j) (t) g (nj) (t)
(33)
j=0 j
for each t (a, b). [Hint: use induction on n].
as x, where
Xn
Hn = n!
= n!
Exercise 1. Show that if a0 , . . . , an and b0 , . . . , bn are complex

numbers such that
Xn
Xn
aj tj =
bj tj + o(tn )
fi1 fi2 fi i
g hX
i1 ,i2 ,...,i 1
!
i1 +i2 ++i =n i1 ! i2 ! i !
m
fjm1 1 fjk k
X
m
[ j1 1,...,jk k ]Pn
m1 ++mk =
(j1 !)m1 (jk !)mk
i
!
. (32)
m1 ! mk !
The last step uses the fact that the number of -tuples i1 , i2 , . . . , i
which contain m1 j1 s, m2 j2 s, . . . , and mk jk s (for
j1 , . . . , jk )
distinct
!
is given by the multinomial coefficient m1 m
= m1 !m
. This
k
k!
completes the proof of (28).
It is worth noting that (29) and (30) for arbitrary fj s and g s
imply (31) with Hn given by (32). This result doesnt require f and
g to be differentiable,
18 11
The next four exercises deal with the cumulants of a random

variable U uniformly distributed over the interval [1/2, 1/2].
Exercise 4. Let U be as above. Show that U has CGF
sinh(t/2)
K(t) := log E(etU ) = log

t/2
for all real t. (sinh(x) := (ex ex )/2; take sinh(0)/0 := 1.)
(34)
Exercise 5. Let K be as in (34). Show that

K 0 (t) =
1 1
1
+
,
2
t
exp(t) 1
for t 6= 0, while K 0 (0) = 0.
18 12
19:39 01/11/2001
Let B0 , B1 , . . . be the so-called Bernoulli numbers, i.e., the

coefficients in the power series expansion
X Bk tk
t
=
;
k=0 k!
exp(t) 1
(35)
Exercise 8. Confirm Fa`

a di Brunos formula by computing the fifth
derivative of h = g(f ) and checking the result against the RHS of ().
in particular
B0 = 1,
B6 =
1
,
42
Exercise 7. Let U be as above. Confirm the values of 2 , . . . , 12

in (38) by computing the first 12 central moments of U and using
the formulas for cumulants in terms of moments; use Maple or the
equivalent to do the arithmetic.
1
B1 = ,
2
1
B8 = ,
30
1
,
6
5
=
,
66
B2 =
B10
1
,
30
691
=
2730
B4 =
B12
(36)
while B3 = B5 = = B11 = 0. (See Abramowitz and Stegun

page 804, or do help (bernoulli) in Maple.) Formula (35) holds
for |t| < 2 .
Exercise 9. For integers 1 m n let pn,m be the number of

additive partitions of n for which the largest element is m, and let pn
be the total number of additive partitions of n. Show that
if m = n,
1,
pn,m = Xmin(m,nm)
pnm,j , if m < n.
j=1
and that
Exercise 6. Let U and the Bernoulli numbers be as above. By
integrating K 0 (s) from 0 to t, show that
K(t) =
X Bk tk
k=2 k k!
(37)
pn =
Xn
m=1
pn,m .
Use these relations to compute pn for n = 1, . . . , 10.
for |t| < 2, and hence that the k th cumulant of U is Bk /k, for k 2;
in particular all cumulants of odd order equal zero, while
2 =
1
12
1
8 =
,
240
1
,
120
1
=
,
132
4 =
10
Optional: prove this.

function which is defined
|z z0 | < r }, then f is
P
(n)
(z0 )(z z0 )n/n!
n=0 f
1
,
252
691
=
.
32760
6 =
12
(38)
Use the fact that if f is a complex valued

and differentiable in the disk D := { z :
infinitely differentiable in D and f (z) =
for all z D.
18 13
18 14

L18 Cumulants PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L18 Cumulants PDF

Uploaded by

Copyright:

Available Formats

19:39 01/11/2001

all n; in particular 0 = log(M (0)) = log(1) = 0. (1) implies (see

tions were developed in the exercises in Section 11. None of those

exists for j = 1, . . . , n, and

The following observation is useful for recognizing the cumulants.

Example 1. (A) Suppose X N (, 2 ). Then

and K(t) = log((t)) = it

(B) Suppose X Gamma(r). Then

for t near 0. X has cumulants of

coefficients in the power series n=1 n in tn/n!, we get

Proof For t near 0 we have

(C) Suppose that X Poisson(). Then

= KcX (t) = KX (ct)

= KcX (t) = cn KX (ct)

for t near 0. Evidently

(D) Suppose that X has an unnormalized t-distribution with 3 degrees

K(t) = |t| + log 1 + |t|

The nth moment of X + b is a linear combination of the first

= n (cX) = KcX (0)/in = cn KX (0)/in = cn n (X).

Proof S (t) = X (t)Y (t), so KS (t) = KX (t) + KY (t).

for j = 0, 1, 2, . . . . Note that 0 = 1.

where (t) = E(eitX ) and K(t) = log((t)), or, equivalently, (t) =

Writing out (17) for r = 0, . . . , 3 produces

These recursive formulas can be used to calculate the s efficiently

Proof For j = 0, . . . , n we have

Since the central moments 2 , 3 , . . . and the cumulants 2 , 3 , . . .

4 = 3 4, 4 = 6r, 4 = 6r + 3r2, 4 = , 4 = + 32.

More on the cumulant/moment connection. Equations (19)

By definition an element of Pn has the form

for some number k, the ji s and

elements, of which k := k (namely, j1 , j2 , . . . , jk ) are distinct. The

[4], [3, 1], [2, 2], [2, 1, 1], and [1, 1, 1, 1] .

ements, of which only 3 (namely 4, 2, and 1) are distinct. In general

(22): = [ j1m1, j2m2, . . . , jk k ] := [ j1 , . . . , j1 , j2 , . . . , j2 , . . . , jk , . . . , jk ].

play an important role in what follows, as do the products

Theorem 5. For a random variable having moments 1 , . . . , n and

note that in (222 ) the notation ji i means replicate ji a total of

Moreover since = 1 + 2 + 3 = 6, the contribution makes to 11

and d = c (1)5 5! = 4158000 .

As we will see, Theorem 5 is a special case of the formula given

h000 = g 000 (f ) (f 0 )3 + 2g 00 (f )f 0 f 00 + g 00 (f )f 0 f 00 + g 0 (f )f 000

Example 5. Let X be a random variable with characteristic function

dividing through by in gives the LHS of (27). Similarly,

Proof of Theorem 6. The method used in Example 4 shows that

I am going to show that

(28): h(n) (x) =

for certain numbers H0 , H1 , . . . , Hn , with Hn being defined by the

(28): h(n) (x) =

Since f has an nth derivative at x, Taylors theorem implies that

as t 0 through R, then aj = bj for j = 0, . . . , n. [Hint: use

Exercise 2. (a) Suppose X and Y are independent random variables,

Exercise 3. Let (a, b) be an open subinterval of R and let f and g

Exercise 1. Show that if a0 , . . . , an and b0 , . . . , bn are complex

(j1 !)m1 (jk !)mk

The next four exercises deal with the cumulants of a random

K(t) := log E(etU ) = log

Exercise 5. Let K be as in (34). Show that

for t 6= 0, while K 0 (0) = 0.

Let B0 , B1 , . . . be the so-called Bernoulli numbers, i.e., the

Exercise 8. Confirm Fa`

Exercise 7. Let U be as above. Confirm the values of 2 , . . . , 12

while B3 = B5 = = B11 = 0. (See Abramowitz and Stegun

Exercise 9. For integers 1 m n let pn,m be the number of