You are on page 1of 7

19:39 01/11/2001

TOPIC. Cumulants. Just as the generating function M of a random variable X generates its moments, the logarithm of M generates a sequence of numbers called the cumulants of X. Cumulants
are of interest for a variety of reasons, an especially important one
being the fact that the j th cumulant of a sum of independent random
variables is simply the sum of the j th cumulants of the summands.
Definitions and examples. To begin with, suppose that X is a
real random variable whose real moment generating function M (u) =
E(euX ) is finite for all us in an open interval about 0. Since M (0) =
1 6= 0 and since the composition of two functions which have power
series expansions itself has a power series expansion, we may write
X n un
(log M )(u) =
(1)
n=0 n!
for all u in some (possibly smaller) open interval about 0. The numbers 1 , 2 , . . . in this expansion are called the cumulants (for a
reason that will be explained subsequently). Notice that
n = (log M )(n) (0)

(2)

all n; in particular 0 = log(M (0)) = log(1) = 0. (1) implies (see


Theorem 12.4) that for all z in some open ball about 0 in C
X n z n
(log G)(z) =
;
(3)
n=1 n!
here G(z) = E(ezX ) is the complex generating function and log
denotes the principal branch of the complex logarithm function. In
particular the characteristic function (t) = E(eitX ) of X satisfies
X n in tn
K(t) := log((t)) =
(4)
n=1
n!
for all t in some open interval about 0 in R, and
n = K (n) (0)/in

(5)

tions were developed in the exercises in Section 11. None of those


results are used here. Moreover
there
is a change in notation

in
Section 11 K(t) denoted log M (t) , whereas here K(t) is log (t) .)
Formula (5) is used to define the first n cumulants of X when X
is only known to have an nth moment, i.e., when E(|X|n ) < . In
this situation is n-times continuously differentiable and close to 1
in an open interval about 0 R, and so K = log() is defined and
n-times continuously differentiable in that interval. Consequently the
j th cumulant
j := K (j) (0)/ij

(6)

exists for j = 1, . . . , n, and


Xn ij j tj
+ o(tn ) as t 0.
K(t) =
j=1
j!

(7)

The following observation is useful for recognizing the cumulants.


If X has a nth moment and 1 , . . . , n are numbers such that
Xn ij j tj
K(t) =
+ o(tn ) as t 0,
(81 )
j=1
j!
then (see Exercise 1)
j = j for j = 1, . . . , n .

(82 )

Example 1. (A) Suppose X N (, 2 ). Then


2 t2
2
for t near 0. X has infinitely many cumulants since its moment generating function is finite in a neighborhood of 0 (actually, finite everywhere); formula (5) gives
(t) = eit e

2 2

t /2

and K(t) = log((t)) = it

1 = , 2 = 2 , 3 = 4 = = 0.

(9)

for all n N. Since the functions log M , log G, and K = log generate the cumulants, they are called cumulant generating functions
(CGFs). (Some properties of cumulants and their generating func-

Since the first and second cumulants of any random variables are its
mean and variance (see (19) and (20)), the normal distribution has
the simplest possible cumulants.

18 1

18 2

(B) Suppose X Gamma(r). Then


1 r
(t) =
and
1 it
X (it)n
X (n 1)! (it)n
K(t) = r log(1 it) = r
=r
n=1
n=1
n
n!

Properties of cumulants. This section develops some useful properties of cumulants. The nth moment of cX is cn times the nth moment
of X; this scaling property is shared by the cumulants.
Theorem 1 (Homogeneity). Suppose X is a random variable with
an nth cumulant. Then for any c R, cX has an nth cumulant and

for t near 0. X has cumulants of


orders. By the uniqueness of the
Pall

coefficients in the power series n=1 n in tn/n!, we get


n = r(n 1)! for n 1.

(10)

it

1)

and

K(t) = (eit 1) =

(13)

Proof For t near 0 we have


cX (t) = E(eitcX ) = X (ct)

(C) Suppose that X Poisson(). Then


(t) = e(e

n (cX) = cn n (X).

X
n=1

= KcX (t) = KX (ct)

in tn
n!

(n)

(n)

= KcX (t) = cn KX (ct)

for t near 0. Evidently

(n)

1 = 2 = 3 = = .

(11)

(D) Suppose that X has an unnormalized t-distribution with 3 degrees


of freedom; its density has the form
c
(1 + x2 )2 .
X has only two moments, and thus only two cumulants. It turns out
(see Exercise 13.11) that (t) = e|t| (1 + |t|) for all t R. Thus

K(t) = |t| + log 1 + |t|

|t|2
|t|3
= |t| + |t|
+
+
(for |t| < 1)
2
3
it
i2 t2
it
i2 t2
t2
+ o(t2 ) = 1 + 2
+ o(t2 )
= + o(t2 ) = 0 + 1
2
1!
2!
1!
2!
as t 0. (82 ) gives

The nth moment of X + b is a linear combination of the first


n moments of X (with what coefficients?). The situation regarding
cumulants is much simpler:
Theorem 2 (Semi-invariance). Suppose X is a random variable
with an nth cumulant. Then for any b R, X + b has an nth cumulant
and

n (X) + b, if n = 1,
(14)
n (X + b) =
n (X),
if n > 1.
Proof For t near 0 we have
X+b (t) = E(eit(X+b) ) = eitb E(eitX ) = eitb X (t)
= KX+b (t) = itb + KX (t)
(n)

1 = 0 and 2 = 1.

(12)

3 , 4 , . . . are undefined.

(n)

= n (cX) = KcX (0)/in = cn KX (0)/in = cn n (X).

= KX+b (t) =

dn
dtn itb

(n)

+ KX (t)

= (14) holds.

18 3

18 4

19:39 01/11/2001

Here is the reason for the name cumulants; note that (16) is
much simpler than the corresponding relation for moments.
Theorem 3 (Cumulants accumulate). Suppose X and Y are independent random variables, each having an nth cumulant. Then
S := X + Y has an nth cumulant, and
n (S) = n (X) + n (Y ).

(15)

Proof S (t) = X (t)Y (t), so KS (t) = KX (t) + KY (t).


Now we investigate the relationship between moments and cumulants. We first consider moments about 0, which we write as
j := E(X j )

(16)

for j = 0, 1, 2, . . . . Note that 0 = 1.


Theorem 4 (The cumulant/moment connection). Suppose X
is a random variable with n moments 1 , . . . , n . Then X has n
cumulants 1 , . . . , n , and
Xr r
r+1 =
j r+1j for r = 0, . . . , n 1.
(17)
j=0 j
and j = K (j) (0)/ij

where (t) = E(eitX ) and K(t) = log((t)), or, equivalently, (t) =


eK(t) , for all t near 0. Differentiating this last identity gives
0 (t) = eK(t) K 0 (t) = (t)K 0 (t)
and evaluating this at t = 0 gives
i1 = 1(i1 ) = 1 = 1 = (17) holds for r = 0.
Differentiating (18) r times gives (see Exercise 3)
Xr r
(r+1) (t) =
(j) (t) (K 0 )rj (t)
j=0 j
and evaluating this at t = 0 shows that (17) holds for 1 r < n.
18 5

Pr

(18)


j r+1j for r = 0, . . . , n 1.

r
j=0 j

Writing out (17) for r = 0, . . . , 3 produces


1
2
3
4

= 1 ,
= 2 + 1 1 ,
= 3 + 21 2 + 2 1 ,
= 4 + 31 3 + 32 2 + 3 1 .

(19)

These recursive formulas can be used to calculate the s efficiently


from the s, and vice versa. When X has mean 0, that is, when
1 = 0 = 1 , j becomes

j := E (X E(X))j
and formulas (19) simplify to
2 = 2 ,

2 = 2 ,

3 = 3 ,

3 = 3

4 = 4 +

Proof For j = 0, . . . , n we have


j = (j) (0)/ij

(17): r+1 =

322 ,

4 = 4

(20)
322 .

Since the central moments 2 , 3 , . . . and the cumulants 2 , 3 , . . .


are unaffected by adding a constant to X, these formulas are valid
even when E(X) 6= 0. Note that 2 is simply the variance of X.
Example 2. The following display exhibits the moment/cumulant
connection for some important distributions:
X N (, 2 )
1 = , 1 = ,

X Gamma(r)

X Poisson()

1 = r, 1 = r,

1 = , 1 = ,

2 = r

2 = , 2 = ,

3 = 0,

3 = 0,

4 = 0,

4 = 3 4, 4 = 6r, 4 = 6r + 3r2, 4 = , 4 = + 32.

2 = , 2 = ,

2 = r,

3 = 2r, 3 = 2r,

18 6

3 = , 3 = ,

More on the cumulant/moment connection. Equations (19)


express 1 , . . . , 4 recursively in terms of 1 , . . . , 4 . By carrying
out the recursions one finds that
2 = 2 + 21 ,

m2 times

mk times

:= m1 + m2 + + mk

3 = 3 + 32 1 + 31 ,
4 = 4 + 43 1 + 322 + 62 21 + 41 .
Note that the formula for 4 is a linear combination of products of j s
and that, including multiplicities, the subscripts j on the j s in those
products form the following lists:

These are all the possible lists of nonincreasing positive integers which
add to 4; theyre called the additive partitions of 4. In what follows
IPam going to show that for any n N, n is a sum of the form
partitions of n, c is a
c where ranges over the additive
Q
certain number depending on , and = j j .
We need some notation. For a positive integer n let
Pn be the collection of all additive partitions of n.

(21)

By definition an element of Pn has the form


= [ j1 , . . . , j1 , j2 , . . . , j2 , . . . , jk , . . . , jk ]
| {z } | {z }
| {z }
m2 times

(221 )

mk times

for some number k, the ji s and


Pk mi s being positive integers satisfying
j1 > j2 > > jk and n = i=1 mi ji . Shorthand for (221 ) is
m
[ j1m1, j2m2, . . . , jk k ];

(23)

elements, of which k := k (namely, j1 , j2 , . . . , jk ) are distinct. The


quantities
c :=

n!
(j1 !)m1 (j2 !)m2

(jk

!)mk

1
m1 ! m2 ! mk !

d := c (1) 1 ( 1)!

[4], [3, 1], [2, 2], [2, 1, 1], and [1, 1, 1, 1] .

m1 times

ements, of which only 3 (namely 4, 2, and 1) are distinct. In general


the partition (22) has a total of

1 = 1 ,

m1 times

(22): = [ j1m1, j2m2, . . . , jk k ] := [ j1 , . . . , j1 , j2 , . . . , j2 , . . . , jk , . . . , jk ].


| {z } | {z }
| {z }

(222 )

(25)

play an important role in what follows, as do the products


Y
Y
:=
j and :=
j .
j

(24)

(26)

Theorem 5. For a random variable having moments 1 , . . . , n and


cumulants 1 , . . . , n ,
X
X
n =
c and n =
d ;
(27)

the sums here are taken over the elements = [ j1m1, j2m2, . . . , jk k ]
of Pn and c , d , , and are defined by (24)(26) above.
Example 3. One of the additive partitions of 11 is = [41 , 22 , 13 ] =
[ j1m1 , j2m2 , j3m3 ] for j1 = 4, j2 = 2, j3 = 1, m1 = 1, m2 = 2, m3 = 3.
According to (27), the contribution this makes to 11 is c where
:= 4 22 31

and

c :=

11!
1
= 34650 .
4! (2!)2 (1!)3 1! 2! 3!

note that in (222 ) the notation ji i means replicate ji a total of


1 2 3
mi times, not raise ji to the mth
i power. For example [4 , 2 , 1 ]
denotes the partition [4, 2, 2, 1, 1, 1] of n = 11; this partition has 6 el-

Moreover since = 1 + 2 + 3 = 6, the contribution makes to 11


is d where

18 7

18 8

= 4 22 13

and d = c (1)5 5! = 4158000 .

19:39 01/11/2001

(23): :=

Pk

i=1

mi

(24): c := n!/

Qk
mi
mi !
i=1 (ji !)

As we will see, Theorem 5 is a special case of the formula given


below for the nth derivative of the composition of two functions. The
formula applies to real or complex valued functions of a real or complex variable. For example, one of the functions may be a characteristic function, which is (in general) a complex-valued function of a
real variable, and the other may be the complex logarithm function.
Recall that a function is said to have an nth derivative at a point
if the function is (n 1)-times differentiable in an open neighborhood
of the point and the (n 1)st derivative is differentiable at the point.
Theorem 6 (Fa`
a di Brunos formula). Let n be a positive integer
and let f and g be two functions such that: the composite function
h := g(f ) is defined in an open neighborhood of a point x; f has an
nth derivative at x; and g has an nth derivative at y := f (x). Then h
has an nth derivative at x given by the formula
hY
i
X
h(n) (x) =
c g ( ) (y)
f (j) (x)
(28)
Pn

where c and are defined by (24) and (23) respectively, f (j) (x) is
the j th derivative of f at x, and g () (y) is the th derivative of g at y.
Example 4. For the additive partition [21 , 11 ] of n = 3 one has
= 1 + 1 = 2 and c = 3!/[(21 1!)(11 1!)] = 3. According to Fa`a di
Brunos formula,
derivative of h = g(f ) should contain the

0 the third
00
00
term 3g f (x) f (x)f (x). This can be verified by direct calculation:

h00 = g 00 (f )(f 0 )2 + g 0 (f )f 00 ,

h000 = g 000 (f ) (f 0 )3 + 2g 00 (f )f 0 f 00 + g 00 (f )f 0 f 00 + g 0 (f )f 000


= g 000 (f )(f 0 )3 + 3g 00 (f )f 0 f 00 + g 0 (f )f 000 .
18 9

Pn c

(j)
g ( ) f (x)
(x)
j f

Example 5. Let X be a random variable with characteristic function


, cumulant generating function K = log(), moments 1 , . . . , n and
cumulants 1 .. . n . in n is h(n) (x) for x = 0 and h(t) = (t) =
eK(t) = g f (t) with f (t) = K(t) and g(z) = ez . (28) applies because
K is defined in a neighborhood of t. Since f (j) (x) = ij j and g () (z) =
ez equals 1 at z = y = f (x) = 0, Fa`
a di Brunos formula (28) implies
in n =

X
Pn

hY
j

i
X
ij j = in

Pn

c ;

dividing through by in gives the LHS of (27). Similarly,

the RHS of
(27) follows by applying (28) for x = 0 to K(t) = g (t) with g(z) =
log(z) and using g () (z) = (1)1 ( 1)!/z = (1)1 ( 1)! for
z = y = (0) = 1.

Proof of Theorem 6. The method used in Example 4 shows that


h has an nth derivative at x, so that by Taylors theorem
h() =

Xn
j=0

h(j) (x)
( x)j + o | x|n as x.
j!

I am going to show that


h() =

h0 = g 0 (f )f 0 ,

(28): h(n) (x) =

h = g(f )

Xn
j=0

Hj
( x)j + o | x|n as x
j!

for certain numbers H0 , H1 , . . . , Hn , with Hn being defined by the


RHS of (28). This implies h(j) (x) = Hj for j = 0, . . . , n, and in
particular that h(n) (x) = Hn , as (28) asserts.
18 10

(28): h(n) (x) =

h = g(f )

Pn c

(j)
g ( ) f (x)
(x)
j f

Since f has an nth derivative at x, Taylors theorem implies that


Xn fj

( x)j + o | x|n as x
(29)
f () =
j=0 j!
for fj = f (j) (x). Similarly, since g has an nth derivative at y = f (x),
Xn g

(30)
g() =
( y) + o | y|n as y
=0 !
for g = g () (y). Thus

Xn g

h() = g f () =
f () f (x) + o |f () f (x)|n
=0 !
Xn g Xn fi

=
( x)i + o | x|n
+ o | x|n
=0 !
i=1 i!

Xn g Xn fi

=
( x)i + o | x|n
=0 !
i=1 i!
X n Hj

=
( x)j + o | x|n
(31)
j=0 j!

=1

n
X
g h
=1

j=0

j=0

as t 0 through R, then aj = bj for j = 0, . . . , n. [Hint: use


induction on j.]

Exercise 2. (a) Suppose X and Y are independent random variables,


each having an nth moment. As in Theorem 3, put S = X + Y .
Express n (S) in terms of j (X) and j (Y ) for j = 0, . . . , n. (b)
Suppose b R and X has an nth moment. Express n (X + b) in
terms of j (X) for j = 0, . . . , n.

Exercise 3. Let (a, b) be an open subinterval of R and let f and g


be complex-valued functions defined on (a, b). Show that if f and g
are n-times differentiable, then so is h := f g and
Xn n
h(n) (t) =
f (j) (t) g (nj) (t)
(33)
j=0 j
for each t (a, b). [Hint: use induction on n].

as x, where
Xn
Hn = n!
= n!

Exercise 1. Show that if a0 , . . . , an and b0 , . . . , bn are complex


numbers such that
Xn
Xn
aj tj =
bj tj + o(tn )

fi1 fi2 fi i
g hX
i1 ,i2 ,...,i 1
!
i1 +i2 ++i =n i1 ! i2 ! i !
m

fjm1 1 fjk k

X
m

[ j1 1,...,jk k ]Pn
m1 ++mk =

(j1 !)m1 (jk !)mk

i
!
. (32)
m1 ! mk !

The last step uses the fact that the number of -tuples i1 , i2 , . . . , i
which contain m1 j1 s, m2 j2 s, . . . , and mk jk s (for
j1 , . . . , jk )
distinct

!
is given by the multinomial coefficient m1 m
= m1 !m
. This
k
k!
completes the proof of (28).
It is worth noting that (29) and (30) for arbitrary fj s and g s
imply (31) with Hn given by (32). This result doesnt require f and
g to be differentiable,
18 11

The next four exercises deal with the cumulants of a random


variable U uniformly distributed over the interval [1/2, 1/2].
Exercise 4. Let U be as above. Show that U has CGF
sinh(t/2)

K(t) := log E(etU ) = log


t/2
for all real t. (sinh(x) := (ex ex )/2; take sinh(0)/0 := 1.)

(34)

Exercise 5. Let K be as in (34). Show that


K 0 (t) =

1 1
1
+
,
2
t
exp(t) 1

for t 6= 0, while K 0 (0) = 0.

18 12

19:39 01/11/2001

Let B0 , B1 , . . . be the so-called Bernoulli numbers, i.e., the


coefficients in the power series expansion
X Bk tk
t
=
;
k=0 k!
exp(t) 1

(35)

Exercise 8. Confirm Fa`


a di Brunos formula by computing the fifth
derivative of h = g(f ) and checking the result against the RHS of ().

in particular
B0 = 1,
B6 =

1
,
42

Exercise 7. Let U be as above. Confirm the values of 2 , . . . , 12


in (38) by computing the first 12 central moments of U and using
the formulas for cumulants in terms of moments; use Maple or the
equivalent to do the arithmetic.

1
B1 = ,
2
1
B8 = ,
30

1
,
6
5
=
,
66

B2 =
B10

1
,
30
691
=
2730

B4 =
B12

(36)

while B3 = B5 = = B11 = 0. (See Abramowitz and Stegun


page 804, or do help (bernoulli) in Maple.) Formula (35) holds
for |t| < 2 .

Exercise 9. For integers 1 m n let pn,m be the number of


additive partitions of n for which the largest element is m, and let pn
be the total number of additive partitions of n. Show that

if m = n,
1,
pn,m = Xmin(m,nm)

pnm,j , if m < n.
j=1

and that
Exercise 6. Let U and the Bernoulli numbers be as above. By
integrating K 0 (s) from 0 to t, show that
K(t) =

X Bk tk
k=2 k k!

(37)

pn =

Xn
m=1

pn,m .

Use these relations to compute pn for n = 1, . . . , 10.

for |t| < 2, and hence that the k th cumulant of U is Bk /k, for k 2;
in particular all cumulants of odd order equal zero, while
2 =

1
12

1
8 =
,
240

1
,
120
1
=
,
132

4 =
10

Optional: prove this.


function which is defined
|z z0 | < r }, then f is
P
(n)
(z0 )(z z0 )n/n!
n=0 f

1
,
252
691
=
.
32760

6 =
12

(38)

Use the fact that if f is a complex valued


and differentiable in the disk D := { z :
infinitely differentiable in D and f (z) =
for all z D.
18 13

18 14

You might also like