You are on page 1of 25

Solution Van Kampen

August 31, 2023


2
Contents

3
4 CONTENTS
Chapter 1

Stochastic Variables

1.1 Dice - pg. 2


1.1.1 Problem
Let X be the number of points obtained by casting a die. Give its range and
probability distribution. Same question for casting two dice.

1.1.2 Solution
For one die, the range is {1, 2, 3, 4, 5, 6}. The probability distribution is 1/6 for
each. For two dice, the range is {2, 3, . . . , 12}. The probability distribution is
1
{1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1} .
36

1.2 Random coin - pg. 3


1.2.1 Problem
Flip a coint N times. Prove that the probability that heads turn up exactly n
times is  
N
pn = 2−N (n = 0, 1, 2, . . . , N )
n
(binomial distribution). If heads gains one penny and tails loses one, find the
probability distribution for the total gain.

1.2.2 Solution
The probability that any one flip is a heads is 2−1 . Therefore, the probability
of any one particular string of heads and tails is 2−N . We have to multiply that

5
6 CHAPTER 1. STOCHASTIC VARIABLES

by the number of possible strings of heads/tails which contains exactly n heads.


Consider a prototypical string with n heads. There are N ! ways of arranging
all the flips. However, arranging the heads amongst themselves gives the same
string; there are n! such rearrangements. Similarly, there are (N − n)! ways of
rearranging the tails amongst themselves. Therefore, there are
 
−N N! −N N
2 =2
n!(N − n)! n

ways to get n heads.


For the probability distribution of the final random walk position, denote
the final gain as g, the number of heads as n, and the number of tails as m. We
have two equations

n+m=N
n−m = g.

Therefore, n = (N + g)/2 and the probability distribution is

N!
P (g|N ) = 2−N     .
N +g
2 ! N 2−g !

1.3 Distribution of speeds - pg. 3


1.3.1 Problem
Let X stand for the three components of the velocity of a molecule in a gas.
Give its range and probability distribution.

1.3.2 Solution

1.4 Electron in a Crystal - pg. 3


1.4.1 Problem
An electron moves freely through a crystal of volume Ω or may be trapped in
one of a number of point centers. What is the probability distribution of it
coordinate r?
1.5. VOLUMES - PG. 3 7

1.4.2 Solution

1.5 Volumes - pg. 3


1.5.1 Problem
Two volumes, V1 and V2 , communicate through a hole and contain N molecules
without interaction. Show that the probability of finding n molecules in V1 is
 
−N N
P (n) = (1 + γ) γn
n

where γ = V1 /V2 .

1.5.2 Solution
This is just like the coin problem. Each time we place a molecule, it goes into
V1 with probability
V1
p= = 1 − (1 + γ)−1
V1 + V2
and it goes into V2 with probability

V2
q =1−p= = (1 + γ)−1 .
V1 + V2

Imagine a string a labels {1, 2, 2, 1, 2, 2, 2, 1, 1, . . .} where the ith label tells you
which volume the ith particle is in. The probability of any one particular ar-
rangement with n particles in V1 and N − n particles in V2 is pn q N −n . There
are N ! ways to rearrange all the labels, n! ways to rearrange the “1” labels, and
(N − n)! ways to rearrange the “2” labels. Therefore, the probability to find n
“1” labels is
N!
P (n) = pn q N −n
n!(N − n)!
N!
= (1 − (1 + γ)−1 )n (1 + γ)−(N −n)
n!(N − n)!
 
N
(algebra) = γ n (1 + γ)−N .
n

1.6 Urn of white and black - pg. 3


1.6.1 Problem
An urn contains a mixture of Nw white balls and Nb black ones. I extract at
random M balls, without putting them back. Show that the probability for
8 CHAPTER 1. STOCHASTIC VARIABLES

having n white balls among them is


    
Nw Nb Nw + Nb
P (n) = / . (“hypergeometric distribution”)
n M −n M
This reduces to the volumes equation in the limit Nw , Nb → ∞ with Nw /Nb = γ.

1.6.2 Solution
We solve the problem by first assuming that the balls are all distinguishable,
and that we draw them from the urn in a particular order. We will then apply
one correction for the fact that the black (white) balls are indistinguishable from
each other and another to account for the ordering.
Suppose that the balls, in addition to having color, have some other trait that
makes them each completely distinguishable, e.g. they could each be numbered.
Imagine a particular case where we extract M balls one by one, getting first n
white balls and then k black balls:
w1 , w2 , . . . , wn , b1 , b2 , . . . , bk
What is the probability of this precise string? The probability of drawing ball
w1 on the first draw is Nw /(Nw + Nb ). Then, with ball w1 already removed, the
probability of drawing ball w2 on the second draw is (Nw − 1)/(Nw + Nb − 1).
Continuing this reasoning, we find the probability P of the string to be
    
Nw Nw − 1 Nw − (n − 1)
P = ···
Nw + Nb Nw + Nb − 1 Nw + Nb − n + 1
| {z }
probability of white balls
    
Nb Nb − 1 Nb − (k − 1)
···
Nw + Nb − n Nw + Nb − n − 1 Nw + Nb − n − k + 1
| {z }
probability of black balls
Nw ! Nb ! (Nw + Nb − n − k)!
= .
(Nw − n)! (Nb − k)! (Nw + Nb )!
Now of course, we’re really trying to find the probability of getting n indistin-
guishable white balls and k indistinguishable black ones. Therefore, shuffling
the n white balls or k black balls among themselves does not result in a uniquely
distinguishable arrangement of the draws. To correct for the overcounting, we
should divide by n!k!. Furthermore, we don’t care what order we draw the balls
in, so we should multiply by (n + k)!. The result is
Nw ! Nb ! (n + k)!(Nw + Nb − (n + k))!
.
(Nw − n)!n! (Nb − k)k! (Nw + Nb )!
Using the binomial notation we can write
    
Nw Nb Nw + Nb
/
n k k+n
which is equivalent to what we are trying to show because M = n + k.
1.7. CUMULATIVE PDF - PG. 4 9

1.7 Cumulative pdf - pg. 4


1.7.1 Problem
Show that P(x) must be a monotone non-decreasing function with P(−∞) = 0
and P(+∞) = 1. What is its relation to P?

1.7.2 Solution
That P is monotone non-decreasing is obvious from the fact that P is positive,
because Z x
P(x) = P (x0 ) dx0 .
−∞

It’s also obvious, for the same reason (and from the normalization condition),
that P(−∞) = 0 and P(+∞) = 1.
The relation between P and P is

P(x) = P((−∞, x)) .

1.8 Opinion poll - pg. 4


1.8.1 Problem
An opinion poll is conducted in a country with many political parties. How
large a sample is needed to be reasonably sure that a party of 5 percent will
show up in it with a percentage between 4.5 and 5.5?

1.8.2 Solution
This comes down to integrating the probability density of a random walk process
with a large number of trials and a fixed 5% probability of success on each trial.
Consider a string of N polls. Each one is either “success”, meaning the
person polled is from the party, or “fail”, meaning that the person is not from the
party. The probability of success is p and the probability of failure is q = 1 − p.
The probability of getting n successes is

N!
P (n) = pn q N −n .
n!(N − n)!

It’s a reasonably well known result that if N is big enough, P (n) is well approx-
imated as1
(n − N p)2
 
1
P (n) ≈ √ exp −
2πσ 2 2σ 2
1 Use Stirling’s approximation to prove this.
10 CHAPTER 1. STOCHASTIC VARIABLES

where σ 2 = N pq. The probability that the fraction of successes is between 4.5%
and 5.5% is

0.055·N
X
K ≡ Probability(0.045 N < n < 0.055 N ) = P (n) .
n=0.045·N

I don’t think we can do this analytically (although there may be some tricky
way to approximate the result). Instead, we can evaluate the sum numerically
for a few values of N and see where we get, say, 90% probability. Doing this,
we find that we get 90% probability at around N = 5, 200. A python script
opinion poll.py for the numerical analysis is included with the source repo.
Note that it’s helpful to approximate the distribution as a Gaussian to make
the numerics simple. Otherwise, we’d have to compute really large factorials.
It would be very nice to have a real analytic treatment of this problem so
that we know the functional dependence on N . From the numerics you can see
that the dependence of K on N seems to follow the form 1 − exp(−N/N0 ) for
some N0 .

1.9 Moments of square distribution - pg.5


1.9.1 Problem
Find the moments of the “square distribution” defined by

P (x) = 0 for |x| > a, P (x) = (2a)−1 for |x| < a .

1.9.2 Solution
The moments are
Z a
1
µn ≡ xn dx
2a −a
Z 0 Z a 
1 n n
= x dx + x dx
2a −a 0
Z a
1
= xn + (−x)n dx
2a 0

0 n odd
=
an /(n + 1) n even
1.10. MOMENTS OF GAUSSIAN DISTRIBUTION - PG. 5 11

1.10 Moments of Gaussian distribution - pg. 5


1.10.1 Problem
The Gauss distribution is defined by

x2
 
1
P (x) = √ exp − 2 .
2πσ 2 2σ

Show that µ2n+1 = 0 and

µ2n = σ 2n (2n − 1)!! = σ 2n (2n − 1)(2n − 3)(2n − 5) · · · 1 .

1.10.2 Solution
The odd moments are obviously zero by symmetry. For the even moments, first
note that µ0 = 1 is just a statement that the distribution is normalized. Now
consider µ2 ,
Z ∞
x2
 
1
µ2n = √ x2n exp − 2 dx
2πσ2 −∞ 2σ
 
 x2n+1  ∞ Z ∞
x2 x2
   
1 1
=√ exp − 2 + 2 x2n+2 exp − 2 dx
 
2πσ 2  2n + 1 2σ −∞ σ (2n + 1) −∞ 2σ


| {z }
0
1
= 2 µ2n+2
σ (2n + 1)
µ2n+2 = σ 2 (2n + 1)µ2n . (?)

Now use induction. Assuming the target relation is true for µ2n , we can rewrite
(?) as

µ2n+2 = σ 2 (2n + 1)σ 2n (2n − 1)(2n − 3) · · · 1


µ2n = σ 2n (2n − 1)(2n − 3) · · · 1

which is what we wanted to prove.

1.11 Square distribution moments from gener-


ating function - pg. 7
1.11.1 Problem
Compute the characteristic function of the square distribution and find its mo-
ments that way.
12 CHAPTER 1. STOCHASTIC VARIABLES

1.11.2 Solution
The generating function is
Z a
1
G(k) = eikx dx
2a −a
1 e − e−ika
ika
=
2a ik
sin(ak)
=
ak

X 1 (ak)2n+1
= (−1)n
n=0
ak (2n + 1)!

X a2n (ik)2n
= (−1)n
n=0
i2n (2n+ 1)!

X (ik)2n
= a2n
n=0
(2n + 1)!

X an (ik)n
= .
n even
n + 1 n!

Therefore, the moments are µn = an /(n + 1) for even n, and zero for odd
n.

1.12 Gauss distribution cumulants - pg. 7


1.12.1 Problem
Show that for the Gauss distribution all cumulants beyond the second are zero.
Find the most general distribution with this property.

1.12.2 Solution
The generating function for the Gaussian is
Z ∞
−x2
 
1
G(k) = √ exp − 2 eikx dx
2πσ 2 −∞ 2σ
2 2
 
σ k
= exp −
2
2 2
σ k
ln G(k) = − .
2
Therefore, all of the cumulants are zero except for the second, which is σ 2 .
If we add in a first cumulant, we’d wind up with a Gaussian shifted away
from zero. So, I guess the most general distribution with all cumulants beyond
the second is just a shifted Gaussian.
1.13. POISSON CUMULANTS - PG. 7 13

1.13 Poisson cumulants - pg. 7


1.13.1 Problem
The Poisson distribution is defined on the discrete range n = 0, 1, 2, . . . by
an −a
pn = e .
n!
Find its cumulants.

1.13.2 Solution
The Poisson distribution can be written as a continuous distribution using delta
functions:

X an −a
p(x) = e δ(x − n) .
n=0
n!
The moment generating function is therefore
∞ Z
X an −a
G(k) = e δ(x − n) eikx dx
n=0
n!

X an −a ikn
= e e
n=0
n!

X (aeik )n −a
= e
n=0
n!
= exp aeik − a
 

= exp a(eik − 1)
 

ln G(k) = a(eik − 1)

X (ik)2
=a .
n!
k=1

Therefore, by definition, all of the cumulants of the Poisson distribution are a,


which is also the mean of the Poisson distribution. It other words,
κ1 = κ2 = κ3 = · · · = µ1 .

1.14 Gas particles in the Poisson limit - pg. 7


1.14.1 Problem
Express the gas particle result in the limit V2 → ∞, N → ∞, N/V2 = ρ =
constant. The result is the Poisson distribution
an −a
pn = e
n!
14 CHAPTER 1. STOCHASTIC VARIABLES

with a = ρV1 .

1.14.2 Solution
We just barge ahead with Stirling’s approximation and ln(1 + x) ≈ x,
N!
pn = (1 + γ)−N γ n
n!(N − n)!
ln pn = −N ln(1 + γ) + n ln γ + ln N ! − ln(N − n)! − ln n!
= −N γ + n ln γ + N ln N − N − (N − n) ln(N − n) + (N − n) − ln n!
= −N γ + n ln γ + N ln N − N − N ln(N − n) + n ln(N − n) + (N − n) − ln n!
(N  n) = −N γ + n ln(N γ) − ln n!
(ρV1 )n
pn = e−ρV1
n!
which is what we wanted to show.

1.15 Characteristic of Lortentz distribution - pg.


7
1.15.1 Problem
Calculate the characteristic function of the of the Lortentz distribution. How
does one see from it that the moments do not exist?

1.15.2 Solution
The Lorentz distribution is
1 γ
P (x) = .
π (x − a)2 + γ 2
The characteristic function can be computed directly:

eikx
Z
γ
G(k) = dx
π (x − a)2 + γ 2
eiky
Z
γ
= eika dy
π y2 + γ 2
eiky
Z
γ
= eika dy .
π (y + iγ)(y − iγ)
Now we use contour integration. For k > 0 y must have a positive imaginary
part in order for the integrand to converge for large |y|. Therefore, we close the
contour in the upper half plane. In that case we get

G(k > 0) = eika e−kγ .


1.16. DISTRIBUTION WITH COS GENERATING FUNCTION - PG. 7 15

For k < 0 we close the contour in the lower half plane and get

G(k < 0) = eika ekγ .

Therefore, the characteristic function is

G(k) = eika e−|k|γ .

This function is not differentiable at k = 0, so it has no Taylor series. This


shows that there are no moments (i.e. they are infinity).

1.16 Distribution with cos generating function -


pg. 7
1.16.1 Problem
Find the distribution and its moments corresponding to the characteristic func-
tion G(k) = cos ak.

1.16.2 Solution
The distribution is found from the inverse Fourier transform
Z
dk
P (x) = G(k)e−ikx

Z
1 dk iak
e + e−iak e−ikx

=
2 2π
Z
1 dk iak
e + e−iak e−ikx

=
2 2π
1
= (δ(x − a) + δ(x + a)) .
2

This is a discrete distribution with probabilities of 1/2 to get value ±a.


The moments are found by writing G(k) as a Taylor series,

X (ak)2n
G(k) = (−1)n
n=0
(2n)!

X (ik)2n
= a2n
n=0
(2n)!

This shows that all the odd moments are zero, and the even moments are µ2n =
a2n .
16 CHAPTER 1. STOCHASTIC VARIABLES

1.17 Factorial moments - pg. 9


1.17.1 Problem
When X only takes the values 0, 1, 2, . . . one defines the factorial moments φm
by φ0 = 1 and

φm = hX(X − 1)(X − 2) · · · (X − m + 1)i (m ≥ 1) .

Show that they are also generated by F , viz.,



X (−x)m
F (1 − x) = θm .
m=1
m!

1.17.2 Solution
Let’s recall the meaning of generating function.

The moment generating func-
tion G, defined by the equation G(k) = eikx , has the property that

(Dn G)(k = 0) = in hxn i ≡ in µn .

As the nth derivative at k = 0 is in µn , we can Taylor expand G about k = 0 as



X kn n
G(k) = i µn .
n=0
n!

Define a function F as
F (k) = hk x i .
Differentiating F and evaluating at k = 1 gives us the factorial moments:

(Dm F )(k) = x(x − 1)(x − 2) · · · (x − m + 1)k x−m



(Dm F )(1) = hx(x − 1)(x − 2) · · · (x − m + 1)i


= φm .

As the nth derivative of F at k = 1 is φn , we can Taylor expand F as



X (k − 1)n
F (k) = φn
n=0
n!

which can also be written as



X (−k)n
F (1 − k) = φn .
n=0
n!

I have no idea why the book writes the latter form.


1.18. FACTORIAL CUMULANTS OF POISSON DISTRIBUTION - PG. 917

1.18 Factorial cumulants of Poisson distribution


- pg. 9
1.18.1 Problem
The factorial cumulants θm are defined by


X (−x)m
log F (1 − x) θm .
m=1
m!

Express the first few in terms of the moments. Show that the Poisson distribu-
tion is characterized by the vanishing of all factorial cumulants beyond θ1 .

1.18.2 Solution
I’m skipping writing the cumulants in terms of the moments.
As shown in the previous problem, the factorial moment generating function
F (k) is
F (k) = hk x i

which, with the Poisson distribution, becomes

n
X an n
F (k) = e−a k
n=0
n!
= exp [a(k − 1)]
(ln F )(k) = a(k − 1)
(ln F )(1 − k) = −ak .

Therefore, θ1 = a and all other factorial cumulants are zero for the Poisson
distribution.

1.19 Factorial moments and cumulants of gas


particles - pg. 9
1.19.1 Problem
Find the factorial moments and cumulants of

N!
pn = (1 + γ)−N γ n .
n!(N − n)!
18 CHAPTER 1. STOCHASTIC VARIABLES

1.19.2 Solution
The moment generating function is

F (k) = hk x i
N
X N!
= (1 + γ)−N γ n kn
n=0
n!(N − n)!
= (1 + γ)−N (1 + γk)N .

The moments come by differentiating F and evaluating at k = 1,

N!
(Dm F )(k) = (1 + γ)−N γ m (1 + γk)N −m
(N − m)!
 −m
N! 1
(Dm F )(k = 1) = 1+
(N − m)! γ
N!
(γ  1) ≈ γm .
(N − m)!

For the cumulants, we take the log:

(ln F )(k) = N [ln(1 + γk) − ln(1 + γ)]


(D ln F )(k) = (−1)m+1 N (m − 1)!γ m (1 + γk)−m
m
 −m
m m−1 1
(D ln F )(k = 1) = (−1) N (m − 1)! 1 + .
γ

Therefore, the factorial cumulants are


 −m
m−1 1
θm = N (−1) (m − 1)! 1 + (m ≥ 1) .
γ

I’m not sure what the point of this problem is.

1.20 Normalization of conditional probability -


pg. 11
1.20.1 Problem
Prove and interpret the normalization of the conditional probability
Z
PA|B (a, b)da = 1 .
1.21. JOIN DENSITY FOR INDEPENDENT VARIABLES - PG. 11 19

1.20.2 Solution
Use Bayes’s rule:
Z Z
1
PA|B (a, b)da = PA,B (a, b) da
PB (b)
PB (b)
=
PB (b)
= 1.

The normalization of the conditional probability PA|B can be interpreted as


the following English sentences: “For any value of the variables B, the sum of
the probabilities for all values of the variables A is one. In other words, after
fixing B, the remaining variables are still a probability distribution.”

1.21 Join density for independent variables - pg.


11
1.21.1 Problem
What is the form of the joint probability if density if all variables are mutually
independent?

1.21.2 Solution
Use Bayes’s rule again:

PA,B,C (a, b, c) = PA (a)PB,C|A (b, c|a)


= PA (a)PB,C (b, c)
= PA (a)PB (b)PC|B (c|b)
= PA (a)PB (b)PC (c) .

Therefore, when all variables are mutually exclusive, the joint probability
factorizes into a product of the marginals.

1.22 Ring distribution pg. 11


1.22.1 Problem
Compute the marginal and conditional probabilities for the ring distribution

1
PX1 ,X2 (x1 , x2 ) = δ(x21 + x22 − a2 ) .
πa
20 CHAPTER 1. STOCHASTIC VARIABLES

1.22.2 Solution

1.23 Dice which sum to 9 - pg. 11


1.23.1 Problem
Two dice are thrown and the outcome is 9. What is the probability distribution
of the first die conditional on the given total? Why is this result not incompatible
with the obvious fact that the two dice are independent?

1.23.2 Solution
The ways to get a total of 9 are:
(3, 6), (4, 5), (5, 4), (6, 3) .
Therefore, the probability distribution P (n) for the first die to show n points is
P (1) = P (2) = 0, P (3) = P (4) = P (5) = P (6) = 1/4 .
This is not incompatible with the fact that the dice are independent because
the fact that the total is 9 is a constraint which does not exist when describing
a-priori probabilities.

1.24 Conditional life time probabilities - pg. 11


1.24.1 Problem
The probability distribution of lifetimes in a population is P (t). Show that the
conditional probability for individuals of age τ is
Z ∞
Pdie at age|alive at age (t|τ ) = P (t)/ P (t0 ) dt0 (t > τ ) .
τ
−γt
Note that in the case P (t) = γe one has Pdie at age|alive at age (t|τ ) = P (t − τ ):
the survival chance is independent of age. Show that this is the only P for which
that is true.

1.24.2 Solution
Use Bayes’s rule,
Pdie at age & alive at age (t, τ )
Pdie at age|alive at age (t|τ ) = .
Palive at age (τ )
If your age is τ that means you die at some time t0 such that t0 > τ . Therefore,
the probability that a given person is still alive at age τ is
Z ∞
alive at age τ = P (t0 ) dt0 .
τ
1.25. MOMENTS AND CHARACTERISTIC FUNCTION OF SUBSET OF VARIABLES - PG. 1221

Also, when t > τ , the joint probability that a person is alive at τ and dies at t is
the same thing as the marginal probability that the person dies at t. Therefore
we have
P (t)
Pdie at age|alive at age (t|τ ) = R ∞
τ
P (t0 ) dt0
as we wanted to show.
Now we suppose P (t) = γe−γt . Then the integral is
∞ ∞
e−γt
Z
0 0
P (t ) dt = γ
τ −γ τ
= e−γτ .

Therefore, the conditional probability is

γe−γt
Pdie at age|alive at age (t|τ ) = = γe−γ(t−τ ) = Pdie at age (t − τ )
e−γτ
which is what we wanted to show.
To show that there is only one possible P (t) with the property we just
demonstrated, we use calculus:

P (t)
P (t − τ ) = R ∞
τ
P (t0 ) dt0
Z ∞
P (t)
P (t0 ) dt0 =
τ P (t − τ )
P (t)
(differentiate w.r.t. τ ) − P (τ ) = − (−P 0 (t − τ ))
P (t − τ )2
P (τ )P (t − τ )2
P 0 (t − τ ) = −
P (t)
(set τ = 0) P 0 (t) = −P (0)P (t)
P (t) = P (0)e−P (0) t .

Denoting P (0) as γ gives the result.

1.25 Moments and characteristic function of sub-


set of variables - pg. 12
1.25.1 Problem
Consider the marginal distribution of a subset of all variables. Express the
moments in terms of the moments of the total distribution, and its characteristic
function in terms of the total one.
22 CHAPTER 1. STOCHASTIC VARIABLES

1.25.2 Solution
Call the subset of variables A and the other variables B. Suppose there are m
variables in A and n variables in B.
Denote the marginal distribution as PA , i.e.
Z
PA (a) ≡ PA,B (a, b) db .
B

The moments of A are


Z
hAp11 · · · Apmm i = PA (a)ap11 · · · apmm da
ZA
= PA,B (a, b)ap11 · · · apmm da db
A,B

p1
· · · Apmm B10 · · · Bn0 .

= A1

The characteristic function of PA is


*  +
m
X
GA (k1 , . . . , km ) ≡ exp i kj aj 
j=1
 
Z m
X
= exp i kj aj  PA (a) da
A j=1
 
Z m
X
= exp i kj aj  PA,B (a, b) da db
A,B j=1
   
Z m
X n
X
= exp i kj aj  exp i 0 · bj  PA,B (a, b) da db
A,B j=1 j=1

= GA,B (k1 , . . . , km , 0, . . . , 0) .

1.26 Equivalence of independent variable crite-


ria - pg. 12
1.26.1 Problem
Prove the three criteria for independence mentioned above and generalize them
to N variables.
The criteria given in the book are

• All moments factorize: hX1m1 X2m2 i = hX1m1 i hX2m2 i.


1.26. EQUIVALENCE OF INDEPENDENT VARIABLE CRITERIA - PG. 1223

• The characteristic function factorizes:

G(k1 , k2 ) = G1 (k1 )G2 (k2 ) .

• The cumulants hhX1m1 X2m2 ii vanish when both m1 and m2 are not zero.

1.26.2 Solution
First, we show that if the variables are independent, the moments factorize. We
just crank through the math:
Z
hX1m1 X2m2 i = PX1 ,X2 (x1 , x2 )xm 1 m2
1 x2 dx1 dx2
X1 ,X2
Z Z
m1
= PX1 (x1 )x1 dx1 PX2 (x2 )xm
2 dx2
2

X1 X2
= hX1m1 i hX2m2 i .

This proves that if two variables are independent, the moments factorize. The
extension to more variables is obvious.
Next, we show that if the moments factorize, then the characteristic function
factorizes. Again, we just use math:

X (ik1 )m1 (ik2 )m2
G(k1 , k2 ) = hX1m1 X2m2 i
0
m1 !m2 !
∞ ∞
X (ik1 )m1 X (ik2 )m2
= hX1m1 i hX2m2 i
0
m1 ! 0
m2 !
= G1 (k1 )G2 (k2 ) .

Again, the extension to multiple variables is obvious.


Next, we show that if the generating function factorizes, then the cumulants
hhX1m1 X2m2 ii vanish when both m1 and m2 differ from zero. You guessed it,
just do math:
log G(k1 , k2 ) = log G1 (k1 ) + log G2 (k2 ) .

The Taylor series for this function obviously has no cross terms, which means
the cumulants with more than one variable having a non-zero power vanish.
It remains only to show that the vanishing of cumulants where more than
one power is nonzero implies that the variables are independent. If we can prove
that, then we’ll have completed a circle of implications showing that any one
of the three listed criteria, and the independence of the variables, all imply one
another.
If the cumulants vanish when more than one of the powers is nonzero, then
24 CHAPTER 1. STOCHASTIC VARIABLES

we can write
∞ ∞
X (ik1 )j X (ikn )j
log G(k1 , . . . , kn ) = + ··· +
j=0
j! j=0
j!
 
n ∞ j
Y X (ikm ) 
G(k1 , . . . , kn ) = exp 
m=1 j=0
j!
n
Y
≡ Gm (km )
m=1
n Z
Y dkm
PX1 ,...,Xn (x1 , . . . , xn ) = Gm (km )e−ikm xm
m=1

Yn
= PXm (xm ) .
m=1

1.27 Correlation is between -1 and 1 - pg. 12


1.27.1 Problem
Prove −1 ≤ ρij ≤ 1. Prove that if ρij is either -1 or 1 the variables Xi , Xj are
connected by a linear relation.
In the book, the correlation is defined as
hhXi Xj ii hXi Xj i − hXi i hXj i
ρij = q

= r .
hhXi2 ii Xj2
 

2 2 2 2
hXi i − hXi i Xj − hXj i

1.27.2 Solution
The Cauchy-Schwarz inequality says that for two vectors A and B in an inner
product space,
2
|hA|Bi| ≤ hA|Ai · hB|Bi .
where here h·|·i means inner product. Fortunately, given two variables X and
Y , the set of functions over those variables is a vector space. Also, given a
probability distribution PX,Y over those variables, the average
Z
hf gi ≡ PX,Y (x, y)f (x, y) g(x, y) dx dy

is an inner product on that vector space. If we define f (x, y) ≡ x − hXi and


g(x, y) ≡ y − hY i, then we have
2 2
|hf |gi| = |h(X − hXi)(Y − hY i)i| .
1.27. CORRELATION IS BETWEEN -1 AND 1 - PG. 12 25

Using the Cauchy-Schwarz inequality we get


2 2
|h(X − hXi)(Y − hY i)i| = |hf |gi| ≤ hf |f i hg|gi = (X − hXi)2 (Y − hY i)2


2
|h(X − hXi)(Y − hY i)i|
≤1
h(X − hXi)2 i h(Y − hY i)2 i
2
|hXY i − hXi hY i|
   ≤ 1.
2 2
hX 2 i − hXi hY 2 i − hY i

Renaming X → Xi and Y → Xj we get


2
|hXi Xj i − hXi i hXj i|
  
 ≤1
2 2
hXi2 i − hXi i Xj2 − hXj i

which proves the first part of the problem.


I think it’s reasonably clear that in order to saturate the inequality of the
Cauchy-Schwarz inequality, the two vectors in question have to be parallel,
meaning that
|f i = a |gi
for some scalar a. In the present case, that would mean

x − hXi = a(y − hY i) −→ x = ay + a (hXi − hY i) .

This proves that x and y are linearly related. I’m not really sure what this means
though. To say that X and Y are linearly related seems to be a statement about
the probability distribution itself, but I’m not sure yet how to think about what
that means.

You might also like