You are on page 1of 305

1

Events and their probabilities

1.2 Solutions. Events as sets


1. (a) Let a E (U Ai )c. Then a U Ai, so that a E A~ for all i. Hence (U Ai )c ~ n A~.
Conversely, if a E
A~, then a Ai for every i. Hence a U Ai, and so
A~ ~ (U Ai )c. The
first De Morgan law follows.

(b) Applying part (a) to the family {Af : i E /},we obtain that
Taking the complement of each side yields the second law.

(Ui

Aft =

ni(Af)c

ni

Ai.

2. Clearly
(i) An B = (Ac U Bc)c,
(ii) A\ B =An Be= (Ac U B)c,
(iii) A D. B =(A\ B) U (B \A)= (Ac U B)c U (AU Bc)c.
Now :F is closed under the operations of countable unions and complements, and therefore each of
these sets lies in :F.

3. Let us number the players 1, 2, ... , 2n in the order in which they appear in the initial table of draws.
The set of victors in the first round is a point in the space Vn = { 1, 2} x {3, 4} x .. x {2n - 1, 2n}.
Renumbering these victors in the same way as done for the initial draw, the set of second-round
victors can be thought of as a point in the space Vn-I, and so on. The sample space of all possible
outcomes of the tournament may therefore be taken to be Vn x Vn -I x x VI, a set containing
.
22n-122n-2 . . . 2I = 22n-I pomts.
Should we be interested in the ultimate winner only, we may take as sample space the set
{1, 2, ... , 2n} of all possible winners.
4. We must check that g, satisfies the definition of a a-field:
(a) 0 E :F., and therefore 0 = 0 n B E g,,
(b) if AI, A2, ... E :F., then Ui(Ai n B)= (Ui Ai) n BEg,,
(c) if A E :F., then Ac E :Fso that B \(An B)= Ac n B E g,.

Note that g, is a a-field of subsets of B but not a a-field of subsets of Q, since C E


that cc = Q \ C E g,.
5.

(a), (b), and (d) are identically true; (c) is true if and only if A~ C.

1.3 Solutions. Probability


1.

(i) We have (using the fact that lP' is a non-decreasing set function) that
lP'(A

n B) = lP'(A)

+ lP'(B) -

lP'(A U B) :::: lP'(A)

135

+ lP'(B) -

1=

fz.

g, does not imply

[1.3.2]-[1.3.4]

Solutions

Events and their probabilities

Also, since An B ~A and An B ~ B, li"(A n B) :::: min{lP'(A), lP'(B)} = ~These bounds are attained in the following example. Pick a number at random from {1, 2, ... , 12}.
Taking A = {1, 2, ... , 9} and B = {9, 10, 11, 12}, we find that An B = {9}, and so li"(A) =

i.

li"(B) = ~. li"(A n B)
B = {1, 2, 3, 4}.

= /2 .

To attain the upper bound for lP'(A n B), take A

= {1, 2, ... , 9} and

(ii) Likewise we have in this case lP'(A U B) :::: min{lP'(A) + li"(B), 1} = 1, and lP'(A U B) 2::
max{lP'(A), lP'(B)} =
These bounds are attained in the examples above.

2.

(i) We have (using the continuity property oflP') that


lP'(no head ever)

= n--+oo
lim lP'(no head in first n tosses) = lim 2-n = 0,
n--+oo

so that lP'(some head turns up) = 1 - lP'(no head ever) = 1.


(ii) Given a fixed sequences of heads and tails of length k, we consider the sequence of tosses arranged
in disjoint groups of consecutive outcomes, each group being of length k. There is probability 2-k
that any given one of these is s, independently of the others. The event {one of the first n such groups
iss} is a subset of the event {s occurs in the first nk tosses}. Hence (using the general properties of
probability measures) we have that
ll"(s turns up eventually) =

lim lP'(s occurs in the first nk tosses)

n--+oo

2:: lim lP'(s occurs as one of the first n groups)


n--+oo

= 1-

1-

lim lP'(none of the first n groups is s)

n--+oo

lim (1 -

n--+oo

rk( =

1.

3. Lay out the saucers in order, say as RRWWSS. The cups may be arranged in 6! ways, but since
each pair of a given colour may be switched without changing the appearance, there are 6!-:- (2 !) 3 = 90
distinct arrangements. By assumption these are equally likely. In how many such arrangements is
no cup on a saucer of the same colour? The only acceptable arrangements in which cups of the
same colour are paired off are WWSSRR and SSRRWW; by inspection, there are a further eight
arrangements in which the first pair of cups is either SW or WS, the second pair is either RS or SR,
and the third either RW or WR. Hence the required probability is 10/90 = ~.

4. Weprovethisbyinductiononn,consideringfirstthecasen = 2. Certainly B = (AnB)U(B\A)


is a union of disjoint sets, so that lP'(B) = li"(A n B)+ li"(B \A). Similarly AU B = AU (B \A), and
so
li"(A U B)= li"(A) + li"(B \A)= li"(A) + {li"(B) -li"(A n B)}.
Hence the result is true for n = 2. Let m 2:: 2 and suppose that the result is true for n :::: m. Then it is
true for pairs of events, so that
1
lP'Cu Ai)
1

= lP'(UAi) + li"(Am+d -11"{ (UAi)


1

n Am+1}

= lP'(u Ai) + li"(Am+d -11"{ U<Ai n


1

Am+1) }

Using the induction hypothesis, we may expand the two relevant terms on the right-hand side to obtain
the result.

136

Solutions

Conditional probability

[1.3.5]-[1.4.1]

Let A 1 , A 2, and A 3 be the respective events that you fail to obtain the ultimate, penultimate, and
ante-penultimate Vice-Chancellors. Then the required probability is, by symmetry,
3

UAi) = 1- 31P'(Al) + 31P'(A1 n Az) -1P'(A1 n Az n A 3 )

1 -11"(

= 1- 3(~)6 + 3(~)65.

(~)6.

By the continuity of 11", Exercise (1.2.1), and Problem (1.8.11),

= 1- lim

n-+oo

lP'(Un A;)::::: 1- n-+oo


lim ~lP'(A;) =
L
r=1

We have that 1 = ll"(lJAr) = Lli"(Ar)- Lli"(Ar


1
r
r<s
p 2:: n- 1, and ~n(n- 1)q = np- 1 ,:s n - 1.

6.

7.

1.

r=1

n As)

= np-

~n(n- 1)q.

Hence

Since at least one of the Ar occurs,


1=1P"(lJAr)=L:lP'(Ar)-LlP'(ArnAs)+ L
lP'(ArnAsnAt)
1
r
r<s
r<s<t

Since at least two of the events occur with probability ~,

~ = 11"( U<Ar n As))= Lli"(Ar n As)-~


r<s

r<s

L
li"(Ar
r<s
t<u
(r,s)#(t,u)

n As nAt n Au)+

By a careful consideration of the first three terms in the latter series, we find that

Hence

i = np- G)x, so that p 2:: 3/(2n). Also, (~)q = 2np- i. whence q .:S 4/n.
1.4 Solutions. Conditional probability

1.

By the definition of conditional probability,


li"(A

I B) _ li"(A n B) _ li"(B n A)
-

li"(B)

li"(A)

137

li"(A) _ li"(B
li"(B) -

I A) li"(A)
li"(B)

[1.4.2]-[1.4.5]

Solutions

Events and their probabilities

iflP'(A)lP'(B) =/; 0. Hence


lP'(A

I B)

lP'(B

I A)

lP'(B)

lP'(A)

whence the last part is immediate.


2.

Set Ao = Q for notational convenience. Expand each term on the right-hand side to obtain

3.

Let M be the event that the first coin is double-headed, R the event that it is double-tailed, and
Let H{ be the event that the lower face is a head on the ith toss, TJ the
event that the upper face is a tail on the ith toss, and so on. Then, using conditional probability ad
nauseam, we find:

N the event that it is normal.

(i)

lP'(Hl) = ~lP'(Hll M)
I

+ ~lP'(H/1 R) + ~lP'(HII IN)=~ +0+ ~ i

lP'(H/ n H.})
lP'(M)
2/3
2
I
= --I- = 5 5 = 3
lP'(Hu)
lP'(HI )

(ii)

lP'(HI I Hu) =

(iii)

lP'(H? I HJ) = 1 lP'(M I HJ)

+ ilP'(N

I HJ)
I + 2I( 1 - lP' (II
I) = 32
= lP' ( HIII Hu)
HI Hu)

+ 2I . 3I = o5

(iv)

~
4
= -2--I = -.
4 . lP'(N)
5 + TO
5

lP'(M)

1 . lP'(M)

(v) From (iv), the probability that he discards a double-headed coin is ~, the probability that he
discards a normal coin is ~- (There is of course no chance of it being double-tailed.) Hence, by
conditioning on the discard,

lP'(H~) = ~lP'(H~ I M)

+ !lP'(H~

IN)= ~Ct

+ i. i) +Hi+ i.

i) = M

4. The final calculation of ~ refers not to a single draw of one ball from an urn containing three, but
rather to a composite experiment comprising more than one stage (in this case, two stages). While it
is true that {two black, one white} is the only fixed collection of balls for which a random choice is
black with probability ~, the composition of the urn is not determined prior to the final draw.
After all, if Carroll's argument were correct then it would apply also in the situation when the urn
originally contains just one ball, either black or white. The final probability is now
implying that
the original ball was one half black and one half white! Carroll was himself aware of the fallacy in
this argument.

i,

5. (a) One cannot compute probabilities without knowing the rules governing the conditional probabilities. If the first door chosen conceals a goat, then the presenter has no choice in the door to be
opened, since exactly one of the remaining doors conceals a goat. If the first door conceals the car, then
a choice is necessary, and this is governed by the protocol of the presenter. Consider two 'extremal'
protocols for this latter situation.
(i) The presenter opens a door chosen at random from the two available.
(ii) There is some ordering of the doors (left to right, perhaps) and the presenter opens the earlier
door in this ordering which conceals a goat.
Analysis of the two situations yields p = ~ under (i), and p =

138

i under (ii).

Solutions [1.4.6]-[1.5.3]

Independence

1, ],

Let a E [ ~ and suppose the presenter possesses a coin which falls with heads upwards with
probability f3 = 6a - 3. He flips the coin before the show, and adopts strategy (i) if and only if the
coin shows heads. The probability in question is now ~ f3 + 1 - /3) = a.
You never lose by swapping, but whether you gain depends on the presenter's protocol.
(b) Let D denote the first door chosen, and consider the following protocols:
(iii) If D conceals a goat, open it. Otherwise open one of the other two doors at random. In this
case p = 0.
(iv) If D conceals the car, open it. Otherwise open the unique remaining door which conceals a
goat. In this case p = 1.
As in part (a), a randomized algorithm provides the protocol necessary for the last part.

1(

6.

This is immediate by the definition of conditional probability.

7. Let Ci be the colour of the ith ball picked, and use the obvious notation.
(a) Since each urn contains the same number n - 1 of balls, the second ball picked is equally likely to
be any of the n (n - 1) available. One half of these balls are magenta, whence 11"( C2 = M) =
(b) By conditioning on the choice of urn,

lP'(C2 = M I C 1 = M) = lP'(CJ, C2 = M) =
(n- r)(n- r - 1)
lP'(CJ = M)
r=l n(n- l)(n- 2)

j!

1.5 Solutions. Independence


1.

Clearly
li"(Ac n B)= li"(B \{An B}) = lP'(B) -lP'(A n B)
= lP'(B)- lP'(A)lP'(B) = lP'(Ac)lP'(B).

For the final part, apply the first part to the pair B, Ac.
2. Suppose i < j and m < n. If j < m, then Aij and Amn are determined by distinct independent
rolls, and are therefore independent. For the case j = m we have that
li"(Aij n Ajn) = lP'(ith, jth, and nth rolls show Sflme number)

tlP'(jth and nth rolls both show r J ith shows r) = 3~ = li"(Aij )li"(Ajn),

r=I

as required. However, if i

i= j i= k,

3. That (a) implies (b) is trivial. Suppose then that (b) holds. Consider the outcomes numbered
i 1, i2, ... , im, and let uJ E {H, T} for 1 :::.0 j :::.0 m. Let Sj be the set of all sequences oflength M =
max{iJ : 1 :::.0 j :::.0 m} showing u1 in the i1th position. Clearly IS} I =2M-I and Jn1 s1 J = 2M-m.
Therefore,

139

[1.5.4]-[1.7.2]
so that lP' ( nj

Solutions

Events and their probabilities

Sj) = IJj lP'(Sj ).


=

Suppose IAI
a, IBI
b, lA n Bl
c, and A and B are independent. Then lP'(A n B) =
lP'(A)lP'(B), which is to say that cfp = (a/p) (b/p), and hence ab = pc. If ab =I= 0 then pI ab
(i.e., p divides ab). However, pis prime, and hence either p I a or p I b. Therefore, either A= Q or
B = Q (or both).

4.

5. (a) Flip two coins; let A be the event that the first shows H, let B be the event that the second
shows H, and let C be the event that they show the same. Then A and B are independent, but not
conditionally independent given C.
(b) Roll two dice; let A be the event that the smaller is 3, let B be the event that the larger is 6, and let
C be the event that the smaller score is no more than 3, and the larger is 4 or more. Then A and B are
conditionally independent given C, but not independent.
(c) The definitions are equivalent if 11"(C) = 1.

6.

(fo)7 < ~-

k i

7. (a) lP'(A n B) = =
~ = li"(A)li"(B), and lP'(B n C) = = ~ = lP'(B)lP'(C).
(b) lP'(A n C)= 0 =/= lP'(A)lP'(C).
(c) Only in the trivial cases when children are either almost surely boys or almost surely girls.
(d) No.

8.

No. lP'(all alike)=

9.

lP'(lst shows rand sum is 7) = ~ =

iJ iJ =

lP'(lst shows r)lP'(sum is 7).

1.7 Solutions. Worked examples


1. Write EF for the event that there is an open road from E to F, and EF' for the complement of
this event; write E ~ F if there is an open route from E to F, and E
F if there is none. Now
{A~ C} = AB nBC, so that

lP'(AB I A

+ C) =

lP'(AB, A+ C)
li"(AB, B +C)
(l - p2)p2
li"(A
C)
= 1 - lP'(A ~ C) = -~---(-=1----=p-=-2)-=-2.

By a similar calculation (or otherwise) in the second case, one obtains the same answer:

2. Let A be the event of exactly one ace, and KK be the event of exactly two kings. Then lP'(A I
KK) = lP'(A n KK)/li"(KK). Now, by counting acceptable combinations,

so the required probability is

(4)1 (4)2 (44)


10 I (4)2 (48)
11 -_37.4611 .4737 -'"'"' 0 4.

140

.4

Solutions [1.7.3]-[1.8.2]

Problems

3. First method: Suppose that the coin is being tossed by a special machine which is not switched
off when the walker is absorbed. If the machine ever produces N heads in succession, then either the
game finishes at this point or it is already over. From Exercise (1.3.2), such a sequence of N heads
must (with probability one) occur sooner or later.
Alternative method: Write down the difference equations for Pk the probability the game finishes at
0 having started at k, and for /h, the corresponding probability that the game finishes at N; actually
these two difference equations are the same, but the respective boundary conditions are different.
Solve these equations and add their solutions to obtain the total 1.
4. It is a tricky question. One of the present authors is in agreement, since if li"(A
and li"(A I cc) > li"(B I cc) then
li"(A) = li"(A
> li"(B

I C)li"(C) + li"(A I cc)li"(Cc)


I C)li"(C) + li"(B I Cc)li"(Cc)

I C)

> li"(B

I C)

= li"(B).

The other author is more suspicious of the question, and points out that there is a difficulty arising
from the use of the word 'you'. In Example (1.7.10), Simpson's paradox, whilst drug I is preferable
to drug II for both males and females, it is drug II that wins overall.

5.

Let Lk be the label of the kth card. Then, using symmetry,


li"(Lk =mILk > Lr for

< r < k)
-

li"(Lk=m)
li"(Lk > Lr for 1 :S r < k)

1/1- =

=-

kjm.

1.8 Solutions to problems


1. (a) Method 1: There are 36 equally likely outcomes, and just 10 of these contain exactly one six.
The answer is therefore ~ =
Method II: Since the throws have independent outcomes,

fs.

ll"(first is 6, second is not 6)

= ll"(first is 6)ll"(second is not 6) =

! g= ft,.

There is an equal probability of the event {first is not 6, second is 6}.


(b) A die shows an odd number with probability ~; by independence, ll"(both odd) = ~ ~ = ;\.
(c) WriteS for the sum, and {i, j} for the event that the first is i and the second j. Then li"(S = 4) =
ll"(l, 3) + ll"(2, 2) + ll"(3, 1) =
(d) Similarly

li"(S divisible by 3)

= li"(S = 3) + li"(S = 6) + li"(S = 9) + li"(S =


= {ll"(1, 2)

12)

+ ll"(2, 1)}

+ {ll"(1, 5) + ll"(2, 4) + ll"(3, 3) + ll"(4, 2) + ll"(5, 1)}


+ {ll"(3, 6) + ll"(4, 5) + ll"(5, 4) + ll"(6, 3)} + ll"(6, 6)
=

j~

(a) By independence, ll"(n - 1 tails, followed by a head) = 2-n.


(b) If n is odd, ll"(# heads = # tails) = 0; #A denotes the cardinality of the set A. If n is even, there
are (n/ 2 ) sequences of outcomes with ~n heads and ~n tails. Any given sequence of heads and tails

2.

has probability 2-n; therefore ll"(# heads = # tails) = 2-n (nj 2 ).


141

[1.8.3]-[1.8.9]

Solutions

Events and their probabilities

G)

(c) There are


sequences containing 2 heads and n- 2 tails. Each sequence has probability 2-n,
and therefore JP(exactly two heads) = G)2-n.
(d) Clearly
JP(at least 2 heads)

= 1-

JP(no heads) -JP(exactly one head)

ni A;

= 1 - z-n - ( ~) z-n.

A7t,

3. (a) Recall De Morgan's Law (Exercise (1.2.1)):


= (U;
which lies in J"since it is
the complement of a countable union of complements of sets in :F.
(b) Jf is a a-field because:
(i) 0 E J"and 0 E g,; therefore 0 E Jf.
(ii) If A1, A2, ... is a sequence of sets belonging to both J"and g,, then their union lies in both J"and
g,, which is to say that Jf is closed under the operation of taking countable unions.
(iii) Likewise Ae is in Jf if A is in both J"and g,.
(c) We display an example. Let
Q

={a, b, c},

J"= {{a}, {b, c}, 0,

n},

g, ={{a, b}, {c}, 0, n}.

Then Jf = J"U g, is given by Jf = {{a}, {c}, {a, b}, {b, c}, 0, Q }. Note that {a}
but the union {a, c} is not in Jf, which is therefore not a a-field.

Jf and {c}

Jf,

4. In each case J"may be taken to be the set of all subsets of Q, and the probability of any member
of J" is the sum of the probabilities of the elements therein.
(a) Q = {H, T} 3 , the set of all triples of heads (H) and tails (T). With the usual assumption of
independence, the probability of any given triple containing h heads and t = 3- h tails is ph (1- p)t,
where p is the probability of heads on each throw.
(b) In the obvious notation, Q = {U, V} 2 = {UU, VV, UV, VU}. Also JP(UU) = JP(VV) = ~ ~ and
lP(UV) = JP(VU) = ~ ~.
(c) Q is the set of finite sequences of tails followed by a head, {Tn H : n ::: 0}, together with the infinite
sequence T 00 of tails. Now, lP(TnH) = (1- p)n p, and JP(T00 ) = limn--+ooO- p)n = 0 if p =j:. 0.
5.

As usual, JP(A!::.. B) = 1P( (AU B)\ JP(A n B)) = JP(A U B) -JP(A n B).

6.

Clearly, by Exercise (1.4.2),


JP(A U B U C)= 1P((Ae n Ben Ce)e) = 1 -JP(Ae n BenCe)
= 1 -JP(Ae

I Ben Ce)lP(Be I Ce)JP(Ce).

7. (a) If A is independent of itself, then JP(A) = JP(A n A) = JP(A) 2 , so that JP(A) = 0 or 1.


(b) If JP(A) = 0 then 0 = JP(A n B) = lP(A)lP(B) for all B. If JP(A) = 1 then JP(A n B) = JP(B), so
that JP(A n B) = lP(A)lP(B).
8. Q U 0 = Q and Q n 0 = 0, and therefore 1 = JP(Q U 0) = JP(Q) + lP(0) = 1 + 1P(0), implying
that lP(0) = 0.

9. (i) Q(0) = lP(0 I B)= 0. Also Q(Q) = JP(Q I B)= lP(B)/lP(B) = 1.


(ii) Let A1, A2, ... be disjoint members of :F. Then {A; n B: i 2:: 1} are disjoint members of :F,
implying that

142

Solutions

Problems

[1.8.10]-[1.8.13]

Finally, since Q is a probability measure,

I C)

Q(A

Q(A n C)
lP(A n C I B)
lP(A n B n C)
Q(C) =
lP(C I B) =
lP(B n C) = JP(A

I B n C).

The order of the conditioning (C before B, or vice versa) is thus irrelevant.

10. As usual,
00

JP(A) = lP ( U<A n Bj)


1

00

00

= L JP(A
1

n Bj) = L JP(A 1 Bj )JP(Bj ).


1

11. The first inequality is trivially true if n = 1. Let m :::: 1 and assume that the inequality holds for
n:::; m. Then

~ lP(Ai),

::": lP(lJAi) +1P(Am+1) ::":


1

by the hypothesis. The result follows by induction. Secondly, by the first part,

12. We have that

lP(~Ai) = lP( (yAf


= 1-

r)

1 -lP(yAf)

~JP(Aj) + ~JP(Af n Aj)- .. + (-l)nlP(rlAf)

(n) -

= 1- n + LlP(Ai) +

LlP(Ai U Aj)0

;;= JP(Ai)- - (-l)nlP(y

13. Clearly,

JP(Nk)

2::

JP(

Ss;{1,2, ... ,n}


ISI=k
For any such given
JP(

s, we write As =

n n
Ai

i ES

jrf_S

(n)
3

l<j

+ ( -l)n (:) - ( -l)nlP ( y Ai)

= (1- l)n +

by Exercise (1.3.4)

l<j

using De Morgan's laws again

Ai)

n n
Ai

iES

+ ...

by the binomial theorem.

A.i).

jrf_S

ni ES Ai Then
0

Aj) = JP(As)- LlP(Asu{jJ) + L


jrf_S

143

j <k

j,kS

lP(Asu{j,kJ)-

[1.8.14]-[1.8.16]

Solutions

Events and their probabilities

by Exercise (1.3.4). Hence

where a typical summation is over all subsets S of {1, 2, ... , n} having the required cardinality.
Let Ai be the event that a copy of the i th bust is obtained. Then, by symmetry,

where aj is the probability that the j most recent Vice-Chancellors are obtained. Now a3 is given in
Exercise (1.3.4), and a4 and as may be calculated similarly.

14. Assuming the conditional probabilities are defined,

I B)

lP'(A-

= lP'(Aj

I Aj)lP'(Aj)
1P'(B n (U7 Ai))
JP'(B

n B)

lP'(B)

15. (a) We have that


lP'(N = 2

s=

4 ) = lP'({N = 2} n {S = 4}) =
JP'(S = 4)

lP'(S = 4 1 N = 2)~(N = 2).


= I )lP'(N = I)

Li JP'(S = 4 I N

1 1
12 . 4

(b) Secondly,

ft

lP'(S = 4 I N = 2):! + lP'(S = 4 I N = 4)


JP'(S = 4 I N even) = ------------"-:-::---.,--------__,_,_
JP'(N even)
1

+ 641 . I61
:! + ft + . . .

12 . 4

4 33 + 1
4433 .

(c) Writing D for the number shown by the first die,

lP'(N = 2 I S = 4, D = 1) =

JP'(N = 2, S = 4, D = 1)
JP'(S = 4 D = 1)

= 1

004

o. 6. 4 + o. 30. 8 + 64. I6

(d) Writing M for the maximum number shown, if 1 :::: r :::: 6,


00

00

j= 1

j= 1

JP'(M:::: r) = LJP'(M:::: r IN= j)Tl. = L

Finally, lP'(M

= r) = JP'(M:::: r) -JP'(M:::: r-

(r)j
j1 = r- ( 1 - r6
2
12
12

)-1 = - -r .
12- r

1).

16. (a) w E B if and only if, for all n, w E U~n Ai, which is to say that w belongs to infinitely many
of the An.
(b) w E
if and only if, for some n, w E n~n Ai, which is to say that w belongs to all but a finite
number of the An.

144

Solutions [1.8.17]-[1.8.20]

Problems

(c) It suffices to note that B is a countable intersection of countable unions of events, and is therefore
an event.
(d) We have that

n
00

Cn =

00

Ai s; An s;

U Ai =

Bn,
i=n
i=n
and therefore JP(Cn) :::: JP(An) :::: JP(Bn). By the continuity of probability measures (1.3.5), if Cn -+ C
then JP(Cn) -+ JP(C), and if Bn -+ B then JP(Bn) -+ JP(B). If B = C = A then
JP(A) = JP(C) :::: lim JP(An) :::: JP(C) = JP(A).
n->-oo

17. If Bn and Cn are independent for all n then, using the fact that Cn s; Bn,
JP(Bn)lP(Cn) = JP(Bn

n Cn)

= JP(Cn) -+ JP(C)

as n-+ oo,

and also JP(Bn)lP(Cn) -+ JP(B)JP(C) as n -+ oo, so that JP(C) = JP(B)JP(C), whence either JP(C) = 0
or JP(B) = 1 or both. In any case JP(B n C) = JP(B)JP(C).
If An -+ A then A = B = C so that JP(A) equals 0 or 1.

18. It is standard (Lemma (1.3.5)) that lP is continuous if it is countably additive. Suppose then that lP is
finitely additive and continuous. Let At, Az, ... be disjoint events. Then Uf Ai = limn-;. 00 U1 Ai,
so that, by continuity and finite-additivity,

19. The network of friendship is best represented as a square with diagonals, with the comers labelled
A, B, C, and D. Draw a diagram. Each link of the network is absent with probability p. We write EF
for the event that a typical link EF is present, and EfC for its complement. We write A ++ D for the
event that A is connected to D by present links.

n BCc)p + JP(A

++ D 1 ADc

n BC)(1- p)

(d)

JP(A ++ D 1 ADc) = JP(A ++ D 1 ADc

(c)

n BCc)p + JP(A ++ D 1 BCc n AD)(1 - p)


={I_ (1- (1- p)2)2}p + (1- p).
JP(A ++ D 1 ABc)= JP(A ++ D 1 ABc n ADc)p + JP(A ++ D 1 ABc n AD) (I- p)
= (1- p){1- p(l- (1- p) 2)}p + (1- p).
JP(A ++D)= JP(A ++ D I ADc)p + JP(A ++ D I AD)( I - p)
= {1- (1- (1- p)2)2}p2 + (1- p2)2p(l- p) + (1- p).

= {1- (1- (1- p)2)2}p + (1- p2)2(1- p).

(b)

(a)

JP(A ++ D 1 BCc) = JP(A ++ D 1 ADc

20. We condition on the result of the first toss. If this is a head, then we require an odd number of
heads in the next n - 1 tosses. Similarly, if the first toss is a tail, we require an even number of heads
in the next n - 1 tosses. Hence
Pn = p(l- Pn-d

+ (1- P)Pn-1

= (1- 2p)Pn-1

+P

with PO = 1. As an alternative to induction, we may seek a solution of the form Pn = A


Substitute this into the above equation to obtain

A+ BA.n = (1- 2p)A

+ (1- 2p)BA.n-l + p

145

+ B A. n.

[1.8.21]-[1.8.23]

Solutions

and A+ B = 1. Hence A=

Events and their probabilities

1. B = 1. A.= 1- 2p.

21. Let A = {run of r heads precedes run of s tails}, B = {first toss is a head}, and C = {first s tosses
are tails}. Then

where p = 1 - q is the probability of heads on any single toss. Similarly JP(A I B)= pr-l + JP(A
Bc)(l_pr-l). WesolveforlP(A I B)andlP(A I Bc),andusethefactthatlP(A) =lP(A I B)p+JP(A
Bc)q, to obtain

I
I

22. (a) Since every cherry has the same chance to be this cherry, notwithstanding the fact that five
are now in the pig, the probability that the cherry in question contains a stone is
=

,fu ! .

(b) Think about it the other way round. First a random stone is removed, and then the pig chooses
his fruit. This does not change the relevant probabilities. Let C be the event that the removed cherry
contains a stone, and let P be the event that the pig gets at least one stone. Then JP(P I C) is the
probability that out of 19 cherries, 15 of which are stoned, the pig gets a stone. Therefore
JP(P

C)= 1 -JP(pig chooses only stoned cherries C) = 1 -

MH H j% .g..

23. Label the seats 1, 2, ... , 2n clockwise. For the sake of definiteness, we dictate that seat 1 be
occupied by a woman; this determines the sex of the occupant of every other seat. For 1 ::::: k ::::: 2n,
let Ak be the event that seats k, k + 1 are occupied by one of the couples (we identify seat 2n + 1 with
seat 1). The required probability is

Now, JP(Ai) = n (n - 1) !2 / n !2 , since there are n couples who may occupy seats i and i + 1, (n - 1)!
ways of distributing the remaining n - 1 women, and (n - 1)! ways of distributing the remaining n- 1
men. Similarly, if 1 ::::: i < j ::::: 2n, then
(n-2)t2

JP(AinAj)= { n(n-1)
0

n! 2

ifli-jlfl
if li- jl = 1,

subject to JP(At n A2n) = 0. In general,


(n- k)!

n!
if it < i2 < < ik and ij+t - ij ::::: 2 for 1 ::::: j < k, and 2n +it - ik ::::: 2; otherwise this
probability is 0. Hence
n
(n-k)l
lP nAj = ~)-1)k
1 Sk,n
1
k=O
n.

(2n)

where Sk,n is the number of ways of choosing k non-overlapping pairs of adjacent seats.
Finally, we calculate Sk,n Consider first the number Nk,m of ways of picking k non-overlapping
pairs of adjacent seats from a line (rather than a circle) of m seats labelled 1, 2, ... , m. There is a oneone correspondence between the set of such arrangements and the set of (m - k)-vectors containing

146

Solutions [1.8.24]-[1.8.27]

Problems

k 1'sand (m - 2k) O's. To see this, take such an arrangement of seats, and count 0 for an unchosen
seat and 1 for a chosen pair of seats; the result is such a vector. Conversely take such a vector, read its
elements in order, and construct the arrangement of seats in which each 0 corresponds to an unchosen
seat and each 1 corresponds to a chosen pair. It follows that Nk,m = (mk'k).
Turning to Sk,n either the pair 2n, 1 is chosen or it is not. If it is chosen, we require another
k - 1 pairs out of a line of 2n - 2 seats. If it is not chosen, we require k pairs out of a line of 2n seats.
Therefore
Sk,n

= Nk-1,2n-2 + Nk,2n =

( 2n-k-I)
k_ l

+ (2n-k)
k
=

(2n-k) 2n
k
2n _ k

24. Think about the experiment as laying down the b + r balls from left to right in a random order.
The number of possible orderings equals the number of ways of placing the blue balls, namely (bt).
The number of ways of placing the balls so that the first k are blue, and the next red, is the number of
ways of placing the red balls so that the first is in position k + 1 and the remainder are amongst the
r + b - k - 1 places to the right, namely (r+~=~- 1 ). The required result follows.
The probability that the last ball is red is r / (r + b), the same as the chance of being red for the
ball in any other given position in the ordering.
25. We argue by induction on the total number of balls in the urn. Let Pac be the probability that the
last ball is azure, and suppose that Pac = whenever a, c :::-_ 1, a+ c ::; k. Let a and CJ be such that
a, a : :-_ 1, a+ a = k + 1. Let A; be the event that i azure balls are drawn before the first carmine
ball, and let Cj be the event that j carmine balls are drawn before the first azure ball. We have, by
taking conditional probabilities and using the induction hypothesis, that

Paa = LPa-i,aiP'(A;)

+ LPa,a-jiP'(Cj)

i=1

}=1

a-1

= Po,aiP'(Aa) + Pa,oiP'(Ca) +

iL

a-1

IP'(A;) +

1L

i=l

Now PO,a = 0 and Pa,O

= 1.

IP'(Cj).

}=1

Also, by an easy calculation,

a
a+a

IP'(Aa) = - -

a-1
1
-a+a-1
a+l

ala!
= IP'(Ca).
(a+a)!

It follows from the above two equations that


Paa =

i (:tiP'( A;)+ t!P'(CJ)) + 1(IP'(Ca) -IP'(Aa)) = 1


i=l

}=1

26. (a) If she says the ace of hearts is present, then this imparts no information about the other card,
which is equally likely to be any of the three other possibilities.
(b) In the given protocol, interchange hearts and diamonds.
27. Writing A if A tells the truth, and Ac otherwise, etc., the only outcomes consistent with D telling
the truth are ABCD, ABcccD, NBCcD, and NBcCD, with a total probability of
Likewise, the
only outcomes consistent with D lying are AcBcccDc, AcBcDc, ABc cDc, and ABCcDc, with a total
probability of ~. Writing S for the given statement, we have that

H.

IP'(D 1 S)

IP'(DnS)
IP'(D n S) + IP'(Dc n S)
147

13

liT

H+ ~

13
41

[1.8.28]-[1.8.33]

Solutions

Events and their probabilities

#;

Eddington himself thought the answer to be


hence the 'controversy'. He argued that a truthful
denial leaves things unresolved, so that if, for example, B truthfully denies that C contradicts D, then
we cannot deduce that C supports D. He deduced that the only sequences which are inconsistent with
the given statement are ABcCD and ABcccoc, and therefore
lP'(D

I S) =

25

8I

46

+ 8I

25

=-

71

Which side are you on?

28. Let Br be the event that the rth vertex of a randomly selected cube is blue, and note that lP'(Br)
By Boole's inequality,

lo.

r=l

r=l

lP'(

UBr) .::=: LlP'(Br) = 1~ < 1,

so at least 20 per cent of such cubes have only red vertices.

I A) = lP'(A n B)/lP'(A) = lP'(A I B)lP'(B)/lP'(A) > lP'(B).


I Be) = lP'(A n BC)/lP'(Bc) = {lP'(A) -lP'(A n B)}/lP'(Bc) < lP'(A).

29. (a) lP'(B

(b) lP'(A

(c) No. Consider the case An C

= 0.

30. The number of possible combinations of birthdays of m people is 365m; the number of combinations of different birthdays is 365!/(365- m)!. Use your calculator for the final part.

ez

32. In the obvious notation, lP'(wS, xH, yD, zC) = (!;) (~) (~) 3)
tor. Turning to the 'shape vector' (w, x, y, z) with w:::: x:::: y:::: z,
lP'(w, x, y, z) = {

I m).

Now use your calcula-

4lP'(wS, xH, yD, zC)

if w

=f. x =

12lP'(wS, xH, yD, zC)

if w

= x =f. y =f. z,

on counting the disjoint ways of obtaining the shapes in question.


33. Use your calculator, and divide each of the following by

148

el).

y = z,

Solutions

Problems

[1.8.34]-[1.8.36]

34. Divide each of the following by 65 .


6!5!
3! (2!) 2 ,

6! 5!
3! (2!) 3 ,

6!,

6!5!
4!3!2!'

6!5!
2!(3!) 2 '
6!5!
(4!)2,

6!5!
(5!)2.
35. Let Sr denote the event that you receive r similar answers, and T the event that they are correct.
Denote the event that your interlocutor is a tourist by V. Then T n vc = 0, and
lP(T
JP(T I Sr)
Hence:

= i x VI = i
I S2) = (i) 2 V[{ (i) 2 +

n V n Sr)
lP(Sr)

lP(T

n Sr I V)lP(V)
lP(Sr)

(a) JP(T I S1)


(b) lP(T

(i) 3 V[{(i) 3 +
I S4) = (i) 4 . V[{(i) 4 +

(c) lP(T I S3) =

H+ ~] = i

(!) 2

(!) 3 }i +

~] =

zt

(d) JP(T
(!) 4 H + ~] = ~
(e) If the last answer differs, then the speaker is surely a tourist, so the required probability is
( 43 )3 . 41

( 1)3 X 1
4
4

+ (1)3
l
4 "4

9
= 10

36. Let E (respectively W) denote the event that the answer East (respectively West) is given.
(a) Using conditional probability,
JP(East correct I E)

JP(East correct I W)

ElP(E I East correct)


JP(E)
1

E(6

E(~.! + ~)
= E
1
2 3
+ 3) + 34(l- E)

(b) Likewise, one obtains for the answer EE,

and for the answer WW,

(c) Similarly for EEE,

149

E ~.
1
2 1
1
zE + (34 + 3)(1 +E)

= E,

[1.8.37]-[1.8.39]

Solutions

Events and their probabilities

andforWWW,
E{ (~)d)3

liE

E[(~)(~)3+j]+(1-E)~(i)3
Then forE =

=9+2E.

irJ, the first is !Jz; the second is i, as you would expect if you look at Problem (1.8.35).

37. Use induction. The inductive step employs Hoole's inequality and the fact that

38. We propose to prove by induction that

IP'(

Ar) ::: :t!P'(Ar)-

r=l

r=t

IP'(Ar nAt).

2~r~n

There is nothing special about the choice of At in this inequality, which will therefore hold with any
suffix k playing the role of the suffix 1. Kounias's inequality is then implied.
The above inequality holds trivially when n = 1. Assume that it holds for some value of n (2: 1).
We have that

::: :t!P'(Ar)-

r=t
n+l

::: L

IP'(Ar nAt)+ IP'(An+t) -IP'( An+t n

IP'(Ar)-

r=l

Ar)

r=t

2~r~n

IP'(Ar nAt)

2~r~n+t

since!P'(An+t nAt) ::OIP'(An+t nU~=tAr)


39. We taken 2: 2. We may assume without loss of generality that the seats are labelled 1, 2, ... , n,
and that the passengers are labelled by their seat assignments. Write F for the event that the last
passenger finds his assigned seat to be free. Let K (2: 2) be the seat taken by passenger I, so that
IP'(F) = (n-1)-t 2::/:= 2 akwhereak = IP'(F I K = k). Notethatan = 0. Passengers2, 3, ... , K-I
occupy their correct seats. Passenger K either occupies seat I, in which case all subsequent passengers
take their correct seats, or he occupies some seat L satisfying L > K. In the latter case, passengers
K + 1, K + 2, ... , L- 1 are correctly seated. We obtain thus that
ak

Therefore ak =

1
(1
n-k+1

+ ak+t + ak+2 ++an),

i for 2::: k < n, by induction, and so IP'(F) =

150

2::: k < n.

iCn- 2)/(n- 1).

2
Random variables and their distributions

2.1 Solutions. Random variables


1. (i) If a > 0, x E
variable. If a < 0,

~.then

{w: aX(w)

{w: aX(w)

~ x} =

x}

= {w:

{w: X(w) 2: xja} = {

X(w) ~ xja} E .T'since X is a random

U1 { w: X(w) ~ ~- ~}

n2'::

which lies in r since it is the complement of a countable union of members of :F. If a

{w: aX(w)

~ x} =

= 0,

if X< 0,
ifx 2: 0;

in either case, the event lies in :F.


(ii) For wE n, X(w)- X(w) = 0, so that X- X is the zero random variable (that this is a random
variable follows from part (i) with a = 0). Similarly X (w) +X (w) = 2X (w ).
2.

Set Y = aX +b. We have that

lP'(X ~ (y- b)/a)= F((y- b)/a)


lP'(Y < y) = {
JP>(X 2: (y- b)/a) = 1 -1imxt(y-b)fa F(x)
Finally, if a= 0, then Y = b, so that JP>(Y

y) equals 0 if b > y and 1 if b

if a> 0,
if a < 0.
~

y.

3. Assume that any specified sequence of heads and tails with length n has probability z-n. There
are exactly (k) such sequences with k heads.
If heads occurs with probability p then, assuming the independence of outcomes, the probability of
any givensequenceofk headsandn-k tails is pk (1- p)n-k. The answer is therefore (k) pk(l- p)n-k.

4. Write H = "AF + (1 - "A)G. Then limx-+-oo H(x) = 0, limx-+oo H(x)


non-decreasing and right-continuous. Therefore His a distribution function.

= 1, and clearly His

5. The function g(F(x)) is a distribution function whenever g is continuous and non-decreasing on


[0, 1], with g(O) = 0, g(l) = 1. This is easy to check in each special case.

151

[2.2.1]-[2.4.1]

Solutions

Random variables and their distributions

2.2 Solutions. The law of averages


1. Let p be the potentially embarrassed fraction of the population, and suppose that each sampled
individual would truthfully answer "yes" with probability p independently of all other individuals.
In the modified procedure, the chance that someone says yes is p + ~(1 - p) = ~(1 + p). If the
proportion of yes's is now ifJ, then 2J- 1 is a decent estimate of p.
The advantage of the given procedure is that it allows individuals to answer "yes" without their
being identified with certainty as having the embarrassing property.
2.

Clearly Hn

+ Tn

= n, so that (Hn - Tn)/n = (2Hn/n)- 1. Therefore

lP' ( 2p - 1 -

as n

--+

E~ ~ (Hn -

Tn)

~ 2p - 1 + E) = lP' (I~ Hn -

pI ~ n

--+

oo, by the law of large numbers (2.2.1 ).

3. Let/n (x) be the indicator function of the event {Xn ~ x}. By the law of averages, n -I I:~= 1 I r (x)
converges in the sense of (2.2.1) and (2.2.6) to lP'(Xn ~ x) = F(x).

2.3 Solutions. Discrete and continuous variables


1.

With 8 = supm lam- am-JI, we have that


IF(x)- G(x)l

IF(am)- F(am-dl

IF(x

+ 8)- F(x- 8)1

for x E [am-I, am). Hence G(x) approaches F(x) for any x at which F is continuous.
2.

For y lying in the range of g, {Y ~ y} ={X~ g- 1 (y)}

3.

Certainly Y is a random variable, using the result of the previous exercise (2). Also

:F.

lP'(Y ~ y) = IP'(F- 1 (X) ~ y) = IP'(X ~ F(y)) = F(y)

as required. IfF is discontinuous then p- 1 (x) is not defined for all x, so that Y is not well defined.
If F is non-decreasing and continuous, but not strictly increasing, then p-I (x) is not always defined
uniquely. Such difficulties may be circumvented by defining p-I (x) = inf {y : F (y) ~ x}.

4. The function A.f +(1- A. )g is non-negative and integrable over llHo 1. Finally, f g is not necessarily
a density, though it may be: e.g., iff= g = 1, 0 ~ x ~ 1 then f(x)g(x) = 1, 0 ~ x ~ 1.
5. (a) If d > 1, then J100 cx-d dx = c/(d- 1). Therefore f is a density function if c = d- 1, and
F(x) = 1- x-(d-I) when this holds. If d ~ 1, then f has infinite integral and cannot therefore be a
density function.
(b) By differentiating F(x) =ex j(l +ex), we see that F is the distribution function, and c = 1.

2.4 Solutions. Worked examples


1.

(a) If y ~ 0,

lP'(X 2 ~ y) = lP'(X ~

(b) We must assume that

X~

0. If y

,JY) -lP'(X
~

<

0,

152

-,JY) =

F(,J'Y)- F(-,J'Y).

Solutions [2.4.2]-[2.5.6]

Random vectors
(c)If-1

1,
00

lP'(sinX ~ y) =

IP'((2n

+ l)rr- sin- 1 y ~X~ (2n + 2)rr + sin- 1 y)

n=-oo
00

L {F((2n + 2)rr + sin- 1 y)- F((2n + l)rr- sin- 1 y) }


n=-oo

(d) lP'(G- 1 (X) ~ y) = lP'(X ~ G(y)) = F(G(y)).


(e) IfO ~ y ~ 1, then lP'(F(X) ~ y) = lP'(X ~ p- 1 (y)) = F(F- 1 (y)) = y. There is a small
difficulty ifF is not strictly increasing, but this is overcome by defining p- 1 (y) = sup{x : F (x) = y}.
(f) lP'(G- 1 (F(X)) ~ y) = lP'(F(X) ~ G(y)) = G(y).
2.

~.

It is the case that, for x E

Fy(x) and Fz(x) approach F(x) as

a~

-oo, b

oo.

2.5 Solutions. Random vectors


1. Write fxw = lP'(X = x, W = w). Then foo = h1 =
x,w.
2.

(a) We have that


/x,y(x,y)

r-

:!, /10 = i. and fxw = 0 for other pairs

if (x, y) = (1, 0),


if (x, y) = (0, 1),
otherwise.

(b) Secondly,
1- p

!x,z(x, z) =

if (x, z) = (0, 0),


if (x, z) = (1, 0),

otherwise.

3.

Differentiating gives fx,y(x, y) =e-x /{rr(l

4.

Let A = {X

+ y 2 )}, x

::: 0, y E R.

b, c < Y ~ d}, B = {a < X ~ b, Y ~ d}. Clearly

lP'(A) = F(b, d) - F(b, c),

lP'(B) = F(b, d) - F(a, d),

lP'(A U B) = F(b, d) - F(a, c);

now lP'(A n B)= lP'(A) + lP'(B) -lP'(A U B), which gives the answer. Draw a map of~2 and plot the
regions of values of (X, Y) involved.

5.

The given expression equals


lP'(X = x, Y

Secondly, for 1

y) -lP'(X = x, Y

y- 1) = lP'(X = x, Y = y).

x ~ y ~ 6,

if X < y,

ifx = y.
6.

No, because F is twice differentiable with a2 F jaxay < 0.

153

[2.7.1]-[2.7.6]

Solutions

Random variables and their distributions

2. 7 Solutions to problems
1.

By the independence of the tosses,


JP>(X > m) = JP>(first m tosses are tails)= (1- p)m.

Hence
JP>(X < x) = {

1-(1-p)lxJ

if X::: 0,

if X< 0.

Remember that Lx J denotes the integer part of x.


(a) If X takes values {xi : i ::: 1} then X = 2:::~ 1 Xi IAi where Ai = {X= Xi}.
(b) Partition the real line into intervals of the form [k2-m, (k + 1)2-m), -oo < k < oo, and
define Xm = L~-oo k2-m h,m where h,m is the indicator function of the event {k2-m :::: X <
(k + 1)2-m}. Clearly Xm is a random variable, and Xm(w) t X(w) as m--+ oo for all w.
2.

(c) Suppose {Xm} is a sequence of random variables such that Xm(w) t X(w) for all w. Then
{X :::: X} = nm {Xm :::: X}, which is a countable intersection of events and therefore lies in :F.

3.

(a) We have that


00

{X+Y::::x}=n
n=1

({X::::r}n{Y::::x-r+n- 1 })

rEI()>+

where IQI+ is the set of positive rationals.


In the second case, if X Y is a positive function, then X Y = exp{log X +log Y}; now use Exercise
(2.3.2) and the above. For the general case, note first that IZI is a random variable whenever Z is
a random variable, since {IZI :::: a} = {Z :::: a} \ {Z < -a} for a ::: 0. Now, if a ::: 0, then
{XY ::::a}= {XY < 0} U {IXYI ::::a} and
{XY < 0}

({X < 0}

n {Y

> 0}) U ({X > 0}

n {Y

< 0}).

Similar relations are valid if a < 0.


Finally {min{X, Y} > x} = {X > x} n {Y > y }, the intersection of events.
(b) It is enough to check that aX+ {JY is a random variable whenever a, f3 E lR and X, Yare random
variables. This follows from the argument above.
If Q is finite, we may take as basis the set {IA : A E

4.

(a) F(~)- F(~)

(b) F(2)- F(1) =

(c)

JP>(X 2 ::::

J1 of all indicator functions of events.

X)= JP>(X:::: 1) =

1) = l
i) = JP>(X:::: i) = !-

(d) li"(X:::: 2X 2 ) = li"(X:::


(e) JP>(X +

2 ::::

(f) JP>(vfx:::: z) = JP>(X:::: z2 ) =

5.

iz2 ifO:::: z:::: .J2.

JP>(X = -1) = 1- p, JP>(X = 0) = 0, li"(X::: 1) =

-!P

6. There are 6 intervals of 5 minutes, preceding the arrival times of buses. Each such interval has
5 = 1 , so the answer IS
. 6 12
1 = 1.
probab1"l"1ty 60
12
2

154

Solutions [2.7.7]-[2.7.12]

Problems

7. Let T and B be the numbers of people on given typical flights of TWA and BA. From Exercise
(2.1.3),
~T-~-

(10)
k

(~)k (_.!._) 10-k ,


10

JP'(B =k) = ( 20) ( -9


k
10

10

)k (-1 )20-k
10

Now
lP'(TWA overbooked)= JP'(T = 10) = ( [0 ) 10 ,
lP'(BAoverbooked) = JP'(B:::: 19) = 20(-fu) 19(fo) + (-fu) 20,
of which the latter is the larger.

8.

Assuming the coins are fair, the chance of getting at least five heads is (~) 6 + 6(~) 6 =

9.

(a) We have that


lP'(x+ < x) = { 0
-

F(x)

-l4.

if X< 0,
ifx :=:: 0.

(b) Secondly,
lP'(X- :::0 x) = { O .
1 - hmyt-x F(y)

if X < 0,
if X:::: 0.

(c) lP'(IXI:::: x) = JP'(-x:::: X:::: x) if x :=:: 0. Therefore


if X < 0,

lP'(IXI < x) = { 0
-

F(x) -limyt-x F(y)

ifx:::: 0.

(d) lP'(-X:::: x) = 1 -limyt-x F(y).


10. By the continuity of probability measures (1.3.5),
lP'(X = xo) = lim lP'(y <X:::: xo) = F(xo)- lim F(y) = F(xo)- F(xo-),

ytxo

ytxo

using general properties of F. The result follows.


11. Define m = sup{x: F(x) < ~}.Then F(y) < ~for y < m, and F(m) :=:: ~(if F(m) < ~then
F(m 1 ) < ~ for some m1 > m, by the right-continuity ofF, a contradiction). Hence m is a median,
and is smallest with this property.
A similar argument may be used to show that M = sup{x : F (x) :::: ~} is a median, and is largest
with this property. The set of medians is then the closed interval [m, M].

12. Let the dice show X andY. WriteS= X+ Y and fi = lP'(X = i), gi = JP'(Y = i). Assume that
lP'(S = 2) = lP'(S = 7) = lP'(S = 12) = 1\. Now
lP'(S = 2) = lP'(X = l)lP'(Y = 1) = flg1,
lP'(S = 12) = lP'(X = 6)lP'(Y = 6) = f6g6,
lP'(S = 7) :::: lP'(X = 1)lP'(Y = 6) + lP'(X = 6)lP'(Y = 1) = flg6 + f6g1.
It follows that

f1 g1 =

f6g6, and also

- 1 = lP'(S = 7) :::: flg1


11

(g6-g1 + -fl!6) = -111 (


155

+ -1)
X

[2.7.13]-[2.7.14]

Solutions

Random variables and their distributions

where x = g6/ gJ. However x +x-i > 1 for all x > 0, a contradiction.

13. (a) Clearly dL satisfies (i). As for (ii), suppose that dL(F, G)

= 0. Then

F(x) ::0 lim{G(x +E)+ E} = G(x)


E,l-0

and
F(y) 0:: lim{G(y- E)- E}
E,l-0

= G(y-).

Now G(y-) :::: G(x) if y > x; taking the limit as y ,j, x we obtain
F(x):::: limG(y-):::: G(x),
y,l-x

implying that F (x) = G (x) for all x.


Finally, if F(x) ::0 G(x +E)+ E and G(x) ::0 H(x + 8) + 8 for all x and some E, 8 > 0, then
F(x) ::0 H(x + 8 +E) + E + 8 for all x. A similar lower bound for F(x) is valid, implying that
dL(F, H) ::: dL(F, G)

+ dL(G, H).

(b) Clearly drv satisfies (i), and drv(X, Y)


inequality,
llP'(X

= k) -lP'(Z = k)l

:S llP'(X

= 0 if and only if lP'(X = Y) = 1. By the usual triangle


= k) -lP'(Y = k)l + llP'(Y = k) -lP'(Z = k)l,

and (iii) follows by summing over k.


We have that
2llP'(X E A) - lP'(Y E A) I = I (lP'(X E A) -lP'(Y E A)) - (lP'(X E A c) - lP'(Y E A c)) I

= ~~(lP'(X = k) -lP'(Y = k))lA(k)i


where J A (k) equals 1 if k E A and equals -1 if k E Ac. Therefore,
2llP'(X E A) -lP'(Y E A)l :S LllP'(X = k) -lP'(Y = k)lllA(k)l :S drv(X, Y).
k

Equality holds if A= {k: lP'(X = k) > lP'(Y = k)}.

14. (a) Note that


a2F
- = -e -x- Y
axay

<0,

so that F is not a joint distribution function.


(b) In this case
82 F ={e-Y
axay
0
and in addition

x,y > 0,

ifO ::0 x :S y,
ifO==oy==ox,

a2 F
--dxdy = 1.
in0 oo inoo
0
axay

Hence F is a joint distribution function, and easy substitutions reveal the marginals:
Fx(x)
Fy(y)

= y-+oo
lim F(x, y) =

1 -e-x,

x :::: 0,

= X-+00
lim F(x, y) = 1- e-Y- ye-Y,
156

y:::: 0.

Solutions [2.7.15]-[2.7.20]

Problems

15. Suppose that, for some i i= j, we have Pi < Pj and Bi is to the left of Bj. Write m for the
position of Bi and r for the position of Bj, and consider the effect of interchanging Bi and Bj. For
k :::;: m and k > r, lP'(T :::: k) is unchanged by the move. Form < k :::;: r, lP'(T :::: k) is decreased by an
amount p j - Pi, since this is the increased probability that the search is successful at the m th position.
Therefore the interchange of Bi and Bj is desirable.
It follows that the only ordering in which lP'(T :::: k) can be reduced for no k is that ordering in
which the books appear in decreasing order of probability. In the event of ties, it is of no importance
how the tied books are placed.

16. Intuitively, it may seem better to go first since the first person has greater choice. This conclusion
is in fact false. Denote the coins by C1, C2, C3 in order, and suppose you go second. If your opponent
chooses C1 then you choose C3, because 1P'(C3 beats C1) = ~ + ~ ~ = i~ >
Likewise
1P'(C1 beats C2) = 1P'(C2 beats C3) = ~ >
Whichever coin your opponent picks, you can arrange
to have a better than evens chance of winning.

17. Various difficulties arise in sequential decision theory, even in simple problems such as this one.
The following simple argument yields the optimal policy. Suppose that you have made a unsuccessful
searches "ahead" and b unsuccessful searches "behind" (if any of these searches were successful, then
there is no further problem). Let A be the event that the correct direction is ahead. Then
lP'(A

I current know1e dge)

lP'(current knowledge I A)lP'(A)


d
lP'(current knowle ge)

(1- p)aa

which exceeds if and only if (1 - p)aa > (1 - p)b(l -a). The optimal policy is to compare
(1 - p)aa with (1 - p)b(1 -a). You search ahead if the former is larger and behind otherwise; in
the event of a tie, do either.

i)

i).

18. (a) There are (6 possible layouts, of which 8+8+2 are linear. The answer is 18/(6
(b) Each row and column must contain exactly one pawn. There are 8 possible positions in the first
row. Having chosen which of these is occupied, there are 7 positions in the second row which are
admissible, 6 in the third, and so one. The answer is 8!/(<t).
19. (a) The density function is f(x)

F 1 (x)

= 2xe-x 2 ,

x:::: 0.

(b) The density function is f(x) = F 1(x) = x 2e-lfx, x > 0.


(c) The density function is f(x) = F'(x) = 2(ex + e-x)- 2 , x E .R
(d) This is not a distribution function because F 1 (1) < 0.

20. We have that


lP'(U

V)

= {

fu,v(u, v) du dv

= 0.

J{(u,v):u=v)

The random variables X, Y are continuous but not jointly continuous: there exists no integrable
function f : [0, 1]2 ---+ lR such that
lP'(X :5 x, Y :5 y) =

1x 1Y

f(u, v)dudv,

u=O v=O

157

0:5 x, y :51.

3
Discrete random variables

3.1 Solutions. Probability mass functions

c- 1 = I::i"' 2-k = 1.
(b) c- = I::i"' rk fk = Iog2.
(c) c- 1 = I::i"' k- 2 = :rr 2 /6.
(d) c- 1 = I::i"' 2k /k! = e 2 - 1.
1.

(a)

2. (i) ~; 1- (2log2)- 1 ; 1- 6:rr- 2; (e 2 - 3)j(e 2 - 1).


(ii) 1; 1; 1; 1 and 2.
(iii) It is the case that JP>(X even) = I:;~ 1 JP>(X = 2k), and the answers are therefore
(d) We have that
(a) ~,(b) 1 -(log 3)j(log4), (c)

so the answer is

-! (1 -

e- 2 ).

3. The number X of heads on the second round is the same as if we toss all the coins twice and count
the number which show heads on both occasions. Each coin shows heads twice with probability p 2 ,
so JP>(X = k) = (~)p2k(1 _ p2)n-k.

4.

Let Dk be the number of digits (to base 10) in the integer k. Then

5. (a) The assertion follows for the binomial distribution because k(n - k) :::: (n - k
The Poisson case is trivial.
(b) This follows from the fact that k8 2: (k2 - 1) 4 .
(c) Thegeometricmassfunctionf(k) =qpk, k 2:0.

3.2 Solutions. Independence


1.

We have that
JP>(X

= 1, Z = 1) = JP>(X = 1, Y = 1) = ! = JP>(X = 1)1P'(Z = 1).


158

+ l)(k + 1).

Solutions

Independence

[3.2.2]-[3.2.3]

This, together with three similar equations, shows that X and Z are independent. Likewise, Y and Z
are independent. However
lP'(X = 1, Y = 1, Z = -1) = 0 ~ ~ = lP'(X = 1)1P'(Y = 1)1P'(Z = -1),
so that X, Y, and Z are not independent.

2.

(a) If x :::: 1,
JP>( min{X, Y} ~ x) = 1 -JP'(X > x, Y > x) = 1 -lP'(X > x)lP'(Y > x)
= 1- 2-x . 2-x = 1- 4-x.

(b) lP'(Y >X)= lP'(Y < X) by symmetry. Also


lP'(Y > X)+ lP'(Y < X)+ lP'(Y =X)= 1.

Since

lP'(Y =X)= LlP'(Y =X =X)= Lz-x .z-x = ~,


X

we have that lP'(Y > X)= ~


(c) ~ by part (b).
00

(d)

lP'(X :::: kY) = L JP>( X :::: kY, y = y)


y=1
00

00

= LlP'(X:::: ky, y = y) = 2::JP>(X:::: ky)lP'(Y = y)


y=1
y=1
00
00
2
- " ' " ' 2-ky-xz-y - ..,...,.-;-.......-- L..., L...,
- 2k+ 1 - 1
y=1 x=O

00

(e)

00

00

lP'(X divides Y) = L lP'(Y = kX) = L L lP'(Y = kx, X= x)


k=1
k=1x=1
=

~~z-kxz-x

L..., L...,

k=1x=1

L..., 2k+ 1 -

k=1

(f) Let r = mfn where m and n are coprime. Then


00
00
1
lP'(X = rY) = L lP'(X = km, y = kn) = L 2-km2-kn = m+n- .
2
1
k=1
k=1

3.

(a) We have that


1P'(X1 < Xz < X3) =

(1- P1)(1- pz)(l- P3)P~- 1 p~- 1 p~- 1

i<j<k

= L:o - P1)0

. 1

. 1

- pz)p~- P~- P~

i<j

(1- P1)(1- pz)p~- 1 (PzP3)i P3

1- P2P3

(1- P1)(1- pz)pzp~


(1 - P2P3)(l- P1P2P3)

159

[3.2.4]-[3.2.5]

Solutions

Discrete random variables

(1 - pl)(l - P2)
(1- P2P3)(1 - P1P2P3).

4. (a) Either substitute P1 = P2 = P3 = ~ in the result of Exercise (3b), or argue as follows, with
the obvious notation. The event {A < B < C} occurs only if one of the following occurs on the first
round:
(i) A and B both rolled 6,
(ii) A rolled 6, B and C did not,
(iii) none rolled 6.
Hence, using conditional probabilities,

In calculating lP'(B < C) we may ignore A's rolls, and an argument similar to the above tells us that
1
lP'(B <C)= ( 65)2 lP'(B <C)+ 6

Hence lP'(B < C) = 161 , yielding lP'(A < B < C) = ?ob61 .


(b) One may argue as above. Alternatively, let N be the total number of rolls before the first 6 appears.
The probability that A rolls the first 6 is
lP'(N E {1, 4, 7,

00

})

(~)k- 1 g=

k=1,4,7, ...

Once A has thrown the first 6, the game restarts with the players rolling in order BCABCA .... Hence
the probability that B rolls the next 6 is ~~ also, and similarly for the probability that C throws the
third 6. The answer is therefore ( ~~) 3 .
5. The vector (- Xr : 1 ,::: r ,::: n) has the same joint distribution as (Xr : 1 ,::: r ,::: n), and the claim
follows.

Let X+ 2 and Y

+ 2 have joint mass function f, where /i,j


1
12
1
6
1
12

(! Y)
6

is the (i, j)th entry in the matrix

1,::: i,j.::: 3.

1
12

Then
lP'(X = -1) = lP'(X = 1) = lP'(Y = -1) = lP'(Y = 1) = ~.
lP'(X + Y = -2) =

g # /2 = lP'(X +

160

lP'(X = 0) = lP'(Y = 0) = ~.

Y = 2).

Expectation

Solutions [3.3.1]-[3.3.3]

3.3 Solutions. Expectation


1.

(a) No!

(b) Let X have mass function:

f(-1)

lE (X) -- -91

= ~, f(~:) = ~,

+ 92 + 98 --

f(2) =~Then

1 -- - 91

+ 98 + 92

-- lE( 1/ X )

2.

(a) If you have already j distinct types of object, the probability that the next packet contains
a different type is (c- j)jc, and the probability that it does not is jjc. Hence the number of days
required has the geometric distribution with parameter (c- j)jc; this distribution has mean cj(c- j).
(b) The time required to collect all the types is the sum of the successive times to collect each new
type. The mean is therefore
c

c-1

L:-c-.=cL~

j=O c- 1

3.

k=1

(a) Let lij be the indicator function of the event that players i and j throw the same number. Then
JE(/ij) = IJI'(lij =

o=

L: (~) 2 =

~,

:~

j.

i=1

The total score of the group is S = L:i <j lij, so


JE(S) =

L
lE(lij) = ~ (n).
. .
6 2
I<]

We claitn that the family {lij : i < j} is pairwise independent. The crucial calculation for this is
as follows: if i < j < k then
6

JE(/ij ljk) = IJI'(i, j, and k throw same number) =

L (~) 3 = l6 = JE(/ij )lE(/jk).


r=1

Hence
var(S) = var (

~ lij) = ~ var(lij) =
I<]

(;) var(/12)

I<]

by symmetry. But var(/12) = ~ (1 - ~).


(b) Let Xij be the common score of players i and j, so that Xij = 0 if their scores are different. This
time the total score isS = L:i<j Xij, and
lE(S) = (;)lE(X12) = (;)

~~=

7
12 (;).

The Xij are not pairwise independent, and you have to slog it out thus:

161

[3.3.4]-[3.4.1]

Solutions

Discrete random variables

4. The expected reward is I;~ 1 2-k 2k = oo. If your utility function is u, then your 'fair' entrance
fee is I;~ 1 2-ku(2k). For example, if u(k) = c(l - k-a) fork:::: 1, where c, a > 0, then the fair
fee is

c ~ 2-k(l- rak) = c
L

(1- 1 ).

k=1

2a+1 _ 1

This fee is certainly not 'fair' for the person offering the wager, unless possibly he is a noted philanthropist.

5.

We have that lE(Xa) = I;~ 1 xa f{x(x + 1)}, which is finite if and only if a < 1.

6.

Clearly
var(a +X)= lE( {(a+ X) -JE(a +X) } 2 ) = lE({X -JE(X)} 2 ) = var(X).

7. For each r, bet {1 + n(r)}- 1 on horse r. If the rth horse wins, your payoff is {n(r) + 1}{1 +
n(r)}- 1 = 1, which is in excess of your total stake I;k{n(k) + 1}- 1.

8. We may assume that: (a) after any given roll of the die, your decision whether or not to stop
depends only on the value V of the current roll; (b) if it is optimal to stop for V = r, then it is also
optimal to stop when V > r.
Consider the strategy: stop the first time that the die shows r or greater. Let S (r) be the expected
score achieved by following this stategy. By elementary calculations,
S(6) = 6 IP'(6 appears before 1) + 1 IP'(l appears before 6) =

i,

l,

and similarly S(5) = 4, S(4) = 4, S(3) = 1 S(2) = ~ The optimal strategy is therefore to stop at
the first throw showing 4, 5, or 6. Similar arguments may be used to show that 'stop at 5 or 6' is the
rule to maximize the expected squared score.

9.

Proceeding as in Exercise (8), we find the expected returns for the same strategies to be:
S(6) = ~ - 3c,

S(5) = 4- 2c,

S(4) = 4- ~c,

S(3) =

.!_f- ~c,

S(2) = ~ -c.

If c = ~,it is best to stop when the score is at least 4; if c = 1, you should stop when the score is at
least 3. The respective expected scores are ~ and

3.4 Solutions. Indicators and matching


1.

Let Ij be the indicator function ofthe event that the outcome of the (j + l)th toss is different from

the outcome of the jth toss. The number R of distinct runs is given by R = 1 +

L;J;;:;i Ij. Hence

lE(R) = 1 + (n- l)JE(/t) = 1 + (n- 1)2pq,


where q = 1- p. Now remark that Ij and hare independent if /j- kl > 1, so that

JE{(R- 1)2 } = lE{ (

n-1

~ Ij)

2}

= (n -l)lE(lt) + 2(n- 2)lE(I1h)

]=1
+ { (n -1) 2 - (n- 1)- 2(n- 2) }lE(/1) 2 .

162

Indicators and matching

Solutions

[3.4.2]-[3.4.5]

NowlE(ft) = 2pq andlEUth) = p 2 q + pq 2 = pq, and therefore


var(R) = var(R- 1) = (n- 1)1E(/t) + 2(n- 2)1EUth)- { (n- 1) + 2(n- 2) }lEUt) 2
= 2pq(2n- 3- 2pq(3n- 5)).

2. The required total is T = I:f=t X;, where X; is the number shown on the ith ball. Hence
JE(T) = klE(X t) = ~k(n + 1). Now calculate, boringly,

lE{ (

2
X;) } = klE(Xf) + k(k- 1)lE(XtX2)

l=l

k ~ 2
= - L...,l
n 1
=

k(k -1)2"' ..
+ n(n
L...,l]
- 1) . .
l>j

~ { 1n(n +

1)(n + 2) -

~n(n +

1)}

k(k- 1) n

n(n-1)~j{n(n+1)-j(j+l)}
J=l

= ik(n + 1)(2n + 1) +

l2 k(k- 1)(3n + 2)(n + 1).

Hence
var(T) = k(n +
3.

o{ i(2n + 1) + tz(k- 1)(3n + 2)- !k(n + 1)} = tz(n + 1)k(n- k).

Each couple survives with probability

so the required mean is

(1 - ~)
(1 - _
m) .
2n
2n-1

4. Any given red ball is in urn R after stage k if and only if it has been selected an even number of
times. The probability of this is

and the mean number of such red balls is n times this probability.
5.

Label the edges and vertices as in Figure 3.1. The structure function is
;(X)= X5 + (1- Xs){ (1- X1)X4[X3 + (1- X3)X2X6]
+Xt [x2 + (1- X2)(X3(X6 + X4(1- X6)))J}.

163

[3.4.6]-[3.4.9]

Solutions

Discrete random variables

Figure 3.1. The network with sources and sink t.


For the reliability, see Problem (1.8.19a).

6. The structure function is Irs:o:k) the indicator function of {S ::: k} where S = 2:~= 1 Xc. The
reliability is therefore 2:7=k (7) pi (1 - p )n-i.

7. Independently colour each vertex livid or bronze with probability ~ each, and let L be the random
set of livid vertices. Then ENL = ~ IE 1. There must exist one or more possible values of N L which
are at least as large as its mean.
8.

Let Ir be the indicator function that the rth pair have opposite polarity, so that X = 1 + 2:~~{ Ir.

We have that lP'(/r = 1) = ~and lP'(/r = Ir+1 = 1) =

!. whence EX= ~(n + 1) and var X=

t(n-1).

9.

(a) Let Ai be the event that the integer i remains in the ith position. Then

= n 1- -

(n)
2

1
++(-1) n-1 -1.
n(n- l)
n!

Therefore the number M of matches satisfies


1
1
lP'(M = 0) = - - 2!
3!

+ + (-1)

n 1

-.

n!

Now

lP'(M = r) =
=

(~ )lP'(r given numbers match, and the remaining n- rare deranged)


n!
(n-r)!(l
1
n-r
- - ---++(-1)
-1- )
r!(n-r)!
n!
2!
3!
(n-r)!

(b)
n+1

dn+1 =

L #{derangements with 1 in the rth place}

r=2

= n{#{derangements which swap 1 with 2}


+#{derangements in which 1 is in the 2nd place and 2 is not in the 1st place}}
= ndn-1

+ ndn,
164

Solutions

Dependence

[3.5.1]-[3.6.2]

where#A denotes the cardinality of the setA. By rearrangement, dn+l- (n+ 1)dn = -(dn -ndn-1).
Set Un = dn - ndn-1 and note that u2 = 1, to obtain Un = (-l)n, n 2: 2, and hence
n!
n!
dn = - - 2!
3!

+ .. + (-1) nn!
-.
n!

Now divide by n! to obtain the results above.

3.5 Solutions. Examples of discrete variables


1. There are n! I (n 1 ! n2! n 1 !) sequences of outcomes in which the ith possible outcome occurs
n; times for each i. The probability of any such sequence is p~ 1 p~ 2 p71 , and the result follows.

2.

The total number H of heads satisfies


IJI'(H = x) =

oo

~ IJI'(H

= x

oo (

I N = n)IJI'(N = n) = ~ : px (1 - p)n-x

+
An

-J,.

(Ap )x e-J,.p oo {A(l _ p) }n-x e-J,.(I- p)


=

x!

n=x

(n - x)!

The last summation equals 1, since it is the sum of the values of the Poisson mass function with
parameter A(l - p).
3.

dpn/dA = Pn-1- Pn where P-1 = 0. Hence (d/dA)IJI'(X ::0 n) = Pn(A).

4. The probability of a marked animal in the nth place is ajb. Conditional on this event, the chance
of n - 1 preceding places containing m - 1 marked and n - m unmarked animals is
a-1) (b-a) /(b-1),
( m-1
n-m
n-1
as required. Now let Xj be the number of unmarked animals between the j - 1th and jth marked
animals, if all were caught. By symmetry, lEXj = (b- a)j(a + 1), whence lEX= m(lEXt + 1) =
m(b + 1)/(a + 1).

3.6 Solutions. Dependence


1. Remembering Problem (2.7.3b), it suffices to show that var(aX +bY) < oo if a, b
var(X), var(Y) < oo. Now,

JR. and

var(aX +bY)= lE ( {aX+ bY -lE(aX +bY) } 2)

+ 2ab cov(X, Y) + b2 var(Y)


::0 a2 var(X) + 2ab)var(X) var(Y) + b 2 var(Y)

= a 2 var(X)

= ( a)var(X)

+ b)var(Y) ) 2

where we have used the Cauchy-Schwarz inequality (3.6.9) applied to X - JE(X), Y - JE(Y).
2. Let N; be the number of times the ith outcome occurs. Then N; has the binomial distribution
with parameters n and Pi.

165

[3.6.3]-[3.6.8]
3.

Solutions

Discrete random variables

For x = 1, 2, ... ,
00

lP'(X =x) = LlP'(X =x, Y = y)


y=l
00

=,?;

C{

(x + y- l)(x + y) - (x + y)(x + y + 1)

C
2x(x + 1)

C(l
~

=2

1)

- x +1

'

and hence C = 2. Clearly Y has the same mass function. Finally E(X) = I;~ 1 (x + 1)- 1 = oo, so
the covariance does not exist.

4.

Max{u, v} = i (u + v) + i lu -vi, and therefore


E( max{X 2, Y 2 }) = iE(X 2 + Y 2 ) + iEI(X- Y)(X +

Y)l

1 + h/E((X- Y)2)E((X + Y)2)

= 1 + h/(2-2p)(2+2p) = 1 +

p,

where we have used the Cauchy-Schwarz inequality.

5.

(a) logy

y - 1 with equality if and only if y = 1. Therefore,

E(1og fx(X)
fy (X)) < E[fy (X) - 1] = 0
fx(X)
'
with equality if and only iffy = fx.
(b) This holds likewise, with equality if and only if f (x, y) = fx (x) fy (y) for all x, y, which is to
say that X and Y are independent.

6. (a) a+ b + c = E{/{X>f} + l(Y>Z) + /{Z>XJ} = 2, whence rnin{a, b, c} ~ ~ Equality is


attained, for example, if the vector (X, Y, Z) takes only three values with probabilities f (2, 1, 3) =
/(3, 2, 1) = f(l, 3, 2) =
(b) lP'(X < Y) = lP'(Y < X), etc.
(c) We have that c =a= p and b = 1- p 2 . Also suprnin{p, (1- p 2)} = icJS- 1).

7.

We have for 1

fx(x) =

x ~ 9 that

y=O

log (1 +

1
lOx+ y

) =log

IT (1

y=O

1
lOx+ y

By calculation, EX ::::::: 3.44.

8.

00

(1) fx(J) = c"'""'


L
k=O

(ii) 1 =

. ak

-.,a lk'
-

J.

kai ak}
+k' .,
. J.

ea(j + a)ai
= c-----,-.,
.
J.

L fx(j) = 2ace a, whence c = e- a/(2a).


2

(ii) fx+r(r) =

r
crar
crar2r
L
.,
(
_
.),
= - - ,-, r ~ 1.
j=O J. r
J .
r.

166

) =log (1 +

_!_).
x

Solutions [3.7.1]-[3.7.4]

Conditional distributions and conditional expectation

~ cr(r- 1)(2a)r
1
(iii) E(X + Y- 1) = L...J
= 2a. Now E(X) = E(Y), and therefore E(X) =a+ -z
1

r.

r= 1

3.7 Solutions. Conditional distributions and conditional expectation


1.

(a) We have that

E(aY + bZ I X= x) = L(ay + bz)JP>(Y = y, Z = z

I X=

x)

y,z

=a LYIP'(Y = y, Z = z

I X=

I X=

x) + b EzJP>(Y = y, Z = z

x)

y,z

y,z

=aEyJP>(Y=y

I X=x)+bEzJP>(Z=z I X=x).

Parts (b)-(e) are verified by similar trivial calculations. Turning to (f),

E{E(Y

X,Z)IX~x} ~ ~ { ~yP(Y~y I X ~x,Z ~z)P(X~x.z~z I X ~x)}


= L LY JP>(Y = y, X = x, Z = z) . JP>(X = x, Z = z)
z Y
JP>(X=x,Z=z)
JP>(X=x)

IX

= LY JP>(Y = y

= X) = E(Y

IX

= X)

= m:{ E(Y I X) I X= x, Z = z }.

by part (e).

2. If and 1{1 are two such functions then E( (c/J(X) -l{I(X))g(X)) = 0 for any suitable g. Setting
g(X) = l(X=x! for any x E lR such that JP>(X = x) > 0, we obtain (x) = l{l(x). Therefore
JP>(cp(X) = l{I(X)) = 1.
3. We do not seriously expect you to want to do this one. However, if you insist, the method is to
check in each case that both sides satisfy the appropriate definition, and then to appeal to uniqueness,
deducing that the sides are almost surely equal (see Williams 1991, p. 88).

4.

The natural definition is given by

Now,
var(Y)

= E({Y- EY}2) = E
= E(var(Y

I X))

[m:( {Y- E(Y I X)+ E(Y I X)- EY} 2

+ var (E(Y

X)]

I X))

since the mean ofE(Y I X) is EY, and the cross product is, by Exercise (le),

2m:[m:( {Y- E(Y 1 X) }{E(Y 1 X)- EY} x) J


1

=2m:[ {E(Y
167

X)- EY}E{Y- E(Y

X)

x}] = o

[3.7.5]-[3.7.10]

Solutions

Discrete random variables

since lE{Y -JE(Y I X)/ X}= lE(Y I X) -JE(Y I X)= 0.

5.

We have that

L IP'(T > t + r I T > t) = r=O


L IP'(TIP'(T> >t +t) r) .
r=O
00

JE(T - t I T > t) =

00

N-t N- t - r

(a)

JE(T- t I T > t) =

N- t

= ~(N- t

+ 1).

r=O
oo 2-(t+r)

(b)

lE(T - t I T > t) =

2=t

= 2.

r=O

6.

Clearly
lE(SIN=n)=lE(txi) =Jm,
1=1

and hence JE(S I N) = J-tN. It follows that JE(S) = lE{JE(S I N)} =

lE(~-tN).

7. A robot passed is in fact faulty with probability n = {(1 - 8)}/(1- 8). Thus the number of
faulty passed robots, given Y, is bin(n- Y, n), with mean (n- Y){(1 - 8)}/(1 - 8). Hence
lE(X I Y) = Y

(n - Y).4.(1 - 8)
'+'
.
1-8

8. (a) Let m be the family size, 4>r the indicator that the rth child is female, and J-tr the indicator of
a male. The numbers G, B of girls and boys satisfy
m

B =

J-tr,

lE(G) = ~m = JE(B).

r=1

(It will be shown later that the result remains true for random m under reasonable conditions.) We
have not used the property of independence.
(b) With M the event that the selected child is male,

lE(G I M) = lE (

m-1

4>r

= ~(m- 1) = JE(B).

r=1

The independence is necessary for this argument.

9. Conditional expectation is defined in terms of the conditional distribution, so the first step is not
justified. Even if this step were accepted, the second equality is generally false.
10. By conditioning on Xn-1
lEXn = lE[lE(Xn I Xn-1)] = lE[p(Xn-1

+ 1) + (1- p)(Xn-1 + 1 + Xn)]

where Xn has the same distribution as Xn. Hence lEXn = (1


lEX1=p- 1.

168

+ lEXn-1)/p.

Solve this subject to

Solutions

Sums of random variables

[3.8.1)-[3.8.5)

3.8 Solutions. Sums of random variables


1.

By the convolution theorem,


lP'(X + Y

= z) = LlP'(X = k)lP'(Y = z- k)
k

k+1

if 0 ::=: k ::=: m 1\ n,

(m + 1)(n + 1)

+1
+ 1)(n + 1)

(m 1\ n)
(m

if m 1\ n < k < m v n,

m+n+1-k

if m v n ::=: k ::=: m + n,

(m + 1)(n + 1)

where m 1\ n

2.

= min{m, n} and m v n = max{m, n}.

If z:::: 2,
lP'(X

L lP'(X = k, Y = z- k) = z(z c+ 1) .
00

+ Y = z) =

k=1

Also, if z

:::: 0,
00

lP'(X- Y

= z) = L

= k + z, Y = k)

lP'(X

k=1

00

- c 2.::: -::-:---,.,--,:-:---,--:-::-:-----:-:k= 1 (2k

+ z- 1)(2k + z)(2k + z + 1)

00

= ic {;
,;C

By symmetry, if z ::=: 0, lP'(X- Y


3.

(2k + z- 1)(2k + z) -

oo

(2k + z)(2k + z + 1)

(-l)r+1

~ (r + z)(r + z +

1)

= z) = lP'(X- Y = -z) = lP'(X- Y =

z-1

La(l-al-1{3(1-{J)z-r-1=
~1

aR{(1
P

(1

R)Z-1
-p

lzl).

a)z-1}
-

a-{J

4. Repeatedly flip a coin that shows heads with probability p. Let Xr be the number of flips after
the r- 1th head up to, and including, the rth. Then Xr is geometric with parameter p. The number
of flips Z to obtain n heads is negative binomial, and Z = I:~= 1 X r by construction.
5. Sam. Let Xn be the number of sixes shown by 6n dice, so that Xn+1
same distribution as X 1 and is independent of X n. Then,

= Xn +

Y where Y has the

lP'(Xn+1

::=:::

+ 1) = LlP'(Xn

::=:::

+ 1- r)lP'(Y = r)

r=O
6

= lP'(Xn

::=:::

n)

+ L[lP'(Xn

::=:::

+ 1- r) -lP'(Xn

::=:::

n)]lP'(Y

= r).

r=O
We set g(k) = lP'(Xn = k) and use the fact, easily proved, that g(n) :::: g(n- 1) :::: :::: g(n- 5) to
find that the last sum is no bigger than
6
g(n) L ( r - 1)1P'(Y = r) = g(n)(lE(Y)- 1).

r=O

169

[3.8.6]-[3.9.2]

Solutions

Discrete random variables

The claim follows since lE(Y) = 1.


oo

6.

oo

( ) -J..

(i)LHS='2:ng(n)e-J..A.nfn!=A.L~ne

1 A.n-

1 =RHS.

n=1 n - 1).

n=O

(ii) Conditioning on Nand XN,


LHS = lE(lE(Sg(S)
=

I N)) = lE{ NlE( XN g(S) IN)}

L (:-J..A.n
_1)! 1xlE (g (n-1
L Xr +x )) dF(x)
n

=A.

r=1

xlE(g(S +x)) dF(x) = RHS.

3.9 Solutions. Simple random walk


1. (a) Consider an infinite sequence of tosses of a coin, any one of which turns up heads with
probability p. With probability one there will appear a run of N heads sooner or later. If the coin
tosses are 'driving' the random walk, then absorption occurs no later than this run, so that ultimate
absorption is (almost surely) certain. LetS be the number of tosses before the first run of N heads.
Certainly JP>(S > Nr) :::: (1- pNy, since Nr tosses may be divided into r blocks of N tosses, each
of which is such a run with probability pN. Hence JP>(S = s) :::: (1 - pN) LsI NJ, and in particular
lE(Sk) < oo for all k :::: 1. By the above argument, lE(Tk) < oo also.

2.

If So = k then the first step X 1 satisfies


JP>(X 1 = 1 1 W) = JP>(X1 = 1)JP>(W 1 x1 = 1) = PPk+1.
lP'(W)
Pk

Let T be the duration of the walk. Then

= lE(T I So = k, W)
= lE(T I So= k, W, X 1 = 1)1P'(X 1 = 1 I So= k, W)

+ lE(T I So = k, W, X 1 = -1)1P'(X 1 = -1 I So =
(1 + h+ 1) Pk+1P + (1 + lk-I) ( 1 - Pk+1P)
Pk

= 1

+ PPk+1Jk+1 +
Pk

k, W)

Pk

(Pk- PPk+!) h-1'


Pk

as required.
Certainly Jo = 0. If p =

1then Pk = 1- (k/ N), so the difference equation becomes

(N- k- 1)h+1 - 2(N- k)h

+ (N- k + 1)h-1

= 2(k- N)

for 1 :::: k :::: N- 1. Setting Uk = (N- k)h, we obtain


uk+1 - 2uk

+ Uk-1

= 2(k- N),

with general solution Uk =A+ Bk -1(N- k) 3 for constants A and B. Now uo =UN = 0, and
therefore A= 1N3 , B

= -1N2 , implying that Jk

= 1{N 2 - (N- k) 2 }, 0:::: k < N.

170

Solutions [3.9.3]-[3.10.3]

Random walk: counting sample paths

3. The recurrence relation may be established as in Exercise (2). Set uk = (pk- pN)h and use
the fact that Pk = (pk- pN)/(1- pN) where p = q jp, to obtain
PUk+1 - (1- r)uk

+ quk-1 = PN- pk.

The solution is
Uk

=A+ Bpk + k(pk + pN),


p-q

for constants A and B. The boundary conditions, uo = u N = 0, yield the answer.

4.

Conditioning in the obvious way on the result of the first toss, we obtain
Pmn

= PPm-1,n + (1- P)Pm,n-1

ifm,n:::: 1.

The boundary conditions are PmO = 0, POn = 1, if m, n :::: 1.


5.

Let Y be the number of negative steps of the walk up to absorption. Then lE(X + Y) = Dk and
N - k if the walk is absorbed at N,
X-Y= {

-k

if the walk is absorbed at 0.

Hence lE(X - Y) = (N - k)(1 - Pk) - kpk. and solving for lEX gives the result.

3.10 Solutions. Random walk: counting sample paths


1.

Conditioning on the first step X 1,


lP'(T = 2n) = -!lP'(T = 2n

I x1

= -!f-1(2n -1)

= 1)

+ -!lP'(T =

2n

I x1

= -1)

+ -!f1(2n -1)

where fb(m) is the probability that the first passage to b of a symmetric walk, starting from 0, takes
place at time m. From the hitting time theorem (3.10.14),

f1 (2n- 1) = f-1 (2n- 1) = _1_lP'(S2n-1 = 1) = _1_ (2n- 1) T(2n-1),


2n-1

2n-1

which therefore is the value oflP'(T = 2n).


For the last part, note first that
lP'(T = 2n) = 1, which is to say that lP'(T < oo) = 1; either
appeal to your favourite method in order to see this, or observe that lP'(T = 2n) is the coefficient of
s 2n in the expansion ofF (s) = 1 - ~- The required result is easily obtained by expanding the
binomial coefficient using Stirling's formula.

I:f

2.

By equation (3.10.13) ofPRP, for

r:::: 0,

+ 1)
+ 1) + lP'(Sn = r)- 2lP'(Sn :::: r + 2) -lP'(Sn = r + 1)
= lP'(Sn = r) + lP'(Sn = r + 1)
= max{lP'(Sn = r), lP'(Sn = r + 1)}

lP'(Mn = r) = lP'(Mn:::: r) -lP'(Mn:::: r

= 2lP'(Sn :::: r

since only one of these two terms is non-zero.


By considering the random walk reversed, we see that the probability of a first visit to S2n at time
2k is the same as the probability of a last visit to So at time 2n - 2k. The result is then immediate

3.

from the arc sine law (3.10.19) for the last visit to the origin.

171

[3.11.1]-[3.11.4]

Solutions

Discrete random variables

3.11 Solutions to problems


(a) Clearly, for all a, b E JR,

1.

lP'(g(X) =a, h(Y) =b) =

lP'(X=x,Y=y)
x,y:
g(x )=a,h(y )=b

lP'(X

= x)lP'(Y = y)

x,y:
g(x )=a,h(y )=b

lP'(X = x)

L
x:g(x)=a

lP'(Y = y)

L
y:h(y)=b

= lP'(g(X) = a)lP'(h(Y) =b).


(b) See the definition (3 .2.1) of independence.
(c) The only remaining part which requires proof is that X andY are independent if fx.r(x, y) =
g(x)h(y) for all x, y E JR. Suppose then that this holds. Then
fx(x) = L

fx.r(x, y) = g(x) Lh(y),

fy(y)

=L

fx.r(x, y)

= h(y) Lg(x).

Now

1 = Lfx(x) = Lg(x) Lh(y),


X

so that
fx(x)fy(y)

= g(x)h(y) L

g(x) L

h(y)

2. If E(X 2 ) = I:x x 21P'(X = x) = 0 then lP'(X


Therefore, if var(X) = 0, it follows that lP'(X -EX

3.

= g(x)h(y) = fx.r(x, y).

= x) = 0 for x
= 0) = 1.

=/= 0. Hence lP'(X

= 0) =

1.

(a)

E(g(X))

= LYlP'(g(X) = y) = L
Y

ylP'(X

Y x:g(x)=y

= x) = Lg(x)lP'(X = x)
x

as required.
(b)

E(g(X)h(Y))

=L

g(x)h(y)fx.r(x, y)

by Lemrna(3.6.6)

x,y

= Lg(x)h(y)fx(x)fy(y)

by independence

x,y

= Lg(x)fx(x) Lh(y)fy(y) = E(g(X))E(h(Y)).


X
y

= fy(i) = ~fori = 1, 2, 3.
(b) (X+ Y)(w,) = 3, (X+ Y)(wz) = 5, (X+ Y)(w3) = 4, and therefore fx+Y(i) = ~fori = 3, 4, 5.
(c) (XY)(w!) = 2, (XY)(wz) = 6, (XY)(w3) = 3, and therefore fxy(i) = ~fori = 2, 3, 6.

4.

(a) Clearly fx(i)

172

Problems

Solutions

(d) Similarly fx;r(i) =~fori=

and similarly frlz(3 I 2) =

(a)

lP'(Y = 2, Z = 2)
lP'(wJ)
lP'(Z = 2)
= lP'(wJ u W2) =

I 3) = fz1rO I 1) = 1.

1} = k, and therefore k =
L k = k n=l
L {1-n - -n=i n(n + 1)
n+1
00

2'

i. fr1z(l 11) = 1, and other values are 0.

(f) Likewise fzlr(2 I 2) = fzlr(2

5.

i. ~. 3.

frlz(2 I 2) =

(e)

[3.11.5]-[3.11.8]

00

1.

(b) 2::~ 1 kna = kl;(-a) where 1; is the Riemann zeta function, and we require a< -1 for the sum
to converge. In this case k = 1; (-a) -I .
6.

(a) We have that


n

lP'(X + Y = n) =

L lP'(X = n -

k)lP'(Y = k) =

k=O

-A.;..n-k

e-JLJ-tk

_e-..,.-- . - k=O (n- k)!


k!

e-A.-JL ~ (n) n-k k


e-A.-JL().. + J-t)n
=--LJ
A.
1-t = - - - - - n! k=O k
n!

lP'(X = k, X+ Y = n)
lP'(X = k I X+ Y = n) = --lP'-(X_+_Y_=_n_)_

(b)

= lP'(X = k)lP'(Y = n- k) = (n) ;..k/-tn-k .


lP'(X + Y = n)
k (A.+ J-t)n
Hence the conditional distribution is bin(n, A./(A. + J-t)).
7.

(i) We have that

lP'(X=n+k,X >n)
lP'(X=n+k I X >n)= - - - - - - lP'(X > n)
)n+k-1
(1
P - P
.
= p(1 - p)k-i = lP'(X = k).
L:~n+i p(1- p)J-1

(ii) Many random variables of interest are 'waiting times', i.e., the time one must wait before the
occurrence of some event A of interest. If such a time is geometric, the lack-of-memory property
states that, given that A has not occurred by time n, the time to wait for A starting from n has the same
distribution as it did to start with. With sufficient imagination this can be interpreted as a failure of
memory by the process giving rise to A.
(iii) No. This is because, by the above, any such process satisfies G(k + n) = G(k)G(n) where
G(n) = lP'(X > n). Hence G(k + 1) = G(li+ 1 and X is geometric.
8.

Clearly,
k

lP'(X + y = k) = LlP'(X = k- j, y = j)
j=O
=

t (:
j=O

.)Pk-jqm-k+j
J

m)

(~)pjqn-j
J

k m+n-k LJ ( k
pq
. (n)
.
j=O
- J
J

173

k m+n-k
pq

(m k+ n)

[3.11.9]-[3.11.13]

Discrete random variables

+ n, p).

which is bin(m

9.

Solutions

Turning immediately to the second request, by the binomial theorem,

as required. Now,
IP(N even) =

L (~) pk (1 _ p )n-k
k even

= i{ (p + 1 -

P)n

+ (1

- P - P )n}

= ~ {1 + (1 -

2p )n}

in agreement with Problem (1.8.20).


10. There are

(%)

ways of choosing k blue balls, and (~~f) ways of choosing n- k red balls. The

total number of ways of choosing n balls is (~),and the claim follows. Finally,

IP(B

= k) =

n)
b!
(N- b)!
(N- n)!
k (b- k)! . (N- b- n + k)! .
N!

=(~){~b~1b-~+1}
x { N-; b

~ ( ~) pk(1 -

... N- b-;

as N

p)n-k

+ k + 1} { ~ ...

N-;

+ 1} -I

~ oo.

11. Using the result of Problem (3.11.8),

12. (a) E(X) = c + d, E(Y) = b + d, and E(XY) = d, so cov(X, Y) = d- (c + d)(b +d), and X
and Y are uncorrelated if and only if this equals 0.
(b) For independence, we require f (i, j) = JP( X = i )JP( Y = j) for all i, j, which is to say that

a=(a+b)(a+c),

b=(a+b)(b+d),

c=(c+d)(a+c),

d=(b+d)(c+d).

Now a+ b + c + d = 1, and with a little work one sees that any one of these relations implies the
other three. Therefore X and Yare independent if and only if d = (b + d)(c +d), the same condition
as for uncorrelatedness.

13. (a) We have that


oo m-!

oo

E(X)

m=O

m!P(X

= m) =

L L

oo

JP(X

= m) =

m=O n=O

oo

L L

n=Om=n+!

174

oo

JP(X

= m) = LIP(X
n=O

> n).

Solutions [3.11.14]-[3.11.14]

Problems
(b) First method. Let N be the number of balls drawn. Then, by (a),
r

E(N) =

L W'(N > n) =LIT>(first n balls are red)

n=O
n=O
r
r
r- 1
r- n + 1
r
r!
(b + r- n)!
=.L:b+rb+r-lb+r-n+l =.L:(b+r)! (r-n)!
n=O
n=O
=

t (n

(b+r)! n=O

+b) = b + r + 1
b
b+ 1 '

where we have used the combinatorial identity I:~=O


the simple identity (r.:I)
summation, we find that

+ G)

f>rt

r=O

n=O

(nbb)

rt!ti).

To see this, either use

(x~I) repeatedly, or argue as follows. Changing the order of

(n+b)
b

_ 1 f=xn(n+b)
l - X n=O
b

= (1- x)-(b+2) =

f=xr
r=O

(b + ++ 1)
r

by the (negative) binomial theorem. Equating coefficients of xr, we obtain the required identity.
Second method. Writing m(b, r) for the mean in question, and conditioning on the colour of the first
ball, we find that
b
r
m(b,r) = - - + {1 +m(b,r -1)}--.
b+r
b+r
With appropriate boundary conditions and a little effort, one may obtain the result.
Third method. Withdraw all the balls, and let Ni be the number of red balls drawn between the i th and
(i + l)th blue ball (No = N, and Nb is defined analogously). Think of a possible 'colour sequence'
as comprising r reds, split by b blues into b + 1 red sequences. There is a one-one correspondence
between the set of such sequences with No = i, Nm = j (for given i, j, m) and the set of such
sequences with No = j, Nm = i; just interchange the 'Oth' red run with the mth red run. In particular
E(No) = E(Nm) for all m. Now No+ NI + + Nb = r, so that E(Nm) = r /(b + 1), whence the
claim is immediate.
(c) We use the notation just introduced. In addition, let Br be the number of blue balls remaining after
the removal of the last red ball. The length of the last 'colour run' is Nb + Br, only one of which is
non-zero. The answer is therefore rf(b + 1) + bf(r + 1), by the argument of the third solution to part
(b).

14. (a) We have that E(Xk) = Pk and var(Xk) = PkO- Pk), and the claims follow in the usual way,
the first by the linearity of][ and the second by the independence of the Xi; see Theorems (3.3.8) and
(3.3.11).
(b) Lets = L:k Pb and let Z be a random variable taking each of the values PI, P2 ... , Pn with
equal probability n-I. Now E(Z 2 ) - E(Z) 2 = var(Z) 2:: 0, so that

Lk

;;Pt 2::

Lk

1
;;Pk

)2 =

s2
n2

with equality if and only if Z is (almost surely) constant, which is to say that PI = P2 = = Pn
Hence

175

[3.11.15]-[3.11.18]

Solutions

Discrete random variables

with equality if and only if Pl = P2 = = Pn.


Essentially the same route may be followed using a Lagrange multiplier.
(c) This conclusion is not contrary to informed intuition, but experience shows it to be contrary to
much uninformed intuition.
15. A matrix V has zero determinant if and only if it is singular, that is to say if and only if there is a
non-zero vector x such that xVx' = 0. However,

Hence, by the result of Problem (3.11.2), L:k Xk(Xk- EXk) is constant with probability one, and the
result follows.
16. The random variables X+ Y and IX- Yl are uncorrelated since
cov(X + Y, IX- Yl) = E{ (X+ Y)IX- Yl}- E(X + Y)E(IX- Yl)
=

! + ! - 1. ~ =

0.

However,

! = lP'(X + Y = 0, IX- Yl = 0) =/= lP'(X + Y = O)lP'(IX- Yl = 0) = ! ~ = ~.


so that X + Y and IX - Y I are dependent.
17. Let h be the indicator function of the event that there is a match in the kth place. Then lP'(h =
1) = n -l , and for k =/= j,
lP'(h
Now X=

= 1, Ij = 1) = lP'(Ij = 1 I h = 1)1P'(h = 1) = -n(n- 1)

L:k=l h, so that E(X) = L:k=l n- 1 = 1 and


var(X)

= E(X 2)- (EX) 2 =

m:(t

h)

- 1

= tE(h) 2 +
1

LE(Ijh) -1

j#

=1

+2(n)

1
-1
2 n(n- 1)

= 1.

We have by the usual (mis)matching argument of Example (3.4.3) that

1 n-r (-1i
lP'(X = r) = - ' """"'
L.J - z.., '
r. i=O

0 _::: r _::: n- 2,

which tends to e- 1/r! as n--+ oo.


18. (a) Let Y1, Y2, ... , Yn be Bernoulli with parameter P2 and Z1, Z2, ... , Zn Bernoulli with parameter Pl / P2 and suppose the usual independence. Define Ai = Yi Zi, a Bernoulli random variable that
has parameter lP'(Ai = 1) = lP'(Yi = 1)1P'(Zi = 1) = Pl Now (AI, A2, ... , An) .::0 (Y1, Y2, ... , Yn)
so that /(A) _::: f(Y). Hence e(pl) = E(f(A)) _::: E(f(Y)) = e(p2)
(b) Suppose first that n = 1, and let X and X' be independent Bernoulli variables with parameter p.
We claim that
{J(X)- f(X')}{g(X)- g(X')} ~ 0;

176

Solutions [3.11.19]-[3.11.19]

Problems

to see this consider the three cases X = X', X < X', X > X' separately, using the fact that
are increasing. Taking expectations, we obtain

and g

E({f(X)- f(X')}{g(X)- g(X')}) 2:: 0,

which may be expanded to find that


0 :::; E(f(X)g(X)) - E(f(X')g(X)) - E(f(X)g(X'))
=

2{ E(f(X)g(X)) -

+ E(f(X )g(X
1

1 ))

E(f(X))E(g(X))}

by the properties of X and X'.


Suppose that the result is valid for all n satisfying n < k where k ::: 2. Now
m:(f(X)g(X)) = m: {E(f(X)g(X)

Ix 1, Xz, .. ., xk-1)} ;

here, the conditional expectation given X 1, Xz, ... , Xk-1 is defined in very much the same way as in
Definition (3.7.3), with broadly similar properties, in particular Theorem (3.7.4); see also Exercises
(3.7.1, 3). If X 1, X 2, ... , Xk-1 are given, then f(X) and g(X) may be thought of as increasing
functions of the single remaining variable Xko and therefore

E(f(X)g(X) X1, Xz, ... , Xk-1) ::: m:(f(X) X1, Xz, ... , Xk-1)m:(g(X) X1, Xz, ... , Xk-1)

by the induction hypothesis. Furthermore

!'(X)= E(f(X) X1, Xz, ... , Xk-1).

g'(X) = m:(g(X) X1, Xz, ... , xk-1),

are increasing functions of the k -1 variables X 1 , X 2, ... , X k-1 , implying by the induction hypothesis
that E(f' (X)g' (X)) ::: E(f' (X) )E(g' (X)). We substitute this into (*) to obtain
E(f(X)g(X)) 2:: E(f' (X))E(g' (X))= E(f(X))E(g(X))

by the definition of f' and g'.

19. Certainly R(p) = E(IA) =


Differentiating, we obtain

LUJ IA(w)IP'(w) and IP'(w)

= pN(UJ)qm-N(UJ) where p + q = 1.

Applying the Cauchy-Schwarz inequality (3.6.9) to the latter covariance, we find that R' (p) :::;
(pq)- 1Jvar(IA) var(N). However lA is Bernoulli with parameter R(p), so that var(IA) = R(p)(lR(p)), and finally N is bin(m, p) so that var(N) = mp(l- p), whence the upper bound for R'(p)
follows.
As for the lower bound, use the general fact that cov( X+ Y, Z) = cov( X, Z) +cov( Y, Z) to deduce
that cov(/A, N) = cov(/A, lA) + cov(IA, N- lA) Now lA and N- lA are increasing functions of
w, in the sense of Problem (3.11.18); you should check this. Hence cov(/A, N) 2:: var(/A) + 0 by the
result of that problem. The lower bound for R' (p) follows.

177

[3.11.20]-[3.11.21]

Solutions

Discrete random variables

20. (a) Let each edge be blue with probability P1 and yellow with probability P2; assume these two
events are independent of each other and of the colourings of all other edges. Call an edge green if it
is both blue and yellow, so that each edge is green with probability P1 P2 If there is a working green
connection from source to sink, then there is also a blue connection and a yellow connection. Thus
IP'(green connection) :::0 IP'(blue connection, and yellow connection)
= IP'(blue connection)IP'(yellow connection)
so that R(P1 P2) :::0 R(pt)R(p2).
(b) This is somewhat harder, and may be proved by induction on the number n of edges of G. If n = 1
then a consideration of the two possible cases yields that either R(p) = 1 for all p, or R(p) = p for
all p. In either case the required inequality holds.
Suppose then that the inequality is valid whenever n < k where k :::: 2, and consider the case
when G has kedges. Let e be an edge of G and write w(e) for the state of e; w(e) = 1 if e is working,
and w(e) = 0 otherwise. Writing A for the event that there is a working connection from source to
sink, we have that
R(pY) = IP'pr (A

I w(e) = 1)pY + IP'pr (A I w(e) = 0)(1- pY)

:::0 IP'p(A 1 w(e) = l)Y pY

+ IP'p(A

1 w(e)

= O)Y (1 - pY)

where IP'a is the appropriate probability measure when each edge is working with probability a. The
inequality here is valid since, if w(e) is given, then the network G is effectively reduced in size by one
edge; the induction hypothesis is then utilized for the case n = k - 1. It is a minor chore to check that
xYpY

+ yY(l- p)Y

:::0 {xp

+ y(l- p)JY

if X 0::: y 0::: 0;

to see this, check that equality holds when x = y :::: 0 and that the derivative of the left-hand side with
respect to x is at most the corresponding derivative of the right-hand side when x, y :::: 0. Apply the
latter inequality with x = IP'p(A I w(e) = 1) andy= IP'p(A I w(e) = 0) to obtain
R(pY) :::0 {IP'p(A

I w(e) = 1)p + IP'p(A I w(e) = 0)(1 - p) V =

R(p)Y.

21. (a) The number X of such extraordinary individuals has the bin(107 , 10- 7) distribution. Hence
lEX= 1 and

IP'(X > 1 I X > 1) = IP'(X > 1) = _1_-_IP'_c.(X_=_O_)_-_IP'_(X_=_1_)


IP'(X > 0)
1 - IP'(X = 0)
1- (1-10-7)107 -107 10-7(1-10-7)107-1
1- (1- 10-7)1o7

1- 2e- 1
1- e-1

:::::::

0.4.

(Shades of (3.5.4) here: X is approximately Poisson distributed with parameter 1.)


(b) Likewise
1 -1
1 - 2e -1 - ze
:::::::0.3.
IP'(X > 2 I X :::: 2) :::::::

1- 2e- 1

(c) Provided m

= 107,

IP'(X = m) =

N!
( 1) m (
1 ) N -m
e -1
1- ::::::: - ,
m!(N -m)! N
N
m!

178

Solutions [3.11.22]-[3.11.23]

Problems

the Poisson distribution. Assume that "reasonably confident that n is all" means that lP'(X > n I X ::::
n) :::0 r for some suitable small number r. Assuming the Poisson approximation, lP'(X > n) :::0 r lP'(X ::::
n) if and only if

00

00

e- 1 ~ - < re- 1 ~ - .
~ k!~k!
k=n+1

k=n

For any given r, the smallest acceptable value of n may be determined numerically. If r is small, then
very roughly n::::::: 1/r will do (e.g., if r = 0.05 then n::::::: 20).
(d) No level p of improbability is sufficiently small for one to be sure that the person is specified
uniquely. If p = w- 7 a, then X is bin(107 , w-7 a), which is approximately Poisson with parameter
a. Therefore, in this case,
lP'(X > 1 I X:::: 1):::::::

1 - e- 01 - ae- 01
= p,
1- e- 01

say.

An acceptable value of p for a very petty offence might be p ::::::: 0.05, in which case a : : : : 0.1 and so
p = w- 8 might be an acceptable level of improbability. For a capital offence, one would normally
require a much smaller value of p. We note that the rules of evidence do not allow an overt discussion
along these lines in a court of law in the United Kingdom.

22. The number G of girls has the binomial distribution bin(2n, p). Hence

lP'(G:::: 2n- G)= lP'(G:::: n) = 2n (2n)


k pkq2n-k
k=n

:::0 ( 2n)
n

= 0.485 and n =

(2n) p n q n - q- ,
n

q-p

er) :::0 enn) for all k.

where we have used the fact that


With p

~
~p k q 2n-k =
k=n

104 , we have using Stirling's formula (Exercise (3.10.1)) that

( 2n) Pn qn _ q - :::0

q- p

~ { (1 -

v (nrr)

0.03)(1

+ 0.03)} n 00.50135
.

104
0.515
= - ( 1- -9 )
< 1.23 x

3,Jir

lo4

w- 5 .

It follows that the probability that boys outnumber girls for 82 successive years is at least (1 - 1.23 x
w- 5 ) 82 :::: o.99899.

23. Let M be the number of such visits. If k =f. 0, then M :::: 1 if and only if the particle hits 0 before
it hits N, an event with probability 1- kN- 1 by equation (1.7.7). Having hit 0, the chance of another
visit to 0 before hitting N is 1 - N- 1, since the particle at 0 moves immediately to 1 whence there is
probability 1 - N- 1 of another visit to 0 before visiting N. Hence
lP'(M :::: r I So

= k) = (1 -

k) (1 - N1 )r-1 ,

r :::: 1,

so that
lP'(M = j

I So =

k) = lP'(M :::: j

I So =

k) - lP'(M :::: j

=(1-:)(1-~y-1~,
179

+ 1 I So =
j::::l.

0)

[3.11.24]-[3.11.27]

Solutions

Discrete random variables

24. Either read the solution to Exercise (3.9 .4), or the following two related solutions neither of which
uses difference equations.
First method. Let Tk be the event that A wins and exactly k tails appear. Then k < n so that
lP'(A wins) = L:/::JlP'(Tk). However lP'(Tk) is the probability that m + k tosses yield m heads, k tails,
and the last toss is heads. Hence

whence the result follows.


Second method. Suppose the coin is tossed m + n - 1 times. If the number of heads is m or more,
then A must have won; conversely if the number of heads is m - 1 or less, then the number of tails is
nor more, so that B has won. The number of heads is bin(m + n- 1, p) so that
lP'(A wins)=

m+n-1 (

+;

l)

pkqm+n-1-k.

k=m

25. The chance of winning, having started from k, is

1- (qjp)k
1- (qjp)N

1 + (q/p)zk.

1- (q/p)zk

which may be written as

1- (qjp)2N

1 + (qjp)'ZN

'

see Example (3.9.6). If k and N are even, doubling the stake is equivalent to playing the original game
with initial fortune k and the price of the Jaguar set at
The probability of winning is now

iN.

lk

1- (q/p)'Z
I

1- (qjp)'ZN
which is larger than before, since the final term in the above display is greater than 1 (when p <

i,

If p =
doubling the stake makes no difference to the chance of winning. If p >
to decrease the stake.

i ).

i, it is better

26. This is equivalent to taking the limit as N ~ oo in the previous Problem (3.11.25). In the limit
when p =f.
the probability of ultimate bankruptcy is

i,

lim (qjp)k _ (qjp)N = { (qjp)k


N-+oo
1- (qjp)N
1
where p

+q =

l. If p =

"f p > 2>


1
"f
1
1 p < 2
1

i, the corresponding limit is limN-+oo(l- k/N) =

l.

27. Using the technique of reversal, we have that


lP'(Rn

Rn-1

+ 1) = lP'(Sn-1 =f. Sn, Sn-2 =f. Sn, ... , So =f. Sn)


= lP'(Xn =f. 0, Xn-1 + Xn =f. 0, ... , X1 + + Xn =f. 0)
= 1P'(X1 =f. 0, Xz + X1 =f. 0, ... , Xn + + X1 =f. 0)
= lP'(S1 =f. 0, Sz =f. 0, ... , Sn =f. 0) = 1P'(S1 Sz Sn =f. 0).

It follows that E(Rn) = E(Rn-1)

+ lP'(S1S2 Sn =f. 0) for n:::

180

1, whence

Solutions [3.11.28]-[3.11.29]

Problems

since IP'(S1S2 Sm =f. 0) ,[_ IP'(Sk =f. 0 for all k 2:: 1) as m ~ oo.
There are various ways of showing that the last probability equals Ip - q I, and here is one.
Suppose p > q. If x 1 = 1, the probability of never subsequently hitting the origin equals 1- (q/p),
by the calculation in the solution to Problem (3 .11.26) above. If X 1 = -1, the probability of staying
away from the origin subsequently is 0. Hence the answer is p(1- (qfp)) + q 0 = p- q.
If q > p, the same argument yields q- p, and if p

=q=

~the answer is 0.

28. Consider first the event that M2n is first attained at time 2k. This event occurs if and only if: (i)
the walk makes a first passage to S2k (> 0) at time 2k, and (ii) the walk thereafter does not exceed
S2k. These two events are independent. The chance of (i) is, by reversal and symmetry,
IP'(S2k-1 < s2k s2k-2 < S2b ... , So < S2k)
= IP'(X2k > o, x 2k_ 1 + x 2k > o, ... , x 1 + ...
= IP'(X 1 >
= IP'(Si >

+ x 2k > O)
o, x 1 + x 2 > o, 00., x 1 + 00 . + x 2k > O)
0 for 1 :::0 i :::0 2k) = ~IP'(Si =f. 0 for 1 :::0 i :::0 2k)

= ~IP'(S2k = 0)

by equation (3.10.23).

As for the second event, we may translate S2k to the origin to obtain the probability of (ii):
IP'(S2k+1 :::0 S2b ... , S2n :::0 S2k) = IP'(M2n-2k = 0) = IP'(S2n-2k = 0),

where we have used the result of Exercise (3.10.2). The answer is therefore as given.
The probabilities of (i) and (ii) are unchanged in the case i = 2k + 1; the basic reason for this is
that S2r is even, and S2r+ 1 odd, for all r.
29. Let Uk = IP'(Sk = 0), fk = IP'(Sk = 0, Si
recall from equation (3.10.25)) to obtain

=f. 0 for 1 :::0 i

< k), and use conditional probability (or

U2n =

L U2n-2khk
k=1

Now N1 = 2, and therefore it suffices to prove that E(Nn) = E(Nn-1) for n:::: 2. Let N~_ 1 be
the number of points visited by the walk S1, S2, ... , Sn exactly once (we have removed So). Then
N~_ 1 + 1 if Sk =f. So for 1 :::0 k :::0 n,
{
Nn =
N~_ 1 - 1 if Sk =So for exactly one kin {1, 2, 00., n},
N~_ 1

Hence, writing an = IP'(Sk

otherwise.

=f. 0 for 1 :::0 k :::0 n),

E(Nn) = E(N~_ 1 ) +an -IP'(Sk =So exactly once)


= E(Nn-1) +an- {han-2

+ f4an-4 + + hLn/2J}

where LxJ is the integer part of x. Now a2m = a2m+1 = u2m by equation (3.10.23). If n = 2k is
even, then
E(N2k)- E(N2k-1) = u2k- {f2u2k-2

If n

+ + hkl =

+ + hk} =

= 2k + 1 is odd, then
E(N2k+d- E(N2k) = u2k- {hu2k-2

181

[3.11.30]-[3.11.34]

Solutions

Discrete random variables

In either case the claim is proved.

30. (a) Not much.


(b) The rhyme may be interpreted in any of several ways. Interpreting it as meaning that families stop
at their first son, we may represent the sample space of a typical family as {B, GB, G2B, ... }, with
lP'(GnB) = 2-(n+ 1). The mean number of girls is I:~ 1 nlP'(GnB) = I:~ 1 n2-(n+ 1) = 1; there is
exactly one boy.
The empirical sex ratio for large populations will be near to 1: 1, by the law of large numbers.
However the variance of the number of girls in a typical family is var(#girls) = 2, whilst var(#boys) =
0; #A denotes the cardinality of A. Considerable variation from 1:1 is therefore possible in smaller
populations, but in either direction. In a large number of small populations, the number of large
predominantly female families would be balanced by a large number of male singletons.
31. Any positive integer m has a unique factorization in the form m =
integers m(1), m(2), .... Hence,

lP'(M=m)=

( . . )= IT (1----p1 )
IT lP'N(z)=m(z)
i

where

c = IJ;(l- i;tJ).

Pi

1
{Jm(i)=C
Pi

Now L:m lP'(M = m) = 1, so that

c- 1 =

II; pr(i)

for non-negative

(IT Pi-m(i)){J =mfJc


i

L:m m-fJ.

32. Number the plates 0, 1, 2, ... , N where 0 is the starting plate, fix k satisfying 0 < k ::; N, and
let Ak be the event that plate number k is the last to be visited. In order to calculate lP'(Ak), we cut
the table open at k, and bend its outside edge into a line segment, along which the plate numbers read
k, k + 1, ... , N, 0, 1, ... , kin order. It is convenient to relabel the plates as -(N + 1- k), -(Nk), ... , -1, 0, 1, ... , k. Now Ak occurs if and only if a symmetric random walk, starting from 0,
visits both -(N- k) and k- 1 before it visits either -(N + 1 - k) or k. Suppose it visits -(N- k)
before it visits k - 1. The (conditional) probability that it subsequently visits k - 1 before visiting
-(N + 1 - k) is the same as the probability that a symmetric random walk, starting from 1, hits N
before it hits 0, a probability of N- 1 by (1.7.7). The same argument applies if the cake visits k- 1
before it visits -(N- k). Therefore lP'(Ak) = N- 1.
33. With j denoting the jth best vertex, the walk has transition probabilities Pjk = ( j - 1)- 1 for
1 ::; k < j. By conditional expectation,

rj

j-1

= 1 + -.-- .l.>b

TI = 0.

1- 1 k=1

Induction now supplies the result. Since rj


log(~).

log j for large j, the worst-case expectation is about

34. Let Pn denote the required probability. If (mr, mr+1) is first pair to make a dimer, then m1
is ultimately uncombined with probability Pr-1 By conditioning on the first pair, we find that
Pn = (P1 + P2 + + Pn-z)/(n - 1), giving n(Pn+1 - Pn) = -(Pn - Pn-1). Therefore,
n! (Pn+ 1 - Pn) = (-l)n- 1 (pz - P1) = (-l)n, and the claim follows by summing.
Finally,
n

EUn = LlP'(mr is uncombined)= Pn

+ P1Pn-1 + + Pn-1P1 + Pn.

r=1

since the rth molecule may be thought of as an end molecule of two sequences oflength r and n- r + 1.
Now Pn --* e- 1 as n--* oo, and it is an easy exercise of analysis to obtain that n- 1EUn --* e- 2.

182

Solutions [3.11.35]-[3.11.36]

Problems

35. First,

where the last summation is over all subsets {r1, ... , rk} of k distinct elements of {1, 2, ... , n}.
Secondly,

Ak :S k!

:S k!

L
PqPr2 Prk
{q, ... ,rkl
L
PqPr2 Prk
r,, ... ,rk

+ (~)

+ (~)

LPf
i

mfXPi

L
Pr1Pr2 Prk-2
r,, ... ,rk-2

(~ Pj) k- 1
J

Hence

By Taylor's theorem applied to the function log ( 1-x), there exist t:lr satisfying 0 < t:lr < {2( 1-c)2 )} - 1
such that
n

II (1- Pr) =II exp{-Pr- t:lrp;} = exp{ -A.- A.O(mfXPi) }


r=1

Finally,
lP'(X = k) =

(IIr (1 -

Pr ))

Pq ... Prk

{r ,, ... ,rk } (1 - Pr1) .. (1 - Prk)

The claim follows from(*) and(**).

36. It is elementary that


-

1 N

1 N

E(Y) =- LE(Xr) =- LXr

n r=1
We write Y

-=f.-(.

n r=1

- E(Y) as the mixture of indicator variables thus:


N

-Y - E(Y)
-

= ~Xr
~ r=1 n

n) .

( Ir - -

It follows from the fact


i =I= j,

183

[3.11.37]-[3.11.38]

Solutions

Discrete random variables

HnC

c
HnC

nC

(1- y)a

1-y

HnC
Figure 3.2. The tree of possibility and probability in Problem (3.11.37). The presence of the
disease is denoted by C, and hospitalization by H; their negations are denoted by C and H.
that

var(Y)

N
=L

x?:
n
n 2 E { ( Ir- N

i#j

r=l

~ x?: !!_ ( 1 L

r=l

n2 N

~ 2 N-n
= Lxr N2n
r=l

)2} + L --;:;rE
XiXj { ( /i-N
n) ( Ij- N
n)}

!!_)
N

+""' XiXj
L n2
i#j

{!!_N Nn -1
_ !!!:.__}
- 1 N2

""'
N-n
- ~XiXj n(N -1)N2
z#J

Nn~;~ 1) {tx?:- ~ tx?:- ~ ~XiXj}


r=l

r=l

z#J

N - n 1 {
2
-2}
N - n 1
_2
= n(N-1)N :=txr -Nx
= n(N-l)N :=t(xr-x).

37. The tree in Figure 3.2 illustrates the possibilities and probabilities. If G contains n individuals,
X isbin(n, yp+ (1- y)a) and Yis bin(n, yp). Itisnotdifficultto seethatcov(X, Y) = nyp(l-v)
where v = yp + (1- y)a. Also, var(Y) = nyp(1- yp) and var(X) = nk(1- v). The result follows
from the definition of correlation.

38. (a) This is an extension of Exercise (3.5.2). With lP'n denoting the probability measure conditional
on N = n, we have that

184

Solutions [3.11.39]-[3.11.39]

Problems
where s = n - I:~=l r;. Therefore,
00

lP'(X; = r; for 1 :::0 i :::0 k) = L

lP'n(X; = r; for 1 :::0 i :::0 k)lP'(N = n)

n=O

=IT {

vri J(iY~,e-vf(i)}

i=l

r1

vs(l-

~(kW e-v(I-F(k)).

s=O

The final sum is a Poisson sum, and equals 1.


(b) We use an argument relevant to Wald' s equation. The event {T :::0 n -1} depends only on the random
variables X1, X2, ... , Xn-t. and these are independent of Xn. It follows that Xn is independent of
the event {T ::: n} = {T :::0 n- l}c. Hence,
00

00

00

lE(S) = LlE(X;/{T;::i}) = LlE(X;)lE(/{T;::i)) = LlE(X;)lP'(T::: i)


i=l
i=l
i=l
00

00

00

= L vf(i) LlP'(T = t) = v LlP'(T = t) L f(i)


t=i
i=l
1=1
i=l
00

= v LlP'(T = t)F(t) = lE(F(T)).


t=l

39. (a) Place an absorbing barrier at a+ 1, and let Pa be the probability that the particle is absorbed
at 0. By conditioning on the first step, we obtain that

1 :::0 n :::0 a.

Pn = - (Po +PI+ P2 + + Pn+I),


n+ 2

The boundary conditions are Po = 1, Pa+l = 0. It follows that Pn+l - Pn = (n + 1)(pn - Pn-d
for 2 :::0 n :::0 a. We have also that P2 - PI = PI - 1, and
Pn+l- Pn =

i(n + 1)! (P2- PI)= i(n + 1)! (PI- Po).

Setting n =a we obtain that -Pa = iCa + 1)! (PI- 1). By summing over 2 :::0 n <a,
a

Pa- PI= (PI- Po)+ iCPt- Po) Lj!,


j=3

and we eliminate PI to conclude that


Pa =

(a+ 1)!

4 + 3! + 4! ++(a+ 1)!

It is now easy to see that, for given r, Pr = Pr (a) --+ 1 as a --+ oo, so that ultimate absorption
at 0 is (almost) certain, irrespective of the starting point.
(b) Let Ar be the probability that the last step is from 1 to 0, having started at r. Then

AI= ~(1 +At +A2),


(r +2)Ar =AI +A2 + +Ar+l

185

r::: 2.

[3.11.40]-[3.11.40]

Solutions

Discrete random variables

It follows that
Ar- Ar-1

whence

1
--(Ar+1 - Ar),

r+l

1
A3 - A2 = 4 . 5 ... (r

Letting r--+ oo, we deduce that'A3

+ l) ('Ar+1 -

Ar ),

= 'A2 sothatAr = 'A2 for r

~ 2. From(**) with r

and from(*) 'A1 = ~


(c) Let J-tr be the mean duration of the walk starting from r. As above, J-to
J-tr = 1 +

- -2 (J-t1 + /-(2 + + J-tr+1),


r+

3.

= 2, 'A2 = 1Al>

= 0, and
1,

whence J-tr+1- J-tr = (r + l)(J-tr- J-tr-1)- 1 for r ~ 2. Therefore, Vr+1 = (J-tr+1- J-tr)/(r
satisfies Vr+ 1 - Vr = -1 / (r + 1)! for r ~ 2, and some further algebra yields the value of J-t 1

+ 1)!

40. We label the vertices 1, 2, ... , n, and we let rr be a random permutation of this set. Let K be the
set of vertices v with the property that rr(w) > rr(v) for all neighbours w of v. It is not difficult to
see that K is an independent set, whence a( G) ~ IKI. Therefore, a( G) ~ ElK I = 'L:v lP'(v E K).
For any vertex v, a random permutation rr is equally likely to assign any given ordering to the set
comprising v and its neighbours. Also, v E K if and only if v is the earliest element in this ordering,
whence lP'(v E K) = lj(dv + 1). The result follows.

186

4
Continuous random variables

4.1 Solutions. Probability density functions


1.

(a) {x(l - x)} -~ is the derivative of sin- 1(2x - 1), and therefore C = n- 1.

(b) C = 1, since

(c) Substitute v = (1

+ x 2 )- 1 to obtain

dx

00

-oo -(1-+----,x2=-)-m =

{1 m 3
I
1
1
v -~ (1- v)-~ dv = B(2, m- 2)

lo

where B(, )is a beta function; see paragraph (4.4.8) and Exercise (4.4.2). Hence, if m >

i,

r( 21 )r(m - 21 )
C -1 -_ B(l2'm _ 21) -_
r(m)

2.

(i) The distribution function Fy of Y is


Fy(y)

So, differentiating, fy(y)

= lP'(Y :S y) = lP'(aX :S y) = lP'(X :S

yfa)

Fx(yfa).

= a- 1 fx(y/a).

(ii) Certainly

= lP'(-X :S x) = lP'(X::: -x) = 1 -lP'(X :S -x)


since lP'(X = -x) = 0. Hence f-x(x) = fx( -x). If X and -X have the same distribution function
then f-x(x) = fx(x), whence the claim follows. Conversely, if fx( -x) = fx(x) for all x, then,
by substituting u = - x,
F_x(x)

lP'(-X :S y)

= lP'(X::: -y) =

00

-y

fx(x)dx

1Y

fx(-u)du

-00

1Y

fx(u)du

= lP'(X :S y),

-00

whence X and -X have the same distribution function.


3.

Since a::: 0, f::: 0, and g::: 0, it follows that af

+ (1 -

a)g ::: 0. Also

l{af+(l-a)g}dx=a lfdx+(l-a) lgdx=a+1-a=l.

187

[4.1.4]-[4.2.4]

Solutions

Continuous random variables

If X is a random variable with density f, and Y a random variable with density g, then a f +(1-a) g
is the density of a random variable Z which takes the value X with probability a and Y otherwise.
Some minor technicalities are necessary in order to find an appropriate probability space for such
a Z. If X andY are defined on the probability space (Q, :F, lP'), it is necessary to define the product
space (Q, :F, lP') x (:E, g., Q) where :E = {0, 1}, g. is the set of all subsets of :E, and Q(O) = a,
Q(l) = 1 -a. For w x a E Q x :E, we define
Z(w x a)= {

X(w)

if a= 0,

Y(w)

if a = 1.

..
. 1 F(x +h)- F(x)
f(x)
(a) By defimtion, r(x) = hm= --h-1-0 h
1- F(x)
1 - F(x)
(b) We have that

4.

H(x)
x

= .!!_ { .!_ fx r(y) dy} =


dx

Jo

r(x) - 2_
x
x2

Jo

r(y) dy

= 2_
x2

Jo

[r(x)- r(y)] dy,

which is non-negative if r is non-increasing.


(c) H(x)fx is non-decreasing if and only if, for 0 _::::a_:::: 1,

1
1
-H(ax) _:::: -H(x)
ax
x

for all x

0,

which is to say that -a- 1 log[l - F(ax)] _:::: -log[l- F(x)]. We exponentiate to obtain the claim.
(d) Likewise, if H(x)/x is non-decreasing then H(at) _:::: aH(t) for 0 _::::a _:::: 1 and t ~ 0, whence
H(at) + H(t- at) _:::: H(t) as required.

4.2 Solutions. Independence


1. LetN be the required number. ThenlP'(N
distribution with mean [1 - F(K)r 1 .

2.

= n) = F(K)n- 1[1- F(K)] forn

~ 1, the geometric

(i) Max{X, Y} _:::: v if and only if X :S v and Y :S v. Hence, by independence,


lP'(max{X, Y} :S

v) = lP'(X :S v, Y :S

v)

= lP'(X

:S v)lP'(Y :S v)

= F(v) 2 .

Differentiate to obtain the density function of V = max{X, Y}.


(ii) Similarly min{X, Y} > u if and only if X> u andY> u. Hence
lP'(U :S u)
giving fu(u)

= 1 -lP'(U

= 2f(u)[l

> u)

1 -lP'(X > u)lP'(Y > u)

= 1- [1- F(u)] 2 ,

- F(u)].

3. The 24 permutations of the order statistics are equally likely by symmetry, and thus have equal
probability. Hence lP'(Xt < Xz < X3 < X4) =
and lP'(Xt > Xz < X3 < X4) =
by
enumerating the possibilities.

f4,

4. lP'(Y(y) > k)
Therefore,

F(y)k fork ~ 1. Hence JE:Y(y)

lP'(Y(y) > JE:Y(y))

:A,

F(y)/[1 - F(y)]

-+ oo as -+ oo.

= {1- [1- F(y)J}LF(y)/[l-F(y)]J

~ exp {1- [1- F(y)]

188

F(y)

1 - F(y)

J}-+ e-

as y-+ oo.

Solutions [4.3.1]-[4.3.5]

Expectation

4.3 Solutions. Expectation

(a)JE:(Xa) = 000 xae-x dx < oo if and only if a> -1.


(b) In this case

1.

if and only if -1 < a < 2m - 1.


2.

We have that

X;)

( -L.,n1 -

1 = JE:

= LE(X;/Sn).

Sn
i=1
By symmetry, JE:(X;/Sn) = JE:(XlfSn) for all i, and hence 1 = nJE:(X1/Sn). Therefore
m

E(Sm/Sn) = LE(Xi/Sn) = mE(XlfSn) = mjn.


i=1

3.

Either integrate by parts or use Fubini's theorem:

roo xr- 11P'(X > x)dx = r roo xr- 1 { roo f(y)dy} dx


k
k
h~
= roo f(y) {1y rxr-1 dx} dy = roo yr f(y)dy.
h=o
x=O
k

An alternative proof is as follows. Let lx be the indicator of the event that X > x, so that
J~ lx dx =X. Taking expectations, and taking a minor liberty with the integral which may be made

rigorous, we obtain EX = J000 E(Ix) dx. A similar argument may be used for the more general case.

4. We may suppose without loss of generality that t-t = 0 and a = 1. Assume further that m > 1.
In this case, at least half the probability mass lies to the right of 1, whence JE:(X/{x:::mJ) ::: ~- Now
O=E(X) =E{X[/{x:::m} +l{X<mJD,implyingthatJE:(Xl{X<mJ):::: -~.Likewise,
E(X 2 /{x:::mJ) ::: ~,

E(X 2 /{X<mJ) :::: ~

By the definition of the median, and the fact that X is continuous,


E(X I X< m):::: -1,

JE:(X 2 I X< m):::: 1.

It follows that var(X I X < m):::: 0, which implies in tum that, conditional on {X < m}, X is almost
surely concentrated at a single value. This contradicts the continuity of X, and we deduce that m :::: 1.
The possibility m < -1 may be ruled out similarly, or by considering the random variable -X.

5.

It is a standard to write X= x+- x- where x+ = max{X, 0} and


are non-negative, and so, by Lemma (4.3.4),

x+ and x-

t-t = E(X) = JE:(X+)- JE:(X-) =


=

r [1-F(x)]dx- lor
00

lo

00

x-

=- min{X, 0}. Now

fooo lP'(X > x) dx- looo lP'(X < -x) dx

F(-x)dx=

r [1-F(x)]dx-j-oo F(x)dx.
00

lo

It is a triviality that

t-t =foiL F(x) dx +foiL [1 - F(x)] dx


and the equation follows with a = I-t It is easy to see that it cannot hold with any other value of a,
since both sides are monotonic functions of a.

189

[4.4.1)-[4.4.6)

Solutions

Continuous random variables

4.4 Solutions. Examples of continuous variables


1.

(i) Integrating by parts,

fooo x

r(t) =

1e-x dx = (t-

1-

1) fooo x

1- 2 e-x

dx = (t- l)r(t-

1).

If n is an integer, then it follows that r(n) = (n- 1)r(n- 1) = = (n -1)! r(l) where r(l) = 1.
(ii) We have, using the substitution u2 = x, that

r(~i = {fooo x-!e-x dx


=

4Joroo

e-" 2 du

roo e-v

{fooo 2e-" 2 du

=
2

Jo

41r=O Je=O
r1(

12 e-r2 r dr d() = 7r

00

dv =

as required. For integral n,


r(n

2.

1
+ z)
=

1
1
1
3
1
1
(2n)! '(n- z)r(n- :z) = ... = (n- z)(n- z) ... :zr{z) = 4nn! y7r.

By the definition of the gamma function,


r(a)r(b)

Now set u

= fooo xa-1e-x dx fooo i-1e-Y dy = fooo fooo e-(x+y)xa-1i-1 dxdy.

= x + y, v = xf(x + y), obtaining

oo r1
1u=Olv=O

e-uua+b-1va-1(l- v)b-1 dvdu


=

fooo ua+b- 1e-"du fo 1 va- 1(l-v)b- 1 dv=r(a+b)B(a,b).

3. If g is strictly decreasing then lP'(g(X) ::S y) = lP'(X :::: g- 1 (y)) = 1- g- 1 (y) so long as
0 ::s g- 1(y) ::s 1. Therefore lP'(g(X) ::s y) = 1 - e-Y, y:::: 0, if and only if g- 1 (y) = e-Y, which is
to say that g(x) = -logx for 0 < x < 1.

4.

We have that
lP'(X < x)

j-oo rr(1 +
x

u2 )

du

7r

= - +- tan- 1 x.

Also,
lE(IXIa)

= ~oo

lxla 2 dx

-oorrO+x)

is finite if and only if Ia I < 1.


5.

Writing <I> for the N (0, 1) distribution function, lP'(Y ::S y) = lP'(X ::S logy) = <I> (logy). Hence

1
1
_I (1
)2
fy(y) =- fx(logy) = - - e 2 ogy ,

6.

y$

0< y<

00.

Integrating by parts,
LHS =

L:

g(x) { (x-

=- [g(x)a

tt)~4> (x: JL)} dx

C: JL)] ~00 + L:
190

g'(x)a

C: JL)

dx = RHS.

Dependence

7.

Solutions

[4.4.7]-[4.5.3]

(a) r(x) = afJxf3- 1.

(b) r(x) =A..


A.ae-J..x + J,L(1 - a)e-ttx
(c) r(x) =
J..
, which approaches min{A., J.L} asx---+ oo.
ae- x + (1 - a)e-ttx

8.

Clearly q/ = -xtj>. Using this identity and integrating by parts repeatedly,

1- <l>(x) =

1oo

tj>(u)du = -

= tj>(x) _ tj>(x)
x

1oo

-1

x3

4>'(u)
tj>(x)
--du = U

00 3'(u) du =
u5

1oo -

tj>(x) _ tj>(x)
x
x3

q/(u)
3-du
U

+ 3(x)
x5

-100
x

15(u) du.
u6

4.5 Solutions. Dependence


1.

(i) As the product of non-negative continuous functions,


g(x) = ~e-lxl

1-oo00

is non- negative and continuous. Also

I 2 2

e-zx Y dy = ~e-lxl

J2nx-2

if x =f. 0, since the integrand is the N(O, x- 2 ) density function. It is easily seen that g(O) = 0, so that
g is discontinuous, while

1-00oo

g(x)dx =

1oo ~e-lxl
-00

dx = 1.

(ii) Clearly f Q ::: 0 and

00 100
1-oo
-oo !Q(x,y)dxdy = 2::00
n=l

Ur

.1

1.

Also f Q is the uniform limit of continuous functions on any subset of ~2 of the form [- M, M] x ~;
hence f Q is continuous. Hence f Q is a continuous density function. On the other hand

00

-oo

!Q(x, y)dy =

00
2:: Or g(x- qn).

n=l

where g is discontinuous at 0.
(iii) Take Q to be the set of the rationals, in some order.
2. We may assume that the centre of the rod is uniformly positioned in a square of size a x b, while
the acute angle between the rod and a line of the first grid is uniform on [0, ~ n]. If the latter angle is
e then, with the aid of a diagram, one finds that there is no intersection if and only if the centre of the
rod lies within a certain inner rectangle of size (a- r cos&) x (b- r sin&). Hence the probability of
an intersection is

7r~

lorr/2 {ab- (a- rcos&)(b- rsin&)} de= -(a+


2r
b- ~r).
7r~

3. (i) Let I be the indicator of the event that the first needle intersects a line, and let J be the indicator
that the second needle intersects a line. By the result of Exercise (4.5.2), E(/) = E(J) = 2/n; hence
Z =I+ J satisfies E(~Z) = 2/n.

191

[4.5.4]-[ 4.5.6]

Solutions

Continuous random variables

(ii) We have that


var(iZ) = i{lE(l 2) + JE(J 2) + 2JE(l J)} -JE(iZ)2
1
4
1
4
1
= 4{lE(l) +lE(J) + 2JE(l J)}- 2 = - - 2 + -JE(/ J).
7r
JT:Jr:
2

in,

z i

then two intersections occur if < min{sine, cos&}


In the notation of (4.5.8), if 0 < e <
or 1 - z < min {sin e, cos e}. With a similar component when 1r ::S e < 1r, we find that

lE(/ J) = lP'(two intersections) =

~ jj

dz de

(z,ll):

O<z<i min(sinll,cosll}
0<11<i1f

41orr/2 imin{sine,cos&}de = 41orr/4 sine de=4 ( 1- 1v. ) ,


=1r

1r

v2

1r

and hence
1
1
4
1
V.
3-./2
4
var(,. Z) = - - - + - (2 - v 2) = - - - - .
..

7r

7r2

7r

7r

7r2

(iii) For Buffon's needle, the variance of the number of intersections is (2/n)- (2/n )2 which exceeds
var(i Z). You should therefore use Buffon's cross.

JJ

4. (i)Fu(u) = 1-(1-u)(l-u)ifO < u < l,andsolE(U) =


2u(l-u)du =~-(Alternatively,
place three points independently at random on the circumference of a circle of circumference 1.
Measure the distances X and Y from the first point to the other two, along the circumference clockwise.
Clearly X and Y are independent and uniform on [0, 1]. Hence by circular symmetry, JE(U)
JE(V- U) = JE(l- V) = ~-)
(ii) Clearly UV = XY, so that JE(UV) = lE(X)lE(Y) = ;\. Hence

cov(U, V) = JE(UV) -lE(U)lE(V) = ;\- j(l- ~) =

-fe;,

since JE(V) = 1 - JE(U) by 'symmetry'.


5.

(i) If X and Y are independent then, by (4.5.6) and independence,


JE(g(X)h(Y)) =
=

jj g(x)h(y)fx,y(x, y)dx dy

g(x)fx(x)dx

h(y)fy(y)dy = lE(g(X))lE(h(Y)).

(ii) By independence

6. If 0 is the centre of the circle, take the radius OA as origin of coordinates. That is, A = (1, 0),
B = (1, E>), C = (1, <1>), in polar coordinates, where we choose the labels in such a way that
0 ::5 e ::5 <1>. The pair e, <I> has joint density function j(e, cp) = (2n 2 )- 1 for 0 < e < cp < 2n.
The three angles of ABC are
of pairs (&, cp) such that 0 < e <

ie, i<<I>- E>),

cp

i<I>.

1rYou should plot in the 8/cp plane the set


< 2n and such that at least one of the three angles exceeds xn.

192

Solutions

Conditional distributions and conditional expectation

[4.5.7]-[4.6.3]

Then integrate f over this region to obtain the result. The shape of the region depends on whether or
not x < ~. The density function g of the largest angle is given by differentiation:
g(x) = {

'f 3::::
1

1
2

6(3x- 1)

6(1- x)

if~::::x::::l.

X::::

The expectation is found to be H-rr.


7.

We have that lE(X) = t-t. and therefore JE(Xr - X) = 0. Furthermore,

lE{X(Xr -X)} =

~JE ( L

XrXs) -JE(X2 ) =

~{a 2 + nt-t2 }- (var(X) + JE(X) 2 )

8.

The condition is that JE(Y) var(X)

+ JE(X) var(Y) =

0.

9. If X and Yare positive, then S positive entails T positive, which displays the dependence. Finally,
S2 =X and T 2 = Y.

4.6 Solutions. Conditional distributions and conditional expectation


1. The point is picked according to the uniform distribution on the surface of the unit sphere, which
is to say that, for any suitable subset C of the surface, the probability the point lies in C is the surface
integral fc(4rr)- 1 dS. Changing to polar coordinates, X = cos ecos</>, y = sine cos</>, z = sin</>,
subject to x 2 + y 2 + z2 = 1, this surface integral becomes (4:n')- 1 fc I cos</>! dB d<f>, whence the joint
density function of e and <I> is

1
f(B, </>)=-I cos<f>l,
4rr

14>1 :::: ~rr,

o:::: e <

2rr.

The marginals are then fe(B) = (2rr)- 1 , fq,(<f>) = ~I cos </>I, and the conditional density functions
are
for appropriate e and</>. Thus E> and <I> are independent. The fact that the conditional density functions
are different from each other is sometimes referred to as 'Borel's paradox'.
2.

We have that
1/f(x) =

1oo Y !x,y(x, y) dy
-oo fx(x)

and therefom
lE(l/f(X)g(X)) =
=

y(x y)
'
' g(x)fx(x)dxdy
1-oooo 1oo y fxfx(x)

L: L:
-~

{y g(x)}fx,y(x, y)dx dy = JE(Yg(X)).

3. Take Y to be a random variable with mean oo, say fy (y) = y- 2 for 1 :::: y < oo, and let X = Y.
Then JE(Y 1 X) = X which is (almost surely) finite.

193

[4.6.4]-[4.6.6]
4.

Solutions

Continuous random variables

(a) We have that

= A.e'-<x-y), for 0 ::S x

so that fYIX (y I x)
(b) Similarly,

I x) = xe-xY, for 0 :::=:

so that fYIX(Y
5.

X< 00,

0 ::S

X< 00,

::S y < oo.

= fooo xe-x(y+ 1) dy =e-x,

fx(x)

0 ::S

y < oo.

We have that
lP'(Y

= k) =
=

loo

lP'(Y

(n)

= k I X= x)fx(x) dx = fo1
o

+ k,

B(a

In the special case a

(n)

n-

xk(l - x)n-k

xa-1(1 _ x)b-1

B(a,b)

k +b).

B(a, b)

=b=

1, this yields

lP' Y =k)
(

(n)
k

r(k+ l)r(n -k+ 1)


r(n + 2)

= _1_
n + 1,

:::=:

k :::=: n,

whence Y is uniformly distributed.


We have in general that
E(Y)

loo

E(Y

na

I X= x)fx(x) dx = - - ,
a+b

and, by a similar computation ofE(Y 2 ),


nab(a

+ b +n)

var(Y) = -(a-+------,-b)""2,-(a-+_____,.b-+-l)

6.

By conditioning on X 1,
Gn(x) = lP'(N > n) =fox Gn-1 (x-u) du =fox Gn-1 (v) dv.

Now Go(v) = 1 for all v

(0, 1], and the result follows by induction. Now,


00

EN=

lP'(N > n) =ex.

n=O

More generally,
GN(s)

oo
oo
= L:snlP'(N
= n) = L:sn
n=1

whence var(N)

n=1

n) =

n-1
x
- ::..._
(n-l)!
n!

= G:Z,(l) + G/,r(l)- G/,r(1) 2 = 2xex +ex- e2x.


194

(s- l)esx

+ 1,

dx

Solutions [4.6.7]-[4.7.1]

Functions of random variables


7. We may assume without loss of generality that lEX
inequality,

lEY

0. By the Cauchy-Schwarz

Hence,
lE(var(Y

8.

I X)) = lE(Y 2 )

-lE(lE(Y

I X) 2 )

lE(XY) 2
lE(X 2 )

< JEY 2 -

(1 - p 2 ) var(Y).

One way is to evaluate

Another way is to observe that min{Y, Z} is exponentially distributed with parameter f-t + v, whence
lP'(X < min{Y, Z}) = /...f(/... + f-t + v). Similarly, lP'(Y < Z) = ~-t/(1-t + v), and the product of these
two terms is the required answer.

9.

By integration, for x, y > 0,

fy(y) =loy f(x, y) dx


whence c

gcy 3 e-Y,

fx(x)

1. It is simple to check the values of fXIY(x I y)

and then deduce by integration that JE(X I Y

00

f(x, y) dy =ex e-x,

f(x, y)/ fy(y) and fYIX(Y

= y) = 1Y and JE(Y I X = x) = x + 2.

I x),

10. We have that N > n if and only if Xo is largest of {Xo, X 1, ... , Xn}, an event having probability

1/(n + 1). Therefore, lP'(N


largest, whence

= n) = 1/{n(n + 1)} for n


oo

lP'(XN :::; x)

F(x)n+l
n(n + 1)

n=1

oo

:=:: 1. Next, on the event {N

F(x)n+l
n
-

n=1

F(x)n+l
n+1

oo

= n}, Xn is the

+ F(x),

n=1

as required. Finally,

lP'(M

= m) = lP'(Xo :=:: X1

:=:: :=:: Xm-1) -lP'(Xo :=:: X1 :=:: :=:: Xm)

= m!

- (m

+ l)!

4.7 Solutions. Functions of random variables


1.

We observe that, if 0 :::; u :::; 1,

lP'(XY :::; u)

= lP'(XY

:::; u, Y :::; u)

+ lP'(XY :::; u, Y

> u)

= lP'(Y

:::; u)

+ lP'(X:::; ufY,

Y > u)

1u

=u+

-dy=u(1-1ogu).

u y

By the independence of XY and Z,

lP'(XY:::; u,

z 2 :::; v) = lP'(XY

:::; u)lP'(Z :::; .y'v)

195

= u.,fV(l

-log u),

0 < u, v < 1.

[4.7.2]-[4.7.5]

Solutions

Continuous random variables

Differentiate to obtain the joint density function


) _ log(1/u)
.fo ,

g(u, v -

0:::: u, v::::

Hence

JJ

lP'(XY:::: Z 2 ) =

1.

~-

lo;%u) dudv =

o::::u::::v::::1
Arguing more directly,

Iff

lP'(XY:::: Z 2 ) =

dxdydz =

o::::x,y,z::::1
xy::::z 2

2.

The transformation x = uv, y = u- uv has Jacobian

=I

vv
1-

u
-u

I=

-u

Hence Ill= lui, and therefore fu. y(u, v) = ue-u, for 0:::: u < oo, 0:::: v:::: 1. Hence U and V are
independent, and fv(v) = 1 on [0, 1] as required.
3.

Arguing directly,
2

lP'(sin X :::: y) = lP'(X :::: sin- 1 y) = - sin- 1 y,

0::::

so that fy (y) =
variables.

4.

2/ (n ~), for 0 :::: y :::: 1.

Alternatively, make a one-dimensional change of

(a) lP'(sin- 1 X :::: y) = lP'(X :::: siny) = siny, for 0 :::: y ::::

o::::y::::in.
(b) Similarly, lP'(sin- 1 X:::: y) =
_ln
< Y <
ln.
2 - 2

y:::: 1,

in.

Hence fy(y) =cosy, for

iO + siny), for -in : : y:::: in. so that fy(y) = i cosy, for

5. Consider the mapping w = x,z = (y-px)/~ withinversex = w,y = pw+zJI=Pf


and Jacobian
J=

I~ Vl~p 2 1 = R

The mapping is one-one, and therefore W (= X) and Z satisfy

implying that W and Z are independent N (0, 1) variables. Now


{X> 0, y > 0}

= {w >

0,

> -Wp

I R},

and therefore, moving to polar coordinates,


lP'(X > 0, Y > 0) =

~JT

oo

1
I 2
-e-zr rdrdB =

Je=a r=O 2n
196

~JT 1

-dB

2n

Solutions [4.7.6]-[4.7.9]

Functions of random variables


where a= -tan- 1 (p/JI=P2") = -sin- 1 p.

6.

We confine ourselves to the more interesting case when p =I= 1. Writing X

U, Y

pU +

JI=P2 V, we have that U and V are independent N(O, 1) variables. It is easy to check that Y > X
if and only if (1 - p) U < JI=P2 V. Turning to polar coordinates,
E(max{X, Y}) =

1 ~ [~1{1+rr
00

{pr cosO+ r P sinO} dB+

~~rr r cos edB] dr

where tan 1{1' = ~(1- p)/(1 + p). Some algebra yields the result. For the second part,

by the symmetry of the marginals of X and Y. Adding, we obtain 2E(max{X, Y} 2 )


E(Y 2 ) = 2.
7.

E(X 2 ) +

We have that
lP'(X < Y, Z > z)

(a) lP'(X

= Z) = lP'(X <

Y)

A
= lP'(z <X< Y) = --e-(A+JL)z
= lP'(X

A+JL

< Y)lP'(Z > z).

= --.
A+JL

(b) By conditioning on Y,
lP'( (X- Y)+

= 0) = lP'(X ::;: Y) = A+ tL,

lP'((X- Y)+ >

w)

= _1-L_e-J.w

A+JL

for w > 0.

By conditioning on X,
lP'(V > v)

= lP'(IX- Yl

> v)

00

lP'(Y > v + x)fx(x) dx +

00

lP'(Y < x- v)fx(x) dx

v > 0.
(c) By conditioning on X, the required probability is found to be

8.
that

Either make a change of variables and find the Jacobian, or argue directly. With the convention
r2 - u2 = 0 when r 2 - u 2 < 0, we have that

F(r,x) = lP'(R::;: r, X::;: x) =

~
7r

a2 F
2r
f(r,x) = - =
~
arax
rry r2- x2
9.

jx

Jr2- u2du,

-r

lxl < r < l.

As in the previous exercise,


lP'(R::::: r, Z::::: z) = 3- ~z n(r 2 - w 2 )dw.

4n -r

197

[4.7.10]-[4.7.14]

Solutions

Continuous random variables

= ~r for lzl < r < 1. This question may be solved in spherical polars also.
The transformations = x + y, r = xf(x + y), has inverse x = rs, y = (1 - r)s and Jacobian

Hence f(r, z)

10.
J = s. Therefore,
fR(r)

roo fR s(r, s) ds = roo fx y(rs, (1 -

lo

lo

r)s )s ds
O::=or::=ol.

11. We have that

whence
fy(y) = 2fx(-J(afy)

-1)

1
rrJy(a- y)

0 :::0 y :::0 a.

12. Using the result of Example (4.6.7), and integrating by parts, we obtain
JP(X >a, Y >b)=

00

= [1 -

<f>(x) { 1 -<I> (

~)} dx

<l>(a)][1 - <l>(c)]

00

[1 - <l>(x)]</> (

~)
1- p2

1- p2

dx.

Since [1 - <l>(x)]/<f>(x) is decreasing, the last term on the right is no greater than
1 - <l>(a)
</>(a)

00

<f>(x)<f>

b- px )

JT=P2 JT=P2

dx,

which yields the upper bound after an integration.


13. The random variable Y is symmetric and, for a > 0,

P(Y > a)

by the transformation v

= JP(O <

-1

X < a

ra-1 rr(l +
du

= Jo

u2)

1aoo rr(1 +

-v-2 dv
v-2),

= 1/u. For another example, consider the density function


lx- 2 if x > 1

f(x) = { 2

'

14. The transformation w


J = w, whence
f(w, z) = w.

ifO:Sx:Sl.

= x + y, z =xI (x + y) has inverse x = wz, y = (1 -

z)w, and Jacobian

A.(A.wz)a-1 e-A.wz A.(A.(1 _ z)w).B-1 e-A.(1-z)w


r(a)
.
r(.B)

A.(A.w)a+.B-le-A.w

r(a

+ {3)

za-1(1- z).B-l
B(a, {3)

w > 0, 0 < z < 1.

Hence Wand Z are independent, and Z is beta distributed with parameters a and {3.

198

Solutions [4.8.1]-[4.8.4]

Sums of random variables

4.8 Solutions. Sums of random variables


1.

= X + Y has density function

By the convolution formula (4.8.2), Z


fz(z)

rz AJ.-Le-Axe-JL(z-x) dx = ~
(e-AZ- e-JLZ),
11- A

z ~ 0,

lo

if A ::j:. f-L. What happens if A = J1? (Z has a gamma distribution in this case.)
2.

Using the convolution formula (4.8.2), W =aX+ f3Y has density function
fw(w)

00

-oo

na(1 + (x/a) 2)

1
nf3(1 + {(w- x)/f3}2)

dx,

which equals the limit of a complex integral:


lim
R-+oo

1- .
1
af3 . - dz
z2 + a2 (z- w)2 + p2

JD n2

where D is the semicircle in the upper complex plane with diameter [- R, R] on the real axis. Evaluating the residues at z = ia and z = w + if3 yields
af32ni { 1
1
1
1
}
fw(w) = _n_2_ -2i-a (ia- w)2 + p2 + -2i-f3 -(w_+_i_f3--,)2,---+-a"""
2

n(a + f3) 1 + {wf(a + f3)}2'

< w <

-00

00

after some manipulation. Hence W has a Cauchy distribution also.


3.

Using the convolution formula (4.8.2),


fz(z) = loz

4.

~ze-z dy = ~z 2 e-z,

z ~ 0.

Let fn be the density function of Sn. By convolution,

_ , -J.. 1x
f 2 (x ) -II.Je

A
2
-A2 - At

A
2
n
A
+ "-2e
, -J..2x - 1
_ '"""" , -J..,x II
- - L...J"-re
- -s- .
At - A2

r=l

s=l As - Ar

sf.r

This leads to the guess that


n

fn(X) = LAre-J..,x
r=l

IIn __As_,
s=l As - Ar

sf.r

which may be proved by induction as follows. Assume that (*) holds for n ::::: N. Then

199

[4.8.5]-[4.8.8] Solutions

Continuous random variables

for some constant A. We integrate over x to find that

1=L IT _ s _ +A- ,
N N+1

r=1 s=1 As - Ar
sf.r

and(*) follows with n

5.

=N

AN+1

+ 1 on solving for A.

The density function of X + Y is, by convolution,

fz(x) = {

ifO:::::x:::::1,

2-

if 1 ::::::

X ::::::

2.

Therefore, for 1 :::::: x :::::: 2,

f3(x)=

lo1fz(x-y)dy= 11
0

(x-y)dy+

lox-1 (2-x+y)dy=~-(x-~) .
2

x-1

Likewise,

A simple induction yields the last part.

6. The covariance satisfies cov(U, V) = E(X 2 - Y 2 )


random variables taking values 1, then

= 0, as required.

If X andY are independent N(O, 1) variables, fu,v(u, v)


a function of u multiplied by a function of v.

If X andY are symmetric

= (4rr)- 1e-!<u 2 +v 2 ), which factorizes as

7. From the representation X= apU + a.Jl=PZV, Y = r:U, where U and V are independent
N(O, 1), we learn that
apy
E(X I Y = y) = E(apU I U = yfr:) = - .
r:
Similarly,

whence var(X 1 Y) = a 2 (1 - p 2 ). For parts (c) and (d), simply calculate that cov(X, X + Y)
a 2 +par:, var(X + Y) = a 2 + 2par: + r: 2 , and
r:2(1 - p2)

1- p(X, X+ Y) =

+ 2par: + r: 2

First recall that lP'(IXI :::::: y) = 2<1>(y) - 1. We shall use the fact that U = (X+ Y)j,fi,
V = (X - Y) I ,J2 are independent and N (0, 1) distributed. Let 8.. be the triangle of JR 2 with vertices
(0, 0), (0, Z), (Z, 0). Then

8.

lP'(Z :::::: z I X > 0, Y > 0) = 4lP'( (X, Y) E

8.)

= lP'(IUI :::::: zj../i, lVI :::::: z!..fi)

= 2{2<1>(z!..fi)- 1}2 ,
200

by symmetry

Solutions [4.9.1]-[4.9.5]

Multivariate normal distribution


whence the conditional density function is

f(z)

2J2{2<t>(z!J2) -l}tP(z;J2).

Finally,
lE(Z

IX

> 0, Y > 0) = 21E(X


= 21E(X

IX

> 0, Y > 0)

I X>

0) = 41E(X/[X>O}) = 4

loooo

I 2

r;c:e-2x dx.
v2rr

4.9 Solutions. Multivariate normal distribution


1. Since Vis symmetric, there exists a non-singular matrix M such that M' = M- 1 and V =
MAM- 1, where A is the diagonal matrix with diagonal entries the eigenvalues A1, Az, ... , An of V.
I

Let A 2 be the diagonal matrix with diagonal entries

.Jil, ../Ai,, ... , $n; A 2 is well defined since

V is non-negative definite. Writing W = MA 2M', we have that W = W' and also

as required. Clearly W is non-singular if and only if A 2 is non-singular. This happens if and only if
Ai > 0 for all i, which is to say that V is positive definite.
By Theorem (4.9.6), Y has the multivariate normal distribution with mean vector 0 and covariance
matrix

2.

Clearly Y = (X - J.L)a' + J.La' where a = (a1, az, ... , an). Using Theorem (4.9.6) as in the
previous solution, (X - J.L )a' is univariate normal with mean 0 and variance aVa'. Hence Y is normal
with mean J.La' and variance aVa'.

3.

4. Make the transformation u = x + y, v = x- y, with inverse x = !<u


that IJ I =
The exponent of the bivariate normal density function is

!.

----,2.-(x2- 2pxy
2(1 - p )

+ Y, V

and therefore U = X

+ i) = -

2 { u2(1- p)

4(1- p )

+ v), y

= !<u- v), so

+ v2(1 + p) },

= X - Y have joint density

1
{
u2
v2
}
f(u, v) = 4rrv'l=P2 exp - 4(1 + p) - 4(1- p) ,
whence U and V are independent with respective distributions N(O, 2(1

5.

That Y is N (0, 1) follows by showing that JP'(Y

y) = lP'(X

+ p)) and N(O, 2(1 -

y) for each of the cases y

p )).
~

-a,

IYI <a, y;,:: a.


Secondly,
p(a)=lE(XY)=

a x 2 ifJ(x)dx-

-a

~-a x 2 ifJ(x)dx-

-oo

1oo x ifJ(x)dx=1-4 1oo x ifJ(x)dx.


2

201

[4.9.6]-[4.10.1]

Solutions

Continuous random variables

The answer to the final part is no; X and Y are N (0, 1) variables, but the pair (X, Y) is not bivariate
normal. One way of seeing this is as follows. There exists a root a of the equation p(a) = 0. With
this value of a, if the pair X, Y is bivariate normal, then X andY are independent. This conclusion is
manifestly false: in particular, we have that lP'(X > a, Y > a) # lP'(X > a)lP'(Y > a).

6.

Recall from Exercise (4.8.7) that for any pair of centred normal random variables
E(X I Y) = cov(X, Y) Y,
varY

var(X

I Y)

= {1- p(X, Y) 2} var X.

The first claim follows immediately. Likewise,

7. As in the above exercise, we calculate a = E(X I I ~7 Xr) and b = var(X I I ~7 Xr) using the
facts that var XI = vu, var(~7 Xi) = ~ij ViJ and cov(XI, ~7 Xr) = ~r VIr
8.

Let p = lP'(X > 0, Y > 0, Z > 0) = lP'(X < 0, Y < 0, Z < 0). Then
1- p = lP'({X > 0} U {Y > 0} U {Z > 0})
= lP'(X > 0)

+ lP'(Y

> 0)

+ lP'(Z

> 0)

+p

- lP'(X > 0, Y > 0) - lP'(Y > 0, Z > 0) - lP'(X > 0, Z > 0)

9.

[3

. -I
. -I
. -I }]
2 + p - 4 + 2rr1 {sm
PI + sm P2 + sm P3 .

Let U, V, W be independent N(O, 1) variables, and represent X, Y, Z as X = U, Y = PI U

J1- p[V,

z=

p 3U

2 P32 + 2PIP2P3
1 -PI2 - P2-

+ P2- PIP3 V +
J1-p[

(1 -PI)

We have that U =X, V = (Y- PIX)/V1- Pf and E(Z


conditional variance.

w.

I X, Y) follows immediately, as does the

4.10 Solutions. Distributions arising from the normal distribution


1.

x2(m) density function is

First method. We have from (4.4.6) that the

The density function of Z = X I

2:0.

+ X 2 is, by the convolution formula,

() lo x2

z lm-I e -!x(
._ z- x )ln-I
._
e -l(z-x)d
2
x

g z = c

= cz2(m+n -Ie-2z

I
I
Jo{I u'Zm-I(lu)Zn-I du

202

Solutions [4.10.2]-[4.10.6]

Distributions arising from the normal distribution

by the substitution u = xjz, where cis a constant. Hence g(z) = c'z~(m+n)- 1 e-~z for z.:,:: 0, for
an appropriate constant c', as required.
Second method. If m and n are integral, the following argument is neat. Let Z1, Z2, ... , Zm+n be
independent N(O, 1) variables. Then X1 has the same distribution as Zf + Zi + + Z~, and X2
the same distribution as Z~+ 1 + Z~+ 2 + + Z~+n (see Problem (4.14.12)). Hence X1 + X2 has
the same distribution as Zf + + Z~+n i.e., the x2(m + n) distribution.

2. (i) The t(r) distribution is symmetric with finite mean, and hence this mean is 0.
(ii) Here is one way. Let U and V be independent x2 (r) and x2 (s) variables (respectively). Then

by independence. Now U is f(i,

ir) and Vis f(i, is), so that E(U) =rand


I

E(V

-1

lo0012-s/2 's-1-'v
r(is-1)lo002-:z(s-2) 's-2-'v
1
)=
---v:Z
e :Z dv=
v:Z
e 2 dv=-0
v r(is)
2f(is)
0
r(is- 1)
s- 2

if s > 2, since the integrand is a density function. Hence

s
( Ujr)
Vjs =s-2

if s > 2.

(iii) If s ::; 2 then E(v- 1) = oo.

3.

Substitute r = 1 into the t (r) density function.

First method. Find the density function of X I Y, using a change of variables. The answer is
F(2, 2).

4.

Second method. X and Y are independent


hence X/Y is F(2, 2).

x2 (2)

variables (just check the density functions), and

5. The vector (X, X1- X, X2- X, ... , Xn- X) has, by Theorem (4.9.6), a multivariate normal
distribution. We have as in Exercise (4.5.7) that cov(X, Xr -X) = 0 for all r, which implies that X
is independent of each X r. Using the form of the multivariate normal density function, it follows that
X is independent of the family {Xr -X : 1 ::; r ::; n}, and hence of any function of these variables.
Now s2 = (n- 1)- 1 L:r(Xr- X) 2 is such a function.
6. The choice of fixed vector is immaterial, since the joint distribution of the Xj is spherically
symmetric, and we therefore take this vector to be (0, 0, ... , 0, 1). We make the change of variables
U2 = Q2 +X~, tan W = Q/ Xn, where Q2 = 2::~;;;;;}
and Q .:,:: 0. Since Q has the x2(n - 1)
distribution, and is independent of Xn, the pair Q, Xn has joint density function

x;

-~x 2 l(lq)!(n-3)e-~x
/( x - _e__ . 2 2
q' ) - ../27i ~r'--(--;i-(n---1-)) -

X E

JR, q > 0.

The theory is now slightly easier than the practice. We solve for U, W, find the Jacobian, and deduce
the joint density function fu,'-l!(u, 1/1) of U, W. We now integrate over u, and choose the constant so
that the total integral is 1.

203

[4.11.1]-[4.11. 7]

Continuous random variables

Solutions

4.11 Solutions. Sampling from a distribution


1.

Uniform on the set {1, 2, ... , n}.

2.

The result holds trivially when n = 2, and more generally by induction on n.

3. We may assume without loss of generality thatA = 1 (since ZIA. is f(A., t) if Z is f(l, t)). Let U,
V be independent random variables which are uniformly distributed on [0, 1]. We set X= -t log V
and note that X has the exponential distribution with parameter 1It. It is easy to check that
for x > 0,
where c = tt e-t+ 1I f(t). Also, conditional on the event A that
U <
-

xt-1 -t

e
f(t)

te-Xft

X has the required gamma distribution. This observation may be used as a basis for sampling using
the rejection method. We note that A= {log U :=:: (n- l)(log(XIn)- (X In)+ 1) }. We have that
IF'( A) = 11c, and therefore there is a mean number c of attempts before a sample of size 1 is obtained.
4. Use your answer to Exercise (4.11.3) to sample X from f(l, a) andY from f(l, {3). By Exercise
(4.7.14), Z = XI(X + Y) has the required distribution.

S. (a) This is the beta distribution with parameters 2, 2. Use the result of Exercise (4).
(b) The required r (1, 2) variables may be more easily obtained and used by forming X = - log( U1U2)
and Y -log(U3 U4) where {Ui : 1 _::: i :=:: 4} are independent and uniform on [0, 1].
(c) Let U1, U2, U3 be as in (b) above, and let Z be the second order statistic U(2) That is, Z is the
middle of the three values taken by the Ui; see Problem (4.14.21). The random variable Z has the
required distribution.
(d) As a slight variant, take Z = max{U1, U2} conditional on the event {Z :=:: U3}.
(e) Finally, let X = .../UlI (.../Ul + ..JU2), Y = .../Ul + ..jUi. The distribution of X, conditional on
the event {Y :=:: 1}, is as required.
6. We use induction. The result is obvious when n = 2. Let n 2: 3 and let p = (P1, P2, ... , Pn) be
a probability vector. Since p sums to 1, its minimum entry P(1) and maximum entry P(n) must satisfy

1
1
P(1) :S - < - - ,
n
n -1

P(1) + P(n) 2: P(1) +

We relabel the entries of the vector p such that P1


(n -l)p1, 0, ... , 0). Then

1
n -2
P = - -1 v1 + - -1 Pn-1
nn-

where

1-P(1)
n- 1

1+(n-2)P(1)
1
2: --1
n- 1
n- .

= P(l) and P2 = P(n) and set v1 = ((n -1) P1, 1-

n-1
(
1
)
Pn-1 = - 0, P1 + P2- - - , P3, , Pn ,
n-2
n-1

is a probability vector with at most n - 1 non-zero entries. The induction step is complete.
It is a consequence that sampling from a discrete distribution may be achieved by sampling from
a collection of Bernoulli random variables.

7. It is an elementary exercise to show that lP'( R2 :=:: 1) = TC, and that, conditional on this event, the
vector (T1, T2) is uniformly distributed on the unit disk. Assume henceforth that R 2 :=:: 1, and write
(R, 8) for the point (T1, T2) expressed in polar coordinates. We have that R and e are independent
with joint density function fR,e(r, 8) = r In, 0 :=:: r :=:: 1, 0 :=:: () < 2n. Let (Q, W) be the polar

204

Solutions

Coupling and Poisson approximation

coordinates of (X, Y), and note that 111 =

e and e-2I Q2

[4.11.8]-[4.12.1]

= R2 . The random variables

Q and 111 are

I 2

independent, and, by a change of variables, Q has density function fQ(q) = qe-1q , q > 0. We
recognize the distribution of (Q, 111) as that of the polar coordinates of (X, Y) where X and Y are
independent N(O, 1) variables. [Alternatively, the last step may be achieved by a two-dimensional
change of variables.]
8.

We have that

9.

The polar coordinates (R, 8) of (X, Y) have joint density function


fR e(r, e)=
'

2r

-,
T(

Make a change of variables to find that y I X

= tan e has the Cauchy distribution.

10. By the definition of Z,


m-1

JP(Z

IT

(1- h(r))
r=O
JP(X > O)JP(X > 1 I X> 0) .. li"(X

= m) = h(m)

= m I X>

m- 1)

= JP(X = m).

11. Suppose g is increasing, so that h(x) = - g(1- x) is increasing also. By the FKG inequality of
Problem (3.11.18b), K = cov(g(U),-g(1 - U)) 2::: 0, yielding the result.

Estimating I by the average (2n) - 1 I:~~ 1 g (Ur) of 2n random vectors U r requires a sample
of size 2n and yields an estimate having some variance 2na 2 . If we estimate I by the average
(2n)- 1 {2:~= 1 g(Ur) + g(1 - Ur) }, we require a sample of size only n, and we obtain an estimate
with the smaller variance 2n(a 2 - K).
12. (a) By the law of the unconscious statistician,
lE [g(Y)fx(Y)J
fy(Y)

g(y)fx(Y)f ( )d =I.
fy(y)
y y y

(b) This is immediate from the fact that the variance of a sum of independent variables is the sum of
their variances; see Theorem (3.3.11b).
(c) This is an application of the strong law of large numbers, Theorem (7.5.1).

13. (a) If U is uniform on [0, 1], then X = sin(~rr U) has the required distribution. This is an example
of the inverse transform method.
(b) If U is uniform on [0, 1], then 1- U 2 has density function g(x) = {2~} - 1 , 0 ::=: x ::=: 1.
Now g(x) 2::: (rr/4)/(x), which fact may be used as a basis for the rejection method.

4.12 Solutions. Coupling and Poisson approximation


1.

Suppose that lE(u(X)) 2::: lE(u(Y)) for all increasing functions u. Let c E ffi. and set u
lc(x) = {

1 if X >
.
0 If X :S

205

C,
C,

= Ic where

[4.12.2]-[4.13.1]

Solutions

Continuous random variables

to find that lP'(X > c) = lE(lc(X)) 2: lE(lc(Y)) = lP'(Y > c).


Conversely, suppose that X 2:st Y. We may assume by Theorem (4.12.3) that X and Y are
defined on the same sample space, and that lP'(X 2: Y) = 1. Let u be an increasing function. Then
lP'(u(X) 2: u(Y)) 2: lP'(X 2: Y) = 1, whence lE(u(X) - u(Y)) 2: 0 whenever this expectation exists.
2. Let a= fl-f/..., and let Ur : r 2: 1} be independent Bernoulli random variables with parameter a.
Then Z =I:;= I lr has the Poisson distribution with parameter J...a = fl-, and Z _:::X.
3.

Use the argument in the solution to Problem (2.7.13).

4.

For any A

R,

lP'(X

#-

Y) 2: lP'(X E A, Y E Ac) = IP'(X E A) -lP'(X E A, Y E A)

2: IP'(X E A) - IP'(Y E A),


and similarly with X and Y interchanged. Hence,
IP'(X

#-

Y) 2: sup jiP'(X E A) -IP'(Y E A)j = ~dTV(X, Y).


A9R

5. For any positive x andy, we have that (y- x)+ + x


follows that

1\

y = y, where x

2)fx(k)- Jy(k)}+ = 2:)Jy(k)- fx(k)}+ = 1k

fx(k)

1\

y = min{x, y}. It

fy(k),

and by the definition of dTv(X, Y) that the common value in this display equals ~dTv(X, Y) = 8.
Let U be a Bernoulli variable with parameter 1-8, and let V, W, Z be independent integer-valued
variables with
IP'(V = k) = Ux(k)- fy(k)}+ /8,
lP'(W

= k) = {fy(k)

IP'(Z = k) = fx(k)

- fx(k)}+ /8,
A

fy(k)/(1 - 8).

Then X'= UZ + (1- U)V andY'= UZ + (1- U)W have the required marginals, and IP'(X' =
Y') = lP'(U = 1) = 1 - 8. See also Problem (7.11.16d).
6. Evidently dTv(X, Y) = Jp- qJ, and we may assume without loss of generality that p 2: q. We
have from Exercise (4.12.4) that lP'(X = Y) _::: 1 - (p- q). Let U and Z be independent Bernoulli
variables with respective parameters 1- p+q andq/(1-p+q). The pair X'= U(Z-1)+ 1, Y' = UZ
has the same marginal distributions as the pair X, Y, and IP'(X' = Y') = IP'(U = 1) = 1 - p + q as
required.
To achieve the minimum, we set X" = 1 - X' and Y" = Y', so that IP'(X" = Y") = 1 -IP'(X' =
Y') = P- q.

4.13 Solutions. Geometrical probability


1. The angular coordinates \II and I; of A and B have joint density f(1/J, u) = (2n)- 2 . We make
the change of variables from (p,B) r+ (1/J,u) by p = cos{~(u -1/1)}, e = ~(n +u +1/J), with
inverse
,,,
r
-r p, U = 8COS-I p,
'I' = 8 - 27i -cos

in+

and Jacobian Ill= 2/~.

206

Solutions [4.13.2]-[4.13.6]

Geometrical probability

2. Let A be the left shaded region and B the right shaded region in the figure. Writing A for the
random line, by Example (4.13.2),
IP'(A meets both Stand S2)

= IP'(A meets both A and B)


= IP'(A meets A) + IP'(A meets B) <X b(A) + b(B) - b(H) = b(X) -

IP'(A meets either A or B)


b(H),

whence IP'(A meets S2 I A meets St) = [b(X) - b(H)]/b(St).


The case when S2 s; St is treated in Example (4.13.2). When St n S2 =f. 0 and St A S2 =1- 0,
the argument above shows the answer to be [b(St) + b(S2) - b(H)]/b(St).

3. With III the length of the intercept I of Al with S2, we have that IP'(A2 meets I)
by the Buffon needle calculation (4.13.2). The required probability is
1 [ 2n

2 Jo
4.

21II

00

dpdB

-oo b(St) . b(St)

[ 2n

= Jo

IS2I
b(S 1) 2 dB

= 21II/b(St),

2niS21
b(S 1) 2 .

If the two points aredenotedP = (Xt, Yt), and Q = (X2, Y2), then

We use Crofton's method in order to calculate lE(Z). Consider a disc D of radius x surrounded by an
annulus A of width h. We set A(x) = lE(Z I P, Q E D), and find that

A(x +h) = A(x) ( 1 - :h - o(h)) + 2JE(Z I P E D, Q E A)


Now
JE(Z I P

D, Q

A)

=-

nx

k~nk2xcos0 2
r dr dB
o o

c:

+ o(1)

+ o(h)) .

32x

= -,
9n

whence

dA
4A
128
dx
x
9n
which is easily integrated subject to A(O) = 0 to give the result.

-=--+-,

5. (i) We may assume without loss of generality that the sphere has radius 1. The length X = IAOI
has density function f(x) = 3x 2 for 0 :::=: x :::=: 1. The triangle includes an obtuse angle ifB lies either
in the hemisphere opposite to A, or in the sphere with centre ~X and radius ~X, or in the segment
cut off by the plane through A perpendicular to AO. Hence,

(ii) In the case of the circle, X has density function 2x for 0 :::=: x :::=: 1, and similar calculations yield

1
1
1
~
3
IP'(obtuse) = - +- + -lE(cos- 1 X- Xy 1- X2) = -.

6.

Choose the x-axis along AB. With P


lEIABPI

JT

(X, Y) and G

= (Yt, Y2),

= ~IABilE(Y) = ~IABin =
207

IABGI.

[4.13.7]-[4.13.11]
7.

Solutions

Continuous random variables

We use Exercise (4.13.6). First fix P, and then Q, to find that

With b = lAB I and h the height of the triangle ABC on the base AB, we have that IGt G2l = ~ b and
the height of the triangle AGt G2 is ~h. Hence,
lEIAPQI = ~ ~b ~h = ~IABq.

8. Let the scale factor for the random triangle be X, where X E (0, 1). For a triangle with scale
factor x, any given vertex can lie anywhere in a certain triangle having area (1 - x) 2 1ABq. Picking
one at random from all possible such triangles amounts to supposing that X has density function
f(x) = 3(1 - x) 2 , 0::::: x ::::: 1. Hence the mean area is

9.

We have by conditioning that, for 0 :::::

z ::::: a,

a )
F(z, a+ da) = F(z, a) ( - -

a+da

= F(z, a)

( 1 - -2da)
a

+JP>(X :::::_a- z)

2ada
2 +o(da)
(a+da)

+-az -2da
+ o(da),
a

and the equation follows by taking the limit as da .j, 0. The boundary condition may be taken to be
F(a, a) = 1, and we deduce that

2z F(z, a) = --;;

(z-;; )2 ,

0 ::::: z ::::: a.

Likewise, by use of conditional expectation,

2da)
2da
mr(a+da)=mr(a) ( 1-~ +E((a-xn-~+o(da).
Now, JE((a-XY) = ar /(r+ 1), yieldingtherequiredequation. The boundaryconditionismr (0) = 0,
and therefore

2ar

mr(a)

(r

+ 1)(r + 2)

10. If n great circles meet each other, not more than two at any given point, then there are 2G)
intersections. It follows that there are 4(~) segments between vertices, and Euler's formula gives the
number of regions as n(n - 1) + 2. We may think of the plane as obtained by taking the limit as
R --+ oo and 'stretching out' the sphere. Each segment is a side of two polygons, so the average
number of sides satisfies
4n(n- 1)
-------+4
2 +n(n- 1)

as n--+ oo.

11. By making an affine transformation, we may without loss of generality assume the triangle has
vertices A= (0, 1), B = (0, 0), C = (1, 0). With P = (X, Y), we have that
L=

c~

y,

0) ,

M=

(X: y,
208

X : y) ,

N= (

0, 1~X) .

Solutions [4.13.12]-[4.14.1]

Problems
Hence,
lEJBLNJ = 2

ABC

xy
dxdy =
2(1-x)(l- y)

and likewise lEJCLMJ = lEJANMJ =


n 2 )JABCJ.

in

2 -

klo (

x
) dx =11:2
3
-x- --logx
--1-x
6
2'

~ It follows that lEJLMNJ = !OO- n 2 ) = (10-

12. Let the points be P, Q, R, S. By Example (4.13.6),


JP>( one lies inside the triangle formed by the other three)

= 41P'(S

PQR)

= 4 -lz.

13. We use Crofton's method. Let m(a) be the mean area, and condition on whether points do or do
not fall in the annulus with internal and external radii a, a+ h. Then
m(a+h)=m(a)C:hr

+[~ +o(h)Jm(a).

where m(a) is the mean area of a triangle having one vertex P on the boundary of the circle. Using
polar coordinates with P as origin,

Letting h .J, 0 above, we obtain

dm
da

6m
a

6
a

35a 2
36n

-=--+---,
whence m(a) = (35a 2 )1(48n).

14. Let a be the radius of C, and let R be the distance of A from the centre. Conditional on R, the
required probability is (a - R) 2 I a 2 , whence the answer is JE((a - R) 2 I a 2 ) =
(1 - r )2 2r dr =

i.

JJ

15. Let a be the radius of C, and let R be the distance of A from the centre. As in Exercise (4.13.14),
the answer is lE((a- R) 3 la 3 ) =
(1- r) 3 3r 2 dr = 2~.

JJ

4.14 Solutions to problems


1.

(a) We have that

Secondly, f 2: 0, and it is easily seen that J~00 f(x) dx

1 using the substitution y

(x -

f.-L)I(a./2).
I

I 2

I 2

(b) The mean is J~00 x (2n)- 'Z e- 'Z x dx, which equals 0 since x e- 'Z x is an odd integrable function.
I

I 2

The variance is J~00 x 2 (2n)- 'Z e- 'Z x dx, easily integrated by parts to obtain 1.

209

[4.14.2]-[4.14.5]

Solutions

Continuous random variables

(c) Note that

I 2

and also 1 - 3y- 4 < 1 < 1 + y- 2. Multiply throughout these inequalities by e--zY f./iii, and
integrate over [x, oo ), to obtain the required inequalities. More extensive inequalities may be found
in Exercise (4.4.8).
(d) The required probability is a(x) = [1 - <t>(x + ajx)]/[1 - <t>(x)]. By (c),

as x-+ oo.

2.

Clearly

f :::: 0 if and only if 0 : :=: a

<

f3 : :=: 1. Also

3. The Ai partition the sample space, and i - 1 : :=: X(w) < i if wE Ai Taking expectations and
using the fact that JE(/i) = JP>(Ai), we find that S::::: lE(X) ::::: 1 + S where
00

s=

00

i=2

4.

i-1

00

00

2:> -1)1P'(Ai) = LL 11P'(Ai) = L L


i=2j=1

(a) (i) Let p- 1 (y)

00

J!D(Ai) = LJP>(X:::: j).

j=1 i=j+1

= sup{x: F(x) = y}, so that

lP'(F(X)::::: y) = J!D(X::::: p- 1 (y)) = F(F- 1(y)) = y,


(ii) JP>( -log F(X) :::S y)

j=1

= JP>(F(X)
= PR,

:::: e-Y)

= 1-

0::::: y:::::

1.

e-Y if y :::: 0.

(b) Draw a picture. With D

JP>(D :::=::d)= lP'(tanPQR :::=::d)= J!D(PQR : :=: tan- 1 d)=~

+tan- 1 d).

Differentiate to obtain the result.


5.

Clearly
JP>(X > s + x

IX

> s) =

J!D(X > s + x)
e-}.(s+x)
JP>(X > s) =
e-}.s
= e-Ax

if x, s :::: 0, where A is the parameter of the distribution.


Suppose that the non-negative random variable X has the lack-of-memory property. Then G (x) =
JP>(X > x) is monotone and satisfies G(O) = 1 and G(s + x) = G(s)G(x) for s,x :::: 0. Hence
G(s) = e-}.s for some A; certainly A> 0 since G(s) : :=: 1 for all s.
Let us prove the hint. Suppose that g is monotone with g(O) = 1 and g(s + t) = g(s)g(t)
for s, t :::: 0. For an integer m, g(m) = g(l)g(m - 1) = = g(l)m. For rational x = mfn,
g(x)n = g(m) = g(l)m so that g(x) = g(l)x; all such powers are interpreted as exp{x logg(l)}.
Finally, if x is irrational, and g is monotone non-increasing (say), then g(u) : :=: g(x) : :=: g(v) for all

210

Solutions [4.14.6]-[4.14.9]

Problems

rationals u and v satisfying v ::0 x ::0 u. Hence g(l)" ::0 g(x) ::0 g(l)v. Take the limits as u {. x and
v t x through the rationals to obtain g (x) = eJLx where 1-i = log g ( 1).

6. If X and Y are independent, we may take g


f(x, y) = g(x)h(y). Then
fx(x)

= g(x)

and
1=
Hence fx(x)fy(y)

r:

r:

= g(x)h(y) =

h(y) dy,

fy(y)dy

fx and h

fy(y)

= h(y)

J: f:
g(x)dx

fy. Suppose conversely that

r:

g(x) dx

h(y)dy.

f(x, y), so that X andY are independent.

7. TheyarenotindependentsincelP'(Y < 1, X> 1) = OwhereaslP'(Y < 1) > OandlP'(X > 1) > 0.


As for the marginals,

fx(x)

=1 2e-x-y dy =2e- x,
00

fy(y) =loy 2e-x-y dx

=2e-Y(l- e-Y),

for x, y ::::: 0. Finally,

roo roo xy2e-x-y dx dy = 1


and E(X) =1. E(Y) =~.implying that cov(X, Y) =!
E(XY)

Jx=O }y=x

8.

As in Example (4.13.1), the desired property holds if and only if the length X of the chord satisfies
X ::0 ,J3. Writing R for the distance from P to the centre 0, and E> for the acute angle between the chord
sin E>, and therefore J!D(X ::0 ,J3) = JP>(R sinE> ::::: ~ ).
and the line OP, we have that X = 2 1 The answer is therefore

lP'

R2 2

(R >- _1_)
= ~ r!rr lP' (R > _1_) d(}
2sinE>
rr Jo
- 2sin8
'

which equals ~ - ,J3j(2rr) in case (a) and ~

9.

+ rr- 1 log tan(rr /12) in case (b).

Evidently,
E(U)

=JP>(Y ::0 g(X)) =!1O:o;x,y:::;t dx dy =lot0 g(x) dx,


y:::;g(x)

=E(g(X)) =lot g(x)dx,


E(W) =1lot {g(x) +g(l-x)}dx =lot g(x)dx.
E(V)

Secondly,

2)=E(U) =J, E(V2)=lol g(x)2dx ::0 J since g ::0 1,


)=! { 2 lot g(x)2dx + 2 lot g(x)g(1- x) dx}
E(W 2
E(U

=E(V 2)-1 lot g(x){g(x)- g(l - x)} dx

rt {g(x)- g(l- x)}

= E(V 2 ) - ! Jo

211

dx ::0 E(V 2 ).

[4.14.10]-[4.14.11]
Hence var(W)

:::=::

Solutions

var(V)

:::=::

Continuous random variables

var(U).

10. Clearly the claim is true for n = 1, since the r(A., 1) distribution is the exponential distribution.
Suppose it is true for n :::=:: k where k :=::: 1, and consider the case n = k + 1. Writing fn for the density
function of Sn, we have by the convolution formula (4.8.2) that

which is easily seen to be the r(A., k

+ 1) density function.

11. (a) Let Z1, Z2, ... , Zm+n be independent exponential variables with parameter A.. Then, by
Problem (4.14.10), X' = Z1 + + Zm is f(A, m), Y' = Zm+1 + + Zm+n is r(A., n), and
X'+ Y' is f(A., m + n). The pair (X, Y) has the same joint distribution as the pair (X', Y 1), and
therefore X+ Y has the same distribution as X'+ Y', i.e., f(A., m + n).
(b) Using the transformation u = x + y, v = xf(x + y), with inverse x = uv, y = u(l - v), and
Jacobian

-I

J we find that U

fu

v v
1-

1-

u - -u '
-u

= X+ Y, V = Xf(X + Y) have joint density function


Am+n

'

v(u, v)

= fx ' y(uv, u(l- v))lul = r (m)r (n) (uv)m- 1{u(l- v)}n-le-}."u


= {

v)n-1}

J..m+n um+n-le-}.u } { vm-1(1r(m+n)


B(m,n)

for u :=::: 0, 0 :::=:: v :::=:: 1. Hence U and V are independent, U being f(A., m + n), and V having the beta
distribution with parameters m and n.
(c) Integrating by parts,

where X' is f(A., m- 1). Hence, by induction,


JP>(X > t) =

m-1

(A.tY

e-M _ _ = J!D(Z < m).

k!

k=O

(d) This may be achieved by the usual change of variables technique. Alternatively, reflect that, using
the notation and result of part (b), the invertible mapping u = x + y, v = xf(x + y) maps a pair
X, Y of independent (r(A., m) and f(A., n)) variables to a pair U, V of independent (f(A., m +n) and
B(m, n)) variables. Now UV =X, so that (figuratively)
'T(A., m

+ n) x

B(m, n) = r(A., m)".

Replace n by n - m to obtain the required conclusion.

212

Solutions [4.14.12]-[4.14.15]

Problems

12. (a) Z

= xf satisfies
z 2:0,

the f(~, ~)or x2 (1) density function.


(b) If z 2: 0, Z

= xf +X~ satisfies

the 2 (2) distribution function.


(c) One way is to work inn-dimensional polar coordinates! Alternatively use induction. It suffices
to show that if X and Y are independent, X being x2 (n) and Y being x2 (2) where n 2: 1 , then
Z =X+ Y is x2 (n + 2). However, by the convolution formula (4.8.2),

z 2:0,
for some constant c. This is the x2 (n

+ 2) density function as required.

13. Concentrate on where x occurs in fxly(x

f XIY (x I Y ) =

I y); any multiplicative constant can be sorted out later:

2
Jx,y(x,y) = c1 ()
{
1
(x
2xp,l
y exp - - - - 2px(y-M))}
fy(y)
2(1 - p 2 ) a[
a[
awz

by Example (4.5.9), where q (y) depends on y only. Hence

X E

IR,

for some cz(y). This is the normal density function with mean 11-1 +pal (y- p,z)/az and variance
a[O - p 2 ). See also Exercise (4.8.7).

14. Set u = yjx, v = x, with inverse x = v, y = uv, and Ill= lvl. Hence the pair U = Y/X,
V = X has joint density fu. v(u, v) = fx.r(v, uv)lvl for -oo < u, v < oo. Therefore fu(u) =
f~oo f(v, uv)lvl dv.
15. By the result of Problem (4.14.14), U = Y I X has density function
fu(u)

L:

f(y)f(uy)IYI dy,

and therefore it suffices to show that U has the Cauchy distribution if and only if Z = tan -l U is
uniform on (-~n, ~n). Clearly
IP'(Z::; 0) = IP'(U :StanO),

213

-~n <

e<

~n,

[4.14.16]-[4.14.17]

Solutions

Continuous random variables

= fu(tanO)sec 2 e.

whence fz(O)

Therefore fz(O)

= n- 1 (for 101

fu(u) = n(l
When

+ u2)'

-00

< j;n)ifandonly if

< u < 00.

f is the N(O, 1) density,

1-oooo

f(x)f(xy)lxldx = 2

looo -e-2x
1
(l+y )xdx,
1 2

o 2n

which is easily integrated directly to obtain the Cauchy density. In the second case, we have the
following integral:

1-oo

a2 1xl

00

In this case, make the substitution

16. The transformation X

(1 +x4)(1 +x4y4) dx.

z = x 2 and expand as partial fractions.

= r COS 0, y = r sin 0 has Jacobian J = r, SO that


r > 0,

0:::::

e < 2n.

Therefore Rand 8 are independent, 8 being uniform on [0, 2n), and R 2 having distribution function

this is the exponential distribution with parameter j; (otherwise known as f(i, 1) or


density function of R is fR(r)
Now, by symmetry,

I 2

= re-2r

x2 (2)).

The

for r > 0.

In the first octant, i.e., in {(x, y) : 0 :::=:: y :::=:: x}, we have min{x, y} = y, max{x, y} = x. The
joint density Jx,Y is invariant under rotations, and hence the expectation in question is

!o

y
-Jx,y(x,y)dxdy=8

O.:::;y.:::;x X

lerr:f41oo
1:1=0

r=O

tane
1 2
2
- 2-re-'Zr drd0=-log2.
7i

7i

17. (i) Using independence,


lP'(U

:::=::

u) = 1-lP'(X > u, Y > u) = 1- (1- Fx(u))(l- Fy(u)).

Similarly
lP'(V

(ii) (a) By (i), lP'(U


(b) Also, Z = X

:::=::

u)

:::=::

v)

= lP'(X :::=:: v, Y :::=:: v) = Fx(v)Fy(v).

= 1- e- 2u for u :=::: 0.

+ j; Y satisfies

lP'(Z > v) =

looo lP'(Y > 2(v- x) )fx(x) dx = lov e- 2(v-x)e-x dx +

= e-2v(ev-

1)

+ e-v = 1- (1- e-v)2 = lP'(V >


214

v).

00

e-x dx

Solutions [4.14.18]-[4.14.19]

Problems
Therefore lE(V) = lE(X)

~,and var(V) = var(X)

+ ilE(Y) =

+ ! var(Y) =

i by independence.

18. (a) We have that

(b) Clearly, for w > 0,


JP>(U ::: u, W > w) = JP>(U ::: u, W > w, X ::: Y)

+ JP>(U ::: u, W >

w, X > Y).

Now
JP>(U :S u, W > w, X :S Y) = J!D(X::: u, Y > X+ w) =lou .l..e-}..xe-JL(x+w) dx

= _J.._e-JLW(l- e-(A+JL)U)
.l..+tt

and similarly

Hence, for 0 ::: u ::: u

+w <

oo,

an expression which factorizes into the product of a function of u with a function of w. Hence U and
W are independent.

19. U =X+ Y, V =X have joint density function fy(u- v)fx(v), 0::: v::: u. Hence
fv u(v I u) = !u, v(u, v) =
1
fu(u)
(a) We are given that fvJU(V

I u)

u fy(u- v)fx(v)
.
fy(u- y)fx(y)dy

fo

= u- 1 for 0::: v::: u; then

1 lou fy(u- y)fx(y)dy


fy(u- v)fx(v) = u 0
is a function of u alone, implying that
fy(u- v)fx(v) = fy(u)fx(O)

fy(O)fx(u)

by setting v = 0
by setting v = u.

In particular fy(u) and fx(u) differ only by a multiplicative constant; they are both density functions,
implying that this constant is 1, and fx = fy. Substituting this throughout the above display, we
find that g(x) = fx(x)ffx(O) satisfies g(O) = 1, g is continuous, and g(u- v)g(v) = g(u) for
0::: v ::: u. From the hint, g(x) = e-}..x for x 2:: 0 for some./.. > 0 (remember that g is integrable).
(b) Arguing similarly, we find that

ru

c
fy(u- v)fx(v) = ua+.B- 1 va- 1(u- v).B- 1 Jo fy(u- y)fx(y)dy

215

[4.14.20]-[4.14.22]

Solutions

Continuous random variables

for 0::: v::: u and some constant c. Setting fx(v) = x(v)va- 1 , fy(y) = 1J(y)yf3- 1, we obtain
IJ(U- v)x (v) = h(u) for 0::: v::: u, and for some function h. Arguing as before, we find that 17 and
x are proportional to negative exponential functions, so that X and Y have gamma distributions.

20. We are given that U is uniform on [0, 1], so that 0 ::: X, Y :::

1almost surely. For 0 <

<

i.

E = J!D(X + Y <E) :S J!D(X < E, Y <E)= J!D(X < E) 2 ,

and similarly
E = J!D(X + Y > 1 -E) :S J!D(X >

implying that J!D(X <E) ::':

JE and J!D(X >

1- E, Y > ! -E) = J!D(X > ! -

! -E) ::':

JE.

E) 2 ,

Now

2E = JP>(!- E <X+ Y <!+E)::': JP>(X > ! - E, Y <E) +JP>(X < E, Y >!-E)


= 21P'(X >

i- E)lP'(X <E)::': 2(,JE")

2.

Therefore all the above inequalities are in fact equalities, implying that J!D(X < E) = J!D(X >

JE if 0 <

<

i. Hence a contradiction:

1- E) =

1
1 X+ Y ::': g)<
1
1
1
g1 = JP>(X + Y < g)=
J!D(X, Y < g1) -J!D(X, Y < g
J!D(X < 81 , Y < g)=
s

21. Evidently

LlP' (Xrr

lP'(X(l) :S Y1 ... , X(n) :S Yn) =

:S Yb ... , Xrrn :S Yn, Xrr1 < < Xrrn)

Jf

where the sum is over all permutations n = (n1, nz, ... , l!n) of (1, 2, ... , n). By symmetry, each
term in the sum is equal, whence the sum equals
n!J!D(X1 :S Y1 , Xn :S Yn. X1 < Xz < < Xn).

The integral form is then immediate. The joint density function is, by its definition, the integrand.
22. (a) In the notation of Problem (4.14.21), the joint density function of X(2), ... , X(n) is
gz(yz, , Yn)

Y2

g(y1, , Yn) dy1

-00

= n! L(yz, ... , Yn)F(yz)f(yz)f(Y3) f(Yn)


where F is the common distribution function of the Xi. Similarly X (3), ... , X (n) have joint density

and by iteration, X(k) ... , X(n) have joint density

nl

gk(Yk. , Yn) = (k _ 1)! L(Yb , Yn)F(yk)

k-1

f(Yk) f(Yn).

We now integrate over Yn. Yn-b ... , Yk+l in turn, arriving at


f x(k) (Yk ) = (k _ 1)!n!(n _ k)! F(yk) k-1{ 1 - F(yk) }n-kf (Yk )

216

Solutions [4.14.23]-[4.14.25]

Problems

(b) It is neater to argue directly. Fix x, and let I r be the indicator function of the event {Xr ::::; x}, and
letS= It + h ++ln. Then Sis distributed as bin(n, F(x)), and
lP'(X(k) ::::; x) = lP'(S;:: k) =

t (;)

F(x) 1(1 - F(x))n-l_

l=k

Differentiate to obtain, with F = F(x),


fx(k)(x)

t (;)

f(x) {tF 1- 1(1- F)n-l- (n -l)F1(1- F)n-l- 1 }

l=k

= k (:) f(x)Fk-1 (1- F)n-k


by successive cancellation of the terms in the series.
23. Using the result of Problem (4.14.21), the joint density function is g(y) = n! L(y)T-n for
0::::; Yi ::::; T, 1 ::::; i ::::; n, where y = (Y1, Y2, ... , Yn).
24. (a) We make use of Problems (4.14.22)-(4.14.23). The density function of X(k) is fk(x)
k(~)xk- 1 (1 - x)n-k for 0::::; x ::::; 1, so that the density function of nX(k) is

1
kn(n-l)(n-k+l) k- 1 (
x)n-k
- fk(x jn) = x
1- n
k!
nk
n
as n

1
k- -x
- - - x 1e
(k - 1)!

cxp. The limit is the f(l, k) density function.

(b) For an increasing sequencex(l), X(2) ... , X(n) in [0, 1], we define the sequence Un = -n 1ogx(n)
Uk = -klog(x(k)/X(k+l)) for 1 ::::; k < n. This mapping has inverse

X(k) = X(k+1)e

-u

kl

n
= exp { - ~
1

-1

u;

l=k

with Jacobian J = ( -l)ne-ul-u2--un jn!. Applying the mapping to the sequence X(l) X(2) ... ,
X(n) we obtain a family U1, U2, ... , Un of random variables with joint density g(u1, u2, ... , un) =
e-ul-u2--un for u; ;:: 0, 1 ::::; i ::::; n, yielding that the U; are independent and exponentially
distributed, with parameter 1. Finally log X (k) = - 'L/t=k i - 1U;.
(c) In the notation of part (b), Zk = exp( -Uk) for 1 ::::; k ::::; n, a collection of independent variables.
Finally, Uk is exponential with parameter 1, and therefore
lP'(Zk ::::; z) = lP'(Uk :=:: -log z) = elogz = z,

O<z:Sl.

25. (i) (X 1, X2, X 3) is uniformly distributed over the unit cube of R 3, and the answeris therefore the
volume of that set of points (x1, x2, x3) of the cube which allow a triangle to be formed. A triangle
is impossible if x1 ;:: x2 + x3, or x2 ;:: x1 + x3, or x3 ;:: x1 + x2. This defines three regions of
the cube which overlap only at the origin. Each of these regions is a tetrahedron; for example, the
region x3 :=:: x1 + x2 is an isosceles tetrahedron with vertices (0, 0, 0), (1, 0, 1), (0, 1, 1), (0, 0, 1),
with volume ~. Hence the required probability is 1 - 3 ~ = ~.
(ii) The rods oflength x1, x2, ... , Xn fail to form a polygon if either x1 ;:: x2 + x3 + + Xn or any of
the other n - 1 corresponding inequalities hold. We therefore require the volume of then-dimensional
hypercube with n comers removed. The inequality x1 ;:: x2 + x3 + + Xn corresponds to the convex
hull of the points (0, 0, ... , 0), (1, 0, ... , 0), (1, 1, 0, ... , 0), (1, 0, 1, 0, ... , 0), ... , (1, 0, ... , 0, 1).

217

[4.14.26]-[4.14.27]

Solutions

Continuous random variables

Mapping x1 1-+ 1 - x1, we see that this has the same volume Vn as has the convex hull of the origin
0 together with then unit vectors e1, e2, ... , en. Clearly V2 = ~.and we claim that Vn = 1/n!.
Suppose this holds for n < k, and consider the case n = k. Then

where Vk-1 (0, x1 e2, ... , x1 ek) is the (k- I)-dimensional volume of the convex hull ofO, x1 e2, ... ,
x1ek. Now

so that

{1

xk-1

vk = lo

(k

~ l)! dx1 =

1
k!.

The required probability is therefore 1- n/(n!) = I- {(n- 1)!}- 1.


26. (i) The lengths of the pieces are U = rnin{X1, X2}, V = IX1- X2l, W =I- U- V, and we
require that U < V + W, etc, as in the solution to Problem (4.14.25). In terms of the Xi we require
either:

IX1- X2l < ~.

1-X2<~,

or:

IX1- X2l < ~.

1-

x1

< ~-

t,

Plot the corresponding region of JR 2. One then sees that the area of the region is which is therefore
the probability in question.
(ii) The pieces may form a polygon if no piece is as long as the sum of the lengths of the others. Since
the total length is 1, this requires that each piece has length less than ~. Neglecting certain null events,
this fails to occur if and only if the disjoint union of events Ao U A 1 U U An occurs, where
Ao = {no break in (0,

~J},

Ak = {no break in (Xk, Xk

+ ~l} for I :::; k:::; n;

remember that there is a permanent break at I. Now lP'(Ao) = (~)n, and fork 2: 1,

+ 1)2-n whence the required probability is 1- (n + 1)2-n.


(tP j p) + (t-q /q), fort > 0, has a unique minimum att = 1, and hence

Hence lP'(Ao U A1 U U An)= (n

27. (a) The function g(t) =


g(t) 2: g(l) =I fort> 0. Substitute t = x 1fqy- 11P where
X

_....:.1X__:._l......{EIXPI}1/p'

(we may as well suppose that lP'(XY = 0) :1= l) to find that


IX!P
pEIXPI

IYiq

+ qEIYql

IXYI
2: {EIXPI}l/P{EIYql}l/q.

Holder's inequality follows by taking expectations.

218

Solutions [4.14.28]-[4.14.31]

Problems
(b) We have, with Z =IX+

Yl.

E(ZP) = E(Z zP- 1 )::::; E(IXIZp-I) + E(IYizp-I)

::::; {EIXP1} 1/P{JE:(ZP)} 1/q + {EIYPI} 1/P{JE:(ZP)}l/q


by Holder's inequality, where p- 1 + q- 1 = 1. Divide by {E(ZP)} 1/q to get the result.
28. Apply the Cauchy-Schwarz inequality to IZI!(b-a) and IZI!(b+a), where 0::::; a::::; b, to obtain
{EIZbl} 2 ::::; Elzb-al Elzb+al. Now take logarithms: 2g(b)::::; g(b- a)+ g(b +a) for 0::::; a::::; b.
Also g(p) --+ g(O) = 1 asp .J, 0 (by dominated convergence). These two properties of g imply that
g is convex on intervals of the form [0, M) if it is finite on this interval. The reference to dominated
convergence may be avoided by using HOlder instead of Cauchy-Schwarz.
By convexity, g(x)jx is non-decreasing inx, and therefore g(r)/r 2': g(s)js ifO < s::::; r.
29. Assume that X, Y, Z are jointly continuously distributed with joint density function
E(X I Y = y, Z = z) =

xfxiY z(x I y, z) d x =
'

f.

Then

f(x,y,z)d
x.
!r,z(y, z)

Hence

E{E(X I Y, Z) Y =

y} =
=

j E(X I Y = y, Z = z)fzly(z I y)dz

rr

f(x, y, z) fy,z(y, z) dxdz

11 fr.z(y, z) fy(y)
f(x,y,z)
~r x fy(y) dxdz=E(XIY=y).

Alternatively, use the general properties of conditional expectation as laid down in Section 4.6.
30. The first car to arrive in a car park of length x + 1 effectively divides it into two disjoint parks
of length Y and x - Y, where Y is the position of the car's leftmost point. Now Y is uniform on
[0, x], and the formula follows by conditioning on Y. Laplace transforms are the key to exploring the
asymptotic behaviour of m (x) / x as x --+ oo.

31. (a) If the needle were oflength d, the answer would be 2/:rr as before. Think about the new needle
as being obtained from a needle of length d by dividing it into two parts, an 'active' part of length L,
and a 'passive' part of length d- L, and then counting only intersections involving the active part.
The chance of an 'active intersection' is now (2/:rr)(L/d) = 2Lj(:rrd).
(b) As in part (a), the angle between the line and the needle is independent of the distance between
the line and the needle's centre, each having the same distribution as before. The answer is therefore
unchanged.
(c) The following argument lacks a little rigour, which may be supplied as a consequence of the
statement that S has finite length. ForE > 0, let XI, x2, ... , Xn be points on S, taken in order along
S, such that xo and Xn are the endpoints of S, and Ixi+ 1 - Xi I < E for 0 ::::; i < n; lx - y I denotes
the Euclidean distance from x to y. Let Ji be the straight line segment joining Xi to xi+ 1, and let /i
be the indicator function of {Ji n A :1= 0}. If E is sufficiently small, the total number of intersections
between Jo U lt U U ln-1 and S has mean
n-1

'L:EUi)
i=O

2 n-1
=lxi+l -xil

:rrd i=O

219

[4.14.32]-[4.14.34]

Solutions

Continuous random variables

by part (b). In the limit as E .j.. 0, we have that L::i JE(/i) approaches the required mean, while

2L(S)

n-I

-:rrd L
lxi+l- xil--+ - - .
i=O
:rrd
32. (i) Fix Cartesian axes within the gut in question. Taking one end of the needle as the origin,
the other end is uniformly distributed on the unit sphere of R 3 . With the X-ray plate parallel to the
(x, y )-plane, the projected length V of the needle satisfies V ~ v if and only if IZ I ::::; ~. where
Z is the (random) z-coordinate of the 'loose' end of the needle. Hence, for 0::::; v ::::; 1,
;-::--;:)
;-::--;:)
4:rr ~
;-::--;:)
lP'(V~v)=lP'(-Yl-v~::;z::;vl-v2)=
=vl-v 2 ,
4:rr
since 4:rr ~is the surface area of that part of the unit sphere satisfying lzl ::::; ~(use
Archimedes's theorem of the circumscribing cylinder, or calculus). Therefore V has density function

fv(v) = vj~ forO::::; v::::; 1.


(ii) Draw a picture, if you can. The lengths of the projections are determined by the angle 8 between
the plane of the cross and the X-ray plate, together with the angle w of rotation of the cross about an
axis normal to its arms. Assume that 8 and Ware independent and uniform on [0, ~:rr]. If the axis
system has been chosen with enough care, we find that the lengths A, B of the projections of the arms
are given by
B = Vsin 2 w + cos2 w cos2

A= Vcos2 w+ sin 2 wcos2 e,


with inverse

e,

w =tan -~M-A2
--2.
1- B

Some slog is now required to calculate the Jacobian J of this mapping, and the answer will be
!A,B(a, b)= 41ll:rr- 2 for 0 <a, b < 1, a 2 + b 2 > 1.

33. The order statistics of the Xi have joint density function

on the set I of increasing sequences of positive reals. Define the one-one mapping from I onto
(0, oo)n by
Yr = (n + 1- r)(xr -Xr-1) for 1 < r ::5 n,
YI = nxt,
with inverse Xr = L:k=l Yk/(n - k + 1) for r ~ 1. The Jacobian is (n !)- 1 , whence the joint density
function of Yt, Y2, ... , Yn is

~Ann! exp(- txi(Y)) =Anexp(- tYk)

n.

i=l

k=l

34. Recall Problem (4.14.4). First, Zi = F(Xi), 1 ::::; i::::; n, is a sequence of independent variables
with the uniform distribution on [0, 1]. Secondly, a variable U has the latter distribution if and only
if - log U has the exponential distribution with parameter 1.

220

Solutions [4.14.35]-[4.14.37]

Problems

ltfollowsthatLi = -log F(Xi), 1 ::S i ::S n, isasequenceofindependentvariables with the exponential distribution. The order statistics L(l) ... , L(n) are in order -log F(X(n)), ... , -log F(X(l)),
since the function -log FO is non-increasing. Applying the result of Problem (4.14.33), Et
-n log F(X(n)) and
Er

= -(n + 1- r){log F(X(n+l-r)

1 < r ::S n,

-log F(X(n+2-r)) },

are independent with the exponential distribution. Therefore exp(- Er ), 1 ::S r ::S n, are independent
with the uniform distribution.
35. One may be said to be in state j if the first j - 1 prizes have been rejected and the jth prize has
just been viewed. There are two possible decisions at this stage: accept the jth prize if it is the best
so far (there is no point in accepting it if it is not), or reject it and continue. The mean return of the
first decision equals the probability j In that the j th prize is the best so far, and the mean return of the
second is the maximal probability f (j) that one may obtain the best prize having rejected the first j.
Thus the maximal mean return V (j) in state j satisfies
V(j)

= max{jln, f(j)}.

Now j In increases with j, and f(j) decreases with j (since a possible stategy is to reject the (j + l)th
prize also). Therefore there exists J such that j In ::=:: f (j) if and only if j ::S J. This confirms the
optimal stategy as having the following form: reject the first J prizes out of hand, and accept the
subsequent prize which is the best of those viewed so far. If there is no such prize, we pick the last
prize presented.
Let n, be the probability of achieving the best prize by following the above strategy. Let Ak be
the event that you pick the kth prize, and B the event that the prize picked is the best. Then,

Ln

rr, =

lP'(B

Ak)lP'(Ak) =

k=J+l

(k)(
-

Ln
k=J+l

1)

1
-k- -1 k

Ln

J
=-

k=J+l

-k- ,
-1

and we choose the integer J which maximizes this expresion.


When n is large, we have the asymptotic relation rr, ~ (J In) log(nl J). The maximum of the
function hn(x) = (xln) log(nlx) occurs atx = nle, and we deduce that J ~ nle. [A version of this
problem was posed by Cayley in 1875. Our solution is due to Lindley (1961).]
36. The joint density function of (X, Y, Z) is
f(x, y, z)

1
exp{ -!(r 2 - 2A.x- 2p,y- 2vz
2
(2Jra ) 312

+ ;.2 + J-L 2 + v2 )}

where r 2 = x 2 + y 2 + z 2 . The conditional density of X, Y, Z given R = r is therefore proportional


to exp{A.x +fLY+ vz}. Now choosing spherical polar coordinates with axis in the direction (A., J-L, v),
we obtain a density function proportional to exp(a cosO) sine, where a = r)A.2 + J-L2 + v2. The
constant is chosen in such a way that the total probability is unity.
37. (a) 1 (x) = -x(x), so H1 (x) = x. Differentiate the equation for Hn to obtain Hn+l (x) =
xHn(x)- H~(x), and use induction to deduce that Hn is a polynomial of degree n as required.
Integrating by parts gives, when m ::S n,

L:

Hm(x)Hn(x)(x) dx = (-l)n

L:
L:

(-l)n-1

Hm(x)(n)(x) dx
H:n(x)(n-1)(x)dx

= ... = (-l)n-m ~oo H~m\x)(n-m)(x) dx,


-00

221

[4.14.38]-[4.14.39]

Solutions

Continuous random variables

and the claim follows by the fact that H~m) (x) = m!.
(b) ~y Taylor's theorem and the first part,
tn
t)n
L -Hn(x)
= L --=--cp(n)(x) = (x- t),
oo

l/J(x)

oo (

n=O n!

n!

n=O

whence

38. The polynomials of Problem (4.14.37) are orthogonal, and there are unique expansions (subject
to mild conditions) of the form u(x) = 2::;~ 0 arHr(x) and v(x) = 2::;~ 0 brHr(x). Without loss of
generality, we may assume that E(U) = E(V) = 0, whence, by Problem (4.14.37a), ao = bo = 0.
By (4.14.37a) again,
00

00

var(U) = E(u(X) 2 ) = L:a?r!,


r=1

L b?r!.

var(V) =

r=1

By considering the coefficient of sm tn,


pnnl
E(Hm(X)Hn(Y)) = { 0

ifm=n,
ifm =/=n,

and so

where we have used the Cauchy-Schwarz inequality at the last stage.

= X(r) - X(r-1) with the convention that X(Q) = 0 and X(n+1) = 1. By Problem
(4.14.21) and a change of variables, we may see that Y1, Y2, ... , Yn+1 have the distribution of a point
chosen uniformly at random in the simplex of non-negative vectors y = (Y1, Y2, ... , Yn+ 1) with sum
1. [This may also be derived using a Poisson process representation and Theorem (6.12.7).] Consequently, the Yj are identically distributed, and their joint distribution is invariant under permutations

39. (a) Let Yr

of the indices of the Yj. Now I;~~f Yr = 1 and, by taking expectations, (n


E(X(r)) = rE(Y1) = r/(n + 1).
(b) We have that
E(Y 2 )
1

loo
[(I:
1

x 2n(1 - x)n- 1 dx

1= E

Yr) ] = (n

2
(n

+ 1)(n + 2)

+ 1)E(Yl) =

+ 1)E(Y[) + n(n + 1)E(Y1 Y2),

r=1

222

1, whence

Solutions [4.14.40]-[4.14.42]

Problems
implying that
I
JE(Y1 Y2) = (n + I)(n + 2),
and also

2
r(s + I)
JE(X(r)X(s)) = rlE(Y1 ) + r(s- I)JE(Y1 Y2) = -:-(n-+--'-:-I):-:(-n-'-+-:2-:-)

The required covariance follows.

40. (a) By paragraph (4.4.6), X 2 is f(!, !) and Y2 + z 2 is f(!, I). Now use theresultsofExercise
(4.7.I4).
(b) Since the distribution of x 2I R 2 is independent of the value of R 2 = x 2 + Y2 + z 2 ' it is valid
also if the three points are picked independently and uniformly within the sphere.

41. (a) Immediate, because the N(O, I) distribution is symmetric.


(b) We have that

L:

2(x)<I>(A.x) dx =
=

L:
L:

{(x)<I>(A.x) + (x)[l- <1>(-A.x)J} dx

L:

(x)dx +

(x)[<I>(A.x)- <1>(-J...x)]dx =I,

because the second integrand is odd. Using the symmetry of, the density function of IYI is
(x) + (x){ <I>(A.x)- <1>(-J...x)} + (-x) + (-x){ <1>(-A.x)- <I>(A.x)} = 2(x).
(c) Finally, make the change of variables W = IYI. Z =(X+ A.IYI)/)I + A. 2, with inverse IYI = w,
x = zJI + J...2- A.w, and Jacobian )I+ A. 2. Then

Jw,z(w,

z) =)I+ A. 2fx.IYI (zJI

+ A. 2 - A.w, w)

=)I+ A. 2 (zJI + A. 2 - A.w) 2(w),

> 0,

R.

The result follows by integrating over w and using the fact that

42. The required probability equals

where U1, U2 are N(O,


therefore
p =

i), V1, V2 are N(O, ~),and U1, U2, V1, V2 are independent. The answer is

JP>(i(Nf + Nj') ::S iCN} + N}))

= 1P'(K1 ::::: ~ K2)


K1
= lP' ( K1 + K2 ::::;

I)

where the

Ni

are independent N(O, 1)

where the Ki are independent chi-squared


1

= lP'(B ::::; 4) = 4

223

x2(2)

[4.14.43]-[4.14.48]

Solutions

Continuous random variables

where we have used the result of Exercise (4.7.14), and B is a beta-distributed random variable with
parameters 1, 1.
43. The argument of Problem (4.14.42) leads to the expression

IP'(Uf + u:} + uj :::

Vf +

vf + vf)

= lP'(K 1 :::

1K2)

where the K; are x2 (3)

.J3
3- 4n'

1
1
= JP'(B ::; 4 ) =

where B is beta-distributed with parameters ~. ~


44. (a) Simply expand thus: JE[(X - p,) 3 ] = JE[X 3 - 3X2p, + 3Xp,2 - p, 3 ] where p, = lE(X).

(b) var(Sn) = na 2 and lE[(Sn - np,) 3 ] = nlE[(X 1 - p,) 3 ] plus terms which equal zero because
lE(X1- p,) = 0.
(c) If Y is Bernoulli with parameter p, then skw(Y) = (1 - 2p)/ .fii(j, and the claim follows by (b).
(d) m1 =A, m2 =A+ A2, m3 = A3 + 3A 2 +A, and the claim follows by (a).
(e)SinceAXisf(l, t), wemayaswellassumethatA = 1. ItisimmediatethatlE(Xn) = f(t+n)/f(t),
whence
t(t + l)(t + 2)- 3t. t(t + 1) + 2t 3
2
skw(X) =
t 312
= 0 .
45. We find as above that kur(X) = (m4- 4m3m1 + 6m2mi- 3mj)/a 4 where mk = lE(Xk).

(a) m4 = 3a 4 for the N(O, a 2 ) distribution, whence kur(X) = 3a 4 ja 4 .


(b) mr = r !jAr, and the result follows.
(c) In this case, m4 = A4 + 6A 3 + n 2 +A, m3 = A3 + 3A2 +A, m2 = A2 +A, and m1 =A.
(d) (varSn) 2 = n 2a 4 andlE[(Sn- nm1) 4 ] = nlE[(X1 -m1) 4 ] + 3n(n -l)a 4 .
46. We have as n

--+

oo that
-x)n

JP'(X(n) ::; x + logn) = {1- e-(x+Iogn)}n = ( 1- ~

--+ e-e-x.

By the lack-of-memory property,

1
1
lE(X(2)) =- + - -1 ,
n
nwhence, by Lemma (4.3.4),

!oo

oo

{1- e-e -x }dx = lim lE(X(n) -logn) = lim


n-+oo

n-+oo

-1 + -1- + + 1-logn ) = y.
n
n- 1

47. By the argument presented in Section 4.11, conditional on acceptance, X has density function
fs. You might use this method when fs is itself particularly tiresome or expensive to calculate. If
a(x) and b(x) are easy to calculate and are close to fs, much computational effort may be saved.
48. M = max{U1, U2, ... , Uy} satisfies

e1 - 1
JP'(M::; t) = lE(ty) = - - .
e -I

224

Solutions [4.14.49]-[4.14.51]

Problems
Thus,

lP'(Z ~

z) = lP'(X
=

Y ~

LzJ + 2) + IP'(X = LzJ + 1,

(e- l)e-LzJ-2
1- e-1

+ (e-

LzJ + 1- z)

LJ 1 elzJ+1-z- 1
.
e- 1

= e-z

l)e- z -

49. (a) Y has density function e-Y for y > 0, and X has density function fx(x) = ae-ax for x > 0.
Now Y ~ (X - a) 2 if and only if

I 2

which is to say that aVfx(X) ~ f(X), where a = a- 1e2" 01 .J2[ii. Recalling the argument of
Example (4.11.5), we conclude that, conditional on this event occurring, X has density function f.
(b) The number of rejections is geometrically distributed with mean a- 1, so the optimal value of a is
I 2

that which minimizes ae- 2" 01


(c) Setting
Z = { +X
-X

.fii72, that is, a

= 1.

1}
!

with probability
with probability

conditional on Y >

1(X -

a )2 ,

we obtain a random variable Z with the N (0, 1) distribution.

50. (a) lE(X)

fo

~du = n/4.

In

(b) lE(Y) = -

{2" sin B dB = 2/rr.

n lo

51. You are asked to calculate the mean distance of a randomly chosen pebble from the nearest
collection point. Running through the cases, where we suppose the circle has radius a and we write
P for the position of the pebble,
(i)

lEIOPI =

(ii)

lEIAPI

__!__2
na

lE(IAPI

(iii)

IBPI)

=-

fa r 2dr dB

= 2a.

lo!n lo2acos8

na

{ 2n

Jo Jo

r 2 drdB

32a

= -.
9n

= __.;. [ [!n fasece r 2drdB + [!n f 2acose r 2drde]


na

lo lo

4a { 16
3rr 3

l!n lo

~
1
~ }
- 617 v2+
2 1og(l +v2)

:::::::

2a

x 1.13.

(iv)
lE(IAPI

IBPI 1\ ICPI) = ---; {


na

r
r 2dr dB+ [!n f 2acose r 2dr dB}
lo
J!n lo

[!n

lo

where x =a sin(jrr) cosec(~rr- B)

1
= -2a {loj-n -3.J3
cosec 3 (n
- +e)
7r

= 2a { 16 7r

dB+

.!...!.,J3
+ 3.J3 log~} ::::::: 2a
4
16
2
3
225

~!n 8 cos 3 B dB }

!n

x 0.67.

[4.14.52]-[4.14.55]

Solutions

Continuous random variables

52. By Problem (4.14.4 ), the displacement of R relative to P is the sum of two independent Cauchy
random variables. By Exercise (4.8.2), this sum has also a Cauchy distribution, and inverting the
transformation shows that e is uniformly distributed.
~ occurs if and only if
the sum of any two parts exceeds the length of the third part.
(a) If the breaks are at X, Y, where 0 < X < Y < 1, then ~occurs if and only if 2Y > 1, and
2(Y - X) < 1 and 2X < 1. These inequalities are satisfied with probability

53. We may assume without loss of generality that R has length 1. Note that

i.

(b) The length X of the shorter piece has density function fx (x) = 2 for 0 ~ x ~ ~. The other pieces
are oflength (1- X)Y and (1- X)(l- Y), where Y is uniform on (0, 1). The event~ occurs if and
only if2Y < 2XY + 1 and X+ Y- XY >
and this has probability

i.

(c) The three lengths are X,

lo0 ~ { 2(1 1- X) -

1- 2x } dx = log(4/e).
2(1 - X)

io -X), iO- X), where X is uniform on (0, 1). The event~ occurs

if and only if X < ~(d) This triangle is obtuse if and only if

lx
2

2 (1- X)

which is to say that X >

.fi -

>

;;:;
v2

1. Hence,

IP'(obtuse

IM =

IP'( .fi- 1 < X < ! )


2 = 3- 2./i.
1
IP'(X < 2)

54. The shorter piece has density function fx(x) = 2 for 0 ~ x ~


IP'(R

r) = IP' ( -X1-X

i Hence,

r ) = -2r
-,
l+r

with density function fR(r) = 2/(1 - r) 2 for 0 ~ r ~ 1. Therefore,


1
lo1 1 r
JE(R) = lo IP'(R > r) dr =
- - dr = 2log2- 1,
0
o 1 +r
1
lo1 2r(l - r)
JE(R 2 )= lo 2r!P'(R>r)dr=
dr=3-4log2,
0
0
1 +r
and var(R) = 2 - (2log 2) 2.
55. With an obvious notation,

By a natural re-scaling, we may assume that a = 1. Now, X1 - X2 and Y1 - Y2 have the


same triangular density symmetric on (-1, 1), whence (X 1 - X2) 2 and (Y1 - Y2) 2 have distribution

226

Solutions [4.14.56]-[4.14.59]

Problems
I

function F(z) = 2../i- z and density function fz(z) = z-2" - 1, for 0:::::: z:::::: 1. Therefore R 2 has
the density f given by

f(r) =

1
r (-1 _ 1) (~
_ 1) dz
{ lo 1 ../i

1 (-../i

1 - 1) ( 1
- 1) dz
~

r-1

if0::S:r::S:1,
if 1 ::S: r ::S: 2.

The claim follows since

b1-1 - dz = 2 (.
sm
a ../i~

-1 ~ .-1l)
- - sm
r

56. We use an argument similar to that used for Buffon's needle. Dropping the paper at random
amounts to dropping the lattice at random on the paper. The mean number of points of the lattice in a
small element of area dAis dA. By the additivity of expectations, the mean number of points on the
paper is A. There must therefore exist a position for the paper in which it covers at least rA l points.
57. Consider a small element of surface dS. Positioning the rock at random amounts to shining light
at this element from a randomly chosen direction. On averaging over all possible directions, we see
that the mean area of the shadow cast by d S is proportional to the area of d S. We now integrate over
the surface of the rock, and use the additivity of expectation, to deduce that the area A of the random
shadow satisfies E(A) = cS for some constant c which is independent of the shape of the rock. By
considering the special case of the sphere, we find c =
It follows that at least one orientation of
the rock gives a shadow of area at least S.

;! .

;!

58. (a) We have from Problem (4.14.1lb) that Yr = Xr/(X1 ++X,) is independent of X1 +
+X,, and therefore of the variables Xr+1 Xr+2 , Xk+1 X1 + + Xk+1 Therefore Yr is
independent of {Yr+s : s ~ 1}, and the claim follows.
(b) Lets= x1 + ... + xk+1 The inverse transformation X1 = Z1S, xz = zzs, ... ' Xk = ZkS,
xk+1 = s - z1s- zzs- - ZkS has Jacobian
s

0
0

Z1
zz
=sk.

1=

Zk

-s

-s

-s

1- Z1- - Zk

The joint density function of X 1 , X 2, ... , X k, S is therefore (with a = I:~!} fJr),

{IT

r=1

)._fJr (zrs)f3r-1e-'Azrs } . )..fJk+! {s(l _ z 1 _ ... _ Zk)}fJk+! -1e-'As(l-zJ--zk)

f'(fJr)

f'(fJk+J)
k

f(A.,

f3, s)

(IT

zf'- 1) (1 - Z1 - - Zk)fJk+J- 1,

r=1

where

is a function of the given variables. The result follows by integrating overs.

59. Let C = (crs) be an orthogonal n x n matrix with Cni = 1/ ,fii for 1 ::S: i ::S: n. Let Yir =
2::~= 1 XisCrs. and note that the vectors Y, = (Y1, Yz, ... , Ynr), 1 ::S: r ::S: n, are multivariate
normal. Clearly EYir = 0, and
E(YirYjs) = LCrtCsuE(XitXju) = LCrtCsu8tuVij = LCrtCstVij = 8rsVij
t,u
t,u
t

227

[4.14.60]-[4.14.60]

Solutions

Continuous random variables

where Dtu is the Kronecker delta, since C is orthogonal. It follows that the set of vectors Y r has the
same joint distribution as the set of Xr. Since C is orthogonal, Xir = 2::~= 1 Csr Yis, and therefore

n-1
= LYisYjs- YinYjn = LYisYjs
s=1

This has the same distribution as Tij because the Yr and the X7 are identically distributed.

60. We sketch this. Let EIPQRI = m(a), and use Crofton's method. A point randomly dropped in
S(a + da) lies in S(a) with probability
2
2da
( -a- ) = 1 - +o(da).
a+da
a

Hence

dm
6m
6mb
-=--+--,
da
a
a
where mb(a) is the conditional mean of IPQRI given that Pis constrained to lie on the boundary of
S(a). Let b(x) be the conditional mean of IPQRI given that Plies a distance x down one vertical
edge.

T1

R1

T2

R2
By conditioning on whether Q and R lie above or beneath P we find, in an obvious notation, that

x)2
(a-x) 2
b(x) = ( ~ mR 1 + -amR2

+ 2x(a-x)
a2
mR 1,Rz

By Exercise (4.13.6) (see also Exercise (4.13.7)), mR 1 ,R2 = icia)(ia) = ~a 2 . In order to find
m R 1, we condition on whether Q and R lie in the triangles T1 or T2, and use an obvious notation.

i,

iax. Next, arguing as we did in


Recalling Example (4.13.6), we have that mT1 = mT2 =
that example,
1 }.
mT1 ,T2 = 21 94{ ax- 41 ax- 41 ax- gax
Hence, by conditional expectation,

m R!

141

4 'I7 2 ax

1
143
13
+ 41 4
'I7 zax + 2 9 gax = TOSax.

We replace x by a - x to find m Rz, whence in total

x _ (::_)213ax
b( ) - a
108

(a -x) 2 13a(a -x)


a
108

228

2x(a -x). a 2 _ ~a 2 _ 12ax


a2
8 - 108
108

12x 2

108

Solutions [4.14.61]-[4.14.63]

Problems

Since the height of P is uniformly distributed on [0, a], we have that

11oa b(x) dx =

mb(a) = -

11a 2
--.
108

We substitute this into the differential equation to obtain the solution m(a) = -{;ha 2 .
Turning to the last part, by making an affine transformation, we may without loss of generality
take the parallelogram to be a square. The points form a convex quadrilateral when no point lies
inside the triangle formed by the other three, and the required probability is therefore 1 - 4m (a) I a 2 =

1- ~ = ~
61. Choose four points P, Q, R, S uniformly at random inside C, and let T be the event that their
convex hull is a triangle. By considering which of the four points lies in the interior of the convex
hull of the other three, we see that IP'(T) = 41P'(S E PQR) = 4EIPQRIIICI. Having chosen P, Q, R,
the four points form a triangle if and only if S lies in either the triangle PQR or the shaded region A.
Thus, IP'(T) ={I AI+ EIPQRI}IICI, and the claim follows on solving for IP'(T).
62. Since X has zero means and covariance matrix I, we have that E(Z) = fL + E(X)L = fL, and the
covariance matrix of Z is E(L'X'XL) = L'IL = V.
63. Let D = (dij) = AB-C. The claim is trivial if D = 0, and so we assume the converse. Choose
i, k such that dik i= 0, and write Yi = L,}=t dijXj = S + dikXk Now IP'(Yi = 0) = E(IP'(xk =
- SI dik I S)). For any given S, there is probability at least ~ that Xk i= - S I dik. and the second claim
follows.
Let Xt, x2, ... , Xm be independent random vectors with the given distribution. If D i= 0, the
probability that Dxs = 0 for 1 :::; s :::; m is at most ( ~ )m, which may be made as small as required by
choosing m sufficiently large.

229

Generating functions and their applications

5.1 Solutions. Generating functions


1.

(a)lflsl<(1-p)- 1,

G(s)

~ sm
= !;:o

(n + mm - 1) pn(l -

p)m

1- s(1- p)

}n .

Therefore the mean is G'(l) = n(1 - p)/ p. The variance is G"(l) +G' (1) -G'(1) 2 = n(l - p)f p 2 .
(b) If lsi < 1,
G(s)

sm

m= 1

(_!_-m +1-1) = 1 + ( 1 -s s) log(1-s).


m

Therefore G' (1) = oo, and there exist no moments of order 1 or greater.
(c) If p <lsi< p- 1,
G(s) =

The mean is G'(l)


2.

m~oo
00

sm

1- p ) plml = 1- p
1+ P
1+ p

1 + _!_!!__ +
pfs
1- sp
1- (pjs)

= 0, and the variance is G"(l) = 2p(l- p)- 2 .

(i) Either hack it out, or use indicator functions lA thus:


00
n
( 00 n
)
(X- 1 n)
(1-sX)
1-G(s)
T(s)=:;s IP'(X>n)=E :;s I[n<X) =E :;s
=E ~ =
1 _s

(ii) It follows that


. {1-G(s)}
. G'(s)
T(l) = hm
= hm - - = G'(l) = E(X)
st1
1- s
st1
1
by L'Hopital's rule. Also,
1

T (1)

{-(1-s)G'(s)+1-G(s)}
= hm
2

st1

(1-s)

= ~G" (1) = ~ {var(X)


230

- G' (1) + G' (1) 2 }

Solutions [5.1.3]-[5.1.5]

Generating functions

whence the claim is immediate.


3.

(i) We have that Gx,y(s, t) = E(sXtY), whence Gx,y(s, 1) = Gx(s) and Gx,Y(l, t) = Gy(t).

(ii) IfEIXYI < oo then

a2
E(XY) = E (xYsx-ItY-I) I
= --Gx,y(s,
t) I
.
s=t=l
as at
s=t=l
4.

We write G (s, t) for the joint generating function.


00

(a)

G(s, t) =

L L sj tk(l- a)(f3- a)aj pk-

j-l

j=Ok=O

(as)j (1- a)(f3- a) . 1- (f3t)H 1


j=O f3
f3
1 - {Jt

if f31tl < 1

1
f3t }
(1-f3t)f3
1-(as/{3)-1-ast
(1 - a)(f3 -a)
(1 - ast)(f3- as)
(1 - a)(f3 -a) {

if~lsl
f3

< 1

(the condition alstl < 1 is implied by the other two conditions on s and t). The marginal generating
functions are
G(s, 1) = (1- a)(f3- a) , G(l, t) = -1-a
-,
1 -at
(1 - as)(f3- as)
and the covariance is easily calculated by the conclusion of Exercise (5.1.3) as a(l - a)- 2.
(b) Arguing similarly, we obtain G(s, t) = (e- 1)/{e(l- tes- 2)} if itles- 2 < 1, with marginals
G(s,l)=

1- e- 1
,
1- es- 2

G(l,t)=

1- e- 1
,
1- te- 1

and covariance e(e- 1)-2.


(c) Once again,
G(s, t) =

log{ 1 - tp(l - p + sp)}


log(l _ p)

if ltp(l - p

+ sp)l

< 1.

The marginal generating functions are


G(s, 1) =

and the covariance is

5.

log{ 1 - p + p 2 (1 - s)}
log(l _ p)
,

G(l, t) = log(l- tp),


log(l- p)

p 2 {p + log(l - p)}
(1- p)2{log(l- p)}2.

(i) We have that

231

[5.1.6]-[5.2.1]

Solutions

Generating functions and their applications

where p + q = 1.
(ii) More generally, if each toss results in one oft possible outcomes, the ith of which has probability
Pi, then the corresponding quantity is a function of t variables, x1, x2, ... , x 1, and is found to be
(p1x1 + P2X2 + + PtXt)n.

6.

We have that

JE(sx)

= lE{lE(sx I U)} =

loo

{1 + u(s- 1)}n du

= -- n+1

sn+1

1-s

the probability generating function of the uniform distribution. See also Exercise (4.6.5).

7.

We have that
Gx,Y,z(x, y,z)

= G(x, y,z, 1) =
=

~(xyz +xy + yz +zx +x + y +z + 1)


~(x + 1)~(y + l)~(z + 1)

= Gx(x)Gy(y)Gz(z),

whence X, Y, Z are independent. The same conclusion holds for any other set of exactly three random
variables. However, G(x, y, z, w) i= Gx(x)Gy(y)Gz(z)Gw(w).

8. (a) We have by differentiating that JE(X 2n) = 0, whence JP(X = 0) = 1. This is not a moment
generating function.
(b) This is a moment generating function if and only if Er Pr = 1, in which case it is that of a random
variable X with JP(X = ar) = Pr
9. The coefficients of sn in both combinations of G 1, G2 are non-negative and sum to 1. They are
therefore probability generating functions, as is G (as) f G (a) for the same reasons.

5.2 Solutions. Some applications


1.

Let G(s) = JE(sx) and Gs(s) =

T(s)

~
L smJP(X 2: m)

EJ=o si Sj. By the result of Exercise (5.1.2),


~ k
s JP(X > k)

= 1 +s L

m=O

= 1+

s(1- G(s))
1- s

k=O

Now,

so that

T(s)- T(O)
s

where we have used the fact that T (0)

Gs(s- 1)- Gs(O)


s- 1

= G s (0) =

1. Therefore

i=1

j=1

L:si- 1JP(X::: i) = L:cs- l)i- 1si.


Equating coefficients of

si- 1, we obtain as required that

~
lP(X 2: i) =LSi
.

]=I

(j -1) ..
.

-1

(-1)1- 1 ,

232

1 _::: i _::: n.

1- sG(s)
1- s

Solutions [5.2.2]-[5.2.4]

Some applications
Similarly,
Gs(s)- Gs(O)
s
whence the second formula follows.

+ s)- T(O)

T(l

1 +s

2. Let Ai be the event that the ith person is chosen by nobody, and let X be the number of events
A 1, A2, ... , An which occur. Clearly

j)

nn- 1

IP'(A nA- n ... nA.) = ( - -

zJ

12

lJ

j (

n- j n- 1

1)

n- j

if i 1 ::j= i2 ::j= ::j= ij, since this event requires each of i 1, ... , ij to choose from a set of n - j people,
and each of the others to choose from a set of size n- j - 1. Using Waring's Theorem (Problem
(1.8.13) or equation (5.2.14)),
IP'(X = k) = t<-1)j-k
. k

]=

where

Sj =

(~)
1

(j)sj
k

(n-j)j (n-j-1)n-j.
n-1
n-1

Using the result of Exercise (5.2.1),


IP'(X ~ k)

n
( 1 = L(-1)j-k
. k

]=

while IF'( X
3.

0)

1)

k- 1

Sj,

1_:::k_:::n,

= 1.

(a)

E(xX+Y)

= E{E(xX+Y I Y)} = E{xy eY(x- 1)} = E{ (xex- 1 / } = exp{J.L(xex- 1 -

(b) The probability generating function of X 1 is

G(s) =

{s(l- p)}k
k= 1 klog(ljp)

log{1- s(l- p)}


logp

Using the 'compounding theorem' (5.1.25),

Gy(s)

4.

= GN(G(s)) = et.L(G(s)- 1) =

)-JL/1ogp

1-s(1-p)

Clearly,
E ( -1- ) =E
1+ X

(lo1o txdt ) = lo1o E(tx)dt= lo1o (q+pt)ndt= 1p(n-q+n+11)

where q = 1- p. In the limit,

1- (1 - 'A/n)n+ 1
1 )
1 +X =
'A(n + 1)/n

233

+ o(l)

--+

1- e-J...
'A

1) }.

[5.2.5]-[5.3.1]

Solutions

Generating functions and their applications

the corresponding moment of the Poisson distribution with parameter )...


Conditioning on the outcome of the first toss, we obtain hn = qhn-1 + p(1- hn-1) for n ~ 1,
where q = 1- p and ho = 1. Multiply throughout by sn and sum to find that H(s) = L:~0 snhn
satisfies H(s)- 1 = (q- p)sH(s) + ps/(1- s), and so

5.

H (s)

1- qs
(1-s){l-(q-p)s}

1
1
}
= -21 { - + 1-(q-p)s

1-s

6. By considering the event that HTH does not appear inn tosses and then appears in the next three,
we find that IP'(X > n) p 2 q = IF'( X = n + 1) pq +IF'( X = n + 3). We multiply by sn+ 3 and sum over
n to obtain
1- E(sx)
----:---'----'-p 2 qs 3 = pqs 2 E(sx) +E(sx),
1-s
which may be solved as required. Let Z be the time at which THT first appears, soY= min{X, Z}.
By a similar argument,
IP'(Y > n)p2q = IP'(X = Y = n + 1)pq + IP'(X = Y = n + 3) + IP'(Z = Y = n + 2)p,
IP'(Y > n)q 2 p = IP'(Z = Y = n + 1) pq + IP'(Z = Y = n + 3) +IF'( X = Y = n + 2)q.

We multipy by sn+ 1, sum over n, and use the fact that IP'(Y = n) =IF'( X= Y = n) + IP'(Z = Y = n).

7.

Suppose there are n + 1 matching letter/envelope pairs, numbered accordingly. Imagine the
envelopes lined up in order, and the letters dropped at random onto these envelopes. Assume that
exactly j + 1 letters land on their correct envelopes. The removal of any one of these j + 1 letters,
together with the corresponding envelope, results after re-ordering in a sequence of length n in which
exactly j letters are correctly placed. It is not difficult to see that, for each resulting sequence of length
n, there are exactly j + 1 originating sequences oflength n + 1. The first result follows. We multiply
by sj and sum over j to obtain the second. It is evident that G 1(s) = s. Either use induction, or
integrate repeatedly, to find that Gn(s) = L:~=0 (s- W /r!.

8.

We have for lsi < fL + 1 that


E(sx) = E{E(sx

9.

I A)}= E(eA(s-1)) =

_IL_ f (-s-)k

fL
=
fL- (s - 1)
fL + 1 k=O

fL + 1

Since the waiting times for new objects are geometric and independent,

E(s T )-s ( -3s- ) ( -s- ) ( -s- )


4- s
2- s
4 - 3s
Using partial fractions, the coefficient of skis ] 2 { iC~)k- 4 - 4(1-)k-4 +

iCi)k-4 }, fork~ 4.

5.3 Solutions. Random walk


1.

Let Ak be the event that the walk ever reaches the point k. Then Ak 2 Ak+ 1 if k
r-1
IP'(M ~ r) = IP'(Ar) = IP'(Ao)

IJ IP'(Ak+l I Ak) =

k=O

234

(p/q/,

0,

0, so that

Solutions

Random walk

since lP'(Ak+1
2.

I Ak)

[5.3.2]-[5.3.4]

= lP'(A1 I Ao) = pfq fork 2: 0 by Corollary (5.3.6).

(a) We have by Theorem (5.3.1c) that


2

00

""'"'
L.J s 2k 2kfo(2k)

00

S
= s F0I (s) = ;-;---:z
= s 2 lP'o(s) = ""'"'
L.J s 2k Po(2k- 2),

k=1

1 - s~

k=1

and the claim follows by equating the coefficients of s 2k.


(b) It is the case that an = lP'( S 1S2 S2n =I= 0) satisfies
00

an=

fo(k),

k=2n+2
keven

with the convention that a 0 = 1. We have used the fact that ultimate return to 0 occurs with probability
1. This sequence has generating function given by
00

00

L s2n L
n=O

!k-1

00

fo(k) =

k=2n+2
k even

fo(k)

k=2
k even
1 - Fo(s)

s2n

n=O

1- s 2

by Theorem (5.3.1c)

00

= lP'o(s) = L

s2nlP'(S2n

= 0).

n=O

Now equate the coefficients of s 2n. (Alternatively, use Exercise (5.1.2) to obtain the generating
function of the an directly.)
3. Draw a diagram of the square with the letters ABCD in clockwise order. Clearly p AA (m) = 0 if
m is odd. The walk is at A after 2n steps if and only if the numbers of leftward and rightward steps are
equal and the numbers of upward and downward steps are equal. The number of ways of choosing
2k horizontal steps out of 2n is (~k). Hence

with generating function

Writing FA (s) for the probability generating function of the time T of first return, we use the
argument which leads to Theorem (5.3.1a) to find that GA(s) = 1 + FA(s)GA(s), and therefore
FA(s) = 1-GA(s)- 1.
4. Write (Xn, Yn) for the position of the particle at time n. It is an elementary calculation to show
that the relations Un = Xn + Yn, Vn = Xn- Yn define independent simple symmetric random walks
U and V. Now T = min{n: Un = m}, and therefore Gr(s) = {s- 1(1- ~)}m for lsi :::: 1
by Theorem (5.3.5).

235

[5.3.5]-[5.3.6]

Solutions

Generating functions and their applications

Now X- Y = VT, so that

where we have used the independence of U and V. This converges if and only if I~ (s + s -I) I _:s 1,
which is to say that s = 1. Note that GT (s) converges in a non-trivial region of the complex plane.

5.

Let T be the time of the first return of the walk S to its starting point 0. During the time-interval
(0, T), the walk is equally likely to be to the left or to the right of 0, and therefore
Lzn = {

+ L'

TR
2nR

if T _:s 2n,
ifT>2n,

where R is Bernoulli with parameter~, L' has the distribution of Lzn-T, and Rand L' are independent.
It follows that Gzn (s) = JE(s Lzn) satisfies
n

Gzn(s) = L ~(1 + s 2k)Gzn-2k(s)f(2k) + L ~(1 + s 2n)/(2k)


k=l
k>n

where f(2k) = IP'(T = 2k). (Remember that Lzn and T are even numbers.) Let H(s, t)
L:~o t 2n Gzn (s). Multiply through the above equation by t 2n and sum over n, to find that
H(s, t) = ~H(s, t) {F(t) + F(st)} + ~ {J(t) + J(st)}

where F(x) = L:~0 x 2k f(2k) and


00

J(x)=L::x 2nLf(2k)=
n=O
k>n

~
1

lxl

< 1,

by the calculation in the solution to Exercise (5.3.2). Using the fact that F(x) = 1 - ~.we
deduce that H(s, t) = 1/VO- t 2)(1- s2t2). The coefficient of s 2kt 2n is
IP'(Lzn=2k)= (n-_!k)(-l)n-k.

(~~)(-1)k
2)Zn =

2k) (2n= ( k
n _ k2k) ( 1

IP'(Szk = O)IP'(Szn-2k = 0).

6. We show that all three terms have the same generating function, using various results established
for simple symmetric random walk. First, in the usual notation,
oo
2m
'
2s2
L 4mlP'(Szm = O)s
= 2sP0 (s) =
2 312 .
m=O
(1- s )
Secondly, by Exercise (5.3.2a, b),
m

lE(T

1\

2m)= 2mlP'(T > 2m)+ L 2kfo(2k) = 2m1P'(Szm = 0) + L IP'(Szk-2 = 0).


k=l
k=l

236

Solutions [5.3.7]-[5.3.8]

Random walk
Hence,
oo 2m
s2Po(s)
'
L
s E(T A 2m)= - -2- +sP0 (s) =
m=O
1- s

2s2
2 312 .

(1 - s )

Finally, using the hitting time theorem (3.10.14), (5.3.7), and some algebra at the last stage,
oo

m=O

oo

s 2m2EIS2m I = 4

oo

s 2m

m=O

L 2k1P'(S2m = 2k) = 4 L

s 2m

m=O

k=1

L 2mf2k(2m)
k=1

d ~ 2m~
d
F1(s) 2
2s 2
= 4s- L s L hk(2m) = 4s2 =
2 312 .
ds m=O
k= 1
ds 1 - F 1(s)
(1 - s )

7. Let In be the indicator of the event {Sn = 0}, so that Sn+1 = Sn + Xn+1 +ln. In equilibrium,
E(So) = E(So) + E(X 1) + E(/o), which implies that IP'(So = 0) = E(/o) = -E(X 1) and entails
E(X t) ::::: 0. Furthermore, it is impossible that IP'(So = 0) = 0 since this entails IP'(So =a) = 0 for all
a < oo. Hence E(X 1) < 0 if Sis in equilibrium. Next, in equilibrium,

Now,
E{ zSn+Xn+l +In (1 - In)} = E(zSn

I Sn

> O)E(zxl )IP'(Sn > 0)

E(zSn+Xn+I+ln In)= zE(zx 1 )IP'(Sn = 0).

Hence
E(zsO) = E(zxi) [{E(zso) - IP'(So = 0)} +ziP'( So = 0)]
which yields the appropriate choice for E(zsO).
8.

The hitting time theorem (3.10.14), (5.3.7), states that IP'(Tob = n) = (lbl/n)IP'(Sn =b), whence
E(Tob

I Tob

< oo)

The walk is transient if and only if p =f:.


p =f:.

1 Suppose henceforth that p =f:. 1

L IP'(Sn =b).

IP'(Tob < oo) n

1, and therefore E(Tob

I Tob

< oo) < oo if and only if

The required conditional mean may be found by conditioning on the first step, or alternatively

as follows. Assume first that p < q, so that IP'(Tob < oo) = (p/q)b by Corollary (5.3.6). Then
L:n IP'(Sn =b) is the mean of the number N of visits of the walk to b. Now

IP'(N=r)=(~rpr- 1 (1-p),

r0::1,

where p = IP'(Sn = 0 for some n 0:: 1) = 1 - IP- ql. Therefore E(N) = (pfq)b liP- ql and
E(Tob I Tob < oo)

b
(p/q)b
(p/q)b IP _ ql.

We have when p > q that IP'(Tob < oo) = 1, and E(TOI) = (p- q)- 1. The result follows from
the fact that E(Tob) = bE(To1).

237

[5.4.1]-[5.4.4]

Solutions

Generating functions and their applications

5.4 Solutions. Branching processes


1. Clearly E(Zn I Zm) = Zm!Ln-m since, given Zm, Zn is the sum of the numbers of (n - m)th
generation descendants of Zm progenitors. Hence E(ZmZn I Zm) = Z~{Ln-m and E(ZmZn) =
E{E(Zm Zn I Zm)} = E(Z~)!Ln-m. Hence
cov(Zm, Zn)

= fLn-mE(Z~)- E(Zm)E(Zn) =

f.ln-m var(Zm),

and, by Lemma (5.4.2),

2. Suppose 0 :::0 r :::0 n, and that everything is known about the process up to timer. Conditional on
this information, and using a symmetry argument, a randomly chosen individual in the nth generation
has probability 1/ Zr of having as rth generation ancestor any given member of the rth generation.
The chance that two individuals from the nth generation, chosen randomly and independently of each
other, have the same rth generation ancestor is therefore 1/Zr. Therefore
lP'(L < r)

= E{lP'(L

< r

I Zr)} = E(1- z; 1)

and so
lP'(L

= r) = lP'(L

< r

+ 1) -lP'(L

< r)

= E(Z; 1)- E(z;J 1),

0:::0 r < n.

If 0 < lP'(ZI = 0) < 1, then almost the same argument proves that lP'(L
Y/r- Ylr+l forO :::0 r < n, where Y/r = E(Z; 1 I Zn > 0).

3.

Zn > 0)

The number Zn of nth generation decendants satisfies

lP'(Zn

= 0) = Gn(O) =

{n:

if p = q,

q (p n -q n)

pn+l _ qn+l

if p # q,

whence, for n :::: 1,


1
lP'(T

= n) = lP'(Zn = 0) -lP'(Zn-1 = 0) =

{ n(n

+ 1)
p

n-1 n
2
q (p- q)

if p

= q,

if p # q.

It follows that E(T) < oo if and only if p < q.

4.

(a) As usual,

This suggests that Gn(s) = 1 - al+.B++.Bn-I (1 - s).Bn for n :::: 1; this formula may be proved
easily by induction, using the fact that Gn(s) = G(Gn-1 (s)).
(b) As in the above part (a),

238

Solutions

Age-dependent branching processes

[5.4.5]-[5.5.1]

whereP2(s) = P(P(s)). SimilarlyGn(s) = /- 1 (Pn(/(s)))forn:::: l,wherePn(s) = P(Pn-I(s)).


(c) With P(s) =as /{1- (1 - a)s} where a = y- 1 , it is an easy exercise to prove, by induction, that
Pn(s) = ans/{1- (1- an)s} for n:::: 1, implying that

5.

Let Zn be the number of members of the nth generation. The (n + 1)th generation has size

Cn+l + ln+l where Cn+I is the number of natural offspring of the previous generation, and ln+l is

the number of immigrants. Therefore by the independence,

whence
Gn+I(s)
6.

= lE(sZn+l) = lE{G(s)Zn}H(s) = Gn(G(s))H(s).

By Example (5.4.3),

lE(s n) =
Differentiate and set s

n - (n - 1)s
n- 1
1
= -- +
,
n + 1- ns
n
n2(1 + n-1 - s)

n:::: 0.

= 0 to find that

Similarly,
oo

oo

00

n2

00

oo

1 2

oo

lE(V2)

2:: (n + 1)3 = n=O


2:: (n + 1)2 - n=O
2:: (n + 1)3 = 0 JT - n=O
2:: (n + 1)3 '
n=O

lE(V3)

2:: (n + 1)4 = n=O


2::
n=O

(n + 1) 2 - 2(n + 1) + 1
(n + 1)4

1 2
1 4
()JT + 90JT - 2

00

The conclusion is obtained by eliminating ~n (n + 1) - 3.

5.5 Solutions. Age-dependent branching processes


1.

(i) The usual renewal argument shows as in Theorem (5.5.1) that


Gt(s)

G(Gt-u(s))fr(u) du +

00

sfr(u) du.

Differentiate with respect to t, to obtain

a
-Gt(s)
= G(Go(s))fr(t) +

at

Now Go(s) = s, and

lot -a {G(Gt-u(s))} fr(u) du- sfr(t).


o

- {G(Gt-u(s))}

at

at

a
=--a
{G(Gr-u(s))},
u
239

2:: -(n-+-1)-.,-3.
n=O

[5.5.2]-[5.5.2]

Solutions

Generating functions and their applications

so that, using the fact that fr (u) = Ae-A.u if u 2: 0,


{' ~ {G(Gt-u(s))} fr(u)du =- [G(Gt-u(s))Jr(u)] 1 -A {' G(Gt-u(s))fr(u)du,
lo at
o lo

having integrated by parts. Hence


:t G 1 (s) = G(s)Ae-At

+ { -G(s)Ae-A.t + AG(G 1 (s))}- A { G 1 (s)

-1

00

sfr(u) du}- sAe-At

=A {G(Gt(s))- G 1 (s)}.

(ii) Substitute G(s)

= s 2 into the last equation to obtain


aGt

ift = A(G 12 -

Gt)

with boundary condition Go(s) = s. Integrate to obtain At+ c(s) = log{l - G( 1} for some function
c(s). Using the boundary condition att = 0, we find thatc(s) = log{l- G01} = log{1-s- 1}, and
hence G 1 (s) =se-At /{1- s(1- e-At)}. Expand in powers of s to find that Z(t) has the geometric
distribution JID(Z(t) = k) = (1- e-At)k- 1e-A.t fork 2: 1.

2.

The equation becomes


aGt

ift
with boundary condition Go(s)

1
= 2(1

+ G21 ) - Gt

= s. This differential equation is easily solved with the result

4/t
G t (s ) -_ 2s + t(l - s) -_ -,-------,-,--2+t(1-s)
2+t(l-s)

2-t

We pick out the coefficient of sn to obtain


4 - ( -tJP>Zt -n - - ( () - ) - t(2 + t) 2 + t

)n ,

n 2: 1,

and therefore

( ()- )-?:;

JP>Zt >k-

00

4 - ( -t-t(2 + t) 2 + t

)n -- -2t ( -2 +t-t )k ,

It follows that, for x > 0 and in the limit as t -+ oo,

JID(Z(t) 2: xt

I Z(t)

> 0)

JID(Z(t) > xt)

JID(Z(t) 2: 1)

t ) LxtJ-1
2+t

= --

240

2)

= 1 +t

1-LxtJ

-+ e- 2x.

Solutions

Characteristic functions

[5.6.1]-[5.7.2]

5.6 Solutions. Expectation revisited


1. Set a = lE(X) to find that u(X) ::=: u(lEX) + A.(X -lEX) for some fixed A. Take expectations to
obtain the result.

2. Certainly Zn = L:?=l Xi and Z


dominated convergence.

3.

Apply Fatou's lemma to the sequence {- Xn : n ::=: 1} to find that


lE(limsupXn)
n->oo

4.

= L:~ 1 IXi I are such that IZn I :::: Z, and the result follows by

-lE(liminf -Xn) ::=: -liminflE(-Xn)


n->oo

n->oo

= limsuplE(Xn).
n->oo

Suppose that lEIXrl < oo where r > 0. We have that, if x > 0,


xrlP'(IXI :::: x) ::::

ur dF(u)--+ 0

as X--+

00,

Jrx,oo)

where F is the distribution function of IX 1.


Conversely suppose that xrlP'(IXI ::=: x) --+ 0 where r ::=: 0, and let 0 ::S s < r. Now lEIXs I =
limM->oo J0M us dF(u) and, by integration by parts,

The first term on the right-hand side is negative. The integrand in the second term satisfies sus - 11P'(I X I >
u) :::: sus-l u-r for all large u. Therefore the integral is bounded uniformly in M, as required.

5. Suppose first that, for all E > 0, there exists 8 = 8(E) > 0, such that lE(IXI/A) < E for all A
satisfying IP'(A) < 8. FixE > 0, and find x (> 0) such that IP'(IXI > x) < 8(E). Then, for y > x,

Hence

J!.y lui dFx(u) converges as y--+

oo, whence lEI XI < oo.

ConverselysupposethatlEIXI < oo. ItfollowsthatlE (IXI/{IXI>yJ)--+ Oasy--+ oo. LetE > 0,


and find y such thatlE (IXI/{IXI>yJ) < iE. For any event A, lA :::: /Anne+ Is where B = {lXI > y}.
Hence
lE(IXI/A):::: lE(IXI/Ansc) +lE(IXI/s):::: ylP'(A) + iE.
Writing 8 = E/(2y), we have thatlE(IXI/A) < E iflP'(A) < 8.

5.7 Solutions. Characteristic functions


1. Let X have the Cauchy distribution, with characteristic function c/>(s) = e-lsl. Setting Y = X,
we have that x+r(t) = (2t) = e- 2 ltl = c/>x(t)c/>y (t). However, X and Yare certainly dependent.

2.

(i) It is the case that Re{(t)} = lE(cos t X), so that, in the obvious notation,
Re{1 - (2t)}

L:
L:

:::: 4

{1- cos(2tx)} dF(x)


{1- cos(tx)} dF(x)

=2

L:

= 4Re{1

241

{1 - cos(tx)}{1 + cos(tx)} dF(x)


- (t)}.

[5.7.3]-[5.7.6]

Solutions

Generating functions and their applications

(ii) Note first that, if X and Y are independent with common characteristic function , then X - Y
has characteristic function

1/f(t) = lE(eitX)lE(e-itY) = c/>(t)c/>(-t) = c/>(t)c/>(t) = lc/>(t)l 2 .


Apply the result of part (i) to the function 1/1 to obtain that 1 -lc/>(2t)1 2 :;:; 4(1 -lc/>(t)l 2 ). However
lc/>(t)l :;:; 1, so that
1 - lc/>(2t)l :;:; 1 - lc/>(2t)l 2 :;:; 4(1 - lc/>(t)l 2 ) :;:; 8(1 - lc/>(t)l).

3.

(a) With mk = lE(Xk), we have that

say, and therefore, for sufficiently small values of(),


Kx(e) =

Loo <- w+l


S(e/.
r

r=l

ExpandS(()/ in powers of e, and equate the coefficients of e, e 2 , e 3 , in tum, to find thatk1 (X) = mt.
k2(X) = m2- mr, k3(X) = m3- 3mtm2 + 2mj.
(b) If X and Yare independent, Kx+y(()) = log{lE(e8 X)lE(e 8 f)} = Kx(())
claim is immediate.
I 2

4.

The N(O, 1) variable X has moment generating function JE(e 8 X) = e z8

5.

(a) Suppose X takes values in L(a, b). Then

+ Ky(()), whence the


so that Kx(()) = ie 2 .

since only numbers of the form x = a + bm make non-zero contributions to the sum.
Suppose in addition that X has span b, and that lc/>x(T)I = 1 for some T E (0, 2nlb). Then
c/>x(T) = eic for some c E JR. Now
JE(cos(TX _c))= ~lE(eiTX-ic

+ e-iTX+ic)

= 1,

using the fact that 1E(e-iTX) = cJ>x(T) = e-ic. However cosx :;:; 1 for all x, with equality if and
only if x is a multiple of 2rr. It follows that T X -cis a multiple of 2rr, with probability 1, and hence
that X takes values in the set L(ciT, 2rr IT). However 2rr IT > b, which contradicts the maximality
of the span b. We deduce that no such T exists.
(b) This follows by the argument above.

6.

This is a form of the 'Riemann-Lebesgue lemma'. It is a standard result of analysis that, forE > 0,
there exists a step function gE such that f~oo If (x) - gE (x) I dx < E. Let E (t) = f~oo eitx gE (x) dx.
Then

If we can prove that, for each E, lc/>E(t)l --+ 0 as t--+ oo, then it will follow that lc/>x(t)l < 2E for all
large t, and the claim then follows.

242

Solutions [5.7.7]-[5.7.9]

Characteristic functions

Now gE(x) is a finite linear combination of functions of the form ciA (x) for reals c and intervals
A, that is gE(x) = 'LJ=l q!Ak (x); elementary integration yields

where ak and bk are the endpoints of Ak. Therefore

2 K
II/IE (t) I < -

2:: Ck -+ 0,

as t -+ oo.

- t k=l

7.

If X is N(f1, 1), then the moment generating function of X 2 is

M 2(s)
X

= lE(esX 2 ) =

100

_ 00

1
I (x -JL )2 dx
esx 2 --e-2

..(iii

.J[=2S

112s ) ,
exp ( 1 - 2s

if s < ~, by completing the square in the exponent. It follows that

My(s) =

ll

n {

.Jf=2S

11~s

1
exp ( - 1- ) } =
exp ( -se- ) .
1 - 2s
(1 - 2s)nf2
1 - 2s

It is tempting to substitute s = it to obtain the answer. This procedure may be justified in this case
using the theory of analytic continuation.

8. (a) T 2 = X 2 /(Y jn), where X 2 is x2 (1; 112 ) by Exercise (5.7.7), andY is


F(1, n; 112 ).
(b) F has the same distribution function as

x2 (n).

Hence T 2 is

(A 2 + B)jm
Z= --------'-'--Vjn

where A, B, V are independent, A being N(.fO, 1), B being x2 (m-1), and V being x2 (n). Therefore
lE(Z) =
=

~ { JE(A2)JE ( ~) + (m- 1)lE ( Bj~~ 1))}


2_
{(1 + 0)___!!__2 + (mm
n-

1)___!!__2} = n~m + ~~,


nmn-

where we have used the fact (see Exercise (4.10.2)) that the F(r, s) distribution has mean sj(s- 2)
if s > 2.
9. ~t X be independent of X with the same distribution. Then 11/11 2 is the characteristic function of
X - X and, by the inversion theorem,

-1

100 lt/l(t)l 2 e-ltx. dt = fx_x_(x) = 100 f(y)f(x + y)dy.

2n -oo

-oo

Now set x = 0. We require that the density function of X 243

X be differentiable at 0.

[5.7.10]-[5.8.2]

Solutions

Generating functions and their applications

10. By definition,
e-ityt/Jx(y)

j_:

eiy(x-t) fx(x)dx.

Now multiply by fy (y ), integrate over y E JR, and change the order of integration with an appeal to
Fubini's theorem.

J:

11. (a) We adopt the usual convention that integrals of the form
g(y) dF(y) include any atom of
the distribution function Fat the upper endpoint v but not at the lower endpoint u. It is a consequence
that Fr: is right-continuous, and it is immediate that Fr: increases from 0 to 1. Therefore Fr: is a
distribution function. The corresponding moment generating function is

Mr:(t)

00

etx dFr:(x)

-oo

= -1M(t)

00

etx+r:x dF(x)

-oo

M(t + r) .
M(t)

(b) The required moment generating function is


Mx+r(t

+ r)

Mx(t

Mx+y(t)

+ r)My(t + r)

Mx(t)My(t)

the product of the moment generating functions of the individual tilted distributions.

5.8 Solutions. Examples of characteristic functions


1. (i) We have that {j)(t) = E(eitX) = E(e-itX) = tP-x(t).
(ii) If X 1 and X 2 are independent random variables with common characteristic function , then
t/Jx 1+Xz (t) = x 1 (t)x2 (t) = (t) 2 .
(iii) Similarly, x 1-x2 (t)

= x1 (t)-x2 (t) = (t)(t) =

l(t)l 2 .

(iv) Let X have characteristic function , and let Z be equal to X with probability
otherwise. The characteristic function of Z is given by

1and to -X

where we have used the argument of part (i) above.

1,

(v) If X is Bernoulli with parameter then its characteristic function is (t) = ~ + eit. Suppose Y
is a random variable with characteristic function 1/f(t) = l(t)l. Then 1/f(t) 2 = (t)(-t). Written
in terms of random variables this asserts that Y1 + Y2 has the same distribution as X 1 - Xz, where
the Y; are independent with characteristic function 1/f, and the X; are independent with characteristic
function. Now Xj E {0, 1}, so that Xr- Xz E {-1, 0, 1}, and therefore Yj E
Write

= lP'(Yj =

{-1, 1l-

J). Then

+ Yz = l) = a 2 = lP'(Xr- Xz = 1) = ~.
lP'(Yr + Yz = -1) = (1- a) 2 = lP'(Xr- Xz = -1) =
lP'(Yr

implying that a 2 = (1 - a) 2 so that a


such variable Y exists.
2.

Fort~

i, contradicting the fact that a 2 = ~- We deduce that no

0,

Now minimize overt

~.

0.

244

Solutions [5.8.3]-[5.8.5]

Examples of characteristic functions


3.

The moment generating function of Z is

Mz(t)

= lE { lE(etXY I Y)} = lE{Mx(tY)} = lE {C.~ tY) m}


-lo1 (-A.-)m yn-1(1 _ y)m-n-1
~
o A.- ty
B(n, m - n)

Substitute v = 1/y and integrate by parts to obtain that

oo (v _ l)m-n-1

Imn =

(A.v - t)m

dv

satisfies
1
(v-l)m-n- 1 ] 00
Imn =[- A.(m- 1) (A.v- t)m-1 1

m-n-1
A.(m- 1) Im-1,n =cmnA.I
( , , ) m-1,n

for some c(m, n, A.). We iterate this to obtain

Imn = c'In+1 n = c' {oo


dv
1
'
J1 (A.v - t)n+

c'

for somec' depending onm, n, A.. Therefore Mz(t) = c" (A.- t)-n for some c" depending on m, n, A..
However Mz(O) = 1, and hence c" = A.n, giving that Z is r(A., n). Throughout these calculations
we have assumed that tis sufficiently small and positive. Alternatively, we could have set t =is and
used characteristic functions. See also Problem (4.14.12).

4.

We have that

1E(e 11. x2 ) =

00

-oo

. 2
1 - exp ( - (x - ~-t) 2 ) dx
ettx
-2
J2na2
2a

= oo

1
(
---exp -oo J2na2

y'1- 2a2it

exp (

t) (

[x- ~-tO- 2a 2it)- 1


.
2a 2 (l - 2a 2 zt)- 1

itf-/,2 )
.
dx
1 - 2a 2zt

exp

itf-/,2 ) .
1- 2a 2it

The integral is evaluated by using Cauchy's theorem when integrating around a sector in the complex
plane. It is highly suggestive to observe that the integrand differs only by a multiplicative constant
from a hypothetical normal density function with (complex) mean ~-t(1 - 2a 2 it)- 1 and (complex)
variance a 2 (1- 2a 2 it)- 1.
5.

(a) Use the result of Exercise (5.8.4) with f-t

characteristic function of the

= 0 and a 2

1: 4>xz(t) = (1- 2it)-2, the


I

x2 ( 1) distribution.

(b) From (a), the sumS has characteristic function t/Js(t)


of the x2 (n) distribution.
(c) We have that

245

= (1- 2it)-z.n, the characteristic function

[5.8.6]-[5.8.6]

Solutions

Now

Generating functions and their applications

E(exp{-it2/X~}) =1oo
-00

_l_exp (- t22- x2) dx.


./2ii
2x
2

There are various ways of evaluating this integral. Using the result of Problem (5.12.18c), we find
that the answer is e-ltl, whence X 1/ Xz has the Cauchy distribution.
(d) We have that
E(eitX1X2) = E{E(eitX1X2 I Xz)} = E(if>x 1(tXz)) =
=

E(e-~t 2 X~)

~exp{-ix 2 (1 +t 2 )}dx = ~,
1 + t2

00

-oo v2rr

on observing that the integrand differs from the N (0, (1 + t 2 )- 2) density function only by a multiplicative constant. Now, examination of a standard work of reference, such as Abramowitz and Stegun
(1965, Section 9.6.21), reveals that

!o

oo cos(xt) d

~
y 1 -r r-

( )
t=Kox,

where Ko(x) is the second kind of modified Bessel function. Hence the required density, by the
inversion theorem, is f(x) = Ko(lxl)jn. Note that, for small x, Ko(x) ~ -logx, and for large
positive x, Ko(x) ~e-x .Jnx/2.
As a matter of interest, note that we may also invert the more general characteristic function
if>(t) = (1- it)-a(l + it)-f3. Setting 1- it= -zjx in the integral gives

1 1oo

f(x) = 2rr

e-itx

e-xxa-I 1-x+ixoo
e-z dz
dt = ----;;;---oo (1- it)a(l + it)f3
2f32ni
-x-ixoo (-z)a (1 + z/(2x))f3
1

ex (2x) z;(f3-a)

----'---'---wz;(a-f3),
1
1
(2x)
f(a)
2 (I-a-f3)
where W is a confluent hypergeometric function. When a = f3 this becomes
I

f(x)

(xj2)a-z;
f(a).J]r Ka-~ (x)

where K is a Bessel function of the second kind.


(e) Using (d), we find that the required characteristic function is >x 1x 2 (t)>x 3 x 4 (t) = (1 + t 2)- 1.
In order to invert this, either use the inversion theorem for the Cauchy distribution to find the required
density to be f(x) = ie-lxl for -oo < x < oo, or alternatively express (1 + t 2)- 1 as partial
it)- 1 + (1 + it)- 1}, and recall that (1- it)- 1 is the characteristic
fractions, (1 + t 2)- 1 =
function of an exponential distribution.

iW-

6. The joint characteristic function of X= (X 1, Xz, ... , Xn) satisfies if>x(t) = E(eitX') = E(eiY)
where t = (tJ, tz, ... , tn) E JRn andY= tX' = t1X 1 + + tnXn. Now Y is normal with mean and
variance
n

E(Y) = _2)jE(Xj) = tfL',

var(Y) =

tjtkcov(Xj, Xk) = tVt',

j,k=I

j=l

where fL is the mean vector of X, and Vis the covariance matrix of X. Therefore if>x(t) = if>y(l) =
exp(itp/- itVt') by (5.8.5).

246

Solutions [5.8.7]-[5.9.2]

Inversion and continuity theorems

Let Z = X- JL. It is easy to check that the vector Z has joint characteristic function Jz(t) =

e- !tvt', which we recognize by (5.8.6) as being that of the N (0, V) distribution.


I 2

I 2

7. We have that E(Z) = 0, E(Z 2 ) = 1, and E(e 12 ) = E{E(e12 I U, V)} = E(e2;t ) = e2 1


If X and Y have the bivariate normal distribution with correlation p, then the random variable Z =
(UX + VY)jy'u2 + 2pUV + y2 is N(O, 1).
8.

By definition, E(eitX)

loo

oo

= E(cos(tX)) + iE(sin(tX)).

A2
cos(tx )Ae -Ax dx = - 2,
2 A +t

loo

By integrating by parts,

oo .

sm(tx)Ae

-Ax

dx

At
= -2 2,
A +t

and
A2 +iA.t
A
A2 + t 2 = A - it .

9.

(a) We have that e-lxl =e-x l{x:c:O) +ex l{x<O) whence the required characteristic function is

J(t)

1(

1)

= 2 1 -it + 1 +it = 1 + t2.

(b) By a similar argument applied to the f(l, 2) distribution, we have in this case that

1(

J(t) =

1
(1 - it)2

+ (1 + it)2

1- t 2
= (1

+ t2) 2 .

10. Suppose X has moment generating function M(t). The proposed equation gives
M(t)

lo

M(ut) 2 du

= -1 lot M(v) 2 dv.


t

Differentiate to obtain tM' + M = M 2 , with solution M(t) = A/(A


distribution has the stated property.

+ t).

Thus the exponential

11. We have that


Jx,r(s, t)

= E(eisX+itY) = JsX+tY0).

Now sX + t Y is N(O, s 2a 2 + 2starp + r 2 ) where a 2 = var(X), r 2 = var(Y), p


therefore
Jx,r(s, t) = exp{ -i(s 2a 2 + 2starp + t 2 r 2) }.

= corr(X, Y), and

The fact that c/Jx, y may be expressed in terms of the characteristic function of a single normal variable
is sometimes referred to as the Cramer-Wold device.

5.9 Solutions. Inversion and continuity theorems


1.

Clearly, for 0 _::: y _::: 1, lP'(Xn _::: ny)

= n- 1 Lny j

~ y as n ~ oo.

2. (a) The derivative of Fn is fn (x) = 1 - cos(2nn x ), for 0 _::: x _::: 1. It is easy to see that fn is
non-negative and
fn (x) dx = 1. Therefore Fn is a distribution function with density function fn.
(b) As n ~ oo,
sin(2nn x)
1
<-~o
2nn
- 2nn
'

JJ

247

[5.9.3]-[5.9.6]

Solutions

Generating functions and their applications

and so Fn (x) ~ x for 0 ::= x ::= 1. On the other hand, cos(2nn x) does not converge unless x E {0, 1},
and therefore fn(x) does not converge on (0, 1).

3. We may express N as the sum N = T1 + T2 + + Tk of independent variables each having the


geometric distribution II"(7j = r) = pqr-I for r 2':: 1, where p + q = 1. Therefore
k
ifJN(t) = ifJTJ (t) =

implying that Z

= 2Np

pe't
1 - qeit

has characteristic function

pe2pit
r/Jz(t) = JN(2pt) = { 1 - (1 - p)e2pit

as p ,} 0, the characteristic function of the


theorem (5.9.5).

4.

{ . }k

r(

}k {
=

p(l + 2pit + o(p))


p(l - 2it + o(l))

}k ~

-k

(1 - 2zt)

i, k) distribution. The result follows by the continuity

All you need to know is the fact, easily proved, that 1/fm (t) = eitm satisfies

1-rr:rr 1/fj(t)l/fk(t) d t= {

2n

if j + k = 0,
ifj+k#O,

for integers j and k.


Now, ifJ(t) = "E-'i=-oo eit}II"(X = j), so that
-1
2n

1:rr e-ttkifJ(t)dt
.
1 L
=-:rr
2n .

00

II"(X = j)

J=-00

1:rr 1/fj(t)l/f-k(t)dt = 1 II"(X = k) 2n.


-:rr
2n

If X is arithmetic with span A., then X/A. is integer valued, whence

A.

II"(X = kA) = 2n

1:rrfl. e-itkAifJx(t) dt.


-:rr/1.

Let X be uniformly distributed on [-a, a], Y be uniformly distributed on [ -b, b ], and let X and
Y be independent. Then X has characteristic function sin(at)j(at), andY has characteristic function
sin(bt)/(bt). We apply the inversion theorem (5.9.1) to the characteristic function of X+ Y to find
that
00
00 sin(at) sin(bt)
1
1
a 1\ b
-2
ifJx+y(t) dt = dt = fx+y(O) = -b-.
2
n -oo
2n -oo
abt
2a

5.

6.

It is elementary that

In addition, a= n, f~'(a) = -n- 1 , and

and Stirling's formula follows.

248

Two limit theorems

Solutions

[5.9. 7]-[5.10.1]

7. The vector X has joint characteristic function (t) = exp(- itVt'). By the multidimensional
version of the inversion theorem (5.9.1), the joint density function of X is
j(x) = - 1 -

{ exp( -itx'- itVt') dt.

(2n)n }ftl.n

Therefore, if i

i=

j,

i=

and similarly when i = j. When i

a
-IP'(maxXk:::::
u) =
OVij

j,

- of dx

where Q = {x: Xk::::: u fork= 1, 2, ... , n}

Q OVjj

J'

where
dx' is an integral over the variables Xk for k i= i, j.
Therefore, IP'(maxk X k ::::: u) increases in every parameter Vij, and is therefore greater than its
value when Vij = 0 fori i= j, namely ITk IP'(Xk::::: u).
8.

By a two-dimensional version of the inversion theorem (5.9.1) applied to E(eitX' ), t = (t1, t2),

~ IP'(X 1 >

op

0, X2 > 0) =

("' ("' {

op lo lo

a 1 {{
= op 4n 2 llJRz
= - 12
4n

1~
JR2

~
4n

{ { exp( -itx'- itVt') dt} dx

1}JRz

exp(-itVt')

dt

(it 1 )(it2 )

1
2Jl' .JjV=Tf =
1
exp(- 2J tVt)dt=
~
2
4n
2n y 1 - p2

We integrate with respect top to find that, in agreement with Exercise (4.7.5),
IP'(X1 > 0, X2 > 0) =

4 + 2n

sin- p.

5.10 Solutions. Two limit theorems


1. (a) Let{ Xi : i 2': 1} be a collection of independent Bernoulli random variables with parameter
Then Sn =~I Xi is binomially distributed as bin(n,
Hence, by the central limit theorem,

i).

where <I> is the N(O, 1) distribution function.

249

i.

[5.10.2]-[5.10.4]

Solutions

Generating functions and their applications

(b) Let {Xi : i :=:: 1} be a collection of independent Poisson random variables, each with parameter 1.
Then Sn =
Xi is Poisson with parameter n, and by the central limit theorem

'J:'i

nk
(ISn-nl
k! = IP'
Jn ~ x ) ~ <l>(x)- <1>(-x),

as above.

k:

lk-nl::=:xy'n
2. A superficially plausible argument asserts that, if all babies look the same, then the number X of
correct answers inn trials is a random variable with the bin(n,
distribution. Then, for large n,

i)

X- ln
)
IP' ( -~-2 - > 3 ::::::1- <1>(3)::::::

zJn

rJrm

by the central limit theorem. For the given values of n and X,

X- ~n

--=
~Jn

910- 750

~s

5v'15 -

Now we might say that the event {X- ~n > iJn} is sufficiently unlikely that its occurrence casts
doubt on the original supposition that babies look the same.
A statistician would level a good many objections at drawing such a clear cut decision from such
murky data, but this is beyond our scope to elaborate.
3.

Clearly
l/Jy(t) = E{E(eitY

I X)}= E{exp(X(eit -1))} = (

~t

1 - (e 1

1)

)s

1 -.
= (1
2- e1

)s

It follows that

1 I
E(Y) = --:-l/Jy(O) = s,
l

whence var(Y) = 2s. Therefore the characteristic function of the normalized variable Z = (Y EY)/Jvar(Y) is
Now,
1og{ifJy(t/5s)} = -s log(2- eitf,ffs) = s(eitj,ffs- 1)
=

+ ~s(eitj,ffs- 1) 2 + o(l)

it/f!- !-t 2 - !-t 2 + o(l),

-it

2 ass~ oo, and the result follows


where the o(l) terms are ass~ oo. Hence log{l/Jz(t)} ~
by the continuity theorem (5.9.5).
Let P1, P2, ... be an infinite sequence of independent Poisson variables with parameter 1. Then
Sn = P1 + P2 + + Pn is Poisson with parameter n. Now Y has the Poisson distribution with
parameter X, and so Y is distributed as Sx. Also, X has the same distribution as the sum of s
independent exponential variables, implying that X ~ oo as s ~ oo, with probability 1. This
suggests by the central limit theorem that Sx (and hence Y also) is approximately normal in the limit
as s ~ oo. We have neglected the facts that s and X are not generally integer valued.

4. Since X1 is non-arithmetic, there exist integers nJ, n2, ... , nk with greatest common divisor 1
and such that IP'(X 1 = ni) > 0 for 1 ~ i ~ k. There exists N such that, for all n :=:: N, there exist nonnegative integers a1, a2, ... , ak such that n = a1n1 + +aknk. If xis a non-negative integer, write

250

Solutions

Two limit theorems

[5.10.5]-[5.10.5]

N = f31 n 1 + + f3knk> N + x = Y1 n 1 + + Yknk for non-negative integers f31, ... , f3b Y1 ... , Yk
Now Sn = X1 + + Xn is such that
k

IP'(Sn = N) 2: IP'(Xj = n; for B;-1 < j:::: B;, 1 ::::

i:::: k)

IJ IP'(X1 = n;)f3i > 0


i=1

where Bo = 0, B; = f31 + f32 +


G = Y1 + Y2 + + Yk Therefore

+ {3;,

B = Bk. Similarly IP'(Sa = N

IP'(Sa- SG,B+G = x) 2: IP'(Sa = N

+ x)IP'(Sn =

+ x)

> 0 where

N) > 0

where SG,B+G = ~f=+J'+ 1 X;. Also, IP'(Sn- Sn,B+G = -x) > 0 as required.

5.

Let X 1, X 2, ... be independent integer-valued random variables with mean 0, variance 1, span 1,
I 2

and common characteristic function cf>. We are required to prove that JnlP'(Un = x)--+ e -z-x
as n --+ oo where
1
X1 + X2 + + Xn
Un = JnSn =
Jn

;...fiJi

and xis any number of the form kf Jn for integral k. The case of generalt-t and a 2 is easily derived
from this.
By the result of Exercise (5.9.4), for any such x,

IP'(Un

= x) =

r.;;

2n v n

Jrr -fii e-ztxcf>Un


.
(t) dt,
-rr -fii

since Un is arithmetic. Arguing as in the proof of the local limit theorem (6),

2n IJnlP'(Un = x)- f(x)l :S In+ In


where

is the N(O, 1) density function, and

Now In = 2...fiii ( 1 - <I> (n Jn)) --+ 0 as n --+ oo, where <I> is the N (0, 1) distribution function. As
for In, pick 8 E (0, n). Then

I 2

The final term involving e -z-t is dealt with as was ln. By Exercise (5.7.5a), there exists).
such that lc/>(t)l <).if 8:::: ltl:::: n. This implies that

and it remains only to show that


as n--+ oo.

251

(0, 1)

[5.10.6]-[5.10.7]

Solutions

Generating functions and their applications

The proof of this is considerably simpler if we make the extra (though unnecessary) assumption
that m3 =lEI
I < oo, and we assume this henceforth. It is a consequence of Taylor's theorem (see

XI

Theorem (5.7.4)) that cp(t) = 1- ~t 2 -

iit 3m3 +o(t 3) as t --+ 0. It follows that cp(t) = e-'It +t


I 2

3
e(t)

for some finite &(t). Now lex- 11 :S lxlelxl, and therefore

Let K 0 = sup{l&(u)l : lui :S 8}, noting that K 0 < oo, and pick 8 sufficiently small that 0 < 8 < rr
and 8K0 <
For It I < 8../fi,

!-

an -e _lr21
1t1
2
1 2
1t1 _I 12
IcfJ(tfvn;
2
:SK0 .fiiexp(t8K0 - 2 t):SK0 .fiie 4 ,
3

and therefore
as n--+ oo
as required.

6.

The second moment of the Xi is


2

lo0e-

x2

2x (log x ) 2

dx

~-1 e2u
-du
-oo u 2

(substitute x = eu), a finite integral. Therefore the X's have finite mean and variance. The density
function is symmetric about 0, and so the mean is 0.
By the convolution formula, if 0 < x < e- 1,

e-l

h(x) =

f(y)f(x- y)dy

-e-1

r
Jo

f(y)f(x- y)dy

f(x)

r
Jo

f(y)dy,

since f(x- y), viewed as a function of y, is increasing on [0, x]. Hence

(x) >

....l.!:!)_ =

- 2log lx I

1
41x I(log lx 1) 3

for 0 < x < e - 1 . Continuing this procedure, we obtain


(x) >

~'
Jn

kn
lxl(log lxl)n+ 1 '

0 < x < e- ,

for some positive constant kn. Therefore fn (x) --+ oo as x --+ 0, and in particular the density function
of (X 1 + + X n) / .fii does not converge to the appropriate normal density at the origin.
7.

We have for s > 0 that

cp(is) =

v2rr
=

1
~
v2rr

roo exp( -(2x)- 1 - xs )x- 312 dx


looo exp (- 21 y 2 - sy - 2) 2dy by substituting x = y- 2

lo

=exp(-../2i),

252

Solutions

Large deviations

[5.10.8]-[5.11.3]

by the result of Problem (5.12.18c), or by consulting a table of integrals. The required conclusion
follows by analytic continuation in the upper half-plane. See Moran 1968, p. 27l.
(a) The sum Sn = I;~=l Xr has characteristic function E(eitSn) = rp(t)n = (tn 2 ), whence
Un = Sn/n has characteristic function rp(tn) = E(eitnX I). Therefore,

8.

IP'(Sn <c)= IP'(nX1 <c)= 1P' ( X1 <

(b) E(eitTn)

~) ~ 0

asn

oo.

= rp(t) = E(eitX 1 ).

9. (a) Yes, because Xn is the sum of independent identically distributed random variables with
non-zero variance.
(b) It cannot in general obey what we have called the central limit theorem, because var(Xn) =
(n 2 - n) var(8) + nE(8)(1 - E(8)) and n var(Xl) = nE(8)(1 - E(8)) are different whenever
var(8) =F 0. Indeed the right 'normalization' involves dividing by n rather than Jn. It may be shown
when var( e) =F 0 that the distribution of X n / n converges to that of the random variable 8.

5.11 Solutions. Large deviations


1. We may write Sn = I:'i Xi where the Xi have moment generating function M(t) = J.(e 1 +e- 1 ).
Applying the large deviation theorem (5.11.4), we obtain that, for 0 <a < l, IP'(Sn > an) 1fn ~
inf1 >o{g(t)} where g(t) =e-at M(t). Now g has a minimum when e 1 = .J(1 + a)/(1 -a), where
it takes the value 1/V(l + a)l+a(l- a)l-a as required. If a 2: 1, then IP'(Sn >an)= 0 for all n.

2. (i) Let Yn have the binomial distribution with parameters n and i. Then 2Yn - n has the same
distribution as the random variable Sn in Exercise (5.11.1). Therefore, if 0 < a < 1,

and similarly for IP'(Yn - in < -ian), by symmetry. Hence

(ii) This time let Sn = X 1+ + Xn, the sum of independent Poisson variables with parameter 1. Then
Tn = en!P'(Sn > n(l +a)). The moment generating function of X1 - l is M(t) = exp(e 1 - 1 - t),
and the large deviation theorem gives that r,f!n ~ einf1 >o{g(t)} where g(t) =e-at M(t). Now
g' (t) = (e 1 -a - l) exp(e 1 -at - t - 1) whence g has a minimum at t = log(a + l). Therefore
r,!ln ~ eg(log(1 +a)) = {ej(a + 1)}a+l.
3. SupposethatM(t) = E(e 1x) isfiniteontheinterval [-8, 8]. Now, fora> 0, M(8) 2: e 8a!P'(X >
a), sothat!P'(X >a)_:::: M(8)e- 8a. Similarly, IP'(X <-a)_:::: M(-8)e-8a_

Suppose conversely that such A, J-t exist. Then


M(t) _:::: E(eltXI) = {

J[O,oo)

eltlx dF(x)

where F is the distribution function of lXI. Integrate by parts to obtain


M(t) _:::: 1 + [-eltlx[l-

F(x)J]~ + fooo ltleltlx[l- F(x)]dx


253

[5.11.4]-[5.12.2]

Solutions

Generating functions and their applications

(the term '1' takes care of possible atoms at 0). However 1 - F(x) :::: ~-te-A.x, so that M(t) < oo if
ltl is sufficiently small.

4.

The characteristic function of Sn/n is {e-lt/nl}n = e-ltl, and hence Sn/n is Cauchy. Hence
lP'(Sn >an)=

dx

00

n(l

+ x 2)

(n

1 --tan- 1 a ) .
=n 2

5.12 Solutions to problems


1.

The probability generating function of the sum is

1 6
{6
~s

i}

10

1 ) 10 { 1 _ s6}
6s
~

10

1 ) 10
6
(1 - lOs + .. )(1 +lOs + ... ).
6s

The coefficient of s 27 is

2. (a) The initial sequences T, HT, HHT, HHH induce a partition of the sample space. By conditioning
on this initial sequence, we obtain f(k) = qf(k - 1) + pqf(k - 2) + p 2qf(k - 3) fork > 3,
where p + q = 1. Also f ( 1) = f (2) = 0, f (3) = p 3 . In principle, this difference equation
may be solved in the usual way (see Appendix 1). An alternative is to use generating functions.
Set G(s) = 2.::~ 1 sk f(k), multiply throughout the difference equation by sk and sum, to find that
G(s) = p 3s3/{1- qs- pqs 2 - p 2qs 3 }. To find the coefficient of sk, factorize the denominator,
expand in partial fractions, and use the binomial series.
Another equation for f(k) is obtained by observing that X =kif and only if X > k- 4 and the
last four tosses were THHH. Hence
k-4

f(k) = qp 3 ( 1 - ~ /(i) '

k > 3.

t=l

Applying the first argumentto the mean, we find that f1, = E(X) satisfies f1, = q(1 + f-1,) + pq(2+
~-t) + p 2q(3 + ~-t) + 3p3 and hence 1-t = (1 + p + p2)jp3.
As for HTH, consider the event that HTH does not occur in n tosses, and in addition the next
three tosses give HTH. The number Y until the first occurrence of HTH satisfies
lP'(Y > n)p 2q = lP'(Y = n + 1)pq + lP'(Y = n + 3),

:=:::

2.

Sum over n to obtain E(Y) = (pq + l)j(p2q).


(b) G N(s) = (q + ps)n, in the obvious notation.
(i) 1!"(2 divides N) = ~{GN(1) + GN(-1)}, since only the coefficients of the even powers of s
contribute to this probability.
(ii) Let w be a complex cube root of unity. Then the coefficient of lP'(X = k) in ~ {G N (1) + G N(w) +
GN(w 2 )} is

~{1

+ w 3 + w6 } =
~{1 +w +w 2 } =

~{1+w 2

1,

if k = 3r,

0,

ifk = 3r + 1,

+w4 }=0,

ifk=3r+2,

254

Solutions

Problems

~{GN(l)

for integers r. Hence


+ GN(w) + GN(w2 )} =
is a multiple of 3. Generalize this conclusion.

2::;!~J JP(N =

[5.12.3]-[5.12.6]

3r), the probability that N

3. We have that T = k if no run of n heads appears in the first k - n - 1 throws, then there is a
tail, and then a run of n heads. Therefore IP(T = k) = JP(T > k- n- 1)qpn fork :=::: n + 1 where
p + q = 1. Finally JP(T = n) = pn. Multiply by sk and sum to obtain a formula for the probability
generating function G of T:
00

00

G(s)-pnsn=qpn L
sk L
JP(T=j)=qpnLIP(T=j) L
sk
k=n+1 j>k-n-1
j=1
k=n+1

qpnsn+1 oo
.
qpnsn+1
1-s LIP(T=j)(l-sl)= 1-s (1-G(s)).
j=1

Therefore

4.

The required generating function is


G(s) =

~ sk
L...

(k- l)

k=r

r- 1

pr (1- p)k-r =

(___!!!_)r
l - qs

where p+q = 1. The mean is G'(l) = rjp and the variance is G"(l) +G'(l)- {G'(l)} 2 = rqjp 2 .
5.

It is standard (5.3.3) that Po(2n) =

Po(2n) ~

e::) (pq)n. Using Stirling's formula,

( 2n)2n+~ e-2n .J27f


1

{nn+2e-n.J2]r}2

(pq)n

(4 pq)n

;:;;-;; .

v7rn

The generating function Fo(s) for the first return time is given by Fo(s) = 1 - Po(s)- 1 where
Po(s) = l:n s 2n Po(2n). Therefore the probability of ultimate return is Fo(l) = 1 -A - 1 where, by
Abel's theorem,
.f
1
1 p=q=2>
A= LPo(2n){ = 00
if p =J:. q.
n
<00
Hence Fo(1) = l if and only if p = ~
6.

(a) Rn = X~

+ Y1' satisfies

Hence Rn = n + Ro = n.
(b) The quick way is to argue as in the solution to Exercise (5.3.4). Let Un = Xn + Yn, Vn = Xn- Yn.
Then U and V are simple symmetric random walks, and furthermore they are independent. Therefore

255

[5.12.7]-[5.12.8]

Generating functions and their applications

Solutions

by (5.3.3). Using Stirling's formula, Po(2n) ~ (mr)- 1, and therefore L:n Po(2n) = oo, implying
that the chance of eventual return is l.
A longer method is as follows. The walk is at the origin at time 0 if and only if it has taken equal
numbers of leftward and rightward steps, and also equal numbers of upward and downward steps.
Therefore
(2n)!
(l)4n (2n) 2
( l)2n n
Po(2n) = 4
(m!)2{(n- m)!}2 = 2
n

Let eij be the probability the walk ever reaches j having started from i. Clearly eao =
ea,a-1 ea-1,a-2 e10, since a passage to 0 from a requires a passage to a - 1, then a passage
to a- 2, and so on. By homogeneity, eao = (ero)a.

7.

By conditioning on the value of the first step, we find that e10


cubic equation x = px 3 + q has roots x = 1, c, d, where
c=

Now

lei

> 1, and

ldl

-p- Jp2 +4pq


2p

pe3o

+ qeoo = pef0 + q.

The

d= -p+Jp2+4pq.
2p

:=:: 1 if and only if p 2 + 4pq :=:: 9p 2 which is to say that p _::: ~-It follows that

e10 = 1 if p _::: ~.so that eao = 1 if p _::: ~-

When p > ~.we have that d < 1, and it is actually the case that e10

eao

-p + Jp2 +4pq)a
2p

= d, and hence

'f p > 3
1

In order to prove this, it suffices to prove that eao < 1 for all large a; this is a minor but necessary
chore. Write Tn = Sn - So = I:i= 1 Xi, where Xi is the value of the ith step. Then
eao

= lP'(Tn

_::: -a for some n :=:: 1)

= lP'(nJL-

Tn :=:: nJL +a for some n :=:: 1)

00

.::0

L lP'(nJL- Tn 2':: nJL +a)


n=1

where Jl = E(X1) = 2p- q > 0. As in the theory of large deviations, fort> 0,

where X is a typical step. Now E(e 1 (~t-X)) = 1 + o(t) as t 0, and therefore we may pick t > 0
such that e(t) = e- 11LE(e 1(tL-X)) < 1. It follows that eao .::0 I;~ 1 e-tae(t)n which is less than 1
for all large a, as required.
8.

We have that

where p + q = 1. Hence Gx,y(s, t) = G(ps + qt) where G is the probability generating function
of X + Y. Now X and Y are independent, so that
G(ps

+ qt) = Gx(s)Gy(t) = Gx,y(s, 1)Gx,Y(1, t) = G(ps + q)G(p + qt).


256

Solutions [5.12.9]-[5.12.11]

Problems

Write f(u) = G(l + u), x = s- 1, y = t - 1, to obtain f(px + qy) = f(px)f(qy), a functional


equation valid at least when -2 < x, y ::; 0. Now f is continuous within its disc of convergence,
and also f(O) = 1; the usual argument (see Problem (4.14.5)) implies that f(x) = eA.x for some
A, and therefore G(s) = f(s- 1) = eA(s- 1). Therefore X+ Y has the Poisson distribution with
parameter A. Furthermore, Gx(s) = G(ps + q) = e)cp(s-1), whence X has the Poisson distribution
with parameter Ap. Similarly Y has the Poisson distribution with parameter Aq.
In the usual notation, Gn+1(s) = Gn(G(s)). It follows that G~+ 1 (1) = G~(l)G 1 (1) 2
G~(l)G"(l) so that, after some work, var(Zn+1) = ~-t 2 var(Zn) + ~-tno- 2 . Iterate to obtain

9.

var(Zn+1) = o-2(~-tn

+ fl,n+1 + ... + f1,2n)

for the case f1, =F 1. If f1, = 1, then var(Zn+d = o- 2 (n

(52fl,n(l _

fl,n+1)

-fl,

n ::=::

0,

+ 1).

10. (a) Since the coin is unbiased, we may assume that each player, having won a round, continues
to back the same face (heads or tails) until losing. The duration D of the game equals kif and only
if k is the first time at which there has been either a run of r - 1 heads or a run of r - 1 tails; the
probability of this may be evaluated in a routine way. Alternatively, argue as follows. We record S
(for 'same') each time a coin shows the same face as its predecessor, and we record C (for 'change')
otherwise; start with a C. It is easy to see that each symbol in the resulting sequence is independent
of earlier symbols and is equally likely to be S or C. Now D = k if and only if the first run of r - 2
S's is completed at time k. It is immediate from the result of Problem (5.12.3) that

( .!st-2 (1 - .!s)

GD(s)= 2
1- s

+ (~sy- 1

(b) The probability that Ak wins is


00

1rk = ~)P'(D = n(r- 1)

+ k- 1).

n=1

Let w be a complex (r - 1)th root of unity, and set

1 {
G D(s)
r-1

Wk(s) = - -

1
1
2
) G D(w s)
+ kTG
D(ws) + ~(k
ww -1
1
r-2 }
+ ... + w(r-2)(k-1)
G D (w
s)

It may be seen (as for Problem (5.12.2)) that the coefficient of si in Wk(s) is lP'(D = i) if i is of the
form n(r- 1) + (k- 1) for some n, and is 0 otherwise. Therefore lP'(Ak wins)= Wk(1).
(c) The pool contains D when it is won. The required mean is therefore
JE(D I Ak wins)=

E(DI

{Ak wms}

lP'(Ak wins)

W 1 (1)

= _k_.
Wk(1)

(d) Using the result of Exercise (5.1.2), the generating function of the sequence lP'(D > k), k
T(s) = (1- G D(s))/(1 - s). The required probability is the coefficient of sn in T(s).

::=::

0, is

11. Let Tn be the total number of people in the first n generations. By considering the size Z 1 of the
first generation, we see that
ZJ

Tn = 1 + LTn-1(i)
i=1

257

[5.12.12]-[5.12.14]

Solutions

Generating functions and their applications

where Tn-1 (1), Tn-1 (2), ... are independent random variables, each being distributed as Tn-1 Using
the compounding formula (5.1.25), Hn(s) = sG(Hn-1 (s)).

12. We have that


lP'(Z > N
n

IZ
m

= O) = lP'(Zn > N, Zm = 0)
lP'(Zm = 0)
=

lP'(Zm = 0

I Zn

+ r)lP'(Zn

lP'(Zm-n = O)N+rlP'(Zn = N

= N

+ r)

+ r)

lP'(Zm = 0)

r= 1

:S

= N

lP'(Zm = 0)

r= 1

lP'(Zm = O)N+ 1
lP'(Zm = O)

00

L lP'(Zn = N + r) :S lP'(Zm = 0) N = Gm(O) N .


r=1

13. (a) We have that Gw(s) = GN(G(s)) = eJ.(G(s)- 1). Also, Gw(s) 11n = eJ.((G(s)- 1)/n, the
same probability generating function as Gw but with A replaced by Ajn.
(b) We can suppose that H (0) < 1, since if H (0) = 1 then H (s) = 1 for all s, and we may take A = 0
and G(s) = l. We may suppose also that H(O) > 0. To see this, suppose instead that H(O) = 0
so that H(s) = sr "L.'f':=o sj hj+r for some sequence (hk) and some r 2: 1 such that hr > 0. Find a
positive integer n such that r fn is non-integral; then H(s) 1/n is not a power series, which contradicts
the assumption that H is infinitely divisible.
Thus we take 0 < H(O) < 1, and so 0 < 1 - H(s) < 1 for 0::: s < l. Therefore
log H(s) =log( 1 - {1 - H(s)l) =A ( -1

+ A(s))

where A = -log H(O) and A(s) is a power series with A(O) = 0, A(1)
00
.
"'j=
1 ajsl, we have that

l. Writing A(s)

as n --+ oo. Now H(s) 11n is a probability generating function, so that each such expression is nonnegative. Therefore aj 2: 0 for all j, implying that A(s) is a probability generating function, as
required.

14. It is clear from the definition of infinite divisibility that a distribution has this property if and only
if, for each n, there exists a characteristic function 1/Jn such that cf>(t) = 1/Jn (t)n for all t.
(a) The characteristic functions in question are
"t

cf> (t) = e'


Poisson (A) :
r(A, ~-t):

4> (t) =

JL- 'I1 a 2t2


"t

eJ.(e' -1)

cf>(t) =(-A.
A -It

)JL.

In these respective cases, the 'nth root' 1/Jn of 4> is the characteristic function of the N(~-t/n, a 2 jn),
Poisson (A/n), and r(A, ~-t/n) distributions.

258

Solutions [5.12.15]-[5.12.16]

Problems

(b) Suppose that ifJ is the characteristic function of an infinitely divisible distribution, and let 1/Jn be a
characteristic function such thatifJ(t) = 1/Jn(t)n. Now lifJ(t)l ::0 l for all t, so that
11/ln(t)l = lifJ(t)l1/n--+ { l

~ lifJ(t)l

lf lifJ(t)l

=I= 0,

= 0.

For any value oft such that ifJ(t) =I= 0, it is the case that 1/Jn (t) --+ 1 as n --+ oo. To see this, suppose
< 2n such that 1/Jn (t) --+ eie along some subsequence.
instead that there exists satisfying 0 <
Then 1/Jn (t)n does not converge along this subsequence, a contradiction. It follows that

1/J(t)

lim

1/ln (t) = { 1 ~f ifJ (t)


0

n---+oo

=I= 0,

tf ifJ(t) = 0.

Now J is a characteristic function, so that ifJ(t) =I= 0 on some neighbourhood of the origin. Hence
1/f(t) = 1 on some neighbourhood of the origin, so that 1/J is continuous at the origin. Applying the
continuity theorem (5.9.5), we deduce that 1/J is itself a characteristic function. In particular, 1/J is
continuous, and hence 1/f(t) = 1 for all t, by ( *). We deduce that J (t) =I= 0 for all t.

15. We have that


lP'(N = n I S = N) =

ll"(S = n IN= n)lP'(N = n)

l:k lP'(S = k I N = k)lP'(N = k)

pnlP'(N = n)

2::~1 pkJP'(N = k)

Hence lE(xN I S = N) = G(px)jG(p).


If N is Poisson with parameter A, then
e)c(px-1)

lE(xN I S = N) = eA(p-1) = e)cp(x-1) = G(x)P.


Conversely, suppose that JE(xN I S = N) = G(x)P. Then G(px) = G(p)G(x)P, valid for lxl ::0 1,
0 < p < 1. Therefore f(x) = logG(x) satisfies f(px) = f(p) + pf(x), and in addition f has a
power series expansion which is convergent at least for 0 < x ::0 1. Substituting this expansion into the
above functional equation for f, and equating coefficients of pi xj, we obtain that f(x) = -A(1- x)
for some A :=:: 0. It follows that N has a Poisson distribution.

16. Certainly
1 - (Pl + P2)) n
Gx(s) = Gx,Y(s, 1) = ( 1
,
- P2- PIS

Gy(t) = Gx,Y(l, t) =

1- (Pl + P2)
Gx+Y(s) = Gx,Y(s, s) = ( 1 _ (Pl + p 2 )s

)n

+ P2)) n ,
1 - P1 - P2t

( 1 - (P1

giving that X, Y, and X+ Y have distributions similar to the negative binomial distribution. More
specifically,

ll"(X = k) =

C+~- )ci(l1

ll"(X

+y

fork 2':: 0, where a= pl/(1- P2), ,8

a)n,

= k) = ( n

= P2!(1

lP'(Y = k) = (n

+~ -

- Pl), Y

259

+~ -

1) yk (1 - y )n,

= Pl + P2

1),ak(1 _ ,B)n,

[5.12.17]-[5.12.19]

Solutions

Generating functions and their applications

Now
JE(sx I y = y) =

JE(sx I

{Y=yJ

JP'(Y = y)

where A is the coefficient of tYinG x,Y(s, t) and B is the coefficient of tY in Gy(t)o Therefore
JE(sx

Iy

P2)n (____!!l:_)y/ {( 1- Pl- P2)n (____!!l:_)y}

= y) = ( 1- Pl1-pls

=(

1-pls

1 - Pl ) n+y

1-pl

1- PIS
17. As in the previous solution,

18. (a) Substitute u = y I a to obtain

(b) Differentiating through the integral sign,

-ai = !ooo {-2b


- exp( -a2 u 2 ab

=-

by the substitution u

= b/ y

u2

b 2 u -2 ) } du

!ooo 2exp(-a 2b2y- 2- y 2)dy =

-2I(l,ab),

(c) Hence a I jab= -2ai, whence I= c(a)e- 2ab where


c(a)=I(a,O)=

loo

oo -a2u2

(d) We have that

by the substitution x = y 2
(e) Similarly

by substituting x = y-20

19. (a) We have that

260

..jii

du=-0
2a

1-pl

Solutions [5.12.20]-[5.12.22]

Problems

in the notation of Problem (5.12.18). Hence U has the Cauchy distribution.


(b) Similarly

fort> 0. Using the result of Problem (5.12.18e), V has density function


f(x) = _1_e-lf(2x),

x > 0.

J2nx3
(c) We have that

w- 2 =

x- 2 + y- 2 + z- 2 . Therefore, using (b),

fort > 0. It follows that w- 2 has the same distribution as 9V = 9X- 2 , and so W 2 has the same
distribution as ~ X 2 . Therefore, using the fact that both X and W are symmetric random variables, W
has the same distribution as

1X, that is N (0, ~).

20. It follows from the inversion theorem that


F(x +h)- F(x)

2n

------=-

1N

1"

1m

-N

N-+oo

1 - e-ith -itx,~.( ) d
e
'~" t
t.

it

Since 14>1 is integrable, we may use the dominated convergence theorem to take the limit as h ..j, 0
within the integral:
f(x)

lim

___!__

2n

N-+oo

1N

e-itxrf>(t)dt.

-N

The condition that 4> be absolutely integrable is stronger than necessary; note that the characteristic
function of the exponential distribution fails this condition, in reflection of the fact that its density
function has a discontinuity at the origin.

21. Let Gn denote the probability generating function of Zn. The (conditional) characteristic function
of Zn/fl.-n is
lE(eitZn/JLn

IZ

>

0) =

Gn(eit/JLn)- Gn(O).

1 - Gn(O)

It is a standard exercise (or see Example (5.4.3)) that

whence by an elementary calculation


as n -+ oo,
the characteristic function of the exponential distribution with parameter 1 - p., -l .
22. The imaginary part of rf>x(t) satisfies

261

[5.12.23]-[5.12.24]

Solutions

Generating functions and their applications

for all t, if and only if X and -X have the same characteristic function, or equivalently the same
distribution.

23. (a) U = X + Y and V = X - Y are independent, so that r/Ju + v = r/Jur/Jv, which is to say that
2X = X+YrfJX-Y or

Write 1/f(t) = rjJ(t)!(-t). Then


1fr(2t) =

(2t) = (t)3(-t) = 1/f(t)2


(-2t)
(-t)3(t)

Therefore
1/f(t) = 1fr(1t)2 = 1fr(!t)4 = ... = 1/f(t/2n)2n

for n:::: 0.

However, as h -+ 0,

so that 1/f(t) = { 1 + o(t 2 j22n)} zn -+ 1 as n -+ oo, whence 1/f(t)


(-t) = rjJ(t). It follows that
rp(t) = (1t)3(-1t) = rfJ(it)4 = rp(t/2nln
=

{1-! .._
2

22n

+o(t2/22n)}z2n-+ e-1t2

1 for all t, giving that

for n :::: 1
as n-+ oo,

so that X andY are N(O, 1).


(b) With U =X+ Y and V =X- Y, we have that 1/f(s, t) = E(eisU+itV) satisfies
1/f(s, t) = E(ei(s+t)X+i(s-t)Y) = rp(s + t)rjJ(s- t).
Using what is given,

However, by (*),

a2t/
at

=2{"(s)(s)-'(s)2},

t=o

yielding the required differential equation, which may be written as


d ' /)
-(
ds

= -1.
1 2

Hence log(s) =a+ bs -1s 2 for constants a, b, whence (s) = e-:zs .


24. (a) Using characteristic functions, rpz(t) = r/Jx(t/n)n = e-ltl.
(b) EIX;I = oo.

262

Solutions [5.12.25]-[5.12.27]

Problems

25. (a) See the solution to Problem (5.12.24).


(b) This is much longer. Having established the hint, the rest follows thus:
fX+Y(Y)

=
=

L:

f(x)f(y -x)dx

100 {J(x) +

JT(4 + y ) -oo

f(y- x)} dx + J g(y)

JT(4 + y )

+ J g(y)

where
1= L:{xf(x)+(y-x)f(y-x)}dx

lim

M,N---+oo

[__!_{log(l+x 2)-log(1+(y-x) 2)}]N


2JT

-M

=0.

Finally,
fz(z) = 2fx+Y(2z) =

1
JT(1

+ z2 )

26. (a)X1 +X2++Xn.


(b) X 1 - X!, where X 1 and X! are independent and identically distributed.
(c) XN, where N is a random variable with W'(N = j) = P) for 1 :::: j :::: n, independent of
X1, X2, ... , Xn.
(d) ~~ 1 ZJ where Z1, Z2, ... are independent and distributed as X1, and M is independent of the
Zj with W'(M = m) = <i)m+ 1 form ::: 0.
(e) Y X 1, where Y is independent of X 1 with the exponential distribution parameter 1.

27. (a) We require

1_00

2eitx

oo

rjJ(t)

e:n:x

+ e -:n:x dx.

First method. Consider the contour integral

where C is a rectangular contour with vertices at K, K + i. The integrand has a simple pole at
1

z = ~i, with residue e-'J.t j(iJT). Hence, by Cauchy's theorem,


asK-+ oo.

Second method. Expand the denominator to obtain

00

cosh(Jrx)

k=O

- - - = :L<-1)k exp{ -(2k +


Multiply by eitx and integrate term by term.

263

1)Jrlxl}.

[5.12.28]-[5.12.30]
(b) Define </J(t)

Solutions

Generating functions and their applications

= 1 -It I for It I .::::


__!__

2rr

= 0 otherwise.

1, and </J(t)

Then

1oo
11 e-itx (1- It I) dt
-oo e-itx</J(t) dt = __!__
2rr
-1

= -1 lo1 (1 -

t) cos(tx) dt

7i

= - 1 2 (1- cosx).
liX

Using the inversion theorem, </J is the required characteristic function.


(c) In this case,

1-oo
00

eitxe-x-e-x dx =

roo y-ite-Y dy = r(l- it)

lo

where r is the gamma function.


(d) Similarly,

L:

~eitxe-lxl dx = ~ {fooo eitxe-x dx + fooo e-itxe-x dx}


1{

(e) We have that lE(X)


that

-i</J'(O)

1
1- it

+ 1 +it

= 1

-r'(l). Now, Euler's product for the gamma function states

r(z) =

.
hm

n!nz

n->oo z(z + 1) (z + n)

where the convergence is uniform on a neighbourhood of the point z

r , (1) =

hm
n->oo

+ t2.

= 1. By differentiation,

{ -n- ( logn- 1 - 1- - ... - -1- )}


n+1
2
n +1

= -y.

28. (a) See Problem (5.12.27b).


(b) Suppose <P is the characteristic function of X. Since <P' (0) = <P" (0) = <P"' (0) = 0, we have that
lE(X) = var(X) = 0, so that lP'(X = 0) = 1, and hence </J(t) = 1, a contradiction. Hence </J is not a
characteristic function.
(c) As for (b).
(d) We have that cost

= ~ (eit + e-it), whence </J is the characteristic function of a random variable

taking values 1 each with probability ~ .


(e) By the same working as in the solution to Problem (5.12.27b), </J is the characteristic function of
the density function
1 - lxl if lxl < 1,
f(x) = {
.
0
otherwtse.

29. We have that

11 _ </J(t)l .::::JEll_ eitxl

= lEVO _ eitX)(l _ e-itX)

= JEy'2{1- cos(tX)}

.:S lEitXI

since 2(1 - cos x) .:S x 2 for all x.

30. This is a consequence of Taylor's theorem for functions of two variables:

</J(s, t)

= "'
L....t

m<M
n?:_N

smtn
--</Jmn(O, 0)
m!n!

264

+ RMN(s, t)

Solutions [5.12.31]-[5.12.33]

Problems

where ifJmn is the derivative of ifJ in question, and RM N is the remainder. However, subject to appropriate conditions,

whence the claim follows.

31. (a) We have that


x2
x2
x4
< - - - < 1- cosx
3 - 2!
4! -

if lx I _:::: 1, and hence


(tx) 2 dF(x) _:::: {

{
Jr-r-l,r-1]

L:

3{1-cos(tx)}dF(x)

J[-r-l,r-1]

_:::: 3

{1- cos(tx)} dF(x) = 3{1 - Re ifJ(t)}.

(b) Using Fubini's theorem,

!
t

r {1-cos(vx)}dvdF(x)
x=-oo !t Jv=O
= loo (1- sin(tx)) dF(x)
-oo
tx

r{1-ReJ(v)}dv=1

00

2: {

x:

Jo

Jltxl~1

(1-

sin(tx)) dF(x)
tx

since 1 - (tx)- 1 sin(tx) 2: 0 if !tx! < 1. Also, sin(tx) _:::: (tx) sin 1 for !tx! 2: 1, whence the last
integral is at least
{ x: (1- sin 1) dF(x) 2:

Jltxl~1

~lP'(IXI

2: t- 1).

32. It is easily seen that, if y > 0 and n is large,

33. (a) The characteristic function of YA is

1/JA (t) =IE{ exp(it(X- A)/.J'i..)} = exp{ A(eit/v'A- 1) - it.J'i..} = exp{- ~t 2 + o(l)}
as A-+ oo. Now use the continuity theorem.
(b) In this case,

so that, as A -+ oo,
it
t2
1 ) -+--t.
1 2
loglfrA(t)=-itv'A-Alog ( 1 -it- ) =-itv'A+A ( ---+o(A-)

,ft.

,ft.

265

2A

[5.12.34]-[5.12.36]

Solutions

Generating functions and their applications

(c) Let Zn be Poisson with parameter n. By part (a),


JED (

z~ n

o)

::=:

-+ <f>(O)

where <f> is the N(O, 1) distribution function. The left hand side equals lfD(Zn ::=: n) =

'Z=o e-nnk jk!.

34. If you are in possession of r - 1 different types, the waiting time for the acquisition of the next
new type is geometric with probability generating function
Gr(s)
Therefore the characteristic function of Un

= (Tn

II

+ l)s

(n- r

n- (r- l)s

- n log n) / n is

II

.
n
.
. n {(n-r+l)eitjn}
n-itnl
1/ln(t) = e-ztlogn
Gr(eztfn) = n-zt
=
.
.
n- (r- l)elt/n
rrn-l(ne-ztjn- r)
r=l
r=l
r=O
The denominator satisfies
n-1

n-1

r=O

r=O

II (ne-itfn- r) = (1 + o(l)) II (n-it- r)

as n -+ oo, by expanding the exponential function, and hence


lim 1/Jn(t)

n---+oo

n-itn'

lim

I
.
n---+00 rr~,:o(n- it- r)

= r ( l - it),

where we have used Euler's product for the gamma function:


n!nZ
n
IT r=o(z
+ r)

as n -+ oo

-+ r(z)

the convergence being uniform on a11y region of the complex plane containing no singularity of r.
The claim now follows by the result of Problem (5.12.27c).

35. Let Xn be uniform on [-n, n], with characteristic function

1
n

</Jn(t)

1 .t
- e 1 x dx
-n 2n

sin(nt)
nt
1

if t

=f. 0,

if t = 0.

It follows that, as n -+ oo, <Pn (t) -+ 8o1 , the Kronecker delta. The limit function is discontinuous at
t = 0 and is therefore not itself a characteristic function.

36. Let G i (s) be the probability generating function of the number shown by the i th die, and suppose
that
12

G1 (s)G2(s) =
so that 1- s 11
However

1 k
{;2 rrs
=

11(1- s)HJ (s)H2(s) where Hi(s)


5

1- s 11

s 2(1 -s 11)
11 (1 _ s) ,

= s- 1Gi(s) is a real polynomial of degree 5.

= (1- s) II (wk- s)(wk- s)


k=l

266

Solutions [5.12.37]-[5.12.38]

Problems

where w1, w1, ... , ws, ms 'are the ten complex eleventh roots of unity. The Wk come in conjugate
pairs, and therefore no five of the ten terms in rr~=1 (Wk - S )(Wk - S) have a product Which is a real
polynomial. This is a contradiction.
37. (a) Let H and T be the numbers of heads and tails. The joint probability generating function of
Hand Tis

where p = 1 - q is the probability of heads on each throw. Hence


G H,T(s, t)

= GN(qt + ps) =

exp {A.(qt

+ ps- 1)}.

It follows that
GT(f)

= G H,T(l, t) = eA.q(s- 1),

so that GH,T(s, t) = G H(s)GT(f), whence Hand Tare independent.


(b) Suppose conversely that H and T are independent, and write G for the probability generating
function of N. From the above calculation, G H,T(s, t) = G(qt + ps), whence G H(s) = G(q + ps)
and GT(f) = G(qt + p), so that G(qt + ps) = G(q + ps)G(qt + p) for all appropriates, t. Write
f(x) = G(l- x) to obtain f(x + y) = f(x)f(y), valid at least for all 0 ::::: x, y ::::: min{p, q}.
The only continuous solutions to this functional equation which satisfy f (0) = 1 are of the form
f(x) = eJ.Lx for some p,, whence it is immediate that G(x) = eA.(x- 1) where A.= -p,.
38. The number of such paths Tr containing exactly n nodes is 2n- 1, and each such Tr satisfies
lP'(B(rr) :::: k) = lP'(Sn :::: k) where Sn = Y1 + Y2 + + Yn is the sum of n independent Bernoulli
variables having parameter p (= 1- q). Therefore E{Xn(k)} = 2n- 11P'(Sn :::: k). We set k = nf3,
and need to estimate lP'(Sn :::: nf3). It is a consequence of the large deviation theorem (5.11.4) that, if
p ::::: f3 < 1,
lP'(Sn 0:: n{3) 1fn -+ inf {e-t{J M(t)}
t>O

where M(t) = E(etYt) = (q +pet). With the aid of a little calculus, we find that
lP'(Sn 0:: n/3)1/n -+

(~ ) {3(11 =~ )1-{J

Hence
E{Xn(f3n)}-+ { 0
00

where
y(f3) = 2

(~r

p::S/3<1.

~fy(f3) < 1,
tf y(/3) > 1,

c=

~r-{3

is a decreasing function of {3. If p < ~,there is a unique f3c E [p, 1) such that y(f3c) = 1; if p ::::
then y ({3) > 1 for all f3 E [p, 1) so that we may take f3c = 1.
Turning to the final part,
1P'(Xn(f3n) 0::

1)

::S E{Xn(f3n)}-+ 0

As for the other case, we shall make use of the inequality


E(N) 2
lP'(N =I= 0) :::: E(N2)

267

if f3 > f3c.

[5.12.38]-[5.12.38]

Solutions

Generating functions and their applications

for any N taking values in the non-negative integers. This is easily proved: certainly
var(N

IN

=I= 0) = E(N 2

IN

=I= 0) - E(N

IN

=I= 0) 2 ::: 0,

whence
E(N 2)

E(N) 2

--->
.
W'(N =/= 0) - W'(N =I= 0)2

We have that E{Xn (.Bn) 2 } = L:n,p E(Inlp) where the sum is over all such paths n, p, and In is the
indicator function of the event {B(n) ::: ,Bn }. Hence
E{Xn(,8n) 2 } = LE(/n)

+L

E(/nlp)

= E{Xn(,Bn)} + 2n- 1 L

n#p

E(hlp)

p#L

where L is the path which always takes the left fork (there are 2n- 1 choices for n, and by symmetry
each provides the same contribution to the sum). We divide up the last sum according to the number
of nodes in common top and L, obtaining 2::~-:,\ 2n-m- 1E(hlM) where M is a path having exactly
m nodes in common with L. Now
E(h/M) = E(IM

Ih

= 1)E(h) :S W'(Tn-m ::: ,Bn- m)E(h)

where Tn-m has the bin(n - m, p) distribution (the 'most value' to IM of the event {h
obtained when all m nodes in L n M are black). However

1} is

so that E(h/M)::; p-mE(h)E(/M). It follows that N = Xn(,Bn) satisfies

E(N2) :S E(N)

+ 2n-1

n-1

n-1 (

2n-m-1. -kE(h)E(/M)

= E(N) + iE(N)2 L

m=1

m=1

21 )
P

whence, by (*),

W'(N =I= 0) >

- E(N)-1

+ i 2::~-:,\ (2p)-m

If ,8 < ,Be then E(N) -+ oo as n -+ oo. It is immediately evident that W'(N =I= 0) -+ 1 if p ::;

Suppose finally that p >

i and ,8 < .Be By the above inequality,


P(Xn(,Bn) >

0) ::: c(,B)

for all n

where c(,B) is some positive constant. Find E > 0 such that ,8 + E < .Be Fix a positive integer m, and
let :Pm be a collection of 2m disjoint paths each of length n - m starting from depth m in the tree.
Now
P(Xn(,Bn) = 0) :S P( B(v) < ,Bn for all v E 9'm) = P( B(v) < ,Bn ) 2m
where v

9'm . However
P(B(v) < ,Bn) :S P(B(v) <

if ,Bn < (,8

+ E)(n -

m), which is to say that n ::: (,8

268

(,8

+ E)(n- m))

+ E)m/E.

Hence, for all large n,

Problems

Solutions

[5.12.39]-[5.12.42]

by(**); we let n--+ oo and m--+ oo in that order, to obtain lP'(Xn(.Bn) = 0)--+ 0 as n--+ oo.

39. (a) The characteristic function of Xn satisfies

the characteristic function of the Poisson distribution.


(b) Similarly,
A

Peitfn .

E(eitYnfn) =

--+ _ _
A- it

1 - (1- p)elt/n

as n --+ oo, the limit being the characteristic function of the exponential distribution.
40. If you cannot follow the hints, take a look at one or more of the following: Moran 1968 (p. 389),
Breiman 1968 (p. 186), Loeve 1977 (p. 287), Laha and Rohatgi 1979 (p. 288).
41. With Yk = kXk. we have that E(Yk)
Y1 + Y2 + + Yn is such that

0, var(Yk)

k 2 , EIYf I

k 3 . Note that Sn

as n --+ oo, where cis a positive constant. Applying the central limit theorem ((5.10.5) or Problem
(5.12.40)), we find that
Sn
D
as n --+ oo,
~ ---+ N(O, 1),
yvarSn
where var Sn

= l:k= 1 k 2 ~

1n 3 as n--+ oo.

42. We may suppose that 11 = 0 and a = 1; if this is not so, then replace Xi by Y; = (Xi - 11) /a.
Lett = (to, t1, t2, ... , tn) E JRn+l, and set 7 = n - 1 l:J= 1 fj. The joint characteristic function of the

+ 1 variables X, ZJ, Z2, ... , Zn is


ifJ(t)

= m:{ exp(itoX + titjZj)} = m:{


=

fl

exp (-

exp

(i [~ +

fj -7]

Xj)}

j=1

]=1

~ [ ~ + fj - 7r)

by independence. Hence
ifJ(t)

= exp ( - -1 Ln

where we have used the fact that

whence

s2 =

[t

_Q
2 j= 1 n

+ (fj

-7)

]2) = {

t2
1 L(tj
n
exp _...Q_--7) 2 }
2n
2 j= 1

L:J= 1(tj -7) = 0. Therefore

is independent of the collection z1, Z2, ... , Zn. It follows that


Compare with Exercise (4.10.5).

(n- 1)- 1 2::)= 1

zJ.

269

is independent of

[5.12.43]-[5.12.47]

Solutions

Generating functions and their applications

43. (i) Clearly, lP'(Y ::; y) = lP'(X::; logy)= cp(log y) for y > 0, where cp is the N(O, 1) distribution
function. The density function of Y follows by differentiating.
(ii) We have that fa(x) :=::: 0 if /a/::; 1, and

lo

oo

asin(2nlogx)

00
1
I I
2
1
I 2
M"::"e-z(ogx) dx=
M"::"asin(2ny)e-zY dy=O
x-v2n
-oo -v2n

since sine is an odd function. Therefore J~oo fa (x) dx = 1, so that each such fa is a density function.
For any positive integer k, the kth moment of fa is J~00 xk f(x) dx + la(k) where

la(k) =

1
k
I 2
M"::"a sin(2ny)e Y-zY dy = 0
-oo -v2n
00

since the integrand is an odd function of y- k. It follows that each fa has the same moments as f.

44. Here is one way of proving this. Let X 1, X 2, . . . be the steps of the walk, and let Sn be the
position of the walk after the nth step. Suppose t-t = E(X ,) satisfies t-t < 0, and let em = lP'(Sn =
0 for some n :=::: 1 I So= -m) where m > 0. Then em ::; 2:~ 1 1P'(Tn > m) where Tn = x, + X2 +
+ Xn = Sn - So. Now, fort > 0,
lP'(Tn > m)

= lP'(Tn

- nt-t > m- nt-t) :5 e-t(m-nf.J.)E(et(Tn-nf.J.))

= e-tm { ett.J. M(t)

where M(t) = E(et(XI-JJ.)). Now M(t) = 1 +O(t 2) as t --+ 0, and therefore there exists t (> 0) such
thate(t) = e 1 f.J.M(t) < 1 (rememberthatt-t < 0). Withthischoiceoft, em::; 2::~ 1 e-tme(t)n--+ 0
as m --+ oo, whence there exists K such that em < for m :=::: K.
Finally, there exist 8, E > 0 such that lP'(X, < -8) > E, implying that lP'(SN < -K I So= 0) >
EN where N = K I 8l, and therefore

lP'(Sn =I= 0 for all n

:=:::

1 I So = 0) :=::: (1 - e K )EN :=::: iEN;

therefore the walk is transient. This proof may be shortened by using the Borel-Cantelli lemma.

45. Obviously,

L={~,+L

ifX,>a,
ifX1 ::;a,

where L has the same distribution as L. Therefore,


a

E(sL) = salP'(X! >a)+ 2__:>rE(sL)lP'(X! = r).


r=l

46. We have that


with probability p,
with probability q,
where Wn is independentofWn-1 and has the same distribution as Wn. Hence Gn(s) = psGn-! (s)+
qsGn-! (s)Gn(s). Now Go(s) = 1, and the recurrence relation may be solved by induction. (Alternatively use Problem (5.12.45) with appropriate X;.)

47. Let Wr be the number of flips until you first see r consecutive heads, so that lP'(Ln < r) =
lP'(Wr > n). Hence,
oo
oo
1- E(s Wr)
1 + """snlP'(Ln < r) = """snlP'(W > n) = - - - L.J
L.J
1-s

n=l

n=O

270

Solutions

Problems

[5.12.48]-[5.12.52]

where E(s Wr) = Gr(s) is given in Problem (5.12.46).

48. We have that


Xn+l =

{ iXn
I

zXn

with probability

i,

with probability

i.

+ Yn

Hence the characteristic functions satisfy


'

n+l (t) = E(e

itX
I

I
I
I
I
A
n+l) = zn(zt) + zn(zt)--.
A -zt
,

A-

I .
zlt

A-

4!(

= n(zt)-,-.- = c/Jn-l(4t)~t = = 1(t2


A-lf

-n

A-!

A-

lf

2-n
.t

-!

-+ -,-------;-t
A-!

as n -+ oo. The limiting distribution is exponential with parameter A.


49. We have that

(a) (1-e-A.)/A, (b) -(pjq 2 )(q+1og p), (c) (1-qn+l )/[(n+ 1) p ], (d) -[l+(p/q) log p]jlog p.
(e) Not if lP'(X + 1 > 0) = 1, by Jensen's inequality (see Exercise (5.6.1)) and the strict concavity of
the function f(x) = 1jx. If X+ 1 is permitted to be negative, consider the case when lP'(X + 1 =
-1) =lP'(X+ 1 = 1) =

50. By compounding, as in Theorem (5.1.25), the sum has characteristic function


GN(x(t)) =

Px(t)
- _AP_
1- qx(t)
Ap- it'

whence the sum is exponentially distributed with parameter Ap.

51. Consider the function G(x) = {E(X 2 )}- 1 J~oo y 2 dF(y). This function is right-continuous and
increases from 0 to 1, and is therefore a distribution function. Its characteristic function is

52. By integration, fx(x) = fy(y) =


are not independent. Now,
fx+Y(Z)=

1
1

i, lxl <

1,

lyl < 1. Since

f(x,z-x)dx=

,\(z + 2)
1

4 (2-z)

-1

f(x,

y) i=

if - 2 <

fx(x)fy(y), X andY

z<

0,

if0<z<2,

the 'triangular' density function on ( -2, 2). This is the density function of the sum of two independent
random variables uniform on (-1, 1).

271

6
Markov chains

6.1 Solutions. Markov wocesses


1.

The sequence X 1, X2, ... of independent random variables satisfies


lP'(Xn+l = j

I X1

= i1, ... , Xn =in)= lP'(Xn+l = j),

whence the sequence is a Markov chain. The chain is homogeneous if the Xi are identically distributed.
2.

(a) With Yn the outcome of the nth throw, Xn+l = max{Xn, Yn+d, so that

0
{ I .
Pij =

if j < i

=i

if j

if j > i,

for 1 :::: i, j :::: 6. Similarly,


0
Pij(n) = { (l)n
6!

ifj<i
'f.
.
1 1 = !.

If j > i, then Pij(n) = lP'(Zn = j), where Zn = max{Y1, Y2, ... , Yn}, and an elementary calculation
yields
Pi} (n) =

(i)n - (1~ l)n ,

i < j:::: 6.

(b) Nn+l- Nn is independent of N1, N2, ... , Nn, so that N is Markovian with

i
{
Pi} =
t

if j = i
if j

+ 1,

= i,

otherwise.

(c) The evolution of Cis given by

Cr+l = {

~r + 1

whence C is Markovian with

Pi}=

if the die shows 6,


otherwise,

j =0,

j = i

otherwise.

272

+ 1,

Markov processes

Solutions

[6.1.3]-[6.1.4]

(d) This time,


Br+! = {

Br -1

if Br > 0,
if Br = 0,

Yr

where Yr is a geometrically distributed random variable with parameter


Bo, B2, , Br. Hence B is Markovian with
Pij=

(-65)j-1_61

(i) If Xn = i, then Xn+! E {i- 1, i

3.

(*)
lP'(Xn+! = i

+ 1 I Xn

if j = i - 1 ~ 0,
.f.
0 . 1
ll=,j~.

+ 1}. Now, fori

1,

= i, B) = lP'(Xn+! = i + 1 I Sn = i, B)lP'(Sn = i I Xn = i, B)
+ lP'(Xn+! = i + 1 I Sn = -i, B)lP'(Sn = -i I Xn = i, B)

where B = {Xr = ir for 0


lP'(Xn+! = i

! ,independent of the sequence

r < n} and io, i1, ... , in-! are integers. Clearly

+ 1 I Sn =

i, B)= p,

lP'(Xn+! = i

+ 1 I Sn

= -i, B)= q,

where p (= 1 - q) is the chance of a rightward step. Let l be the time of the last visit to 0 prior to
the time n, l = max{r : ir = 0}. During the time-interval (1, n], the path lies entirely in either the
positive integers or the negative integers. If the former, it is required to follow the route prescribed by
the event B n {Sn = i }, and if the latter by the event B n {Sn = -i}. The absolute probabilities of
these two routes are

whence
lP'(Sn = i

n, +

I Xn = i, B) =

Jl"i

7r2

L
= 1 - lP'(Sn = -i I Xn = i, B).
pz + qz

Substitute into (*) to obtain

.
.
pi+!+ qi+!
lP'(Xn+! = l + 1 I Xn = l, B)=
.
.
= 1 -lP'(Xn+l = i - 1 I Xn = i, B).
pl +ql
Finally lP'(Xn+l = 1 I Xn = 0, B) = 1.
(ii) If Yn > 0, then Yn - Yn+! equals the (n + l)th step, a random variable which is independent of
the past history of the process. If Yn = 0 then Sn = Mn, so that Yn+! takes the values 0 and 1 with
respective probabilities p and q, independently of the past history. Therefore Y is a Markov chain
with transition probabilities
fori > 0,

Pij =

ifj=i-1
'f .
.+1
1 J = l
'

POj = { q

ifj=O
ifj=l.

The sequence Y is a random walk with a retaining barrier at 0.

4.

For any sequence io, i" ... of states,


lP'(Yk+! = ik+!

I Yr = ir for 0

~ r ~

k) =

=
=

lP'(Xns =is for 0 < s < k + 1)


.
- lP'(Xns = ls forO~ s ~ k)

TI~=O Pis .is+! (ns+ I

ns)

k-!

Tis=O Pis,is+l (ns+l- ns)


Pikoik+l

273

(nk+! - nk) = lP'(Yk+! = ik+!

I Yk = ik),

[6.1.5]-[6.1.9]

Markov chains

Solutions

where Pij (n) denotes the appropriate n-step transition probability of X.


(a) With the usual notation, the transition matrix of Y is
p2
lrjj

if j = i

{ 2pq
q2

if j

+ 2,

= i,

if j = i- 2.

(b) With the usual notation, the transition probability lrij is the coefficient of sj in G(G(s))i.
5.

Writing X= (X1, X2, ... , Xn), we have that

.)

( I

lP' F /(X)= 1, Xn = z =

lP'(F,/(X)=l.Xn=i)
(
.)
lP' /(X)= 1,Xn = I

where F is any event defined in terms of Xn, Xn+1 .... Let A be the set of all sequences x =
(x1, x2, ... , Xn-1, i) of states such that I (x) = 1. Then
lP'(F, /(X)= 1, Xn =

i)

L lP'(F, X= x) = lP'(F I Xn = i) L lP'(X = x)


XEA

XEA

by the Markov property. Divide through by the final summation to obtain lP'(F /(X) = 1, Xn =
i) = lP'(F I Xn = i).
6.

Let Hn = {Xk = Xk for 0:::: k < n, Xn = i}. The required probability may be written as
lP'(XT+m = j, HT)

l:n lP'(XT+m = j, HT, T = n)

lP'(HT)

lP'(HT)

Now lP'(XT+m = j I HT, T = n) = lP'(Xn+m = j I Hn, T = n). Let I be the indicator function of
the event Hn n {T = n }, an event which depends only upon the values of X 1, X2, ... , Xn. Using the
result of Exercise (6.1.5),
lP'(Xn+m = j

I Hn,

T = n) = lP'(Xn+m = j

I Xn

= i) = Pij(m).

Hence
lP'(X

7.

. (m) "' lP'(H T - n)


. I H ) - PlJ
LJn
n,
-

T +m - 1

lP'(HT)

- PJ m

Clearly
lP'(Yn+1 = j

I Yr

= ir for 0 ::S r ::S n) = lP'(Xn+1 = b

I Xr

= ar for 0 ::S r ::S n)

where b = h- 1(j), ar = h- 1Cir ); the claim follows by the Markov property of X.

It is easy to find an example in which h is not one-one, for which X is a Markov chain but Y is
not. The first part of Exercise (6.1.3) describes such a case if So =f::. 0.
8.

Not necessarily! Take as example the chains SandY of Exercise (6.1.3). The sum is Sn

+ Yn

Mn, which is not a Markov chain.


9.

All of them. (a) Using the Markov property of X,


lP'(Xm+r = k

I Xm

= im, , Xm+r-1 = im+r-1) = lP'(Xm+r = k

274

I Xm+r-1

= im+r-1).

Solutions [6.1.10]-[6.2.1]

Classification of states

(b) Let {even}= {X2r = i2r forO:::: r:::: m} and {odd}= {X2r+1 = i2r+1 forO :S r :S m- 1}.
Then,

""''lP'(X2m+2 = k, X2m+1 = i2m+1 even, odd)


lP'(X2m+2 = k I even) = L..J
lP'(even)
= L,lP'(X2m+2 = k, X2m+1 = i2m+1 I X2m = i2m)IP'(even, odd)
lP'(even)
= lP'(X2m+2 = k I X 2m= i2m),
where the sum is taken over all possible values of is for odd s.
(c) With Yn = (Xn, Xn+1),

I
(k, l) I Yn =

IP'(Yn+1 = (k, l) Yo = (io, i1), ... , Yn = Cin, k)) = IP'(Yn+1 = (k, l) Xn+1 = k)
= IP'(Yn+1 =

Cin, k)),

by the Markov property of X.

10. We have by Lemma (6.1.8) that, with 11-Ji) = lP'(Xi = j),

11. (a) Since Sn+ 1 = Sn

+ Xn+ 1, a sum of independent random variables, S is a Markov chain.

(b) We have that

lP'(Yn+1 = k I Yi =Xi

+ Xi-1 for 1 :S i

:S n) = lP'(Yn+1 = k I Xn = Xn)

by the Markov property of X. However, conditioning on Xn is not generally equivalent to conditioning


on Yn = Xn + Xn-1, so Y does not generally constitute a Markov chain.
(c) Zn = nX1 + (n -l)X2 + + Xn, so Zn+1 is the sum of Xn+1 andacertainlinearcombination
of Z1, Z2, ... , Zn, and so cannot be Markovian.
(d) Since Sn+1 = Sn + Xn+1 Zn+1 = Zn + Sn + Xn+1 and Xn+1 is independent of X1, ... , Xn,
this is a Markov chain.

12. With 1 a row vector of 1's, a matrix Pis stochastic (respectively, doubly stochastic, sub-stochastic)
ifP1' = 1 (respectively, 1P = 1, P1' :::: 1, with inequalities interpreted coordinatewise). By recursion,
P satisfies any of these equations if and only if pn satisfies the same equation.

6.2 Solutions. Classification of states


1. Let Ak be the event that the last visit to i, prior to n, took place at time k. Suppose that Xo = i,
so that Ao, A 1, ... , An-1 form a partition of the sample space. It follows, by conditioning on the Ai,
that
n-1
Pi} (n) = L Pii (k)lij (n - k)
k=O
for i of. j. Multiply by sn and sum over n (0:: 1) to obtain Pij (s) = Pii (s )Lij (s) for i of. j. Now
Pij (s) = Fij (s)Pjj (s) if i of. j, so that Fij (s) = Lij (s) whenever Pii (s) = Pjj (s).
As examples of chains for which Pii (s) does not depend on i, consider a simple random walk on
the integers, or a symmetric random walk on a complete graph.
275

[6.2.2]-[6.3.1]

Solutions

Markov chains

2. Let i (# s) be a state of the chain, and define ni = min{n: Pis(n) > 0}. If Xo = i and Xn; = s
then, with probability one, X makes no visit to i during the intervening period [1, n;- 1]; this follows
from the minimality of n;. Now s is absorbing, and hence
lP'(no return to i

I Xo = i)

;::: lP'(Xn;

= s I Xo = i)

> 0.

3. Let h be the indicator function of the event {Xk = i}, so that N = I:~o
visits to i. Then
00

h is the number of

00

JE:(N) = "LE(h) = LPii(k)


k=O

k=O

which diverges if and only if i is persistent. There is another argument which we shall encounter in
some detail when solving Problem (6.15.5).

4. We write lP'iO = lP'( I Xo = i). One way is as follows, another is via the calculation of Problem
(6.15.5). Note that lP';(Yj ;::: 1) = lP';(1j < oo).
(a) We have that

by the strong Markov property (Exercise (6.1.6)) applied at the stopping time Ti. By iteration, lP'i (Vi ;:::
n) = lP'i ( V; ;::: l)n, and allowing n -+ oo gives the result.
(b) Suppose i = j. Form ;::: 1,

by the strong Markov property. Now let m -+ oo, and use the result of (a).
5. Let e = lP'(1j < Ti
before visiting i. Then

Xo

= i) = lP'(Ti

lP'(N:::: 1 I Xo
Likewise, lP'(N:::: k I Xo

< 1j

Xo

= i) = lP'(1j

= i) =eo - e)k-! fork::::

= j), and let N

< T;

I Xo = i)

be the number of visits to j

=e.

1, whence

00

E(N 1 Xo

= i) = "Leo - e)k-l = 1.
k=l

6.3 Solutions. Classification of chains


If r = 1, then state i is absorbing for i ;::: 1; also, 0 is transient unless ao = l.
Assume r < 1 and let J = sup{j : aj > 0}. The states 0, 1, ... , J form an irreducible persistent
class; they are aperiodic if r > 0. All other states are transient. For 0 _:::: i _:::: J, the recurrence time
Ti of i satisfies lP'(Ti = 1) = r. If T; > 1 then Ti may be expressed as the sum of
1.

r/') := time to reach 0, given that the first step is leftwards,


r?) := time spent in excursions from 0 not reaching i,
T-(3)
:=
l

time taken to reach i in final excursion.

276

Solutions

Classification of chains

[6.3.2]-[6.3.3]

It is easy to see that JE:(Tp)) = 1 + (i - 1)/(1 - r) if i 0:: 1, since the waiting time at each
intermediate point has mean (1- r)- 1. The number N of such 'small' excursions has mass function
f'(N = n) = ai(l- ai)n, n :=: 0, where ai = L~i aj; hence JE:(N) = (1- ai)/ai. Each such small
excursion has mean duration
i-1 (

Eo

- 1-

1-r

+1

__!!1_- 1 +

i-1

1-ai-

Eo

1{
= (1 - ai)
ai

i-1
+L

Jaj

(1-ai)(l-r)

and therefore

E(T?))

Jaj

. 1- r
]=0

By a similar argument,

E(T?))

1
=-""'
00

a~

j=i

1 + J - t' ) aj.
1-r

Combining this information, we obtain that

and a similar argument yields JE:(To) = 1 + Lj jaj j(l- r). The apparent simplicity of these formulae
suggests the possibility of an easier derivation; see Exercise (6.4.2). Clearly JE:(Ti) < oo for i ::::: J
whenever Lj jaj < oo, a condition which certainly holds if J < oo.
2. Assume that 0 < p < 1. The mean jump-size is 3p - 1, whence the chain is persistent if and
only if p =
see Theorem (5.10.17).

1;

3. (a) All states are absorbing if p = 0. Assume henceforth that p =f. 0. Diagonalize P to obtain
P = BAB- 1 where

B~ (:

~}

1
0
-1

P"

~ BA'n-

B-1=

0
~ 0 -gp)'
0
l-2p
0

A=

Therefore

(! -\).
I

(I

whence Pij (n) is easily found.


In particular,

and P33(n) = Pll (n) by symmetry.


277

2
0

) B- 1

0 ) .
l-4p

(1- 4p)n

[6.3.4]-[6.3.5]

Markov chains

Solutions

Now Fu(s) = 1- Pu(s)- 1 , where

Pn (s)

p 33 (s)

Pzz(s) =

1
4(1 - s)

+ 2{1 -

1
s(l- 2p)}

1
2(1 - s)

+ 2{1 -

1
.
s(l- 4p)}

+ 4{1 -

1
s(l- 4p)}'

Mter a little work one obtains the mean recurrence times J-ti = F{i (1): t-tl = t-t3 = 4, t-t2 = 2.
(b) The chain has period 2 (if p =f. 0), and all states are non-null and persistent. By symmetry, the
mean recurrence times J-ti are equal. One way of calculating their common value (we shall encounter
an easier way in Section 6.4) is to observe that the sequence of visits to any given state j is a renewal
process (see Example (5.2.15)). Suppose for simplicity that p =f. 0. The times between successive
visits to j must be even, and therefore we work on a new time-scale in which one new unit equals two
old units. Using the renewal theorem (5.2.24), we obtain
2

Pij (2n) -+ - i f Jj- i I is even,


1-tj '

Pij(2n

+ 1)-+

- i f Jj1-tj

il is odd;

note that the mean recurrence time of j in the new time-scale is it-tj. Now "E,j Pij (m) = 1 for all m,
and so, letting m = 2n -+ oo, we find that 4/ t-t = 1 where t-t is a typical mean recurrence time.
There is insufficient space here to calculate Pij (n). One way is to diagonalize the transition
matrix. Another is to write down a family of difference equations of the form PI2 (n) = p P22 (n 1) + (1 - p) P4z(n- 1), and solve them.
4. (a) By symmetry, all states have the same mean-recurrence time. Using the renewal-process
argument of the last solution, the common value equals 8, being the number of vertices of the cube.
Hence t-tv = 8.
Alternatively, lets be a neighbour of v, and lett be a neighbour of s other than v. In the obvious
notation, by symmetry,

- 1 + 4/-tsv.
3
l-tvI

1-ttv = 1 + 21-tsv

I
I
+ 4/-ttv
+ 41-twv,

1-tsv = 1 +
1-twv = 1 +

I
+ 'J.I-ttv.
I
3
41-twv + 4/-ttv,

4/-tsv

a system of equations which may be solved to obtain t-tv = 8.


(b) Using the above equations, t-twv =

'j0 , whence t-tvw = 'j0 by symmetry.

(c) The required number X satisfies lP'(X = n) = en-! (1 - e) 2 for n ?: 1, where e is the probability
that the first return of the walk to its starting point precedes its first visit to the diametrically opposed
vertex. Therefore
00

JE(X)

I>en-1(1- e)2 = 1.
n=I

5.

(a) Let lP'i() = lP'(

I Xo =

i). Since i is persistent,

1 = 1P'i(Vi = oo) = 1P'i(Vj = 0, Vi= oo)


_:::: lP'j(Vj = 0)

+ 1P'i(7} < oo, vi= oo)


+ 1P'i(7} < oo)lP'j(Vi

_:::: 1 -lP'i(7} < oo)

+ lP'i(V;

> 0, Vi= oo)

= oo),

by the strong Markov property. Since i -+ j, we have that lP'j (Vi = oo) ?: 1, which implies IJji = 1.
Also, lP'i (1j < oo) = 1, and hence j -+ i and j is persistent. This implies IJij = 1.
(b) This is an immediate consequence of Exercise (6.2.4b).

278

Classification of chains

Solutions

6. Let lP'i 0 = lP'( I Xo = i). It is trivial that T/j = 1 for j


step and use the Markov property to obtain
TJi =

2.:: PiklP'(TA < oo

x1

= k) =

rf A, condition on the first

For j

E A.

[6.3.6]-[ 6.3.9]

2.:: PjkTJk
k

kES

If x = (Xj : j E S) is any non-negative solution of these equations, then Xj = 1 ::: T/j for j E A. For
j rf A,
Xj

=L
kES

PjkXk = L Pjk
kEA

= IP'j(TA = 1)

+L

1)

+L

PjkXk

krjA

L Pjk{ L Pki
krjA
iEA

We obtain by iteration that, for j

PjkXk = IP'j (TA

krjA

+L

PkiXi} = IP'j(TA ::::: 2)

irjA

+L
krjA

Pjk L PkiXi.
irjA

rf. A,

where the sum is over all k1, k2, ... , kn

rf.

A. We let n --+ oo to find that Xj ::: IP'j(TA < oo) = T/j.

7. The first part follows as in Exercise (6.3.6). Suppose x = (xj : j E S) is a non-negative solution
to the equations. As above, for j rf A,
Xj = 1 + LPjkXk = IP'j(TA::: 1)

+L

= IP'j(TA ::: 1)

Pjk(1

+ LPkiXi)

~A

+ IP'j(TA

::: 2)

+ + IP'j(TA

i~

::: n)

+L

Pjk 1 Pk1k2 Pkk-!knXkn

::: L lP'(TA::: m),


m=i

where the penultimate sum is over all paths of length n that do not visit A. We let n --+ oo to obtain
thatxj :::Ej(TA)=Pj

8. Yes, because the Sr and Tr are stopping times whenever they are finite. Whether or not the exit
times are stopping times depends on their exact definition. The times Ur = min{k > Ur-i : Xu, E
A, Xu,+! rf. A} are not stopping times, but the times Ur + 1 are stopping times.
9. (a) Using the aperiodicity of j, there exist integers TJ, r2, ... , rs having highest common factor
1 and such that Pjj (rk) > 0 for 1 ::::: k ::::: s. There exists a positive integer M such that, if r ::: M, then
r = Z::~=l akrk for some sequence aJ, a2, ... , as of non-negative integers. Now, by the ChapmanKolmogorov equations,
s

Pjj (r) :::

IT Pjj (rk)ak > 0,


k=i

sothatpjj(r) > Oforallr::: M.


Finally, find m such that Pij (m) > 0. Then
Pij(r

+ m)::: Pij(m)Pjj(r)

> 0

ifr::: M.

(b) Since there are only finitely many pairs i, j, the maximum R(P) = max{N(i, j) : i, j E S} is
finite. Now R(P) depends only on the positions of the non-negative entries in the transition matrix P.

279

[6.3.10]-[6.3.10]

Solutions

Markov chains

There are only finitely many subsets of entries of P, and so there exists f(n) such that R(P) ::": f(n)
for all relevant n x n transition matrices P.
(c) Consider the two chains with diagrams in the figure beneath. In the case on the left, we have that
Pll (5) = 0, and in the case on the right, we may apply the postage stamp lemma with a = n and
b=n-1.
3
2

n
10. Let Xn be the number of green balls after n steps. Let ej be the probability that Xn is ever zero
when Xo = j. By conditioning on the first removal,

j+2
j
ej = 2 (j + 1) ej+1 + 2 (j + 1) ej-1

j::: 1,

with eo = 1. Solving recursively gives


q1
q1q2 .. qj-1 }
ej = 1- (1 - e1) { 1 + - + +
,
P1
P1P2Pj-1

(*)

where

j+2
Pj = 2(j + 1)'

It is easy to see that


j-1

"'

q1q2 ... qj-1

f::QP1P2Pj-1

= 2 - -- ~ 2

j+1

as J. ~ oo.

By the result of Exercise (6.3.6), we seek the minimal non-negative solution (ej) to(*), which is
attained when 2(1 - q) = 1, that is, e1 =
Hence

j-1

ej =

1-! L

q1q2 .. qj-1
2 r=O P1P2 Pj-1

+1

For the second part, let dj be the expected time until j - 1 green balls remain, having started with j
green balls and j + 2 red. We condition as above to obtain
j
dj = 1 + 2(j + 1) {dj+1 +dj}.

We set ej = dj - (2j + 1) to find that (j + 2)ej = jej+b whence ej = ij (j + 1)e1. The expected
time to remove all the green balls is
n
n
n
dj =
ej + 2(j - 1)} = n(n + 2) + e1
j (j + 1).

L{

L-!

j=1

j=1

j=1

The minimal non-negative solution is found by setting q = 0, and the conclusion follows by Exercise
(6.3.7).

280

Solutions

Stationary distributions and the limit theorem

[6.4.1]-[6.4.2]

6.4 Solutions. Stationary distributions and the limit theorem


1. Let Yn be the number of new errors introduced at the nth stage, and let G be the common probability
generating function of the Yn. Now Xn+1 = Sn + Yn+1 where Sn has the binomial distribution with
parameters Xn and q (= 1- p). Thus the probability generating function Gn of Xn satisfies
Gn+1(s) = G(s)lE(sSn) = G(s)lE{lE(sSn I Xn)} = G(s)JE{(p + qs)Xn}
= G(s)Gn(P + qs) = G(s)Gn(1- q(1- s)).

Therefore, for s < 1,


Gn(s)

= G(s)G(l- q(l- s))Gn-2(1- q 2 (l- s)) =


n-1
= Go(1- qn(l- s))

oo

II G(1- qr(l- s))---+ II G(1- qr(l- s))


r=O

as n ---+ oo, assuming q < 1. This infinite product is therefore the probability generating function of
the stationary distribution whenever this exists. If G(s) = el.(s-1), then

IT

G(1 - qr (1- s)) = exp{ A.(s- 1)

r=O

qr} = el.(s- 1)/P,

r=O

so that the stationary distribution is Poisson with parameter A./ p.


2. (6.3.1): Assume for simplicity that sup{j : aj > 0} = oo. The chain is irreducible if r < 1.
Look for a stationary distribution :rr with probability generating function G. We have that

:rro = ao:rro + (1 - r):rr1,

:rri = aino + rni + (1 - r)ni+1 fori ::: 1.

Hence
sG(s) = nosA(s) + rs(G(s)- no)+ (1- r)(G(s)- no)
where A (s) =

2:,]:_0 aj s j, and therefore


G(s) =no (

Taking the limit ass

sA(s)- (1- r + sr))


.
(1 - r)(s- 1)

1, we obtain by L'H6pital's rule that

A'(1)+1-r)
G(1) =no (
.
1-r
There exists a stationary distribution if and only if r < 1 and A' (1) < oo, in which case
sA(s)- (1 - r + sr)
G(s)- - - - - - - - (s- 1)(A'(l) + 1- r)
Hence the chain is non-null persistent if and only if r < 1 and A' (1) < oo. The mean recurrence time
j;.,i is found by expanding G and setting f-Li = 1/ni.
(6.3.2): Assume that 0 < p < 1, and suppose first that p =f. ~- Look for a solution {Yj : j =f. 0} of
the equations
Yi =

L PijYj.
j-10

281

i =f. 0,

[6.4.3]-[6.4.3]

Solutions

Markov chains

as in (6.4.10). Away from the origin, this equation is Yi = qyi-1 + PYi+2 where p + q = 1, with
auxiliary equation pe 3 - e + q = 0. Now pe 3 - e + q = p(e - 1)(8 - a)(8 - /3) where
a=

-p- y'p2 +4pq


< -1 ,
2p

Note that 0 < f3 < 1 if p > ~,while f3 > 1 if p < ~


For p > ~,set
. = { A+ Bf3i
Yz
C + Dai

if i :::: 1,
if i :::; -1,

the constants A, B, C, D being chosen in such a manner as to 'patch over' the omission of 0 in the
equations (*):
Y-2 = qy-3,

Y-1 = qy_2 + PY1

Y1 = PY3

The result is a bounded non-zero solution {Yj} to (*), and it follows that the chain is transient.
For p < ~, follow the same route with
if i :::: 1,
ifi:::;-1,
the constants being chosen such that Y-2 = qy_3, Y-1 = qY-2
Finally suppose that p = ~,so that a= -2 and f3 = 1. The general solution to(*) is
Yi = { A+ Bi + Cai.
D+ Ei +Fa 1

if i :::: 1,
ifi::::: -1,

subject to (** ). Any bounded solution has B = E = C = 0, and (**) implies that A = D = F = 0.
Therefore the only bounded solution to(*) is the zero solution, whence the chain is persistent. The
equation x = xP is satisfied by the vector x of 1's; by an appeal to (6.4.6), the walk is null.

(!, 1, !)

(6.3.3): (a) Solve the equation 1r = 1rP to find a stationary distribution 1r =


when p =I= 0.
Hence the chain is non-null and persistent, with f.L1 = rr1 1 = 4, and similarly f.L2 = 2, f.L3 = 4.
. a stationary
.
d"tstn"but10n,
.
. "1 ar1y, 1r = ( 41:, 41:. 41 , 41) ts
(b) Snru
an d J.Li = ni-1 = 4 .
(6.3.4): (a) The stationary distribution may be found to be Jri = ~ for all i, so that J.Lv = 8.
3.

The quantities X 1, X2, ... , Xn depend only on the initial contents of the reservoir and the rainfalls
Yo, Y1, ... , Yn-1 The contents on day n + 1 depend only on the value Xn of the previous contents
and the rainfall Yn. Since Yn is independent of all earlier rainfalls, the process X is a Markov chain.
Its state space isS= {0, 1, 2, ... , K- 1} and it has transition matrix

[go+
lP'=

go
0

Kl

g2
g1
go

g3
g2
g1

gK-1
gK-2
gK-3

GK-1
Gx
GK-2

go

G1

where gi = lP'(Y1 = i) and Gi = 2:-'f:=i gj. The equation 1r = 1rP is as follows:


no= no(go + g1) + rr1go,
1rr = nogr+1 + n1gr + + 1rr+1go,

1rK-1 = noGK +rr1GK-1 + +rrK-1G1.

282

0 < r < K- 1,

Solutions

Stationary distributions and the limit theorem

[6.4.4]-[6.4.5]

The final equation is a consequence of the previous ones, since L;~(/ Jri = 1. Suppose then that
v = ( v1, v2, ... ) is an infinite vector satisfying

Multiply through the equation for Vr by sr+ 1, and sum over r to find (after a little work) that
00

N(s) =

00

G(s) = Lgisi

VjSi,

i=O

satisfy sN(s) = N(s)G(s)

i=O

+ vogo(s- 1), and hence


1
go(s - 1)
-N(s) =
.
vo
s- G(s)

The probability generating function of the Jri is therefore a constant multiplied by the coefficients of
s0 , s 1, ... , sK - 1 in go(s -1)/(s- G(s)), the constant being chosen in such a way that L;~0 1 rri = 1.

When G(s) = p(1 - qs)- 1, then go= p and


go(s- 1)

p(l - qs)

s-G(s)

p-qs

-=-=--'------- =

= p

q
+ _ ______;:...__
1-(qsjp)

The coefficient of s0 is 1, and of si is qi+ 1j pi if i ::: 1. The stationary distribution is therefore given
byrri =qno(qfpi fori::: 1,where

+ 1) if p =

if p

4.

The transition matrices

q, and no= 2/(K

q =

P,~ (~

1
2
1
2

0
0

0
0
1
2
1
2

i i p, i (1 -

have respective stationary distributions n 1 = (p, 1 - p) and n 2 = ( p,


for any 0 ~ p ~ 1.

p),

i(1 -

P))

5. (a) Set i = 1, and find an increasing sequence n1 (1), n1 (2), ... along which x1 (n) converges.
Now set i = 2, and find a subsequence of (n1(j) : j ::: 1) along which x2(n) converges; denote
this subsequence by n2(1), n2(2), .... Continue inductively to obtain, for each i, a sequence Di =
(ni (j) : j ::: 1), noting that:
(i) Di+1 is a subsequence ofni, and
(ii) limr-+ooXi(ni(r)) exists for all i.
Finally, define m k = n k ( k). For each i ::: 1, the sequence m i, m i+1, . . . is a subsequence of Di , and
therefore limr-+oo Xi (mr) exists.
(b) Let S be the state space of the irreducible Markov chain X. There are countably many pairs i, j
of states, and part (a) may be applied to show that there exists a sequence (nr : r ::: 1) and a family
(aij : i, j E S), not all zero, such that Pij (nr) --+ Ciij as r --+ oo.

283

[6.4.6]-[6.4.10]

Solutions

Markov chains

Now X is persistent, since otherwise Pi} (n) ---+ 0 for all i, j. The coupling argument in the proof
of the ergodic theorem (6.4.17) is valid, so that PaJ (n) - Pbj (n) ---+ 0 as n ---+ oo, implying that
aaj = abj for all a, b, j.

6.

Just check that 1r satisfies 1r = 1rP and L:v Irv = 1.

7.

Let Xn be the Markov chain which takes the valuer if the walk is at any of theY nodes at level
r. Then Xn executes a simple random walk with retaining barrier having p = 1 - q = ~,and it is
thus transient by Example (6.4.15).
8.

Assume that Xn includes particles present just after the entry of the fresh batch Yn. We may write
Xn

Xn+l =

L Bi,n + Yn
i=l

where the Bi,n are independent Bernoulli variables with parameter 1 - p. Therefore X is a Markov
chain. It follows also that

In equilibrium, Gn+l = Gn = G, where G(s) = G(p + qs)el.(s-l). There is a unique stationary


distribution, and it is easy to see that G(s) = el.(s-l)fp must therefore be the solution. The answer is
the Poisson distribution with parameter 'A/ p.
9. The Markov chain X has a uniform transition distribution P}k = 1/(j + 2), 0 ::::: k ::::: j + 1.
Therefore,
E(Xn) =lE(E(Xn

I Xn-d) = ~(1 +E(Xn-1)) =

= 1- (~)n + (~)nXo.
The equilibrium probability generating function satisfies

whence

d
ds {0-s)G(s)} = -sG(s),

subject to G(1) = 1. The solution is G(s) = es-l, which is the probability generating function of
the Poisson distribution with parameter 1.

10. This is the claim of Theorem (6.4.13). Without loss of generality we may takes = 0 and the Yj to
be non-negative (since if the Yj solve the equations, then so do Yj + c for any constant c). LetT be the
matrix obtained from P by deleting the row and column labelled 0, and write Tn = (tij (n) : i, j # 0).
Then Tn includes all the n-step probabilities of paths that never visit zero.
We claim first that, for all i, j it is the case that tij (n) ---+ 0 as n ---+ oo. The quantity tij (n) may
be thought of as the n-step transition probability from i to j in an altered chain in which s has been
made absorbing. Since the original chain is assumed irreducible, all states communicate with s, and
therefore all states other than s are transient in the altered chain, implying by the summability of fij (n)
(Corollary (6.2.4)) that tij (n) ---+ 0 as required.
Iterating the inequality y :=: Ty yields y :=: Tny, which is to say that
00

00

Yi 0:: Lfij(n)yj 0:: min{Yr+s} L


tij(n),
. I
s>
I
.
I
J=
J=r+

284

i ::: 1.

Solutions [6.4.11]-[6.4.12]

Stationary distributions and the limit theorem


Let An= {Xk

0 fork::::: n). Fori

1,
00

lP'(Aoo

I Xo = i) = n-+oo
lim lP'(An I Xo =

i)

= '"'tij
(n)
L
j=1

+ .

r
::::: lim { Ltij(n)

n-+oo

} .
Yi
mms:::dYr+s}

j= 1

Let E > 0, and pick R such that

Yi
---'--'--< E.
mins:::dYR+sl
Taker = Rand let n --+ oo, implying that lP'(A 00
by irreducibility that all states are persistent.

I Xo =

11. By Exercise (6.4.6), the stationary distribution is 1TA

i)

= 0.

It follows that 0 is persistent, and

= 1TB = rrn = 1TE =

j,, nc = 1

(a) By Theorem (6.4.3), the answer is f-LA = 1/rrA = 6.


(b) By the argument around Lemma (6.4.5), the answer is Pn(A)

= nnt-tA = nnfnA =

1.

(c) Using the same argument, the answer is pc(A) = ncfnA = 2.


(d)LetlP'iO = lP'( I Xo = i), let 1j be the time of the first passage to state j, and let Vi
By conditioning on the first step,

"th

Wl

1 .

= IP'i(TA

< TE).

5
3
1
1
8,
VB = 4> VC = 2, VD = 4
A typical conditional transition probability Tij = IP'i (X 1 = j I TA < TE) is calculated as follows:
SO UtlOn VA

and similarly,

We now compute the conditional expectations Iii = lEi (TA I TA < TE) by conditioning on the first
step of the conditioned process. This yields equations of the form iiA = 1 + ~JiB + ~ Jic, whose
solution gives ji A =

1:.

(e) Either use the stationary distribution of the conditional transition matrix T, or condition on the first
step as follows. With N the number of visits to D, and Tli = lEi (N I TA < TE), we obtain
TIA =

~TIB +~Tic,

whence in particular Tl A =

TIB =

0+

~TIC,

TIC=

0+

~TIB + ~(1 +TID),

TID =TIC,

1~.

12. By Exercise (6.4.6), the stationary distribution has n A


around Lemma (6.4.5), the answer is PB(A) = 1TBf-LA = 1TB/1TA

285

-h, 1TB
= 2.

.t.

Using the argument

[6.5.1]-[6.5.6]

Solutions

Markov chains

6.5 Solutions. Reversibility


1. Look for a solution to the equations JriPij = lrj Pji. The only non-trivial cases of interest are
those with j = i + 1, and therefore Ailri = fLi+llri+l for 0::::: i < b, with solution

0::::: i::::: b,

/Ll/L2 ... /Li

an empty product being interpreted as 1. The constant no is chosen in order that the ni sum to 1, and
the chain is therefore time-reversible.

2. Letn be the stationary distribution of X, and suppose X is reversible. We have thatniPij = Pjilrj
for all i, j, and furthermore n i > 0 for all i. Hence

as required when n = 3. A similar calculation is valid when n > 3.


Suppose conversely that the given display holds for all finite sequences of states. Sum over all
values of the subsequence h, ... , jn-1 to deduce that Pij (n- 1)Pji = Pij Pji(n- 1), where i = h,
j = jn. Take the limit as n ~ oo to obtain lrj Pji = Pij Jri as required for time-reversibility.

3.

With n the stationary distribution of X, look for a stationary distribution v of Y of the form
ififjC,

ifi

C.

There are four cases.


(a) i E C, j fj C: Viqjj = cnif3Pij = cf3njPji = Vjqji,

(b) i, j

C: Vjqjj = ClriPij = ClrjPji = VNji,

(c) i fj C, j E C: Viqjj = cf3JriPij = cf3nj Pji = Vjqji,


(d) i, j fj C: viqij = cf3niPij = cf3JrjPji = Vjqji
Hence the modified chain is reversible in equilibrium with stationary distribution v, when

In the limit as f3

+0, the chain Y never leaves the set C once it has arrived in it.

4.

Only if the period is 2, because of the detailed balance equations.

5.

With Yn = Xn - im,

Now iterate.

6. (a) The distribution n1 = f3/(a + /3), n2 = aj(a + /3) satisfies the detailed balance equations,
so this chain is reversible.
(b) By symmetry, the stationary distribution is 1r = (~, ~,~),which satisfies the detailed balance
equations if and only if p =

i.

(c) This chain is reversible if and only if p =

i.
286

Solutions

Chains with .finitely many states

[6.5.7]-[6.6.2]

7. A simple random walk which moves rightwards with probability p has a stationary measure
TCn = A(p/q)n, in the sense that :n: is a vector satisfying :n: = :n:P. It is not necessarily the case that
this :n: has finite sum. It may then be checked that the recipe given in the solution to Exercise (6.5.3)

yields n(i, j) = pfp4 L:(r,s)EC PIp~ as stationary distribution for the given process, where Cis the
relevant region of the plane, and Pi = pi/ qi and Pi (= 1 - qi) is the chance that the i th walk moves
rightwards on any given step.

8.

Since the chain is irreducible with a finite state space, we have that TCi > 0 for all i. Assume the
chain is reversible. The balance equations TCi Pij = TCj Pji give Pij = TCj Pji/TCi. Let D be the matrix
with entries 1/ni on the diagonal, and S the matrix (nj Pji ), and check that P = DS.

Conversely, ifP = DS, then di- 1 Pij = dj- 1 Pji whence TCi = di- 1 L:k di: 1 satisfies the detailed
balance equations.
Note that
Pij =

~ [!; Pij y'Jfj.

If the chain is reversible in equilibrium, the matrix M = ( y'ni/nj Pij) is symmetric, and therefore M,
and, by the above, P, has real eigenvalues. An example of the failure of the converse is the transition
matrix

P=(i ~
1

!).
0

which has real eigenvalues 1, and-! (twice), and stationary distribution :n: = (~,~,b). However,
n1 Pl3 = 0

9.

=f::.

b= n3 P31, so that such a chain is not reversible.

Simply check the detailed balance equations TCi Pij = TCj Pji.

6.6 Solutions. Chains with finitely many states


1. Let P = (Pij : 1 ::::= i, j ::::= n) be a stochastic matrix and let C be the subset of JR.n containing
all vectors x = (x1, x2, ... , Xn) satisfying Xi 0:: 0 for all i and 2::?= 1 Xi = 1; for x E C, let
llxll = maxj{Xj }. Define the linear mapping T : C ~ JR.n by T(x) = xP. Let us check that Tis a
continuous function from C into C. First,
IIT(x)ll

= mr{ "';XiPij}

::::: allxll

where

hence IIT(x)- T(y)ll

::::=

allx- Yll. Secondly, T(x)j 0:: Oforall j, and


L T(x)j = LLXiPij = LXi LPij = 1.
j
j
i
j

Applying the given theorem, there exists a point :n: in C such that T (n) = n, which is to say that

n = nP.
2.

Let P be a stochastic m x m matrix and letT be them x (m


Pi'- Oi'J
t{ 1J
lj -

287

if j

::::=

+ 1) matrix with (i, j)th entry

m,

if j = m + 1,

[6.6.3]-[6.6.4]

Solutions

Markov chains

where Dij is the Kronecker delta. Let v = (0, 0, ... , 0, 1)


is valid, there exists y = (YI, Y2 ... , Ym+l) such that

JRm+l. If statement (ii) of the question

"'<p
< i _< m,
L
!J - 8!J)y
J + ym +I -> 0 for 1 _

Ym+l < 0,

j=l
this implies that
m

for all i,

L Pij Yj :0:: Yi - Ym+l > Yi


j=l

and hence the impossibility that 'L/}=I Pij Yj > maxi {Yi}. It follows that statement (i) holds, which
is to say that there exists a non-negative vector x = (xl, x2, ... , Xm) such that x(P - I) = 0 and
I:t=l Xi = 1; such an xis the required eigenvector.

3. Thinking of Xn+ 1 as the amount you may be sure of winning, you seek a betting scheme x such
that Xn+ 1 is maximized subject to the inequalities
n

for 1 :S j :Sm.

Xn+l :S LXitij
i=l

Writing aij = -tij for 1 :S i :S nand an+l,j = 1, we obtain the linear program:
maximize

Xn+l

subject to

n+l
L Xiaij :S 0
i=l

for 1 :S j :Sm.

The dua11inear program is:


m

minimize

subject to

L aij Yj = 0
j=l

for 1 :S i :S n,

Lan+l,jYj = 1,
j=l

Yj :0::0

for 1 :S j :Sm.

Re-expressing the aij in terms of the tij as above, the dual program takes the form:
m

minimize

subject to

L tij Pj
j=l

=0

for 1 :S i :S n,

L Pj = 1,
j=l

Pj :::: 0

for 1 :S j :Sm.

The vector x = 0 is a feasible solution of the primal program. The dual program has a feasible
solution if and only if statement (a) holds. Therefore, if (a) holds, the dual program has minimal value
0, whence by the duality theorem of linear programming, the maximal value of the primal program is
0, in contradiction of statement (b). On the other hand, if (a) does not hold, the dual has no feasible
solution, and therefore the primal program has no optimal solution. That is, the objective function of
the primal is unbounded, and therefore (b) holds. [This was proved by De Finetti in 1937.]
4. Use induction, the claim being evidently true when n = 1. Suppose it is true for n = m. Certainly
pm+l is of the correct form, and the equation pm+ 1x' = P(Pmx') with x = (1, w, w2 ) yields in its
first row

288

Solutions

Branching processes revisited

[6.6.5]-[6.7.2]

as required.
5. The first part follows from the fact that Jr1 1 = 1 if and only if 1rU = 1. The second part follows
from the fact that ni > 0 for all i if P is finite and irreducible, since this implies the invertibility of
1-P+ U.

6. The chessboard corresponds to a graph with 8 x 8 = 64 vertices, pairs of which are connected
by edges when the corresponding move is legitimate for the piece in question. By Exercises (6.4.6),
(6.5.9), we need only check that the graph is connected, and to calculate the degree of a comer vertex.
(a) For the king there are 4 vertices of degree 3, 24 of degree 5, 36 of degree 8. Hence, the number of
edges is 210 and the degree of a comer is 3. Therefore tt(king) = 420/3 = 140.
(b) tt(queen) = (28 x 21 + 20 x 23 + 12 x 25 + 4 x 27)/21 = 208/3.
(c) We restrict ourselves to the set of 32 vertices accessible from a given comer. Then tt(bishop) =
(14x7+10x9+6x 11+2x 13)/7=40.
(d) tt(knight) = (4 X 2 + 8 X 3 + 20 X 4 + 16 X 6 + 16 X 8)/2 = 168.
(e) tt(rook) = 64 x 14/14 = 64.
7. They are walking on a product space of 8 x 16 vertices. Of these, 6 x 16 have degree 6 x 3 and
16 x 2 have degree 6 x 5. Hence
tt(C) = (6

16

3 + 16

5)/18 = 448/3.

IP- All =(A.- l)(A. + !)(A.+ ~). Tedious computation yields the eigenvectors, and thus

8.

0 0)

-4
-1

-3

-1
1

6. 7 Solutions. Branching processes revisited


1. We have (using Theorem (5.4.3), or the fact that Gn+1(s) = G(Gn(s))) that the probability
generating function of Zn is
n- (n- l)s
Gn(s) =
,
n + 1- ns
so that
n )k-1
nk-1
( n )k+1
IP(Zn = k) = n + 1
n+ 1
n+ 1
- (n + l)k+ 1

(n -1) (

fork :=::: 1. Therefore, for y > 0, as n --+ oo,

1
llD(Z <2yn1Z >0)j[

2.

n-

l2ynJ

- 1 - Gn(O)

"""'
t:J

nk-1

(n + l)k+1

=1-

1 ) - L2ynJ
1+-n
-+l-e- 2Y.

Using the independence of different lines of descent,


JE(s

~ skiP(Zn = k, extinction)

I extmct10n) = L

k=O

IP( extinction)

where Gn is the probability generating function of Zn.

289

~ skiP(Zn = k)r/
1
= L
= -Gn(sry),
k=O

TJ

TJ

[6.7.3]-[6.8.1]
3.

Markov chains

Solutions

We have that 17 = G(17). In this case G(s) = q(l - ps)- 1, and therefore 17 = q I p. Hence

P q{pn _ qn _ p(sqlp)(pn-1 _ qn-1)}


1
- G n (s 17) = - ..::.-.::---,-,..-::---,-7--=..:._::_-"----=---'17
q
pn+1 _ qn+1 _ p(sqlp)(pn _ qn)
p{qn _ Pn _ qs(qn-1 _ Pn-1)}
qn+1 _ pn+1 _ qs(qn _ pn) '
which is Gn (s) with p and q interchanged.

4.

(a) Using the fact that var(X I X > 0) ::=: 0,


JE(X 2 ) = JE(X 2

IX

> O)JP(X > 0) :::: lE(X

IX

> 0) 2 JP(X > 0) = lE(X)lE(X

IX

> 0).

(b) Hence

lE(Z~)

lE(Znl t-t I Zn > 0) <


= lE(Wn)
- t-tnlE(Zn)
where Wn = ZnllE(Zn). By an easy calculation (see Lemma (5.4.2)),

where 0" 2 = var(Z 1 ) = plq 2 .


(c) Doing the calculation exactly,
lE Z
n
( nlt-t

Z > 0 - lE(Znlt-tn) 1
) - IP'(Zn > 0)- 1- Gn(O)--+ 1-11

I n

where 17 =II"(ultimate extinction) = q I p.

6.8 Solutions. Birth processes and the Poisson process


1. Let F and W be the incoming Poisson processes, and let N(t) = F(t)+ W(t). Certainly N(O) = 0
and N is non-decreasing. Arrivals of flies during [0, s] are independent of arrivals during (s, t ], if
s < t; similarly for wasps. Therefore the aggregated arrival process during [0, s] is independent of
the aggregated process during (s, t]. Now
IP'(N(t +h)=

n +II N(t)

n)

= JP(A Ll B)

where

A = {one fly arrives during (t, t + h l},

B = {one wasp arrives during (t, t + h l}.

We have that
JP(A Ll B) = JP(A) + JP(B) - II"( A

n B)

= A.h + t-th- (A.h)(t-th) + o(h) = (A.+ t-t)h + o(h).


Finally

IP'(N(t +h)> n +II N(t) =

n)::::: IP'(A n B)+ JP(C u D),

290

Solutions, [6.8.2]-[6.8.4]

Birth processes and the Poisson process

where C ={two or more flies arrive in (t, t + h]} and D ={two or more wasps arrive in (t, t + h]}.
This probability is no greater than (A.h)(p,h) + o(h) = o(h).
2. Let I be the incoming Poisson process, and let G be the process of arrivals of green insects.
Matters of independence are dealt with as above. Finally,
IP'(G(t +h)= n +II G(t) =

n)

n)

= p!P'(I(t +h)= n +II I(t) =

+ o(h) = pA.h + o(h),

11"( G(t +h) > n +II G(t) = n) ::=:II" (I (t +h) > n +II I (t) = n) = o(h).
3.

Conditioning on T1 and using the time-homogeneity of the process,

1P'(E(t)>xiT1=u)=

IP'(E(t-u)>x)

if u :::: t,

ift<u=St+x,
ifu>t+x,

(draw a diagram to help you see this). Therefore


IP'(E(t) > x) = fooo IP'(E(t) >xI T1 = u)A.e-Audu
= t!P'(E(t-u) >x)A.e-Audu+1

Jo

00

A.e-Audu.

t+x

You may solve the integral equation using Laplace transforms. Alternately you may guess the
answer and then check that it works. The answer is IP'(E(t) ::=: x) = I - e-h, the exponential
distribution. Actually this answer is obvious since E(t) > x if and only if there is no arrival in
[t, t + x ], an event having probability e -Ax.
4.

The forward equation is


i :::: j,

with boundary conditions Pi} (0) = Oij, the Kronecker delta. We write Gi (s, t) = "E-1 si Pi} (t), the
probability generating function of B(t) conditional on B(O) = i. Multiply through the differential
equation by si and sum over j:

a partial differential equation with boundary condition Gi (s, 0) = si. This may be solved in the usual
way to obtain Gi(s, t) = g(eAt(l- s- 1)) for some function g. Using the boundary condition, we
find that g(I- s- 1) = si and so g(u) = (1 - u)-i, yielding
I
(se-At)i
Gi(s, t) = {I- eAI(I- s-1 )}i = {I- s(l- e-At)}i.

The coefficient of si is, by the binomial series,


j

as required.

291

?:_

i,

[6.8.5]-[ 6.8.6]

Solutions

Markov chains

Alternatively use induction. Set j =ito obtain pji(t) = -Aipu(t) (remember Pi,i-1 (t) = 0),
and therefore Pi i (t) = e- Ai t. Rewrite the differential equation as
d ( Pij(t)e A1't) = A(j -l)Pi,j-t(t)e A1't .
dt

Set j = i + 1 and solve to obtain Pi,i+l (t) = ie-Ait ( 1- e-At). Hence(*) holds, by induction.
The mean is
E(B(t)) = !_GJ(S,

as

t)l

=/eM,
s=l

by an easy calculation. Similarly var(B(t)) = A + E(B(t)) - E(B(t)) 2 where


A= -a22 G1(s,t)

as

s=l

Alternatively, note that B(t) has the negative binomial distribution with parameters e-At and I.

5.

The forward equations are


P~(t) = An-lPn-t(t)- AnPn(t),

n 2: 0,

where Ai = i A+ v. The process is honest, and therefore m(t) = L::n npn(t) satisfies
00

00

m'(t) = Ln[(n -l)A

+ v]Pn-t(f)-

n=l

Ln(nA + v)pn(t)
n=O

00

= L

{ A[(n + l)n- n 2 ] + v[(n + 1)- nl} Pn(t)

n=O
00

L(An
n=O

+ v)pn(t)

= Am(t)

+ v.

Solve subject to m(O) = 0 to obtain m(t) = v(eJ..t- 1)/A.

6. Using the fact that the time to the nth arrival is the sum of exponential interarrival times (or using
equation (6.8.15)), we have that

is given by

which may be expressed, using partial fractions, as

where
n

az-

A.

IT-1A-A

j=O 1

Hi

292

Solutions

Continuous-time Markov chains

[6.8.7]-[6.9.1]

so long as Ai =!= A.J whenever i =!= j. The Laplace transform Pn may now be inverted as

See also Exercise (4.8.4).

7. Let Tn be the time of the nth arrival, and letT=


Exercise (6.8.6),
AnPn(())

limn~oo

Tn = sup{t: N(t) < oo}. Now, as in

.
=ITn __). z= lE(e-eTn)

+ ()
Xo +X 1 + + Xn where Xk is the (k + l)th interarrival time, a random variable which is
i=O Ai

since Tn =
exponentially distributed with parameter Ak Using the continuity theorem, lE(e-eTn) -+ 1E(e-eT) as
n-+ oo, whence AnPn(()) -+ 1E(e-eT) as n-+ oo, which may be inverted to obtain AnPn(t) -+ f(t)
as n-+ oo where f is the density function ofT. Now

lE(N(t) N(t) < oo) = L:=;onPn(t)


L:=n=O Pn (t)
which converges or diverges according to whether or not L:=n npn (t) converges. However Pn (t) ~
A.; 1 f (t) as n -+ oo, so that L:=n npn (t) < oo if and only if L:=n nA.; 1 < oo.
When A.n = (n

+ ~) 2 , we have that
lE(e-eT) =

IT
00

n=O

1+

() 1 2
(n + 2)

}-1

= sech

(nv'e).

Inverting the Laplace transform (or consulting a table of such transforms) we find that

where 81 is the first Jacobi theta function.

6.9 Solutions. Continuous-time Markov chains


1.

(a) We have that

where P12 = 1 - Plio P21 = 1 - P22 Solve these subject to Pi} (t) = 8ij, the Kronecker delta, to
obtain that the matrix P1 = (Pi} (t)) is given by

(b) There are many ways of calculating Gn; let us use generating functions. Note first that G0 =I,
the identity matrix. Write

n ::=: 0,

293

[6.9.2]-[6.9.4]

Solutions

Markov chnins

and use the equation Gn+ 1 = G Gn to find that

Hence an+1 = -(pfA.)cn+1 for n 2: 0, and the first difference equation becomes an+1 = -(A.+ p,)an,
n 2: 1, which, subject to a1 = -p,, has solution an = (-l)np,(A. + p,)n- 1, n 2: 1. Therefore
Cn = (-l)n+ 1A.(A. + p,)n- 1 for n 2: 1, and one may see similarly that bn =-an, dn = -en for
n 2: 1. Using the facts that ao =do = 1 and bo = co = 0, we deduce that L:~0 (tn jn!)Gn = Pt
where P 1 is given in part (a).
(c) With1r = (n1, n2), we have that -p,n1 + A.n2 = 0 and p,n1- A.n2 = 0, whence n1 = (A.jp,)n2.
In addition, n1 + n2 = 1 if n1 = A./(A. + p,) = 1- n2.
2.

(a) The required probability is


lP'(X(t) = 2, X(3t) = 1 I X(O) =
1P'(X(3t) = 1 I X(O) = 1)

1)

p!2(t)P21 (2t)

Pll (3t)

using the Markov property and the homogeneity of the process.


(b) Likewise, the required probability is
p!2(t)p21 (2t)pu (t)

Pu (3t)pu (t)

the same as in part (a).

3. The interarrival times and runtimes are independent and exponentially distributed. It is the lackof-memory property which guarantees that X has the Markov property.
The state space isS= {0, 1, 2, ... } and the generator is

G~

c
~

A.
-(A.+ p,)
p,

0
A.
-(A.+ p,)

0
0
A.

...

Solutions of the equation 1rG = 0 satisfy


-A.no + p,n1 = 0,

A.nj-1 -(A.+ p,)nj + P,7rj+1 = 0 for j =:: 1,

with solutionni = no(A.jp,)i. We have in addition that:Li ni = 1 ifA < p,andno = {1- (A./p,)}- 1.
4.

One may use the strong Markov property. Alternatively, by the Markov property,
lP'(Yn+1 = j I Yn = i, Tn = t, B)= lP'(Yn+1 = j I Yn = i, Tn = t)

for any event B defined in terms of {X (s) : s :S Tn}. Hence


lP'(Yn+1 = j

I Yn =

i, B)= fooo lP'(Yn+1 = j

I Yn =

i, Tn = t)fTn(t)dt

= lP'(Yn+1 = j I Yn = i),
so that Y is a Markov chain. Now qij = lP'(Yn+1 = j I Yn = i) is given by

% = { 00 Pij(t)A.e-M dt,

lo

294

Solutions

Continuous-time Markov chains

[6.9.5]-[6.9.8]

by conditioning on the (n + 1)th interarrival time of N; here, as usual, Pij (t) is a transition probability
of X. Now

The jump chain Z = {Zn : n ::=: 0} has transition probabilities hij = 8i} / 8i, i -=f. j. The chance
that Z ever reaches A from j is also IJj, and 11} = Lk hjkiJk for j </:. A, by Exercise (6.3.6). Hence
-gjiJj = Lk 8jk11k> as required.

5.

6. Let T1 = inf {t : X (t) -=f. X (0)}, and more generally let Tm be the time of the mth change in value
of X. For j :. A,
/J-j = 1Ej(T1)

+ Lhjk/J-k>
kf-j

where Ej denotes expectation conditional on Xo = j. Now Ej (Tl) = g1- 1 , and the given equations
follow. Suppose next that (ak : k E S) is another non-negative solution of these equations. With
Ui = Ti+1 - Ti and R = min{n 2: 1 : Zn E A}, we have for j </:.A that

where

is a sum of non-negative terms. It follows that


a}

2: Ej(Uo)

= Ej (

+ 1Ej(U1/{R>l)) + + Ej(Unl(R>nJ)
Url(R>r})

= Ej (min{Tn, HA})

-+ Ej(HA)

r=O

as n -+ oo, by monotone convergence. Therefore, f.L is the minimal non-negative solution.

7. First note that i is persistent if and only if it is also a persistent state in the jump chain Z. The
integrand being positive, we can write

where {Tn : n ::=: 1} are the times of the jumps of X. The right side equals

00

00

L JE(T1 I X (0) = i)hu (n) = -:- L hu (n)


n=O

gz n=O

where H = (hij) is the transition matrix of Z. The sum diverges if and only if i is persistent for Z.

8. Since the imbedded jump walk is persistent, so is X. The probability of visiting m during an
excursion is a= (2m)- 1, since such a visit requires an initial step to the right, followed by a visit to
m before 0, cf. Example (3.9.6). Having arrived at m, the chance of returning tom before visiting 0
is 1 -a, by the same argument with 0 and m interchanged. In this way one sees that the number N
of visits tom during an excursion from 0 has distribution given by IP'(N ::=: k) = a(l - a)k- 1, k ::=: 1.
The 'total jump rate' from any state is A., whence T may be expressed as :L~o Vi where the Vi are
exponential with parameter A.. Therefore,
JE(e eT )=GN ( -A.- )
).-()

a).
=(1-a)+a--.
a).-()

295

[6.9.9]-[6.9.11]

Solutions

Markov chains

The distribution ofT is a mixture of an atom at 0 and the exponential distribution with parameter a)..

9. The number N of sojourns in i has a geometric distribution IP'(N = k) = fk-l(l- f), k =:: 1,
for some f < 1. The length of each of these sojourns has the exponential distribution with some
parameter 8i. By the independence of these lengths, the total time T in state i has moment generating
function
gi(l-f)
8i0- f ) -

The distribution ofT is exponential with parameter 8i (1 - f).

10. The jump chain is the simple random walk with probabilities ).j (). + JJ-) and JJ-/ (). + JJ-), and with
POl = 1. By Corollary (5.3.6), the chance of ever hitting 0 having started at 1 is JJ-/A, whence the
probability of returning to 0 having started there is f = JJ-/A. By the result of Exercise (6.9.9),

as required. Having started at 0, the walk visits the state r =:: 1 with probability 1. The probability of
returning to r having started there is

and each sojourn is exponentially distributed with parameter 8r = ). + JJ-. Now g 7 (1 - fr) = ). - JJ-,
whence, as above,
JE(eevr) =
). - /JA-JJ--e
The probability of ever reaching 0 from X (0) is (JJ-/A)x (O), and the time spent there subsequently
is exponential with parameter ). - JJ-. Therefore, the mean total time spent at 0 is

11. (a) The imbedded chain has transition probabilities

where 8k = -gkk Therefore, for any state j,

where we have used the fact thatJrG = 0. Also irk =:: 0 and L:k irk = 1, and thereforeJi is a stationary
distribution of Y.
Clearly fik = nk for all k if and only if 8k = l::i ni 8i for all k, which is to say that 8i = 8k for
all pairs i, k. This requires that the 'holding times' have the same distribution.
(b) Let Tn be the time of the nth change of value of X, with To= 0, and let Un = Tn+l - Tn. Fix a
state k, and let H = min{n =:: 1 : Zn = k}. Let Yi (k) be the mean time spent in state i between two
consecutive visits to k, and let Yi(k) be the mean number of visits to i by the jump chain in between two

296

Solutions

Birth-death processes and imbedding

[6.9.12]-[6.11.2]

visits to k (so that, in particular, Yk(k) = g/; 1 and n(k) = 1). With Ej and 1P'j denoting expectation
and probability conditional on X (0) = j, we have that
Yi(k) =lEk(f:unl(Zn=i,H>n}) = f:Ek(Un I f(Zn=iJ)lP'k(Zn = i, H > n)
n=O
n=O

1
1
-lP'k(Zn = i, H > n) = -yi(k).
n=O 8i
8i
00

= L

The vector y(k) = (Yi (k) : i E S) satisfies y(k)H = y(k), by Lemma (6.4.5), where H is the
transition matrix of the jump chain Z. That is to say,
for j E S,
whence I:i Yi (k)gij = 0 for all j E S. If tlk = I:i Yi (k) < oo, the vector (Yi (k)/ t-tk) is a stationary
distribution for X, whence :rri = Yi(k)/t-tk for all i. Setting i = k we deduce that Trk = 1/(gktLk).
Finally, ifi;i Tri8i < oo, then

tlk =-:::::--- =
Trk

I:i Tri8i
"'
= tlk L:rri8i
Trk8k
i

12. Define the generator G by gu = -vi, 8ij = vihij, so that the imbedded chain has transition
matrix H. A root of the equation 1rG = 0 satisfies

0 = L:rri8ij = -TrjVj
i

L (:rrivi)hij
i:i-/-j

whence the vectors = (:rrjVj : j E S) satisfies s = tH. Therefore s = av, which is to say that
Trj Vj = avj ,for some constant a. Now Vj > 0 for all j, so that Trj = a, which implies that I:j Trj =!= 1.
Therefore the continuous-time chain X with generator G has no stationary distribution.

6.11 Solutions. Birth-death processes and imbedding


1.

The jump chain is a walk {Zn} on the setS= {0, 1, 2, ... } satisfying, fori ::: 1,
lP'(Zn+I = j

I Zn = i) = {

if j = i + 1,
ifj=i-1,

Pi
1- Pi

where Pi= Ai/(Ai + t-ti) Also lP'(Zn+l = 1 I Zn = 0) = 1.


2.

The transition matrix H = (hij) of Z is given by


i tL
h .. - { A+it-t
lJA
-A+ it-t

'f .

1 ]=!-'

ifj=i+l.

To find the stationary distribution of Y, either solve the equation 1r = 1rQ, or look for a solution of
the detailed balance equations :rrihi,i+l = :rri+Ihi+ 1,i. Following the latter route, we have that

297

[6.11.3]-[6.11.4]

Solutions

Markov chains

whence TCi =reo pi (1 + i j p )/ i! fori 2:: 1. Choosing reo accordingly, we obtain the result.
It is a standard calculation that X has stationary distribution v given by vi = pie -p j i! fori 2:: 0.
The difference between 1r and v arises from the fact that the holding-times of X have distributions
which depend on the current state.
3.

We have, by conditioning on X(h), that

l](t +h)= lE{lP'(X(t +h) = 0 X(h))}


= JJ-h 1 + (1- A.h- JJ-h)l](t) + A.h~(t) + o(h)
where~ (t) = lP'(X (t) = 0 I X (0) = 2). The process X may be thought of as a collection of particles
each of which dies at rate fJ- and divides at rate A., different particles enjoying a certain independence;
this is a consequence of the linearity of A.n and JJ-n. Hence ~ (t) = 11 (t ) 2 , since each of the initial pair
is required to have no descendants at time t. Therefore

1J 1(t) = M- (A.+ JJ-)IJ(f) + A.17(t) 2

subject to 11 (0) = 0.
Rewrite the equation as
11'

--,-,---------,___:_-----,-- = 1

(1 - IJ)(JJ- - A.IJ)

and solve using partial fractions to obtain


if).=/)-,

Finally, if 0 < t < u,


lP'(X(t) = 0 I X(u) = 0) = lP'(X(u) = 0 I X(t) = 0)

4.

lP'(X(t) = 0)
lP'(X(u) = 0)

17(t)
= -.
17(u)

The random variable X (t) has generating function


JJ-(1 - s)- (M- A.s)e-t(A.-JL)
G (s, t) = ;____:___ ___:____:__ ____:_..,.,-;------,).(1 - s)- (M- A.s)e-t(A.-JL)

as usual. The generating function of X(t), conditional on {X(t) > 0}, is therefore

~ sn lP'(X(t) = n) = G(s, t)- G(O, t).


~

lP'(X(t) > 0)

Substitute for G and take the limit as t

H(s)

1 - G(O, t)

oo to obtain as limit
(M- A.)s
/)--AS

oo

= Ls

Pn

n=l

where, with p = A./ JJ-, we have that Pn = pn-l (1 - p) for n 2:: 1.

298

Solutions

Special processes
5.

[6.11.5]-[6.12.1]

Extinction is certain if A. < JJ-, and in this case, by Theorem (6.11.10),

IfA > fJ- then lP'(T < oo) = JJ-/A., so


JE(T

IT

< oo)

!oo

oo {

)..

1 - -lE(sX(t))l =O
M
s

dt

looo (A. - JJ-)e<~L-A)t


( -A)t dt
o A. - JJ-e 11-

1
( ).. )
-log - - .
M
A. - M

In the case A.= JJ-, lP'(T < oo) = 1 and JE(T) = oo.

6. By considering the imbedded random walk, we find that the probability of ever returning to 1
is max{A., JJ-}/(A. + JJ-), so that the number of visits is geometric with parameter min{A., JJ-}/(A. + JJ-).
Each visit has an exponentially distributed duration with parameter A. + JJ-, and a short calculation
using moment generating functions shows that V1 (oo) is exponential with parameter min{A., JJ-}.
Next, by a change of variables, Theorem (6.11.10), and some calculation,

~srlE(Vr(t)) = lE(~

sr f(X(u)=r)du) = lE

(l

sX(u) du)

1
{ A.(l- s)- (M- A.s)e-(A-~L)t}
loot lE(s X( ) du = -JJ-fA. - -log
A.
A.-JJ1 { A.s(eP 1)} + terms not involving s,
= --log 1 -

u )

1 -

A.

JJ-eP -A.

where p = JJ- - A.. We take the limit as t --+ oo and we pick out the coefficient of sr.

7.

If A.= JJ- then, by Theorem (6.11.10),


lE(sX(t))

A.t(1 - s)
A.t(l - s)

+s = 1 _
+1

1- s
A.t(1 - s) + 1

and

r lE(sX(u)) du

lo

= t - _!_ log{A.t(l- s)

A.

+ 1}

1 1og { 1 - Ats
= --}
A.
1 +A.t

.
. s.
+ terms not mvolvmg

Letting t--+ oo and picking out the coefficient of sr gives JE(Vr(oo)) = (rA.)- 1 . An alternative
method utilizes the imbedded simple random walk and the exponentiality of the sojourn times.

6.12 Solutions. Special processes


1. The jump chain is simple random walk with step probabilities A./(A.
expected time IJ-10 to pass from 1 to 0 satisfies

299

+ JJ-) and JJ-/(A. + JJ-).

The

[6.12.2]-[6.12.4]

Solutions

Markov chains

whence JJ,lQ = (JJ, + A.)/(JJ, -A.). Since each sojourn is exponentially distributed with parameter
JJ, +A., the result follows by an easy calculation. See also Theorem (11.3.17).
2.

We apply the method of Theorem (6.12.11) with

the probability generating function of the population size at time u in a simple birth process. In the
absence of disasters, the probability generating function of the ensuing population size at time v is

The individuals alive at timet arose subsequent to the most recent disaster at timet - D, where D
has density function 8e- 8x, x > 0. Therefore,

3. The mean number of descendants after time t of a single progenitor at time 0 is e<A-~L)t. The
expected number due to the arrival of a single individual at a uniformly distributed time in the interval
on [0, x] is therefore
1
e(A-!L)X - 1
e(A-~L)u du = --,----,-x 0
(J...- JJ,)X

lox

The aggregate effect at timex of N earlier arrivals is the same, by Theorem (6.12.7), as that of N
arrivals at independent times which are uniformly distributed on [0, x]. Since JE(N) = vx, the mean
population size at timex is v[e(A-~L)x- 1]/(A.- JJ,). The most recent disaster occurred at timet- D,
where D has density function 8e - 8x, x > 0, and it follows that
lE(X(t)) =

loo

8e- 8x--[e(A-11-)x-

A.-JJ,

1]dx

v
8 x[e(A-~L)t- 1].
+ --e-

A.-JJ,

This is bounded as t --+ oo if and only if 8 > A. - JJ,.

4.

Let N be the number of clients who arrive during the interval [0, t]. Conditional on the event
{N = n}, the arrival times have, by Theorem (6.12.7), the same joint distribution as n independent
variables chosen uniformly from [0, t]. The probability that an arrival at a uniform time in [0, t] is
still in service at timet is f3 = f~[l - G(t- x)]t- 1 dx, whence, conditional on {N = n}, the total
number M still in service is bin(n, {J). Therefore,

whence M has the Poisson distribution with parameter A.{Jt = A. f~[l - G(x)] dx. Note that this
parameter approaches A.JE(S) as t --+ oo.

300

Solutions

Spatial Poisson processes

[6.13.1]-[6.13.3]

6.13 Solutions. Spatial Poisson processes


1. It is easy to check from the axioms that the combined process N(t) = B(t) + G(t) is a Poisson
process with intensity f3 + y.
(a) The time S (respectively, T) until the arrival of the first brown (respectively, grizzly) bear is
exponentially distributed with parameter f3 (respectively, y ), and these times are independent. Now,
lP'(S < T) = {oo f3e-{Jse-ys ds = _f3__

lo

f3 + Y

(b) Using (a), and the lack-of-memory of the process, the required probability is

(c) Using Theorem (6.12.7),

lE(min{S, T} B(l)

1
}
G(1)+2

= 1) = lE {
.

y- 1 + e-Y
y2

2. Let B, be the ball with centre 0 and radius r, and let Nr = ITI
(6.13.11) that S, = Z::xEnnB, g(x) satisfies

JE(S, IN,)= N,
where A(B) = fyEB A.(y) dy. Therefore, JE(S,)
gence that JE(S) = JJRd g(u)A.(u) du. Similarly,

JE(s;

N,) = lE

(l L

1
Br

n Brl-

We have by Theorem

A.(u)
g(u)--du,
A(B,)

fsr g(u)A.(u) du, implying by monotone conver-

g(x)J )

EnnBr

~ c~H,
E

g(x)

2))

+E (

g(x)g(y))

x,yEnnBr
= N,

A.(u)
g(u) - - du

Br

whence
lE(S;) =

A(Br)

+ N, (Nr

- 1)

11u

g(u)g(v)

u,vEBr

A.(u)A.(v)
A(Br)

2 du dv,

fsr g(u) A.(u) du + (fsr g(u)A.(u) du


2

By monotone convergence,

and the formula for the variance follows.


3. If B 1 , B2, ... , Bn are disjoint regions of the disc, then the numbers of projected points therein
are Poisson-distributed and independent, since they originate from disjoint regions of the sphere. By

301

[6.13.4]-[6.13.8]

Solutions

Markov chains

elementary coordinate geometry, the intensity function in plane polar coordinates is 2'A/ ~.
< 2n.

0 ::::; r ::::; 1, 0 ::::;


4.

The same argument is valid with resulting intensity function 2'A ~-

The Mercator projection represents the spherical coordinates (e, )as Cartesian coordinates in
the range 0 ::::; < 2n, 0 ::::;
n . (Recall that is the angle made with the axis through the north
pole.) Therefore a uniform intensity on the globe corresponds to an intensity .function 'A sine on the
map. Likewise, a uniform intensity on the map corresponds to an intensity 'A/ sine on the globe.

5.

e ::;

6.

Let the Xr have characteristic function. Conditional on the value of N(t), the corresponding
arrival times have the same distribution as N(t) independent variables with the uniform distribution,
whence
JE(eieS(t)) = lE{lE(eieS(t) I N(t))} = lE{lE(eiexe-au )N(t)}

= exp{A.t(JE(ewxe-au)- I)}= exp{ 'A

lot {(ee-"u)- 1} du }

where U is uniformly distributed on [0, t]. By differentiation,

Now, for s < t, S(t) = S(s )e-a(t-s) + S(t- s) where S(t- s) is independent of S(s) with the same
distribution as S(t- s). Hence, for s < t,
'AlE(X 2 )
'AJE(X 2 )
cov(S(s), S(t)) = ~a(t-s) = ~(1- e-2as)e-a(t-s)--+ ~e-av
ass --+ oo with v = t - s fixed. Therefore, p(S(s), S(s + v))--+ e-av ass--+ oo.

7.

The first two arrival times T1 , Tz satisfy

Differentiate with respect to x andy to obtain the joint density function A.(x)'A(x

+ y)e-A(x+y),

x, y ::: 0. Since this does not generally factorize as the product of a function of x and a function of y,
T1 and Tz are dependent in general.
8.

Let Xi be the time of the first arrival in the process Ni. Then
lP'(I

=I, T:::

t)

= lP'(t::::; x 1

00

< inf{X 2 ,

x3 , ... })

lP'(inf{Xz, X3, ... } > x)'A1e-Jqx dx

302

Al e-J..t_
'A

Solutions

Markov chain Monte Carlo

[6.14.1]-[6.14.4]

6.14 Solutions. Markov chain Monte Carlo


1.

If P is reversible then

RHS

= L(~PijXj )YiJl"i = ~JriPijXjYi = ~JrjPjiYiXj


l

l,]

l,]

~njXj (~PjiYi)
J

= LHS.

Suppose conversely that (x, Py) = (Px, y) for all x, y E l 2 (n). Choose x, y to be unit vectors with 1
in the ith and jth place respectively, to obtain the detailed balance equations ni Pij = nj Pji.
2. Just check that 0 ::::; bij ::::; 1 and that the Pij = 8ij bij satisfy the detailed balance equations
(6.14.3).

3. It is immediate that Pjk = IAjkl, the Lebesgue measure of Ajk This is a method for simulating
a Markov chain with a given transition matrix.
4. (a) Note first from equation (4.12.7) that d(U) = ~ supi#j dTv(ui-o Uj.), where Ui. is the mass
function Uif, t E T. The required inequality may be hacked out, but instead we will use the maximal
coupling of Exercises (4.12.4, 5); see also Problem (7.11.16). Thus requires a little notation. For
i, j E S, i =f. j, we find a pair (Xi, Xj) of random variables taking values in T according to the
marginal mass functions Ui., Uj., and such that IP'(Xi =f. Xj) = ~dTv(uh Uj.). The existence of
such a pair was proved in Exercise (4.12.5). Note that the value of Xi depends on j, but this fact
has been suppressed from the notation for ease of reading. Having found (Xi, Xj ), we find a pair
(Y(Xi), Y(Xj)) taking values in U according to the marginal mass functions vx; vxj'' and such that
IP'(Y(Xi)

=f.

Y(Xj)

I Xi,

Xj) =

IP'(Y(Xi)

=f.

~dTV(vx;., vxr). Now, taking a further liberty with the notation,

Y(Xj)~=

IP'(Xi = r, Xj = s)IP'(Y(r)

=f. Y(s))

,sES

is
= L
IP'(Xi = r, Xj = s)idTV(Vr., Vs.)
r,sES
ri=s

::'0

g supdTV(Vr., Vs.)}IP'(Xi =f. Xj),


ri=s

whence
d(UV) = supiP'(Y(Xi)

=f.

Y(Xj)) ::'0

ii=j

g rfs
supdTV(Vr., Vs.) }{sup!P'(Xi =f. Xj)}
i,j

and the claim follows.


(b) WriteS= {1, 2, ... , m}, and take

u=

(IP'(Xo = 1)
IP'(Yo = 1)

IP'(Xo = 2)
IP'(Yo = 2)

m))

IP'(Xo =
IP'(Yo = m)

The claim follows by repeated application of the result of part (a).


It may be shown with the aid of a little matrix theory that the second largest eigenvalue of a finite
stochastic matrix Pis no larger in modulus that d(P); cf. the equation prior to Theorem (6.14.9).

303

[6.15.1]-[6.15.6]

Solutions

Marlwv chains

6.15 Solutions to problems


1. (a) The state 4 is absorbing. The state 3 communicates with 4, and is therefore transient. The set
{1, 2} is finite, closed, and aperiodic, and hence ergodic. We have that /34(n) = Ct)n- 1 ~'so that
/34 = l:n /34(n) = ~(b) The chain is irreducible with period 2. All states are non-null persistent. Solve the equation
1f = 1f P to find the stationary distribution 1f = ( ~, ft, 156 , ~) whence the mean recurrence times are
8 16 16 8 .
d
3 , 3 , 3" , m or er.

2.

(a) Let P be the transition matrix, assumed to be doubly stochastic. Then


LPij(n)
i

= LLPik(n -OPkj = "L(LPik(n -l))Pkj


i

whence, by induction, the n-step transition matrix pn is doubly stochastic for all n :::: 1.
If j is not non-null persistent, then Pij (n) --+ 0 as n --+ oo, for all i, implying that l::i Pij (n) --+ 0,
a contradiction. Therefore all states are non-null persistent.
If in addition the chain is irreducible and aperiodic then Pij (n) --+ lij, where 1f is the unique
stationary distribution. However, it is easy to check that 1f = (N- 1 , N- 1 , ... , N- 1) is a stationary
distribution if Pis doubly stochastic.
(b) Suppose the chain is persistent. In this case there exists a positive root of the equation x = xP, this
root being unique up to a multiplicative constant (see Theorem (6.4.6) and the forthcoming Problem
vector of 1's. By the
(7)). Since the transition matrix is doubly stochastic, we may take x = 1, ~e
above uniqueness of x, there can exist no stationary distribution, and there e the chain is null. We
deduce that the chain cannot be non-null persistent.
3.

By the Chapman-Kolmogorov equations,


m,r,n:::: 0.

Choose two states i and j, and pick m and n such that a= Pij (m)Pji(n) > 0. Then
Pu(m

+ r + n):::: apjj(r).

Set r = 0 to find that Pii (m + n) > 0, and so d(i) I (m + n). If d(i) f r then Pii (m + r
that Pjj(r) = 0; therefore d(i) I d(j). Similarly d(j) I d(i), giving that d(i) = d(j).

+ n) =

0, so

4. (a) See the solution to Exercise (6.3.9a).


(b) Let i, j, r, s E S, and choose N(i, r) and N(j, s) according to part (a). Then

lP'(Zn = (r, s) Zo = (i,

n)

= Pir(n)Pjs(n) > 0

if n:::: max{N(i, r), N(j, s)}, so that the chain is irreducible and aperiodic.
(c) SupposeS= {1, 2} and

P=C 6)
In this case { {1, 1}, {2, 2}} and { {1, 2}, {2, 1}} are closed sets of states for the bivariate chain.
5. Clearly lP'(N = 0) = 1 - /ij, while, by conditioning on the time of the nth visit to j, we
have that lP'(N :::: n + 1 I N :::: n) = fjj for n :::: 1, whence the answer is immediate. Now
lP'(N = oo) = 1 - l:~o lP'(N = n) which equals 1 if and only if /ij = fjj = 1.
6. Fix i =/= j and let m = min{n : Pij (n) > 0}. If Xo = i and Xm = j then there can be no
intermediate visit to i (with probability one), since such a visit would contradict the minimality of m.

304

Solutions

Problems

[6.15.7]-[6.15.8]

Suppose Xo = i, and note that (1 - f}i) Pij (m) :5 1 - Iii, since if the chain visits j at time m
and subsequently does not return to i, then no return to i takes place at all. However fii = 1 if i is
persistent, so that fj i = 1.
7.

(a) We may takeS= {0, 1, 2, ... }. Note that qij (n)::: 0, and
00

L%(n) =

%(n

1,

+ 1) = Lqu(l)qu(n),

1=0

whence Q = (% (1)) is the transition matrix of a Markov chain, and Qn = (% (n)). This chain is
persistent since
for all i,
n

and irreducible since i communicates with j in the new chain whenever j communicates with i in the
original chain.
That
i =I= j, n ::: 1,

is evident when n = 1 since both sides are qij (1). Suppose it is true for n = m where m ::: 1. Now
lji(m

+ 1) =

i =I= j,

ljk(m)Pki

k:kfj

so that

_l_ lji(m

+ 1) =

Xi

i =I= j,

gkj(m)qik(1),

k:kfj

which equals gij (m + 1) as required.


(b) Sum (*) over n to obtain that
i =I= j,

where Pi(j) is the mean number of visits to i between two visits to j; we have used the fact that
I:n gij (n) = 1, since the chain is persistent (see Problem (6.15.6)). It follows that Xi = XOPi (0) for
all i, and therefore x is unique up to a multiplicative constant.
(c) The claim is trivial when i = j, and we assume therefore that i =I= j. Let Ni (j) be the number
of visits to i before reaching j for the first time, and write lP'k and lEk for probability and expectation
conditional on Xo = k. Clearly, lP'j(Ni(j):::: r) = hjiO- hijy-l for r::: 1, whence

The claim follows by(**).

8.

(a) If such a Markov chain exists, then


n

Un =

fiUn-i

i=l

305

n:::

1,

[6.15.9]-[6.15.9]
where

Solutions

Marlwv chains

fi is the probability that the first return of X to its persistent starting point s takes place at time

i. Certainly uo = 1.
Conversely, suppose u is a renewal sequence with respect to the collection Um : m 2: 1). Let X
be a Markov chain on the state space S = {0, 1, 2, ... } with transition matrix

.. _ { lP'(T 2: i + 2 I T 2: i + 1)
PI] 1 - lP'(T 2: i + 2 I T 2: i + 1)

if j = i + 1,
if j = 0,

where Tis a random variable having mass function fm = lP'(T = m). With Xo = 0, the chance that
the first return to 0 takes place at time n is

lP'( Xn = 0,

IT

xi

i= 0 I Xo

= 0) = P01P12 ... Pn-2,n-1Pn-1,0

IT

= ( 1 _ G(n + 1))
G(i + 1)
G(n)
i=l
G(i)

= G(n) -

G(n

+ 1) =

fn

where G(m) = lP'(T 2: m) = L:~m fn Now Vn = lP'(Xn = 0 I Xo = 0) satisfies


n

vo

= 1,

forn::;::1,

Vn = LfiVn-i
i=l

whence Vn = Un for all n.


(b) Let X andY be the two Markov chains which are associated (respectively) with u and v in the
above sense. We shall assume that X andY are independent. The product (unvn: n 2: 1) is now the
renewal sequence associated with the bivariate Markov chain (Xn, Yn).

2n
(2n)

9. Of the first
steps, let there be i rightwards, j upwards, and k inwards. Now Xzn = 0 if
and only if there are also i leftwards, j downwards, and k outwards. The number of such possible
!/ {(i! j! k!) 2 }, and each such combination has probability (t) 2U+ j+k) = (t) 2
combinations is
The first equality follows, and the second is immediate.
Now

n.

1 ) 2n
lP'(Xzn = 0) ::S ( 2
where

M=max{ 3 n~!lkl:
l. J . .

(2n) M
n

i+J+k=n

n!
3n1 "lkl

l.J.

i,j,k::;::O, i+j+k=n}.

It is not difficult to see that the maximum M is attained when i, j, and k are all closest to :} n, so that

Furthermore the summation in (*) equals 1, since the summand is the probability that, in allocating n
balls randomly to three urns, the urns contain respectively i, j, and k balls. It follows tliat
(2n)!
lP'(X2 = 0) < -----'-----7----::n

- 12nn!

306

(Lj-nJ !)3

Solutions [6.15.10]-[6.15.13]

Problems

which, by an application of Stirling's formula, is no bigger than Cn -2 for some constant C. Hence
l:n lP'(X2n = 0) < oo, so that the origin is transient.

10. No. The line of ancestors of any cell-state is a random walk in three dimensions. The difference
between two such lines of ancestors is also a type of random walk, which in three dimensions is
transient.

11. There are one or two absorbing states according as whether one or both of a and f3 equal zero. If
af3 =f. 0, the chain is irreducible and persistent. It is periodic if and only if a = f3 = 1, in which case
it has period 2.

IfO < af3 < 1 then


1r=

(a!f3'a:f3)

is the stationary distribution. There are various ways of calculating pn; see Exercise (6.3.3) for
example. In this case the answer is given by

proof by induction. Hence


as n ---+ oo.
The chain is reversible in equilibrium if and only if n1 P12

= 7r2P21, which is to say that af3 = f3a

12. The transition matrix is given by

Pi}=

(NN-i)2
ifj=i+1,
(NN- i) 2 if j = i,
1 - (Ni )
if j = i - 1,
(Ni )2
2

for 0 :::; i :::; N. This process is a birth-death process in discrete time, and by Exercise (6.5.1) is
reversible in equilibrium. Its stationary distribution satisfies the detailed balance equation Jri Pi,i+l =
ni+lPi+l,i for 0:::; i < N, whence ni = no(~) 2 for 0:::; i :::; N, where

_!_

no

t (~)
i=O

(2N).
N

13. (a) The chain X is irreducible; all states are therefore of the same type. The state 0 is aperiodic,
and so therefore is every other state. Suppose that Xo = 0, and let T be the time of the first return to
0. Then lP'(T > n) = aoa1 an-1 = bn for n :::: 1, so that 0 is persistent if and only if bn ---+ 0 as
n ---+ oo.
(b) The mean of T is
00

JE(T)

00

L lP'(T > n) = L bn.


307

[6.15.14]-[6.15.14]

Solutions

Markov chains

The stationary distribution 1r satisfies


00

no= L 7l'k(l- ak),


k=O

Jl'n = 7l'n-1Gn-1 for n 2: 1.

Hence Jl'n = nobn and n 1 = :L~o bn if this sum is finite.


(c) Suppose ai has the stated form for i 2: I. Then

n-1

II (1 -

bn =hi

Ai-f3),

n 2: /.

i=I

Hence bn --+ 0 if and only if :Li Ai-fJ = oo, which is to say that
persistent if and only if f3 ::=: 1.
(d) We have that 1 - x :::: e-x for x 2: 0, and therefore

f3 ::=:

1. The chain is therefore

if{J<l.

(e) If f3 = 1 and A > 1, there is a constant c I such that


oo
oo
{
n-1 1 }
oo
oo
Lbn::SbiLexp - A L i ::SciLexp{-Alogn}=ciLn-A<oo,
n=I
n=I
z=I
n=I
n=I

giving that the chain is non-null.


(f) If f3 = 1 and A :::: 1,

bn = b I

n-1(
A)
II
1 - --;-

i=I

2: b I

n-1(.
II ~1)
i=I

= bI

(/ =1) .
n

Therefore :Ln bn = oo, and the chain is null.

14. Using the Chapman-Kolmogorov equations,

IPi} (t + h) -

Pij (t) I =

IL (Pik (h) -

Oik) Pkj (t)

I :::: (1 -

Pii (h)) Pij (t)

::S

(1 -

+ L Pik (h)
k=Ji

k
Pii (h))

+ (1 -

Pii (h)) --+ 0

as h .j, 0, if the semigroup is standard.


Now logx is continuous for 0 < x :::: 1, and therefore g is continuous. Certainly g(O) = 0. In
addition Pii (s + t) 2: Pii (s)pii (t) for s, t 2: 0, whence g(s + t) ::S g(s) + g(t), s, t 2: 0.
For the last part

1
((Pii(t) - 1)

g(t)
Pii (t) - 1
-t- -log{1- (1- pu(t))}--+ -A.

as t .j, 0, since x flog( 1 - x) --+ -1 as x .j, 0.

308

Solutions

Problems

[6.15.15]-[6.15.16]

15. Let i and j be distinct states, and suppose that Pij (t) > 0 for some t. Now

implying that (Gn)ij > 0 for some n, which is to say that

for some sequence k1, k2, ... , kn of states.


Suppose conversely that (*)holds for some sequence of states, and choose the minimal such value
of n. Then i, k1, k2, ... , kn, j are distinct states, since otherwise n is not minimal. It follows that
(Gn)ij > 0, while (Gm)ij = 0 for 0::: m < n. Therefore
00

Pij (t) = t

n"'l

L..J k! t (G )ij

k=n

is strictly positive for all sufficiently small positive values oft. Therefore i communicates with j.

16. (a) Suppose X is reversible, and let i and j be distinct states. Now
lP'(X(O) = i, X(t) =

j)

= lP'(X(t) = i, X(O) =

j),

which is to say that Tri Pij (t) = Trj Pji (t). Divide by t and take the limit as t

+0 to obtain that

Trigij = Trjgji

Suppose now that the chain is uniform, and X (0) has distribution 1r. If t > 0, then
lP'(X (t) = j) =

2.:: Tri Pij (t) = Trj,


i

so that X(t) has distribution 1r also. Now lett < 0, and suppose that X(t) has distribution J.L. The
distribution of X (s) for s ::: 0 is J.LPs-t = 1r, a polynomial identity in the variables - t, valid for all
s ::: 0. Such an identity must be valid for all s, and particularly for s = t, implying that J.L = 1r.
Suppose in addition that Tri gij = Trj gji for all i, j. For any sequence k1, k2, ... , kn of states,
Trigi,ktgkt,kz gkn,j = gkJ,iTrktgkt,kz gkn,j = = gk1,igkz,k1 gj,knTrj.

Sum this expression over all sequences k1, k2, ... , kn oflength n, to obtain
Tri(Gn+1)ij = Trj(Gn+1)ji

n::: 0.

It follows, by the fact that Pt = etG, that

foralli,j,t. Fort1 < t2 < < tn,


lP'(X(t1) = i1, X(t2) = i2, ... , X(tn) =in)

= Tri! Pit ,iz (t2 - t1) Pin_ 1,in (tn - tn-1)


= Piz,i1 (t2- tl)TrizPiz,i3 (t3- t2) Pin-J,in Ctn- tn-1) =
= Pi2 ,i 1 (t2- t1) Pin.in- 1 Ctn- tn-1)1rin
= lP'(Y(t1) = i1, Y(t2) = i2, ... , Y(tn) =in),

309

[6.15.17]-[6.15.19]

Markov chains

Solutions

giving that the chain is reversible.


(b) LetS = {1, 2} and

=(-a a)
f3

-{3

where af3 > 0. The chain is uniform with stationary distribution

and therefore Jqgl2 = n2g2i


(c) Let X be a birth--death process with birth rates A.; and death rates J.Li. The stationary distribution
n satisfies

Therefore nk+iiLk+i = Jl'kAk fork 2: 0, the conditions for reversibility.

17. Consider the continuous-time chain with generator


G = (-{3

f3 ) .

-y

It is a standard calculation (Exercise (6.9.1)) that the associated semigroup satisfies


({3

+ f3h(t)
y(1- h(t))

)P = ( y
y

{3(1- h(t)))
f3 + yh(t)

where h(t) = e-t(,B+y). Now P1 = Pifandonlyify + f3h(1) = f3 + yh(1)


to say that f3 = y = -1log(2a - 1), a solution which requires that a > 1

= a(f3 +

y), which is

18. The forward equations for Pn(t) = lP'(X(t) = n) are


Pb(t) = JLPi - A.po,

P~ (t) = APn-i - (A.+ nJL) Pn + JL(n + 1) Pn+i,


In the usual way,

aa
at

=(s- 1)(A.G- 11

no::

I.

aa)
as

with boundary condition G(s, 0) =sf. The characteristics are given by

ds

dt= - - JL(S - 1)

dG
A.(s - 1)G'

and therefore G = eP(s-i) f ( (s-l)e-JU), for some function f, determined by the boundary condition
to satisfy eP(s-l) f(s- 1) =sf. The claim follows.
As t-+ oo, G(s, t)-+ eP(s-i), the generating function of the Poisson distribution, parameter p.

19. (a) The forward equations are

-p;;(s, t) = -A.(t)p;;(s, t),


at

atPij(s,t) = -A.(t)pij(S, t) +A.(t)Pi,j-J(t),

310

i < j.

Solutions [6.15.20]-[6.15.20]

Problems
Assume N(s) = i and s < t. In the usual way,
00

G(s, t; x)

= 2:xilP'(N(t) = j

I N(s)

= i)

J=i

satisfies
iJG

at= A.(t)(x- 1)G.

We integrate subject to the boundary condition to obtain


G(s, t; x) =xi exp { (x- 1)

1t

A.(u) du},

whence Pi} (t) is found to be the probability that A = j - i where A has the Poisson distribution with

J:

parameter A.(u) du.


The backward equations are

osPij(S, t) = A.(s)Pi+l,j(S, t)- A.(s)Pij(S, t);

using the fact that Pi+ I,} (t) = Pi,J-1 (t), we are led to
iJG
- - = A.(s)(x- !)G.

OS

The solution is the same as above.


(b) We have that
lP'(T > t)

= Poo(t) = exp {

so that
fT(t) = A.(t)exp {

In the case A.(t) = cj(l

-l

lot

A.(u) du},

t 2:0.

A.(u)du }

+ t),
E(T)

= !o oo lP'(T
0

> t)dt

= !ooo
o

du

(1

+ u)c

which is finite if and only if c > 1.


20. Let s > 0. Each offer has probability 1 - F (s) of exceeding s, and therefore the first offer
exceeding sis the Mth offer overall, where lP'(M = m) = F(s)m-l [I - F(s)], m 2: 1. Conditional
on {M = m}, the value of XM is independent of the values of X1, X2, ... , XM-1, with
lP'(XM > u I M = m) =

1- F(u)
,
I - F(s)

0<

S ::": U,

and X 1, X 2, ... , X M -I have shared (conditional) distribution function


F(u)
G(u Is)= F(s),

311

0:::: u:::: s.

[6.15.21]-[6.15.21]

Solutions

Markov chains

For any event B defined in terms of X 1, X 2, ... , X M -! , we have that


00

IP'(XM > u, B)=

IP'(XM > u, B

IM =

m)IP'(M = m)

m=i
00

IM =

IP'(XM > u

m)IP'(B

IM =

m)IP'(M = m)

m=i
00

= IP'(XM > u)

L IP'(B I M = m)IP'(M = m)
m=i

0<

= IP'(XM > u)IP'(B),

S _:::: U,

where we have used the fact that IP'(XM > u I M = m) is independent of m. It follows that the first
record value exceeding s is independent of all record values not exceeding s. By a similar argument
(or an iteration of the above) all record values exceeding s are independent of all record values not
exceeding s.
The chance of a record value in (s, s + h] is
IP'(s < XM < s +h)=
-

F(s +h)- F(s)

1- F(s)

J(s)h

1- F(s)

+o(h).

A very similar argument works for the runners-up. Let XM 1 , XM2 , .. be the values, in order,
of offers exceeding s. It may be seen that this sequence is independent of the sequence of offers
not exceeding s, whence it follows that the sequence of runners-up is a non-homogeneous Poisson
process. There is a runner-up in (s, s + h] if (neglecting terms of order o(h)) the first offer exceeding
s is larger than s + h, and the second is in (s, s + h]. The probability of this is
( 1-F(s+h)) (F(s+h)-F(s)) +o(h ) = f(s)h +o (h) .
1-F(s)
1-F(s)
1-F(s)

21. Let F 1 (x) = IP'(N*(t) _:::: x), and let A be the event that N has a arrival during (t, t +h). Then
Fr+h(x) = A.h!P'(N*(t +h)_::::

xI A)+ (1- A.h)Fr(x) + o(h)

where
IP'(N*(t +h)_::::

Hence

~Fr(x) =
at

xI A)=

-A.Fr(x)

L:

t.../

00

-oo

Fr(x- y)f(y)dy.

Fr(x- y)f(y)dy.

Take Fourier transforms to find that 1 (8) = IE(eifiN*(t)) satisfies

an equation which may be solved subject to o(8) = 1 to obtain 1 (8) = eAI(</>(e)-ll.


Alternatively, using conditional expectation,

where N (t) is Poisson with parameter A.t.

312

Solutions [6.15.22]-[6.15.25]

Problems

22. We have that


E(sN(t)) = E{E(sN(t)

I A)}=

1{eJ.It(s-1)

+ eJ.2t(s-l)},

= ~(A.J + A.z)t and var(N(t)) = 1(A.J + A.z)t + i<A.I- A.z) 2 t 2 .


Conditional on {X (t) = i}, the next arrival in the birth process takes place at rate Ai.

whence E(N(t))
23.

24. The forward equations for Pn(t) = li"(X(t) = n) are

n 2: 0,
with the convention that P-i (t) = 0. Multiply by sn and sum to deduce that
(1

aa

aa

aG

2
+ J.Lf)at = sG + JLS as - G- JLSas

as required.
Differentiate with respect to s and take the limit ass
m(t)

l. If E(X (t) 2 ) < oo, then

= E(X(t)) = -aal

as

s=i

satisfies (1 + J.Lt)m 1(t) = 1 + J.Lm(t) subject to m(O) =I. Solving this in the usual way, we obtain
m(t) = I+ (1 + JL/)t.
Differentiate again to find that
n(t) = E(X(t)(X(t)

satisfies (1

+ J.Lf)n 1 (t)

= 2(m(t)

+ J.Lm(t) + J.Ln(t))

n(t) = I (I - 1)

The variance of X(t) is n(t)

-1))

a2al

= -

as

s=i

subject to n(O) =I (I- 1). The solution is

+ 2I (1 + JL/)t + (1 + 11 I)(l + 11 + 11 I)t 2 .

+ m(t) -

m(t) 2 .

25. (a) Condition on the value of the first step:


TJ} =

AJ

"}

as required. Set Xi

= TJi+i

ILJ

..,--------+ . TJ} +I + ..,--------+ . TJ} -!,


!Lj

"}

- TJi to obtain AJXJ


j

XJ

!Lj

= JLjXJ-i

for j 2: 1, so that

=xoiT
f.
i=i
l

It follows that
TJ}+i

= TJO +

k=O

k=O

L Xk = 1 + (TJJ - 1) L ek.

The TJ} are probabilities, and lie in [0, 1]. If l::f ek = oo then we must have TJJ = 1, which implies
that TJ} = 1 for all j.

313

[6.15.26]-[6.15.28]

Solutions

Markov chains

(b) By conditioning on the first step, the probability T/J, of visiting 0 having started from j, satisfies

u + I) 2 TJj+i + p,j-1
j2

T/j =

+ (j + 1)2

Hence, (j + 1) 2 (TJj+i- TJj) = P(TJj- T/j-J), giving (j + 1) 2 (TJj+i- TJj) = TJi- TJO Therefore,
j

1- TJj+i = (1- TJJ) '""


L.J

k=O (k

+ 1)

I 2
2 -+ (1- TJI)6n

as j-+ oo.

By Exercise (6.3.6), we seek the minimal non-negative solution, which is achieved when T/1 = 1 (6/n 2).

26. We may suppose that X(O) = 0. Let Tn = inf{t : X(t) = n}. Suppose Tn = T, and let
Y = Tn+i - T. Condition on all possible occurrences during the interval (T, T +h) to find that
E(Y) = (A.nh)h + f.lnh(h + E(Y 1)) + (1 - A.nh- P,nh)(h + E(Y)) + o(h),
where Y' is the mean time which elapses before reaching n + 1 from n- 1. Set mn = E(Tn+i - Tn)
to obtain that
mn = P,nh(mn-i + mn) + mn + h{l- (A.n + P,n)mn} + o(h).
Divide by h and take the limit ash

.,!..

0 to find that A.nmn = 1 + Jlnmn-i, n 2: 1. Therefore

1
f.ln
1
Jln
Jlnf.ln-i Ill
mn = - + -mn-i = = - + - - - + +

An
An
An
An An-i
An An-i A.o

since mo = A. 01 . The process is dishonest if ~~ 0 mn < oo, since in this case T00 = lim Tn has
finite mean, so that lP'(T00 < oo) = 1.
On the other hand, the process grows no faster than a birth process with birth rates Ai, which is
honest if ~~ 0 1/A.n = oo. Can you find a better condition?
27. We know that, conditional on X (0) = I, X (t) has generating function

A.t(l-s)+s)I
G(s, t) = ( A.t(l - s) + 1
,
so that
lP'(T:::; x

A.x
X(O) =I)= lP'(X(x) = 0 I X(O) =I)= G(O, x) = ( A.x + 1

)I

It follows that, in the limit as x -+ oo,


lP'(T :S x) = ~
L.J ( -A.xI=O Ax + 1

)I

lP'(X(O) = /) = Gx(O)

-A.x- ) -+ 1.
Ax + 1

For the final part, the required probability is {xI I (xI + 1)} I = { 1 + (x /)-I }-I, which tends to
e-ijx as I -+ oo.

28. Let Y be an immigration-death process without disasters, with Y (0) = 0. We have from Problem
(6.15.18) that Y(t) has generating function G(s, t)
exp{p(s- 1)(1- e-!Lt)} where p A.fp,. As
seen earlier, and as easily verified by taking the limit as t -+ oo, Y has a stationary distribution.

314

Solutions [6.15.29]-[6.15.31]

Problems

From the process Y we may generate the process X in the following way. At the epoch of each
disaster, we paint every member of the population grey. At any given time, the unpainted individuals
constitute X, and the aggregate population constitutes Y. When constructed in this way, it is the case
that Y(t) :::; X(t), so that Y is a Markov chain which is dominated by a chain having a stationary
distribution. It follows that X has a stationary distribution :n: (the state 0 is persistent for X, and
therefore persistent for Y also).
Suppose X is in equilibrium. The times of disasters form a Poisson process with intensity 8. At
any given time t, the elapsed time T since the last disaster is exponentially distributed with parameter
8. At the time of this disaster, the value of X (t) is reduced to 0 whatever its previous value.
It follows by averaging over the value ofT that the generating function H(s) = L~o snnn of
X (t) is given by

by the substitution x = e-JLu. The mean of X (t) is

29. Let G(IBI, s) be the generating function of X(B). If BnC = 0, then X(B UC) = X(B)+X(C),
so that G(a + f.l, s) = G(a, s)G(f.l, s) for Is I :::; 1, a, f.l ::=: 0. The only solutions to this equation which
are monotone in a are of the form G(a, s) = eaJ..(s) for lsi :::; 1, and for some function A.(s). Now any
interval may be divided into n equal sub-intervals, and therefore G(a, s) is the generating function of
an infinitely divisible distribution. Using the result of Problem (5.12.13b), A.(s) may be written in the
form A.(s) = (A(s) -l)A. for some A. and some probability generating function A(s) = L~ aisi. We
now use (iii): if lEI= a,
IP(X (B) :0:: 1)
IP(X(B) = 1)
as a .,), 0. Therefore ao + a 1 = 1, and hence A(s) = ao
distribution with parameter proportional to IB 1.

+ (1- ao)s,

and X(B) has a Poisson

30. (a) Let M (r, s) be the number of points of the resulting process on R+ lying in the interval (r, s].
Since disjoint intervals correspond to disjoint annuli of the plane, the process M has independent
increments in the sense that M(rJ, SJ), M(r2, s2), ... , M(rn, sn) are independent whenever r1 <
SJ < r2 < < rn < Sn. Furthermore, for r <sand k :0:: 0,
{A.:rr(s _
IP(M(r, s) = k) = IP(N has k points in the corresponding annulus) =

r)}ke-J..n(s-r)

k!

(b) We have similarly that

IP(R(k):::; x) = IP(N has leastk points in circle of radius x) =

Loo

(A.nx2t e-J..nx2

r= k

r.1

and the claim follows by differentiating, and utilizing the successive cancellation.
31. The number X(S) of points within the sphere with volume Sand centre at the origin has the
Poisson distribution with parameter A.S. Hence IP(X (S) = 0) = e-J..S, implying that the volume V of
the largest such empty ball has the exponential distribution with parameter A..

315

[6.15.32]-[6.15.33]

Solutions

Markov chains

It follows that lP'(R > r) = JP'(V > ern) =


ball in n dimensions. Therefore

e-J...crn

for r 2: 0, where c is the volume of the unit

r 2: 0.

Finally, E(R) = J000 e-J...crn dr, and we set v = 'Acrn.

32. The time between the kth and (k + 1)th infection has mean 'Ak" 1, whence
N

E(T)

=I:-.
Ak
k=1

Now
N

{N

.(.; k(N + 1 - k) = N + 1 .(.;

k + .(.; N

1
}
+ 1- k

1
2
L= --{logN +y +O(N- 1)}.
N +1
k
N +1
2

= --

k= 1

It may be shown with more work (as in the solution to Problem (5.12.34)) that the moment
generating function of 'A(N + l)T- 2log N converges as N-+ oo, the limit being {r(l- 8)} 2 .

33. (a) The forward equations for Pn (t) = JP'(V (t) = n + ~) are
p~(t)

= (n + 1)Pn+l(t)- (2n + l)pn(t) +nPn-1(t),

n 2: 0,

with the convention that P-1 (t) = 0. It follows as usual that


a a= aaat

( 2s-+G
aa
)

as

as

( s 2 -+sG
aa
)
as

as required. The general solution is


G(s,

t) =

- 1-J

1-s

1- )
(t + -1-s

for some function f. The boundary condition is G(s, 0) = 1, and the solution is as given.
(b) Clearly
mn(T)=E(foT lntdt) =loT ECint)dt

by Fubini's theorem, where lnt is the indicator function of the event that V (t) = n + ~.
As for the second part,

Loo snmn(T) =
n=O

r G(s,
lo
T

t) dt = log[ 1 + (1 - s)T],
1- s

so that, in the limit as T -+ oo,

Loo
n=O

1
( 1 + (1- s)T)
log(1- s)
sn (mn(T) -log T) =--log
-+
=-

1-s

316

1-s

Loo snan
n=1

Solutions [6.15.34]-[6.15.35]

Problems

if lsi < 1, where an = ~?= 1 i- 1 , as required.


(c) The mean velocity at time t is
1
aa
1
-+2
as s=l =t+-.
2
J

(0, 1)

(1, 0)

34. It is clear that Y is a Markov chain, and its possible transitions are illustrated in the above
diagram. Let x andy be the probabilities of ever reaching (1, 1) from (1, 2) and (2, 1), respectively.
By conditioning on the first step and using the translational symmetry, we see that x = ~ y + ~ x 2
andy= ~ + ~xy. Hence x 3 - 4x 2 + 4x- 1 = 0, an equation with roots x = 1, ~(3 0). Since
xis a probability, it must be that either x = 1 or x = ~(3- 0), with the corresponding values of
y = 1 and y = ~ (0 - 1). Starting from any state to the right of (1 , 1) in the above diagram, we
see by recursion that the chance of ever visiting (1, 1) is of the form xa yf3 for some non-negative
integers a, {3. The minimal non-negative solution is therefore achieved when x = ~(3- 0) and
y = ~(0- 1). Since x < 1, the chain is transient.
35. We write A, 1, 2, 3, 4, 5 for the vertices of the hexagon in clockwise order. Let Ti = min{n :::
1: Xn = i} and ll"iO = ll"( I Xo = i).
(a) By symmetry, the probabilities Pi = ll"i (h < Tc) satisfy
2

PA = 3PI

whence p A =

I
PI= 3

I
+ 3P2

I
P2 = 3PI

1
+ 3P3,

P3 = 3P2,

:J:r.

(b) By Exercise (6.4.6), the stationary distribution is nc =


-1
8
JTA = .
(c) By the argument leading to Lemma (6.4.5), this equals

:!, JTi

f-tAJTC

~ fori =/= C, whence MA =

= 2.

(d) We condition on the event E = {h < Tc} as in the solution to Exercise (6.2.7). The probabilities
bi = lP'i (E) satisfy

317

[6.15.36]-[6.15.37]

Solutions

yielding bi = -ft, b2 =
equations of the form

j;, b3

Markov chains

= ~- The transition probabilities conditional onE are now found by

ii2 =

li"2(E)p12
li"I (E)

. 'IarIy r2I = 97 , r23 = 92 , r32 = 2I , TIA = 76 . Hence,


and sum
IL2A = 1 +~ILIA+

giving ILIA=

~IL3A,

WI'th

IL3A = 1 + IL2A

.
.
theobvwus
notation,
ILIA= 1 + +IL2A

lj!, and the required answer is 1 +ILIA= 1 + .!j1 = J;f.

36. (a) We have that


f3(m - i) 2

Pi,i+I =

m2

Look for a solution to the detailed balance equations

to find the stationary distribution

(b) In this case,


f3(m - i)

a(i

Pi+I,i =

Pi,i+I =

m
Look for a solution to the detailed balance equations
f3(m - i)

lri

= lri+I

a(i

+ 1)
m

+ 1)
m

yielding the stationary distribution

37. We have that

by the Chapman-Kolmogorov equations

by the concavity of c

318

Solutions [6.15.38]-[6.15.41]

Problems

where we have used the fact that 'E,j njPjk(t) = nk. Now aj(s)---+ nj ass---+ oo, and therefore
d(t) ---+ c(l).

38. By the Chapman-Kolmogorov equations and the reversibility,


uo(2t) = LlP'(X(2t) = 0 I X(t) = j)lP'(X(t) = j

I X(O) = 0)

'
(u(t))
= "'no
L..J ----:-1P'(X(2t) = j I X(t) = O)uj(t) =no "
L..Jnj
- 1 -.
j

n]

n]

The function c(x) = - x 2 is concave, and the claim follows by the result of the previous problem.

39. This may be done in a variety of ways, by breaking up the distribution of a typical displacement and
using the superposition theorem (6.13.5), by the colouring theorem (6.13.14), or by Renyi's theorem
(6.13.17) as follows. Let B be a closed bounded region of Rd. We colour a point of IT at x E Rd
black with probability lP'(x + X E B), where X is a typical displacement. By the colouring theorem,
the number of black points has a Poisson distribution with parameter
{

A.lP'(x + X E B) dx = A. {

}Jf!.d

=A.

lP'(X E dy - x)

dy {

}yEB

lxEJRd

dy

}yEB

lP'(X

dv) = A.IBI,

lvE!Rd

by the change of variables v = y - x. Therefore the probability that no displaced point lies in B is
e-J..!BI, and the claim follows by Renyi's theorem.

40. Conditional on the number N (s) of points originally in the interval (0, s ), the positions of these
points are jointly distributed as uniform random variables, so the mean number of these points which
lie in ( -oo, a) after the perturbation satisfies
A.s

looo Fx(a-u)du=IE(RL)
lo0s1-lP'(X+u:sa)du---+A.
s
0

as s ---+ oo,

where X is a typical displacement. Likewise, IE(LR) =A. J000 [1 - Fx(a + u)] du. Equality is valid
if and only if

00

[1- Fx(v)]dv

ja-oo

Fx(v)dv,

which is equivalent to a =IE( X), by Exercise (4.3.5).


The last part follows immediately on setting Xr = Vrt, where Vr is the velocity of the rth car.

41. Conditional on the number N(t) of arrivals by timet, the arrival times of these ants are distributed
as independent random variables with the uniform distribution. Let U be a typical arrival time, so
that U is uniformly distributed on (0, t). The arriving ant is in the pantry at time t with probability
n = lP'(U +X > t), or in the sink with probability p = lP'(U +X < t < U +X+ Y), or departed
with probability 1 - p - n. Thus,
IE(xA(t)YB(t)) = IE{IE(xA(t)YB(t) I N(t))}

= IE{ (n x + py + 1 -

n - p )N(t)}

= eh(x-1) eJ..p(y-1).

Thus A(t) and B(t) are independent Poisson-distributed random variables. If the ants arrive in pairs
and then separate,

319

[6.15.42]-[6.15.45]

Solutions

Markov chains

where y = 1- n- p. Hence,

whence A(t) and B(t) are not independent in this case.

42. The sequence {Xr} generates a Poisson process N(t) = max{n : Sn _:::: t}. The statement that
Sn =tis equivalent to saying that there are n- 1 arrivals in (0, t), and in addition an arrival at t. By
Theorem (6.12.7) or Theorem (6.13.11), the first n - 1 arrival times have the required distribution.
Part (b) follows similarly, on noting that fu(u) depends on u = (UJ, u2, ... , un) only through
the constraints on the Ur.

43. Let Y be a Markov chain independent of X, having the same transition matrix and such that Yo
has the stationary distribution Jr. LetT= rnin{n ::: 1 : Xn = Yn} and suppose Xo = i. As in the
proof of Theorem (6.4.17),
IPij (n)- Tl:j I =

IL

nk(Pij (n)- Pkj (n))

I_: L

ll:klP'(T > n) = lP'(T > n).

Now,

lP'(T > r
where

+1 IT

for r::: 0,

> r) _:::: 1- E2

= rnin;j {Pi}} > 0. The claim follows with "A = 1 -

E2

44. Let/k(n)betheindicatorfunctionofavisittokattimen,sothatE(h(n)) =lP'(Xn =k) =ak(n),


say. By Problem (6.15.43), lak(n)- nkl _:::: ;..n. Now,

Lets = rnin{m, r} and t = lm- rl. The last summation equals

= n12

L L {(a;(s)- n;)(Pii (t)- n;) + n;(p;; (t)- n;)


r

+ n; (a; (s) -

n;) - n; (a; (r) - n;) - n; (a; (m) - n;)}

1
_:: : 2 LL(;..s+t +"At +"As +"Ar +"Am)
n r m
as n--+ oo,
where 0 < A < oo. For the last part, use the fact that I:~;:J f(Xr) = l:::iES f(i)V; (n). The result
is obtained by Minkowski's inequality (Problem (4.14.27b)) and the first part.

45. We have by the Markov property that f(Xn+i I Xn, Xn-i, ... , Xo) = f(Xn+i I Xn), whence
E(log f(Xn+i I Xn, Xn-J, ... , Xo) I Xn, ... , Xo) = E(log f(Xn+i I Xn) I Xn).

320

Problems

Solutions

[6.15.46]-[ 6.15.48]

Taking the expectation of each side gives the result. Furthermore,


H(Xn+! I Xn) =- 2)Pij log Pij )lP'(Xn = i).
i,j

Now X has a unique stationary distribution


is finite, and the claim follows.

JC,

so that lP'(Xn = i) ---+ Jri as n ---+ oo. The state space

46. Let T = inf{t : X 1 = Y1 }. Since X and Y are persistent, and since each process moves by
distance 1 at continuously distributed times, it is the case that lP'(T < oo) = 1. We define

Zt= {

Xt

if t < T,

Yt

ift 2: T,

noting that the processes X and Z have the same distributions.


(a) By the above remarks,
IIP'(Xt = k) -lP'(Yt = k)l = llP'(Zt = k) -lP'(Yt = k)l
:::silP'CZt=k, T:::St)+lP'(Zt=k, T>t)-lP'(Yr=k, T:::St)-lP'(Yt=k, T>t)l
:::S lP'(Xt = k, T > t)

+ lP'(Yt

= k, T > t).

We sum over k E A, and let t ---+ oo.


(b) We have in this case that Z 1 :::S Y1 for all t. The claim follows from the fact that X and Z are
processes with the same distributions.
47. We reformulate the problem in the following way. Suppose there are two containers, Wand N,
containing n particles in all. During the time interval (t, t + dt), any particle in W moves toN with
probability [tdt +o(dt), and any particle inN moves toW with probability A.dt +o(dt). The particles
move independently of one another. The number Z(t) of particles in W has the same rules of evolution
as the process X in the original problem. Now, Z (t) may be expressed as the sum of two independent
random variables U and V, where U is bin(r, e1 ), Vis bin(n- r, 1/11), and e1 is the probability that a
particle starting in W is in W at timet, 1/11 is the probability that a particle starting inN at 0 is in W
at t. By considering the two-state Markov chain of Exercise (6.9.1),

and therefore
E(sX(t)) = E(sU )E(s V) = (set

+1-

sr (s1flt

+1-

s)n-r.

Also, E(X(t)) = re 1 + (n- r)1/1 1 and var(X(t)) = re 1 (1- e1 ) + (n- r)1/f1 (1 -1/11 ). In the limit as
n ---+ oo, the distribution of X (t) approaches the bin(n, A./(A. + ft)) distribution.
48. Solving the equations

gives the first claim. We have that y = l:i (Pi - qi )Jri, and the formula for y follows.
Considering the three walks in order, we have that:
A. Jri =

1for each i, and YA = -2a < 0.

B. Substitution in the formula for YB gives the numerator as 3{- ~a +o(a) }, which is negative
for small a whereas the denominator is positive.

321

[6.15.49]-[6.15.51]

Solutions

Markov chains

i (Jb -

C. The transition probabilities are the averages of those for A and B, namely, Po =
a)+
a) =
a, and so on. The numerator in the formula for YC equals
+o(l),
which is positive for small a.

i<i-

-fu-

-rfu

49. Call a car green if it satisfies the given condition. The chance that a green car arrives on the scene
during the time interval (u, u +h) is A.hlP'(V < xf(t- u)) for u < t. Therefore, the arrival process
of green cars is an inhomogeneous Poisson process with rate function
A.(u) = {

A.lP'(V < xf(t- u))


0

if u < t,
if u :::: t.

Hence the required number has the Poisson distribution with mean
A.

lot lP' ( V < t ~ J du lot lP' (v < ~) du


lot E(/[Vu<xJ) du
=A.
=A.

= A.E(v- 1 min{x, Vt}).

50. The answer is the probability of exactly one arrival in the interval (s, t), which equals g(s) =
A.(t- s)e-J..(t-s). By differentiation, g has its maximum atS = max{O, t - A.- 1}, and g(s) = e- 1
whent:::: A.- 1 .
51. We measure money in millions and time in hours. The number of available houses has the
Poisson distribution with parameter 30A., whence the number A of affordable houses has the Poisson
distribution with parameter~ 30A. =SA. (cf. Exercise (3.5.2)). Since each viewing timeT has moment
generating function E(e 8T) = (e 28 - e8 )je, the answer is

322

7
Convergence of random variables

7.1 Solutions. Introduction


1. (a) El(cXYI = lcr {IIXIIr}'.
(b) This is Minkowski's inequality.
(c) Let E > 0. Certainly IX I 2: h where h is the indicator function of the event {I X I > E}. Hence
EIX' I 2: Eli; I = lP'(IXI > E), implying that lP'(IXI > E) = 0 for all E > 0. The converse is trivial.

2.

(a) E({aX

+ bY}Z) =
+ E({X -

(b) E({X + Y} 2 )
(c) Clearly

aE(XZ)

+ bE(YZ).
+ 2E(Y 2 ).

Y} 2 ) = 2E(X 2 )

3. Let f(u) = ~E, g(u) = 0, h(u) = -~E, for all u. Then df(f, g)+ dE(g, h) = 0 whereas
df(f, h)= 1.
~

Either argue directly, or as follows. With any distribution function F, we may associate a graph
F obtained by adding to the graph of F vertical line segments connecting the two endpoints at each
discontinuity ofF. By drawing a picture, you may see that .../2 d (F, G) equals the maximum distance
between F and G measured along lines of slope -1. It is now clear that d(F, G) = 0 if and only if
F = G, and that d(F, G) = d(G, F). Finally, by the triangle inequality for real numbers, we have
that d(F, H) ~ d(F, G)+ d(G, H).
5.

Take X to be any random variable satisfying E(X 2 ) = oo, and define Xn = X for all n.

7.2 Solutions. Modes of convergence


1.

(a) By Minkowski's inequality,

let n --+ oo to obtain lim infn---+oo EIX~ I 2: EIX' 1. By another application ofMinkowski's inequality,

whencelimsupn 400 EIX~I ~ EIX'I.

323

[7.2.2]-[7.2.4]

Solutions

Convergence of random variables

(b) We have that


IE(Xn)- E(X)I = IE(Xn- X) I ~ EIXn- XI--+ 0
as n --+ oo. The converse is clearly false. If each Xn takes the values 1, each with probability
then E(Xn) = 0, but EIXn - 01 = 1.

i,

(c) By part (a), E(X~) --+ E(X 2 ). Now Xn _..;. X by Theorem (7.2.3), and therefore E(Xn) --+ E(X)
by part (b). Therefore var(Xn) = E(X~) - E(Xn) 2 --+ var(X).

2. Assume that Xn ~ X. Since IXnl ~ Z for all n, it is the case that lXI
Zn = IXn -XI satisfies Zn ~ 2Z a.s. In addition, if E > 0,

~ Z a.s. Therefore

As n--+ oo, lP'(IZnl >E) --+ 0, and therefore the last term tends to 0; to see this, use the fact that
E(Z) < oo, together with the result of Exercise (5.6.5). Now let E ,!.. 0 to obtain that EIZnl --+ 0 as
n --+ oo.
3. We have that X- n- 1 ~ Xn ~ X, so that E(Xn) --+ E(X), and similarly E(Yn) --+ E(Y). By
the independence of Xn and Yn,

{( 1) ( 1) }
X - -;;_

Y - -;;_

= E(XY) -

E(X) + E(Y)
1
n
+ n 2 --+ E(XY)

as n --+ oo, so that E(Xn Yn) --+ E(XY).

4. Let F1, F2, ... be distribution functions. As in Section 5.9, we write Fn --+ F if Fn(x) --+ F(x)
for all x at which F is continuous. We are required to prove that Fn --+ F if and only if d (Fn, F) --+ 0.
Suppose that d(Fn, F) --+ 0. Then, forE > 0, there exists N such that
F(x- E)- E ~ Fn(x)

F(x +E)+ E

for all x.

Take the limits as n --+ oo and E --+ 0 in that order, to find that Fn (x) --+ F (x) whenever F is
continuous at x.
Suppose that Fn --+ F. Let E > 0, and find real numbers a = x1 < x2 < < Xn = b, each
being points of continuity of F, such that
(i) Fi(a) < E for all i, F(b) > 1- E,
(ii) lxi+1 -Xi I < E for 1 ~ i < n.
In order to pick a such that Fi (a) < E for all i, first choose a 1 such that F(a 1) < iE and F is
continuous at a', then find M such that IFm (a')- F(a 1 ) I < iE form ::::: M, and lastly find a continuity
point a ofF such that a~ a 1 and Fm(a) < E for 1 ~ m < M.
There are finitely many points Xi, and therefore there exists N such that IFm(Xi)- F(xi)l < E
for all i and m ::::: N. Now, if m ::::: N and Xi ~ x < Xi+ 1,

and similarly
Fm(x)::::: Fm(Xi) > F(xi)- E :0:: F(x- E)- E.

Similar inequalities hold if x


d(Fm, F) --+ 0 as m --+ oo.

a or x ::::: b, and it follows that d(Fm, F) < E if m ::::: N. Therefore

324

Modes of convergence
5.

n:::::

Solutions [7.2.5]-[7.2.7]

(a) Suppose c > 0 and pick 8 such that 0 < 8 < c. Find N such that lP'(/Yn - cJ > 8) < 8 for
N. Now, for x::::: 0,

lP'(XnYn.::; x) .::; IP'(XnYn .::; X, /Yn- cJ .:'S

8) +

IP'(/Yn- c/ >

8)

.:'S lP' ( Xn .:'S c ~ 8 ) + 8,

and similarly

lP'(XnYn > x).::; IP'(XnYn >X, /Yn- cJ .:'S

8)

+ 8 .:'S lP' ( Xn > c: 8 ) + 8.

Taking the limits as n --+ oo and 8 .,),. 0, we find that lP'(XnYn .::; x) --+ lP'(X .::; xfc) if xfc is a point
of continuity of the distribution function of X. A similar argument holds if x < 0, and we conclude
that XnYn

~ eX if c > 0. No extra difficulty arises if c < 0, and the case c = 0 is similar.

For the second part, it suffices to prove that Y,;- 1 ~ c- 1 if Yn ~ c (# 0). This is immediate
from the fact that IYn- 1 - c- 1 1 < E/{/c/(/c/- E)} if /Yn - c/ < E ( < Jcl).
(b) Let E > 0. There exists N such that

lP'(/Xn/ >E)< E,

IP'(/Yn- Y/ >E)< E,

ifn

::0:: N,

and in addition lP'(/YI > N) <E. By an elementary argument, g is uniformly continuous at points of
the form (0, y) for Jy/ .::; N. Therefore there exists 8 (> 0) such that

/g(x 1, y 1) - g(O, y)/ <

if Jx'J .::; 8, Jy'-

y/ .::; 8.

If /Xn/.::; 8, /Yn- Y/ .::; 8, and IYI .::; N, then Jg(Xn, Yn)- g(O, Y)J < E, so that

IP'(/g(Xn, Yn)- g(O, Y)J

::0:: E).::;

for n::::: N. Therefore g(Xn, Yn)

6.

lP'(/Xn/ > 8) + IP'(IYn- Y/ > 8) + lP'(/Y/ > N).::; 3E,

p
---+ g(O,

Y) as n--+ oo.

The subset A of the sample space Q may be expressed thus:

nU n
00

A=

00

00

{/Xn+m- Xn/ < k- 1 },

k=1 n=1m=1

a countable sequence of intersections and unions of events.


For the last part, define

X(w) = {

limn---+oo Xn(w)

ifw E A

ifw

rf. A.

The function X is .1"-measurable since A E :F.

7. (a) If Xn(w)--+ X(w) then cnXn(w)--+ cX(w).


(b) We have by Minkowski's inequality that, as n --+ oo,

E(/cnXn- cXJ').::; /cn/'E(/Xn- XI')+ /en- cJ'EJX'J--+ 0.


(c) If c = 0, the claim is nearly obvious. Otherwise c # 0, and we may assume that c > 0. For
0 < E < c, there exists N such that /en - cJ < E whenever n ::::: N. By the triangle inequality,
/cnXn- eX/ .::; /cn(Xn- X)/+ /(en- c)X/, so that, for n ::0:: N,

IP'(/cnXn- eX/> E).::; IP'(cn/Xn- X/> iE) + IP'(/cn- cJJXJ > iE)
.:'S lP' (/Xn -X/ >
--+ 0

E
) + lP' (/X/ >
E
)
2(c +E)
2/cn - cJ
as n --+ oo.

325

[7.2.8]-[7.3.1]

Solutions

Convergence of random variables

_!;. X, find random variables


~ cY, so that cnYn _!;. cY,

(d) A neat way is to use the Skorokhod representation (7.2.14). If Xn


Yn, Y with the same distributions such that Yn
implying the same conclusion for the X's.

8.

Y. Then cnYn

If X is not a.s. constant, there exist real numbers c and E such that 0 < E <

lP'(X > c +E) > 2E. Since Xn

i and lP'( X < c) > 2E,

!.. X, there exists N such that

lP'(Xn<c)>E,

lP'(Xn>c+E)>E,

ifn;:::N.

Also, by the triangle inequality, IXr- Xsl _:::: IXr- XI+ IXs- XI; therefore there exists M such
that IP'(IX, - Xs I > E) < E3 for r, s ::::: M. Assume now that the Xn are independent. Then, for
r, s ::::: max{M, N}, r =f. s,
E3 > IP'(IXr - Xs I > E)

IP'(Xr < c, Xs > c +E) = lP'(Xr < c)lP'(Xs > c +E) > E2 ,

::0::

a contradiction.
9. Either use the fact (Exercise (4.12.3)) that convergence in total variation implies convergence in
distribution, together with Theorem (7.2.19), or argue directly thus. Since luOI :::: K < oo,
IE(u(Xn))- E(u(X))I = jL:u(k){fn(k)- /(k)}l _:::: KL lfn(k)- /(k)l--+ 0.
k

10. The partial sum Sn

= 2::~= 1

X, is Poisson-distributed with parameter an

= 2::~= 1 A.,.

For fixed

x, the event {Sn _:::: x} is decreasing inn, whence by Lemma (1.3.5), if an --+ a < oo and xis a
non-negative integer,
oo

lP' ( LXr
r=1

:::=x =n~~lP'(Sn ::;x)

-crj

)=0

L~

Hence if a < oo, 2::~ 1 X, converges to a Poisson random variable. On the other hand, if an --+ oo,

L:J=

then e-crn
0 aj I j! --+ 0, giving that IP'(2::~ 1 X, > x)
diverges with probability 1, as required.

I for all

x,

and therefore the sum

7.3 Solutions. Some ancillary results


1.

(a) If IXn - Xm I > E then either IXn- XI > iE or IXm -XI > iE, so that

as n, m --+ oo, forE > 0.


Conversely, suppose that {Xn} is Cauchy convergent in probability. For each positive integer k,
there exists nk such that
forn,

m::::: nk.

The sequence (nk) may not be increasing, and we work instead with the sequence defined by N1 = n1,
Nk+1 = max{Nk + 1, nk+d We have that
LlP'(IXNk+J- XNkl::::: Tk) <
k

326

00,

Solutions

Some ancillary results

[7.3.2]-[7.3.3]

whence, by the first Borel-Cantelli lemma, a.s. only finitely many of the events {IXNk+l - XNk I 2::
2-k} occur. Therefore, the expression
00

X= XN 1 + L(XNk+i - XNk)
k=1
converges absolutely on an event C having probability one. Define X(w) accordingly for wE C, and
X(w) = 0 for w rf. C. We have, by the definition of X, that XNk ~ X ask-+ oo. Finally, we 'fill
in the gaps'. As before, forE > 0,

as n, k-+ oo, where we are using the assumption that {Xn} is Cauchy convergent in probability.
(b) Since Xn

!.. X, the sequence {Xn} is Cauchy convergent in probability. Hence


IP'(IYn- Yml >E)= IP'(IXn- Xml >E)-+ 0

as n, m-+ oo,

forE > 0. Therefore {Yn} is Cauchy convergent also, and the sequence converges in probability to
some limit Y. Finally, Xn
2.

... X and Yn ... X, so that X andY have the same distribution.

Since An ~ U;;'=n Am, we have that

lJ

lJ

limsuplP'(An) _:::: lim lP'(


Am) = lP'( lim
Am) = lP'(An i.o.),
n--+oo
n--+oo
m=n
n--+oo m=n
where we have used the continuity of lP'. Alternatively, apply Fatou's lemma to the sequence (4 ~ of
indicator functions.
3. (a) Suppose Xzn = 1, Xzn+1 = -1, for n 2:: 1. Then {Sn = 0 i.o.} occurs if X 1 = -1, and not
if X 1 = 1. The event is therefore not in the tail a-field of the X's.
(b) Here is a way. As usual, lP'(Szn = 0) =

e:)

{p(l - p W, so that

LlP'(Szn = 0) < oo

if p

=f.

i.

implying by the first Borel-Cantelli lemma that lP'(Sn = 0 i.o.) = 0.


(c) Changing the values of any finite collection of the steps has no effect on I = lim inf Tn and J =
lim sup Tn, since such changes are extinguished in the limit by the denominator '.[ii'. Hence I and J
are tail functions, and are measurable with respectto the tail a -field. In particular, {I _:::: - x} n {J ::::: x}
lies in the a -field.
Take x = 1, say. Then, lP'(/ _:::: -1) = lP'(J 2:: 1) by symmetry; using Exercise (7.3.2) and the
central limit theorem,
lP'(J 2:: 1) 2:: lP'(Sn 2:: .[ii i.o.) 2:: lim sup lP'(Sn 2::
n--+oo

,Jfi,) =

1 - <I>(l) > 0,

where <I> is theN (0, 1) distribution function. Since {J 2:: 1} is a tail event of an independent sequence,
it has probability either 0 or 1, and therefore lP'(/ _:::: -1) = lP'(J 2:: 1) = 1, and also lP'(/ _:::: -1, J 2::
1) = 1. That is, on an event having probability one, each visit of the walk to the left of -.[ii is
followed by a visit of the walk to the right of .[ii, and vice versa. It follows that the walk visits 0
infinitely often, with probability one.

327

[7.3.4]-[7.3.8]

Solutions

Convergence of random variables

4. Let A be exchangeable. Since A is defined in terms of the Xi, it follows by a standard result of
measure theory that, foreachn, thereexistsaneventAn E a(X1, X2, ... , Xn), suchthatlP'(Ab.An)--+
0 as n --+ oo. We may express An and A in the form
An = {Xn E Bn},

where Xn

A = {X E B},

= (X 1, X2, ... , Xn), and Bn and Bare appropriate subsets ofll~n and JR 00 Let
A~ ={X~

Bn},

A'= {X 1 E B},

where X~ = (Xn+1, Xn+2, ... , X2n) and X'= (Xn+1, Xn+2, ... , X2n, X1, X2, ... , Xn, X2n+1,
X2n+2, ).
Now !P'(An n A~) = lP'(An)lP'(A~), by independence. Also, !P'(An) = lP'(A~), and therefore

!P'(A

By the exchangeability of A, we have that lP'(A t. A~) = !P'(A' t. A~), which in turn equals
t. An), using the fact that the Xi are independent and identically distributed. Therefore,
liP'( An

n A~)

- lP'(A) I ::: lP'(A

t.

An)

+ !P'(A t. A~) --+ 0

as n --+ oo.

Combining this with(*), we obtain that lP'(A) = lP'(A) 2, and hence lP'(A) equals 0 or 1.

5. The value of Sn does not depend on the order of the first n steps, but only on their sum. If Sn = 0
i.o., then S~ = 0 i.o. for all walks {S~} obtained from {Sn} by permutations of finitely many steps.
6. Since f is continuous on a closed interval, it is bounded: If (y) I ::: c for all y E [0, 1]for some c.
Furthermore f is uniformly continuous on [0, 1], which is to say that, if E > 0, there exists 8 (> 0),
such that lf(y)- f(z)l < E if IY- zl :S 8. With this choice of E, 8, we have that IE(Z/Ac)l < E, and
IE(ZIA)I ::: 2clP'(A) ::: 2c

x(1 - x)

n 82

by Chebyshov's inequality. Therefore


2c
IE(Z)I < E + n 82 ,
which is less than 2E for values of n exceeding 2c/(E8 2).

7. If {Xn} converges completely to X then, by the first Borel-Cantelli lemma, IXn -XI > E only
finitely often with probability one, for all E > 0. This implies that Xn ~ X; see Theorem (7.2.4c).
Suppose conversely that {Xn} is a sequence of independent variables which converges almost
surely to X. By Exercise (7.2.8), X is almost surely constant, and we may therefore suppose that
Xn ~ c where c E JR. It follows that, for E > 0, only finitely many of the (independent) events
{IXn - cl > E} occur, with probability one. Using the second Borel-Cantelli lemma,
LlP'(IXn- cl >E)<

00.

8.

Of the various ways of doing this, here is one. We have that

(n)
2

-1

" ' XiXj


L

1~l<J~n

_n (_!_ ~

1 n L
z"--1

328

xi)2-

1
x2
n(n - 1) L. 1
z--1

Solutions [7.3.9]-[7.3.12]

Some ancillary results

Now n

-1

L;'i Xi

-+ f.l, by the law oflarge numbers (5.10.2); hence

P
n -1 L;'i Xi -+
~t (use Theorem

(7.2.4a)). It follows that (n- 1 L;'i Xi) 2 ~ ~t 2 ; to see this, either argue directly or use Problem
(7.11.3). Now use Exercise (7.2.7) to find that
n ( 1 n
--I: xi )2 -+p ~t 2 .
n- 1 n i=
1

Arguing similarly,

- - - """'x? ~ o
n(n- 1) ~
!=1

'

'

and the result follows by the fact (Theorem (7.3.9)) that the sum of these two expressions converges
in probability to the sum of their limits.
9.

Evidently,

Xn
) = -1lP' ( -->1+E
logn nl+E'

for lEI < 1.

By the Borel-Cantelli lemmas, the events An = {Xn/logn


-1 < E ::=: 0, and a.s. only finitely often for E > 0.

1 + E} occur a.s. infinitely often for

10. (a) Mills's ratio (Exercise (4.4.8) or Problem (4.14.1c)) informs us that 1- <P(x) ~ x- 1cp(x) as
x --+ oo. Therefore,
lP'(IXnl ~ y'2logn(l+E)) ~

.J2n logn(l + E)nO+E)

2.

The sum over n of these terms converges if and only if E > 0, and the Borel--Cantelli lemmas imply
the claim.
(b) This is an easy implication of the Borel--Cantelli lemmas.
11. Let X be uniformly distributed on the interval [-1, 1], and define Xn = /{X:S(-l)n /n) The distribution of Xn approaches the Bernoulli distribution which takes the values 1 with equal probability
The median of X n is 1 if n is even and - 1 if n is odd.

i.

12. (i) We have that

for x > 0. The result follows by the second Borel--Cantelli lemma.


(ii) (a) The stationary distribution 1l is found in the usual way to satisfy

Hence nk = {k(k + 1)}- 1 fork~ 1, a distribution with mean L;~ 1 (k + 1)- 1


(b) By construction, lP'(Xn ::=: Xo + n) = 1 for all n, whence

. sup -Xn ::=: 1)


lP' ( hm
n-+oo n
It may in fact be shown that lP'(lim supn-+oo Xn/n

= 0) = 1.

329

1.

= oo.

[7 .3.13]-[7.4.3]

Solutions

Convergence of random variables

13. We divide the numerator and denominator by Jna. By the central limit theorem, the former
converges in distribution to the N(O, 1) distribution. We expand the new denominator, squared, as

12:n

na 2

(Xr -

~-t)

2-

- -(Xna 2

r=1

~-t)

I:n (Xr

~-t)

1-

+-(Xa2

r=1

~-t)

By the weak law of large numbers (Theorem (5.10.2), combined with Theorem (7.2.3)), the first term
converges in probability to 1, and the other terms to 0. Their sum converges to 1, by Theorem (7.3.9),
and the result follows by Slutsky's theorem, Exercise (7.2.5).

7.4 Solutions. Laws of large numbers


1.

LetSn=X1+X2++Xn.Then
n

'"'L
1 -<
nE(S 2) =
n
i= 2 log i - log n

and therefore Snfn ~ 0. On the other hand, I::i lP'(IXil 2: i) = 1, so that IXil 2: i i.o., with
probability one, by the second Borel-Cantelli lemma. For such a value of i, we have that ISi - Si -11 2:
i, implying that Sn/ n does not converge, with probability one.

2.

Let the Xn satisfy

1
n

lP'(Xn = -n) = 1 - 2

lP'(Xn

= n3 -

n)

= 21 ,
n

whence they have zero mean. However,

implying by the first Borel-Cantelli lemma that lP'(Xnfn-+ -1) = 1. It is an elementary result of
real analysis that n - 1 I::~= 1 Xn -+ -1 if Xn -+ -1, and the claim follows.
3. The random variable N(S) has mean and variance AlSI = crd, where cis a constant depending
only on d. By Chebyshov's inequality,

lP'

(I

I )

N(S)
ISl-A :::

A (A)
~

.:5 E21SI =

By the first Borel-Cantelli lemma, I1Ski- 1N(Sk)- AI 2:

era

for only finitely many integers k, a.s.,

where Skis the sphere of radius k. It follows that N(Sk)/ISkl ~ A ask-+ oo. The same conclusion
holds ask -+ oo through the reals, since N(S) is non-decreasing in the radius of S.

330

Solutions

Martingales

[7.5.1]-[7.7.1]

7.5 Solutions. The strong law


1.

Let Iij be the indicator function of the event that X J lies in the i th interval. Then
n

log Rm

=L

Zm (i) log Pi

=L

i=l

where, for 1 .::0 j .::0 m, Yj


variables with mean

lij log Pi

=L

i=l }=I

l:t=I Iij

Yj

}=I

log Pi is the sum of independent identically distributed


n

lE(Yj) =

L Pi log Pi = -h.
i=l

By the strong law, m- 1 log Rm ~ -h.


2. The following two observations are clear:
(a) N(t) < n if and only if Tn > t,
(b) TN(t) .::0 t < TN(t)+I
If lE(X I) < oo, then lE(Tn) < oo, so that lP'(Tn > t) -+ 0 as t -+ oo. Therefore, by (a),
lP'(N(t) < n) = lP'(Tn > t) -+ 0

as t-+ oo,

implying that N(t) ~ oo as t-+ oo.


Secondly, by (b),
TN(t) < _t_ < TN(t)+I . ( 1 + N(t)-I).
N(t) - N(t)
N(t) + 1
Take the limit as t -+ oo, using the fact that Tn/n ~ lE(XJ) by the strong law, to deduce that
t/N(t) ~ lE(XJ).

By the strong law, Sn/n ~ lE(XJ) =I= 0. In particular, with probability 1, Sn = 0 only finitely
often.

3.

7.6 Solution. The law of the iterated logarithm


1.

The sum Sn is approximately N(O, n), so that


-~a

lP'(Sn > ylanlogn) = 1- <t>(Jalogn) <

~
alogn

for all large n, by the tail estimate of Exercise (4.4.8) or Problem (4.14.1c) for the normal distribution.
This is summable if a > 2, and the claim follows by an application of the first Borel-Cantelli lemma.

7.7 Solutions. Martingales


1.

Suppose i < j. Then

_I]}
= lE{ Xi [lE(SJ I So, S1, ... , SJ-I)- s1_I]} = 0

lE(XjXi) = lE{ lE[(SJ- Sj-I)Xi

331

I So, S1, ... , s1

[7. 7.2]-[7.8.2]

Solutions

Convergence of random variables

by the martingale property.


2.

Clearly EISnl < oo for all n. Also, for n 2: 0,

1 {
( 1 _ p,n+1)}
E(Sn+1 I Zo, z1' ... ' Zn) = p,n+1 E(Zn+1 I Zo, ... ' Zn)- m
1 - JL
1 {
( 1 _ p,n+1)}
= p,n+ 1 m+p,Zn-m
=Sn.
1 _p,
3.

CertainlyEISnl < ooforalln. Secondly,forn 2:1,


E(Sn+1 I Xo, X1, ... , Xn) = aE(Xn+1 I Xo, ... , Xn)
= (aa + 1)Xn +abXn-1,

+ Xn

which equals Sn if a= (1 - a)- 1.


4. The gambler stakes Zi = /i-1 (X 1, ... , Xi-1) on the ith play, at a return of Xi per unit. Therefore
Si = Si-1 + XiZi fori 2: 2, with S1 = X1Y. Secondly,

where we have used the fact that Zn+1 depends only on X1, X2, ... , Xn.

7.8 Solutions. Martingale convergence theorem


1. It is easily checked that Sn defines a martingale with respect to itself, and the claim follows from
the Doob-Kolmogorov inequality, using the fact that
n

E(S~) = I:var(Xj).
}=1

2. It would be easy but somewhat perverse to use the martingale convergence theorem, and so we
give a direct proof based on Kolmogorov 's inequality of Exercise (7 .8 .1 ). Applying this inequality to
the sequence Zm, Zm+1' ... where Zi =(Xi- EXi)/i, we obtain that Sn = Z1 + Z2 + + Zn
satisfies, for E > 0,
lP' ( max ISn- Sml
m<n<r
- -

~
~

>E) ::0 _!_2


E

var(Zn).

n=m+1

We take the limit as r -+ oo, using the continuity oflP', to obtain

1 00
1
lP' ( sup ISn- Sml > E ) ::0 2
2 var(Xn).
n:=::m
E n=m+1 n

.2:::

Now let m -+ oo to obtain (after a small step)


lP' ( lim sup ISn - Sm I ::0
m-;.oo n:=::m

E) = 1

for all

Any real sequence (xn) satisfying


lim sup lxn -

m-;.oo n:=::m

Xm

I ::0

332

for all

> 0,

> 0.

Solutions

Prediction and conditional expectation

[7.8.3]-[7.9.4]

is Cauchy convergent, and hence convergent. It follows that Sn converges a.s. to some limit Y.
The last part is an immediate consequence, using Kronecker's lemma.
3. By the martingale convergence theorem, S = limn-+oo Sn exists a.s., and Sn
Exercise (7.2.1c), var(Sn) --+ var(S), and therefore var(S) = 0.

m.s.

S. Using

7.9 Solutions. Prediction and conditional expectation


1. (a) Clearly the best predictors are lE(X
(b) We have, after expansion, that

I Y) = Y 2 , lE(Y I X) = 0.

since lE(Y) = lE(Y 3 ) = 0. This is a minimum when b = JE(Y 2 ) = ~.and a = 0. The best linear
predictor of X given Y is therefore ~ .
Note that lE(Y I X) = 0 is a linear function of X; it is therefore the best linear predictor of Y
given X.
2.

By the result of Problem (4.14.13), lE(Y I X) =

3.

Write

~-tz

+ paz(X -

~-tdfal,

in the natural notation.

g(a) = LaiXi =aX',


i=l

and
v(a) = lE{ (Y- g(a)) 2 } = lE(Y 2 ) - 2alE(YX1) + aVa'.
Let a be a vector satisfying Vi' = lE(YX'). Then
v(a) - v(a) = aVa' - 2alE(YX1) + 2ilE(YX1) -iVa'
= aVa'- 2aVa' +iVa'= (a- a)V(a- a)' :::: 0,
since Vis non-negative definite. Hence v(a) is a minimum when a =a, and the answer is g(i). If V
is non-singular, a= JE(YX)V- 1 .
4. Recall that Z = JE(Y I fj,) is the ('almost') unique fJ,-measurable random variable with finite
mean and satisfying JE{(Y- Z)/G} = 0 for all G E fJ,.

(i) Q E fJ,, and hence lE{lE(Y I fj,)/n} = lE(Z/n) = lE(Y /g).


(ii) U = alE(Y I fj,) + ,BlE(Z I fj,) satisfies
JE(U /G) = alE{lE(Y I fj,)/G} + ,BlE{lE(Z I fj,)/G}
= alE(Y I G)+ ,BlE(Z/G) = lE{ (aY + ,BZ)IG },
Also, U is fJ,-measurable.
(iii) Suppose there exists m (> 0) such that G = {lE(Y I fj,) < -m} has strictly positive probability.
Then G E fJ,, and so JE(Y /G) = lE{lE(Y I fj,)/G}. However Y IG :::: 0, whereas lE(Y I fj,)/G < -m.
We obtain a contradiction on taking expectations.
(iv) Just check the definition of conditional expectation.
(v) If Y is independent of fJ,, then lE(Y /G)= lE(Y)lP'(G) for G E fJ,. Hence JE{(Y -JE(Y))/G} = 0 for
G E fJ,, as required.

333

[7.9.5]-[7.10.1]

Solutions

Convergence of random variables

(vi) If g is convex then, for all a E JR, there exists A.(a) such that
g(y) 2: g(a)

+ (y- a)A.(a);

furthermore A. may be chosen to be a measurable function of a. Set a = E(Y I fJ,) andy = Y, to


obtain
g(Y) ::: g{E(Y I fJ,)} + {Y - E(Y I fJ,) }A.{E(Y I fJ,) }.
Take expectations conditional on g,, and use the fact that E(Y I fJ,) is [J,-measurable.
(vii) We have that
IECYn I fJ.) -E(Y I fJ.)I :':: E{IYn- Yll fJ.} :':: Vn
where Vn = E{supm;c:n IYm- Yll fJ.}. Now {Vn : n::: 1} is non-increasing and bounded below.
Hence V = limn--+oo Vn exists and satisfies V ::: 0. Also
E(V) :<:: E(Vn) = E {sup IYm- Yl},
m?:_n

which tends to 0 as m --+ oo, by the dominated convergence theorem. Therefore E(V) = 0, and hence
lP'(V = 0) = 1. The claim follows.

5.

E(Y

I X)=

X.

6. (a) Let {Xn : n ::: 1} be a sequence of members of H which is Cauchy convergent in mean
square, that is, E{IXn - Xm 12 } --+ 0 as m, n --+ oo. By Chebyshov's inequality, {Xn : n ::: 1} is
Cauchy convergent in probability, and therefore converges in probability to some limit X (see Exercise
(7.3.1)). It follows that there exists a subsequence {Xnk : k ::: 1} which converges to X almost surely.
Since each Xnk is fJ,-measurable, we may assume that X is fJ,-measurable. Now, as n --+ oo,

where we have used Fatou's lemma and Cauchy convergence in mean square. Therefore Xn ~ X.
That E(X 2 ) < oo is a consequence of Exercise (7.2.la).
(b) That (i) implies (ii) is obvious, since I G E H. Suppose that (ii) holds. Any Z ( E H) may be
written as the limit, as n --+ oo, of random variables of the form
m(n)
Zn =

a;(n)Ic;(n)

i=l

for reals a;(n) and events G;(n) in fJ,; furthermore we may assume that IZnl :<:: IZI. It is easy to see
that E{ (Y- M)Zn} = 0 for all n. By dominated convergence, E{(Y- M)Zn} --+ E{(Y- M)Z},
and the claim follows.

7.10 Solutions. Uniform integrability


1.

It is easily checked by considering whether lxl :<::a or IYI :<::a that, for a > 0,

Now substitute x = Xn andy= Yn, and take expectations.

334

Solutions

Uniform integrability

[7.10.2]-[7.10.6]

xn

2. (a) Let E > 0. There exists N such that lE(IXn < E if n > N. Now lEIXrl < oo, by
Exercise (7.2.1a), and therefore there exists 8 (> 0) such that
lE(IXnlr lA) < E for 1 S n S N,

lE(IXIr lA) < E,

for all events A such that lP'(A) < 8. By Minkowski's inequality,


{lE(IXn(/A)} 1/r S {lE(IXn- X(/A)} 1/r

+ {lE(IX(/A)} 1/r

S 2El/r

ifn > N

iflP'(A) < 8. Therefore {IXnr: n 2: 1} is uniformly integrable.


If r is an integer then {X~ : n 2: 1} is uniformly integrable also. Also X~ ~ xr since Xn ~ X
(use the result of Problem (7.11.3)). Therefore lE(X~) --+ E(Xr) as required.
(b) Suppose now that the collection {IXnlr : n 2: 1} is uniformly integrable and Xn ~X. We show
first that lEIXr I < oo, as follows. There exists a subsequence {Xnk : k 2: 1} which converges to X
almost surely. By Fatou's lemma,
lEIXrl = lE (liminf IXnklr) S
k-H>O

liminflEIX~kl
S suplEIX~I
k-+oo
n

< oo.

If E > 0, there exists 8 (> 0) such that


lE(IXriiA) < E,

lE(IX~IIA) < E

for all n,

whenever A is such that lP'(A) < 8. There exists N such that Bn(E) = {IXn- XI > E} satisfies
lP'(Bn(E)) < 8 for n > N. Consequently
lE(IXn-

xn S

Er

+ E (IXn- Xlr IBn(E)),

n > N,

of which the final term satisfies


{lE (IXn - Xlr IBn(E))} lfr S {lE (IX~IIBn(E))} lfr

+ {lE (IXr IIBn(E))} l/r

S 2El/r.

Therefore, Xn ~X.
3.

FixE > 0, and find a real number a such that g(x) > xjE if x >a. If b 2: a,
lE (IXnii(IXnl>bJ) < ElE{g(IXnl)} S EsupE{g(IXnl)},
n

whence the left side approaches 0, uniformly inn, as b --+ oo.

4.

Here is a quick way. Extinction is (almost) certain for such a branching process, so that Zn ~ 0,

and hence Zn ~ 0. If {Zn : n 2: 0} were uniformly integrable, it would follow that JE(Zn) --+ 0 as
n --+ oo; however lE(Zn) = 1 for all n.
5.

We may suppose that Xn, Yn, and Zn have finite means, for all n. We have that 0 S Yn - Xn S
p

Zn- Xn where, by Theorem (7.3.9c), Zn- Xn-+ Z- X. Also

lEIZn- Xn I = lE(Zn- Xn)--+ lE(Z- X)

= lEIZ- XI,

so that {Zn - Xn : n 2: 1} is uniformly integrable, by Theorem (7.10.3). It follows that {Yn - Xn :


n 2: 1} is uniformly integrable. Also Yn - Xn ~ Y- X, and therefore by Theorem (7.10.3),
lEIYn-Xnl-+ lEIY-XI,whichistosaythatlE(Yn)-lE(Xn)--+ lE(Y)-lE(X);hencelE(Yn)--+ E(Y).
It is not necessary to use uniform integrability; try doing it using the 'more primitive' Fatou's
lemma.

6. For any event A, lE(IXniiA) S JE(ZIA) where Z = supn IXnl The uniform integrability follows
by the assumption that lE(Z) < oo.

335

[7.11.1]-[7.11.2]

Solutions

Convergence of random variables

7.11 Solutions to problems


1.

lEI X~ I = oo for r 2: 1, so there is no convergence in any mean. However, if E > 0,


2

lP'(IXnl >E)= 1 - - tan- 1 (nE)-+ 0

as n-+ oo,

T(

so that Xn -+ 0.
You have insufficient information to decide whether or not Xn converges almost surely:
(a) Let X be Cauchy, and let Xn = Xjn. Then Xn has the required density function, and Xn ~ 0.
(b) Let the Xn be independent with the specified density functions. ForE > 0,

2 sin- 1 (
lP'(IXnl >E)=n

,h + n2E2

) "'2,
nnE

so that Ln lP'(IXnl >E)= oo. By the second Borel-Cantelli lemma, IXnl > E i.o. with probability
one, implying that Xn does not converge a.s. to 0.
2. (i) Assume all the random variables are defined on the same probability space; otherwise it is
meaningless to add them together.
(a) Clearly Xn(w) + Yn(w) -+ X(w) + Y(w) whenever Xn(w) -+ X(w) and Yn(w) -+ Y(w).
Therefore
{ Xn + Yn...,. X+ Y} ~ {Xn...,. X} U {Yn...,. Y},
a union of events having zero probability.
(b) Use Minkowski's inequality to obtain that
{lE (IXn + Yn -X- Yn} 1/r .::0 {lE(IXn -

xn} 1/r +

{lE(IYn - Yn} 1/r.

(c) If E > 0, we have that

and the probability of the right side tends to 0 as n -+ oo.


(d) If Xn ~ X and the Xn are symmetric, then -Xn ~ X. However Xn + (-Xn) ~ 0, which
generally differs from 2X in distribution.
(ii) (e) Almost-sure convergence follows as in (a) above.
(f) The corresponding statement for convergence in rth mean is false in general. Find a random
variable Z such that lEIZrl < oo but lEIZ 2rl = oo, and define Xn = Yn = Z for all n.
p

(g) Suppose Xn -+ X and Yn -+ Y. Let E > 0. Then

lP'(IXnYn- XYI >E)= lP'(I(Xn- X)(Yn- Y) + (Xn- X)Y + X(Yn-

nl >E)

.::0 lP'(IXn- XIIYn- Yl >~E) +lP'(IXn- XIIYI >~E)

+ lP'(IXI IYn- Yl > ~E).


Now, for 8 > 0,
lP'(IXn- XIIYI >~E) .::0 lP'(IXn- XI> E/(38)) +lP'(IYI >

336

8),

Solutions

Problems

[7 .11.3]-[7.11.5]

which tends to 0 in the limit as n --+ oo and 8 --+ oo in that order. Together with two similar facts, we
obtain that XnYn ~ XY.
(h) The example of (d) above indicates that the corresponding statement is false for convergence in
distribution.
3. Let E > 0. We may pick M such that lP'(IXI 2: M) .::::E. The continuous function g is uniformly
continuous on the bounded interval [- M, M]. There exists 8 > 0 such that
lg(x)- g(y)l
If lg(Xn)- g(X)I >

E,

.:::= E

.:::=

8 and lxl

.:::=

M.

then either IXn- XI > 8 or lXI 2: M. Therefore

lP'(Ig(Xn)- g(X)I

>E) _:: : lP'(IXn- XI > 8)

in the limit as n --+ oo. It follows that g(Xn)

4.

if lx- Yl

+ JI(IXI 2:

M)

--+ lP'(IXI 2:

M) _:: : E,

g(X).

Clearly
lE(eitXn)

n
.
n { 1 1
=II
lE(eitYj/101) =II _. - e .
.
.
.
10
1 _ eztf10J

it/10j-l }

]=1

]=1

1- eit
wn(l-eitjlOn)

1 - eit
--+----

it

as n --+ oo. The limit is the characteristic function of the uniform distribution on [0, 1].
Now Xn .::; Xn+1 _:::: 1 for all n, so that Y(w) = limn---+oo Xn(w) exists for all w. Therefore
Xn

5.

(a) Supposes < t. Then

Y; hence Xn

Er Y, whence Y has the uniform distribution.

lE(N(s)N(t)) = JE(N(s) 2) +lE{N(s)(N(t)- N(s))} = JE(N(s) 2) +lE(N(s))lE(N(t)- N(s)),


since N has independent increments. Therefore
cov(N(s), N(t)) = lE(N(s)N(t)) -JE(N(s))JE(N(t))
= (A.s) 2 + A.s + A.s{A.(t- s)}- (A.s)(A.t) = A.s.
In general, cov(N(s), N(t)) =A. min{s, t}.
(b) N(t +h)- N(t) has the same distribution as N(h), if h > 0. Hence

which tends to 0 as h --+ 0.


(c) By Markov's inequality,

lP'(IN(t+h)-N(t)l

>E).:::: E12lE({N(t+h)-N(t)} 2),

which tends to 0 as h --+ 0, if E > 0.


(d) Let E > 0. For 0 < h < E- 1,
lP'

(I

N(t +

h~- N(t) >E)

= lP'(N(t +h)- N(t):::

337

1)

= A.h + o(h),

[7.11.6]-[7.11.10]

Solutions

Convergence of random variables

which tends to 0 as h --+ 0.


On the other hand,

which tends to oo as h -J, 0.

6.

By Markov's inequality, Sn

= 2:7= 1 X;

satisfies

forE > 0. Using the properties of the X's,

since E(X;) = 0 for all i. Therefore there exists a constant C such that

implying (via the first Borel--Cantelli lemma) that n- 1 Sn ~ 0.

7.

We have by Markov's inequality that


<

00

forE > 0, so that Xn ~ X (via the first Borel-Cantelli lemma).

8. Either use the Skorokhod representation or characteristic functions. Following the latter route,
the characteristic function of aXn + b is

where >n is the characteristic function of Xn. The result follows by the continuity theorem.

9.

For any positive reals c, t,


li"(X > t)
-

= li"(X + c >- t + c)

<
-

E{(X+c) 2 }
.
(t +c) 2

Set c = a- 2 jt to obtain the required inequality.

10. Note that g(u) = uj(l + u) is an increasing function on [0, oo). Therefore, forE > 0,
lP' X >
(I nl

=11" (

IXnl
l+E
> -E- ) < - .JE: ( IXnl )
l+IXnl
l+E E
l+IXnl

338

Solutions [7.11.11]-[7.11.14]

Problems

by Markov's inequality. If this expectation tends to 0 then Xn

~ 0.

Suppose conversely that Xn -+ 0. Then


E(

IXnl )
E
E
:':: - - li"(IXnl :'::E)+ 1 lP'(IXnl >E)--+-1+1Xnl
1+E
1+E

as n --+ oo, for E > 0. However E is arbitrary, and hence the expectation has limit 0.

11. (i) The argument of the solution to Exercise (7.9.6a) shows that {Xn} converges in mean square
if it is mean-square Cauchy convergent. Conversely, suppose that Xn ~ X. By Minkowski's
inequality,

as m, n--+ oo, so that {Xn} is mean-square Cauchy convergent.


(ii) The corresponding result is valid for convergence almost surely, in rth mean, and in probability. For
a.s. convergence, it is self-evident by the properties of Cauchy-convergent sequences of real numbers.
For convergence in probability, see Exercise (7.3.1). For convergence in rth mean (r ~ 1), just adapt
the argument of (i) above.

12. Ifvar(Xi)::: M for all i, the variance of n- 1 Lt=1

xi is

1 n

2 L:var(Xi) :=:---+ 0
n i=1
n

as n--+ oo.

13. (a) We have that


JP>(Mn ::0 GnX) = F(anx)n --+ H(x)

If x :=: 0 then F (an X)n --+ 0, so that H (x)

= 0.

as n--+

00.

Suppose that x > 0. Then

-logH(x) =- lim {nlog[1- (1- F(anx))]} =


n----+-oo

lim {n(1- F(anx))}

n---+oo

since -y- 1 log(1 - y) --+ 1 as y t 0. Setting x = 1, we obtain n(l - F(an)) --+ -log H(l), and
the second limit follows.
(b) This is immediate from the fact that it is valid for all sequences {an}.
(c) We have that

1- F(tex+y)
1 - F(t)

1- F(tex+y)
1- F(tex)

1-F(tex)

1 - F(t)

logH(eY)

--+ --=--='---:::c=:'-:-:-c'log H(1)

logH(ex)

log H(l)

as t --+ oo. Therefore g(x + y) = g(x)g(y). Now g is non-increasing with g(O) = 1. Therefore
g(x) = e-f3x for some ,B, and hence H(u) = exp( -au-f3) for u > 0, where a= -log H(l).

14. Either use the result of Problem (7.11.13) or do the calculations directly thus. We have that
JP>(Mn :=oxn/rr)

= {~ +~tan- 1

c:) r=

{1-

~tan- 1 (;J

if x > 0, by elementary trigonometry. Now tan- 1 y = y + o(y) as y--+ 0, and therefore


as n --+ oo.

339

[7.11.15]-[7.11.16]

Solutions

Convergence of random variables

15. The characteristic function of the average satisfies


as n--+ oo.
By the continuity theorem, the average converges in distribution to the constant JL, and hence in
probability also.

16. (a) With Un = u(xn), we have that

if /lu/1 00 :::0 1. There is equality if Un equals the sign of fn - gn. The second equality holds as in
Problem (2.7.13) and Exercise (4.12.3).
(b) Similarly, if /lu/1 00 :::0 I,
IE(u(X))- E(u(Y))I :::0

L:

lu(x)l lf(x)- g(x)l dx :::0

L:

lf(x)- g(x)l dx

with equality if u(x) is the sign of f(x)- g(x). Secondly, we have that
llP'(X

A) -lP'(Y

A) I

i IE(u(X))- E(u(Y))I :::0 idTV(X, Y),

where
u(x)- {

if X E A,

-1

ifx.A.

Equality holds when A= {x E ffi(: f(x) :::: g(x)}.


(c) Suppose dTv(Xn, X)--+ 0. Fix a E ffi(, and let u be the indicator function of the interval ( -oo, a].
Then IE(u(Xn))- E(u(X))I = llP'(Xn :::0 a) -lP'(X :::0 a)l, and the claim follows.
On the other hand, if Xn = n- 1 with probability one, then Xn ..!?.. 0. However, by part (a),
dTV(Xn, 0) = 2 for all n.
(d) This is tricky without a knowledge of Radon-Nikodym derivatives, and we therefore restrict
ourselves to the case when X and Y are discrete. (The continuous case is analogous.) As in the
solution to Exercise (4.12.4), lP'(X =I Y) ::=:: idTv(X, Y). That equality is possible was proved for
Exercise (4.12.5), and we rephrase that solution here. Let J.Ln = min{fn, gn} and JL = Ln J.Ln, and
note that
dTV(X, Y) = L lfn- gnl = LUn + gn- 2J.Ln} = 2(1- JL).
n

It is easy to see that


1

z:dTV(X, Y) = lP'(X

=f

Y) =

1 if JL = 0,
ifJL=l,

and therefore we may assume that 0 < JL < I. Let U, V, W be random variables with mass functions
lP'(u _
-

Xn

) _ J.Ln
JL

lP'(V _
'

Xn

) _ max{fn- gn, 0} lP'(W _


'
l-JL

) _ - min{fn- gn, 0}
Xn

1-J.L

'

and let Z be a Bernoulli variable with parameter JL, independent of (U, V, W). We now choose the
pair X', Y' by
(X' Y') = { (U, U) if Z = I,
'
(V,W) ifZ=O.

340

Solutions [7.11.17]-[7.11.19]

Problems

It may be checked that X' andY' have the same distributions as X andY, and furthermore, li"(X'
Y 1) = li"(Z = 0) = 1- fi = idTV(X, Y).

i=

(e) By part (d), we may find independent pairs (X;, Y[), 1 :::0 i :::0 n, having the same marginals as
(Xi, Yi), respectively, and such thatli"(X;

:::: 211" (

i=

Y[) = idTV(Xi, Yi). Now,

L x; i= L:rf
i=l

n
::::2 Lll"(x;

i=l

i= Y[)

i=l

= 2

LdTV(Xj, Yj).
i=l

17. If XJ, X2, ... are independent variables having the Poisson distribution with parameter A, then
Sn = X1 +X2 + +Xn has the Poisson distribution with parameter An. Now n- 1sn -SA, so that
IE(g(n- 1Sn)) -+ g(A) for all bounded continuous g. The result follows.

18. The characteristic function 1/Jmn of


(Xn - n)- (Ym - m)
~

Umn=

vm+n

satisfies
log 1/lmn (t) = n (eit!.../m+n -

1)

+ m (e-it!.../m+n -

1)

+ (m - n)it -+ _lt2
2
Jm+n

as m, n-+ oo, implying by the continuity theorem that Umn -S N(O, 1). Now Xn + Ym is Poissondistributed with parameter m + n, and therefore

_
v:mn-

J+

Xn Ym P 1
-+
m+n

asm,n-+ oo
D

by the law oflarge numbers and Problem (3). It follows by Slutsky's theorem (7.2.5a) that Umn IVmn -+
N(O, 1) as required.

19. (a) The characteristic function of Xn is 1/Jn (t) = exp{i f-tnt - ia;t 2 } where fin and a; are
I 2

the mean and variance of Xn. Now, limn-+oo 1/Jn(l) exists. However 1/Jn(l) has modulus e-'Ian,
and therefore a 2 = limn-+oo a; exists. The remaining component eitLnt of 1/Jn (t) converges as
n -+ oo, say eitLnt -+ 8(t) as n -+ oo where (t) lies on the unit circle of the complex plane. Now

I 2 2

1/Jn(t) -+ 8(t)e-'Ia
which is required to be a characteristic function; therefore e is a continuous
function oft. Of the various ways of showing that 8(t) = eitLt for some f-t, here is one. The sequence
1/Jn(t) = eitLnt is a sequence of characteristic functions whose limit 8(t) is continuous at t = 0.
Therefore e is a characteristic function. However 1/1n is the characteristic function of the constant fin,
which must converge in distribution as n -+ oo; it follows that the real sequence {/In} converges to
some limit f-t, and 8(t) = eitLt as required.
This proves that 1/Jn(t)-+ exp{ifl,t- ia 2 t 2 }, and therefore the limit X is N(fl,, a 2).
1 ,

(b) Each linear combinations Xn + t Yn converges in probability, and hence in distribution, to s X +t Y.


Now s Xn + t Yn has a normal distribution, implying by part (a) that s X+ t Y is normal. Therefore the
joint characteristic function of X and Y satisfies

1/Jx,r(s, t) = I/JsX+tr(1) = exp{iE(sX + tY)- i var(sX + tY)}

ap)}

= exp{i(sf-tx + t{ty)- i<s 2 a_k + 2stpxyaxay + t 2

341

[7.11.20]-[7.11.21]

Solutions

Convergence of random variables

in the natural notation. Viewed as a function of s and t, this is the joint characteristic function of a
bivariate normal distribution.
When working in such a context, the technique of using linear combinations of Xn and Yn is
sometimes called the 'Cramer-Wold device'.

20. (i) Write


n --+ oo,

=X; -E(X;) and

Y;

E(Tn fn )

Tn

It suffices to show that n- 1Tn ~ 0. Now, as

I:t=l Y;.

I n
2
= 2 l:var(X;) + 2
n i=1
n

(ii) Let E > 0. There exists I such that lp(X;,

2:::

i,J=l

nc

2--+ 0.

Xj) ::0

l::Oi<} ::on

Xj )I

::: E if li - jl

2::: cov(X;, X1)::: 2:::

cov(X;,

2:::

cov(X;,Xj)+

li-}19
l::Oi,J::On

::=:: /.

Now

cov(X;,X1):::2n/c+n 2 Ec,

li-}1>1
1::oi,}::On

sincecov(X;, Xj)::: lp(Xi, Xj)lyivar(X;) var(Xj) Therefore,


2

2/c
n

E(Tn / n ) ::0 -

+ EC --+ EC

as n--+ oo.

This is valid for all positive E, and the result follows.

21. The integral


r)O_c_dx
12 x log lxl
diverges, and therefore IE( X I) does not exist.
The characteristic function cp of X 1 may be expressed as
,~., )

'f'(t

= 2C

100
2

whence
cp(t)- cp(O) = _ [
2c
12

cos(tx) d X
x 2 Iogx

---

00 1 -

cos(tx) dx.
x 2 Iogx

Now 0::: I- cose::: min{2, 8 2 }, and therefore

Icp(t)-2c cp(O) I -<


Now

11/t
2

11u

and

!00

t2
2
--dx+
---dx
Iogx
1/t x2Iogx
'

dx
----+0

Therefore

as u--+ oo,

Iogx

2
1oo 2 dx
- - dx
1uoo -x2Iogx
-Iogu u x 2
I
< --

if t > 0.

Icp(t) ~ cp(O) I = o(t)

2
= --

ulogu'

as t

u > I.

0.

Now cp is an even function, and hence cp' (0) exists and equals 0. Use the result of Problem (7.ll.l5)
to deduce that n- 1 I:'i Xi converges in distribution to 0, and therefore in probability also, since 0 is
constant. The X; do not obey the strong law since they have no mean.

342

Solutions [7.11.22]-[7.11.24]

Problems

22. If the two points are U and V then

and therefore
12
1~
2p1
-Xn = - L)Ui -Vi) -+ n
n i= 1
6

as n

oo,

--+

by the independence of the components. It follows that Xn/ ,fii ~


Problem (7 .11.3) or by the fact that

1/ .J6 either by the result of

23. The characteristic function of Yj = Xj 1 is


4J(t) = 1

. + e-ztfx)
. dx = lo1 cos(tjx) dx =It I100 -cosy- dy
lo 1(eztfx
ltl
Y2

2 0

by the substitution x = It II y. Therefore


l/J(t) = 1- It I

100
ltl

1- cosy

as t--+ 0,

dy = 1- /It I+ o(ltl)

where, integrating by parts,

-looo 1 - cos y dy--looo -sin-u du-_ -.

I-

11:

It follows that Tn = n- 1 ',)= 1 Xj 1 has characteristic function

as t--+ oo, whence 2Tn/7r is asymptotically Cauchy-distributed. In particular,


1P'(I2Tn/rrl >

2100

1)--+ -

11:

du
1+u

1
2

as t

--2 =-

--+

oo.

24. Let mn be a non-decreasing sequence of integers satisfying 1 ::: mn < n, mn

noting that Ynk takes the value 1 each with probability


'Z= 1 Ynk Then
n

lP'(Un-:/= Zn) :"':

LlP'(IXkl ~

i whenever mn

--+

< k < n. Let Zn

mn) :"':

L
k=mn

k=1

343

k2

--+

oo, and define

as n--+ oo,

[7.11.25]-[7.11.26]

Solutions

Convergence of random variables

from V.:hich it follows that Un/ Jn

-S N(O, 1) if and only if Zn/ Jn -S N(O, 1). Now


mn

Zn =

Ynk

+ Bn-mn

k=1

where Bn-mn is the sum of n- mn independent summands each of which takes the values 1, each
possibility having probability
Furthermore

1 mn
2
-2:Ynk ::S mn
Jn k=1
Jn

which tends to 0 if mn is chosen to be mn =


n- 1 Bn-mn

Ln 1/ 5J; with this choice for mn, we have that

-S N(O, 1), and the result follows.

Finally,
var(Un)

t (2 - ~)

k=1

so that
1 n 1
var(Un/ Jn) = 2- 2 --+ 2.
n k=1 k

25. (i) Let c/Jn and cp be the characteristic functions of Xn and X. The characteristic function !frk of
XNk is
00

!frk(t)

= 2::: cfJ1 (t)f'(Nk = J)


}=1

whence
00

11/fk(t)-

cp(t)l:::::

L lc/Jj(t)- cp(t)llP'(Nk = j).


}=1

Let E > 0. We have that ifJ} (t) --+ cp (t) as j --+ oo, and hence for any T > 0, there exists J (T) such that
lifJ} (t) - cp(t)l < E if j :::: J (T) and It I ::::: T. Finally, there exists K(T) such that lP'(Nk ::::: J (T)) ::::: E
if k :::: K (T). It follows that

if It I ::::: T and k :::: K(T); therefore !frk(t) --+ cp(t) ask--+ oo.
(ii) Let Yn = SUPm;:::n IXm -XI. ForE > 0, n :0:: 1,
lP'(IXNk- XI>

E)::::: lP'(Nk::::: n) + lP'(IXNk- XI> E, Nk


::::: lP'(Nk ::::: n)

+ lP'(Yn

> n)

> E) --+ lP'(Yn > E)

Now take the limit as n --+ oo and use the fact that Yn ~ 0.

26. (a) We have that


(
a(n

an- -,n
- =

+ 1, n)

ITk ( 1 i=O

"'l

k .)
-1. ) < exp ( - L..J
- .
n i=O n

344

ask --+

00.

Solutions

Problems

(b) The expectation is

L_g

En=

Jnn)

where the sum is over all j satisfying n - M Jn

(j-

:::

(nj+l _

Jn

j!

nl;~n

j ::: n. For such a value of j,

n) nle-n =e-n

Jn

[7.11.27]-[7.11.28]

nl
)
(j- l)! ,

j!

whence En has the form given.


(c) Now g is continuous on the interval [ -M, 0], and it follows by the central limit theorem that

I
I 2
g(x)--e-'1.x dx

En--+

.J2ii

- M

x
I 2
--e-'1.x dx

!oM

.J2ii

1-e-'J.M

------.,=-

.J2ii

Also,

where k

= LM JnJ.

Take the limits as n --+ oo and M --+ oo in that order to obtain


1

--<lim

.J2ii -

nn+2e-n
I

n--+ oo {

< --.

.J2ii

27. Clearly
E(Rn+1 I Ro, R1, ... , Rn)

+ 2).

since a red ball is added with probability Rn/(n

Rn
n+2

= Rn + - -

Hence

and also 0 ::: Sn ::: 1. Using the martingale convergence theorem, S = limn--+oo Sn exists almost
surely and in mean square.

28. Let 0 <

< ~, and let

k(t)

L8tJ, m(t)

= ro- E3)k(t)l,

n(t)

L(l

+ E3)k(t)J

and let Imn(t) be the indicator function of the event {m(t) ::: M(t) < n(t)}. Since M(t)jt ~ e, we
may find T such that EUmn(t)) > I-E fort~ T.
We may approximate SM(t) by the random variable sk(t) as follows. With Aj = {ISj - sk(t) I >

E.Jk(t)},
IP'(AM(t)) ::0 IP'(AM(t) lmn(t)
k(t)-1

= 1) + IP'(AM(t) lmn(t) = 0)
n(t)-1

:::lP'( U Aj)+lP'( U
j=m(t)
<
-

Aj)+IP'(lmn(t)=O)

j=k(t)

{k(t) - m(t) }a 2
E2k(t)

+
if t

345

{ n(t)- k(t) }a 2

E2k(t)
~

T,

+E

[7.11.29]-[7.11.32]

Solutions

Convergence of random variables

by Kolmogorov's inequality (Exercise (7.8.1) and Problem (7.11.29)). Send t --+ oo to find that
as t --+ oo.

Now Sk(t)/ .Jk(i)

N(O, a 2 ) as t --+ oo, by the usual central limit theorem. Therefore

which implies the first claim, since k(t)j(et) --+ 1 (see Exercise (7.2.7)). The second part follows by
Slutsky's theorem (7.2.5a).

29. We have that Sn = Sk

+ (Sn -

Sk), and so, for n :::: k,

Now S'fJAk :::: c 2 IAk; the second term on the right side is 0, by the independence of the X's, and the
third term is non-negative. The first inequality of the question follows. Summing over k, we obtain
E(S;) :::: c 2 1P'(Mn > c) as required.

30. (i) With Sn = Lt=1 xj, we have by Kolmogorov's inequality that


1 m+n
lP' ( max ISm+k- Sml > E) ::0 2
E(X~)
1~k~n
E k=m

forE > 0. Take the limit as m, n --+ oo to obtain in the usual way that {Sr : r :::: 0} is a.s. Cauchy
convergent, and therefore a.s. convergent, if 2::\"' E(X~) < oo. It is shorter to use the martingale
convergence theorem, noting that Sn is a martingale with uniformly bounded second moments.
(ii) Apply part (i) to the sequence Yk = Xkfbk to deduce that 2::~ 1 Xkfbk converges a.s. The claim
now follows by Kronecker's lemma (see Exercise (7.8.2)).

31. (a) This is immediate by the observation that


ei.(P) =

f Xo IJ PN_ij
lJ
i,j

(b) Clearly Lj Pij = 1 for each i, and we introduce Lagrange multipliers {JL; : i E S} and write
V = A.(P) + Li JLi Lj Pij. Differentiating V with respect to each Pij yields a stationary (maximum)
value when (N;j / Pij) + JLi = 0. Hence Lk N;k = - JLb and

z=E\

(c) We have that Nij =


Nik Ir where Iris the indicator function oftheeventthatthe rth transition
out ofi is to j. By the Markov property, the Ir are independent with constant mean Pij. Using the
strong law oflarge numbers and the fact that Lk N;k ~ oo as n --+ oo, Pij ~ E(/t) = Pij.
32. (a) If X is transient then V;(n) < oo a.s., and JLi = oo, whence V;(n)/n ~ 0 = ttj 1. If X is
persistent, then without loss of generality we may assume Xo = i. LetT (r) be the duration of the rth

346

Solutions [7.11.33]-[7.11.34]

Problems

excursion from i. By the strong Markov property, the T (r) are independent and identically distributed
with mean J.Li . Furthermore,
V;(n)-J

::r

V;(n)

V;(n)

::r

T(r) < < "


V;(n) - V;(n)

"

T(r)

By the strong law of large numbers and the fact that V; (n) ~ oo as n --+ oo, the two outer terms
sandwich the central term, and the result follows.
(b) Note that ~~,;;;J f(Xr) = ~iES f(i)V;(n). With Q a finite subset of S, and n; = J.Lj 1, the
unique stationary distribution,

::":

{"I

V;(n)
1
L.t ------:-

iEQ

J.Lz

1 )} 11/lloo,
+ "(V;(n)
L.t - - +---:i<f.Q

J.Lz

where 11/lloo = sup{l/(i)l : i E S}. The sum over i E Q converges a.s. to 0 as n--+ oo, by part (a).
The other sum satisfies
"(V;(n)
L.t - i<f.Q
n

1) -_2- L.t

"(Vi(n)
-iEQ
n

+-

J.Li

which approaches 0 a.s., in the limits as n --+ oo and Q

+ lrj )

S.

33. (a) Since the chain is persistent, we may assume without loss of generality that Xo = j. Define
the times RJ, R2, ... of return to j, the sojourn lengths SJ, S2, ... in j, and the times VJ, V2, ...
between visits to j. By the Markov property and the strong law of large numbers,
1

n
"S

-~

a.s.

1
1~
a.s.
- Rn = - L.t Vr ----+ J.Lj.
n
n r=l

r~-,

n r=l

gj

Also, Rn/ Rn+J ~ 1, since J.Lj = E(RJ) < oo. If Rn < t < Rn+l then

Let n --+ oo to obtain the result.


(b) Note by Theorem (6.9.21) that Pij (t) --+ nj as t --+ oo. We take expectations of the integral in
part (a), and the claim follows as in Corollary (6.4.22).
(c) Use the fact that

lot f(X(s))ds L l
=

l(X(s)=jJdS

jES 0

together with the method of solution of Problem (7 .11.32b).

34. (a) By the first Borel-Cantelli lemma, Xn = Yn for all but finitely many values of n, almost
surely. Off an event of probability zero, the sequences are identical for all large n.
(b) This follows immediately from part (a), since Xn - Yn = 0 for all large n, almost surely.
(c) By the above, a;;- 1 ~~~ (Xr - Yr) ~ 0, which implies the claim.

347

[7.11.35]-[7.11.37]

Solutions

Convergence of random variables

35. Let Yn = Xnl(IXn[.:Sa) Then,

by assumption (a), whence {Xn} and {Yn} are tail-equivalent (see Problem (7.11.34)). By assumption
(b) and the martingale convergence theorem (7.8.1) applied to the partial sums 2:::~= 1 (Yn - IE(Yn)),
the infinite sum 2:::~ 1 (Yn - IE(Yn)) converges almost surely. Finally, I:~ 1 IE(Yn) converges by
assumption (c), and therefore 2:::~ 1 Yn, and hence 2:::~ 1 Xn, converges a.s.

36. (a) Let n1 < n2 < < nr = n. Since the

h take only two values, it suffices to show that


r

lP'(/ns

= 1 for 1 .::5 s

.::5 r)

IT lP'Uns = 1).
s=1

Since F is continuous, the Xi take distinct values with probability 1, and furthermore the ranking of
X 1, X2, ... , Xn is equally likely to be any of the n! available. Let x1, x2, ... , Xn be distinct reals,
and write A= {Xi =Xi for 1 .::5 i .::5 n}. Now,
lP'Uns = 1 for 1 .::5 s .::5 r

I A)

1{(nns-1
- 1) (n- 1- ns-1)! }{(ns-ns-21 - 1) (ns-1- 1- ns-2)! } (n1- 1)!

=I
n.

and the claim follows on averaging over the Xi.


(b)WehavethatlE(h) = lP'(h = 1) = k- 1 and var(h) = r 1(1-k- 1), whence l::k var(h/logk) <
oo. By the independence of the h and the martingale convergence theorem (7.8.1), 2:::~ 1 (h k- 1) flog k converges a.s. Therefore, by Kronecker's lemma (see Exercise (7.8.2)),

t(/-~)
~0
1
J

- 1
logn }= 1
The result follows on recalling that

as n---+ oo.

I:,}= 1 j - 1 ~ log n as n ---+

oo.

37. By an application of the three series theorem of Problem (7.11.35), the series converges almost
surely.

348

Random processes

8.2 Solutions. Stationary processes


1.

With ai (n)

= lP'(Xn = i), we have that

cov(Xm, Xm+n) = lP'(Xm+n = 1 I Xm = 1)1P'(Xm = 1) -lP'(Xm+n = 1)1P'(Xm = 1)

= a1 (m)pu (n)- al (m)al (m + n),


and therefore,
a1 (m)pu (n)- a1 (m)al (m

+ n)

p(Xm,Xm+n)=-r=r~~~~~~~~~==T=~~

v'a1 (m)(l - a1 (m))al (m

Now, a1 (m)

+ n)(l -

a1 (m

+ n))

aj(a + fJ) as m ~ oo, and


a
a+fJ

Pll (n) = - whence p(Xm, Xm+n)

+ -fJ- (1
a+fJ

-a - fJ) ,

(1 -a- fJ)n as m ~ oo. Finally,

.
1 n
hm - LlP'(Xr

1)

= --.
a+ fJ

n-+oo n r=l

The process is strictly stationary if and only if Xo has the stationary distribution.

2.

We have that E(T(t)) = 0 and var(T(t)) = var(To) = 1. Hence:


(a) p(T(s), T(s + t)) = E(T(s)T(s + t)) = E[(-1)N(t+s)-N(sl] = e-2M.

(b) Evidently, E(X (t)) = 0, and


E[X(t) 2 ] = E ( l l T(u)T(v)dudv)

E(T(u)T(v))dudv=21 1 1v e- 2}..(v-u)dudv
~<u<v<t
v=O u=O

=2 {
=

3.

I1 ( t -

1
2A.

+ 21A. e -2At)

We show first the existence of the limitA.


g(x

+ y) =

It

as t ~ oo.

= limq.o g(t)jt, where g(t) = lP'(N(t)

+ y) > 0)
= lP'(N(x) > 0) + lP'({N(x) = 0} n
for x, y 2: 0.
:S g(x) + g(y)

> 0). Clearly,

lP'(N(x

349

{N(x

+ y)- N(x)

> 0})

[8.3.1]-[8.3.4]

Solutions

Random processes

Such a function g is called subadditive, and the existence of A follows by the subadditive limit theorem
discussed in Problem (6.15.14). Note that A= oo is a possibility.
Next, we partition the interval (0, 1] into n equal sub-intervals, and let ln(r) be the indicator
function of the event that at least one arrival lies in ( (r - 1) In, r In] , 1 .::5 r .::5 n. Then 2:::~= 1 In (r) t
N(l) as n ---+ oo, with probability 1. By stationarity and monotone convergence,
E(N(l)) =

m:( lim

~ In(r))

n-+oo L....t

lim

n-+oo

r=I

m:(~
In(r))
L....t

= lim ng(n- 1) =A.


n-+oo

r=I

8.3 Solutions. Renewal processes


1.

See Problem (6.15.8).

2.

With X a certain inter-event time, independent of the chain so far,

Bn+I = {

X - 1 if Bn = 0,
Bn - 1 if Bn > 0.

Therefore, B is a Markov chain with transition probabilities Pi,i-I = 1 fori > 0, and POJ = fJ+I
for j 2: 0, where fn = lP'(X = n). The stationary distribution satisfies TCJ = TCJ+J + nof}+J, j 2: 0,
with solution n1 = lP'(X > j)IE(X), provided E(X) is finite.
The transition probabilities of B when reversed in equilibrium are

Pi,i+I

TCi+I

= 7; =

lP'(X > i + 1)
lP'(X > i)

fi+I
PiO = lP'(X > i),

fori 2: 0.

These are the transition probabilities of the chain U of Exercise (8.3.1) with the Jj as given.

3. We have that pnun = 2:::~=! pn-kUn-kPk fk, whence Vn = pnun defines a renewal sequence
provided p > 0 and l:n pn fn = 1. By Exercise (8.3.1 ), there exists a Markov chain U and a state s
such that Vn = lP'(Un = s)---+ ns, as n---+ oo, as required.

4.

Noting that N(O) = 0,


oo

oo

oo

oo

k=I

r=k

L E(N(r))sr = L L UkSr = L Uk L sr
r=O

r=I k=I

= ~ uvk = U(s)- 1 = F(s)U(s).

L....t1-s
k=I

1-s

1-s

Let Sm = l:k=I Xk and So= 0. Then lP'(N(r) = n) = lP'(Sn .::5 r) -lP'(Sn+I .::5 r), and

~sfE
00

[(N(t)k

+k)] =~sf~ n +k)


k
(lP'(Sn .::5 t) -lP'(Sn+I .::5 t))
00

00

= ~sf
00

Now,

350

00

n+
k _k

1)

lP'(Sn .::5 t) .

Solutions

Queues

[8.3.5]-[8.4.4]

whence, by the negative binomial theorem,


"'too=OstlE

L...J

[(N(t)k+k)] =

(1- s)(l - F(s))k

U(d
1-s

5. This is an immediate consequence of the fact that the interarrival times of a Poisson process are
exponentially distributed, since this specifies the distribution of the process.

8.4 Solutions. Queues


1. We use the lack-of-memory property repeatedly, together with the fact that, if X and Y are
independent exponential variables with respective parameters 'A and J-t, then JP>(X < Y) = 'A/('A + J-t).
(a) In this case,
1{

=2

'A
1-t
1-t }
1 { 1-t
'A
'A }
'A + J-t . 'A + J-t + 'A + J-t + 2 'A + J-t . 'A + J-t + 'A + J-t

(b) If 'A< f.-t, and you pick the quicker server, p

= 1-

(-1-t-)

= 2+

2'A~-t
('A +

~-t)2.

"A+~-t

2'A~-t

(c) And finally, p = ('A+ J-t) 2 .

2. The given event occurs if the time X to the next arrival is less than t, and also less than the time
Y of service of the customer present. Now,

3.

By conditioning on the time of passage of the first vehicle,


IE(T) =loa (x

+ IE(T) )'Ae-1.x dx + ae-1.a,

and the result follows. If it takes a time b to cross the other lane, and so a + b to cross both, then, with
an obvious notation,
(a)
(b)

eal.- 1
ebJ.L- 1
IE(Ta) +IE(n) = - - + - - ,
'A
1-t
e<a+b){A+J.L) - 1
IE(Ta+b) = - - - - "A+~-t

The latter must be the greater, by a consideration of the problem, or by turgid calculation.

4.

Look for a solution of the detailed balance equations


f.-tlTn+l

'A(n + 1)
n + 2 Jl'n,

n:::: 0.

tofindthatnn = pnno/(n+1)isastationarydistributionifp < l,inwhichcaseno = -pjlog(l-p).


Hence l:n nnn = 'Ano/(J-t- 'A), and by the lack-of-memory property the mean time spent waiting for
service is pno/(J-t- 'A). An arriving customer joins the queue with probability

00

n + 1lTn = p + log(l- p).


n=O n + 2
p log(l - p)

351

[8.4.5]-[8.5.5]

Solutions

Random processes

5. By considering possible transitions during the interval (t, t +h), the probability Pi (t) that exactly
i demonstrators are busy at time t satisfies:
P2(t +h)= P1 (t)2h

+ P2(t)(l

- 2h)

+ o(h),

P1 (t +h)= Po(t)2h + P1 (t)(l - h)(l - 2h) + P2(t)2h + o(h),


Po(t +h)= Po(t)(l- 2h)

+ P1(t)h + o(h).

Hence,
P~ (t)

= 2pt (t)

- 2p2(t),

Pi (t)

= 2po(t) -

Jp1 (t)

+ 2p2 (t),

p (t)

-2po(t)

+ Pt (t),

and therefore P2(t) =a+ be- 2t + ce-St for some constants a, b, c. By considering the values of P2
and its derivatives at t = 0, the boundary conditions are found to be a + b + c = 0, - 2b - 5c = 0,
4b + 25c = 4, and the claim follows.

8.5 Solutions. The Wiener process


1. We might as well assume that W is standard, in that a 2 = 1. Because the joint distribution is
multivariate normal, we may use Exercise (4.7.5) for the first part, and Exercise (4.9.8) for the second,
giving the answer
-1 + 1 { sm. 1
8
4n
2.

Writing W(s)

..fiX, W(t)

If

-+sin- 1
t

= ,fiz,

If

and W(u)

-+sin- 1
u

If}
u

.JUY, we obtain random variables X,


= ..[ilU, P2 = ..filU,

Y, Z with the standard trivariate normal distribution, with correlations P1

P3 = .JS[i. By the solution to Exercise (4.9.9),

var(Z I X, Y) =
yielding var(W(t)

I W(s),

(u-t)(t-s)
t(u- s)

W(u)) as required. Also,

E(W(t)W(u) W(s), W(v)) = E {[

(u-t)W(s)+(t-s)W(u)]
u_ s
W(u) W(s), W(v)

},

which yields the conditional correlation after some algebra.


3.

Whenever a 2 + b 2 = 1.

4.

Let lij(n)

= W((j +

1)tfn)- W(jtfn). By the independence of these increments,

5. They all have mean zero and variance t, but only (a) has independent normally distributed increments.

352

Solutions [8.7.1]-[8.7.3]

Problems

8.7 Solutions to problems


1. E(Yn) = 0, and cov(Ym, Ym+n) = Ll=O CXiCXn+i form, n 2': 0, with the convention that CXk
fork> r. The covariance does not depend on m, and therefore the sequence is stationary.

=0

2. We have, by iteration, that Yn = Sn (m) +am+ 1 Yn-m-1 where Sn (m) = LJ=O cxi Zn- j. There
are various ways of showing that the sequence {Sn (m) : m ::: 1} converges in mean square and almost
surely, and the shortest is as follows. We have that cxm+ 1Yn-m-1 ---+ 0 in m.s. and a.s. as m ---+ oo;
to see this, use the facts that var(cxm+ 1 Yn-m-1) = cx 2(m+ 1) var(Yo), and

> 0.

It follows that Sn(m) = Yn- am+ 1Yn-m-1 converges in m.s. and a.s. as m---+ oo. A longer route to
the same conclusion is as follows. For r < s,

whence {Sn(m) : m ::: 1} is Cauchy convergent in mean square, and therefore converges in mean
square. In order to show the almost sure convergence of Sn(m), one may argue as follows. Certainly

whence Z:::f:=o cxi Zn- j is a.s. absolutely convergent, and therefore a.s. convergent also. We may
express limm-+oo Sn (m) as Z:::f:=o cxi Zn- j. Also, am+ 1Yn-m-1 ---+ 0 in mean square and a.s. as
m ---+ oo, and we may therefore express Yn as
00

Yn =

L cxi Zn-

a.s.

j=O

It follows that E(Yn) = limm-+oo E(Sn(m)) = 0. Finally, for r > 0, the autocovariance function
is given by
c(r) = cov(Yn, Yn-r) = E{ (cxYn-1 + Zn)Yn-r} = cxc(r - 1),

whence
cxlrl

c(r)

= cxlrlc(O) = - -2 ,
l-ex

r= ... ,-1,0,1, ... ,

since c(O) = var(Yn).


3.
N(t)

1ft is a non-negative integer, N(t) is the number ofO's and 1's preceding the (t+ l)th 1. Therefore

+ 1 has the negative binomial distribution with mass function

k:::t+l.

If tis not an integer, then N(t) = N(Lt J).

353

[8.7.4]-[8.7.6]
4.

Solutions

Random processes

We have that
A.h

JP>(Q(t +h)= j I Q(t)

= i) = {

+ o(h)
+ o(h)

+ 1,
= i - 1,

if j = i
if j

11-ih

1 -(A.+ 11-i)h

+ o(h)

if j = i,

an immigration-death process with constant birth rate A. and death rates /1-i = i fl-.
Either calculate the stationary distribution in the usual way, or use the fact that birth-death
processes are reversible in equilibrium. Hence 'Ani = 11-U + l)rri+l fori 2:: 0, whence

nl = -.,1
I.

(A.)i
e-A/JL
11-

i :::: 0.

'

5. We have that X(t) = R cos(IJ!) cos(&t) - R sin(IJ!) sin(&t). Consider the transformation u
r cos 1/f, v = -r sin 1/f, which maps [0, oo) x [0, 2n) to R 2 . The Jacobian is
au

au

ar

a!fr

ar

aljf

av

av

= -r,

whence U = R cos IJf, V = - R sin IJf have joint density function satisfying
rfu,v(rcosljf, -rsin 1/f)

Substitute fu,v(u, v)

= fR,\jl(r, 1/f).

= e-2I (u 2+v 2) /(2rr), to obtain


r > 0, 0::::: 1/f < 2n.

Thus Rand IJf are independent, the latter being uniform on [0, 2rr).
6. A customer arriving at time u is designated green if he is in state A at time t, an event having
probability p(u, t - u). By the colouring theorem (6.13.14), the arrival times of green customers form
a non-homogeneous Poisson process with intensity function A.(u)p(u, t - u), and the claim follows.

354

Stationary processes

9.1 Solutions. Introduction


1.

We examine sequences Wn of the form


00

Wn

LakZn-k
k=O

for the real sequence {ak : k 2:: 0}. Substitute, to obtain ao = l, a1 =a, ar = aar-1
with solution

+ fJar-2 r

2:: 2,

otherwise,
where A. 1 and A.2 are the (possibly complex) roots of the quadratic x 2 -ax - fJ = 0 (these roots are
distinct if and only if a 2 + 4{3 =!= 0).
Using the method in the solution to Problem (8.7.2), the sum in(*) converges in mean square and
almost surely if IA.1I < 1 and IA.2I < 1. Assuming this holds, we have from(*) that JE(Wn) = 0 and
the autocovariance function is
c(m) = JE(Wn Wn-m) = ac(m- 1)

+ {Jc(m- 2),

::=::

1,

by the independence of the Zn. Therefore W is weakly stationary, and the autocovariance function
may be expressed in terms of a and fJ.

2.

We adopt the convention that, if the binary expansion of U is non-unique, then we take the
(unique) non-terminating such expansion. It is clear that Xi takes values in {0, 1}, and

for all x1, x2, ... , Xn; therefore the X's are independent Bernoulli random variables. For any sequence k1 < k2 < ... < kr, the joint distribution of vk,' vk2' ... ' vkr depends only on that of
Xk 1+1, Xk 1+2 .... Since this distribution is the same as the distribution of X 1, X2, ... , we have that
(Vk 1 , Vk2 , ... , Vkr) has the same distribution as (Vo, Vk2-k1 , . , Vkr-k! ). Therefore V is strongly
stationary.
Clearly JE(Vn) = JE(Vo) = i. and, by the independence oftlie Xi,
00

cov(Vo, Vn)

= 2:T 2i-n var(Xi) =


i=1

355

tz(i)n.

[9.1.3]-[9.2.1]

Solutions

Stationary processes

3. (i) For mean-square convergence, we show that Sk = L~=O anXn is mean-square Cauchy convergent as k -+ oo. We have that, for r < s,
lE{ (Ss - Sr ) 2 }

= . _L
s

L
s

aiajc(i - j) .::; c(O) .


lai I
z,;=r+I
l=r+I

}2

since lc(m) I .::; c(O) for all m, by the Cauchy-Schwarz inequality. The last sum tends to 0 as r, s -+ oo
if Li lai I < oo. Hence Sk converges in mean square ask -+ oo.
Secondly,
lE(

lakXkl) .::::

k=I

lakl lEIXkl.:::: VJE<X5)

k=I

lakl

k=I

which converges as n -+ oo if the lakl are summable. It follows that Lk=I lakXkl converges
absolutely (almost surely), and hence Lk=I akXk converges a.s.
(ii) Each sum converges a.s. and in mean square, by part (i). Now
00

cy(m) =

ajakc(m + k- j)

j,k=O
whence

L lcy(m)l.:::: c(O){ f

laj I} < oo.

j=O

4. Clearly Xn has distribution :rr for all n, so that {f(Xn) : n :::: m} has fdds which do not depend
on the value of m. Therefore the sequence is strongly stationary.

9.2 Solutions. Linear prediction


1.

(i) We have that

which is minimized by setting a = c(l)/c(O). Hence Xn+I = c(l)Xn/c(O).


(ii) Similarly

an expression which is minimized by the choice

f3

c(l) (c(O)- c(2))


=
c(0) 2 - c(1)2 '

c(O)c(2) - c(1) 2
y = c(0)2 - c(1)2 ;

Xn+I is given accordingly.


(iii) Substitute a, {3, y into(*) and(**), and subtract to obtain, after some manipulation,
{c(l ) 2 - c(O)c(2) } 2
D - .:-,--:-~..-'--'--.:...:.:.:
- c(O){c(0)2- c(l)2}

356

Solutions

Autocovariances and spectra


1

(a) In this case c(O) = '2> and c(l)


(b) In this caseD = 0 also.

= c(2) = 0.

Therefore Xn+1

[9.2.2]-[9.3.2]

= Xn+1
= 0, and D = 0.

In both (a) and (b), little of substance is gained by using Xn+1 in place of Xn+1
2. Let {Zn : n = ... , -1, 0, 1, ... } be independent random variables with zero means and unit
variances, and define the moving-average process

Zn + aZn-1
~.

Xn=

It is easily checked that X has the required autocovariance function.

By the projection theorem, Xn - Xn is orthogonal to the collection {Xn-r : r > 1}, so that
00
.
E{(Xn- Xn)Xn-r} = 0, r :=:: 1. Set Xn = l::s=1 hsXn-s to obtam that
for s:::: 2,
where a = aj(l + a 2 ). The unique bounded solution to the above difference equation is hs
(-l)s+ 1as, and therefore
00

Xn

= 2:)-l)s+ 1as Xn-s


s=1

The mean squared error of prediction is

Clearly E(Xn) = 0 and


00

cov(Xn, Xn-m)

brbsc(m

+ r - s),

m ::::0,

r,s=1

so that

X is weakly stationary.
9.3 Solutions. Autocovariances and spectra

1.

It is clear that E(Xn) = 0 and var(Xn) = 1. Also

cov(Xm, Xm+n) = cos(mA.) cos{(m

+ n)A.} + sin(mA.) sin{(m + n)A.} =

cos(nA.),

so that X is stationary, and the spectrum of X is the singleton {A.}.

2.

Certainly 1/Ju (t)

= (eit:rr

cov(Xm, Xm+n)

- e-it:rr)/(2nit), so that E(Xn)

= E(XmXm+n) =

= 1/Ju (1)1/Jv (n) = 0.

E(ei{U-Vm-U+V(m+n)J)

whence X is stationary. Finally, the autocovariance function is


c(n) = 1/Jv(n) =

whence F is the spectral distribution function.

357

einJ... dF(A.),

Also

= 1/lv(n),

[9.3.3]-[9.3.4]

Solutions

Stationary processes

3.

The characteristic functions of these distributions are

(i)

p(t)

(ii)

p(t) =

4.

1 2

= e-zt

c~it

+ 1 ~it)

= 1

~ t2 .

(i) We have that

var ( -1 n Xj ) = 21
n }=!

Ln

cov(Xj, Xk) = -c(O)


2

n j,k=1

1 (L
n


) dF(A.).
ez(j-k))..

j,k=1

(-Jr,Jr]

The integrand is
1 - cos(nA.)
1- cos A. '
whence

~X )
var ( -1 L
n }= 1 1
It is easily seen that I sine I .:5

= c(O)

1 (
(-Jr,Jr]

sin(nA./2) ) 2 dF(A.).
n sin(A./2)

IB I, and therefore the integrand is no larger than


(

A./2 ) 2
1 2
sin(A./2)
.:5 (2n)

As n --+ oo, the integrand converges to the function which is zero everywhere except at the origin,
where (by continuity) we may assign it the value 1. It may be seen, using the dominated convergence
theorem, that the integral converges to F (0) - F (0-), the size of the discontinuity of F at the origin,
and therefore the variance tends to 0 if and only if F(O) - F(O-) = 0.
Using a similar argument,

-1 n-1
L c(j) =
n }=0

(0)
(n-1
)
_ceij).. dF(A.) = c(O)
gn(A.)dF(A.)
n
(-Jr,Jr] j=O
(-Jr,Jr]

where

ifA.

= 0,

ifA.

= 0,

gn(A.) = { 1ein)..- 1
n(ei).. - 1)

is a bounded sequence of functions which converges as before to the Kronecker delta function 8;..o.
Therefore
1 n-1
c(j) --+ c(O) ( F(O) - F(O-))
as n--+ oo.
n }=0

-L

358

Solutions

The ergodic theorem

[9.4.1]-[9.5.2]

9.4 Solutions. Stochastic integration and the spectral representation


1. Let Hx be the space of all linear combinations of the X;, and let H x be the closure of this space,
that is, H x together with the limits of all mean-square Cauchy-convergent sequences in H x. All
members of Hx have zero mean, and therefore all members of H x also. Now S(A.) E H x for all A.,
whence JE.(S(A.)- S(fl-)) = 0 for all A. and fl-.
2. First, each Ym lies in the space H x containing all linear combinations of the X n and all limits of
mean-square Cauchy-convergent sequences of the same form. As in the solution to Exercise (9 .4.1 ),
all members of H x have zero mean, and therefore JE.(Ym) = 0 for all m. Secondly,

_ 1

lE.(YmY n) =

(-n,n]

eimJ...e-inJ...
/(A.) dA. = 8mn

2rrf(A.)

As for the last part,

This proves that such a sequence Xn may be expressed as a moving average of an orthonormal
sequence.
3. Let H x be the space of all linear combinations of the Xn, together with all limits of (meansquare) Cauchy-convergent sequences of such combinations. Using the result of Problem (7 .11.19),
all elements in H x are normally distributed. In particular, all increments of the spectral process are
normal. Similarly, all pairs in H x are jointly normally distributed, and therefore two members of
H x are independent if and only if they are uncorrelated. Increments of the spectral process have
zero means (by Exercise (9.4.1)) and are orthogonal. Therefore they are uncorrelated, and hence
independent.

9.5 Solutions. The ergodic theorem


1. With the usual shift operator r, it is obvious that r- 1 0 = 0, so that 0 E 1. Secondly, if A
then r- 1(A c) = (r- 1A)c = Ac, whence Ac E 1. Thirdly, suppose A1, Az, ... E 1. Then

1,

so that U'f A; E 1.
2. The left-hand side is the sum of covariances, c(O) appearing n times, and c(i) appearing 2(n- i)
times for 0 < i < n, in agreement with the right-hand side.
Let E > 0. Ifc(j) =
when j 2::: J. Now

1 2:{~~

c(i) --+ a 2 as j --+

2n

2{1

00,

2 Ljc(j) :52 LjC(j) + L


j=1

j=1

there exists J such that lc(j) - a 2 1 <

j(a2

+E) }

--+ a2

+E

j=l+1

as n --+ oo. A related lower bound is proved similarly, and the claim follows since E ( > 0) is arbitrary.

359

[9.5.3]-[9.6.3]
3.

Solutions

Stationary processes

It is easily seen that Sm = l::~o ai Xn+i constitutes a martingale with respect to the X's, and
oo

E(S~)

I>tE(X~+i):::;

L:>f,

i=O

i=O

whence Sm converges a.s. and in mean square as m--+ oo.


Since the Xn are independent and identically distributed, the sequence Yn is strongly stationary;
also E(Yn) = 0, and so n- 1 l::t=I Yi --+ Z a.s. and in mean, for some random variable Z with mean
zero. For any fixed m (2: 1), the contribution of X 1, Xz, ... , Xm towards l::t=l Yi is, for large n, no
larger than
Cm =

Jf(ao +a,+ ... + aj-I)Xj I


J=l

Now n- 1cm --+ 0 as n --+ oo, so that Z is defined in terms of the subsequence Xm+I, Xm+2, ...
for all m, which is to say that Z is a tail function of a sequence of independent random variables.
Therefore Z is a.s. constant, and so Z = 0 a.s.

9.6 Solutions. Gaussian processes


1. The quick way is to observe that c is the autocovariance function of a Poisson process with
intensity 1. Alternatively, argue as follows. The sum is unchanged by taking complex conjugates, and
hence is real. Therefore it equals

where

to= 0.

2. For s, t :::: 0, X (s) and X (s + t) have a bivariate normal distribution with zero means, unit
variances, and covariance c(t). It is standard (see Problem (4.14.13)) that E(X(s + t) I X(s)) =
c(t)X(s). Now
c(s

+ t) = E(X(O)X(s + t)) = E{E(X(O)X(s + t) I X(O), X(s))}


= E(X(O)c(t)X(s)) = c(s)c(t)

by the Markov property. Therefore c satisfies c(s + t) = c(s)c(t), c(O) = 1, whence c(s) = c(l)lsl =
plsl. Using the inversion formula, the spectral density function is

Note that X has the same autocovariance function as a certain autoregressive process. Indeed,
stationary Gaussian Markov processes have such a representation.

3. If X is Gaussian and strongly stationary, then it is weakly stationary since it has a finite variance.
Conversely suppose X is Gaussian and weakly stationary. Then c(s, t) = cov(X(s), X(t)) depends

360

Solutions

Problems

[9.6.4]-[9.7.2]

on t - s only. The joint distribution of X(t1), X(t2), ... , X(tn) depends only on the common mean
and the covariances c(ti, fj ). Now c(ti, fj) depends on fj - ti only, whence X (tl), X (t2), ... , X Un)
have the same joint distribution as X(s + t1), X(s + t2), ... , X(s + tn). Therefore X is strongly
stationary.

4.

(a) If s, t > 0, we have from Problem (4.14.13) that

whence
cov(X(s) 2 , X(s

+ t) 2 ) =
=

+ t) 2) - 1
m:{ X(s) 2 E(X(s + t) 2 I X(s))}- 1
c(t) 2E(X(s) 4 ) + (1- c(t) 2 )E(X(s) 2 )- 1 =

E(X(s) 2 X(s

by an elementary calculation.
(b) Likewise cov(X(s) 3 , X(s + t) 3) = 3(3

2c(t) 2

+ 2c(t) 2 )c(t).

9.7 Solutions to problems

+ f3Yn-1, whence the autocovariance function c

1. It is easily seen that Yn = Xn +(a - f3)Xn-1


of Y is given by

if k = 0,

_1_+_a_2__
-=2-'-f3_2

1-/3

{
c(k) =

f31kl-1

2 }

a( 1 + af3- f3 )
1- f32

if k

0.

Set Yn+ 1 = 2:::~0 ai Yn-i and find the ai for which it is the case that E{ (Yn+ 1 - Yn+ 1) Yn-k} = 0
for k ::::: 0. These equations yield
00

+ 1) =

c(k

L aic(k- i),

k::::: 0,

i=O

which have solution ai


2.

= a(f3- a)i fori

::::: 0.

The autocorrelation functions of X and Y satisfy


r

aipx(n) = ap

ajakpy(n

+ k- j).

j,k=O

Therefore
2

aifx(A.)=

~;

oo

e-inJ...

n=-oo
2

ajakpy(n+k-j)

j,k=O
oo

= ay ""'aakei(k-j)J...

2n L

j,k=O

""' e-i(n+k-j)J...py(n+k-j)

n=-oo

2
iJ... 2
= ayiGa(e
)I fy(A.).

361

[9.7.3]-[9.7.5]

Solutions

Stationary processes

In the case of exponential smoothing, G 0 (eil.) = (1 - t-t)/(1 - t-teil.), so that


fx(A) =

c(l - t-t)2 fy(A) ,


1- 2~-tcosA + t-t 2

IAI

< n,

1 is a constant chosen to make this a density function.

where c = a{; I a

Consider the sequence {Xn} defined by

3.

Xn

= Yn

- Yn

= Yn- aYn-1

- f3Yn-2

Now Xn is orthogonal to {Yn-k : k 2: 1}, so that the Xn are uncorrelated random variables with
spectral density function fx(A) = (2n)- 1 , A E (-n, n). By the result of Problem (9.7.2),

al fx (A) = a{; 11 -

aeil. - f3e 2il. 12 fy (A),

whence

f y (A)-

a2ja2
X

2n 11 -ae il. - f3 e 2il. 12 ,

-n <A< n.

4. Let {X~ : n 2: 1} be the interarrival times of such a process counted from a time at which a
meteorite falls. Then Xi, X~, ... are independent and distributed as X2. Let Y~ be the indicator
function of the event {X:n = n for some m}. Then
E(YmYm+n)

= lP'(Ym = 1, Ym+n = 1)
= lP'(Ym+n = 1 I Ym = l)lP'(Ym =

1)

= lP'(Y~ =

where a = lP'(Ym = 1). The autocovariance function of Y is therefore c(n)


n 2: 0, and Y is stationary.
The spectral density function of Y satisfies

l)a

a{lP'(Y~

, c(n) = Re {
1
~
,
1 } .
fy(A) = - 1 ~
L..J e-znA
L..J eznAc(n)-2n n=-oo
a(l- a)
na(l- a) n=O
2n
Now

00

00

n=O

n=O

L einl. y~ = L eiAT~
where T~

= Xi + X~ + + X~; just check the non-zero terms.

when eil.

i=

1, where (jJ is the characteristic function of x2 It follows that


jy(A)

5.

Therefore

1
n(l-a)

Re {

1
1-(jJ(A)

a . } - -1,
- -1-e11.
2n

We have that
E(cos(nU)) =

:n:

-:n: 2n

cos(nu) du = 0,

362

IAI

< n.

1) -a},

Solutions

Problems

[9.7.6]-[9.7.7]

for n :::: 1. Also


E(cos(mU) cos(nU)) =

m:{i (cos[(m + n)U] + cos[(m -

n)U1)} = 0

if m =f. n. Hence X is stationary with autocorrelation function p(k) = 8ko, and spectral density
function f(J....) = (2n)- 1 for IJ....I < n. Finally
E{ cos(mU) cos(nU) cos(rU)}

im:{ (cos[(m + n)U] + cos[(m- n)U]) cos(rU)}

= i{p(m

+ n- r) + p(m- n- r)}

which takes different values in the two cases (m, n, r) = (1, 2, 3), (2, 3, 4).

6. (a) The increments of N during any collection of intervals {(u;, v;) : 1 :::0 i :::: n} have the same
fdds if all the intervals are shifted by the same constant. Therefore X is strongly stationary. Certainly
E(X(t)) = J....a for all t, and the autocovariance function is
c(t) = cov(X(O), X(t)) = { 0
J....(a- t)

if t >a,
ifO :::0 t :::0 a.

Therefore the autocorrelation function is

p(t)

0
1- itfai

if it I >a,
if it I Sa,

which we recognize as thecharacteristicfunctionofthe spectral density f(J....) = {1-cos(aJ....)}j(an J.. . 2 );


see Problems (5.12.27b, 28a).
(b) We have that E(X(t)) = 0; furthermore, for s :::0 t, the correlation of X(s) and X(t) is

2(J cov(X(s), X(t)) = 2(J cov(W(s)- W(s -1), W(t)- W(t -1))

= s- min{s, t - 1}- (s- 1) +


1
if s:::: t - 1,
{
s-t+1

(s- 1)

ift-1<sst.

This depends on t - s only, and therefore X is stationary; X is Gaussian and therefore strongly
stationary also.
The autocorrelation function is

p(h) =

0
1- lhl

if lhl :::: 1,
if lhl < 1,

which we recognize as the characteristic function of the density function f(J....)

= (1

-cos J....)j(nJ.... 2 ).

7. We have from Problem (8.7.1) that the general moving-average process of part (b) is stationary
with autocovariance function c(k) = '}=o a}ak+ J, k :::: 0, with the convention that as = 0 if s < 0
ors > r.
(a) In this case, the autocorrelation function is

p(k)={ 1 +a 2
0

363

if k = 0,
if lkl = 1,
iflk1>1,

[9.7.8]-[9.7.13]

Solutions

Stationary processes

whence the spectral density function is


1 ( p(O) + e 1;.. p(l) + e- 1;.. p(-1) ) = 1 ( 1+ 2acosA.)
f(A.) = 2 ,
~
~
1+a

IA.I < n.

(b) We have that

/(A.)=__!__

00

e-ik)..p(k)

2n k=-oo

8.

')..

IA(ez )I
2nc(O)

"2:,1 a1zi. See Problem (9.7.2) also.


The spectral density function f is given by the inversion theorem (5.9.1) as

where c(O) =

"2:,1 aJ and A(z)

_1_'Laeij).. Lak e-i(k+j))..


2nc(O) J 1
+J
k

f(x)

= -1

100 e-ztx
. p(t) dt

2n _ 00

00 lp(t)l dt < oo; see Problem (5.12.20). Now


1 100 lp(t)l dt
1/(x)l .:5-

under the condition J0

2n -oo

and

lf(x +h)- f(x)l .:5

__!__

100

2n -oo

Ieith- 11 lp(t)l dt.

The integrand is dominated by the integrable function 21p(t)l. Using the dominated convergence
theorem, we deduce that lf(x +h)- f(x)l--+ 0 ash--+ 0, uniformly in x.
By Exercise (9.5.2), var(n- 1 'L,'J=l XJ) --+ a 2 if Cn = n- 1 'L-j:J cov(Xo, Xj) --+ a 2 . If
cov(Xo, Xn) --+ 0 then Cn --+ 0, and the result follows.

9.

10. Let X 1, X 2, ... be independent identically distributed random variables with mean p,. The sequence X is stationary, and it is a consequence of the ergodic theorem that n- 1 'L,'J=I 1 --+ Z a.s.
and in mean, where Z is a tail function of X 1, X 2 ... with mean p,. Using the zero-one law, Z is a.s.
constant, and therefore lP'(Z = p,) = 1.

11. We have from the ergodic theorem that n- 1 'L-t=l Yi --+ lE(Y I 1) a.s. and in mean, where 1 is
the a-field of invariant events. The condition of the question is therefore
lE(Y 11) = lE(Y)

a.s., for all appropriate Y.

Suppose(*) holds. Pick A E 1, and set Y = lA to obtain lA = Q(A) a.s. Now lA takes the values 0
and 1, so that Q(A) equals 0 or 1, implying that Q is ergodic. Conversely, suppose Q is ergodic. Then
lE(Y I 1) is measurable on a trivial a-field, and therefore equals lE(Y) a.s.

12. Suppose Q is strongly mixing. If A is an invariant event then A = r-nA. Therefore Q(A) =
Q(A n r-nA) --+ Q(A) 2 as n --+ oo, implying that Q(A) equals 0 or 1, and therefore Q is ergodic.

13. The vector X = (X 1, X2, ... ) induces a probability measure Q on (l~T, JF,T). Since Tis measurepreserving, Q is stationary. Let Y : rntT --+ rnt be given by Y(x) = x1 for x = (xl, x2, .. .), and define
Yi (x) = Y(ri-l (x)) where r is the usual shift operator on rntT. The vector Y = (f1, Y2, ... ) has the
same distributions as the vector X. By the ergodic theorem for Y, n- 1 'L-t=l Yi --+ lE(Y I 'J.) a.s. and
in mean, where '!f. is the invariant a -field of r. It follows that the limit
1 n
Z = lim - ""Xi
n--+oo n L...t
i=l

364

Solutions

Problems

exists a.s. and in mean. Now U = limsupn-+oo (n- 1

[9. 7.14]-[9. 7.15]

L;1 Xi) is invariant, since

implying that U(w) = U(Tw) a.s. It follows that U is 1-measurable, and it is the case that Z = U
a.s. Take conditional expectations of(*), given 1, to obtain U = E(X 11) a.s.
If Tis ergodic, then 1 is trivial, so that E(X 11) is a.s. constant; therefore E(X 11) = E(X) a.s.

14. If(a,b) ~ [0, l),thenT- 1(a,b) = (1a, ib)u(i+1a, 1+!-b),andthereforeTismeasurable.


Secondly,
1P'(T- 1(a, b))= 2(1b -1a) = b- a= IP'((a, b)),
so that r- 1 preserves the measure of intervals. The intervals generate :B, and it is then standard that
r- 1 preserves the measures of all events.
Let A be invariant, in that A= r- 1A. Let 0 :S: w < 1; it is easily seen that T(w) = T(w + 1).
Therefore wE A if and only if w + 1 E A, implying that An [1, 1) =
lP'(A n E)= 11P'(A) = lP'(A)lP'(E)

i + {An [0, 1) }; hence

forE= [0, 1), [1, 1).

This proves that A is independent of both [0, 1) and [1, 1). A similar proof gives that A is independent
of any set E which is, for some n, the union of intervals of the form [k2-n, (k+ 1)2-n) forO ::::: k < 2n.
It is a fundamental result of measure theory that there exists a sequence E 1, Ez, ... of events such that
(a) En is of the above form, for each n,
(b) lP'(A LEn) --+ 0 as n --+ oo.
Choosing the En accordingly, it follows that
lP'(A n En) = lP'(A)lP'(En) --+ lP'(A) 2

by independence,

IIP'(A n En) -lP'(A)I ::::: lP'(A LEn) --+ 0.


Therefore IF'( A) = lP'(A) 2 so that lP'(A) equals 0 or 1.
For w E Q, expand w in base 2, w = O.w1 wz , and define Y (w) = w1. It is easily seen that
Y(Tn- 1w) = Wn, whence the ergodic theorem (Problem (9.7.13)) yields that n- 1 L;?=l Wi --+
as
n --+ oo for all w in some event of probability 1.

15. We may as well assume that 0 < a < 1. LetT : [0, 1) --+ [0, 1) be given by T(x) = x +a
(mod 1). It is easily seen that Tis invertible and measure-preserving. Furthermore T(X) is uniform
on [0, 1], and it follows that the sequence Z 1, Zz, ... has the same fdds as Zz, Z3, ... , which is to
say that Z is stationary. It therefore suffices to prove that T is an ergodic shift, since this will imply
by the ergodic theorem that
1 n
{I
Zj --+ E(ZI) =
g(u) du.
n j=l
o

-L

Jo

We use Fourier analysis. Let A be an invariant subset of [0, 1). The indicator function of A has
a Fourier series:
00

IA(x) ~

anen(x)

n=-oo

where en (x) = e 2:rrinx and

1
an=-

2n

lolo

IA(x)e-n(x)dx = 1
-

2n

365

e-n(x)dx.

[9.7.16]-[9.7.18]

Solutions

Stationary processes

Similarly the indicator function of r- 1 A has a Fourier series,


lr-!A(x) ~ Lbnen(x)
n

where, using the substitution y

= T(x),

since em(Y- a)= e- 2nimaem(y). Therefore Ir-IA has Fourier series


I r-IA (x ) ~ ""
L..Je -2rrina anen ( x ) .
n

Now I A = I r-1 A since A is invariant. We compare the previous formula with that of (* ), and deduce
that an = e- 2 nina an for all n. Since a is irrational, it follows that an = 0 if n =f 0, and therefore I A
has Fourier series ao, a constant. Therefore lA is a.s. constant, which is to say that either lP'(A) = 0
or lP'(A) = 1.

16. Let G 1 (z) = E(zX(t)), the probability generating function of X(t). Since X has stationary
independent increments, for any n (~ 1), X(t) may be expressed as the sum
n

X(t) = L { X(itjn)- X((i -l)tjn)}


i=l

of independent identically distributed variables. Hence X(t) is infinitely divisible. By Problem


(5.12.13), we may write
Gt(z)

= e-J..(t)(l-A(z))

for some probability generating function A, and some A.(t).


Similarly, X(s + t) = X(s) +{X (s + t)- X(s)}, whence Gs+t(z) = Gs(z)G 1 (z), implying that
G 1 (z) = e!h(z)t for some t-t(z); we have used a little monotonicity here. Combining this with(*), we
obtain that G 1 (z) = e-J..t(l-A(z)) for some A..
Finally, X (t) has jumps of unit magnitude only, whence the probability generating function A is
given by A(z) = z.

17. (a) We have that


X(t)- X(O)

= {X(s)- X(O)}

+ { X(t)- X(s) },

0:::::

s::::: t,

whence, by stationarity,
{m(t) -m(O)} = {m(s)- m(O)} + {m(t -s) -m(O)}.

Now m is continuous, so that m(t)- m(O) = {Jt, t ~ 0, for some {3; see Problem (4.14.5).
(b) Take variances of(*) to obtain v(t) = v(s) + v(t- s), 0 ::::: s ::::: t, whence v(t) = a 2 t for some
(J2.

18. In the context of this chapter, a process Z is a standard Wiener process if it is Gaussian with
Z(O) = 0, with zero means, and autocovariance function c(s, t) = min{s, t}.

366

Solutions [9.7.19]-[9.7.20]

Problems

(a) Z(t)

= aW(tja 2 ) satisfies Z(O) = 0, JE(Z(t)) = 0, and


= a 2 min{s ja 2 , t ja 2 } = min{s, t}.

cov(Z(s), Z(t))

(b) The only calculation of any interest here is


cov(W(s + a)-W(a), W(t +a)- W(a))

= c(s +a, t +a) -

c(a, t +a) - c(s +a, a)+ c(a, a)

=(s+a)-a-a+a=s,

s~t.

(c) V(O) = 0, and JE(V(t)) = 0. Finally, if s, t > 0,


cov(V(s), V(t)) = stcov(W(l/s), W(lft)) = stmin{l/s, 1ft}= min{t, s}.
(d) Z(t)

W(l)- W(l - t) satisfies Z(O)


cov(Z(s), Z(t))

= 0, JE(Z(t)) = 0.

Also Z is Gaussian, and

= 1- (1- s)- (1- t) + min{l- s, 1- t}


= min{s, t},
O~s,t~l.

19. The process W has stationary independent increments, and G(t) = JE(IW(t)l 2 ) satisfies G(t)
t --+ 0 as t --+ 0; hence
(u) d W (u) is well defined for any satisfying

jgo

fooo l(u)l 2 dG(u) = fooo (u) 2 du < oo.


It is obvious that (u) = l[O,tJ(u) and (u) = e-(t-u) l[O,tJ(u) are such functions.
Now X(t) is the limit (in mean-square) of the sequence
n-1

Sn(t)

= l:)W((j + 1)t/n)- W(jtjn)},

n~l.

}=0

However Sn(t) = W(t) for all n, and therefore Sn(t) ~ W(t) as n--+ oo.
Finally, Y (s) is the limit (in mean-square) of a sequence of normal random variables with mean
0, and therefore is Gaussian with mean 0. If s < t,
cov(Y(s), Y(t)) = fooo (e-(s-u) l[O,sJ(u)) (e-(t-u) I[o,t](u)) dG(u)
-los
e 2u-s-t d u_- 21 ( e s-t -e -s-t) .
0

Y is an Omstein-Uhlenbeck process.
20. (a) W(t) is N(O, t), so that
JEIW(t)l =

oo

-00

var(IW(t)l)

lui

--e-z<u ft) du =

..;z;rt

2)

= JE(W(t) 2 ) - 2t
n = t ( 1-;
367

J2ili(,
.

[9. 7.21]-[9. 7.21]

Solutions

Stationary processes

The process X is never negative, and therefore it is not Gaussian. It is Markov since, if s < t and
B is an event defined in terms of {X(u) : u ::5 s}, then the conditional distribution function of X(t)
satisfies

lP'(X(t) ::5 y X(s)

= x, B) = lP'(X(t) ::5 y IW(s) = x, B)lP'(W(s) =xI X(s) = x, B)


+ lP'(X(t) ::5 y IW(s) = -x, B)lP'(W(s) =-xI X(s) = x, B)
= i{lP'(X(t) ::5 y IW(s) =x) +lP'(X(t) ::5 y IW(s) = -x) }.

which does not depend on B.


(b) Certainly,
E(Y(t))

1 Jiiit
00

eU

--e-z(u ft) du

= e2 1

-00

Secondly, W(s) + W(t) = 2W(s) + {W(t)- W(s)} is N(O, 3s + t) if s < t, implying that
E(Y(s)Y(t)) = JE: (eW(s)+W(t)) = A(3s+t),
and therefore

cov(Y(s), Y(t)) = e'2( 3s+t)- ez(s+t),

s < t.

W(1) is N(O, 1), and therefore Y(1) has the log-normal distribution. Therefore Y is not Gaussian.
It is Markov since W is Markov, and Y(t) is a one-{Jne function of W (t).
(c) We shall assume that the random function W is a.s. continuous, a point to which we return in
Chapter 13. Certainly,
E(Z(t)) =lot E(W(u)) du
E(Z(s)Z(t))

!1

= 0,

E(W(u)W(v))dudv

O<u<s

o;v;t

=1s
u=O

{fu
vdv+ t
udv}du=is (3t-s),
lv=O
lv=u
2

s<t,

since E(W(u)W(v)) = min{u, v}.


Z is Gaussian, as the following argument indicates. The single random variable Z(t) may be
expressed as a limit of the form
lim

n-+oo

~ (!_)
W(itjn),
n

L...J

i=l

each such summation being normal. The limit of normal variables is normal (see Problem (7.11.19)),
and therefore Z(t) is normal. The limit in(*) exists a.s., and hence in probability. By an appeal to
(7.11.19b), pairs (Z(s), Z(t)) are bivariate normal, and a similar argument is valid for all n-tuples of
the Z(u).
The process Z is not Markov. An increment Z (t) - Z (s) depends very much on W (s) = Z' (s ),
and the collection {Z (u) : u ::; s} contains much information about Z' (s) in excess of the information
contained in the single value Z(s).

21. Let Ui
X(ti). The random variables A= U1, B
Uz- U1o C
U3- Uz, D
U4- U3
are independent and normal with zero means and respective variances t1, tz - t1, t3 - tz, t4 - t3. The
Jacobian of the transformation is 1, and it follows that U1, Uz, U3, U4 have joint density function

e-!Q
/u(u) = (2n) 2,Jtl(t2- t1)(t3- tz)(t4- t3)

368

Solutions

Problems

[9. 7.22]-[9. 7.22]

where
ui

(u2- u1) 2

fl

f2 - tl

Q=-+
Likewise ul and

(u3 - u2) 2
t3 - t2

(u4- u3) 2

+---'--~

f4 - t3

u4 have joint density function


where

Hence the joint density function of U2 and U3, given U1 = U4 = 0, is

where

)2
2
2
(
S=~+ U3-U2 +~.
f2 - fl
t3 - t2
f4 - f3

Now g is the density function of a bivariate normal distribution with zero means, marginal variances
(t2 - tl )(t3 - t2)
t3- tl

and correlation

See also Exercise (8.5.2).


22. (a) The random variables {/j (x) : I :::: j :::: n} are independent, so that
E(Fn(x)) =X,

1
x(l-x)
var(Fn(x)) = -var(h(x)) =
.
n
n

By the central limit theorem, ,Jn{Fn(x)- x} ~ Y(x), where Y(x) is N(O, x(1- x)).
(b) The limit distribution is multivariate normal. There are general methods for showing this, and
here is a sketch. If 0 ::::= x1 < x2 ::::= 1, then the number M2 (= nF(x2)) of the Ij not greater than
x2 is approximately N(nx2, nx2(l- x2)). Conditional on {M2 = m}, the number M1 = nF(xl)
is approximately N(mu, mu(l - u)) where u = xlfx2. It is now a small exercise to see that the
pair (M1, M2) is approximately bivariate normal with means nx1, nx2, with variances nx1 (1 - x1),
nx2(l - x2), and such that

whence cov(Ml, M2) ~ nx1 (1 - x2). It follows similarly that the limit of the general collection is
multivariate normal with mean 0, variances x;(1- x;), and covariances Cij = x;(l- Xj).
(c) The autocovariance function of the limit distribution is c(s, t) = min{s, t} - st, whereas, for
0:::: s :::: t :::: 1, we have that cov(Z(s), Z(t)) = s- ts- st + st = min{s, t}- st. It may be shown
that the limit of the process { Jn (Fn (x) - x) : n :::: 1} exists as n -+ oo, in a certain sense, the limit
being a Brownian bridge; such a limit theorem for processes is called a 'functional limit theorem'.

369

10
Renewals

10.1 Solutions. The renewal equation


1.

SincelE(Xl) > 0, there exists E (> 0) such thatlP'(X1 2: E)> E. LetXIc = El{xk::::E} and denote

by N' the related renewal process. Now N(t) :s N'(t), so that lE(eeN(t)) :s lE(eeN'{tl), fore > 0.
Let Zm be the number of renewals (in N') between the times at which N' reaches the values (m - 1)E
and mE. The Z's are independent with
Eee
JE(ee Zm) _ ----~
- 1 - (1 - E)ee'

if (I- E)ee <I,

whence JE(eeN'(t)) :S (Eee {1- (1- E)ee} - 1)t/E for sufficiently small positive e.

2. Let X 1 be the time of the first arrival. If X 1 > s, then W = s. On the other hand if X 1 < s, then
the process starts off afresh at the new starting time X 1. Therefore, by conditioning on the value of

x1,
Fw(x) =

fooo lP'(W :s

=los

I x1

=los

= u)dF(u)

lP'(W :S x- u)dF(u)

lP'(W

:s X - u)dF(u) +

00

1 dF(u)

+ {1- F(s)}

if x 2: s. It is clear that F w (x) = 0 if x < s. This integral equation for F w may be written in the
standard form
Fw(x) = H(x)

where H and

+fox Fw(x- u)dF(u)

F are given by
H(x) = {

~- F(s)

if X <

S,

ifx:=::s,

F(x) =

{ F(x)

F(s)

if x < s,
if X 2: s.

This renewal-type equation may be solved in the usual way by the method ofLaplace-Stieltjes transforms. We have that Fw = H* + FiVF*, whence Fw = H* j(l- F*). If N is a Poisson process
then F(x) = 1- e-l.x. In this case
H*(e) =

fooo e-ex dH(x) = e-(.A.+e)s,

since H is constant apart from a jump at x = s. Similarly

370

Solutions

Limit theorems

so that

(A.+ e)e-(A+O)s

Fw(e)

Finally, replace e with


3.

[10.1.3]-[10.2.2]

+ A.e-(A+O)s

-e, and differentiate to find the mean.

We have as usual that lP'(N(t) = n) = lP'(Sn :S t) -lP'(Sn+1 :S t). In the respective cases,

(a)

- - Jor {;,.nbxnb-1

(b)

4.

lP'(N(t)- n)-

r(nb)

;,.(n+1)bx(n+1)b-1} -A.x
f((n + l)b)
e
dx.

By conditioning on X1, m(t) = E(N(t)) satisfies


m(t) =lot (I+ m(t- x)) dx = t +lot m(x) dx,

O:::;t:::;l.

Hence m' = 1 + m, with solution m(t) = e1 - 1, for 0 ::::; t ::::; 1. (For larger values oft, m(t) =
1 + m(t- x) dx, and a tiresome iteration is in principle possible.)

JJ

With v(t) = E(N(t) 2 ),


v(t)= l[v(t-x)+2m(t-x)+l]dx=t+2(e 1 -t-l)+ lot v(x)dx,

Hence v' = v + 2et- 1, with solution v(t) = 1 - et

+ 2te1 for 0::::; t ::::;

O:::;t:::;l.

1.

10.2 Solutions. Limit theorems


1. Let Zi be the number of passengers in the ith plane, and assume that the Zi are independent
of each other and of the arrival process. The number of passengers who have arrived by time t is
S(t) = I:~i) Z;. Now
1
N(t) S(t)
E(Z1)
-S(t) = - - ----+ - a.s.
t
t
N(t)
f.l

by the law of the large numbers, since N(t)/t --+


2.

1/Jl a.s., and N(t)

--+ oo a.s.

We have that

since l(M:;:i)l{M:;:JJ = /{M:;:iv j} where i V j = max{i, j}. Now

since {M :S i - 1} is defined in terms of Z 1, Z2, ... , Zi -1, and is therefore independent of Z;.
Similarly E(Z; ZJ l(M:;:jj) = E(ZJ )E(Zi l(M:;:jj) = 0 if i < j. It follows that
E(Tli) =

00

00

i=1

i=1

2:: E(Zf)lP'(M 2:: i) = a 2 2:: lP'(M 2:: i) = a 2E(M).


371

[10.2.3]-[10.2.5]

Solutions

Renewals

3. (i) The shortest way is to observe that N(t) + k is a stopping time if k :::: 1. Alternatively, we
have by Wald's equation that JE(TN(t)+I) = Jl(m(t) + 1). Also

lE(XN(t)+k) = lE{lE(XN(t)+k N(t))} =

k:::: 2,

jl,

and therefore, for k :::: 1,


k

lE(TN(t)+k) = lE(TN(t)+I)

+ 2:JE(XN(t)+j) =

Jl(m(t)

+ k).

j=2

(ii) Suppose p

=f: 1 and
li"(Xt=a)= {

Then f.l = 2 - p

=f: 1.

if a= 1,

1- p

if a= 2.

Also

JE(TN(I)) = (1- p)JE(To 1 N(1) = O)


whereas m(1) = p. Therefore JE(TN(I))

=f:

+ plE(Tt

N(1) = 1) = p,

Jlm(l).

4. Let V(t) = N(t) + 1, and let W1, W2, ... be defined inductively as follows. W1 = V(1), W2
is obtained similarly to Wt but relative to the renewal process starting at the V(1)th renewal, i.e., at
time TN(I)+1 and Wn is obtained similarly:
Wn = N(Txn-l

+ 1)- N(Txn_ 1 ) + 1,

n:::: 2,

whereXm = W1 + W2+ + Wm. Foreachn, Wn isindependentofthesequence W1, W2, ... , Wn-1


and therefore the Wn are independent copies of V ( 1). It is easily seen, by measuring the time-intervals
covered, that V(t).:::; 2::[~ 1 W;, and hence
1
1 ftl
- V(t).:::; -2: Wi--+ lE(V(l))

a.s. and in mean, as t--+ oo.

t i=I

It follows that the family { m- 1 2::~ 1 W; : m :::: 1} is uniformly integrable (see Theorem (7.10.3)).
Now N(t).:::; V(t), and so {N(t)/t: t:::: 0} is uniformly integrable also.
Since N (t) It ~ J.l- 1, it follows by uniform integrability that there is also convergence in mean.

5.

(a) Using the fact that lP'(N(t) = k) = lP'(Sk .:::; t) -li"(Sk+1 .:::; t), we find that
lE(sN(T)) =

fooo (~ sklP'(N(t) = k)) ve-vt dt


~ sk {fooo [li"(Sk .:::; t) -lP'(Sk+I .:::; t)] ve-vt dt} .

By integration by parts, J000 lP'(Sk.:::; t)ve-vt dt = M(-v)k fork:::: 0. Therefore,


lE(sN(T)) = f>k{M(-v)k _ M(-v)k+I} = 1- M(-v).
k=O
1- sM(-v)
(b) In this case, lE(sN(T)) = lE(eAT(s-l)) = MT(A.(s -1)). When T has the given gamma distribution,
MT(e) = {vj(v- e)}b, and

lE(sN(T)) =

(-v-) (1 - ~)
b

v+A.

v+A.

The coefficient of sk may be found by use of the binomial theorem.

372

Excess life

Solutions

[10.3.1]-[10.3.3]

10.3 Solutions. Excess life


1. Let g (y) = lP'( E (t) > y), assumed notto depend on t. By the integral equation for the distribution
of E(t),

+ y) + g(y) dF(x).
Write h(x) = 1- F(x) to obtain g(y)h(t) = h(t + y), for y, t :=:: 0. With t = 0, we have that
g(y)h(O) = h(y), whence g(y) = h(y)/ h(O) satisfies g(t + y) = g(t)g(y), for y, t :=:: 0. Now g is
g(y)

= 1-

F(t

left-continuous, and we deduce as usual that g(t)


the renewal process is a Poisson process.

= e-At for some A..

Hence F(t)

1 -e-At, and

2.

(a) Examine a sample path of E. If E(t) = x, then the sample path decreases (with slope -1)
until it reaches the value 0, at which point it jumps to a height X, where X is the next interarrival time.
Since X is independent of all previous interarrival times, the process is Markovian.
(b) In contrast, C has sample paths which increase (with slope 1) until a renewal occurs, at which they
drop to 0. If C (s) = x and, in addition, we know the entire history of the process up to time s, the
time of the next renewal depends only on the length ofthe spent period (i.e., x) of the interarrival time
in process. Hence C is Markovian.

3.

(a) We have that


lP'(E(t) _:::: y) = F(t

+ y)

-l

G(t

+ y- x)dm(x)

where G(u) = 1- F(u). Check the conditions of the key renewal theorem (10.2.7): g(t) = G(t + y)
satisfies:
(i) g(t) :::: 0,
(ii) f0 g(t) dt .:::= fcF[1 - F(u)] du = E(X 1) < oo,
(iii) g is non-increasing.
We conclude, by that theorem, that

00

1
lim lP'(E(t) _:::: y) = 1 - -

t-+00

f1

100 g(x)dx =
0

loy -[11
F(x)]dx.
0 f1

(b) Integrating by parts,

i oo
o

xr
1
-[1-F(x)]dx=11

11

laoo

xr+1
E(xr+1)
--dF(x)=
1
.
M(r + 1)

o r+1

See Exercise (4.3.3).


(c) As in Exercise (4.3.3), we have that E(E(tY) = J0 ryr- 11P'(E(t) > y) dy, implying by(*) that

00

E(E(tn = E({CX1- t)+n

+ i:ol~ 0 ryr- 1 1P'(X 1

> t

+ y -x)dm(x)dy,

whence the given integral equation is valid with

Now h satisfies the conditions of the key renewal theorem, whence


lim E(E(tY)

t-+oo

= .!._ { 00 h(u) du = .!._ { {


yr dF(u + y) du
11 Jo
11 JJo<u,y<oo
1 ioo r
E(X~+1)
=y lP'(X1 >y)dy=
.
11 o
J1(r + 1)
373

[10.3.4]-[10.3.5]

4.

Renewals

Solutions

We have that
lP'(E(t)>yjC(t)=x)=lP'(Xt>y+xiXt>x)=

1- F(y +x)

1- F(x)

whence
E(E(t)jC(t)=x)= fool-F(y+x)dy=E{(Xt-x)+}_
Jo
1- F(x)
1- F(x)

5. (a) Apply Exercise (10.2.2) to the sequence X;- f-L, 1 :::= i < oo, to obtain var(TM(t)- t-LM(t)) =
a 2 E(M(t)).
(b) Clearly TM(t) = t + E(t), where E is excess lifetime, and hence t-LM(t) = (t + E(t))- (TM(t)t-LM(t)), implying in tum that
f-L 2 var(M(t)) =

where

SM(t)

var(E(t)) + var(SM(t))- 2cov(E(t),

SM(t)),

= TM(t)- {LM(t). Now

as t --+ oo

ifE(XI) < oo (see Exercise (10.3.3c)), implying that


1
- var(E(t)) --+ 0
t

as t --+ oo.

This is valid under the weaker assumption that E(XI) < oo, as the following argument shows. By
Exercise (10.3.3c),
E(E(t) 2 ) = a(t) +lot a(t- u) dm(u),
where a(u) = JE:({(Xt- u)+} 2 ). Now use the key renewal theorem together with the fact that
a(t):::: E(XIl(x 1 >tJ)--+ Oast--+ oo.
Using the Cauchy-Schwarz inequality,

as t --+ oo, by part (a) and (** ). Returning to (*), we have that
f-L2
-var(M(t))--+
lim
t
t--'>00

{a2

-(m(t)+l) } =a2
-.
t
{L

374

Solutions [10.4.1]-[10.5.3]

Renewal-reward processes

10.4 Solution. Applications


1. Visualize a renewal as arriving after two stages, type 1 stages being exponential parameter A and
type 2 stages being exponential parameter J-t. The 'stage' process is the flip-flop two-state Markov
process of Exercise (6.9.1). With an obvious notation,
Pn(t)

= _A_e-(A+JL)t + _1-t_.
A+J-t

A+J-t

Hence the excess lifetime distribution is a mixture of the exponential distribution with parameter J-t,
and the distribution of the sum of two exponential random variables, thus,

where g(x) is the density function of a typical interarrival time. By Wald's equation,
E(t

+ E(t)) = E(SN(t)+1) = E(X1)E(N(t) + 1) =

We substitute

( + ~)

(m(t)

+ 1).

p 11 (t)
(-A1+-1-t1) + (1- Pn(t))-1-t1=1-t-1 + A

E(E(t)) = Pn(t)
to obtain the required expression.

10.5 Solutions. Renewal-reward processes


1. Suppose, at time s, you are paid a reward at rate u(X(s)). By Theorem (10.5.10), equation
(10.5.7), and Exercise (6.9.11b),

lot

1
t 0

Suppose lu (i) I .::; K < oo for all i

a.s.

1
J-tjgj

l{X(s)=j)dS ~ - - = lfj.

S, and let F be a finite subset of the state space. Then

where T1 (F) is the total time spent in F up to time t. Take the limit as t
F t S, to obtain the required result.

oo using (*), and then as

2. Suppose you are paid a reward at unit rate during every interarrival time of type X, i.e., at all
times tat which M(t) is even. By the renewal-reward theorem (10.5.1),
1

lot I

a.s. E(reward during interarrival time)


EX 1
=
.
E(length of interarrival time)
EX 1 + EY1

M s is even ds ~

o { ()

3.

Suppose, at timet, you are paid a reward at rate C(t). The expected reward during an interval
(cycle) of length X is fox s ds = ~ X 2 , since the age C is the same at the time s into the interval. The

375

[10.5.4]-[10.6.3]

Solutions

Renewals

result follows by the renewal-reward theorem (10.5.1) and equation (10.5.7). The same conclusion is
valid for the excess lifetime E(s), the integral in this case being fox (X- s) ds = ~ X 2 .
4. Suppose Xo = j. Let V1 = min{n ::: 1 : Xn = j, Xm = k for some 1 :S m < n}, the first visit
to j subsequent to a visit to k, and let Vr+1 = min{n ::: Vr : Xn = j, Xm = k for some Vr + 1 :S
m < n}. The Vr are the times of a renewal process. Suppose a reward of one ecu is paid at every visit
to k. By the renewal-reward theorem and equation (10.5.7),

By considering the time of the first visit to k,


E(V1

I Xo = j) = E(Tk I Xo = j) +E(1) I Xo = k).

The latter expectation in (*) is the mean of a random variable N having the geometric distribution
f'(N = n) = p(l - p)n- 1 for n ::: 1, where p = f'(1) < Tk I Xo = k). Since E(N) = p- 1, we
deduce as required that
1/f'(1) < Tk I Xo = k)
lfk = - - - - " - - - - - - - - -

E(n

I Xo

= j) + E(1j

I Xo

= k)

10.6 Solutions to problems


1. (a) For any n, JFl'(N(t) < n) :S JFl'(Tn > t)--+ 0 as t--+ oo.
(b) Either use Exercise (10.1.1), or argue as follows. Since JL > 0, there exists E (> 0) such that
f'(X1 >E) > 0. For all n,
f'(Tn :S nE) = 1- f'(Tn > nE) :S 1- f'(X1 > E)n < 1,

so that, if t > 0, there exists n = n(t) such that f'(Tn :S t) < 1.


Fix t and let n be chosen accordingly. Any positive integer k may be expressed in the form
k =an+ fj where 0 :S fj < n. Now JFl'(Tk :S t) :S f'(Tn :S t)a for an :S k < (a+ 1)n, and hence
00

00

k=1

a=O

L JFl'(Tk :S t) :S L nf'(Tn :S t)a <

m(t) =

00.

(c) It is easiest to use Exercise (10.1.1), which implies the stronger conclusion that the moment
generating function of N (t) is finite in a neighbourhood of the origin.
2.

(i) Condition on X 1 to obtain

,,
2
v(t)= Jo E{(N(t-u)+1) }dF(u)= Jo {v(t-u)+2m(t-u)+1}dF(u).

Take Laplace-Stieltjes transforms to find that v* = (v* +2m*+ 1)F*, where m* = F* + m* F* as


usual. Therefore v* = m*(l +2m*), which may be inverted to obtain the required integral equation.
(ii) If N is a Poisson process with intensity A., then m(t) =At, and therefore v(t)
3.

Fix x

lit Then

376

= (A.t) 2 + A.t.

Solutions [10.6.4]-[10.6. 7]

Problems
wherea(t) = l<tfp,) +xv'ta2jp,3j. Now,

lP'(T () < t) = lP' ( Ta(t) - p,a(t) < t - p,a(t))


a t a ..j(i(i)
- a ..j(i(i) .
However a(t)

tIp, as t --+ oo, and therefore


t - p,a(t)

---'==='- --+
a ..j(i(i)

as t--+ oo,

-X

implying by the usual central limit theorem that


lP'(N(t)-(t/p,)
y'ta2jp,3

~x)

__... <1>(-x)

as t--+ oo

where <I> is the N(O, 1) distribution function.


An alternative proof makes use of Anscombe's theorem (7.11.28).
4.

We have that, for y .::0 t,


lP'(C(t) ~ y) = lP'(E(t- y) > y)--+ lim lP'(E(u) > y)
U---+00

as t--+ oo

= {oo !._[1- F(x)] dx

}y

1-i

by Exercise (10.3.3a). The current and excess lifetimes have the same asymptotic distributions.

5. Using the lack-of-memory property of the Poisson process, the current lifetime C(t) is independent
of the excess lifetime E(t), the latter being exponentially distributed with parameter J..... To derive the
density function of C(t) either solve (without difficulty in this case) the relevant integral equation, or
argue as follows. Looking backwards in time from t, the arrival process looks like a Poisson process
up to distance t (at the origin) where it stops. Therefore C(t) may be expressed as min{Z, t} where Z
is exponential with parameter J....; hence
J....e-J..s

!qt)(s) = { 0

and lP'(C(t) = t) = e-At. Now D(t)


convolution formula) to be as given.

C(t)

if s < t
f - ,
s > t,

+ E(t),

whose distribution is easily found (by the

6. The ith interarrival time may be expressed in the form T + Zi where Zi is exponential with
parameter J..... In addition, Z1, Zz, ... are independent, by the lack-of-memory property. Now

1- F(x)

= lP'(T + zl

> x)

= lP'(Zl

>X- T)

= e-A(x-T)'

X~

T.

Taking into account the (conventional) dead period beginning at time 0, we have that

lP'(N(t)

~ k)

= lP'(kT

+ ~ Zi

.::0 t) = lP'(N(t- kT)

~ k),

kT,

l=l

where N is a Poisson process.


7. We have that X1 = L + E(L) where Lis the length of the dead period beginning at 0, and E(L)
is the excess lifetime at L. Therefore, conditioning on L,

377

[10.6.8]-[10.6.10]

Solutions

Renewals

We have that
lP'(E(t) s y) = F(t

+ y)

-lot {1-

F(t

+ y -x)}dm(x).

By the renewal equation,

m(t

+ y) =

F(t

rr+y F(t + y- x)dm(x),

+ y) + Jo

whence, by subtraction,

t+y

{1-F(t+y-x)}dm(x).

lP'(E(t)sy)= t
It follows that

1P'(X 1 sx)=

{x

{x {1-F(x-y)}dm(y)dFL(l)

}z=O }y=l

= [FL(l) {x{1-F(x-y)}dm(y)]x

~0

r FL(l){1-F(x-l)}dm(l)

using integration by parts. The term in square brackets equals 0.

8. (a) Each interarrival time has the same distribution as the sum of two independent random variables
with the exponential distribution. Therefore N(t) has the same distribution as L~ M(t)J where M is a
Poisson process with intensity A. Therefore m(t) = ~JE(M(t))- ilP'(M(t) is odd). Now JE(M(t)) =
At, and
oo (At)Zn+le-At
lP'(M(t) is odd)="'"'
= ~e-A.t(eA.t- e-At).
~ (2n + 1)!

n=O

With more work, one may establish the probability generating function of N(t).
(b) Doing part (a) as above, one may see that iii(t) = m(t).

9.

Clearly C(t) and E(t) are independent if the process N is a Poisson process. Conversely, suppose
that C(t) and E(t) are independent, for each fixed choice oft. The event {C(t) 0:: y} n {E(t) 0:: x}
occurs if and only if E (t - y) ::=: x + y. Therefore
lP'(C(t) ::=: y)lP'(E(t) ::=: x) = lP'(E(t- y) ::=: x

+ y).

Take the limit as t ~ oo, remembering Exercise (10.3.3) and Problem (10.6.4), to obtain that
G(y)G(x) = G(x + y) if x, y ::=: 0, where

00

G(u) =

-[1- F(v)] dv.


f.-i

Now 1- G is a distribution function, and hence has the lack-of-memory property (Problem (4.14.5)),
implying that G(u) = e-Jcu for some A. This implies in tum that [1- F(u)]/ f.-i = -G'(u) = Ae-Jcu,
whence f.-i = 1/A and F(u) = 1- e-Jcu.

10. Clearly N is a renewal process if Nz is Poisson. Suppose that N is a renewal process, and write
A for the intensity of N 1 , and Fz for the interarrival time distribution of Nz. By considering the time
X 1 to the first arrival of N,

378

Solutions [10.6.11]-(10.6.12]

Problems

Writing E, Ei for the excess lifetimes of N, Ni, we have that

Take the limit as t

oo, using Exercise (10.3.3), to find that

1
100 -[1-

F(u)] du = e-Ax

xf.L

100 -[11
F2(u)]

du,

xf.L2

where f.L2 is the mean of F2. Differentiate, and use(*), to obtain

which simplifies to give 1- F2(x) = c


equation has solution F2 (x) = 1 - e -ex.

fx

00

[1- F2(u)] du where c = AJL/(JL2- J.L); this integral

11. (i) Taking transforms of the renewal equation in the usual way, we find that
m* e -

F*(e)

< ) - 1- F*(e)

where
as

-----1
1- F*(e)

F*(e) = E(e-ex,) = 1- elL+ ~e 2 (JL 2

+a 2 ) +o(e 2 )

e ~ 0. Substitute this into the above expression to obtain

and expand to obtain the given expression. A formal inversion yields the expression form.
(ii) The transform of the right-hand side of the integral equation is

By Exercise (10.3.3), Ff:(e) = [1 - F*(e)]/(J.Le), and(*) simplifies to m*(e) - (m* - m* F* F*)/(J.Le), which equals m*(e) since the quotient is 0 (by the renewal equation).
Using the key renewal theorem, as t ~ oo,

lo [1t

looo [1- FE(u)]du = --JE(X2)


a2 + JL2
=
2

FE(t -x)]dm(x) ~JL 0

2JL

by Exercise (10.3.3b). Therefore,

12. (i) Conditioning on X 1, we obtain

379

2JL

[10.6.13]-[10.6.16]

Solutions

Renewals

Thereforemd* = Fd* +m*Fd*. Alsom* = F* +m*F*,sothat

whence md* = Fd* + md* F*, the transform of the given integral equation.
(ii)ArguingasinProblem(10.6.2), vd* = Fd*+2m*Fd*+v*Fd*wherev* = F*(l+2m*)/(l-F*)
is the corresponding object in the ordinary renewal process. We eliminate v* to find that

by(*). Now invert.

13. Taking into account the structure of the process, it suffices to deal with the case I = 1. Refer
to Example (10.4.22) for the basic notation and analysis. It is easily seen that fJ = (v - 1)A.. Now
F(t) = 1- e-vA.t. Solve the renewal equation (10.4.24) to obtain

+lot

g(t) = h(t)

h(t- x) diii(x)

where iii(x) = vA.x is the renewal function associated with the interarrival time distribution
Therefore g(t) = 1, andm(t) = ef3t.

F.

14. We have from Lemma (10.4.5) that p* = 1 - Fz + p* F*, where F* = Fy Fz. Solve to obtain

*
p

1- Fz
1- F*F*
y z

15. The first locked period begins at the time of arrival of the first particle. Since all future events
may be timed relative to this arrival time, we may take this time to be 0. We shall therefore assume
that a particle arrives at 0; call this the Oth particle, with locking time Yo.
We shall condition on the time X 1 of the arrival of the next particle. Now
lP'(L > t

I X1

= u) = {

lP'(Yo>t)

ifu>t,

lP'(Yo > u)lP'(L' > t - u)

ifu ~ t,

where L' has the same distribution as L; the second part is a consequence of the fact that the process
'restarts' at each arrival. Therefore
lP'(L > t) = (1- G(t))lP'(X1 > t)

+lot

lP'(L > t - u)(1- G(u))fx 1 (u)du,

the required integral equation.


If G(x) = 1 - e-11-x, the solution is lP'(L > t) = e-11-t, so that L has the same distribution as
the locking times of individual particles. This striking fact may be attributed to the lack-of-memory
property of the exponential distribution.

16. (a) It is clear that M(tp) is a renewal process whose interarrival times are distributed as X1 +
Xz + + XR where lP'(R = r) = pqr- 1 for r:::: 1. It follows that M(t) is a renewal process whose
first interarrival time
X(p) = inf{t : M(t) = 1} = p inf{t : M(tp) = 1}

380

Solutions [10.6.17]-[10.6.19]

Problems

has distribution function


00

lP'(X(p)::::: x) = LlP'(R = r)Fr(xjp).


r=1

(b) The characteristic function QJp of Fp is given by


QJp(t)

pqr-1100 eixt dFr(tfp)

r=1

-oo

pqr-14J(pt{

pqy(pt)

1 - qqy(pt)

r=1

where qy is the characteristic function of F. Now qy (pt) = 1 + i Jlpt + o(p) as p ,), 0, so that
QJ (t)

1 + ijlpt + o(p)
1-itLt+o(1)

1 + o(l)
1-itLt

as p ,), 0. The limit is the characteristic function of the exponential distribution with mean fL, and the
continuity theorem tells us that the process M converges in distribution as p ,), 0 to a Poisson process
with intensity 1I fL (in the sense that the interarrival time distribution converges to the appropriate
limit).
(c) If M and N have the same fdds, then c/Jp(t) = qy(t), which implies that qy(pt) = qy(t)f(p + qqy(t)).
Hence 1/f(t) = c/J(t)- 1 satisfies 1/f(pt) = q + pl/f(t) fort E JR. Now 1/f is continuous, and it follows as
in the solution toProblem(5.12.15) that 1/f has the form 1/f(t) = 1+,Bt, implyingthatqy(t) = (1 +.Bt)- 1
for some .B E :::. The only characteristic function of this form is that of an exponential distribution,
and the claim follows.

17. (a) Let N(t) be the number of times the sequence has been typed up to the tth keystroke. Then N is
a renewal process whose interarrival times have the required mean fL; we have that JE(N(t))/t --+ fL- 1
as t --+ oo. Now each epoch of time marks the completion of such a sequence with probability
( 1 0 ) 14 , so that

~lE(N(t)) = ~
t

t (-1-) (-1-)
14--+

n= 14

100

100

14

as t--+ oo,

implying that fL = 1028 .


The problem with 'omo' is 'omomo' (i.e., appearances may overlap). Let us call an epoch
of time a 'renewal point' if it marks the completion of the word 'omo', disjoint from the words
completed at previous renewal points. In each appearance of 'omo', either the first 'o' or the second
'o' (but not both) is a renewal point. Therefore the probability un, that n is a renewal point, satisfies
( 1 0 ) 3 = un + un-2( 1 0 ) 2 . Average this over n to obtain

C~or = n~moo ~ j; {Un +un-2 C~or} = ~ + ~ C~or,


and therefore fL = 106 + 102 .
(b) (i) Arguing as for 'omo', we obtain p 3 = un + pun-1 + p 2un-2, whence p 3 = (1 + p + p 2 )/ fL.
(if) Similarly, p 2q = Un + pqun-2, so that fL = (1 + pq)f(p 2q).
18. The fdds of {N(u) - N(t) : u 2:: t} depend on the distributions of E(t) and of the interarrival
times. In a stationary renewal process, the distribution of E(t) does not depend on the value oft,
whence {N(u) - N(t) : u 2:: t} has the same fdds as {N(u) : u 2:: 0}, implying that X is strongly
stationary.

19. We use the renewal-reward theorem. The mean time between expeditions is B fL, and this is
the mean length of a cycle of the process. The mean cost of keeping the bears during a cycle is
B(B - 1)cfL, whence the long-run average cost is {d + B(B - 1)ctL/2}/(B fL).

381

11

Queues

11.2 Solutions. M/M/1


1. The stationary distribution satisfies 1f =
equations

1f P

when it exists, where P is the transition matrix. The


Plfn-1
lfn = - 1+p

lfn+1
+1+p

for n > 2,
-

with l::~o :rri = 1, have the given solution. If p ;::: 1, no such solution exists. It is slightly shorter to
use the fact that such a walk is reversible in equilibrium, from which it follows that 1f satisfies
for n ;::: 1.
2. (i) This continuous-time walk is a Markov chain with generator given by g01 = Bo, gn,n+ 1 =
Bnp/(1 + p) and gn,n-1 = Bn/(1 + p) for n ;::: 1, other off-diagonal terms being 0. Such a process
is reversible in equilibrium (see Problem (6.15.16)), and its stationary distribution v must satisfy
Vngn,n+1 = Vn+1gn+1,n These equations may be written as
for n ;::: 1.
These are identical to the equations labelled(*) in the previous solution, with :rrn replaced by vnBn. It
follows that Vn = C:rrn/Bn for some positive constant C.
(ii) If Bo = A., Bn =A.+ f.-i for n ;::: 1, we have that

whence C = 2A. and the result follows.


3. Let Q be the number of people ahead of the arriving customer at the time of his arrival. Using the
lack -of-memory property of the exponential distribution, the customer in service has residual servicetime with the exponential distribution, parameter f.-t, whence W may be expressed as S 1 + S2 + + S Q,
the sum of independent exponential variables, parameter f.-t. The characteristic function of W is
<f>w(t) =E{lE(eitW

I Q)}

=E{ (f.-t~it)Q}

-c--_1_-,...,---P--,- = (1 _ p)
1-pf.-t/({t-it)

382

+p

1-i - A. ) .
{t-A.-it

Solutions

M/M/1

[11.2.4]-[11.2.7]

This is the characteristic function of the given distribution. The atom at 0 corresponds to the possibility
that Q = 0.
4. We prove this by induction on the value of i + j. If i + j = 0 then i = j = 0, and it is easy to
check that rr(O; 0, 0) = 1 and A(O; 0, 0) = 1, A(n; 0, 0) = 0 for n ::: 1. Suppose then that K ::: 1,
and that the claim is valid for all pairs (i, j) satisfying i + j = K. Let i and j satisfy i + j = K + 1.
The last ball picked has probability i /(i + j) of being red; conditioning on the colour of the last ball,
we have that
rr(n;i,j)= "+i .rr(n-1;i-1,j)+ "+j .rr(n+1;i,j-1).
l
1
l
1
Now (i - 1) + j = K = i + ( j - 1). Applying the induction hypothesis, we find that
rr(n; i, j) = i

~j

{ A(n- 1; i - 1, j)- A(n; i - 1, j)}

+ i

~j

{ A(n + 1; i, j - 1)- A(n + 2; i, j - 1) }

Substitute to obtain the required answer, after a little cancellation and collection of terms. Can you
see a more natural way?

5.

Let A and B be independent Poisson process with intensities J.... and p, respectively. These processes
generate a queue-process as follows. At each arrival time of A, a customer arrives in the shop. At
each arrival-time of B, the customer being served completes his service and leaves; if the queue
is empty at this moment, then nothing happens. It is not difficult to see that this queue-process is
M(J....)/M(p,)/1. Suppose that A(t) = i and B(t) = j. During the time-interval [0, t], the order of
arrivals and departures follows the schedule of Exercise (11.2.4), arrivals being marked as red balls
and departures as lemon balls. The imbedded chain has the same distributions as the random walk
of that exercise, and it follows that JP>(Q(t) = n I A(t) = i, B(t) = j) = rr(n; i, j). Therefore
Pn (t) = 'E.i,j rr(n; i, j)JP>(A(t) = i)JP>(B(t) = j).
6.

With p

= A/ p,, the stationary distribution of the imbedded chain is, as in Exercise (11.2.1 ),

{ io - p)
Jrn =
iO - p2)pn-1
~

if n = 0,
if n::: 1.

In the usual notation of continuous-time Markov chains, go=


by Exercise (6.10.11), there exists a constant c such that
rro

Now'.; rr;

;J.... (1- p),

Jrn

J....

and gn =

J....

+ p, for n :::

1, whence,

2
1 for n::: 1.
2 (/...: p,) (1- p )pn-

= 1, and therefore c = 2/... and Jrn = (1 -

p)pn as required. The working is reversible.

7. (a) Let Q; (t) be the number of people in the ith queue at timet, including any currently in service.
The process Q1 is reversible in equilibrium, and departures in the original process correspond to arrivals
in the reversed process. It follows that the departure process of the first queue is a Poisson process
with intensity J...., and that the departure process of Q 1 is independent of the current value of Q1
(b) We have from part (a) that, for any given t, the random variables Q1 (t), Q2(t) are independent.
Consider an arriving customer when the queues are in equilibrium, and let W; be his waiting time
(before service) in the ith queue. With T the time of arrival, and recalling Exercise (11.2.3),
JP>(W1

= 0,

W2

= 0)

> JP>(Q;(T)

= (1 -

= 0 fori= 1, 2) = lP'(Q1 (T) = O)JP>(Q2(T) = 0)


= lP'(W1 = O)JP>(W2 = 0).

P1)(1 - P2)

Therefore W1 and W2 are not independent. There is a slight complication arising from the fact that T
is a random variable. However, T is independent of everybody who has gone before, and in particular
of the earlier values of the queue processes Q;.

383

[11.3.1]-[11.4.1]

Solutions

Queues

11.3 Solutions. M/G/1


1.

In equilibrium, the queue-length Qn just after the nth departure satisfies

where An is the number of arrivals during the (n + l)th service period, and h(m) = 1 - 8mo Now
Qn and Qn+l have the same distribution. Take expectations to obtain

0 = lE(An) - IP'(Qn > 0),


where lE (An) = A.d, the mean number of arrivals in an interval of length d. Next, square (*) and take
expectations:

Use the facts that An is independent of Qn, and that Qnh(Qn)


0

= {(A.d) 2 + A.d} + IP'(Qn

> 0)

Qn, to find that

+ 2{ (A.d- l)lE(Qn)- A.dlP'(Qn >

0)}

and therefore, by (** ),


2p- p2
lE(Qn) = 2 (1 _ p)

From the standard theory, Ms satisfies Ms(s) = Ms(s - A. + A.Ms(s)), where Ms(e)
~-t/(J-t- e). Substitute to find that x = Ms (s) is a root of the quadratic A.x 2 - x(A. + 1-t- s) + 1-t = 0.
For some small positive s, M B (s) is smooth and non-decreasing. Therefore M B (s) is the root given.

2.

3. Let Tn be the instant of time at which the server is freed for the nth time. By the lack-of-memory
property of the exponential distribution, the time of the first arrival after Tn is independent of all
arrivals prior to Tn, whence Tn is a 'regeneration point' of the queue (so to say). It follows that the
times which elapse between such regeneration points are independent, and it is easily seen that they
have the same distribution.

11.4 Solutions. G/M/1


1. The transition matrix of the imbedded chain obtained by observing queue-lengths just before
arrivals is
0
...
0 ...

ao

The equation :rr = :rr P A may be written as


00

Trn

= L ai:rrn+i-1

for n 2:: 1.

i=O

It is easily seen, by adding, that the first equation is a consequence of the remaining equations, taken
in conjunction with L:Q" :rri = 1. Therefore :rr is specified by the equation for :rrn, n 2:: 1.

384

Solutions [11.4.2]-[11.5.2]

G/G/1

The indicated substitution gives


00

en= en-1 l.:a;ei


i=O

which is satisfied whenever e satisfies

It is easily seen that A(8) = Mx(~-t(8 - I)) is convex and non-decreasing on [0, 1], and satisfies
A(O) > 0, A(l) = 1. Now A'(l) = ~-tlE(X) = p- 1 > 1, implying that there is a unique 'f/ E (0, 1)
such that A('f/) = 'f'J. With this value of 'f'J, the vector 1r given by Trj = (1 - TJ)'f/1, j ;:: 0, is a
stationary distribution of the imbedded chain. This 1r is the unique such distribution because the chain
is irreducible.

2. (i) The equilibrium distribution is Trn = (1- TJ)'f'Jn for n 2: 0, with mean l:~o nnn = TJ/0- TJ).
(ii) Using the lack-of-memory property of the service time in progress at the time of the arrival, we
see that the waiting time may be expressed as W = S1 + S2 + + S Q where Q has distribution 1r,
given above, and the Sn are service times independent of Q. Therefore
JE(W)

= lE(S1)lE(Q) = ___!}_j__.
(1- 'f/)

We have that Q(n+) = 1 + Q(n-) a.s. foreachintegern, whencelim 1 _, 00 lP'(Q(t) = m) cannot


exist.
Since the traffic intensity is less than 1, the imbedded chain is ergodic with stationary distribution
as in Exercise (11. 4.1).

3.

11.5 Solutions. G/G/1


1. Let Tn be the starting time of the nth busy period. Then Tn is an arrival time, and also the
beginning of a service period. Conditional on the value of Tn, the future evolution of the queue is
independent of the past, whence the random variables {Tn+l - Tn : n :::: 1} are independent. It is
easily seen that they are identically distributed.
2. If the server is freed at time T, the time I until the next arrival has the exponential distribution
with parameter J-t (since arrivals form a Poisson process).
By the duality theory of queues, the waiting time in question has moment generating function
Mw(s) = (1- 0/(1- M1(s)) where M1(s) = ~-t/(J-t- s) and = lP'(W > 0). Therefore,

Mw(s)=

s~-t( 1 -0 +(1-S),

~-tO-n-s

the moment generating function of a mixture of an atom at 0 and an exponential distribution with
parameter ~-t(1 - S).
If G is the probability generating function of the equilibrium queue-length, then, using the lackof-memory property of the exponential distribution, we have that Mw(s) = G(~-t/(J-t- s)), since W is
the sum of the (residual) service times of the customers already present. Set u = ~-t/(J-t- s) to find that
G(u) = (1- 0/(1- su), the generating function of the mass function f(k) = (1- Osk fork::: 0.

385

[11.5.3]-(11.7.2]

Solutions

Queues

It may of course be shown that ~ is the smallest positive root of the equation x = M x (~-t(x where X is a typical interarrival time.
3.

1)),

We have that
1- G(y)

= lP'(S- X>

y)

lx:;

lP'(S > u

+ y)dFx(u),

JR,

where Sand X are typical (independent) service and interarrival times. Hence, formally,
dG(y) = - {

lo

00

dlP'(S > u

+ y) dFx(u)

= dy

1-y ~-te-JJ,(u+y)
00

dFx(u),

since fs(u + y) = e-JJ,(u+y) if u > -y, and is 0 otherwise.


With F as given,

x F(x- y)dG(y) = {{
{I
-oo
}}-oo<y::::X
-y<u<OO

-7Je-JJ,(l-1})(x-y)}~-te-JJ,(u+y) dFx(u)dy.

First integrate over y, then over u (noting that F x (u) = 0 for u < 0), and the double integral collapses
to F(x), when x ::::: 0.

11.6 Solution. Heavy traffic


1.

Qp has characteristic function

Therefore the characteristic function of (1- p)Qp satisfies


1- p
1
cl>p((l- p)t) = 1- pei(I-p)t --+ 1- it

asptl.

The limit characteristic function is that of the exponential distribution, and the result follows by the
continuity theorem.

11.7 Solutions. Networks of queues


1. The first observation follows as in Example (11.7.4). The equilibrium distribution is given as in
Theorem (11.7.14) by
c

n(n) =

ni -a

a. e
II -'n'

'

1 --

i=I

forn = (nJ,nz, .. . ,nc) E 71/,

the product of Poisson distributions. This is related to Bartlett's theorem (see Problem (8.7.6)) by
defining the state A as 'being in station i at some given time'.

2. The number of customers in the queue is a birth-death process, and is therefore reversible in
equilibrium. The claims follow in the same manner as was argued in the solution to Exercise (11.2.7).

386

Solutions

Problems

[11.7.3]-[11.8.1]

3. (a) We may take as state space the set {0, 11, 111 , 2, 3, ... }, where i E {0, 2, 3, ... } is the state
of having i people in the system including any currently in service, and 11 (respectively 111 ) is the
state of having exactly one person in the system, this person being served by the first (respectively
second) server. It is straightforward to check that this process is reversible in equilibrium, whence the
departure process is as stated, by the argument used in Exercise (11.2.7).
(b) This time, we take as state space the set {0', 0 11 , 11 , 1", 2, 3, ... } having the same states as in part
(a) with the difference that O' (respectively O") is the state in which there are no customers present and
the first (respectively second) server has been free for the shorter time. It is easily seen that transition
from 01 to 111 has strictly positive probability whereas transition from 111 to 0 1 has zero probability,
implying that the process is not reversible. By drawing a diagram of the state space, or otherwise, it
may be seen that the time-reversal of the process has the same structure as the original, with the unique
change that states 01 are 011 are interchanged. Since departures in the original process correspond to
arrivals in the time-reversal, the required properties follow in the same manner as in Exercise (11.2.7).
4. The total time spent by a given customer in service may be expressed as the sum of geometrically
distributed number of exponential random variables, and this is easily shown to be exponential with
parameter 8~-t. The queue is therefore in effect aM(A.)/M(8~-t)ll system, and the stationary distribution
is the geometric distribution with parameter p = A./(8~-t), provided p < 1. As in Exercise (11.2.7),
the process of departures is Poisson.
Assume that rejoining customers go to the end of the queue, and note that the number of customers
present constitutes a Markov chain. However, the composite process of arrivals is not Poisson, since
increments are no longer independent. This may be seen as follows. In equilibrium, the probability of
an arrival of either kind during the time interval (t, t +h) is A.h + P~-tO- 8)h +o(h) = (A.j8)h + o(h).
If there were an arrival of either kind during (t- h, t), then (with conditional probability 1 - O(h))
the queue is non-empty at time t, whence the conditional probability of an arrival of either kind during
(t, t +h) is A.h + ~-tO - 8)h + o(h); this is of a larger order of magnitude than the earlier probability
(A./8)h + o(h).
5. For stations r, s, we write r --+ s if an individual at r visits s at a later time with a strictly positive
probability. Let C comprise the station j together with all stations i such that i --+ j. The process
restricted to C is an open migration process in equilibrium. By Theorem (11. 7 .19), the restricted
process is reversible, whence the process of departures from C via j is a Poisson process with some
intensity t;. Individuals departing C via j proceed directly to k with probability
Ajk~j(nj)

Ajk

1-tj~j(nj)+l:.rrtCAjr~j(nj) =

1-tj+l:.rrtCAjr'

independently of the number nj of individuals currently at j. Such a thinned Poisson process is a


Poisson process also (cf. Exercise (6.8.2)), and the claim follows.

11.8 Solutions to problems


1. Although the two cases may be done together, we choose to do them separately. When k = l,
the equilibrium distribution 1r satisfies:
J-t7r1 - A.1ro = 0,
J-t1rn+1 -(A.+ J-t)7rn + A'lrn-1 = 0,

1::::; n < N,

-J-t'lrN + A'lrN-1 = 0,

a system of equations with solution 7rn

= 7ro(A./ J-t)n for 0::::; n

N
'Jf-1 ="'\:'(A./ )n =

L
n=O

1-t

387

::::; N, where (if A.

(A./ )N+1
1-t

1-(A/~-t)

=f. J-t)

[11.8.2]-[11.8.3]

Solutions

Queues

Now let k = 2. The queue is a birth-death process with rates


A.- {
1 -

A.
0

ifi < N,
ifi 2:: N,

/Li =

/L
2JL

ifi = 1,
ifi 2:: 2.

It is reversible in equilibrium, and its stationary distribution satisfies A.ini


that ni = 2pino for 1 :::: i ::=: N, where p = A./(2/L) and

/Li+lni+l We deduce

no'= 1 + 2.:2pi.
i=l

2. The answer is obtainable in either case by following the usual method. It is shorter to use the fact
that such processes are reversible in equilibrium.
(a) The stationary distribution1r satisfies nnA.p(n) = nn+l/L for n 2:: 0, whence nn = nopn fn! where
p = A./JL. Therefore nn = pne-P fn!.
(b) Similarly,
n-1
n 2::0,
nn = nop n
P m = nop n 2 - ~n(n-1)
",

II ( )

m=O

where
oo
no'= LPn(~)Zn(n-1).
I

n=O

At the instant of arrival of a potential customer, the probability q that she joins the queue is
obtained by conditioning on its length:
oo
oo
oo
1
=2.:
p(n)nn =no 2.: pn2-n-zn<n-1) =no 2.: pn2-zn<n+l) =no-{no'- 1}.
I

n=O

n=O

n=O

3. First method. Let (Ql, Qz) be the queue-lengths, and suppose they are in equilibrium. Since
Q, is a birth-death process, it is reversible, and we write (h(t) = Q,(-t). The sample paths of
Q, have increasing jumps of size 1 at times of a Poisson process with intensity A.; these jumps mark
arrivals at the cash desk. By reversibility, Q, has the same property; such increasing jumps for Q,
are decreasing jumps for Q 1, and therefore the times of departures from the cash desk form a Poisson
process with intensity A.. Using the same argument, the quantity Q, (t) together with the departures
prior tot have the same joint distribution as the quantity Q, (-t) together with all arrivals after -t.
However Q, (-t) is independent of its subsequent arrivals, and therefore Q, (t) is independent of its
earlier departures.
It follows that arrivals at the second desk are in the manner of a Poisson process with intensity
A., and that Qz(t) is independent of Q, (t). Departures from the second desk form a Poisson process
also.
Hence, in equilibrium, Q, is M(A.)/M(J.LI)/1 and Qz is M(A.)/M(J.Lz)/1, and they are independent
at any given time. Therefore their joint stationary distribution is
nmn =IP'(Q,(t) =m, Qz(t) =n) = (1-p,)(l-pz)PiP!J.

= A./ J.Li.
Second method. The pair (Ql (t), Qz(t)) is a bivariate Markov chain. A stationary distribution
(nmn : m, n 2:: 0) satisfies
where Pi

(A.+ ILl+ J.Lz)nmn

= A.nm-l,n +

J.Linm+l,n-1 + J.Lznm,n+l

388

m,n 2::1,

Solutions [11.8.4]-[11.8.5]

Problems

together with other equations when m = 0 or n = 0. It is easily checked that these equations have the
solution given above, when Pi < 1 for i = 1, 2.

4. Let Dn be the time of then th departure, and let Q n = Q ( Dn +) be the number of waiting customers
immediately after Dn. We have in the usual way that Qn+l = An+ Qn - h(Qn), where An is the
number of arrivals during the (n + l)th service time, and h(x) = min{x, m}. Let G(s) = l:~o :rr;si
be the equilibrium probability generating function of the Qn. Then, since Qn is independent of An,

where

lE(sAn)

fooo

e)..u(s-l) fs(u)

du

= Ms (A.(s

1)),

M s being the moment generating function of a service time, and

Combining these relations, we obtain that G satisfies

smG(s) = Ms(A.(s-

1)) { G(s) +

fcsm - si):rr; }
1=0

whenever it exists.
Finally suppose that m = 2 and Ms(e) = M/(M- 8). In this case,
M{rro(s + 1) + :rr1s}
G(s) = '------'----=-----=----'M(s + 1)- A.s 2

Now G(l) = 1, whence M(2:rro + :rr1) = 2M- A.; this implies in particular that 2M- A. > 0. Also
G(s) converges for Is I _:::: 1. Therefore any zero of the denominator in the interval [-1, 1] is also a
zero of the numerator. There exists exactly one such zero, since the denominator is a quadratic which
takes the value -A. at s = -1 and the value 2M - A. at s = 1. The zero in question is at

and it follows that rro + (:rro + :rr1 )so

= 0.

Solving for rro and :rr1, we obtain

1-a
-as

G(s) = -1 - ,

where a= 2A./{M +

5.

J M2 + 4A.J-[}.

Recalling standard M/G/1 theory, the moment generating function M B satisfies

Ms(s) = Ms(s- A.+ A.Ms(s)) = -----,----M------,M-{s-A.+A.Ms(s)}

whence M B (s) is one of


(A.+ M- s)

V(A. +
2)..

389

M- s) 2 - 4A.J-[

[11.8.6]-[11.8. 7]

Solutions

Queues

Now M B (s) is non-decreasing in s, and therefore it is the value with the minus sign. The density
function of B may be found by inverting the moment generating function; see Feller (1971, p. 482),
who has also an alternative derivation of M B.
As for the mean and variance, either differentiate MB, or differentiate(*). Following the latter
route, we obtain the following relations involving M (= MB):
2AM M 1 + M

+ (s -A -

JL)M 1

= 0,

2AM M 11 + 2A(M 1) 2 + 2M 1 + (s -A - JL)M 11 = 0.

Sets = 0 to obtain M 1 (0) = (JL -A)- 1 and M 11 (0) = 2J.L(J.L-A)- 3 , whence the claims are immediate.

6. (i) This question is closely related to Exercise (11.3.1). With the same notation as in that solution,
we have that

where h(x) = min{l, x}. Taking expectations, we obtain IP'(Qn > 0) = JE(An) where
lE(An) =

lX>

lE(An IS= s)dFs(s) = AlE(S) = p,

and S is a typical service time. Square (*) and take expectations to obtain
p(l- 2p)

lE(Qn) =

+ JE(A~+l)

2 (1 _ p)

where JE(A~) is found (as above) to equal p +A 2 JE(S 2 ).


(ii) If a customer waits for time W and is served for time S, he leaves behind him a queue-length
which is Poisson with parameter A(W + S). In equilibrium, its mean satisfies AlE(W + S) = lE(Qn),
whence lE(W) is given as claimed.
(iii) JE(W) is a minimum when JE(S 2 ) is minimized, which occurs when Sis concentrated at its mean.
Deterministic service times minimize mean waiting time.
7. Condition on arrivals in (t, t+h). If there are no arrivals, then Wt+h ::::= x if and only ifW1 ::::= x+h.
If there is an arrival, and his service time is S, then Wt+h ::::= x if and only if W1 ::::= x + h- S. Therefore
F(x; t +h)= (l - Ah)F(x + h; t) + Ah

Jor+h F(x + h- s; t) dFs(s) + o(h).

Subtract F(x; t), divide by h, and take the limit ash ,).. 0, to obtain the differential equation.
We take Laplace-Stieltjes transforms. Integrating by parts, for e :::: 0,
{

lco,oo)
{

lco,oo)
{

lco,oo)

eex dh(x) = -h(O)- B{Mu(B) - H(O)},

eex dH(x) = Mu(B)- H(O),

eex dlP'(U

+ S:::: x) =

Mu(B)Ms(B),

and therefore
0 = -h(O)- B{Mu(B)- H(O)}

+ AH(O) + AMu(B){Ms(B)- 1}.

390

Solutions [11.8.8]-[11.8.10]

Problems

Set e = 0 to obtain that h(O) = 'AH(O), and therefore


1
H(O) = -eMu(e){'A(Ms(e)-

1)- e}.

Take the limit as e --.. 0, using L'Hopital's rule, to obtain H(O) = 1- 'AJE.(S) = 1 - p. The moment
generating function of U is given accordingly. Note that Mu is the same as the moment generating
function of the equilibrium distribution of actual waiting time. That is to say, virtual and actual
waiting times have the same equilibrium distributions in this case.

8. In this case U takes the values 1 and -2 each with probability ~ (as usual, U = S- X where S
and X are typical (independent) service and interarrival times). The integral equation for the limiting
waiting time distribution function F becomes
F(O) = ~ F(2),

F(x)

HF(x- 1) + F(x + 2)}

The auxiliary equation is e3 - 28 + 1 = 0, with roots 1 and- ~(1


can contribute, whence
F(x) =A+ B (

forx = 1,2, ....

J5).

Only roots lying in [-1, 1]

-l: JSr

for some constants A and B. Now F(x) -->- 1 as x -->- oo, since the queue is stable, and therefore
A= 1. Using the equation for F(O), we find that B = ~(1-

JS).

9. Q is a M('A)/M(JL)/oo queue, otherwise known as an immigration-death process (see Exercise


(6.11.3) and Problem (6.15.18)). As found in (6.15.18), Q(t) has probability generating function
G(s, t) = { 1 + (s- l)e-JLt} I exp{p(s- 1)(1 - e-JLt)}

where p ='A/ IL Hence


JE.(Q(t)) = Ie-JLt

+ p(l- e-JLt),

lP'(Q(t) =0) = (1-e-JLti exp{-p(1-e-ILt)},


as t -->- oo.
If JE.(J) and JE.(B) denote the mean lengths of an idle period and a busy period in equilibrium, we
have that the proportion of time spent idle is JE.(/)/{lE.(J) + JE.(B)}. This equals limt-H>o lP'(Q(t) =
0) = e-P. Now JE.(J) = A- 1, by the lack-of-memory property of the arrival process, so that JE.(B) =
(eP- 1)/'A.

10. We have in the usual way that


Q(t

+ 1) =At+ Q(t)- min{l, Q(t)}

where At has the Poisson distribution with parameter A. When the queue is in equilibrium, JE.(Q(t)) =
+ 1)), and hence

JE.(Q(t

lP'(Q(t) > 0) = JE.(min{l, Q(t)}) = JE.(At) ='A.


We have from (*)that the probability generating function G(s) of the equilibrium distribution of
Q(t) (= Q) is

G(s) = JE.(sAt)JE.(sQ-min{1,Q}) = e).(s-1){lE.(sQ-1 l(Q:::l))

391

+ lP'(Q =

0) }.

[11.8.11]-[11.8.13]

Solutions

Queues

Also,
G(s) = JE.(sQ I{Q:::1J) + JP>(Q = 0),
and hence
G(s) =
whence

e).(s- 1) {

~G(s) +

( 1-

~) (1- A)}

(1 - s)(l -A)
G(s) = 1- se -).(s -1).

The mean queue length is G' (1) = ~A(2 - A)/(1 -A). Since service times are of unit length,
and arrivals form a Poisson process, the mean residual service time of the customer in service at an
arrival time is ~, so long as the queue is non-empty. Hence
A

JE.(W) = JE.(Q)- ~JP>(Q > 0) = - - 2(1- A)

11. The length B of a typical busy period has moment generating function satisfying M B (s) =
exp{s- A+ AMs(s)}; this fact may be deduced from the standard theory ofM/G/1, or alternatively
by a random-walk approach. Now T may be expressed as T = I + B where I is the length of the
first idle period, a random variable with the exponential distribution, parameter A. It follows that
MT(s) = AMs(s)/(A- s). Therefore, as required,
(A- s)MT(s) = Aexp{s- A+ (A- s)MT(s) }.
If A :::: 1, the queue~length at moments of departure is either null persistent or transient, and it
follows that JE.(T) = oo. If A < 1, we differentiate(*) and sets = 0 to obtain AlE.(T)- 1 =A 2 JE.(T),
whence JE.(T) = {A(l - A)} -1.
12. (a) Q is a birth-death process with parameters Ai = A, /-[i = {[, and is therefore reversible in
equilibrium; see Problems (6.15.16) and (11.8.3).
(b) The equilibrium distribution satisfies krci = {[7ri+1 fori :::: 0, whence Jri = (1- p)pi where
p = A/ f.-[. A typical waiting time W is the sum of Q independent service times, so that

Mw(s)=GQ(Ms(s))=

1-p

1- P{[/({[- s)

(1-p)(f.-[-S)
.
f.-[(1- p)- s

(c) See the solution to Problem (11.8.3).


(d) Follow the solution to Problem (11.8.3) (either method) to find that, at any timet in equilibrium,
the queue lengths are independent, the jth having the equilibrium distribution ofM(A)IM({[j )/1. The
joint mass function is therefore

where

Pj

=A/{[J.

13. The size of the queue is a birth-death process with rates Ai =A, /-[i = {[ min{i, k}. Either solve
the equilibrium equations in order to find a stationary distribution 7r, or argue as follows. The process
is reversible in equilibrium (see Problem (6.15.16)), and therefore Ailri = /-[i+17ri+1 for all i. These
'balance equations' become
ifO~i<k,

ifi:::: k.

392

Solutions [11.8.14]-[11.8.14]

Problems

These are easily solved iteratively to obtain

Tri

noai I i!
no(aj k)i kk I k!

ifO:Si ::Sk,
ifi:::: k

where a = A./ JL. Therefore there exists a stationary distribution if and only if J.. . < kJL, and it is given
accordingly, with

The cost of having k servers is


oo
(a/ k)i kk
Ck = Ak + Bno L)- k + 1)-'---'--::-:-i=k
k!

where no = no(k). One finds, after a little computation, that

c1

Ba
=A+,
1-a

2Ba 2
4-a

C2 =2A+ - 2.

Therefore

a 3 (A - B)+ a 2 (2B - A) - 4a(A +B)+ 4A


C2- Cl =

(1 - a)(4- a2)

Viewed as a function of a, the numerator is a cubic taking the value 4A at a= 0 and the value -3B
at a= l. This cubic has a unique zero at some a* E (0, 1), and C1 < C2 if and only ifO <a <a*.

14. The state of the system is the number Q(t) of customers within it at timet. The state 1 may be
divided into two sub-states, being a1 and a2, where O"i is the state in which server i is occupied but
the other server is not. The state space is therefore S = {0, a1, a2, 2, 3, ... }.
The usual way of finding the stationary distribution, when it exists, is to solve the equilibrium
equations. An alternative is to argue as follows. If there exists a stationary distribution, then the
process is reversible in equilibrium if and only if

for all sequences i1, i2, ... , ik of states, where G = (guv)u,vES is the generator of the process (this
may be shown in very much the same way as was the corresponding claim for discrete-time chains
in Exercise (6.5.3); see also Problem (6.15.16)). It is clear that(*) is satisfied by this process for all
sequences of states which do not include both a1 and a2; this holds since the terms guv are exactly
those of a birth-death process in such a case. In order to see that (*) holds for a sequence containing
both a1 and a2, it suffices to perform the following calculation:
go,al gal ,2g2,azgaz,O = ( ~ J....)J....J.L2/Ll = go,az gaz,2g2,al gal ,0

Since the process is reversible in equilibrium, the stationary distribution


i= v. Therefore

1r:

satisfies Truguv

Trvgvu for all u, v E S, u

and hence
Tra1

= -2

J.. .

ILl

no,

')...2

Tru = - - 2JL 1/L2

393

')...

JL1

+ JL2

)u-2 no

for u ::=: 2.

[11.8.15]-[11.8.17]

Solutions

Queues

This gives a stationary distribution if and only if).. < JL1 + f.i-2, under which assumption no is easily
calculated.
A similar analysis is valid if there are s servers and an arriving customer is equally likely to go
to any free server, otherwise waiting in tum. This process also is reversible in equilibrium, and the
stationary distribution is similar to that given above.

15. We have from the standard theory that QJL has as mass function Trj = (1 - IJ)IJj, j 0::: 0, where
IJ is the smallest positive root of the equation x = eJL(x-I). The moment generating function of
(1- f.L- 1)QJL is
1 -1] _ .
MJL(8)=E(exp{8Cl-f.L-l)QJL})=
1
1 - IJeeO-JL l
Writing f.L = 1 + E, we have by expanding eiLC'7-l) as a Taylor series that IJ = IJ(E) = 1- 2E + o(E)
as E ,j, 0. This gives
2E + O(E)
2E + O(E)
2
= 1 - (1 - 2E)(l +BE)+ o(E) = (2- B)E + o(E) --+ 2- e

MJL(B)

as E ,j, 0, implying the result, by the continuity theorem.

16. The numbers P (of passengers) and T (of taxis) up to timet have the Poisson distribution with
respective parameters nt and rt. The required probabilities Pn = IP'(P = T + n) have generating
function
00

00

2.: 2.: IP'(P = m + n)IP'(T = m)zn


n=-oo

n=-OOm=O
00

2.: IP'(T = m)z-mG p(z)


m=O

= GT(Z-l)Gp(Z) = e-(rr+r)te(rrz+rz-l)t,
in which the coefficient of zn is easily found to be that given.

17. Let N(t) be the number of machines which have arrived by timet. Given that N(t) = n, the times
T1, T2, ... , Tn of their arrivals may be thought of as the order statistics of a family of independent
uniform variables on [0, t], say U1, U2, ... , Un; see Theorem (6.12.7). The machine which arrived
at time

ui

is, at time t,
in the X -stage}
{ a(t)
in the Y -stage
with probability
f3 ( t)
1 - a(t) - f3(t)
repaired

where a(t) = IP'(U +X > t) and f3(t) = IP'(U +X :=: t < U +X+ Y), where U is uniform on [0, t],
and (X, Y) is a typical repair pair, independent of U. Therefore
IP'(U(t) = . V(t) = k I N(t) = n) = n! a(t)j f3(ti (1 - a(t) - f3(t))n-k- j

1"lkl( n 1-k)'

implying that
oo e-At (At)n

IP'(U(t) = j, V(t) = k) =

2.:

n= 0

n.1

.,

394

IP'(U(t) = j, V(t) = k I N(t) = n)

k!

Solutions [11.8.18]-[11.8.19]

Problems

18. The maximum deficit Mn seen up to and including the time of the nth claim satisfies

Mn

= max{Mn-1 t ( K j - Xj)} = max{O, U1, U1 + U2, ... , U1 + U2 + + Un},


;=1

where the Xj are the inter-claim times, and Uj = Kj - Xj. We have as in the analysis of G/G/1 that
Mn has the same distribution as Vn = max{O, Un, Un + Un-1, ... , Un + Un-1 + + Ul}, whence
Mn has the same distribution as the (n + l)th waiting time in a M(A.)/G/1 queue with service times
Kj and interarrival times Xj. The result follows by Theorem (11.3 .16).

19. (a) Look for a solution to the detailed balance equations A.rri = (i + l)JL7ri+1, 0
that the stationary distribution is given by Jri = (pi/ i !)no.
(b) Let Pc be the required fraction. We have by Little's theorem (10.5.18) that

Pc =

A.(rrc-1 - Jrc)

= p(rrc-1 - rrc),

i < s, to find

c :=:: 2,

fL

and P1 = rr1, where Jrs is the probability that channels 1, 2, ... , s are busy in a queue M/M/s having
the property that further calls are lost when all s servers are occupied.

395

12
Martingales

12.1 Solutions. Introduction


1.

(i) We have that E(Ym) = E{E(Ym+l I Tm)} = E(Ym+J), and the result follows by induction.
(ii) For a submartingale, E(Ym) ::0 E{E(Ym+l I Tm)} = E(Ym+d, and the result for supermartingales
follows similarly.
2.

We have that

if m :::: 1, since Tn s; Tn+m-1 Iterate to obtain E(Yn+m I Tn) = E(Yn I Tn) = Yn.
3.

(i) Znt-t-n has mean 1, and

where Tn = cr(ZJ, Zz, ... , Zn).


(ii) Certainly 1'/Zn ::0 1, and therefore it has finite mean. Also,

where the Xi are independent family sizes with probability generating function G. Now G(ry) = ry,
and the claim follows.

4.

(i) With Xn denoting the size of the nth jump,

where Tn = cr(X 1, Xz, ... , Xn). Also EISn I ::0 n, so that {Sn} is a martingale.
(ii) Similarly E(S;) = var(Sn) = n, and

(iii) Suppose the walk starts at k, and there are absorbing barriers at 0 and N (:::: k). LetT be the time
at which the walk is absorbed, and make the assumptions that E(ST) =So, E(Sf- T) =
Then
the probability Pk of ultimate ruin satisfies

sg.

0 Pk

+N

(1 - Pk) = k,

396

Solutions

Introduction

[12.1.5]-[12.1.9]

and therefore Pk = 1 - (k/ N) and lE(T) = k(N - k).


5.

(i) By Exercise (12.1.2), for r ::=: i,


lE(YrYi)

= lE{lE(YrYi

I .rj)}

= lE{YilE(Yr

I .rj)}

= lE(Yh,

an answer which is independent of r. Therefore


ifi_:::=j_:::=k.

(ii) We have that


lE{ (Yk - YJ ) 2 l.rt} = lE(Yfl .rj) - 21E(Yk YJ I .rj)

+ lE(Y}I

.rj ).

NowlE(YkYj IJ=i) =lE{lE(YkYj I Jj)j J:'i} =E(Yf 1 J:'i),andthec1aimfo1lows.


(iii) Taking expectations of the last conclusion,

j.:::: k.
Now {lE(YJ) : n :;:: 1} is non-decreasing and bounded, and therefore converges. Therefore, by(*),
{ Yn : n :;:: 1} is Cauchy convergent in mean square, and therefore convergent in mean square, by
Problem (7.11.11).

6.

(i) Using Jensen's inequality (Exercise (7.9.4)),

(ii) It suffices to note that lx I, x 2 , and x+ are convex functions of x; draw pictures if you are in doubt
about these functions.
7. (i) This follows just as in Exercise (12.1.6), using the fact that u{lE(Yn+l I .7='n)} ::=: u(Yn) in this
case.
(ii) The function x+ is convex and non-decreasing. Finally, let {Sn : n ::=: 0} be a simple random walk
whose steps are +1 with probability p (= 1 - q > ~) and -1 otherwise. If Sn < 0, then
lE (ISn+III .7='n)

= p(ISnl-1) + q(ISnl + 1) =

ISnl- (p- q) < ISnl;

note that lP'(Sn < 0) > 0 if n ::=: 1. The same example suffices in the remaining case.
8.

Clearly lElA. -nlfr(Xn)l _::::A. -n sup{llfr(j)l : j

S}. Also,

lE(lfr(Xn+J) I .7='n) = LPXn,}lfr(j) _:::: A.lfr(Xn)


}ES

where .7='n = cr(X 1, Xz, ... , Xn). Divide by A. n+l to obtain that the given sequence is a supermartingale.
9. Since var(ZJ) > 0, the function G, and hence also Gn, is a strictly increasing function on [0, 1].
Since 1 = Gn+l (Hn+l (s)) = Gn(G(Hn+l (s))) and Gn(Hn(s)) = 1, we have that G(Hn+l (s)) =
Hn(s). With .7='m = cr(Zk : 0 _:::: k _:: : m),
lE(Hn+l (s) 2 n+l I .7='n) = G(Hn+l (s))Zn = Hn(s)Zn.

397

[12.2.1]-[12.3.2]

Solutions

Martingales

12.2 Solutions. Martingale differences and Hoeffding's inequality


1. Let:Fi = a({Vj, Wj: 1 _::: j _::: i}) and Yi = lE(Z I :fl). With Z(j) the maximal worth attainable
without using the jth object, we have that
lE(Z(j)

I:FJ) =

lE(Z(j)

I0-d.

Z(j) _::: Z _::: Z(j)

+ M.

Take conditional expectations of the second inequality, given :F) and given :F) -I , and deduce that
IYj - YJ-!1 _::: M. Therefore Y is a martingale with bounded differences, and Hoeffding's inequality
yields the result.
2. Let :fi be the a-field generated by the (random) edges joining pairs (va, Vb) with 1 _::: a, b _::: i,
and let Xi = lE(X I :fi ). We write x (j) for the minimal number of colours required in order to colour
each vertex in the graph obtained by deleting Vj. The argument now follows that of the last exercise,
using the fact that X (j) _::: X _::: X (j) + 1.

12.3 Solutions. Crossings and convergence


1.

Let T1

= min{n:

Yn

:=:::

b}, Tz

= min{n

Tzk-1 = min{n > Tzk-2 : Yn

> T1 : Yn _:::a}, and define Tk inductively by


:=:::

Tzk = min{n > Tzk-1 : Yn _:::a}.

b},

The number of downcrossings by time n is Dn(a, b; Y) = max{k: Tzk _::: n}.


(a) Between each pair of upcrossings of [a, b ], there must be a downcrossing, and vice versa. Hence
IDn(a, b; Y)- Un(a, b; Y)l _::: 1.
(b) Let Ii be the indicator function of the event that i

(Tzk-1, Tzk] for some k, and let

Zn =

L li(Yi- Yi-J),

:=:::

0.

i=l

It is easily seen that

Zn _::: -(b- a)Dn(a, b; Y)

+ (Yn- b)+,

whence

(b- a)lEDn(a, b; Y) _::: lE{ (Yn- b)+} -lE(Zn).


Now Ii is :Fi-1-measurable, since

Ui

= 1} =

U({Tzk-1 ::: i- 1} \ {Tzk::: i -1}).


k

Therefore,

since In :=::: 0 andY is a submartingale. It follows that lE(Zn) :=::: lE(Zn_J) :=::: :=::: lE(Zo) = 0, and
the final inequality follows from ( *).
2. If Y is a supermartingale, then - Y is a submartingale. Upcrossings of [a, b] by Y correspond to
downcrossings of [- b, -a] by - Y, so that
lEUn (a, b, Y) =lEDn (-b , - a, -Y) <
-

398

lE{(- Y +a)+}
n
ba

JE{(Y -a)-}
= __bn_ __
a
,

Solutions

Stopping times

[12.3.3]-[12.4.5]

by Exercise (12.3.1). If a, Yn 2':: 0 then (Yn -a)- :::::a.


3. The random sequence {1/r(Xn) : n :=:: 1} is a bounded supermartingale, which converges a.s. to
some limit Y. The chain is irreducible and persistent, so that each state is visited infinitely often a.s.;
it follows that limn-+oo 1/r(Xn) cannot exist (a.s.) unless 1fr is a constant function.

I:i"'

4. Y is a martingale since Yn is the sum of independent variables with zero means. Also
IP'(Zn =f.
0) = L:j"' n- 2 < oo, implying by the Borel-Cantelli lemma that Zn = 0 except for finitely many
values of n (a.s.); therefore the partial sum Yn converges a.s. as n -+ oo to some finite limit.
It is easily seen that an = San -1 and therefore an = 8 sn- 2 , if n :=:: 3. It follows that IYn I :=:: ian
if and only if IZn I = an. Therefore

which tends to infinity as n -+ oo.

12.4 Solutions. Stopping times


1.

We have that
n

U ({T1=k}n{T2=n-k}),
k=O
{ max{T1, T2}::::: n} = m : : : n} n {T2::::: n},
{ min{T1, T2}::::: n} = m : : : n} U {T2::::: n}.
{T1+T2=n}=

Each event on the right-hand side lies in :Fn.

2.

Let :Fn = a(X1, X2, ... , Xn) and Sn = X1 + X2 + + Xn. Now


{N(t)

+1=

n} = {Sn-1::::: t}

n {Sn

> t}

:Fn.

3. (Y+, :F) is a submartingale, and T = min{k: Yk :=:: x} is a stopping time. Now 0::::: T 1\ n::::: n,
so that JE(Yd) ::::: lE(YfAn) ::::: lE(Y,i), whence

4.

We may suppose that JE(Yo) < oo. With the notation of the previous solution, we have that

5. It suffices to prove that lEYs ::::: lEYT, since the other inequalities are of the same form but with
different choices of pairs of stopping times. Letlm be the indicator function of the event {S < m ::::: T},
and define
n

Zn =

lm(Ym- Ym-1),

m=1

Note that Im is :Fm-1-measurable, so that

399

O:sn :S N.

[12.4.6]-[12.5.2]

Solutions

Martingales

since Y is a submartingale. Therefore E(ZN) ~ E(ZN-1)


ZN = Yr- Ys, and therefore E(Yr) ~ E(Ys).

~ E(Zo)

= 0.

On the other hand,

6. De Moivre's martingale is Yn = (q I p )Sn, where q = 1 - p. Now Yn ~ 0, and E(Yo) = 1, and


the maximal inequality gives that
ll' (max Sm
O:=;m:=;n

~ x)

= ll' (max Ym
O:=;m:=;n

~ (qlp)x)::: (plq)x.

Take the limit as n -+ oo to find that S00 = supm Sm satisfies


00

L ll'(S

E(Soo) =

00

x=l

x)::: _P__
q- P

We can calculate E(S00 ) exactly as follows. It is the case that S00 ~ x if and only if the walk
ever visits the point x, an event with probability fx for x ~ 0, where f = pI q (see Exercise (5.3.1)).
The inequality of (*) may be replaced by equality.
7.

(a) First,

n {T::: n} =

0 E :Fn.

Secondly, if An {T::: n}

E :Fn

then

Ac n {T::: n} = {T::: n} \(An {T::: n}) E :Fn.

Thirdly, if A1, Az, ... satisfy Ai

n {T ::: n} E

:Fn for each i, then

(U Ai) n {T ::: n} = l) (Ai n {T ::: n}) E :Fn.


l

Therefore :Fr is a a-field.


For each integer m, it is the case that
{T < n}

{T<m}n{T<n}= {
{T:Sm}
an event lying in :Fn. Therefore {T ::: m}
(b) Let A E :Fs. Then, for any n,

E :Fr

ifm > n,
ifm::: n,

for all m.

(An{S:::T})n{T:Sn}=

U (An{S:::m})n{T=m},
m=O

the union of events in :Fn, which therefore lies in :Fn. Hence A n {S ::: T} E :Fr.
(c) We have {S::: T} = n, and (b) implies that A E :Fr whenever A E :Fs.

12.5 Solutions. Optional stopping


Under the conditions of (a) or (b), the family {YT/\n : n ~ 0} is uniformly integrable. Now
T 1\ n -+ T as n -+ oo, so that Yr An -+ Yr a.s. Using uniform integrability, E(Yr An) -+ E(Yr ),
and the claim follows by the fact that E(Yr An) = E(Yo).

1.

2. It suffices to prove that {Yr An : n


uniformly integrable if

0} is uniformly integrable. Recall that {Xn : n

400

0} is

Solutions

Optional stopping

[12.5.3]-[12.5.5]

(a) Now,
lE (IYTAnll(IYT/\nl:::aJ) = lE (IYrll(T:sn,IYTI2:aJ) + lE (IYnll(T>n,IYnl:::aJ)
::0 lE (IYrll(lhl:::aJ) + lE (IYnll{T>nJ) = g(a) + h(n),

say. We have that g(a) ----+ 0 as a ----+ oo, since lEIYrl < oo. Also h(n) ----+ 0 as n ----+ oo,
so that supn>N h(n) may be made arbitrarily small by suitable choice of N. On the other hand,
lE (IYnll(IYnl~aJ)----+ 0 as a----+ oo uniformly inn E {0, 1, ... , N}, and the claim follows.
(b) Since Y;i defines a submartingale, we have that supn lE(YfAn) ::0 supn JE(Y;i) < oo, the second
inequality following by the uniform integrability of {Yn }. Using the martingale convergence theorem,
YT/\n----+ Yr a.s. wherelEIYrl < oo. Now

Also lP'(T > n) ----+ 0 as n ----+ oo, so that the final two terms tend to 0 (by the uniform integrability of
the Yi and the finiteness of lEI Yr I respectively). Therefore Yr An
standard theorem (7.10.3).

Yr, and the claim follows by the

3. By uniform integrability, Y00 = limn-+oo Yn exists a.s. and in mean, and Yn = lE(Y00 I :Fn).
(a) On the event {T = n} it is the case that Yr = Yn and JE(Y00 I :Fr) = lE(Y00 I :Fn); for the
latter statement, use the definition of conditional expectation. It follows that Yr = JE(Y00 I :Fr ),
irrespective of the value of T.
(b) We have from Exercise (12.4.7) that :Fs ~ :Fr. Now Ys = JE(Yoo I :Fs) = lE{lE(Y00 I :Fr) I
:Fs)

= JE(Yr I :Fs).

4.

Let T be the time until absorption, and note that {Sn} is a bounded, and therefore uniformly
integrable, martingale. Also IP'(T < oo) = 1 since T is no larger than the waiting time for N
consecutive steps in the same direction. It follows that JE(So) = lE(Sr) = NIP'(Sr = N), so that
IP'(Sr = N) = JE(So)/ N. Secondly, {S~- n : n 2':: 0} is a martingale (see Exercise (12.1.4)), and the
optional stopping theorem (if it may be applied) gives that
lE(S6)

= JE(Sf -

T)

= N 21P'(Sr = N) -lE(T),

and hence JE(T) = NlE(So) - lECS6) as required.


It remains to check the conditions of the optional stopping theorem. Certainly IP'(T < oo) = 1,
and in addition lE(T 2) < oo by the argument above. We have that lEI Sf- Tl ::0 N 2 + JE(T) < oo.
Finally,
lE{ (S~ - n)l(T>n}} ::0 (N 2 + n)IP'(T > n) ----+ 0
as n ----+ oo, since JE(T 2) < oo.

5. Let :Fn = cr(SJ, S2, ... , Sn). It is immediate from the identity cos(A + ).) + cos(A- ).) =
2 cos A cos). that
lE(Yn+l I :Fn)

cos[,J,.(Sn + 1- ~(b- a))]+ cos[,J,.(Sn- 1- ~(b- a))]

2(cos ,J,.)n+l

= Yn,

and therefore Y is a martingale (it is easy to see that lEI Yn I < oo for all n ).
SupposethatO <). < rrj(a +b), andnotethatO ::0 I,J,.{Sn- ~(b -a)}l <~).(a +b)< ~JT for
n ::0 T. Now Yr An constitutes a martingale which satisfies

cos{~,J,.(a+b)}
_ _-=c__=-_ < Yr An <
(cos ,J,.)T 1\n

401

(cos ,J,.)T

[12.5.6]-[12.5.8]

Solutions

Martingales

If we can prove that E{(cos A.)-T} < oo, it will follow that {Yr 1\n} is uniformly integrable. This will
imply in tum that E(Yr) = Iimn-+oo E(Yr 1\n) = E(Yo), and therefore
cos{~A.(a + b)}E{(cosA.)-T} = cos{iA.(b- a)}

as required. We have from (*) that

Now T

1\

n --+ T as n --+ oo, implying by Fatou's lemma that

E{(cosA.)-T}:::

cos{iA.(a- b)}

E(Yo)

cos{~A.(a +b)}

cos{iA.(a +b)}

6. (a) The occurrence of the event {U = n} depends on Sr, S2, ... , Sn only, and therefore U is a
stopping time. Think of U as the time until the first sequence of five consecutive heads in a sequence
of coins tosses. Using the renewal-theory argument of Problem (10.5.17), we find that E(U) = 62.
(b) Knowledge of Sr, S2, ... , Sn is insufficient to determine whether or not V = n, and therefore V
is not a stopping time. Now E(V) = E(U)- 5 =57.
(c) W is a stopping time, since it is a first-passage time. Also E(W) = oo since the walk is null
persistent.
7.

With the usual notation,

E(Mm+n I :Fm)

=E(

r=O

Sr

Sr -

~(Sm+n -

Sm

+ Sm) 3

:Fm)

r=m+l

+ nSm- SmE{(Sm+n- Sm) 2}


= Mm + nSm- nSmE(XI) = Mm.

= Mm

Thus {Mn : n :::: 0} is a martingale, and evidently Tis a stopping time. The conditions of the optional
stopping theorem (12.5.1) hold, and therefore, by a result of Example (3.9.6),

8. We partition the sequence into consecutive batches of a + b flips. If any such batch contains only
l's, then the game is over. Hence lP'(T > n(a +b))::; {1- (~)a+b}n--+ 0 as n--+ oo. Therefore,
EISf- Tl ::; E(Sf) + E(T) ::; (a+ b) 2 + E(T) < oo,
and
as n--+ oo.

402

Solutions

Problems

[12.7.1]-[12.9.2]

12.7 Solutions. Backward martingales and continuous-time martingales


1.

Lets~ t. WehavethatE(I](X(t))

I J=S, Xs =

i) = 'j Pij(t- s)IJ(j). Hence

i)

:tlE(IJ(X(t)) :Fs, Xs =

= (Pr-sGJ7'); = 0,

so that E(I](X (t)) I :Fs, Xs = i) = l](i ), which is to say that lE(IJ (X (t))

2. Let W(t) = exp{-BN(t) + A.t(l- e-li)} where


constitutes a martingale. Furthermore

e 0::: 0.

I :Fs) = I](X (s )).

It may be seen that W(t

1\

Ta), t 0::: 0,

where, by assumption, the limit has finite expectation for sufficiently small positive e (this fact may be
checked easily). In this case, {W(t 1\ Ta): t 0::: 0} is uniformly integrable. Now W(t 1\ Ta)-+ W(Ta)
a.s. as t -+ oo, and it follows by the optional stopping theorem that

Writes = e-li to obtain s-a = E{eHae 1-sl}. Differentiate at s = 1 to find that a = AE(Ta) and
a(a + 1) =A 2E(T}), whence the claim is immediate.
3. Let 9-m be the a -field generated by the two sequences of random variables Sm, Sm+ 1 ... , Sn and
Um+1, Um+2 ... , Un. It is a straightforward exercise in conditional density functions to see that
-1

E (Um+1

!oUm+2
I 9-m+1) -- 0

(m

+ l)xm- 1 d x -m-+- 1

(U

m+2

)m+1

U
'
m m+2

whence E(Rm I 9-m+ 1) = Rm+1 as required. [The integrability condition is elementary.]


LetT= max{m : Rm 0::: 1} with the convention that T = l if Rm < 1 for all m. As in the closely
related Example (12.7.6), Tis a stopping time. We apply the optional stopping theorem (12.7.5) to
the backward martingale R to obtain that E(RT I 9-n) = Rn = Sn/t. Now, RT 0::: 1 on the event
{Rm 0::: l for some m ~ n}, whence

~ = E(RT I Sn =
t

y) 0::: IP'(Rm 0::: l forsomem

~ n I Sn = y).

[Equality may be shown to hold. See Karlin and Taylor 1981, pages 110-113, and Example (12.7.6).]

12.9 Solutions to problems


1.

Clearly E(Zn)

(M + m )n, and hence El Yn I < oo. Secondly, Zn+ 1 may be expressed as

'f,:; 1 X;+ A, where X1, X2, ... are the family sizes of the members of the nth generation, and A is
the number of immigrants to the (n + l)th generation. Therefore E(Zn+1

I Zn) = tJ-Zn +

m, whence

1 { MZn + m ( 1- 1 -l _Mn+1)}
E(Yn+1 I Zn) = Mn+
M
= Yn.
1
2. Each birth in the (n + 1)th generation is to an individual, say the sth, in the nth generation. Hence,
for each r, Ben+ 1),r may be expressed in the form Ben+ 1),r = Bn,s + Bj(s ), where Bj (s) is the age
of the parent when its jth child is born. Therefore
E{

L e-liB(n+l).r I:Fn} = E{ ~ e-lieBn.s+Bjes)) I:Fn} = L e-liBn.s M1 (B),


r

S,J

403

[12.9.3]-[12.9.5]

Solutions

which gives that E(Yn+ 1 I :Fn)


3.

Martingales

Yn. Finally, E(Y1 (e))

1, and hence E(Yn (e))

1.

If x, c > 0, then
lP' (max Yk >
1~k~n

x) ::=: lP' (max (Yk + c) > (x + c)


2

1~k~n

2 ).

Now (Yk + c) 2 is a convex function of Yk, and therefore defines a submartingale (Exercise (12.1.7)).
Applying the maximalinequality to this submartingale, we obtain an upper bound ofE{ (Yn +c) 2 }/ (x +
c) 2 for the right-hand side of(*). We set c = E(Y1)/x to obtain the result.

4. (a) Note that Zn = Zn-1 + Cn{Xn- E(Xn I :Fn_I)}, so that (Z, !F) is a martingale. LetT be
the stopping timeT= min{k: qYk 2:: x}. ThenE(ZTAn) = E(Zo) = 0, so that

since the final term in the definition of Zn is non-negative. Therefore


n

xlP'(T ::S n) ::S E{cTAnyT/\n} ::S LckE{E(Xk I :Fk-d},

k=1
where we have used the facts that Yn 2:: 0 and E(Xk I :Fk-d 2:: 0. The claim follows.
(b) Let X 1, X 2, ... be independent random variables, with zero means and finite variances, and let
Yj =
1 X;. Then Y/ defines a non-negative submartingale, whence

L;{=

5. The function h(u) = lulr is convex, and therefore Y;(m) = IS;- Smlr, i 2:: m, defines a
submartingale with respect to the filtration :F; = a ({x1 : 1 :::: j :::: i}). Apply the HRC inequality of
Problem (12.9.4), with q = 1, to obtain the required inequality.
If r = 1, we have that

E(ISm+n- Sml) ::S

m+n
L
EIZkl
k=m+1

by the triangle inequality. Let m, n --+ oo to find, in the usual way, that the sequence {Sn} converges
a.s.; Kronecker's lemma (see Exercise (7.8.2)) then yields the final claim.
Suppose 1 < r ::=: 2, in which case a little more work is required. The function h is differentiable,
and therefore

h(v)- h(u) = (v- u)h 1 (u)

+ fov-u {h'(u +x)- h 1(u)} dx.

Now h'(y) = rlylr- 1sign(y) has a derivative decreasing in IYI It follows (draw a picture) that
h 1 (u + x)- h 1(u) ::=: 2h 1 ( ix) if x 2:: 0, and therefore the above integral is no larger than 2h(i(v- u)).
Apply this with v = Sm+k+1 - Sm and u = Sm+k- Sm, to obtain

404

Solutions [12.9.6]-[12.9.10]

Problems

Sum over k and use the fact that

to deduce that
E(ISm+n- smn

~ 2 2 -r

m+n

E(lzkn

k=m+l

The argument is completed as after (*).


6.

With

= l{Yk=O) we have that

+ nYn-tiXn 1(1 - ln-t) I.Tn-1)


= ln-tE(Xn) + nYn-1 (1- ln-t)EIXnl = Yn-t

E(Yn I .Tn-t) = E( Xnln-1

since E(Xn) = 0, EIXnl = n- 1. Also EIYnl ~ E{IXniO


El Yn I < oo. Therefore (Y, !F) is a martingale.
Now Yn

0 if and only if Xn

0. Therefore lP'(Yn

+ niYn-tl)}
=

0)

and EIYtl < oo, whence

lP'(Xn

0)

1 - n- 1 ---+ 1

as n ---+ oo, implying that Yn ~ 0. On the other hand, Ln lP'(Xn =1- 0) = oo, and therefore
lP'(Yn =1- 0 i.o.) = 1 by the second Borel-Cantelli lemma. However, Yn takes only integer values, and
therefore Yn does not converge to 0 a.s. The martingale convergence theorem is inapplicable since
supn EIYnl = 00.
7. Assume that t > 0 and M(t) = 1. Then Yn = e 1Sn defines a positive martingale (with mean 1)
with respect to .Tn = a(Xt, X2, ... , Xn). By the maximal inequality,

and the result follows by taking the limit as n ---+ oo.

8. The sequence Yn = ~Zn defines a martingale; this may be seen easily, as in the solution to
Exercise (12.1.15). Now {Yn} is uniformly bounded, and therefore Y 00 = limn---+oo Yn exists a.s. and
satisfies E(Y00 ) = E(Yo) = ~.
Suppose 0 < ~ < 1. In this case Zt is not a.s. zero, so that Zn cannot converge a.s. to a constant
c unless c E {0, oo}. Therefore the a.s. convergence of Yn entails the a.s. convergence of Zn to a limit
random variable taking values 0 and oo. In this case, E(Y00 ) = 1 lP'(Zn ---+ 0) + 0 lP'(Zn ---+ oo ),
implying that lP'(Zn ---+ 0) = ~, and therefore lP'(Zn ---+ oo) = 1 - ~.

9. It is a consequence of the maximal inequality that lP'(Y; =:: x) ~ x- 1E(Ynl{Y,i::::x)) for x > 0.
Therefore
E(Y;) = fooo lP'(Y; =:: x) dx

~ 1 + E { Yn

00

~ 1+

00

lP'(Y;

=:: x) dx

x-l I(l,Y,t](x) dx}

= 1 + E(Yn log+ Y;) ~ 1 + E(Yn log+ Yn)

+ E(Y;)fe.

10. (a) We have, as in Exercise (12.7.1), that


E(h(X(t)) I B,X(s)

=i) =

LPij(t)h(j)
j

405

fors < t,

[12.9.11]-[12.9.13]

Solutions

Martingales

for any event B defined in terms of {X (u) : u ::=: s}. The derivative of this expression, with respect to
t, is (P1 Gh')i, where P1 is the transition semigroup, G is the generator, and h = (h(j) : j :::: 0). In
this case,

(Gh1)j

=L

gjkh(k)

= Aj { h(j + 1)- h(j)}- IJ-j { h(j)- h(j- 1)} = 0

for all j. Therefore the left side of (*) is constant for t ::=: s, and is equal to its value at time s, i.e.
X (s). Hence h(X(t)) defines a martingale.
(b) We apply the optional stopping theorem with T = min{t: X(t) E {0, n}} toobtainE(h(X(T))) =
E(h(X(O))), and therefore (1 - n(m))h(n) = h(m) as required. It is necessary but not difficult to
check the conditions of the optional stopping theorem.

11. (a) Since Y is a submartingale, so is y+ (see Exercise (12.1.6)). Now

Therefore {E(Y:+m I :Fn) : m :::: 0} is (a.s.) non-decreasing, and therefore converges (a.s.) to a limit
Mn. Also, by monotone convergence of conditional expectation,

E(Mn+l

I :Fn) = m--+oo
lim E{E(Y:+m+l I :Fn+I) I:Fn} = m--+00
lim E(Y:+m+l I :Fn) = Mn,

and furthermore E(Mn) = limm---+oo E(Y:+n) :::0 M. It is the case that Mn is :Fn-measurable, and
therefore it is a martingale.
(b) We have that Zn = Mn - Yn is the difference of a martingale and a submartingale, and is therefore
a supermauingale. Also Mn ::=: Y: ::=: 0, and the decomposition for Yn follows.
(c) In thi~ case Zn is a martingale, being the difference of two martingales. Also Mn :::: E(Y: I :Fn) =
Y: ::=: Yn a.s., and the claim follows.

12. We may as well assume that JJ- < P since the inequality is trivial otherwise. The moment
generating function of P- Cr is M(t) = iCP-JLH5;a 212 , and we choose t such that M(t) = l,
i.e., t = -2(P- J.J-)/a 2. Now define Zn = min{etYn, 1} and :Fn = a(Cr, C2, ... , Cn). Certainly
EIZnl < oo; also
E(Zn+l I :Fn) :::0 E(etYn+l I :Fn) = etYn M(t) = etYn
and E(Zn+l I :Fn) :S 1, implying that E(Zn+l I :Fn) :S Zn. Therefore (Zn, :Fn) is a positive
supermartingale. LetT = inf{n : Yn ::=: 0} = inf{n : Zn = 1}. Then T 1\ m is a bounded stopping
time, whence E(Zo) ::=: E(ZT Am) ::=: JP>(T ::=: m). Let m ---+ oo to obtain the result.

13. Let :Fn = a(Rr, R2, ... , Rn).


(a) 0 :S Yn :S 1, and Yn is :Fn-measurable. Also

E(Rn+l

I Rn)

= Rn

Rn

+ n+r+ b'

whence Yn satisfies E(Yn+I I :Fn) = Yn. Therefore {Yn : n ::=: 0} is a uniformly integrable martingale,
and therefore converges a.s. and in mean.
(b) In order to apply the optional stopping theorem, it suffices that JP>(T < oo) = l (since Y is
uniformly integrable). However JP>(T > n) =
~
n~I = (n + l)- 1 ---+ 0. Using that theorem,

1
E(YT) = E(Yo), which is to say that E{T I (T + 2)} = 1, and the result follows.
(c) Apply the maximal inequality.

406

Solutions [12.9.14]-[12.9.17]

Problems

14. As in the previous solution, with fi,n the a-field generated by A1, A2, ... and :Fn,
E(Yn+1

I 13
(J>n) =

Rn + An
) (
Rn
)
Rn + Bn +An
Rn + Bn
Rn
---=Yn,
Rn +Bn
(

+(

Rn

Rn

+ Bn + An

) (

Bn
Rn

+ Bn

sothatE(Yn+1 I :Fn) = E{E(Yn+1 I fi,n) I :Fn} = Yn. Also IYn I ::=: 1, and therefore Yn is a martingale.
We need to show that lP'(T < oo) = 1. Letln be the indicator function of the event {T > n}. We
have by conditioning on the An that

E(In

I A)

IT

= n-1 ( 1 - -1- ) --+

J=O

2 + SJ

IT
00

J=O

1 - -1- )
2 + S1

L-{=

as n --+ oo, where s1 =


1 Ai. The infinite product equals 0 a.s. if and only if"L,1 (2+ SJ )- 1 = oo
a.s. By monotone convergence, lP'(T < oo) = 1 under this condition. If this holds, we may apply the
optional stopping theorem to obtain that E(YT) = E(Yo), which is to say that

15. At each stage k, let Lk be the length of the sequence 'in play', and let Yk be the sum of its
entries, so that Lo = n, Yo = L,j= 1 Xi. If you lose the (k + 1)th gamble, then Lk+ 1 = Lk + 1 and
Yk+1 = Yk + Zk where Zk is the stake on that play, whereas if you win, then Lk+1 = Lk - 2 and
Yk+1 = Yk- Zk; we have assumed that Lk :::: 2, similar relations being valid if Lk = 1. Note that
Lk is a random walk with mean step size -1, implying that the first-passage time T to 0 is a.s. finite,
and has all moments finite. Your profits at time k amount to Yo - Yb whence your profit at time T is
Yo, since YT = 0.
Since the games are fair, Yk constitutes a martingale. Therefore E(YT 1\m) = E(Yo) =j:. 0 for
all m. However T 1\ m --+ T a.s. as m --+ oo, so that YT 1\m --+ YT a.s. Now E(YT) = 0 =j:.
Iimm---+ooE(YT/\m), and it follows that {YT/\m : m :0:: 1} is not uniformly integrable. Therefore
E(supm YT 1\m) = oo; see Exercise (7.10.6).
16. Since the game is fair, E(Sn+1 I Sn) = Sn. Also ISnl ::=: 1 + 2 + + n < oo. Therefore Sn is
a martingale. The occurrence of the event {N = n} depends only on the outcomes of the coin-tosses
up to and including the nth; therefore N is a stopping time.
A tail appeared at time N- 3, followed by three heads. Therefore the gamblers G1, G2, ... ,
G N -3 have forfeited their initial capital by time N, while G N -i has had i + 1 successful rounds for
0 ::=: i ::=: 2. Therefore SN = N- (p- 1 + p- 2 + p- 3 ), after a little calculation. It is easy to check that
N satisfies the conditions of the optional stopping theorem, and it follows that E(SN) = E(So) = 0,
which is to saythatE(N) = p-1 + p-2 + p-3.

In order to deal with HTH, the gamblers are re-programmed to act as follows. If they win
on their first bet, they bet their current fortune on tails, returning to heads thereafter. In this case,
SN = N- (p- 1 + p- 2q- 1) where q = 1- p (remember that the game is fair), and therefore
E(N) = p-1 + p-2q-1.

17. Let :Fn = a ({Xi , Yi : 1 ::=: i ::=: n}), and note that T is a stopping time with respect to this
filtration. Furthermore lP'(T < oo) = 1 since Tis no larger than the first-passage time to 0 of either
of the two single-coordinate random walks, each of which has mean 0 and is therefore persistent.
Let a'f = var(X1) and ai = var(YI). We have that Un - Uo and Vn - Vo are sums of
independent summands with means 0 and variances a'f and ai respectively. It follows by considering

407

[12.9.18]-[12.9.19]

Solutions

Martingales

the martingales (Un- Uo) 2 -no} and (Vn- Vo) 2 -no} (see equation (12.5.14) and Exercise (10.2.2))
that
E{(UT - Uo) 2} = o}E(T), E{(VT - Vo) 2} = o}E(T).
Applying the same argument to (Un
E { (UT

+ VT- Uo-

+ Vn)- (Uo + Vo), we obtain

Vo) 2} = E(T)E{(Xr

+ Yr) 2} =

E(T)(a12 + 2c + a 22).

Subtract the two earlier equations to obtain


E{ (UT - Uo)(VT - Vo)} = cE(T)
if E(T) < oo. Now UT VT = 0, and in addition E(UT) = Uo, E(VT) = Vo, by Wald's equation and
the fact that E(X r) = E(Yr) = 0. It follows that -E(Uo Vo) = cE(T) if E(T) < oo, in which case
c < 0.
Suppose conversely that c < 0. Then(*) is valid with T replaced throughout by the bounded
stopping timeT 1\ m, and hence

0 :S E(UT Am VTAm) = E(Uo Vo)

+ cE(T 1\ m).

Therefore E(T 1\ m) :S E(Uo Vo)/(21cl) for all m, implying that E(T) = limm-+oo E(T
and so E(T) = -E(Uo Vo)/c as before.

1\

m) < oo,

18. Certainly 0 :S Xn :S 1, and in addition Xn is measurable with respect to the a-field :Fn =
a(Rr, R2, ... , Rn). Also E(Rn+l I Rn) = Rn- Rn/(52- n), whence E(Xn+l I :Fn) = Xn.
Therefore Xn is a martingale.
A strategy corresponds to a stopping time. If the player decides to call at the stopping time T, he
wins with (conditional) probability X T and therefore IP'(wins) = E(XT ), which equals E(Xo) (=
by the optional stopping theorem.
Here is a trivial solution to the problem. It may be seen that the chance of winning is the same
for a player who, after calling "Red Now", picks the card placed at the bottom of the pack rather than
that at the top. The bottom card is red with probability
irrespective of the strategy of the player.

i)

i,

19. (a) A sums of money in weekt is equivalentto a sums /(1 +a)t in weekO, since the latter sum may
be invested now to yield s in week t. If he sells in week t, his discounted costs are :L~= 1 cI (1 + a )n
and his discounted profit is Xt/(1 + a)t. He wishes to find a stopping time for which his mean
discounted gain is a maximum.
Now

so that tt(T) = E{ (1

+ a)-T ZT} -

(c/a).

(b) The function h(y) = ay - fy IP'(Zn > y) dy is continuous and strictly increasing on [0, 100 ),
with h(O) = -E(Zn) < 0 and h(y) --+ oo as y --+ oo. Therefore there exists a unique y (> 0) such
that h (y) = 0, and we choose y accordingly.
(c) Let :Fn = a(Zr, Z2, ... , Zn). We have that
00

E(max{Zn, y}) = y

00

[1- G(y)]dy = (1 +a)y

where G(y) = IP'(Zn :S y). Therefore E(Vn+l I :Fn) = (1


non-negative supermartingale.

408

+ a)-ny

:S Vn, so that CVn, :Fn) is a

Solutions [12.9.20]-[12.9.21]

Problems

Let ~-t(r) be the mean gain of following the strategy 'accept the first offer exceeding r - (cja)'.
The corresponding stopping timeT satisfies JP>(T = n) = G(r)n(l- G(r)), and therefore
00

~-t(r)

+ (cja)

L E{ (1 + a)-T Zr l{T=n)}
n=O
00

= L(l

+ a)-nG(r)n(l- G(r))E(Z, I z,

n=O

1 +a
{ r(l - G(r))
1 +a- G(r)

00

> r)

(1 - G(y)) dy } .

Differentiate with care to find that the only value of r lying in the support of Z 1 such that ~-t' (r) = 0
is the value r = y. Furthermore this value gives a maximum for ~-t( r ). Therefore, amongst strategies
of the above sort, the best is that with r = y. Note that ~-t(Y) = y(l +a) - (cja).
Consider now a general strategy with corresponding stopping time T, where JP>(T < oo) = 1. For
anypositiveintegerm, T /\misaboundedstoppingtime, whenceE(VTAm):::: E(Vo) = y(l+a). Now
IVT Am I :::: l::~o IVil, and l::~o EIVi I < oo. Therefore {Vr Am : m :=::: 0} is uniformly integrable.
Also Vr Am --+ Vr a.s. as m --+ oo, and it follows that E(Vr Am) --+ E(Vr ). We conclude that
J-t(T) = E(Vr)- (cja) :::: y(l +a)- (cja) = J-t(y). Therefore the strategy given above is optimal.
(d) In the special case, JP>(ZI > y) = (y - 1)- 2 for y ::::, 2, whence y = 10. The target price is
therefore 9, and the mean number of weeks before selling is G(y)/(1 - G(y)) = 80.

20. Since G is convex on [0, oo) wherever it is finite, and since G(l) =land G'(l) < 1, there exists
a unique value of IJ (> 1) such that G(l]) = IJ. Furthermore, Yn = IJZn defines a martingale with
mean E(Yo) = IJ. Using the maximal inequality (12.6.6),

IJ) ::S

JP>(supZn::::, k) = lP' ( supYn :=:::


n

k-l
I]

for positive integers k. Therefore


00

E(supZn) ::S
n

L - .1

k=I I] -

21. LetMn bethenumberpresentafterthenthround, soMo = K, andMn+I = Mn -Xn+I n::::, 1,


where Xn is the number of matches in the nth round. By the result of Problem (3.11.17), EXn = 1
for all n, whence
E(Mn+I + n + 1 I :Fn) = Mn + n,
where :Fn is the a-field generated by Mo, M1, ... , Mn. Thus the sequence {Mn + n} is a martingale.
Now, N is clearly a stopping time, and therefore K = Mo + 0 = E(MN + N) =EN.
We have that

+ n + 1) 2 + Mn+I I Fn}
= (Mn + n) 2 - 2(Mn + n)E(Xn+I
::S (Mn + n) 2 + Mn,

E{ (Mn+I

- 1) + Mn

+ E{ (Xn+I

where we have used the fact that

1 if Mn > 1,
if Mn = 1.

var(Xn+l I :Fn) = { 0

409

- 1) 2 - Xn+I :Fn}

[12.9.22]-[12.9.24]

Solutions

Martingales

+ n) 2 + Mn}

Hence the sequence {(Mn


supermartingales,

is a supermartingale. By an optional stopping theorem for

and therefore var(N) _:::: K.

22. In the usual notation,


E(M(s

+ t) I.'Fs)

= E(fos W(u)du

+ 1s+t W(u)du-

HW(s

= M(s) + tW(s)- W(s)E([W(s + t)-

+ t)- W(s) + W(s)} 3 1.'Fs)

W(s)l 2

I.'Fs) = M(s)

as required. We apply the optional stopping theorem (12.7.12) with the stopping timeT= inf{ u :
W ( u) E {a, b}}. The hypotheses of the theorem follow easily from the boundedness of the process
fort E [0, T], and it follows that

Hence the required area A has mean


E(A) = E( (T W(u)du) =

3 ) = ~a 3 (~) + ~b 3 (-a-).
~E(W(T)
3
3
a-b
3
a-b

[We have used the optional stopping theorem twice actually, in that E(W(T)) = 0 and therefore
JP>(W(T) =a)= -bj(a- b).]

23. With .'Fs = a(W(u) : 0 _:::: u _:::: s), we have for s < t that
E(R(t) 2 I .'Fs) = E(IW(s)l 2 + IW(t)- W(s)l 2

+ 2W(s) (W(t)-

W(s)) I:Fs) = R(s) 2

+ (t- s),

and the first claim follows. We apply the optional stopping theorem (12.7.12) with T = inf{u :
IW(u)l =a}, as in Problem (12.9.22), to find that 0 = E(R(T) 2 - T) = a 2 - E(T).

24. We apply the optional stopping theorem to the martingale W(t) with the stopping timeT to find
that E(W(T)) = -a(1 - Pb) + bpb = 0, where Pb = JP>(W(T) = b). By Example (12.7.10),
W (t) 2 - t is a martingale, and therefore, by the optional stopping theorem again,

whence E(T) = ab. For the final part, we take a = band apply the optional stopping theorem to the
2 t] to obtain
martingale exp[eW(t)-

ie

on noting that the conditional distribution ofT given W (T) = b is the same as that given W (T) = -b.
Therefore, E(e-i 02 T)

1/ cosh(be), and the answer follows by substituting s

410

ie 2.

13
Diffusion processes

13.3 Solutions. Diffusion processes


1.

It is easily seen that

E{ X(t +h)- X(t) X(t)} =(A.- p,)X(t)h + o(h),


E( {X (t +h) - X (t)} 2 1 X (t)) = (A.+ p,)X (t)h + o(h),

which suggest a diffusion approximation with instantaneous mean a(t, x) = (A.- p,)x and instantaneous variance b(t, x) =(A.+ p,)x.
2. The following method is not entirely rigorous (it is an argument of the following well-known
type: it is valid when it works, and not otherwise). We have that

-aM =
at

1oo e
-oo

0Yaf
-

at

dy

1oo {ea(t, y) + ~e b(t, y) }e


-oo
2

0Y f

dy,

by using the forward equation and integrating by parts. Assume that a(t, y) = I:n an (t)yn, b(t, y) =
I:n f3n (t)yn. The required expression follows from the 'fact' that

3.

Using Exercise (13.3.2) or otherwise, we obtain the equation


aM= emM + le 2M
2
at

with boundary condition M(O, e) = l. The solution is M(t) = exp{ ~e (2m + e)t}.

4.

Using Exercise (13.3.2) or otherwise, we obtain the equation


aM
aM
1 2
-=-e-+-e M
2
ae
at

with boundary condition M(O, e) = l. The characteristics of the equation are given by
dt
1

de
e

2dM
e2M'

with solution M(t, e) = e! 02 g(ee- 1) where g is a function satisfying 1 = e! 02 g(e). Therefore


M = exp{~e 2 (1- e- 21 )}.

411

[13.3.5]-[13.3.10]

Solutions

Diffusion processes

5. Fix t > 0. Suppose we are given W1 (s), W2(s), W3(s), for 0:::;: s :::;: t. By Pythagoras's theorem,
R(t + u) 2 =Xi+ X~+ X~ where the Xi are independent N(Wi (t), u) variables. Using the result of
Exercise (5.7.7), the conditional distribution of R(t + u) 2 (and hence of R(t + u) also) depends only
on the value of the non-centrality parameter e = R(t) 2 of the relevant non-central x2 distribution.
It follows that R satisfies the Markov property. This argument is valid for the n-dimensional Bessel
process.
6. By the spherical symmetry of the process, the conditional distribution of R(s +a) given R(s) = x
is the same as that given W(s) = (x, 0, 0). Therefore, recalling the solution to Exercise (13.3.5),

lP'(R(s +a):::;: y I R(s) =x)


=

i
{ (u-x) 2 +v 2 +w 2 }
(u,v,w):
dudvdw
2a
2:rra 312 exp u2+v2+w2:sy2 (
)

y 1 2:n: l:n:
1
{
1p=O
312 exp =0 0=0 (2:rra)

=loy~ { exp (

(p

~ax)2) -

p 2 - 2px cos e + x 2 } p 2 sm8d8ddp


.
2a
exp (

(p ;ax)2)} dp,

and the result follows by differentiating with respect to y.


7. Continuous functions of continuous functions are continuous. The Markov property is preserved
because gO is single-valued with a unique inverse.
I 2

8. (a) Since E(eaW(t)) = e2a t, this is not a martingale.


(b) This is a Wiener process (see Problem (13.12.1)), and is certainly a martingale.
(c) With :Ft = a(W(s) : 0:::;: s :::;: t) and t, u > 0,
E{ (t + u)W(t + u)- lot+u W(s)ds

I:Ft} =

(t + u)W(t)- lot W(s)ds -lt+u W(t)ds

= tW(t)- lot W(s)ds,

whence this is a martingale. [The integrability condition is easily verified.]

9. (a) With s < t, S(t) = S(s) exp{a(t- s) + b(W(t)- W(s)) }. Now W(t)- W(s) is independent
of {W(u) : 0:::;: u :::;: s}, and the claim follows.
(b) S (t) is clearly integrable and adapted to the filtration !F = (:Ft) so that, for s < t,
E(S(t) I Fs) = S(s)E(exp{a(t- s) + b(W(t)- W(s))} I Fs) = S(s)exp{a(t- s) + ~b 2 (t- s)},
which equals S(s) if and only if a+ ~b 2 = 0. In this case, E(S(t)) = E(S(O)) = 1.

10. Either find the instantaneous mean and variance, and solve the forward equation, or argue directly
as follows. With s < t,
lP'(S(t):::;: y I S(s) = x) = lP'(bW(t):::;: -at+ logy I bW(s) =-as+ logx).
Now b(W(t)- W(s)) is independent of W(s) and is distributed as N(O, b2(t- s)), and we obtain on
differentiating with respect to y that

f (t, y 1 s, x) -_

1
( (log(y/x)-a(t-s)) 2 )
exp ,
2
yy2:rrb2(t- s)
2b (t- s)

412

x,y > 0.

Solutions

Excursions and the Brownian bridge

[13.4.1]-[13.6.1]

13.4 Solutions. First passage times


1.

Certainly X has continuous sample paths, and in addition EIX (t) I < oo. Also, if s < t,

as required, where we have used the fact that W(t)- W(s) is N(O, t - s) and is independent of :Fs.

2.

Apply the optional stopping theorem to the martingale X of Exercise (13.4.1), with the stopping
time T, to obtain E(X(T)) = 1. Now W(T) = aT + b, and therefore E(elfrT+ilib) = l where
1/J = i ae + 2 . Solve to find that

1e

E(elfrT)

= e-ilib = exp { -b( Jaz- 21/1 +a)}

is the solution which gives a moment generating function.


We have that T :::: u if and only if there is no zero in (u, t], an event with probability l (2/n) cos- 1 { .JU?t}, and the claim follows on drawing a triangle.

3.

13.5 Solution. Barriers


1. Solving the forward equation subject to the appropriate boundary conditions, we obtain as usual
that
F(t, y) = g(t, y I d)+ e-Zmd g(t, y I -d)-

j_:

2me 2 mx g(t, y I x) dx

where g(t, y I x) = (2nt) -2 exp{ -(y - x - mt) 2 j(2t)}. The first two terms tend to 0 as t -+ oo,
regardless of the sign of m. As for the integral, make the substitution u = (x - y- mt) I ..fi to obtain,
as t-+ oo,
-

-(d+y+mt)/../i

-00

I 2

e-zu
{ 21mle-21mly
2me 2my - - du -+
..,fiir
0

ifm < 0,
ifm:::: 0.

13.6 Solutions. Excursions and the Brownian bridge


1.

Let f(t, x) = (2nt)-ze-x /(Zt). It may be seen that

IP'(W(t)

>xI Z, W(O) =

0)

= lim IP'(W(t)
w-!-0

>xI Z, W(O) =

w)

where Z = {no zeros in (0, t]}; the small missing step here may be filled by conditioning instead on
the event {W(E) = w, no zeros in (E, t]}, and taking the limit as E .j, 0. Now, if w > 0,

IP'(W(t) > x, Z I W(O)

= w) =

00

{J(t, y- w)- f(t, y

by the reflection principle, and

IP'(Z I W(O) =

w)

= l- 2

Loo f(t, y)dy


413

i:

+ w)} dy

f(t, y)dy

[13.6.2]-[13.6.5]

Solutions

Diffusion processes

by a consideration of the minimum value of Won (0, t]. It follows that the density function of W(t),
conditional on Z n {W(O) = w}, where w > 0, is

h ( ) _ f(t, x- w)- f(t, x


X -w f(t, y) dy

Jw

+ w)

'

X> 0.

Divide top and bottom by 2w, and take the limit as w {. 0:


lim hw(x) = _ _1_ 8f = ::_e-x2/(2t)
f(t, 0) ax
t
,

X> 0.

w-),0

2.

It is a standard exercise that, for a Wiener process W,

C=;),

E{W(t) I W(s) =a, W(1) =

0}

=a

E{W(s) 2 W(O) = W(l) =

0}

= s(l- s),

if 0 ::::: s ::::: t ::::: 1. Therefore the Brownian bridge B satisfies, for 0 ::::: s ::::: t ::::: 1,
E(B(s)B(t)) = E{ B(s)E(B(t) I B(s))} = 1 - t E(B(s) 2 ) = s(l- t)
1-s
as required. Certainly E(B(s)) = 0 for all s, by symmetry.
3. W is a zero-mean Gaussian process on [0, 1] with continuous sample paths, and also W(O) =
W(l) = 0. Therefore W is a Brownian bridge if it has the same autocovariance function as the
Brownian bridge, that is, c(s, t) = min{s, t}- st. For s < t,
cov(W(s), W(t)) = cov(W(s)- sW(1), W(t)- tW(l)) = s- ts- st

+ st =s-st

since cov(W(u), W(v)) = min{u, v}. The claim follows.


4. Either calculate the instantaneous mean and variance of W, or repeat the argument in the solution
to Exercise (13.6.3). The only complication in this case is the necessity to show that W(t) is a.s.
continuous at t = l, i.e., that u- 1 W(u- l) ---+ 0 a.s. as u---+ oo. There are various ways to show this.
Certainly it is true in the limit as u---+ oo through the integers, since, for integral u, W(u- l) may be
expressed as the sumofu -1 independent N(O, 1) variables (use the strong law). It remains to fill in the
gaps. Let n be a positive integer, let x > 0, and write Mn = max{IW(u)- W(n)l : n::::: u ::::: n + 1}.
We have by the stationarity of the increments that
oo

oo

E~)

n=O

n=O

L)"CMn ::=: nx) = ~)P'CM1 ::=: nx)::::: 1 + - - < oo,


implying by the Borel-Cantelli lemma that n- 1 Mn ::::: x for all but finitely many values of n, a.s.
Therefore n - 1 Mn ---+ 0 a.s. as n ---+ oo, implying that

.
1
.
1
hm --IW(u)l::::: hm -{IW(n)l
+1
n-+oo n

u-+oo u

+ Mn}---+ 0

5.

a.s.

In the notation of Exercise (13.6.4), we are asked to calculate the probability that W has no zeros
in the time interval between s /(1 - s) and t /(1 - t). By Theorem (13.4.8), this equals
1
2
1 - - cos-

:rr

s(1-y)
2
-1
t(1-s) =;cos

414

V~

Solutions

Stochastic calculus

[13.7.1]-[13.7.5]

13.7 Solutions. Stochastic calculus


1. Let :Fs = u(Wu : 0:::;: u :::;: s). Fix n :::0: 1 and define Xn(k) = IWktjznl for 0:::;: k:::;: 2n.
By Jensen's inequality, the sequence {Xn (k) : 0 :::;: k :::;: 2n} is a non-negative submartingale with
respect to the filtration :Fktf2n, with finite variance. Hence, by Exercise (4.3.3) and equation (12.6.2),
X~= max{Xn(k): 0 ::=:: k ::=:: 2n} satisfies

E(X~2 ) = 2lo xlP'(X~ > x) dx :::;: 2lo


00

00

E(Wt

= 2E(Wt X~) ::=:: 2VECWhE(Xj?)

/{X~:o:x}) dx =

2E{ wt lox:; dx}

by the Cauchy-Schwarz inequality.

Hence E(X~ 2 ) :::;: 4E(Wh. Now X~ 2 is monotone increasing inn, and W has continuous sample
paths. By monotone convergence,

2.

See the solution to Exercise (8.5.4).

3.

(a) We have that

w?,

The first summation equals


by successive concellation, and the mean-square limit of the second
it in mean square.
summation is t, by Exercise (8.5.4). Hence limn--+oo Ir (n) = i
Likewise, we obtain the mean-square limits:
lim /z(n) = i

n--+oo

4.

w?-

w? +it,

lim /3(n) = lim h(n) =

n--+oo

n--+oo

iw?.

Clearly E(U (t)) = 0. The process U is Gaussian with autocovariance function

Thus U is a stationary Gaussian Markov process, namely the Omstein-Uhlenbeck process. [See
Example (9.6.10).]
5.

Clearly E(Ut) = 0. For s < t,


E(UsUs+t) = E(Ws Wt)

+ ,B 2 E (1~ 0 1~ 0 e-.B(s-u)Wue-.B(t-v)Wv du dv)

- E ( Wt.B los e-.B(s-u)Wu du) - E ( Ws lot e-.B(t-v)Wv dv)


=s+,B 2e-.B(s+t)1s

1
1

e.B(u+v)min{u,v}dudv

u=O v=O

- ,B los e-.B(s-u) min{u, t} du- lot e-,B(t-v) min{s, v} dv


e2.Bs - 1 -,B(s+t)
2,8 e

415

[13.8.1]-[13.8.1]

Solutions

Diffusion processes

after prolonged integration. By the linearity of the definition of U, it is a Gaussian process. From the
calculation above, it has autocovariance function c(s, s +t) = (e-f3(t-s) -e-f3(t+sl)j(2(3). From this
we may calculate the instantaneous mean and variance, and thus we recognize an Omstein-Uhlenbeck
process. See also Exercise (13.3.4) and Problem (13.12.4).

13.8 Solutions. The Ito integral


1. (a) Fix t > 0 and let n :=:: I and 8 = tjn. We write tj = jtjn and Vj = Wtr By the absence of
correlation of Wiener increments, and the Cauchy-Schwarz inequality,

~ ~{(tj+1- tj) ltj+l E(IVH1- Wsl 2)ds}


j=O

tl

n-1 1
-(tj+1 -tj) 3
. 0 2

n-1 1 (t)3

L.
2 n

--+ 0

asn--+ oo .

]= 0

]=

Therefore,

lim ( tWtn--'>00

n-1

L vj+1 (tj+1- tj) )

= tWt-

j=O

lor Ws ds.
t

(b) As n --+ oo,


n-1
n-1
2:: V/(Vi+1- Vj) =
2:: {v/+ 1 - V/- 3Vj(Vj+1 - Vj) 2 - <VJ+1 - Yj) 3 }
j=O
j=O
n-1
n-1
=
Vj(tj+1- tj) + Vj{ <VJ+1 - Vj) 2 - (tH1 - tj)}] 2::<VJ+1 - Yj) 3
j=O
j=O

t w?- L [

--+

tw?- lot W(s)ds+O+O.

The fact that the last two terms tend to 0 in mean square may be verified in the usual way. For example,

E(I~(Yj+1- Yj) 3
J=O

n ~E[(Vj+l=

J=O
n-1
= 6
(tj+ 1 j=O

Vj) 6]
n-1

fj ) 3 =

416

L U)
j=O n

3
--+ 0

as n--+ oo.

Solutions

Ito'sjormula
(c) It was shown in Exercise (13.7.3a) that f~ Ws dWs = i

[13.8.2]-[13.9.1]

w?- it. Hence,

and the result follows because IECWh = t.


2. Fix t > 0 and n 2: 1, and let 8 = t/n. We set Vj = Wjtfn It is the case that Xt =
limn---+oo L,j Vj(tj+l- tj). Each term in the sum is normally distributed, and all partial sums are
multivariate normal for all8 > 0, and hence also in the limit as 8 --+ 0. Obviously IE(Xt) = 0. For
s ::::: t,
IE(XsXt) =lot los IE(Wu Wv) du dv =lot los min{u, v} du dv
=

losl-u 2 du + los u(t0 2

u) du = s 2

(t2

- - -s) .

Hence var(X t) = ~ t 3 , and the autocovariance function is


p(Xs, Xt) = 3

3.

By the Cauchy-Schwarz inequality, as n --+ oo,

4. We square the equation II I ( 1/f1 + 1/f2) 112


i = 1, 2, to deduce the result.
5.

~).
v~t (~2
6t

ll1/f1

+ 1/f2ll and use the fact that II I (1/Fi) 112 =

111/Fi II for

The question permits us to use the integrating factor ef3t to give, formally,
ef3t Xt

__s ds = ef3t Wt - fJ lot ef3s Ws ds


lootef3s dW
ds
o

on integrating by parts. This is the required result, and substitution verifies that it satisfies the given
equation.
6.

Find a sequence cp = (cp(n)) of predictable step functions such that llc/J(n)

-1/f II

--+ 0 as n --+ oo.

By the argument before equation (13.8.9), /(cp(n)) ~ /(1/f) as n --+ oo. By Lemma (13.8.4),
II/ (cp(n)) 112 = llc/J(n) II, and the claim follows.

13.9 Solutions. Ito's formula


1. The process Z is continuous and adapted with Zo = 0. We have by Theorem (13.8.11) that
IE(Zt - Zs I :Fs) = 0, and by Exercise (13.8.6) that

The first claim follows by the Levy characterization of a Wiener process (12. 7.1 0).

417

[13.9.2]-[13.10.2]

Diffusion processes

Solutions

We have inn dimensions that R2 = XI +X~++ X~, and the same argument yields that
(Xi/ R) dXi is a Wiener process. By Example (13.9.7) and the above,

Zt = ~i

Jd

d(R 2) = 2

Ln Xi dXi + n dt = 2R Ln X dXi + n dt = 2R dW + n dt.


___!_

i=l R

i=I

2.

Applying Ito's formula (13.9.4) to Y1 = W 14 we obtain dY1 = 4W? dW1 + 6W? dt. Hence,

3.

Apply Ito's formula (13.9.4) to obtain dY1 = W 1 dt + t dW1 Cf. Exercise (13.8.1).

4.

Note that X 1 =cos Wand X2 =sin W. By Ito's formula (13.9.4),


dY = d(Xr + iXz) = dXr + i dXz = d(cos W) + i d(sin W)

= -sin W dW- ~cos W dt + i cos W dW - ~sin W dt.


5.

We apply Ito's formula to obtain:

(a) (1

+ t) dX

(c)

+ dW,
+ -v'l=X2 dW,

= -X dt

(b) dX = -~X dt

d(~) = -~ (~) dt + (b~a -~b) (~) dW.


13.10 Solutions. Option pricing

1.

(a) We have that


E((ae 2 - K)+) = { 00

(aez- K)

00

~ exp (- (z- ;) 2 )

v2nr2

lrog(Kfa)

I 2

e-zY
(aeY+<Y - K ) - - dy

= aeY+z'

21
a

00

I
2
e-z(y-r)

v2n

dz

2r

z- y
log(K/a)- y
where y = - - , a= - - - - r

dy- K<l>(-a)

= aeY+ir 2<l>(r- a)- K<l>(-a).

(b) We have that ST = ae 2 where a = S1 and, under the relevant conditional Q-distribution, Z is
normal with mean y = (r - ~a 2 )(T - t) and variance r 2 = a 2(T - t). The claim now follows by
the result of part (a).
2. (a) Set ~(t, S) = ~(t, S1 ) and 1/r(t, S) = 1/r(t, S1 ), in the natural notation. By Theorem (13.10.15),
we have 1/rx = 1fr1 = 0, whence 1/r(t, x) = c for all t, x, and some constant c.
(b) Recall that dS = MS dt +aS dW. Now,
d(~S + 1frer 1 ) = d(S 2 + 1frer 1 ) = (aS) 2 dt + 2S dS + ert dlfr + lfrrert dt,

418

Solutions [13.10.3]-[13.10.5]

Option pricing

by Example (13.9.7). By equation (13.10.4), the portfolio is self-financing if this equals S dS


1/frert dt, and thus we arrive at the SDE ert dl{! = -S dS- a 2 S 2 dt, whence
'''(t
S) --lot
e-ru Su dSu - a2 lot e-ru s2u du
o/
'
0

(c) Note first that Zt

Jd Su du satisfies dZt

St dt. By Example (13.9.8), d(StZt)

Zt dSt

s[ dt, whence
d(~ S + 1/fert)

= Zt dSt +sf dt + ert dl{! + rert dt.

Using equation (13.10.4), the portfolio is self-financing if this equals Zt dSt


require that ert dl{! = -S[ dt, which is to say that

+ 1/frert dt, and thus we

3. We need to check equation (13.10.4) remembering that dMt = 0. Each of these portfolios is
self-financing.
(a) This case is obvious.
(b)d(~S + 1/J)

= d(2S 2 - s2 - t) = 2SdS +dt -dt = ~dS.


(c) d(~S + 1/1) = -S- tdS + S = ~ dS.
(d) Recalling Example (13.9.8), we have that

4. The time of exercise of an American call option must be a stopping time for the filtration (:Ft).
The value of the option, if exercised at the stopping time r, is V, = (S, - K)+, and it follows by
the usual argument that the value at time 0 of the option exercised at r is E<Qi(e-rr V, ). Thus the
value at time 0 of the American option is sup, {E<Qi(e-rr V,) }, where the supremum is taken over all
stopping times r satisfying lP'( r :::= T) = 1. Under the probability measure Q, the process e-rt Vt is
a martingale, whence, by the optional stopping theorem, E<Qi(e-rr V,) = Vo for all stopping times r.
The claim follows.
5. We rewrite the value at time 0 of the European call option, possibly with the aid of Exercise
(13.10.1), as

where N is an N(O, 1) random variable. It is immediate that this is increasing in So and r and is
decreasing in K. To show monotonicity in T, we argue as follows. Let T1 < T2 and consider the
European option with exercise date T2. In the corresponding American option we are allowed to
exercise the option at the earlier time T1 . By Exercise (13.10.4), it is never better to stop earlier than
T2, and the claim follows.
Monotonicity in a may be shown by differentiation.

419

[13.11.1]-[13.12.2]

Solutions

Diffusion processes

13.11 Solutions. Passage probabilities and potentials


1. LetHbeaclosedspherewithradius R (> lwl), and define PR(r) =
Then PR satisfies Laplace's equation in JRd, and hence

!!._
dr

lP'( G before H

II W(O)I = r ).

(rd-ldPR) =O
dr

since PR is spherically symmetric. Solve subject to the boundary equations PR(E) = 1, PR(R) = 0,
to obtain
2-d - R2-d
r
d-2
as R-+ oo.
PR(r) = 2-d R2-d -+ (E/r)
E

2. The electrical resistance Rn between 0 and the set /':,.n is no smaller than the resistance obtained
by, for every i = 1, 2, ... , 'shorting out' all vertices in the set /':,.i. This new network amounts to a
linear chain of resistances in series, points labelled /':,.i and /':,.i+ 1 being joined by a resistance if size
Ni-l . It follows that
00
1
R(G)= lim Rn::::L-
n->-oo
i=O Ni
By Theorem (13.11.18), the walk is persistent if "Ei Ni-l = oo.
3. Thinking of G as an electrical network, one may obtain the network H by replacing the resistance
of every edge e lying in G but not in H by oo. Let 0 be a vertex of H. By a well known fact in the
theory of electrical networks, R(H) :::: R(G), and the result follows by Theorem (13.11.19).

13.12 Solutions to problems


(a) T(t) =a W(t ja 2 ) has continuous sample paths with stationary independent increments, since
W has these properties. Also T(t)ja is N(O, tja 2 ), whence T(t) is N(O, t).
(b) As for part (a).
(c) Certainly V has continuous sample paths on (0, oo). For continuity at 0 it suffices to prove that
t W(t- 1 ) -+ 0 a.s. as t ,), 0; this was done in the solution to Exercise (13.6.4).
If (u,v), (s,t) are disjoint time-intervals, then so are (v- 1 ,u- 1), (t- 1 ,s- 1); since W has
independent increments, so has V. Finally,
1.

is N(O, {3) if s, t > 0, where

f3 =

_!!__ +s2
s+t

(!- _1_)
s

s+t

= t.

2. Certainly W is Gaussian with continuous sample paths and zero means, and it is therefore sufficient
to prove that cov(W (s), W(t)) = min{s, t}. Now, if s .:::; t,

as required.

420

Solutions [13.12.3]-[13.12.4]

Problems

Ifu(s) = s, v(t) = 1- t, then r(t)


this case X(t) = (1- t)W(t/(1- t)).
3.

= tj(l- t), and r- 1(w) = wj(l +

w) forO::; w < oo. In

Certainly U is Gaussian with zero means, and U (0) = 0. Now, with s1 = e 2f3t - 1,
E{U(t +h) I U(t)

= u} = e-f3(t+h)E{W(sr+h) I W(sr) = uef3 1 }


= ue-f3(t+h)ef3t = u- f3uh + o(h),

whence the instantaneous mean of U is a(t, u)


therefore
E{U(t

+ h) 2

-f3u. Secondly, St+h

= s1 +

2{3e 2f3 1h + o(h), and

= u} = e- 2f3(t+h)E{W(s 1+h) 2 W(s 1 ) = uef3 1 }


= e-2f3(t+h) (u2e2f3t + 2f3e2f3t h + o(h))
= u 2 - 2f3h(u 2 - 1) + o(h).

U(t)

It follows that

E{IU(t +h)- U(t)1 2 l U(t)

u}

= u2 - 2f3h(u 2 = 2f3h + o(h),

1)- 2u(u- f3uh) + u 2 + o(h)

and the instantaneous variance is b(t, u) = 2{3.

4.

Bartlett's equation (see Exercise (13.3.4)) for M(t, B)= E(eeV(t)) is

aM
at

aM
aB

1 2 2

-=-{38-+-a B M
2

with boundary condition M (B, 0) = e 8 u. Solve this equation (as in the exercise given) to obtain

the moment generating function of the given normal distribution. Now M (t, B) --+ exp{ ~B 2 a 2 j (2{3)}
as t --+ oo, whence by the continuity theorem V (t) converges in distribution to the N (0, ~a 2 j fJ)
distribution.
If V (0) has this limit distribution, then so does V (t) for all t. Therefore the sequence (V Ct1), ... ,
V(tn)) has the same joint distribution as (V(t1 +h), ... , V(tn +h)) for all h, t1; ... , tn, whenever
V (0) has this normal distribution.
In the stationary case, E(V(t)) = 0 and, for s ::; t,
cov(V(s), V(t))

= E{V(s)E(V(t) I V(s))} = E{V(s) 2 e-f3(t-s)} = c(O)e-f31t-sl

where c(O) = var(V (s)); we have used the first part here. This is the autocovariance function
of a stationary Gaussian Markov process (see Example (9.6.10)). Since all such processes have
autocovariance functions of this form (i.e., for some choice of fJ), all such processes are stationary
Omstein-Uhlenbeck processes.
The autocorrelation function is p(s) = e-f31sl, which is the characteristic function of the Cauchy
density function
1
X ER
f(x) = fJn{l + (x/{3)2},

421

[13.12.5]-[13.12.7]
5.

Diffusion processes

Solutions

Bartlett's equation (see Exercise (13.3.2)) forM is

aM
at

aM + -{JB
1
2 aM
ae 2 -ae '

= aB-

subject to M(O, B)= e8d. The characteristics satisfy

The solution isM= g(Beat j(a


The solution follows as given.
By elementary calculations,

dM

dt

+ ifJB))

E(D(t)) =

2d()
2aB

where g is a function satisfying g(Bj(a

aM I =
ae 8=0

a2 M
E(D(t)2) = --2

+ f3() 2

ae 8=0

+ ifJB))

= e 8d.

deat,

fJd
= -eat
(eat - 1)

+ d2e2at'

whence var(D(t)) = (fJdja)eat (eat - 1). Finally


IP'(D(t) = 0) =

lim

8-->--oo

M(t, B)= exp {

2adeat

{3(1 - ea )

which converges to e- 2adffJ as t--+ oo.

6.

The equilibrium density function g(y) satisfies the (reduced) forward equation
d
--(ag)
dy

where a(y)
are

-fJy and b(y)

1 d
+ --(bg)
=
2 dy2

= a 2 are the instantaneous mean and variance.

The boundary conditions

y = -c,d.
Integrate (*) from -c to y, using the boundary conditions, to obtain

-c

~d.

Integrate again to obtain g(y) = Ae-f3Y 2fa 2 . The constant A is given by the fact that f~c g(y) dy = 1.

7.

First we show that the series converges uniformly (along a subsequence), implying that the limit
exists and is a continuous function of t. Set
n-1
. (k )
~ sm t
Zmn(t) = ~ -k-Xk>

Mmn = sup{IZmn(t)l: 0 ~ t ~

7r }.

k=m

We have that
M2 <
mn -

su

n-1 eikt

~ -X
P l~ k k

O::or:::,:rr k=m

12

n-1 x2
< ~ ___k_
~ k2

k=m

422

+2

n-m-11n-1-1 XX I
1 J+l
~
~
~
~ ( + l)
1=1

J=m 1 1

Solutions

Problems

[13.12.8]-[13.12.8]

The mean value of the final term is, by the Cauchy-Schwarz inequality, no larger than
n-m-1

n-l-1

L:

"

+ [)2

L.J j2(j
}=m

1=1

< 2(n - m)
-

F.t--

m
m4

--

Combine this with (*) to obtain

It follows that

implying that .2:::~ 1 M2n-I 2n < oo a.s. Therefore the series which defines W converges uniformly
with probability 1 (along a ;ubsequence), and hence W has (a.s.) continuous sample paths.
Certainly W is a Gaussian process since W(t) is the sum of normal variables (see Problem
(7.11.19)). Furthermore E(W(t)) = 0, and
st

cov ( W(s), W(t) ) = -

2 ~ sin(ks) sin(kt)

+- L.J

T(

T(

k=1

since the Xi are independent with zero means and unit variances. It is an exercise in Fourier analysis
to deduce that cov(W(s), W(t)) = min{s, t}.

8.

We wish to find a solution g(t, y) to the equation

IYI <

b,

satisfying the boundary conditions


g(O, y) = 8yo

if IYI :S b,

g(t, y) =

0 if IYI =b.

Let g(t, y I d) be the N(d, t) density function, and note that g(-, I d) satisfies (*) for any
'source' d. Let
00

L:

g(t, y) =

c-1)kg(t, y 12kb),

k=-00

a series which converges absolutely and is differentiable term by term. Since each summand satisfies
(*),so does the sum. Now g(O, y) is a combination of Dirac delta functions, one at each multiple of2b.
Only one such multiple lies in [-b, b], and hence g(y, 0) = 8ao Also, setting y = b, the contributions
from the sources at -2(k- 1)b and 2kb cancel, so that g(t, b) = 0. Similarly g(t, -b) = 0, and
therefore g is the required solution.
Here is an alternative method. Look for the solution to(*) of the form e-l.nt sin{inn(y + b)jb};
such a sine function vanishes when IYI =b. Substitute into(*) to obtain An = n 2 n 2 /(8b 2 ). A linear
combination of such functions has the form
00
g(t,y) =Lane

-Ant .

sm

n=1

423

nn(y +b)
) .
2b

[13.12.9]-[13.12.11]

Solutions

Diffusion processes

We choose the constants an such that g(O, y) = 8yo for IYI < b. With the aid of a little Fourier
analysis, one finds that an = b- 1 sin(~mr).
Finally, the required probability equals the probability that wa has been absorbed by time t, a
probability expressible as 1- f~b fa(t, y) dy. Using the second expression for
this yields

r,

9. Recall that U(t) = e- 2mD(t) is a martingale. LetT be the time of absorption, and assume that
the conditions of the optional stopping theorem are satisfied. Then E(U(O)) = E(U(T)), which is to
say that 1 = e2ma Pa + e-2mb(l - Pa).

10. (a) We may assume that a, b > 0. With


Pt(b)

= lP'(W(t)

> b, F(O, t) I W(O) =a),

we have by the reflection principle that


Pt(b) = lP'(W(t) > b I W(O) =a) -lP'(W(t) < -b I W(O) =a)
= lP'(b- a< W(t) < b +a I W(O) =

o),

giving that
apr (b)
---az;=

f(t, b +a)- f(t, b- a)

where f(t, x) is the N(O, t) density function. Now, using conditional probabilities,
lP'(F(O, t) I W(O) =a, W(t) =b)=-

1
apr(b) = 1 _ e-2abft.
f(t, b- a) ab

(b) We know that

lP'(F(s, t)) = 1 - - cos- 1 {


T(

VS/t} = -2 sin- 1 { VS/t}


T(

if 0 < s < t. The claim follows since F (to, t2) s; F (t1, t2).
(c) Remember that sinx = x + o(x) as x 0. Take the limit in part (b) as to

+0 to obtain ,fiJ!i2.

11. Let M(t) = sup{W(s): 0:::; s:::; t} and recall that M(t) has the same distribution as IW(t)l. By
symmetry,
lP'( sup IW(s)l::::

w) :::; 2lP'(M(t):::: w)

= 2lP'(IW(t)l::::

w).

O~s~t

By Chebyshov's inequality,

Fix E > 0, and let


An(E) = {I W(s)l/s > E for somes satisfying 2n- 1 < s :::; 2n }.
Note that

424

Solutions [13.12.12]-[13.12.13]

Problems

for all large n, and also

Therefore L:n lP'(An(E)) < oo, implying by the Borel-Cantelli lemma that (a.s.) only finitely many of
the An(E) occur. Therefore t- 1W(t)--+ 0 a.s. as t--+ oo. Compare with the solution to the relevant
part of Exercise (13.6.4).

12. We require the solution to Laplace's equation


p(w)

v2 p

= 0, subject to the boundary condition

0 ifwEH,
{ 1 'f
G.
1 W E

Look for a solution in polar coordinates of the form


00

p(r, e) =

L rn {an sin(ne) + bn cos(ne)}.


n=O

Certainly combinations having this form satisfy Laplace's equation, and the boundary condition gives
that
00

H(e) = bo

+L

{an sin(ne)

+ bn cos(ne) },

1e1

< n,

n=1
where
H(e) = {

if - n <

e < o,

ifO<e<n.

The collection {sin(me), cos( me) : m 2: 0} are orthogonal over ( -n, n). Multiply through(*) by
sin(me) and integrate over ( -n, n) to obtain nam = {1 - cos(nm)}/m, and similarly bo = ~ and
bm = 0 for m 2: 1.

13. The joint density function of two independent N (0, t) random variables is (2n t) - 1 exp{- (x 2 +
y2)j(2t)}. Since this function is unchanged by rotations of the plane, it follows that the two coordinates of the particle's position are independent Wiener processes, regardless of the orientation of the
coordinate system. We may thus assume that lis the line x = d for some fixed positive d.
The particle is bound to visit the line l sooner or later, since lP'(W1 (t) < d for all t) = 0. The
first-passage timeT has density function
t > 0.
Conditional on {T = t}, D = W2(T) is N(O, t). Therefore the density function of Dis
fv(u) =

loo

oo

fDIT(u I t)fT(t) dt =

looo --e-(u
d
+d )/( 2
2

o 2m 2

t) dt

d
n(u 2

giving that DId has the Cauchy distribution.


The angle E> = PoR satisfies e = tan- 1(D/d), whence
lP'(E>.::; e)= lP'(D.::; dtane) =

425

2 + -;

1e1

< ~n.

+ d2)

u E ffi:.,

[13.12.14]-[13.12.18]

Solutions

Diffusion processes

14. By an extension of Ito's formula to functions of two Wiener processes, U = u(Wr, W2 ) and
V = v(Wr, W2) satisfy
dU = Ux dWr
dV = Vx dWr

+ uy dW2 + J-Cuxx + Uyy) dt,


+ vy dW2 + J-Cvxx + Vyy) dt,

where Ux, Vyy, etc, denote partial derivatives of u and v. Since is analytic, u and v satisfy the
Cauchy-Riemann equations Ux = vy, uy = -vx, whence u and v are harmonic in that Uxx + uyy =
Vxx + Vyy = 0. Therefore,

The matrix (

~xUy

uy) is an orthogonal rotation oflH:2 when u~ +u~ = 1. Since the joint distribution
Ux

of the pair (Wr, W2) is invariant under such rotations, the claim follows.
15. One method of solution uses the fact that the reversed Wiener process {W (t- s)- W (t) : 0 ::: s :::
t} has the same distribution as {W(s) : 0 ::: s ::: t}. Thus M(t)- W (t) = maxo~s~dW(s)- W(t)} has
the same distribution as maxo~u~dW(u)- W(O)} = M(t). Alternatively, by the reflection principle,
lP'(M(t) 2':: x, W(t)::: y) = lP'(W(t) 2':: 2x- y)

for x 2':: max{O, y}.

By differentiation, the pair M(t), W(t) has joint density function -2' (2x - y) for y ::: x, x 2':: 0,
where is the density function of the N(O, t) distribution. Hence M(t) and M(t)- W(t) have the
joint density function -2'(x + y). Since this function is symmetric in its arguments, M(t) and
M(t)- W(t) have the same marginal distribution.
16. The Lebesgue measure A(Z) is given by

A(Z)

= fooo l{W(t)=u) du,

whence by Fubini's theorem (cf. equation (5.6.13)),


JE:(A(Z))

= fooo lP'(W(t) = u) dt = 0.

17. Let 0 <a < b < c < d, and let M(x, y)

= maxx 9 ~y W(s).

M(c, d)- M(a, b)= max {W(s)- W(c)}


c~s~d

Then

+ {W(c)- W(b)}-

max {W(s)- W(b) }.

a~s~b

Since the three terms on the right are independent and continuous random variables, it follows that
lP'((M(c, d) = M(a, b)) = 0. Since there are only countably many rationals, we deduce that
lP'( (M(c, d) = M(a, b) for all rationals a < b < c <d) = 1, and the result follows.

18. The result is easily seen by exhaustion to be true when n = 1. Suppose it is true for all m ::: n - 1
where n 2':: 2.
(i) If sn ::: 0, then (whatever the final term of the permutation) the number of positive partial sums and
the position of the first maximum depend only on the remaining n - 1 terms. Equality follows by the
induction hypothesis.
(ii) If Sn > 0, then
n

Ar =

L Ar-1 (k),
k=l

426

Solutions

Problems

[13.12.19]-[13.12.20]

where Ar-1 (k) is the number of permutations with Xk in the final place, for which exactly r - 1 of the
first n- 1 terms are strictly positive. Consider a permutation rc = (xi 1 , xi 2 , ... , Xin-I, Xk) with Xk in
the final place, and move the positionofxk to obtain thenewpermutationrc' = (xk. xi 1 , Xi 2 , . , Xin-l ).
The first appearance of the maximum in rc' is at its rth place if and only if the first maximum of the
reduced permutation (xi 1 , Xi 2 , ... , X in- I) is at its (r - l)th place. [Note that r = 0 is impossible
since sn > 0.] It follows that
n

Br =

Br-1 (k),
k=1
where Br-1 (k) is the number of permutations with Xk in the final place, for which the first appearance
of the maximum is at the (r - 1)th place.
By the induction hypothesis, Ar-1 (k) = Br-1 (k), since these quantities depend on the n - 1
terms excluding Xk. The result follows.

2::f=

19. Suppose that Sm =


1 Xj, 0 ::": m ::": n, are the partial sums of n independent identically
distributed random variables Xj. Let An be the number of strictly positive partial sums, and Rn the
index of the first appearance of the value of the maximal partial sum. Each of the n! permutations
of (X 1, X2, ... , Xn) has the same joint distribution. Consider the kth permutation, and let h be
the indicator function of the event that exactly r partial sums are positive, and let h be the indicator
function that the first appearance ofthe maximum is at the rth place. Then, using Problem (13.12.18),
1 n!
1 n!
lP'(An = r) = - LE(h) = - LE(h) = lP'(Rn = r).
n! k=1
n! k=1
We apply this with Xj

= W(jtjn)- W((j-

l)t/n), so that Sm

l::j I{W(jtfn)>O) has the same distribution as

Rn = min{k 2':: 0: W(kt/n) =

= W(mtjn).

Thus An

m;:tx W(jtjn) }.
O:'OJ:'On

By Problem (13.12.17), Rn ~ R as n---+ oo. By Problem (13.12.16), the time spent by W at zero
is a.s. a null set, whence An ~ A. Hence A and R have the same distribution. We argue as follows
to obtain that that L and R have the same distribution. Making repeated use of Theorem (13.4.6) and
the symmetry of W,
lP'(L < x) =

lP'c~~~t W(s)

<

0) + lP'cl~~t W(s) > 0)

= 2lP'( sup {W(s)- W(x)} < -W(x)) = 2lP'(IW(t)- W(x)l < W(x))
x:'Os:'Ot

= lP'(IW(t)- W(x)l < IW(x)l)

= lP'(

sup {W(s)- W(x)} <


X:'OS:'Ot

sup {W(s)- W(x)})

= lP'(R

::": x).

0:'0S:'OX

Finally, by Problem (13.12.15) and the circular symmetry of the joint density distribution of two
independent N(O, 1) variables U, V,
lP'(IW(t)- W(x)l < IW(x)l) = lP'((t- x)V 2 ::::: xU 2 ) = lP' (

20. Let
Tx = {

inf {t ::": 1 : W (t)

= x}

:::::

~)
= ~ sin- 1 ~.
t
Vt

if this set is non-empty,


otherwise,

427

u2 +V 2

Jr

[13.12.21]-[13.12.24]

Diffusion processes

Solutions

and similarly Vx = sup{t .::; 1 : W(t) = x}, with Vx = 1 if W(t) "# x for all t E [0, 1]. Recall that
Uo and Vo have an arc sine distribution as in Problem (13.12.19). On the event {Ux < 1}, we may
write (using the re-scaling property of W)
Ux

Tx

+ (1

- Tx)Uo,

Vx

Tx

+ (1 -

Tx)Vo,

where Uo and Yo are independent of Ux and Vx, and have the above arc sine distribution. Hence Ux
and Vx have the same distribution. Now Tx has the first passage distribution of Theorem (13.4.5),
whence

- (r r/J)

Tx,Uo

'

x2)} {

= { -x- exp ( - v'2nr3


2r

1
}
n./r/J(1- rp)

Therefore,

and
1
fux(u)= lo u frxux(t,u)du=
exp ( -x2)
- ,
o
'
n.ju(1-u)
2u

O<x<l.

21. Note that Vis a martingale, by Theorem (13.8.11). Fix t and let 1/fs = sign(Ws), 0.::; s .::; t.
We have that 111/fll = v'f, implying by Exercise (13.8.6) that E(V?) = ll/(1/f)ll~ = t. By a similar
calculation, E(V? 1 :Fs) = v? + t - s for 0 .::; s .::; t. That is to say, v? - t defines a martingale, and
the result follows by the Levy characterization theorem of Example (12.7.10).
22. The mean cost per unit time is
f.l(T) =

~{ R +CloT lP'(IW(t)l 2: a) dt} = ~{ R + 2C loT (1- <P(a/v'f)) dt }

Differentiate to obtain that 11' (T) = 0 if


R = 2C {loT <P(a/v'f)dt- T<P(a/fl)} = aC loT t- 1rp(a/v'f)dt,

where we have integrated by parts.

23. Consider the portfolio with Ht, S1) units of stock and 1/f(t, S1) units of bond, having total value
w(t, S1) = x~(t. x) + ert 1/f(t, S1). By assumption,
(1- y)x~(t, x) = yer 1 1jf(t, x).
Differentiate this equation with respect to x and substitute from equation (13.10.16) to obtain the
differential equation (1- y)~ + x~x = 0, with solution ~(t, x) = h(t)xY-I, for some function h(t).
We substitute this, together with(*), into equation (13.10.17) to obtain that
h'- h(l- y)(~yo- 2

+ r) =

0.

It follows that h(t) = A exp{(1 - y)( ~yo- 2 + r)t}, where A is an absolute constant to be determined
according to the size of the initial investment. Finally, w(t, x) = y- 1 x~(t. x) = y- 1h(t)xY.

24. Using Ito's formula (13.9.4), the drift term in the SDE for U1 is
(-ut(T- t, W)

+ ~u22(T- t, W)) dt,

where u1 and u22 denote partial derivatives of u. The drift function is identically zero if and only if
-

U[ -

zU22.

428

Bibliography

A man will turn over half a library to make one book.

Samuel Johnson

Abramowitz, M. and Stegun, I. A. ( 1965). Handbook ofmathematicalfunctions with formulas,


graphs and mathematical tables. Dover, New York.
Billingsley, P. (1995). Probability and measure (3rd edn). Wiley, New York.
Breiman, L. (1968). Probability. Addison-Wesley, Reading, MA, reprinted by SIAM, 1992.
Chung, K. L. (1974). A course in probability theory (2nd edn). Academic Press, New York.
Cox, D. R. and Miller, H. D. (1965). The theory of stochastic processes. Chapman and Hall,
London.
Doob, J. L. (1953). Stochastic processes. Wiley, New York.
Feller, W. (1968). An introduction to probability theory and its applications, Vol. 1 (3rd edn).
Wiley, New York.
Feller, W. (1971). An introduction to probability theory and its applications, Vol. 2 (2nd edn).
Wiley, New York.
Grimmett, G. R. and Stirzaker, D. R. (2001). Probability and random processes, (3rd edn).
Oxford University Press, Oxford.
Grimmett, G. R. and Welsh, D. J. A. (1986). Probability, an introduction. Clarendon Press,
Oxford.
Hall, M. (1983). Combinatorial theory (2nd edn). Wiley, New York.
Harris, T. E. (1963). The theory of branching processes. Springer, Berlin.
Karlin, S. and Taylor, H. M. (1975). A first course in stochastic processes (2nd edn). Academic
Press, New York.
Karlin, S. and Taylor, H. M. (1981). A second course in stochastic processes. Academic
Press, New York.
Laha, R. G. and Rohatgi, V. K. (1979). Probability theory. Wiley, New York.
Loeve, M. (1977). Probability theory, Vol. 1 (4th edn). Springer, Berlin.
Loeve, M. (1978). Probability theory, Vol. 2 (4th edn). Springer, Berlin.
Moran, P. A. P. (1968). An introduction to probability theory. Clarendon Press, Oxford.
Stirzaker, D. R. (1994). Elementary probability. Cambridge University Press, Cambridge.
Stirzaker, D. R. (1999). Probability and random variables. Cambridge University Press,
Cambridge.
Williams, D. (1991). Probability with martingales. Cambridge University Press, Cambridge.

429

Index

Abbreviations used in this index: c.f. characteristic function; distn distribution; eqn equation;
fn function; m.g.f. moment generating function; p.g.f. probability generating function; pr.
process; r.v. random variable; r.w. random walk; s.r.w. simple random walk; thm theorem.

A
absolute value of s.r. w. 6.1.3
absorbing barriers: s.r.w. 1.7.3,
3.9.1, 5, 3.11.23, 25-26,
12.5.4-5, 7; Wiener pr.
12.9.22-3, 13.12.8-9
absorbing state 6.2.1
adapted process 13.8.6
affine transformation 4.13.11;
4.14.60
age-dependent branching pr.
5.5.1-2; conditional 5.1.2;
honest martingale 12.9.2; mean
10.6.13
age, see current life
airlines 1.8.39, 2. 7. 7
alarm clock 6.15.21
algorithm 3.11.33, 4.14.63, 6.14.2
aliasing method 4.11.6
alternating renewal pr. 10.5.2,
10.6.14
American call option 13.10.4
analytic fn 13.12.14
ancestors in common 5.4.2
anomalous numbers 3.6.7
Anscombe's theorem 7.11.28
antithetic variable 4.11.11
ants 6.15.41
arbitrage 3.3.7, 6.6.3
Arbuthnot, J. 3.11.22
arc sine distn 4.11.13; sample
from 4.11.13
arc sine law density 4.1.1
arc sine laws for r.w.: maxima
3.11.28; sojourns 5.3.5; visits
3.10.3

arc sine laws for Wiener pr.


13.4.3, 13.12.10, 13.12.19
Archimedes's theorem 4.11.32
arithmetic r.v. 5.9.4
attraction 1.8.29
autocorrelation function 9.3.3,
9.7.5, 8
autocovariance function 9.3.3,
9.5.2, 9.7.6, 19-20, 22
autoregressive sequence 8.7.2,
9.1.1, 9.2.1, 9.7.3
average, see moving average

B
babies 5.10.2
backward martingale 12.7.3
bagged balls 7.11.27, 12.9.13-14
balance equations 11.7.13
Bandrika 1.8.35-36, 4.2.3
bankruptcy, see gambler's ruin
Barker's algorithm 6.14.2
barriers: absorbing/retaining in
r.w. 1.7.3, 3.9.1-2, 3.11.23,
25-26; hitting by Wiener pr.
13.4.2
Bartlett: equation 13.3.2-4;
theorem 8.7.6, 11.7.1
batch service 11.8.4
baulking 8.4.4, 11.8.2, 19
Bayes's formula 1.8.14, 1.8.36
bears 6.13.1, 10.6.19
Benford's distn 3.6.7
Berkson's fallacy 3.11.37
Bernoulli: Daniel 3.3.4, 3.4.3-4;
Nicholas 3.3.4

430

Bernoulli: model 6.15.36;


renewal 8.7.3; shift 9.17.14;
sum ofr.v.s 3.11.14, 35
Bertrand's paradox 4.14.8
Bessel: function 5.8.5, 11.8.5,
11.8.16; B. pr. 12.9.23,
13.3.5-6, 13.9.1
best predictor 7.9.1; linear 7.9.3,
9.2.1-2, 9.7.1, 3
beta fn 4.4.2, 4.10.6
beta distn: b.-binomial 4.6.5;
sample from 4.11.4-5
betting scheme 6.6.3
binary: expansion 9 .1.2; fission
5.5.1
binary tree 5.12.38; r.w. on 6.4.7
binomial r.v. 2.1.3; sum of 3.11.8,
11
birth process 6.8.6; dishonest
6.8.7; forward eqns
6.8.4; divergent 6.8.7;
with immigration 6.8.5;
non-homogeneous 6.15.24;
see also simple birth
birth-death process: coupled
6.15.46; extinction 6.11.3,
5, 6.15.25, 12.9.10; honest
6.15.26; immigration-death
6.11.3, 6.13.18, 28; jump chain
6.11.1; martingale 12.9.10;
queue 8.7.4; reversible 6.5.1,
6.15.16; symmetric 6.15.27;
see also simple birth-death
birthdays 1.8 .30
bivariate: Markov chain 6.15.4;
negative binomial distn
5.12.16; p.g.f. 5.1.3
bivariate normal distn 4.7.5-6,
12, 4.9.4-5, 4.14.13, 16, 7.9.2,

Index
7.11.19; c.f. 5.8.11; positive

part 4.7.5, 4.8.8, 5.9.8


Black-Scholes: model 13.12.23;
value 13.10.5
Bonferroni's inequality 1.8.37
books 2.7.15
Boole's inequalities 1.8.11
Borel: normal number theorem
9.7.14; paradox 4.6.1
Borel-Cantelli lemmas 7.6.1,
13.12.11
bounded convergence 12.1.5
bow tie 6.4.11
Box-Muller normals 4.11.7
branching process: age-dependent
5.5.1-2, 10.6.13; ancestors
5.4.2; conditioned 5.12.21,
6.7.1-4; convergence, 12.9.8;
correlation 5.4.1; critical
7.10.1; extinction 5.4.3;
geometric 5.4.3, 5.4.6;
imbedded in queue 11.3.2,
11.7.5, 11; with immigration
5.4.5, 7.7.2; inequality 5.12.12;
martingale 12.1.3, 9, 12.9.1-2,
8; maximum of 12.9.20;
moments 5.4.1; p.g.f. 5.4.4;
supercritical 6.7.2; total
population 5.12.11; variance
5.12.9; visits 5.4.6
bridge 1.8.32
Brownian bridge 9.7.22, 13.6.2-5;
autocovariance 9.7.22, 13.6.2;
zeros of 13.6.5
Brownian motion; geometric
13.3.9; tied-down, see
Brownian bridge
Buffon: cross 4.5.3; needle 4.5.2,
4.14.31-32; noodle 4.14.31
busy period 6.12.1; in G/G/1
11.5.1; in M/G/1 11.3.3; in
M/M/1 11.3.2, 11.8.5; in
M!M/oo 11.8.9

c
cake, hot 3.11.32
call option: American 13.10.4;
European 13.10.4-5
Campbell-Hardy theorem 6.13.2
capture-recapture 3.5.4
car, parking 4.14.30
cards 1.7.2, 5, 1.8.33
Carroll, Lewis 1.4.4
casino 3.9.6, 7.7.4, 12.9.16
Cauchy convergence 7.3.1; in
m.s. 7.11.11
Cauchy distn 4.4.4; maximum
7 .11.14; moments 4.4.4;

sample from 4.11.9; sum 4.8.2,


5.11.4, 5.12.24-25
Cauchy-Schwarz inequality
4.14.27
central limit theorem 5.10.1, 3, 9,
5.12.33, 40, 7.11.26, 10.6.3
characteristic function
5.12.26-31; bivariate normal
5.8.11; continuity theorem
5.12.39; exponential distn
5.8.8; extreme-value distn
5.12.27; first passage distn
5.10.7-8; joint 5.12.30; law
of large numbers 7.11.15;
multinormal distn 5.8.6; tails
5.7.6; weak law 7.11.15
Chebyshov's inequality, one-sided
7.11.9
cherries 1.8.22
chess 6.6.6-7
chimeras 3.11.36
chi-squared distn: non-central
5.7.7; sum 4.10.1, 4.14.12
Cholesky decomposition 4.14.62
chromatic number 12.2.2
coins: double 1.4.3; fair 1.3.2;
first head 1.3.2, 1.8.2, 2.7.1;
patterns 1.3.2, 5.2.6, 5.12.2,
10.6.17, 12.9.16; transitive
2.7.16; see Poisson flips
colouring: graph 12.2.2; sphere
1.8.28; theorem 6.15.39
competition lemma 6.13.8
complete convergence 7.3.7
complex-valued process 9.7.8
compound: Poisson pr. 6.15.21;
Poisson distn 3.8.6, 5.12.13
compounding 5.2.3, 5.2.8
computer queue 6.9.3
concave fn 6.15.37
conditional: birth-death pr.
6.11.4-5; branching pr.
5.12.21, 6.7.1-4; convergence
13.8.3; correlation 9.7.21;
entropy 6.15.45; expectation
3.7.2-3, 4.6.2, 4.14.13, 7.9.4;
independence 1.5.5; probability
1.8.9; s.r.w. 3.9.2-3; variance
3.7.4, 4.6.7; Wiener pr.
8.5.2, 9.7.21, 13.6.1; see also
regression
continuity of: distn fns 2. 7.1 0;
marginals 4.5.1; probability
measures 1.8.16, 1.8.18;
transition probabilities 6.15.14
continuity theorem 5.12.35, 39

431

continuous r.v.: independence


4.5.5, 4.14.6; limits of discrete
r.v.s 2.3.1
convergence: bounded 12.1.5;
Cauchy 7.3.1, 7.11.11;
complete 7.3.7; conditional
13.8.3; in distn 7.2.4, 7.11.8,
16, 24; dominated 5.6.3, 7.2.2;
event of 7.2.6; martingale
7.8.3, 12.1.5, 12.9.6; Poisson
pr. 7.11.5; in probability 7.2.8,
7.11.15; subsequence 7.11.25;
in total variation 7 .2.9
convex: fn 5.6.1, 12.1.6-7; rock
4.14.47; shape 4.13.2-3,
4.14.61
corkscrew 8.4.5
Com Flakes 1.3.4, 1.8.13
countable additivity 1.8.18
counters 10.6.6-8, 15
coupling: birth-death pr. 6.15.46;
maximal 4.12.4-6, 7.11.16
coupons 3.3.2, 5.2.9, 5.12.34
covariance: matrix 3.11.15, 7.9.3;
of Poisson pr. 7 .11.5
Cox process 6.15.22
Cp inequality
Cramer-Wold device 7.11.19,
5.8.11
criterion: irreducibility 6.15.15;
Kolmogorov's 6.5.2; for
persistence 6.4.1 0
Crofton's method 4.13.9
crudely stationary 8.2.3
cube: point in 7.11.22; r.w. on
6.3.4
cumulants 5.7.3-4
cups and saucers 1.3.3
current life 10.5.4; and excess
10.6.9; limit 10.6.4; Markov
10.3.2; Poisson 10.5.4

D
dam 6.4.3
dead period 10.6.6-7
death-immigration pr. 6.11.3
decay 5.12.48, 6.4.8
decimal expansion 3.1.4, 7 .11.4
decomposition: Cholesky 4.14.62;
Krickeberg 12.9.11
degrees of freedom 5.7.7-8
delayed renewal pr. 10.6.12
de Moivre: martingale 12.1.4,
12.4.6; trial 3.5.1
De Morgan laws 1.2.1
density: arc sine 4.11.13; arc
sine law 4.1.1; betsa 4.11.4,

Index

4.14.11, 19, 5.8.3; bivariate


normal 4.7.5-6, 12, 4.9.4-5,
4.14.13, 16, 7.9.2, 7.11.19;
Cauchy 4.4.4, 4.8.2, 4.10.3,
4.14.4, 16, 5.7.1, 5.11.4,
5.12.19, 24-25, 7.11.14;
chi-squared 4.10.1, 4.14.12,
5.7.7; Dirichlet 4.14.58;
exponential 4.4.3, 4.5.5, 4.7.2,
4.8.1, 4.10.4, 4.14.4-5, 17-19,
24, 33, 5.12.32, 39, 6.7.1;
extreme-value 4.1.1, 4.14.46,
7.11.13; F(r, s) 4.10.2, 4,
5.7.8; first passage 5.10.7-8,
5.12.18-19; Fisher's spherical
4.14.36; gamma 4.14.10-12,
5.8.3, 5.9.3, 5.10.3, 5.12.14,
33; hypoexponential 4.8.4;
log-normal 4.4.5, 5.12.43;
multinormal 4.9.2, 5.8.6;
normal4.9.3, 5, 4.14.1,
5.8.4-6, 5.12.23, 42, 7.11.19;
spectral 9.3 .3; standard normal
4.7.5; Student's t 4.10.2-3,
5.7.8; uniform 4.4.3, 4.5.4,
4.6.6, 4.7.1, 3, 4, 4.8.5, 4.11.1,
8, 4.14.4, 15, 19, 20, 23-26,
5.12.32, 7.11.4, 9.1.2, 9.7.5;
Weibull 4.4.7, 7.11.13
departure pr. 11.2.7, 11.7.2-4,
11.8.12
derangement 3.4.9
diagonal selection 6.4.5
dice 1.5.2, 3.2.4, 3.3.3, 6.1.2;
weighted/loaded 2.7.12,
5.12.36
difference eqns 1.8.20, 3.4.9,
5.2.5
difficult customers 11.7.4
diffusion: absorbing barrier
13.12.8-9; Bessel pr. 12.9.23,
13.3.5-6, 13.9.1; Ehrenfest
model 6.5.5, 36; first passage
13.4.2; Ito pr. 13.9.3; models
3.4.4, 6.5.5, 6.15.12, 36;
Omstein-Uhlenbeck pr.
13.3.4, 13.7.4-5, 13.12.3-4,
6; osmosis 6.15.36; reflecting
barrier 13.5.1, 13.12.6; Wiener
pr. 12.7.22-23; zeros 13.4.1,
13.12.10; Chapter 13 passim
diffusion approximation to
birth-death pr. 13.3 .1
dimer problem 3.11.34
Dirichlet: density 4.14.58; distn
3.11.31
disasters 6.12.2-3, 6.15.28
discontinuous marginal 4.5.1
dishonest birth pr. 6.8.7

distribution: see also density;


arc sine 4.11.13; arithmetic
5.9.4; Benford 3.6.7; Bernoulli
3.11.14, 35; beta 4.11.4;
beta-binomial 4.6.5; binomial
2.1.3, 3.11.8, 11, 5.12.39;
bivariate normal 4.7.5-6, 12;
Cauchy 4.4.4; chi-squared
4.10.1; compound 5.2.3;
compound Poisson 5.12.13;
convergence 7 .2.4; Dirichlet
3.11.31, 4.14.58; empirical
9.7.22; exponential 4.4.3,
5.12.39; F(r, s) 4.10.2-4;
extreme-value 4.1.1, 4.14.46,
7.11.13; first passage
5.10.7-8. 5.12.18-19; gamma
4.14.10-12; Gaussian, see
normal; geometric 3.1.1,
3.2.2, 3.7.5, 3.11.7, 5.12.34,
39, 6.11.4; hypergeometric
3 .11.1 0-11; hypoexponential
4.8.4; infinitely divisible
5.12.13-14; inverse square
3.1.1; joint 2.5.4; lattice 5.7.5;
logarithmic 3 .1.1, 5 .2.3;
log-normal 4.4.5, 5.12.43;
maximum 4.2.2, 4.14.17;
median 2.7.11, 4.3.4, 7.3.11;
mixed 2.1.4, 2.3.4, 4.1.3;
modified Poisson 3 .1.1;
moments 5.11.3; multinomial
3.5.1, 3.6.2; multinormal
4.9.2; negative binomial
3.8.4, 5.2.3, 5.12.4, 16;
negative hypergeometric 3.5.4;
non-central 5.7.7-8; normal
4.4.6, 8, 4.9.3-5; Poisson
3.1.1, 3.5.2-3, 3.11.6, 4.14.11,
5.2.3, 5.10.3, 5.12.8, 14, 17,
33, 37, 39, 7.11.18; spectral
9.3.2, 4; standard normal 4.7.5;
stationary 6.9.11; Student's
t 4.10.2-3, 5.7.8; symmetric
3.2.5; tails 5.1.2, 5.6.4, 5.11.3;
tilted 5.7.11; trapezoidal 3.8.1;
trinormal 4.9.8-9; uniform
2.7.20, 3.7.5, 3.8.1, 5.1.6,
9.7.5; Weibull 4.4.7, 7.11.13;
zeta or Zipf 3.11.5
divergent birth pr. 6.8.7
divine providence 3.11.22
Dobrushin's bound and ergodic
coefficient 6.14.4
dog-flea model 6.5.5, 6.15.36
dominated convergence 5.6.3,
7.2.2
Doob's Lz inequality 13.7.1
Doob-Kolmogorov inequality
7.8.1-2

432

doubly stochastic: matrix 6.1.12,


6.15.2; Poisson pr. 6.15.22-23
downcrossings inequality 12.3.1
drift 13.3.3, 13.5.1, 13.8.9,
13.12.9
dual queue 11.5.2
duration of play 12.1.4

E
Eddington's controversy 1.8.27
editors 6.4.1
eggs 5.12.13
Ehrenfest model 6.5.5, 6.15.36
eigenvector 6.6.1-2, 6.15.7
embarrassment 2.2.1
empires 6.15.10
empirical distn 9.7.22
entrance fee 3.3.4
entropy 7 .5.1; conditional
6.15.45; mutual 3.6.5
epidemic 6.15.32
equilibrium, see stationary
equivalence class 7 .1.1
ergodic: coefficient 6.14.4;
measure 9.7.11; stationary
measure 9.7.11
ergodic theorem: Markov chain
6.15.44, 7.11.32; Markov pr.
7.11.33, 10.5.1; stationary pr.
9.7.10-11, 13
Erlang's loss formula 11.8.19
error 3.7.9; of prediction 9.2.2
estimation 2.2.3, 4.5.3, 4.14.9,
7.11.31
Euler: constant 5.12.27, 6.15.32;
product 5.12.34
European call option 13.10.4-5
event: of convergence 7.2.6;
exchangeable 7.3.4-5; invariant
9.5.1; sequence 1.8.16; tail
7.3.3
excess life 10.5.4; conditional
10.3.4; and current 10.6.9;
limit 10.3.3; Markov 8.3.2,
10.3.2; moments 10.3.3;
Poisson 6.8.3, 10.3.1, 10.6.9;
reversed 8.3.2; stationary
10.3.3
exchangeability 7.3.4-5
expectation: conditional
3.7.2-3, 4.6.2, 4.14.12,
7.9.4; independent r.v.s 7.2.3;
linearity 5.6.2; tail integral
4.3.3, 5; tail sum 3.11.13,
4.14.3
exponential distn: c.f. 5.8.8;
holding time 11.2.2; in Poisson
pr. 6.8.3; lack-of-memory

Index
property 4.14.5; limit in
branching pr. 5.6.2, 5.12.21,
6. 7.1; limit of geometric distn
5.12.39; heavy traffic limit
11.6.1; distn of maximum
4.14.18; in Markov pr. 6.8.3,
6.9.9; order statistics 4.14.33;
sample from 4.14.48; sum
4.8.1, 4, 4.14.10, 5.12.50,
6.15.42
exponential martingale 13.3.9
exponential smoothing 9.7.2
extinction: of birth-death pr.
6.11.3, 6.15.25, 27, 12.9.10;
of branching pr. 6.7.2-3
extreme-value distn 4.1.1,
4.14.46, 5.12.34, 7.11.13; c.f.
and mean 5.12.27

F
F(r, s) distn 4.10.2, 4;

non-central 5.7.8
fair fee 3.3.4
families 1.5.7, 3.7.8
family, planning 3.11.30
Farkas's theorem 6.6.2
filter 9. 7.2
filtration 12.4.1-2, 7
fingerprinting 3.11.21
finite: Markov chain 6.5.8, 6.6.5,
6.15.43-44; stopping time
12.4.5; waiting room 11.8.1
first passage: c.f. 5.10.7-8;
diffusion pr. 13.4.2; distn
5.10.7-8, 5.12.18-19; Markov
chain 6.2.1, 6.3.6; Markov pr.
6.9.5-6; mean 6.3.7; m.g.f.
5.12.18; s.r.w. 5.3.8; Wiener
pr. 13.4.2
first visit by s.r.w. 3.10.1, 3
Fisher: spherical distn 4.14.36;
F.-Tippett-Gumbel distn
4.14.46
FKG inequality 3.11.18, 4.11.11
flip-flop 8.2.1
forest 6.15.30
Fourier: inversion thm 5.9.5;
series 9.7.15, 13.12.7
fourth moment strong law 7 .11.6
fractional moments 3.3.5, 4.3.1
function, upper-class 7.6.1
functional eqn 4.14.5, 19

gambling: advice 3.9.4; systems


7.7.4
gamma distn 4.14.10-12, 5.8.3,
5.9.3, 5.10.3, 5.12.14, 33; g.
and Poisson 4.14.11; sample
from 4.11.3; sum 4.14.11
gamma fn 4.4.1, 5.12.34
gaps: Poisson 8.4.3, 10.1.2;
recurrent events 5.12.45;
renewal 10.1.2
Gaussian distn, see normal distn
Gaussian pr. 9.6.2-4, 13.12.2;
Markov 9.6.2; stationary 9.4.3;
white noise 13 .8.5
generator 6.9.1
geometric Brownian motion,
Wiener pr. 13.3.9
geometric distn 3.1.1, 3.2.2,
3.7.5, 3.11.7, 5.12.34, 39;
lack-of-memory property
3.11.7; as limit 6.11.4; sample
from 4.11.8; sum 3.8.3-4
goat 1.4.5
graph: colouring 12.2.2; r.w.
6.4.6, 9, 13.11.2-3

H
Hajek-Renyi-Chow inequality
12.9.4-5
Hall, Monty 1.4.5
Hastings algorithm 6.14.2
Hawaii 2.7.17
hazard rate 4.1.4, 4.4.7; technique
4.11.10
heat eqn 13.12.24
Heathrow 10.2.1
heavy traffic 11.6.1, 11.7.16
hen, see eggs
Hewitt-Savage zero-one law
7.3.4-5
hitting time 6.9.5-6; theorem
3.10.1, 5.3.8
Hoeffding's inequality 12.2.1-2
Holder's inequality 4.14.27
holding time 11.2.2
homogeneous Markov chain 6.1.1
honest birth-death pr. 6.15.26
Hotelling's theorem 4.14.59
house 4.2.1, 6.15.20, 51
hypergeometric distn 3.11.10-11
hypoexponential distn 4.8.4

idle period 11.5.2, 11.8.9


Galton's paradox 1.5.8
imbedding: jump chain 6.9.11;
Markov chain 6.9.12, 6.15.17;
gambler's ruin 3.11.25-26, 12.1.4,
queues: D/M/1 11.7.16; GIM/1
12.5.8

433

11.4.1, 3; M/G/1 11.3.1,


11.7.4; unsuccessful 6.15.17
immigration: birth-i. 6.8.5;
branching 5.4.5, 7.7.2; i.-death
6.11.2, 6.15.18; with disasters
6.12.2-3, 6.15.28
immoral stimulus 1.2.1
importance sampling 4.11.12
inclusion-exclusion principle
1.3.4, 1.8.12
increasing sequence: of events
1.8.16; of r.v.s 2.7.2
increments: independent 9.7.6,
16--17; orthogonal 7.7.1;
spectral 9.4.1, 3; stationary
9.7.17; of Wiener pr. 9.7.6
independence and symmetry 1.5.3
independent: conditionally 1.5 .5;
continuous r.v.s 4.5.5, 4.14.6;
current and excess life 10.6.9;
customers 11.7 .1; discrete
r. v.s 3 .11.1, 3; events 1.5 .1;
increments 9.7.17; mean,
variance of normal sample
4.10.5, 5.12.42; normal distn
4.7.5; pairwise 1.5.2, 3.2.1,
5.1.7; set 3.11.40; triplewise
5.1.7
indicators and matching 3.11.17
inequality: bivariate normal
4.7.12; Bonferroni 1.8.37;
Boole 1.8.11; Cauchy-Schwarz
4.14.27; Chebyshov
7.11.9; Dobrushin 6.14.4;
Doob-Kolmogorov
7.8.1; Doob L2 13.7.1;
downcrossings 12.3.1; FKG
3.11.18; Hajek-Renyi-Chow
12.9.4-5; Hoeffding 12.2.1-2;
Holder 4.14.27; Jensen 5.6.1,
7.9.4; Kolmogorov 7.8.1-2,
7.11.29-30; Kounias 1.8.38;
Lyapunov 4.14.28; maximal
12.4.3-4, 12.9.3, 5, 9; m.g.f.
5.8.2, 12.9.7; Minkowski
4.14.27; triangle 7.1.1, 3;
upcrossings 12.3.2
infinite divisibility 5.12.13-14
inner product 6.14.1, 7.1.2
inspection paradox 10.6.5
insurance 11.8.18, 12.9.12
integral: Monte Carlo 4.14.9;
normal 4.14.1; stochastic
9.7.19, 13.8.1-2
invariant event 9.5.1
inverse square distn 3.1.1
inverse transform technique 2.3 .3
inversion theorem 5.9.5; c.f.
5.12.20

Index

5.12.33, 40, 7.11.26, 10.6.3;


diffusion 13.3.1; distns 2.3.1;
events 1.8.16; gamma 5.9.3;
geometric-exponential 5.12.39;
lim inf 1.8.16; lim sup 1.8.16,
5.6.3, 7.3.2, 9-10, 12; local
J
5.9.2, 5.10.5-6; martingale
Jaguar 3.11.25
7.8.3; normal 5.12.41, 7.11.19;
Jensen's inequality 5.6.1, 7.9.4
Poisson 3.11.17; probability
joint: c.f. 5.12.30; density 2.7.20;
1.8.16-18; r.v. 2.7.2; uniform
distn 2.5.4; mass fn 2.5.5;
5.9.1
moments 5.12.30; p.g.f. 5.1.3-5 linear dependence 3.11.15
jump chain: of MIM!l 11.2.6
linear fn of normal r.v. 4.9.3-4
linear prediction 9. 7.1, 3
K
local central limit theorem 5.9.2,
key renewal theorem 10.3.3, 5,
5.10.5-6
10.6.11
logarithmic distn 3.1.1, 5.2.3
Keynes, J. M. 3.9.6
log-convex 3.1.5
knapsack problem 12.2.1
log-likelihood 7.11.31
Kolmogorov: criterion 6.5.2;
log-normal r.v. 4.4.5, 5.12.43
inequality 7.8.1-2, 7.11.29-30 lottery 1.8.31
Korolyuk-Khinchin theorem 8.2.3 Lyapunov's inequality 4.14.28
Kounias's inequality 1.8.38
Krickeberg decomposition 12.9.11
M
Kronecker's lemma 7.8.2,
machine 11.8.17
7.11.30, 12.9.5
magnets 3.4.8
kurtosis 4.14.45
marginal: discontinuous 4.5.1;
multinomial 3.6.2; order
L
statistics 4.14.22
L2 inequality 13.7.1
Markov chain in continuous
Labouchere system 12.9.15
time: ergodic theorem 7 .11.33,
lack of anticipation 6.9.4
10.5.1; first passage 6.9.5-6;
lack-of-memory property:
irreducible 6.15.15; jump chain
exponential distn 4.14.5;
6.9.11; martingale 12.7.1;
geometric distn 3 .11.7
mean first passage 6.9.6;
mean recurrence time 6.9.11;
ladders, see records
renewal pr. 8.3.5; reversible
Lancaster's theorem 4.14.38
6.15.16, 38; stationary distn
large deviations 5.11.1-3, 12.9.7
6.9.11;
two-state 6.9.1-2,
last exits 6.2.1, 6.15.7
6.15.17; visits 6.9.9
lattice distn 5.7.5
Markov chain in discrete time:
law: anomalous numbers 3.6.7;
absorbing state 6.2.2; bivariate
arc sine 3.10.3, 3.11.28, 5.3.5;
6.15.4; convergence 6.15.43;
De Morgan 1.2.1; iterated
dice 6.1.2; ergodic theorem
logarithm 7.6.1; large numbers
7.11.32; finite 6.5.8, 6.15.43;
2.2.2; strong 7.4.1, 7.8.2,
first passages 6.2.1, 6.3.6;
7.11.6, 9.7.10; unconscious
homogeneous 6.1.1; imbedded
statistician 3.11.3; weak 7.4.1,
6.9.11, 6.15.17, 11.4.1; last
7.11.15, 20-21; zero-one
exits 6.2.1, 6.15.7; martingale
7.3.4-5
12.1.8, 12.3.3; mean first
Lebesgue measure 6.15.29,
passage 6.3.7; mean recurrence
13.12.16
time 6.9.11; persistent 6.4.10;
left-continuous r.w. 5.3.7, 5.12.7
renewal 10.3.2; reversible
level sets of Wiener pr. 13.12.16
6.14.1; sampled 6.1.4, 6.3.8;
simulation of 6.14.3; stationary
Levy metric 2.7.13, 7.1.4, 7.2.4
distn 6.9.11; sum 6.1.8;
limit: binomial 3.11.1 0;
two-state 6.15.11, 17, 8.2.1;
binomial-Poisson 5.12.39;
visits 6.2.3-5, 6.3.5, 6.15.5, 44
branching 12.9.8; central
Markov-Kakutani theorem 6.6.1
limit theorem 5.10.1, 3, 9,
irreducible Markov pr. 6.15.15
iterated logarithm 7 .6.1
Ito: formula 13.9.2; process
13.9.3

434

Markov process: Gaussian 9.6.2;


reversible 6.15.16
Markov property 6.1.5, 10; strong
6.1.6
Markov renewal, see Poisson pr.
Markov time, see stopping time
Markovian queue, see M/M/1
marriage problem 3.4.3, 4.14.35
martingale: backward 12.7.3;
birth-death 12.9.10;
branching pr. 12.1.3, 9,
12.9.1-2, 8, 20; casino
7.7.4; continuous parameter
12.7.1-2; convergence 7.8.3,
12.1.5, 12.9.6; de Moivre
12.1.4, 12.4.6; exponential
13.3.9; finite stopping time
12.4.5; gambling 12.1.4,
12.5.8; Markov chain 12.1.8,
12.3.3, 12.7.1; optional
stopping 12.5.1-8; orthogonal
increments 7. 7.1; partial sum
12.7 .3; patterns 12.9.16;
Poisson pr. 12.7.2; reversed
12.7.3; simple r.w. 12.1.4,
12.4.6, 12.5.4-7; stopping
time 12.4.1, 5, 7; urn 7.11.27,
12.9.13-14; Wiener pr.
12.9.22-23
mass function, joint 2.5.5
matching 3.4.9, 3.11.17, 5.2.7,
12.9.21
matrix: covariance 3.11.15;
definite 4.9.1; doubly
stochastic 6.1.12, 6.15.2;
multiplication 4.14.63; square
root 4.9.1; stochastic 6.1.12,
6.14.1; sub-stochastic 6.1.12;
transition 7.11.31; tridiagonal
6.5.1, 6.15.16
maximal: coupling 4.12.4-6,
7.11.16; inequality 12.4.3-4,
12.9.3, 5, 9
maximum of: branching pr.
12.9.20; multinormal 5.9.7;
r.w. 3.10.2, 3.11.28, 5.3.1,
6.1.3, 12.4.6; uniforms
5.12.32; Wiener pr. 13.12.8,
11, 15, 17
maximum r.v. 4.2.2, 4.5.4,
4.14.17-18, 5.12.32, 7.11.14
mean: extreme-value 5.12.27;
first passage 6.3.7, 6.9.6;
negative binomial 5.12.4;
normal 4.4.6; recurrence time
6.9.11; waiting time 11.4.2,
11.8.6, 10
measure: ergodic 9.7.11-12;
Lebesgue 6.15.29, 13.12.16;

Index
stationary 9.7.11-12; strongly
mixing 9.7.12
median 2.7.11, 4.3.4, 7.3.11
menages 1.8.23
Mercator projection 6.13.5
meteorites 9. 7.4
metric 2.7.13, 7.1.4; Levy 2.7.13,
7 .1.4, 7 .2.4; total variation
2.7.13
m.g.f. inequality 5.8.2, 12.9.7
migration pr., open 11.7.1, 5
millionaires 3.9.4
Mills's ratio 4.4.8, 4.14.1
minimal, solution 6.3.6--7, 6.9.6
Minkowski's inequality 4.14.27
misprints 6.4.1
mixing, strong 9. 7.12
mixture 2.1.4, 2.3.4, 4.1.3, 5.1.9
modified renewall0.6.12
moments: branching pr. 5.4.1;
fractional 3.3.5, 4.3.1;
generating fn 5.1.8, 5.8.2,
5.11.3; joint 5.12.30; problem
5.12.43; renewal pr. 10.1.1; tail
integral 4.3.3, 5.11.3
Monte Carlo 4.14.9
Monty Hall 1.4.5
Moscow 11.8.3
moving average 8.7.1, 9.1.3,
9.4.2, 9.5.3, 9.7.1-2, 7;
spectral density 9. 7. 7
multinomial distn 3.5.1; marginals
3.6.2; p.g.f. 5.1.5
multinormal distn 4.9.2; c.f.
5.8.6; conditioned 4.9.6-7;
covariance matrix 4.9.2;
maximum 5.9.7; sampling
from 4.14.62; standard 4.9.2;
transformed 4.9.3, 4.14.62
Murphy's law 1.3.2
mutual information 3.6.5

N
needle, Buffon's 4.5.2,
4.14.31-32
negative binomial distn 3.8.4;
bivariate 5.12.16; moments
5.12.4; p.g.f. 5.1.1, 5.12.4
negative hypergeometric distn
3.5.4
Newton, I. 3.8.5
non-central distn 5.7.7-8
non-homogeneous: birth pr.
6.15.24; Poisson pr. 6.13.7,
6.15.19-20
noodle, Buffon's 4.14.31

norm 7.1.1, 7.9.6; equivalence


class 7.1.1; mean square 7.9.6;
rth mean 7.1.1
normal distn 4.4.6; bivariate
4.7.5-6, 12; central limit
theory 5.10.1, 3, 9, 5.12.23,
40; characterization of 5.12.23;
cumulants 5.7.4; limit 7.11.19;
linear transfomations 4.9.3-4;
Mills's ratio 4.4.8, 4.14.1;
moments 4.4.6, 4.14.1;
multivariate 4.9.2, 5.8.6;
regression 4.8.7, 4.9.6, 4.14.13,
7.9.2; sample 4.10.5, 5.12.42;
simulation of 4.11.7, 4.14.49;
square 4.14.12; standard 4.7.5;
sum 4.9.3; sum of squares
4.14.12; trivariate 4.9.8-9;
uncorrelated 4.8.6
normal integral 4.14.1
normal number theorem 9.7.14
now 6.15.50

0
occupation time for Wiener pr.
13.12.20
open migration 11.7.1, 5
optimal: packing 12.2.1; price
12.9.19; reset time 13.12.22;
serving 11.8.13
optimal stopping: dice 3.3.8-9;
marriage 4.14.35
optional stopping 12.5.1-8,
12.9.19; diffusion 13.4.2;
Poisson 12.7.2
order statistics 4.14.21;
exponential 4.14.33; general
4.14.21; marginals 4.14.22;
uniform 4.14.23-24, 39,
6.15.42, 12.7.3
Omstein-Uhlenbeck pr. 9.7.19,
13.3.4, 13.7.4-5, 13.12.3-4, 6;
reflected 13.12.6
orthogonal: increments 7.7.1;
polynomials 4.14.37
osmosis 6.15.36

p
pairwise independent: events
1.5.2; r.v.s 3.2.1, 3.3.3, 5.1.7
paradox: Bertrand 4.14.8;
Borel 4.6.1; Carroll 1.4.4;
Galton 1.5.8; inspection
10.6.5; Parrando 6.15.48; St
Petersburg 3.3.4; voter 3.6.6
parallel lines 4.14.52
parallelogram 4.14.60; property
7.1.2;

435

parking 4.14.30
Parrando's paradox 6.15.48
particle 6.15.33
partition of sample space 1.8.10
Pasta property 6.9.4
patterns 1.3.2, 5.2.6, 5.12.2,
10.6.17, 12.9.16
pawns 2.7.18
Pepys's problem 3.8.5
periodic state 6.5.4, 6.15.3
persistent: chain 6.4.10, 6.15.6;
r.w. 5.12.5-6, 6.3.2, 6.9.8,
7.3.3, 13.11.2; state 6.2.3-4,
6.9.7
Petersburg, see St Petersburg
pig 1.8.22
points, problem of 3.9.4, 3.11.24
Poisson: approximation 3.11.35;
coupling 4.12.2; flips 3.5.2,
5.12.37; sampling 6.9.4
Poisson distn 3.5.3;
characterization of 5.12.8, 15;
compound 3.8.6, 5.12.13; and
gamma distn 4.14.11; limit of
binomial 5.12.39; modified
3.1.1; sum 3.11.6, 7.2.10
Poisson pr. 6.15.29; age 10.5.4;
arrivals 8.7.4, 10.6.8;
autocovariance 7 .11.5,
9 .6.1; characterization
6.15.29, 9.7.16; colouring
theorem 6.15.39; compound
6.15.21; conditional property
6.13.6; continuity in m.s.
7 .11.5; covariance 7 .11.5;
differentiability 7.11.5;
doubly stochastic 6.15.22-23;
excess life 6.8.3, 10.3.1;
forest 6.15.30; gaps 10.1.2;
Markov renewal pr. 8.3.5,
10.6.9; martingales 12.7.2;
non-homogeneous 6.13.7,
6.15.19-20; optional stopping
12.7.2; perturbed 6.15.39-40;
renewal 8.3.5, 10.6.9-10;
Renyi's theorem 6.15.39;
repairs 11.7.18; sampling
6.9.4; spatial 6.15.30-31,
7.4.3; spectral density
9.7.6; sphere 6.13.3-4;
stationary increments 9.7.6,
16; superposed 6.8.1; thinned
6.8.2; total life 10.6.5; traffic
6.15.40, 49, 8.4.3
poker 1.8.33; dice 1.8.34
P6lya's urn 12.9.13-14
portfolio 13.12.23; self-financing
13.10.2-3
positive definite 9.6.1, 4.9.1

Index
positive state, see non-null
postage stamp lemma 6.3.9
potential theory 13.11.1-3
power series approximation
7.11.17
Pratt's lemma 7.10.5
predictable step fn 13.8.4
predictor: best 7.9.1; linear 7.9.3,
9.2.1-2, 9.7.1, 3
probabilistic method 1.8.28, 3.4. 7
probability: continuity 1.8.16,
1.8.18; p.g.f. 5.12.4, 13; vector
4.11.6
problem: matching 3.4.9, 3.11.17,
5.2.7, 12.9.21; menages 1.8.23;
Pepys 3.8.5; of points 3.9.4,
3.11.24; Waldegrave 5.12.10
program, dual, linear, and primal
6.6.3
projected r.w. 5.12.6
projection theorem 9.2.10
proof-reading 6.4.1
proportion, see empirical ratio
proportional investor 13.12.23
prosecutor's fallacy 1.4.6
protocol 1.4.5, 1.8.26
pull-through property 3.7.1

Q
quadratic variation 8.5.4, 13.7.2
queue: batch 11.8.4; baulking
8.4.4, 11.8.2, 19; busy period
6.12.1, 11.3.2-3, 11.5.1,
11.8.5, 9; costs 11.8.13;
departure pr. 11.2.7, 11.7.2-4,
11.8.12; difficult customer
11.7.4; D/M/1 11.4.3, 11.8.15;
dual 11.5.2; Erlang's loss fn
11.8.19; finite waiting room
11.8.1; G/G/1 11.5.1, 11.8.8;
G/M/1 11.4.1-2, 11.5.2-3;
heavy traffic 11.6.1, 11.8.15;
idle period 11.5.2, 11.8.9;
imbedded branching 11.3.2,
11.8.5, 11; imbedded Markov
pr. 11.2.6, 11.4.1, 3, 11.4.1;
imbedded renewal 11.3.3,
11.5.1; imbedded r.w. 11.2.2,
5; Markov, see M/M/1; M/D/1
11.3.1, 11.8.10-11; M/G/1
11.3.3, 11.8.6-7; M/G/oo
6.12.4, 11.8.9; migration
system 11.7.1, 5; M/M/1 6.9.3,
6.12.1, 11.2.2-3, 5-6, 11.3.2,
11.6.1, 11.8.5, 12; M!Mik
11.7.2, 11.8.13; M/M/oo 8.7.4;
series 11.8.3, 12; supermarket
11.8.3; tandem 11.2.7, 11.8.3;
taxicabs 11.8.16; telephone

orthogonal 7.7.1; p.g.f. 5.12.4,


13; Poisson 3.5.3; standard
normal 4.7.5; Student's t
4.10.2-3, 5.7.8; symmetric
3.2.5, 4.1.2, 5.12.22; tails
3.11.13, 4.3.3, 5, 4.14.3, 5.1.2,
5.6.4, 5.11.3; tilted 5.1.9,
5.7.11; trivial3.11.2; truncated
2.4.2; uncorrelated 3.11.12, 16,
4.5.7-8, 4.8.6; uniform 3.8.1,
R
4.8.4, 4.11.1, 9.7.5; waiting
radioactivity 10.6.6-8
time, see queue; Weibull 4.4.7;
zeta or Zipf 3.11.5
random: bias 5.10.9; binomial
coefficient 5 .2.1; chord
random walk: absorbed 3.11.39,
4.13.1; dead period 10.6.7;
12.5.4-5; arc sine laws
harmonic series 7.11.37;
3.10.3, 3.11.28, 5.3.5; on
integers 6.15.34; line 4.13.1-3,
binary tree 6.4.7; conditional
4.14.52; paper 4.14.56;
3.9.2-3; on cube 6.3.4;
parameter 4.6.5, 5.1.6,
first passage 5.3.8; first
5.2.3, 5.2.8; particles 6.4.8;
visit 3.10.3; on graph 6.4.6,
pebbles 4.14.51; permutation
9, 13.11.2-3; on hexagon
4.11.2; perpendicular 4.14.50;
6.15.35; imbedded in queue
polygons 4.13.10, 6.4.9; rock
11.2.2, 5; left-continuous 5.3.7,
4.14.57; rods 4.14.25-26,
5.12.7; martingale 12.1.4,
53-54; sample 4.14.21;
12.4.6, 12.5.4-5; maximum
subsequence 7.11.25; sum
3.10.2, 3.11.28, 5.3.1;
3.7.6, 3.8.6, 5.2.3, 5.12.50,
persistent 5.12.5-6, 6.3.2;
10.2.2; telegraph 8.2.2; triangle
potentials 13.11.2-3; projected
4.5.6, 4.13.6-8, 11, 13;
5.12.6; range of 3.11.27;
velocity 6.15.33, 40
reflected 11.2.1-2; retaining
barrier 11.2.4; returns to origin
random sample: normal 4.10.5;
3.10.1, 5.3.2; reversible 6.5.1;
ordered 4.12.21
simple 3.9.1-3, 5, 3.10.1-3; on
random variable: see also density
square 5.3.3; symmetric 1.7.3,
and distribution; arc sine
3.11.23; three dimensional
4.11.13; arithmetic 5.9.4;
6.15.9-10; transient 5.12.44,
Bernoulli 3.11.14, 35; beta
6.15.9, 7.5.3; truncated 6.5.7;
4.11.4; beta-binomial 4.6.5;
two dimensional 5.3.4, 5.12.6,
binomial 2.1.3, 3.11.8, 11,
12.9.17; visits 3.11.23, 6.9.8,
5.12.39; bivariate normal
10; zero mean 7.5.3; zeros of
4.7.5-6, 12; Cauchy 4.4.4;
3.10.1, 5.3.2, 5.12.5-6
c.f. 5.12.26-31; chi-squared
range of r.w. 3.11.27
4.10.1; compounding
rate of convergence 6.15.43
5.2.3; continuous 2.3.1;
ratios 4.3.2; Mills's 4.4.8; sex
Dirichlet 3.11.31, 4.14.58;
expectation 5.6.2, 7.2.3;
3.11.22
exponential 4.4.3, 5.12.39;
record: times 4.2.1, 4, 4.6.6, 10;
extreme-value 4.1.1, 4.14.46;
values 6.15.20, 7.11.36
F(r, s) 4.10.2, 4, 5.7.8;
recurrence, see difference
gamma 4.11.3, 4.14.10-12;
recurrence time 6.9 .11
geometric 3.1.1, 3.11.7;
recurrent: event 5.12.45, 7.5.2,
hypergeometric 3 .11.1 0-11;
9.7.4; see persistent
independent 3.11.1, 3, 4.5.5,
red now 12.9.18
7.2.3; indicator 3.11.17;
infinitely divisible 5.12.13-14; reflecting barrier: s.r.w. 11.2.1-2;
drifting Wiener pr. 13.5.1;
logarithmic 3.1.1, 5.2.3;
Ornstein-Uhlenbeck pr.
log-normal 4.4.5; median
13.12.6
2.7.11, 4.3.4; m.g.f. 5.1.8,
regeneration 11.3 .3
5.8.2, 5.11.3; multinomial
regression 4.8.7, 4.9.6, 4.14.13,
3.5.1; multinormal 4.9.2;
7.9.2
negative binomial 3.8.4;
rejection method 4.11.3-4, 13
normal 4.4.6, 4.7.5-6, 12;
exchange 11.8.9; two servers
8.4.1, 5, 11.7.3, 11.8.14;
virtual waiting 11.8.7; waiting
time 11.2.3, 11.5.2-3, 11.8.6,
8, 10
quotient 3.3.1, 4.7.2, 10, 13-14,
4.10.4, 4.11.10, 4.14.11, 14,
16, 40, 5.2.4, 5.12.49, 6.15.42

436

Index
reliability 3.4.5-6, 3.11.18-20,
renewal: age, see current life;
alternating 10.5.2, 10.6.14;
asymptotics 10.6.11; Bernoulli
8.7.3; central limit theorem
10.6.3; counters 10.6.6-8,
15; current life 10.3.2,
10.5.4, 10.6.4; delayed
10.6.12; excess life 8.3.2,
10.3.1-4, 10.5.4; r. function
10.6.11; gaps 10.1.2; key r.
theorem 10.3.3, 5, 10.6.11;
Markov 8.3.5; rn.g.f. 10.1.1;
moments 10.1.1; Poisson
8.3.5, 10.6.9-10; r. process
8.3.4; r.-reward 10.5.1-4;
r. sequence 6.15.8, 8.3.1, 3;
stationary 10.6.18; stopping
time 12.4.2; sum/superposed
10.6.10; thinning 10.6.16
Renyi's theorem 6.15.39
repairman 11.7.18
repulsion 1.8.29
reservoir 6.4.3
resources 6.15.47
retaining barrier 11.2.4
reversible: birth-death pr.
6.15.16; chain 6.14.1; Markov
pr. 6.15.16, 38; queue
11.7.2-3, 11.8.12, 14; r.w.
6.5.1
Riemarm-Lebesgue lemma 5.7.6
robots 3.7.7
rods 4.14.25-26,
ruin 11.8.18, 12.9.12; see also
gambler's ruin
runs 1.8.21, 3.4.1, 3.7.10, 5.12.3,
46-47

s
a-field 1.2.2, 4, 1.8.3, 9.5.1,
9.7.13; increasing sequence
of 12.4.7
StJohn's College 4.14.51
St Petersburg paradox 3.3.4
sample: normal 4.10.5, 5.12.42;
ordered 4.12.21
sampling 3.11.36; Poisson 6.9.4;
with and without replacement
3.11.10
sampling from distn: arc sine
4.11.13; beta 4.11.4-5; Cauchy
4.11.9; exponential4.14.48;
gamma 4.11.3; geometric
4.11.8; Markov chain 6.14.3;
multinormal 4.14.62; normal
4.11.7, 4.14.49; s.r.w. 4.11.6;
uniform 4 .11.1

secretary problem 3.11.17,


4.14.35
self-financing portfolio 13.10.2-3
semi-invariant 5. 7.3-4
sequence: of c.f.s 5.12.35; of
distns 2.3.1; of events 1.8.16;
of heads and tails, see pattern;
renewal 6.15.8, 8.3.1, 3; of
r.v.s 2.7.2
series of queues 11.8.3, 12
shift operator 9.7.11-12
shocks 6.13.6
shorting 13.11.2
simple birth pr. 6.8.4-5, 6.15.23
simple birth-death pr.:
conditioned 6.11.4-5; diffusion
approximation 13.3.1;
extinction 6.11.3, 6.15.27;
visits 6.11.6-7
simple immigration-death pr.
6.11.2, 6.15.18
simple: process 8.2.3; r.w.
3.9.1-3, 5, 3.10.1-3, 3.11.23,
27-29, 11.2.1, 12.1.4, 12.4.6,
12.5.4-7
simplex 6.15.42; algorithm
3.11.33
simulation, see sampling
sixes 3.2.4
skewness 4.14.44
sleuth 3.11.21
Slutsky's theorem 7.2.5
smoothing 9.7.2
snow 1.7.1
space, vector 2.7.3, 3.6.1
span of r.v. 5.7.5, 5.9.4
Sparre Andersen theorem
13.12.18
spectral: density 9.3 .3;
distribution 9.3.2, 4, 9.7.2-7;
increments 9.4.1, 3
spectrum 9.3.1
sphere 1.8.28, 4.6.1, 6.13.3-4,
12.9.23, 13.11.1; empty
6.15.31
squeezing 4.14.47
standard: bivariate normal
4.7.5; multinormal 4.9.2;
normal 4.7.5; Wiener pr. 9.6.1,
9.7.18-21, 13.12.1-3
state: absorbing 6.2.2; persistent
6.2.3-4; symmetric 6.2.5;
transient 6.2.4
stationary distn 6.9.1, 3-4, 11-12;
6.11.2; birth-death pr. 6.11.4;
current life 10.6.4; excess
life 10.3.3; Markov chain
9.1.4; open migration 11.7.1,

437

5; queue length 8.4.4, 8.7.4,


11.2.1-2, 6, 11.4.1, 11.5.2,
and Section 11.8 passim; r.w.
11.2.1-2; waiting time 11.2.3,
11.5.2-3, 11.8.8
stationary excess life 10.3.3
stationary increments 9. 7.1 7
stationary measure 9.7.11-12
stationary renewal pr. 10.6.18,
Stirling's formula 3.10.1, 3.11.22,
5.9.6, 5.12.5, 6.15.9, 7.11.26
stochastic: integral 9.7.19,
13.8.1-2; matrix 6.1.12,
6.14.1, 6.15.2; ordering
4.12.1-2
stopping time 6.1.6, 10.2.2,
12.4.1-2, 5, 7; for renewal pr.
12.4.2
strategy 3.3.8-9, 3.11.25, 4.14.35,
6.15.50, 12.9.19, 13.12.22
strong law of large numbers
7.4.1, 7.5.1-3, 7.8.2, 7.11.6,
9.7.10
strong Markov property 6.1.6
strong mixing 9.7.12
Student's t distn 4.10.2-3;
non-central 5.7.8
subadditive fn 6.15.14, 8.3.3
subgraph 13.11.3
sum of independent r.v.s:
Bernoulli 3.11.14, 35; binomial
3.11.8, 11; Cauchy 4.8.2,
5.11.4, 5.12.24-25; chi-squared
4.10.1, 4.14.12; exponential
4.8.1, 4, 4.14.10, 5.12.50,
6.15.42; gamma 4.14.11;
geometric 3.8.3-4; normal
4.9.3; p.g.f. 5.12.1; Poisson
3.11.6, 7.2.10; random 3.7.6,
3.8.6, 5.2.3, 5.12.50, 10.2.2;
renewals 10.6.10; uniform
3.8.1, 4.8.5
sum of Markov chains 6.1.8
supercritical branching 6. 7.2
supermartingale 12.1.8
superposed: Poisson pr. 6.8.1;
renewal pr. 10.6.10
sure thing principle 1. 7.4
survival 3.4.3, 4.1.4
Sylvester's problem 4.13.12,
4.14.60
symmetric: r.v. 3.2.5, 4.1.2,
5.12.22; r.w. 1.7.3; state 6.2.5
symmetry and independence 1.5.3
system 7.7.4; Labouchere 12.9.15

Index

two-dimensional: r.w. 5.3.4,


5.12.6, 12.9.17; Wiener pr.
13.12.12-14
two server queue 8.4.1, 5, 11.7.3,
11.8.14
two-state Markov chain 6.15.11,
17, 8.2.1; Markov pr. 6.9.1-2,
6.15.16-17
Type: T. one counter 10.6.6-7; T.
two counter 10.6.15

11.4.2; in M/D/1 11.8.1 0;


in M/G/1 11.8.6; in MIM/1
11.2.3; stationary distn 11.2.3,
5.7.8
11.5.2-3; virtual 11.8.7
tail: c.f. 5.7.6; equivalent 7.11.34;
Wald's eqn 10.2.2-3
event 7.3.3, 5; function 9.5.3;
integral 4.3.3, 5; sum 3.11.13,
Waldegrave's problem 5.12.10
4.14.3
Waring's theorem 1.8.13, 5.2.1
tail of distn: and moments 5.6.4,
weak law of large numbers 7.4.1,
5.11.3; p.g.f. 5.1.2
7.11.15, 20-21
tandem queue 11.2.7, 11.8.3
Weibull distn 4.4.7, 7.11.13
taxis 11.8.16
u
Weierstrass's theorem 7.3.6
telekinesis 2. 7.8
u
Mysaka 8.7.7
white noise, Gaussian 13.8.5
telephone: exchange 11.8.9; sales
unconscious statistician 3.11.3
3.11.38
Wiener process: absorbing
uncorrelated r.v. 3.11.12, 16,
barriers 13.13.8, 9; arc
testimony 1.8.27
4.5.7-8, 4.8.6
sine laws 13.4.3, 13.12.10,
thinning 6.8.2; renewal 10.6.16
uniform integrability: Section
13.12.19; area 12.9.22;
three-dimensional r.w. 6.15.10;
7.10 passim, 10.2.4, 12.5.1-2
Bartlett eqn 13.3.3; Bessel
transience of 6.15.9,
uniform distn 3.8.1, 4.8.4, 4.11.1,
pr. 13.3.5; on circle 13.9.4;
three-dimensional Wiener pr.
9.7.5; maximum 5.12.32;
conditional 8.5.2, 9.7.21,
13.11.1
order statistics 4.14.23-24, 39,
13.6.1; constructed 13.12.7;
6.15.42; sample from 4.11.1;
three series theorem 7.11.35
d-dimensional 13.7.1; drift
sum 3.8.1, 4.8.4
13.3.3, 13.5.1; on ellipse
tied-down Wiener pr. 9.7.21-22,
13.9.5; expansion 13.12.7;
uniqueness of conditional
13.6.2-5
first
passage 13.4.2; geometric
expectation
3.7.2
tilted distn 5.1.9, 5.7.11
13.3.9, 13.4.1; hits sphere
time-reversibility 6.5.1-3, 6.15 ..16 upcrossings inequality 12.3.2
13.11.1; hitting barrier
upper class fn 7.6.1
total life 10.6.5
13.4.2; integrated 9.7.20,
urns 1.4.4, 1.8.24-25, 3.4.2,
total variation distance 4.12.3-4,
12.9.22, 13.3.8, 13.8.1-2; level
4, 6.3.10, 6.15.12; P6lya's
7.2.9, 7.11.16
sets 13.12.16; martingales
12.9.13-14
12.9.22-23, 13.3.8-9;
tower property 3.7.1, 4.14.29
maximum 13.12.8, 11, 15,
traffic: gaps 5.12.45, 8.4.3;
17; occupation time 13.12.20;
heavy 11.6.1, 11.8.15; Poisson
value function 13.10.5
quadratic variation 8.5.4,
6.15.40, 49, 8.4.3
variance: branching pr. 5.12.9;
13.7.2; reflected 13.5.1; sign
transform, inverse 2.3.3
conditional 3.7.4, 4.6.7; normal
13.12.21; standard 9.7.18-21;
transient: r.w. 5.12.44, 7.5.3;
4.4.6
three-dimensional 13.11.1;
Wiener pr. 13.11.1
tied-down, see Brownian
vector space 2.7.3, 3.6.1
transition matrix 7.11.31
bridge; transformed 9.7.18,
Vice-Chancellor 1.3.4, 1.8.13
13.12.1, 3; two-dimensional
transitive coins 2.7.16
virtual waiting 11.8.7
13.12.12-14; zeros of 13.4.3,
trapezoidal distn 3.8.1
visits: birth-death pr. 6.11.6-7;
13.12.10
Markov chain 6.2.3-5, 6.3.5,
trial, de Moivre 3.5.1
Wiener-Hopf eqn 11.5.3, 11.8.8
r.w.
3.11.23,
6.9.9,
6.15.5,
44;
triangle inequality 7 .1.1, 3
29, 6.9.8, 10
Trinity College 12.9.15
voter paradox 3.6.6
X
triplewise independent 5.1. 7
X-ray 4.14.32
trivariate normal distn 4.9.8-9
trivial r. v. 3.11.2
waiting room 11.8.1, 15
truncated: r.v. 2.4.2; r.w. 6.5.7
waiting time: dependent 11.2.7;
Tunin's theorem 3.11.40
zero-one law, Hewitt-Savage
for a gap 10.1.2; in G/G/1
turning time 4.6.10
11.5.2, 11.8.8; in G/M/1
7.3.4-5
t, Student's 4.10.2-3; non-central

438

FROM A REVIEW OF PROBABILITY AND RANDOM PROCESSES:


PROBLEMS AND SOLUTIONS