You are on page 1of 22

Solutions of Selected Problems from Probability

Essentials, Second Edition


Solutions to selected problems of Chapter 2
2.1 Lets rst prove by induction that #(2

n
) = 2
n
if = {x
1
, . . . , x
n
}. For n = 1
it is clear that #(2

1
) = #({, {x
1
}}) = 2. Suppose #(2

n1
) = 2
n1
. Observe that
2

n
= {{x
n
} A, A 2

n1
} 2

n1
} hence #(2

n
) = 2#(2

n1
) = 2
n
. This proves
niteness. To show that 2

is a -algebra we check:
1. hence 2

.
2. If A 2

then A and A
c
hence A
c
2

.
3. Let (A
n
)
n1
be a sequence of subsets of . Then

n=1
A
n
is also a subset of hence
in 2

.
Therefore 2

is a -algebra.
2.2 We check if H =
A
G

has the three properties of a -algebra:


1. G

A hence
A
G

.
2. If B
A
G

then B G

A. This implies that B


c
G

A since each
G

is a -algebra. So B
c

A
G

.
3. Let (A
n
)
n1
be a sequence in H. Since each A
n
G

n=1
A
n
is in G

since G

is a
-algebra for each A. Hence

n=1
A
n

A
G

.
Therefore H =
A
G

is a -algebra.
2.3 a. Let x (

n=1
A
n
)
c
. Then x A
c
n
for all n, hence x

n=1
A
c
n
. So (

n=1
A
n
)
c

n=1
A
c
n
. Similarly if x

n=1
A
c
n
then x A
c
n
for any n hence x (

n=1
A
n
)
c
. So
(

n=1
A
n
)
c
=

n=1
A
c
n
.
b. By part-a

n=1
A
n
= (

n=1
A
c
n
)
c
, hence (

n=1
A
n
)
c
=

n=1
A
c
n
.
2.4 liminf
n
A
n
=

n=1
B
n
where B
n
=
mn
A
m
A n since A is closed under taking
countable intersections. Therefore liminf
n
A
n
A since A is closed under taking
countable unions.
By De Morgans Law it is easy to see that limsup A
n
= (liminf
n
A
c
n
)
c
, hence limsup
n
A
n

A since liminf
n
A
c
n
A and A is closed under taking complements.
Note that x liminf
n
A
n
n

s.t x
mn
A
m
x
mn
A
m
n x
limsup
n
A
n
. Therefore liminf
n
A
n
limsup
n
A
n
.
2.8 Let L = {B R : f
1
(B) B}. It is easy to check that L is a -algebra. Since f
is continuous f
1
(B) is open (hence Borel) if B is open. Therefore L contains the open
sets which implies L B since B is generated by the open sets of R. This proves that
f
1
(B) B if B B and that A = {A R : B B with A = f
1
(B) B} B.
1
Solutions to selected problems of Chapter 3
3.7 a. Since P(B) > 0 P(.|B) denes a probability measure on A, therefore by Theorem
2.4 lim
n
P(A
n
|B) = P(A|B).
b. We have that A B
n
A B since 1
AB
n
(w) = 1
A
(w)1
B
n
(w) 1
A
(w)1
B
(w).
Hence P(A B
n
) P(A B). Also P(B
n
) P(B). Hence
P(A|B
n
) =
P(A B
n
)
P(B
n
)

P(A B)
P(B)
= P(A|B).
c.
P(A
n
|B
n
) =
P(A
n
B
n
)
P(B
n
)

P(A B)
P(B)
= P(A|B)
since A
n
B
n
A B and B
n
B.
3.11 Let B = {x
1
, x
2
, . . . , x
b
} and R = {y
1
, y
2
, . . . , y
r
} be the sets of b blue balls and r
red balls respectively. Let B

= {x
b+1
, x
b+2
, . . . , x
b+d
} and R

= {y
r+1
, y
r+2
, . . . , y
r+d
} be
the sets of d-new blue balls and d-new red balls respectively. Then we can write down the
sample space as
= {(a, b) : (a B and b B B

R) or (a R and b R R

B)}.
Clearly card() = b(b +d +r) +r(b +d +r) = (b +r)(b +d +r). Now we can dene a
probability measure P on 2

by
P(A) =
card(A)
card()
.
a. Let
A = { second ball drawn is blue}
= {(a, b) : a B, b B B

} {(a, b) : a R, b B}
card(A) = b(b + d) + rb = b(b + d + r), hence P(A) =
b
b+r
.
b. Let
B = { rst ball drawn is blue}
= {(a, b) : a B}
Observe A B = {(a, b) : a B, b B B

} and card(A B) = b(b + d). Hence


P(B|A) =
P(A B)
P(A)
=
card(A B)
card(A)
=
b + d
b + d + r
.
2
3.17 We will use the inequality 1 x > e
x
for x > 0, which is obtained by taking Taylors
expansion of e
x
around 0.
P((A
1
. . . A
n
)
c
) = P(A
c
1
. . . A
c
n
)
= (1 P(A
1
)) . . . (1 P(A
n
))
exp(P(A
1
)) . . . exp(P(A
n
)) = exp(
n

i=1
P(A
i
))
3
Solutions to selected problems of Chapter 4
4.1 Observe that
P(k successes) =
_
n
2
_

n
k
_
1

n
_
nk
= Ca
n
b
1,n
. . . b
k,n
d
n
where
C =

k
k!
a
n
= (1

n
)
n
b
j,n
=
n j + 1
n
d
n
= (1

n
)
k
It is clear that b
j,n
1 j and d
n
1 as n . Observe that
log((1

n
)
n
) = n(

n


2
n
2
1

2
) for some (1

n
, 1)
by Taylor series expansion of log(x) around 1. It follows that a
n
e

as n and
that
|Error| = |e
nlog(1

n
)
e

| |nlog(1

n
) | = n

2
n
2
1

2
p
Hence in order to have a good approximation we need n large and p small as well as to
be of moderate size.
4
Solutions to selected problems of Chapter 5
5.7 We put x
n
= P(X is even) for X B(p, n). Let us prove by induction that x
n
=
1
2
(1 +(1 2p)
n
). For n = 1, x
1
= 1 p =
1
2
(1 +(1 2p)
1
). Assume the formula is true for
n 1. If we condition on the outcome of the rst trial we can write
x
n
= p(1 x
n1
) + (1 p)x
n
= p(1
1
2
(1 + (1 2p)
n1
)) + (1 p)(
1
2
(1 + (1 2p)
n1
))
=
1
2
(1 + (1 2p)
n
)
hence we have the result.
5.11 Observe that E(|X |) =

i<
( i)p
i
+

i
(i )p
i
. Since

i
(i )p
i
=

i=0
(i )p
i

i<
(i )p
i
we have that E(|X |) = 2

i<
( i)p
i
. So
E(|X |) = 2

i<
( i)p
i
= 2
1

i=1
( i)
e

k
k!
= 2e

i=0
(

k+1
k!


k
(k 1)!
)
= 2e

(k 1)!
.
5
Solutions to selected problems of Chapter 7
7.1 Suppose lim
n
P(A
n
) = 0. Then there exists > 0 such that there are dis-
tinct A
n
1
, A
n
2
, . . . with P(A
n
k
) > 0 for every k 1. This gives

k=1
P(A
n
k
) =
which is a contradiction since by the hypothesis that the A
n
are disjoint we have that

k=1
P(A
n
k
) = P(

n=1
A
n
k
) 1 .
7.2 Let A
n
= {A

: P(A

) > 1/n}. A
n
is a nite set otherwise we can pick disjoint
A

1
, A

2
, . . . in A
n
. This would give us P

m=1
A

m
=

m=1
P(A

m
) = which is a
contradiction. Now {A

: B} =

n=1
A
n
hence (A

)
B
is countable since it is a
countable union of nite sets.
7.11 Note that {x
0
} =

n=1
[x
0
1/n, x
0
] therefore {x
0
} is a Borel set. P({x
0
}) =
lim
n
P([x
0
1/n, x
0
]). Assuming that f is continuous we have that f is bounded
by some M on the interval [x
0
1/n, x
0
] hence P({x
0
}) = lim
n
M(1/n) = 0.
Remark: In order this result to be true we dont need f to be continuous. When we dene
the Lebesgue integral (or more generally integral with respect to a measure) and study its
properties we will see that this result is true for all Borel measurable non-negative f.
7.16 First observe that F(x) F(x) > 0 i P({x}) > 0. The family of events {{x} :
P({x}) > 0} can be at most countable as we have proven in problem 7.2 since these events
are disjoint and have positive probability. Hence F can have at most countable discon-
tinuities. For an example with innitely many jump discontinuities consider the Poisson
distribution.
7.18 Let F be as given. It is clear that F is a nondecreasing function. For x < 0 and x 1
right continuity of F is clear. For any 0 < x < 1 let i

be such that
1
i

+1
x <
1
i
. If
x
n
x then there exists N such that
1
i

+1
x
n
<
1
i
for every n N. Hence F(x
n
) = F(x)
for every n N which implies that F is right continuous at x. For x = 0 we have that
F(0) = 0. Note that for any there exists N such that

i=N
1
2
i
< . So for all x s.t.
|x|
1
N
we have that F(x) . Hence F(0+) = 0. This proves the right continuity of F
for all x. We also have that F() =

i=1
1
2
i
= 1 and F() = 0 so F is a distribution
function of a probability on R.
a. P([1, )) = F() F(1) = 1

n=2
= 1
1
2
=
1
2
.
b. P([
1
10
, )) = F() F(
1
10
) = 1

n=11
1
2
i
= 1 2
10
.
c P({0}) = F(0) F(0) = 0.
d. P([0,
1
2
)) = F(
1
2
) F(0) =

n=3
1
2
i
0 =
1
4
.
e. P((, 0)) = F(0) = 0.
f. P((0, )) = 1 F(0) = 1.
6
7
Solutions to selected problems of Chapter 9
9.1 It is clear by the denition of F that X
1
(B) F for every B B. So X is measurable
from (, F) to (R, B).
9.2 Since X is both F and G measurable for any B B, P(X B) = P(X B)P(X
B) = 0 or 1. Without loss of generality we can assume that there exists a closed interval
I such that P(I) = 1. Let
n
= {t
n
0
, . . . t
n
l
n
} be a partition of I such that
n

n+1
and
sup
k
t
n
k
t
n
k1
0. For each n there exists k

(n) such that P(X [t


n
k
, t
n
k

+1
]) = 1 and
[t
n
k

(n+1
, t
n
k

(n+1)+1
] [t
n
k

(n)
, t
n
k

(n)+1
]. Now a
n
= t
n
k

(n)
and b
n
= t
n
k

(n)
+ 1 are both Cauchy
sequences with a common limit c. So 1 = lim
n
P(X (t
n
k
, t
n
k

+1
]) = P(X = c).
9.3 X
1
(A) = (Y
1
(A) (Y
1
(A) X
1
(A)
c
)
c
)(X
1
(A) Y
1
(A)
c
). Observe that both
Y
1
(A) (X
1
(A))
c
and X
1
(A) Y
1
(A)
c
are null sets and therefore measurable. Hence
if Y
1
(A) A

then X
1
(A) A

. In other words if Y is A

measurable so is X.
9.4 Since X is integrable, for any > 0 there exists M such that
_
|X|1
{X>M}
dP < by
the dominated convergence theorem. Note that
E[X1
A
n
] = E[X1
A
n
1
{X>M}
] + E[X1
A
n
1
{XM}
]
E[|X|1
{XM}
] + MP(A
n
)
Since P(A
n
) 0, there exists N such that P(A
n
)

M
for every n N. Therefore
E[X1
A
n
] + n N, i.e. lim
n
E[X1
A
n
] = 0.
9.5 It is clear that 0 Q(A) 1 and Q() = 1 since X is nonnegative and E[X] = 1. Let
A
1
, A
2
, . . . be disjoint. Then
Q(

n=1
A
n
) = E[X1

n=1
A
n
] = E[

n=1
X1
A
n
] =

n=1
E[X1
A
n
]
where the last equality follows from the monotone convergence theorem. Hence Q(

n=1
A
n
) =

n=1
Q(A
n
). Therefore Q is a probability measure.
9.6 If P(A) = 0 then X1
A
= 0 a.s. Hence Q(A) = E[X1
A
] = 0. Now assume P is the
uniform distribution on [0, 1]. Let X(x) = 21
[0,1/2]
(x). Corresponding measure Q assigns
zero measure to (1/2, 1], however P((1/2, 1]) = 1/2 = 0.
9.7 Lets prove this rst for simple functions, i.e. let Y be of the form
Y =
n

i=1
c
i
1
A
i
8
for disjoint A
1
, . . . , A
n
. Then
E
Q
[Y ] =
n

i=1
c
i
Q(A
i
) =
n

i=1
c
i
E[X1
A
i
] = E
P
[XY ]
For non-negative Y we take a sequence of simple functions Y
n
Y . Then
E
Q
[Y ] = lim
n
E
Q
[Y
n
] = lim
n
E
P
[XY
n
] = E
P
[XY ]
where the last equality follows from the monotone convergence theorem. For general Y
L
1
(Q) we have that E
Q
[Y ] = E
Q
[Y
+
] E
Q
[Y

] = E
P
[(XY )
+
] E
Q
[(XY )

] = E
P
[XY ].
9.8 a. Note that
1
X
X = 1 a.s. since P(X > 0) = 1. By problem 9.7 E
Q
[
1
X
] = E
P
[
1
X
X] = 1.
So
1
X
is Q-integrable.
b. R : A R, R(A) = E
Q
[
1
X
1
A
] is a probability measure since
1
X
is non-negative and
E
Q
[
1
X
] = 1. Also R(A) = E
Q
[
1
X
1
A
] = E
P
[
1
X
X1
A
] = P(A). So R = P.
9.9 Since P(A) = E
Q
[
1
X
1
A
] we have that Q(A) = 0 P(A) = 0. Now combining the
results of the previous problems we can easily observe that Q(A) = 0 P(A) = 0 i
P(X > 0) = 1.
9.17. Let
g(x) =
((x )b + )
2

2
(1 + b
2
)
2
.
Observe that {X + b} {g(X) 1}. So
P({X + b}) P({g(X) 1})
E[g(X)]
1
where the last inequality follows from Markovs inequality. Since E[g(X)] =

2
(1+b
2
)

2
(1+b
2
)
2
we
get that
P({X + b})
1
1 + b
2
.
9.19
xP({X > x}) E[X1
{
X > x}]
=
_

x
z

2
e

z
2
2
dz
=
e

x
2
2

2
Hence
P({X > x})
e

x
2
2
x

2
9
.
9.21 h(t +s) = P({X > t +s}) = P({X > t +s, X > s}) = P({X > t +s|X > s})P({X >
s}) = h(t)h(s) for all t, s > 0. Note that this gives h(
1
n
) = h(1)
1
n
and h(
m
n
) = h(1)
m
n
. So
for all rational r we have that h(r) = exp (log(h(1))r). Since h is right continuous this
gives h(x) = exp(log(h(1))x) for all x > 0. Hence X has exponential distribution with
parameter log h(1).
10
Solutions to selected problems of Chapter 10
10.5 Let P be the uniform distribution on [1/2, 1/2]. Let X(x) = 1
[1/4,1/4]
and Y (x) =
1
[1/4,1/4]
c. It is clear that XY = 0 hence E[XY ] = 0. It is also true that E[X] = 0. So
E[XY ] = E[X]E[Y ] however it is clear that X and Y are not independent.
10.6 a. P(min(X, Y ) > i) = P(X > i)P(Y > i) =
1
2
i
1
2
i
=
1
4
i
. So P(min(X, Y ) i) =
1 P(min(X, Y ) > i) = 1
1
4
i
.
b. P(X = Y ) =

i=1
P(X = i)P(Y = i) =

i=1
1
2
i
1
2
i
=
1
1
1
4
i
1 =
1
3
.
c. P(Y > X) =

i=1
P(Y > i)P(X = i) =

i=1
1
2
i
1
2
i
=
1
3
.
d. P(X divides Y ) =

i=1

k=1
1
2
i
1
2
ki
=

i=1
1
2
i
1
2
i
1
.
e. P(X kY ) =

i=1
P(X ki)P(Y = i) =

i=1
1
2
i
1
2
ki
1
=
2
2
k+1
1
.
11
Solutions to selected problems of Chapter 11
11.11. Since P{X > 0} = 1 we have that P{Y < 1} = 1. So F
Y
(y) = 1 for y 1. Also
P{Y 0} = 0 hence F
Y
(y) = 0 for y 0. For 0 < y < 1 P{Y > y} = P{X <
1y
y
} =
F
X
(
1y
y
). So
F
Y
(y) = 1
_ 1y
y
0
f
X
(x)dx = 1
_
y
0
1
z
2
f
X
(
1 z
z
)dz
by change of variables. Hence
f
Y
(y) =
_
_
_
0 < y 0
1
y
2
f
X
(
1y
y
) 0 < y 1
0 1 y <
11.15 Let G(u) = inf{x : F(x) u}. We would like to show {u : G(u) > y} = {u :
F(Y ) < u}. Let u be such that G(u) > y. Then F(y) < u by denition of G. Hence
{u : G(u) > y} {u : F(Y ) < u}. Now let u be such that F(y) < u. Then y < x for any x
such that F(x) u by monotonicity of F. Now by right continuity and the monotonicity of
F we have that F(G(u)) = inf
F(x)u
F(x) u. Then by the previous statement y < G(u).
So {u : G(u) > y} = {u : F(Y ) < u}. Now P{G(U) > y} = P{U > F(y)} = 1 F(y) so
G(U) has the desired distribution. Remark:We only assumed the right continuity
of F.
12
Solutions to selected problems of Chapter 12
12.6 Let Z = (
1

Y
)Y (

XY

X
)X. Then
2
Z
= (
1

2
Y
)
2
Y
(

2
XY

2
X
)
2
X
2(

XY

Y
)Cov(X, Y ) =
1
2
XY
. Note that
XY
= 1 implies
2
Z
= 0 which implies Z = c a.s. for some constant
c. In this case X =

X

Y

XY
(Y c) hence X is an ane function of Y .
12.11 Consider the mapping g(x, y) = (
_
x
2
+ y
2
, arctan(
x
y
)). Let S
0
= {(x, y) : y = 0},
S
1
= {(x, y) : y > 0}, S
2
= {(x, y) : y < 0}. Note that
2
i=0
S
i
= R
2
and m
2
(S
0
) = 0.
Also for i = 1, 2 g : S
i
R
2
is injective and continuously dierentiable. Corresponding
inverses are given by g
1
1
(z, w) = (z sin w, z cos w) and g
1
2
(z, w) = (z sin w, z cos w). In
both cases we have that |J
g
1
i
(z, w)| = z hence by Corollary 12.1 the density of (Z, W) is
given by
f
Z,W
(z, w) = (
1
2
2
e
z
2
2
z +
1
2
2
e
z
2
2
z)1
(

2
,

2
)
(w)1
(0,)
(z)
=
1

1
(

2
,

2
)
(w)
z

2
e
z
2
2
1
(0,)
(z)
as desired.
12.12 Let P be the set of all permutations of {1, . . . , n}. For any P let X

be the
corresponding permutation of X, i.e. X

k
= X

k
. Observe that
P(X

1
x
1
, . . . , X

n
x
n
) = F(x
1
) . . . F(X
n
)
hence the law of X

and X coincide on a system generating B


n
therefore they are equal.
Now let
0
= {(x
1
, . . . , x
n
) R
n
: x
1
< x
2
< . . . < x
n
}. Since X
i
are i.i.d and have
continuous distribution P
X
(
0
) = 1. Observe that
P{Y
1
y
1
, . . . , Y
n
y
n
} = P(
P
{X

1
y
1
, . . . , X

n
y
n
}
0
)
Note that {X

1
y
1
, . . . , X

n
y
n
}
0
, P are disjoint and P(
0
= 1) hence
P{Y
1
y
1
, . . . , Y
n
y
n
} =

P
P{X

1
y
1
, . . . , X

n
y
n
}
= n!F(y
1
) . . . F(y
n
)
for y
1
. . . y
n
. Hence
f
Y
(y
1
, . . . , y
n
) =
_
n!f(y
1
) . . . f(y
n
) y
1
. . . y
n
0 otherwise
13
Solutions to selected problems of Chapter 14
14.7
X
(u) is real valued i
X
(u) =
X
(u) =
X
(u). By uniqueness theorem
X
(u) =

X
(u) i F
X
= F
X
. Hence
X
(u) is real valued i F
X
= F
X
.
14.9 We use induction. It is clear that the statement is true for n = 1. Put Y
n
=

n
i=1
X
i
and assume that E[(Y
n
)
3
] =

n
i=1
E[(X
i
)
3
]. Note that this implies
d
3
dx
3

Y
n
(0) =
i

n
i=1
E[(X
i
)
3
]. Now E[(Y
n+1
)
3
] = E[(X
n+1
+ Y
n
)
3
] = i
d
3
dx
3
(
X
n+1

Y
n
)(0) by indepen-
dence of X
n+1
and Y
n
. Note that
d
3
dx
3

X
n+1

Y
n
(0) =
d
3
dx
3

X
n+1
(0)
Y
n
(0)
+ 3
d
2
dx
2

X
n+1
(0)
d
dx

Y
n
(0) + 3
d
dx

X
n+1
(0)
d
2
dx
2

Y
n
(0)
+
X
n+1
(0)
d
3
dx
3

Y
n
(0)
=
d
3
dx
3

X
n+1
(0) +
d
3
dx
3

Y
n
(0)
= i
_
E[(X
n+1
)
3
] +
n

i=1
E[(X
i
)
3
]
_
where we used the fact that
d
dx

X
n+1
(0) = iE(X
n+1
) = 0 and
d
dx

Y
n
(0) = iE(Y
n
) = 0. So
E[(Y
n+1
)
3
] =

n+1
i=1
E[(X
i
)
3
] hence the induction is complete.
14.10 It is clear that 0 (A) 1 since
0
n

j=1

j
(A)
n

j=1

j
= 1.
Also for A
i
disjoint
(

i=1
A
i
) =
n

j=1

j
(

i=1
A
i
)
=
n

j=1

i=1

j
(A
i
)
=

i=1
n

j=1

j
(A
i
)
=

i=1
(A
i
)
14
Hence is countably additive therefore it is a probability mesure. Note that
_
1
A
d(dx) =

n
j=1

j
_
1
A
(x)d
j
(dx) by denition of . Now by linearity and monotone convergence
theorem for a non-negative Borel function f we have that
_
f(x)(dx) =

n
j=1

j
_
f(x)d
j
(dx).
Extending this to integrable f we have that (u) =
_
e
iux
(dx) =

n
j=1

j
_
e
iux
d
j
(dx) =

n
j=1

j

j
(u).
14.11 Let be the double exponential distribution,
1
be the distribution of Y and
2
be
the distribution of Y where Y is an exponential r.v. with parameter = 1. Then we
have that (A) =
1
2
_
A(0,)
e
x
dx +
1
2
_
A(,0)
e
x
dx =
1
2

1
(A) +
1
2

2
(A). By the previous
exercise we have that (u) =
1
2

1
(u) +
1
2

2
(u) =
1
2
(
1
1iu
+
1
1+iu
) =
1
1+u
2
.
14.15. Note that E{X
n
} = (i)
n d
n
dx
n

X
(0). Since X N(0, 1)
X
(s) = e
s
2
/2
. Note that
we can get the derivatives of any order of e
s
2
/2
at 0 simply by taking Taylors expansion
of e
x
:
e
s
2
/2
=

i=0
(s
2
/2)
n
n!
=

i=0
1
2n!
(i)
2n
(2n)!
2
n
n!
s
2n
hence E{X
n
} = (i)
n d
n
dx
n

X
(0) = 0 for n odd. For n = 2k E{X
2k
} = (i)
2k d
2k
dx
2k

X
(0) =
(i)
2k
(i)
2k
(2k)!
2
k
k!
=
(2k)!
2
k
k!
as desired.
15
Solutions to selected problems of Chapter 15
15.1 a. E{x} =
1
n

n
i=1
E{X
i
} = .
b. Since X
1
, . . . , X
n
are independent Var(x) =
1
n
2

n
i=1
Var{X
i
} =

2
n
.
c. Note that S
2
=
1
n

n
i=1
(X
i
)
2
x
2
. Hence E(S
2
) =
1
n

n
i=1
(
2
+
2
) (

2
n
+
2
) =
n1
n

2
.
15.17 Note that
Y
(u) =

i=1

X
i
(u) = (

iu
)

which is the characteristic function


of Gamma(,) random variable. Hence by uniqueness of characteristic function Y is
Gamma(,).
16
Solutions to selected problems of Chapter 16
16.3 P({Y y}) = P({X y} {Z = 1}) + P({X y} {Z = 1}) =
1
2
(y) +
1
2
(y) = (y) since Z and X are independent and (y) is symmetric. So Y is normal.
Note that P(X +Y = 0) =
1
2
hence X +Y can not be normal. So (X, Y ) is not Gaussian
even though both X and Y are normal.
16.4 Observe that
Q =
X

Y
_

X

X
_
So det(Q) =
X

Y
(1
2
). So det(Q) = 0 i = 1. By Corollary 16.2 the joint density
of (X, Y ) exists i 1 < < 1. (By Cauchy-Schwartz we know that 1 1). Note
that
Q
1
=
1

Y
(1
2
)

Y
Substituting this in formula 16.5 we get that
f
(X,Y )
(x, y) =
1
2
X

Y
(1
2
)
exp
_
1
2(1
2
)
_
_
x
X

X
_
2

2(x
X
)(y
Y
)

Y
+
_
y
Y

Y
_
2
__
.
16.6 By Theorem 16.2 there exists a multivariate normal r.v. Y with E(Y ) = 0 and a
diagonal covariance matrix s.t. X = AY where A is an orthogonal matrix. Since
Q = AA

and det(Q) > 0 the diagonal entries of are strictly positive hence we can
dene B =
1/2
A

. Now the covariance matrix



Q of B(X ) is given by

Q =
1/2
A

AA

A
1/2
= I
So B(X ) is standard normal.
16.17 We know that as in Exercise 16.6 if B =
1/2
A

where A is the orthogonal matrix s.t.


Q = AA

then B(X) is standard normal. Note that this gives (X)

Q
1
(X) =
(X )

B(X ) which has chi-square distribution with n degrees of freedom.


17
Solutions to selected problems of Chapter 17
17.1 Let n(m) and j(m) be such that Y
m
= n(m)
1/p
Z
n(m),j(m)
. This gives that P(|Y
m
| >
0) =
1
n(m)
0 as m . So Y
m
converges to 0 in probability. However E[|Y
m
|
p
] =
E[n(m)Z
n(m),j(m)
] = 1 for all m. So Y
m
does not converge to 0 in L
p
.
17.2 Let X
n
= 1/n. It is clear that X
n
converge to 0 in probability. If f(x) = 1
{0}
(x) then
we have that P(|f(X
n
) f(0)| > ) = 1 for every 1, so f(X
n
) does not converge to
f(0) in probability.
17.3 First observe that E(S
n
) =

n
i=1
E(X
n
) = 0 and that Var(S
n
) =

n
i=1
Var(X
n
) = n
since E(X
n
) = 0 and Var(X
n
) = E(X
2
n
) = 1. By Chebyshevs inequality P(|
S
n
n
| ) =
P(|S
n
| n)
Var(S
n
)
n
2

2
=
n
n
2

2
0 as n . Hence
S
n
n
converges to 0 in probability.
17.4 Note that Chebyshevs inequality gives P(|
S
n
2
n
2
| )
1
n
2

2
. Since

i=1
1
n
2

2
< by
Borel Cantelli TheoremP(limsup
n
{|
S
n
2
n
2
| }) = 0. Let
0
=
_

m=1
limsup
n
{|
S
n
2
n
2
|
1
m
}
_
c
.
Then P(
0
) = 1. Now lets pick w
0
. For any there exists m s.t.
1
m
and
w (limsup
n
{|
S
n
2
n
2
|
1
m
})
c
. Hence there are nitely many n s.t. |
S
n
2
n
2
|
1
m
which implies
that there exists N(w) s.t. |
S
n
2
n
2
|
1
m
for every n N(w). Hence
S
n
2
(w)
n
2
0. Since
P(
0
) = 1 we have almost sure convergence.
17.12 Y < a.s. which follows by Exercise 17.11 since X
n
< and X < a.s. Let
Z =
1
c
1
1+Y
. Observe that Z > 0 a.s. and E
P
(Z) = 1. Therefore as in Exercise 9.8
Q(A) = E
P
(Z1
A
) denes a probability measure and E
Q
(|X
n
X|) = E
P
(Z|X
n
X|).
Note that Z|X
n
X| 1 a.s. and X
n
X a.s. by hypothesis, hence by dominated
convergence theorem E
Q
(|X
n
X|) = E
P
(Z|X
n
X|) 0, i.e. X
n
tends to X in L
1
with
respect to Q.
17.14 First observe that |E(X
2
n
)E(X
2
)| E(|X
2
n
X
2
|). Since |X
2
n
X
2
| (X
n
X)
2
+
2|X||X
n
X| we get that |E(X
2
n
) E(X
2
)| E((X
n
X)
2
) + 2E(|X||X
n
X|). Note
that rst term goes to 0 since X
n
tends to X in L
2
. Applying Cauchy Schwarz inequality
to the second term we get E(|X||X
n
X|)
_
E(X
2
)E(|X
n
X|
2
), hence the second
term also goes to 0 as n . Now we can conclude E(X
2
n
) E(X
2
).
17.15 For any > 0 P({|X| c+}) P({|X
n
| c, |X
n
X| }) 1 as n . Hence
P({|X| c + }) = 1. Since {X c} =

m=1
{X c +
1
m
} we get that P{X c} = 1.
Now we have that E(|X
n
X|) = E(|X
n
X|1
{|X
n
X|}
) + E(|X
n
X|1
{|X
n
X|>}
)
+ 2c(P{|X
n
X| > }), hence choosing n large we can make E(|X
n
X|) arbitrarily
small, so X
n
tends to X in L
1
.
18
Solutions to selected problems of Chapter 18
18.8 Note that
Y
n
(u) =
n
i=1

X
i
(
u
n
) =
n
i=1
e

|u|
n
= e
|u|
, hence Y
n
is also Cauchy with
= 0 and = 1 which is independent of n, hence trivially Y
n
converges in distribution
to a Cauchy distributed r.v. with = 0 and = 1. However Y
n
does not converge to
any r.v. in probability. To see this, suppose there exists Y s.t. P(|Y
n
Y | > ) 0.
Note that P(|Y
n
Y
m
| > ) P(|Y
n
Y | >

2
) + P(|Y
m
Y | >

2
). If we let m = 2n,
|Y
n
Y
m
| =
1
2
|
1
n

n
i=1
X
i

1
n

2n
i=n+1
X
i
| which is equal in distribution to
1
2
|U W| where
U and W are independent Cauchy r.v.s with = 0 and = 1. Hence P(|Y
n
Y
m
| >

2
)
does not depend on n and does not converge to 0 if we let m = 2n and n which is a
contradiction since we assumed the right hand side converges to 0.
18.16 Dene f
m
as the following sequence of functions:
f
m
(x) =
_

_
x
2
if |x| N
1
m
(N
1
m
)x (N
1
m
)N if x N
1
m
(N
1
m
)x + (N
1
m
)N if x N +
1
m
0 otherwise
Note that each f
m
is continuous and bounded. Also f
m
(x) 1
(N,N)
(x)x
2
for every x R.
Hence
_
N
N
x
2
F(dx) = lim
m
_

f
m
(x)F(dx)
by monotone convergence theorem. Now
_

f
m
(x)F(dx) = lim
n
_

f
m
(x)F
n
(dx)
by weak convergence. Since
_

f
m
(x)F
n
(dx)
_
N
N
x
2
F
n
(dx) it follows that
_
N
N
x
2
F(dx) lim
m
limsup
n
_
N
N
x
2
F
n
(dx) = limsup
n
_
N
N
x
2
F
n
(dx)
as desired.
18.17 Following the hint, suppose there exists a continuity point y of F such that
lim
n
F
n
(y) = F(y)
Then there exist > 0 and a subsequence (n
k
)
k1
s.t. F
n
k
(y) F(y) < for all k, or
F
n
k
(y) F(y) > for all k. Suppose F
n
k
(y) F(y) < for all k, observe that for x y,
F
n
k
(x) F(x) F
n
k
(y) F(x) = F
n
k
(y) F(y) + (F(y) F(x)) < + (F(y) F(x)).
Since f is continuous at y there exists an interval [y
1
, y) s.t. |(F(y) F(x))| <

2
, hence
F
n
k
(x) F(x) <

2
for all x [y
1
, y). Now suppose F
n
k
(y) F(y) > , then for x y,
F
n
k
(x) F(x) F
n
k
(y) F(x) = F
n
k
(y) F(y) + (F(y) F(x)) > + (F(y) F(x)).
19
Now we can nd an interval (y, y
1
] s.t. |(F(y) F(x))| <

2
which gives F
n
k
(x) F(x) >

2
for all x (y, y
1
]. Note that both cases would yield
_

infty
|F
n
k
(x) F(x)|
r
dx > |y
1
y|

2
which is a contradiction to the assumption
lim
n
_

infty
|F
n
(x) F(x)|
r
dx = 0.
Therefore X
n
converges to X in distribution.
20
Solutions to selected problems of Chapter 19
19.1 Note that
X
n
(u) = e
iu
n

u
2

2
n
2
e
iu
u
2

2
2
. By Levys continuity theorem it follows
that X
n
X where X is N(,
2
).
19.3 Note that
X
n
+Y
n
(u) =
X
n
(u)
Y
n
(u)
X
(u)
Y
(u) =
X+Y
(u). Therefore X
n
+
Y
n
X + Y
21
Solutions to selected problems of Chapter 20
20.1 a. First observe that E(S
2
n
) =

n
i=1

n
j=1
E(X
i
X
j
) =

n
i=1
X
2
i
since E(X
i
X
j
) = 0
for i = j. Now P(
|S
n
|
n
)
E(S
2
n
)

2
n
2
=
nE(X
2
i
)

2
n
2

c
n
2
as desired.
b. From part (a) it is clear that
1
n
S
n
converges to 0 in probability. Also E((
1
n
S
n
)
2
) =
E(X
2
i
n
0 since E(X
2
i
) , so
1
n
S
n
converges to 0 in L
2
as well.
20.5 Note that Z
n
Z implies that
Z
n
(u)
Z
(u) uniformly on compact subset of R.
(See Remark 19.1). For any u, we can pick n > N s.t.
u

n
< M, sup
x[M,M]
|
Z
n
(x)

Z
(x)| < and |varphi
Z
(
u

n
)
Z
(0)| < . This gives us
|
Z
n
(
u

n
)
Z
(0)| = |
Z
n
(
u

n
)
Z
(
u

n
)| +|
Z
(
u

n
)
Z
(0)| 2
So Z
n

n
(u) =
Z
n
(
u

n
) converges to
Z
(0) = 1 for every u. Therefore
Z
n

n
0 by continuity
theorem. We also have by the strong law of large numbers that
Z
n

n
E(X
j
) . This
implies E(X
j
) = 0, hence the assertion follows by strong law of large numbers.
22

You might also like