Prob Ess Solutions

Solutions of Selected Problems from Probability
Essentials, Second Edition

Solutions to selected problems of Chapter 2
2.1 Lets rst prove by induction that #(2
n
) = 2
n
if = {x
1
, . . . , x
n
}. For n = 1
it is clear that #(2
1
) = #({, {x
1
}}) = 2. Suppose #(2
n1
) = 2
n1
. Observe that
2
n
= {{x
n
} A, A 2
n1
} 2
n1
} hence #(2
n
) = 2#(2
n1
) = 2
n
. This proves
niteness. To show that 2
is a -algebra we check:
1. hence 2
.
2. If A 2
then A and A
c
hence A
c
2
.
3. Let (A
n
)
n1
be a sequence of subsets of . Then
n=1
A
n
is also a subset of hence
in 2
.
Therefore 2
is a -algebra.
2.2 We check if H =
A
G
has the three properties of a -algebra:

1. G
A hence
A
G
.
2. If B
A
G
then B G
A. This implies that B

c
G
A since each
G
is a -algebra. So B
c

A
G
.
3. Let (A
n
)
n1
be a sequence in H. Since each A
n
G
n=1
A
n
is in G
since G
is a
-algebra for each A. Hence
n=1
A
n

A
G
.
Therefore H =
A
G
is a -algebra.
2.3 a. Let x (
n=1
A
n
)
c
. Then x A
c
n
for all n, hence x
n=1
A
c
n
. So (
n=1
A
n
)
c
n=1
A
c
n
. Similarly if x
n=1
A
c
n
then x A
c
n
for any n hence x (
n=1
A
n
)
c
. So
(
n=1
A
n
)
c
=
n=1
A
c
n
.
b. By part-a
n=1
A
n
= (
n=1
A
c
n
)
c
, hence (
n=1
A
n
)
c
=
n=1
A
c
n
.
2.4 liminf
n
A
n
=
n=1
B
n
where B
n
=
mn
A
m
A n since A is closed under taking
countable intersections. Therefore liminf
n
A
n
A since A is closed under taking
countable unions.
By De Morgans Law it is easy to see that limsup A
n
= (liminf
n
A
c
n
)
c
, hence limsup
n
A
n

A since liminf
n
A
c
n
A and A is closed under taking complements.
Note that x liminf
n
A
n
n
s.t x
mn
A
m
x
mn
A
m
n x
limsup
n
A
n
. Therefore liminf
n
A
n
limsup
n
A
n
.
2.8 Let L = {B R : f
1
(B) B}. It is easy to check that L is a -algebra. Since f
is continuous f
1
(B) is open (hence Borel) if B is open. Therefore L contains the open
sets which implies L B since B is generated by the open sets of R. This proves that
f
1
(B) B if B B and that A = {A R : B B with A = f
1
(B) B} B.
1
3.7 a. Since P(B) > 0 P(.|B) denes a probability measure on A, therefore by Theorem
2.4 lim
n
P(A
n
|B) = P(A|B).
b. We have that A B
n
A B since 1
AB
n
(w) = 1
A
(w)1
B
n
(w) 1
A
(w)1
B
(w).
Hence P(A B
n
) P(A B). Also P(B
n
) P(B). Hence
P(A|B
n
) =
P(A B
n
)
P(B
n
)

P(A B)
P(B)
= P(A|B).
c.
P(A
n
|B
n
) =
P(A
n
B
n
)
P(B
n
)

P(A B)
P(B)
= P(A|B)
since A
n
B
n
A B and B
n
B.
3.11 Let B = {x
1
, x
2
, . . . , x
b
} and R = {y
1
, y
2
, . . . , y
r
} be the sets of b blue balls and r
red balls respectively. Let B
= {x
b+1
, x
b+2
, . . . , x
b+d
} and R
= {y
r+1
, y
r+2
, . . . , y
r+d
} be
the sets of d-new blue balls and d-new red balls respectively. Then we can write down the
sample space as
= {(a, b) : (a B and b B B
R) or (a R and b R R
B)}.
Clearly card() = b(b +d +r) +r(b +d +r) = (b +r)(b +d +r). Now we can dene a
probability measure P on 2
by
P(A) =
card(A)
card()
.
a. Let
A = { second ball drawn is blue}
= {(a, b) : a B, b B B
} {(a, b) : a R, b B}
card(A) = b(b + d) + rb = b(b + d + r), hence P(A) =
b
b+r
.
b. Let
B = { rst ball drawn is blue}
= {(a, b) : a B}
Observe A B = {(a, b) : a B, b B B
} and card(A B) = b(b + d). Hence

P(B|A) =
P(A B)
P(A)
=
card(A B)
card(A)
=
b + d
b + d + r
.
2
3.17 We will use the inequality 1 x > e
x
for x > 0, which is obtained by taking Taylors
expansion of e
x
around 0.
P((A
1
. . . A
n
)
c
) = P(A
c
1
. . . A
c
n
)
= (1 P(A
1
)) . . . (1 P(A
n
))
exp(P(A
1
)) . . . exp(P(A
n
)) = exp(
n
i=1
P(A
i
))
3
4.1 Observe that
P(k successes) =
_
n
2
_
n
k
_
1

n
_
nk
= Ca
n
b
1,n
. . . b
k,n
d
n
where
C =

k
k!
a
n
= (1

n
)
n
b
j,n
=
n j + 1
n
d
n
= (1

n
)
k
It is clear that b
j,n
1 j and d
n
1 as n . Observe that
log((1

n
)
n
) = n(
n

2
n
2
1
2
) for some (1

n
, 1)
by Taylor series expansion of log(x) around 1. It follows that a
n
e
as n and
that
|Error| = |e
nlog(1
n
)
e
| |nlog(1

n
) | = n
2
n
2
1
2
p
Hence in order to have a good approximation we need n large and p small as well as to
be of moderate size.
4
5.7 We put x
n
= P(X is even) for X B(p, n). Let us prove by induction that x
n
=
1
2
(1 +(1 2p)
n
). For n = 1, x
1
= 1 p =
1
2
(1 +(1 2p)
1
). Assume the formula is true for
n 1. If we condition on the outcome of the rst trial we can write
x
n
= p(1 x
n1
) + (1 p)x
n
= p(1
1
2
(1 + (1 2p)
n1
)) + (1 p)(
1
2
(1 + (1 2p)
n1
))
=
1
2
(1 + (1 2p)
n
)
hence we have the result.
5.11 Observe that E(|X |) =
i<
( i)p
i
+
i
(i )p
i
. Since
i
(i )p
i
=
i=0
(i )p
i
i<
(i )p
i
we have that E(|X |) = 2
i<
( i)p
i
. So
E(|X |) = 2
i<
( i)p
i
= 2
1
i=1
( i)
e
k
k!
= 2e
i=0
(
k+1
k!

k
(k 1)!
)
= 2e
(k 1)!
.
5
7.1 Suppose lim
n
P(A
n
) = 0. Then there exists > 0 such that there are dis-
tinct A
n
1
, A
n
2
, . . . with P(A
n
k
) > 0 for every k 1. This gives
k=1
P(A
n
k
) =
which is a contradiction since by the hypothesis that the A
n
are disjoint we have that
k=1
P(A
n
k
) = P(
n=1
A
n
k
) 1 .
7.2 Let A
n
= {A
: P(A
) > 1/n}. A
n
is a nite set otherwise we can pick disjoint
A
1
, A
2
, . . . in A
n
. This would give us P
m=1
A
m
=
m=1
P(A
m
) = which is a
contradiction. Now {A
: B} =
n=1
A
n
hence (A
)
B
is countable since it is a
countable union of nite sets.
7.11 Note that {x
0
} =
n=1
[x
0
1/n, x
0
] therefore {x
0
} is a Borel set. P({x
0
}) =
lim
n
P([x
0
1/n, x
0
]). Assuming that f is continuous we have that f is bounded
by some M on the interval [x
0
1/n, x
0
] hence P({x
0
}) = lim
n
M(1/n) = 0.
Remark: In order this result to be true we dont need f to be continuous. When we dene
the Lebesgue integral (or more generally integral with respect to a measure) and study its
properties we will see that this result is true for all Borel measurable non-negative f.
7.16 First observe that F(x) F(x) > 0 i P({x}) > 0. The family of events {{x} :
P({x}) > 0} can be at most countable as we have proven in problem 7.2 since these events
are disjoint and have positive probability. Hence F can have at most countable discon-
tinuities. For an example with innitely many jump discontinuities consider the Poisson
distribution.
7.18 Let F be as given. It is clear that F is a nondecreasing function. For x < 0 and x 1
right continuity of F is clear. For any 0 < x < 1 let i
be such that
1
i
+1
x <
1
i
. If
x
n
x then there exists N such that
1
i
+1
x
n
<
1
i
for every n N. Hence F(x
n
) = F(x)
for every n N which implies that F is right continuous at x. For x = 0 we have that
F(0) = 0. Note that for any there exists N such that
i=N
1
2
i
< . So for all x s.t.
|x|
1
N
we have that F(x) . Hence F(0+) = 0. This proves the right continuity of F
for all x. We also have that F() =
i=1
1
2
i
= 1 and F() = 0 so F is a distribution
function of a probability on R.
a. P([1, )) = F() F(1) = 1
n=2
= 1
1
2
=
1
2
.
b. P([
1
10
, )) = F() F(
1
10
) = 1
n=11
1
2
i
= 1 2
10
.
c P({0}) = F(0) F(0) = 0.
d. P([0,
1
2
)) = F(
1
2
) F(0) =
n=3
1
2
i
0 =
1
4
.
e. P((, 0)) = F(0) = 0.
f. P((0, )) = 1 F(0) = 1.
6
7
9.1 It is clear by the denition of F that X
1
(B) F for every B B. So X is measurable
from (, F) to (R, B).
9.2 Since X is both F and G measurable for any B B, P(X B) = P(X B)P(X
B) = 0 or 1. Without loss of generality we can assume that there exists a closed interval
I such that P(I) = 1. Let
n
= {t
n
0
, . . . t
n
l
n
} be a partition of I such that
n

n+1
and
sup
k
t
n
k
t
n
k1
0. For each n there exists k
(n) such that P(X [t

n
k
, t
n
k
+1
]) = 1 and
[t
n
k
(n+1
, t
n
k
(n+1)+1
] [t
n
k
(n)
, t
n
k
(n)+1
]. Now a
n
= t
n
k
(n)
and b
n
= t
n
k
(n)
+ 1 are both Cauchy
sequences with a common limit c. So 1 = lim
n
P(X (t
n
k
, t
n
k
+1
]) = P(X = c).
9.3 X
1
(A) = (Y
1
(A) (Y
1
(A) X
1
(A)
c
)
c
)(X
1
(A) Y
1
(A)
c
). Observe that both
Y
1
(A) (X
1
(A))
c
and X
1
(A) Y
1
(A)
c
are null sets and therefore measurable. Hence
if Y
1
(A) A
then X
1
(A) A
. In other words if Y is A
measurable so is X.
9.4 Since X is integrable, for any > 0 there exists M such that
_
|X|1
{X>M}
dP < by
the dominated convergence theorem. Note that
E[X1
A
n
] = E[X1
A
n
1
{X>M}
] + E[X1
A
n
1
{XM}
]
E[|X|1
{XM}
] + MP(A
n
)
Since P(A
n
) 0, there exists N such that P(A
n
)

M
for every n N. Therefore
E[X1
A
n
] + n N, i.e. lim
n
E[X1
A
n
] = 0.
9.5 It is clear that 0 Q(A) 1 and Q() = 1 since X is nonnegative and E[X] = 1. Let
A
1
, A
2
, . . . be disjoint. Then
Q(
n=1
A
n
) = E[X1
n=1
A
n
] = E[
n=1
X1
A
n
] =
n=1
E[X1
A
n
]
where the last equality follows from the monotone convergence theorem. Hence Q(
n=1
A
n
) =
n=1
Q(A
n
). Therefore Q is a probability measure.
9.6 If P(A) = 0 then X1
A
= 0 a.s. Hence Q(A) = E[X1
A
] = 0. Now assume P is the
uniform distribution on [0, 1]. Let X(x) = 21
[0,1/2]
(x). Corresponding measure Q assigns
zero measure to (1/2, 1], however P((1/2, 1]) = 1/2 = 0.
9.7 Lets prove this rst for simple functions, i.e. let Y be of the form
Y =
n
i=1
c
i
1
A
i
8
for disjoint A
1
, . . . , A
n
. Then
E
Q
[Y ] =
n
i=1
c
i
Q(A
i
) =
n
i=1
c
i
E[X1
A
i
] = E
P
[XY ]
For non-negative Y we take a sequence of simple functions Y
n
Y . Then
E
Q
[Y ] = lim
n
E
Q
[Y
n
] = lim
n
E
P
[XY
n
] = E
P
[XY ]
where the last equality follows from the monotone convergence theorem. For general Y
L
1
(Q) we have that E
Q
[Y ] = E
Q
[Y
+
] E
Q
[Y
] = E
P
[(XY )
+
] E
Q
[(XY )
] = E
P
[XY ].
9.8 a. Note that
1
X
X = 1 a.s. since P(X > 0) = 1. By problem 9.7 E
Q
[
1
X
] = E
P
[
1
X
X] = 1.
So
1
X
is Q-integrable.
b. R : A R, R(A) = E
Q
[
1
X
1
A
] is a probability measure since
1
X
is non-negative and
E
Q
[
1
X
] = 1. Also R(A) = E
Q
[
1
X
1
A
] = E
P
[
1
X
X1
A
] = P(A). So R = P.
9.9 Since P(A) = E
Q
[
1
X
1
A
] we have that Q(A) = 0 P(A) = 0. Now combining the
results of the previous problems we can easily observe that Q(A) = 0 P(A) = 0 i
P(X > 0) = 1.
9.17. Let
g(x) =
((x )b + )
2
2
(1 + b
2
)
2
.
Observe that {X + b} {g(X) 1}. So
P({X + b}) P({g(X) 1})
E[g(X)]
1
where the last inequality follows from Markovs inequality. Since E[g(X)] =

2
(1+b
2
)
2
(1+b
2
)
2
we
get that
P({X + b})
1
1 + b
2
.
9.19
xP({X > x}) E[X1
{
X > x}]
=
_

x
z
2
e
z
2
2
dz
=
e
x
2
2
2
Hence
P({X > x})
e
x
2
2
x
2
9
.
9.21 h(t +s) = P({X > t +s}) = P({X > t +s, X > s}) = P({X > t +s|X > s})P({X >
s}) = h(t)h(s) for all t, s > 0. Note that this gives h(
1
n
) = h(1)
1
n
and h(
m
n
) = h(1)
m
n
. So
for all rational r we have that h(r) = exp (log(h(1))r). Since h is right continuous this
gives h(x) = exp(log(h(1))x) for all x > 0. Hence X has exponential distribution with
parameter log h(1).
10
10.5 Let P be the uniform distribution on [1/2, 1/2]. Let X(x) = 1
[1/4,1/4]
and Y (x) =
1
[1/4,1/4]
c. It is clear that XY = 0 hence E[XY ] = 0. It is also true that E[X] = 0. So
E[XY ] = E[X]E[Y ] however it is clear that X and Y are not independent.
10.6 a. P(min(X, Y ) > i) = P(X > i)P(Y > i) =
1
2
i
1
2
i
=
1
4
i
. So P(min(X, Y ) i) =
1 P(min(X, Y ) > i) = 1
1
4
i
.
b. P(X = Y ) =
i=1
P(X = i)P(Y = i) =
i=1
1
2
i
1
2
i
=
1
1
1
4
i
1 =
1
3
.
c. P(Y > X) =
i=1
P(Y > i)P(X = i) =
i=1
1
2
i
1
2
i
=
1
3
.
d. P(X divides Y ) =
i=1
k=1
1
2
i
1
2
ki
=
i=1
1
2
i
1
2
i
1
.
e. P(X kY ) =
i=1
P(X ki)P(Y = i) =
i=1
1
2
i
1
2
ki
1
=
2
2
k+1
1
.
11
11.11. Since P{X > 0} = 1 we have that P{Y < 1} = 1. So F
Y
(y) = 1 for y 1. Also
P{Y 0} = 0 hence F
Y
(y) = 0 for y 0. For 0 < y < 1 P{Y > y} = P{X <
1y
y
} =
F
X
(
1y
y
). So
F
Y
(y) = 1
_ 1y
y
0
f
X
(x)dx = 1
_
y
0
1
z
2
f
X
(
1 z
z
)dz
by change of variables. Hence
f
Y
(y) =
_
_
_
0 < y 0
1
y
2
f
X
(
1y
y
) 0 < y 1
0 1 y <
11.15 Let G(u) = inf{x : F(x) u}. We would like to show {u : G(u) > y} = {u :
F(Y ) < u}. Let u be such that G(u) > y. Then F(y) < u by denition of G. Hence
{u : G(u) > y} {u : F(Y ) < u}. Now let u be such that F(y) < u. Then y < x for any x
such that F(x) u by monotonicity of F. Now by right continuity and the monotonicity of
F we have that F(G(u)) = inf
F(x)u
F(x) u. Then by the previous statement y < G(u).
So {u : G(u) > y} = {u : F(Y ) < u}. Now P{G(U) > y} = P{U > F(y)} = 1 F(y) so
G(U) has the desired distribution. Remark:We only assumed the right continuity
of F.
12
12.6 Let Z = (
1
Y
)Y (
XY
X
)X. Then
2
Z
= (
1
2
Y
)
2
Y
(
2
XY
2
X
)
2
X
2(

XY
Y
)Cov(X, Y ) =
1
2
XY
. Note that
XY
= 1 implies
2
Z
= 0 which implies Z = c a.s. for some constant
c. In this case X =

X
Y

XY
(Y c) hence X is an ane function of Y .
12.11 Consider the mapping g(x, y) = (
_
x
2
+ y
2
, arctan(
x
y
)). Let S
0
= {(x, y) : y = 0},
S
1
= {(x, y) : y > 0}, S
2
= {(x, y) : y < 0}. Note that
2
i=0
S
i
= R
2
and m
2
(S
0
) = 0.
Also for i = 1, 2 g : S
i
R
2
is injective and continuously dierentiable. Corresponding
inverses are given by g
1
1
(z, w) = (z sin w, z cos w) and g
1
2
(z, w) = (z sin w, z cos w). In
both cases we have that |J
g
1
i
(z, w)| = z hence by Corollary 12.1 the density of (Z, W) is
given by
f
Z,W
(z, w) = (
1
2
2
e
z
2
2
z +
1
2
2
e
z
2
2
z)1
(
2
,
2
)
(w)1
(0,)
(z)
=
1
1
(
2
,
2
)
(w)
z
2
e
z
2
2
1
(0,)
(z)
as desired.
12.12 Let P be the set of all permutations of {1, . . . , n}. For any P let X
be the
corresponding permutation of X, i.e. X
k
= X
k
. Observe that
P(X
1
x
1
, . . . , X
n
x
n
) = F(x
1
) . . . F(X
n
)
hence the law of X
and X coincide on a system generating B

n
therefore they are equal.
Now let
0
= {(x
1
, . . . , x
n
) R
n
: x
1
< x
2
< . . . < x
n
}. Since X
i
are i.i.d and have
continuous distribution P
X
(
0
) = 1. Observe that
P{Y
1
y
1
, . . . , Y
n
y
n
} = P(
P
{X
1
y
1
, . . . , X
n
y
n
}
0
)
Note that {X
1
y
1
, . . . , X
n
y
n
}
0
, P are disjoint and P(
0
= 1) hence
P{Y
1
y
1
, . . . , Y
n
y
n
} =
P
P{X
1
y
1
, . . . , X
n
y
n
}
= n!F(y
1
) . . . F(y
n
)
for y
1
. . . y
n
. Hence
f
Y
(y
1
, . . . , y
n
) =
_
n!f(y
1
) . . . f(y
n
) y
1
. . . y
n
0 otherwise
13
14.7
X
(u) is real valued i
X
(u) =
X
(u) =
X
(u). By uniqueness theorem
X
(u) =
X
(u) i F
X
= F
X
. Hence
X
(u) is real valued i F
X
= F
X
.
14.9 We use induction. It is clear that the statement is true for n = 1. Put Y
n
=
n
i=1
X
i
and assume that E[(Y
n
)
3
] =
n
i=1
E[(X
i
)
3
]. Note that this implies
d
3
dx
3
Y
n
(0) =
i
n
i=1
E[(X
i
)
3
]. Now E[(Y
n+1
)
3
] = E[(X
n+1
+ Y
n
)
3
] = i
d
3
dx
3
(
X
n+1
Y
n
)(0) by indepen-
dence of X
n+1
and Y
n
. Note that
d
3
dx
3
X
n+1
Y
n
(0) =
d
3
dx
3
X
n+1
(0)
Y
n
(0)
+ 3
d
2
dx
2
X
n+1
(0)
d
dx
Y
n
(0) + 3
d
dx
X
n+1
(0)
d
2
dx
2
Y
n
(0)
+
X
n+1
(0)
d
3
dx
3
Y
n
(0)
=
d
3
dx
3
X
n+1
(0) +
d
3
dx
3
Y
n
(0)
= i
_
E[(X
n+1
)
3
] +
n
i=1
E[(X
i
)
3
]
_
where we used the fact that
d
dx
X
n+1
(0) = iE(X
n+1
) = 0 and
d
dx
Y
n
(0) = iE(Y
n
) = 0. So
E[(Y
n+1
)
3
] =
n+1
i=1
E[(X
i
)
3
] hence the induction is complete.
14.10 It is clear that 0 (A) 1 since
0
n
j=1
j
(A)
n
j=1
j
= 1.
Also for A
i
disjoint
(
i=1
A
i
) =
n
j=1
j
(
i=1
A
i
)
=
n
j=1
i=1
j
(A
i
)
=
i=1
n
j=1
j
(A
i
)
=
i=1
(A
i
)
14
Hence is countably additive therefore it is a probability mesure. Note that
_
1
A
d(dx) =
n
j=1
j
_
1
A
(x)d
j
(dx) by denition of . Now by linearity and monotone convergence
theorem for a non-negative Borel function f we have that
_
f(x)(dx) =
n
j=1
j
_
f(x)d
j
(dx).
Extending this to integrable f we have that (u) =
_
e
iux
(dx) =
n
j=1
j
_
e
iux
d
j
(dx) =
n
j=1
j

j
(u).
14.11 Let be the double exponential distribution,
1
be the distribution of Y and
2
be
the distribution of Y where Y is an exponential r.v. with parameter = 1. Then we
have that (A) =
1
2
_
A(0,)
e
x
dx +
1
2
_
A(,0)
e
x
dx =
1
2
1
(A) +
1
2
2
(A). By the previous
exercise we have that (u) =
1
2

1
(u) +
1
2

2
(u) =
1
2
(
1
1iu
+
1
1+iu
) =
1
1+u
2
.
14.15. Note that E{X
n
} = (i)
n d
n
dx
n
X
(0). Since X N(0, 1)
X
(s) = e
s
2
/2
. Note that
we can get the derivatives of any order of e
s
2
/2
at 0 simply by taking Taylors expansion
of e
x
:
e
s
2
/2
=
i=0
(s
2
/2)
n
n!
=
i=0
1
2n!
(i)
2n
(2n)!
2
n
n!
s
2n
hence E{X
n
} = (i)
n d
n
dx
n
X
(0) = 0 for n odd. For n = 2k E{X
2k
} = (i)
2k d
2k
dx
2k
X
(0) =
(i)
2k
(i)
2k
(2k)!
2
k
k!
=
(2k)!
2
k
k!
as desired.
15
15.1 a. E{x} =
1
n
n
i=1
E{X
i
} = .
b. Since X
1
, . . . , X
n
are independent Var(x) =
1
n
2
n
i=1
Var{X
i
} =

2
n
.
c. Note that S
2
=
1
n
n
i=1
(X
i
)
2
x
2
. Hence E(S
2
) =
1
n
n
i=1
(
2
+
2
) (
2
n
+
2
) =
n1
n

2
.
15.17 Note that
Y
(u) =
i=1
X
i
(u) = (

iu
)
which is the characteristic function

of Gamma(,) random variable. Hence by uniqueness of characteristic function Y is
Gamma(,).
16
16.3 P({Y y}) = P({X y} {Z = 1}) + P({X y} {Z = 1}) =
1
2
(y) +
1
2
(y) = (y) since Z and X are independent and (y) is symmetric. So Y is normal.
Note that P(X +Y = 0) =
1
2
hence X +Y can not be normal. So (X, Y ) is not Gaussian
even though both X and Y are normal.
16.4 Observe that
Q =
X
Y
_

X
X
_
So det(Q) =
X
Y
(1
2
). So det(Q) = 0 i = 1. By Corollary 16.2 the joint density
of (X, Y ) exists i 1 < < 1. (By Cauchy-Schwartz we know that 1 1). Note
that
Q
1
=
1
Y
(1
2
)
Y
Substituting this in formula 16.5 we get that
f
(X,Y )
(x, y) =
1
2
X
Y
(1
2
)
exp
_
1
2(1
2
)
_
_
x
X
X
_
2
2(x
X
)(y
Y
)
Y
+
_
y
Y
Y
_
2
__
.
16.6 By Theorem 16.2 there exists a multivariate normal r.v. Y with E(Y ) = 0 and a
diagonal covariance matrix s.t. X = AY where A is an orthogonal matrix. Since
Q = AA
and det(Q) > 0 the diagonal entries of are strictly positive hence we can
dene B =
1/2
A
. Now the covariance matrix

Q of B(X ) is given by
Q =
1/2
A
AA
A
1/2
= I
So B(X ) is standard normal.
16.17 We know that as in Exercise 16.6 if B =
1/2
A
where A is the orthogonal matrix s.t.

Q = AA
then B(X) is standard normal. Note that this gives (X)
Q
1
(X) =
(X )
B(X ) which has chi-square distribution with n degrees of freedom.

17
17.1 Let n(m) and j(m) be such that Y
m
= n(m)
1/p
Z
n(m),j(m)
. This gives that P(|Y
m
| >
0) =
1
n(m)
0 as m . So Y
m
converges to 0 in probability. However E[|Y
m
|
p
] =
E[n(m)Z
n(m),j(m)
] = 1 for all m. So Y
m
does not converge to 0 in L
p
.
17.2 Let X
n
= 1/n. It is clear that X
n
converge to 0 in probability. If f(x) = 1
{0}
(x) then
we have that P(|f(X
n
) f(0)| > ) = 1 for every 1, so f(X
n
) does not converge to
f(0) in probability.
17.3 First observe that E(S
n
) =
n
i=1
E(X
n
) = 0 and that Var(S
n
) =
n
i=1
Var(X
n
) = n
since E(X
n
) = 0 and Var(X
n
) = E(X
2
n
) = 1. By Chebyshevs inequality P(|
S
n
n
| ) =
P(|S
n
| n)
Var(S
n
)
n
2
2
=
n
n
2
2
0 as n . Hence
S
n
n
converges to 0 in probability.
17.4 Note that Chebyshevs inequality gives P(|
S
n
2
n
2
| )
1
n
2
2
. Since
i=1
1
n
2
2
< by
Borel Cantelli TheoremP(limsup
n
{|
S
n
2
n
2
| }) = 0. Let
0
=
_
m=1
limsup
n
{|
S
n
2
n
2
|
1
m
}
_
c
.
Then P(
0
) = 1. Now lets pick w
0
. For any there exists m s.t.
1
m
and
w (limsup
n
{|
S
n
2
n
2
|
1
m
})
c
. Hence there are nitely many n s.t. |
S
n
2
n
2
|
1
m
which implies
that there exists N(w) s.t. |
S
n
2
n
2
|
1
m
for every n N(w). Hence
S
n
2
(w)
n
2
0. Since
P(
0
) = 1 we have almost sure convergence.
17.12 Y < a.s. which follows by Exercise 17.11 since X
n
< and X < a.s. Let
Z =
1
c
1
1+Y
. Observe that Z > 0 a.s. and E
P
(Z) = 1. Therefore as in Exercise 9.8
Q(A) = E
P
(Z1
A
) denes a probability measure and E
Q
(|X
n
X|) = E
P
(Z|X
n
X|).
Note that Z|X
n
X| 1 a.s. and X
n
X a.s. by hypothesis, hence by dominated
convergence theorem E
Q
(|X
n
X|) = E
P
(Z|X
n
X|) 0, i.e. X
n
tends to X in L
1
with
respect to Q.
17.14 First observe that |E(X
2
n
)E(X
2
)| E(|X
2
n
X
2
|). Since |X
2
n
X
2
| (X
n
X)
2
+
2|X||X
n
X| we get that |E(X
2
n
) E(X
2
)| E((X
n
X)
2
) + 2E(|X||X
n
X|). Note
that rst term goes to 0 since X
n
tends to X in L
2
. Applying Cauchy Schwarz inequality
to the second term we get E(|X||X
n
X|)
_
E(X
2
)E(|X
n
X|
2
), hence the second
term also goes to 0 as n . Now we can conclude E(X
2
n
) E(X
2
).
17.15 For any > 0 P({|X| c+}) P({|X
n
| c, |X
n
X| }) 1 as n . Hence
P({|X| c + }) = 1. Since {X c} =
m=1
{X c +
1
m
} we get that P{X c} = 1.
Now we have that E(|X
n
X|) = E(|X
n
X|1
{|X
n
X|}
) + E(|X
n
X|1
{|X
n
X|>}
)
+ 2c(P{|X
n
X| > }), hence choosing n large we can make E(|X
n
X|) arbitrarily
small, so X
n
tends to X in L
1
.
18
18.8 Note that
Y
n
(u) =
n
i=1
X
i
(
u
n
) =
n
i=1
e
|u|
n
= e
|u|
, hence Y
n
is also Cauchy with
= 0 and = 1 which is independent of n, hence trivially Y
n
converges in distribution
to a Cauchy distributed r.v. with = 0 and = 1. However Y
n
does not converge to
any r.v. in probability. To see this, suppose there exists Y s.t. P(|Y
n
Y | > ) 0.
Note that P(|Y
n
Y
m
| > ) P(|Y
n
Y | >

2
) + P(|Y
m
Y | >

2
). If we let m = 2n,
|Y
n
Y
m
| =
1
2
|
1
n
n
i=1
X
i
1
n
2n
i=n+1
X
i
| which is equal in distribution to
1
2
|U W| where
U and W are independent Cauchy r.v.s with = 0 and = 1. Hence P(|Y
n
Y
m
| >

2
)
does not depend on n and does not converge to 0 if we let m = 2n and n which is a
contradiction since we assumed the right hand side converges to 0.
18.16 Dene f
m
as the following sequence of functions:
f
m
(x) =
_
_
x
2
if |x| N
1
m
(N
1
m
)x (N
1
m
)N if x N
1
m
(N
1
m
)x + (N
1
m
)N if x N +
1
m
0 otherwise
Note that each f
m
is continuous and bounded. Also f
m
(x) 1
(N,N)
(x)x
2
for every x R.
Hence
_
N
N
x
2
F(dx) = lim
m
_

f
m
(x)F(dx)
by monotone convergence theorem. Now
_

f
m
(x)F(dx) = lim
n
_

f
m
(x)F
n
(dx)
by weak convergence. Since
_
f
m
(x)F
n
(dx)
_
N
N
x
2
F
n
(dx) it follows that
_
N
N
x
2
F(dx) lim
m
limsup
n
_
N
N
x
2
F
n
(dx) = limsup
n
_
N
N
x
2
F
n
(dx)
as desired.
18.17 Following the hint, suppose there exists a continuity point y of F such that
lim
n
F
n
(y) = F(y)
Then there exist > 0 and a subsequence (n
k
)
k1
s.t. F
n
k
(y) F(y) < for all k, or
F
n
k
(y) F(y) > for all k. Suppose F
n
k
(y) F(y) < for all k, observe that for x y,
F
n
k
(x) F(x) F
n
k
(y) F(x) = F
n
k
(y) F(y) + (F(y) F(x)) < + (F(y) F(x)).
Since f is continuous at y there exists an interval [y
1
, y) s.t. |(F(y) F(x))| <

2
, hence
F
n
k
(x) F(x) <
2
for all x [y
1
, y). Now suppose F
n
k
(y) F(y) > , then for x y,
F
n
k
(x) F(x) F
n
k
(y) F(x) = F
n
k
(y) F(y) + (F(y) F(x)) > + (F(y) F(x)).
19
Now we can nd an interval (y, y
1
] s.t. |(F(y) F(x))| <

2
which gives F
n
k
(x) F(x) >

2
for all x (y, y
1
]. Note that both cases would yield
_

infty
|F
n
k
(x) F(x)|
r
dx > |y
1
y|
2
which is a contradiction to the assumption
lim
n
_

infty
|F
n
(x) F(x)|
r
dx = 0.
Therefore X
n
converges to X in distribution.
20
19.1 Note that
X
n
(u) = e
iu
n
u
2
2
n
2
e
iu
u
2
2
2
. By Levys continuity theorem it follows
that X
n
X where X is N(,
2
).
19.3 Note that
X
n
+Y
n
(u) =
X
n
(u)
Y
n
(u)
X
(u)
Y
(u) =
X+Y
(u). Therefore X
n
+
Y
n
X + Y
21
20.1 a. First observe that E(S
2
n
) =
n
i=1
n
j=1
E(X
i
X
j
) =
n
i=1
X
2
i
since E(X
i
X
j
) = 0
for i = j. Now P(
|S
n
|
n
)
E(S
2
n
)
2
n
2
=
nE(X
2
i
)
2
n
2

c
n
2
as desired.
b. From part (a) it is clear that
1
n
S
n
converges to 0 in probability. Also E((
1
n
S
n
)
2
) =
E(X
2
i
n
0 since E(X
2
i
) , so
1
n
S
n
converges to 0 in L
2
as well.
20.5 Note that Z
n
Z implies that
Z
n
(u)
Z
(u) uniformly on compact subset of R.
(See Remark 19.1). For any u, we can pick n > N s.t.
u
n
< M, sup
x[M,M]
|
Z
n
(x)
Z
(x)| < and |varphi
Z
(
u
n
)
Z
(0)| < . This gives us
|
Z
n
(
u
n
)
Z
(0)| = |
Z
n
(
u
n
)
Z
(
u
n
)| +|
Z
(
u
n
)
Z
(0)| 2
So Z
n
n
(u) =
Z
n
(
u
n
) converges to
Z
(0) = 1 for every u. Therefore
Z
n
n
0 by continuity
theorem. We also have by the strong law of large numbers that
Z
n
n
E(X
j
) . This
implies E(X
j
) = 0, hence the assertion follows by strong law of large numbers.
22

Prob Ess Solutions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Prob Ess Solutions

Uploaded by

Copyright:

Available Formats

Solutions of Selected Problems from Probability

Essentials, Second Edition

has the three properties of a -algebra:

A. This implies that B

} and card(A B) = b(b + d). Hence

(n) such that P(X [t

and X coincide on a system generating B

which is the characteristic function

. Now the covariance matrix

where A is the orthogonal matrix s.t.

then B(X) is standard normal. Note that this gives (X)

B(X ) which has chi-square distribution with n degrees of freedom.

You might also like