Ws1 - Probability Generating Functionssol

Book Problems
1.16
a.
1
2
b.
1
n
c.
E
_
I
i
_
=
1
n

1.17
a.
P (N
1
> n) = P (X
1
= max {X
1
, ..., X
n
})
=
1
n
b.
P (N
1
= n) = P (N
1
> n 1) P (N
1
> n)
=
1
n 1

1
n
P (N
1
= n) = 1
c.
EN
1
=
i=1
iP (N
1
= i)
=
i=1
i
1
i (i + 1)
=
i=2
1
i
=
d.
Consider we have reached the rst record. Use this value in a separate process, then whenever we
reach the rst record (N
1
) of the separate process, we have reached the second record of the rst
process. Thus N
2
is a r.v because N
1
is.
e.
Yup, crazy.
1.18
a.
P (X
n
, X
n+1
both records) =
1
n(n + 1)
by symmetry
b.
1
P (X
n
, X
n+1
both records) =
1
n(n + 1)
=
1
n

1
n + 1
= P (X
n
record) P (X
n+1
record) by 1.16b
the two events are independent
c.
Let N
A
= # adjacent records
I
n
=
_
1 if X
n
, X
n+1
are both records
0 o.w
Then
E [N
A
] = E
n=1
I
n
=
1
1 2
+
1
2 3
+
= 1
Weird, right, you expect only one adjacent record in an innite seq.
1.32
a.
Monotone Convergence Theorem (MCT)
b.
By Markov inequality. Check condition that

I
A
i
0, thus Markov inequality applies. Then
P
_

i=1
I
A
i
> m
_

E [
i=1
I
A
i
]
m
=
i=1
P (A
i
)
m
0 as m since
i=1
P (A
i
) <
1.33
a.

b
x
2
f (x) dx b
2

b
f (x) dx = b
2
b.
2
(1): by denition
(2):
xf (x) dx = EX

b
xf (x) dx
0 b

b
f (x) dx
= b
c.
Per the hint. Minimizing
x
2
f (x) dx
is the same as minimizing
x
2
f (x)
1
dx
Let g (x) =
1
1
f (x), let Y g (x) on (, b). Then we want to minimize
EY
2
given
P (Y < b) = 1
EY
b
1
Then
EY
2
(EY )
2
_
b
1
_
2
So the minimum value for EY
2
= E
2
Y =
_
b
1
_
2
, this is achieved if Y =
b
1
wp 1. Thus
x
2
f (x) dx = (1 ) EY
2
(b)
2
1
d.
By part a and c, we have
(b)
2
1
+ b
2
= b
2
_
1
1
+ 1
_
=
b
2
1
We then have

2
2
+ b
2
P (X b)

2
2
+ b
2
e.
Let X = Y EY , then everything follows.
3
1.33 (Alternative)
1 {X b}
_
X + c
b + c
_
2
by construction. Then
E [1 {X b}] = P (X b)
E
_
_
X + c
b + c
_
2
_
=

2
+ c
2
(b + c)
2
(1)
Minimize over c:
d
dc
2
+ c
2
(b + c)
2
=
2c (b + c)
2
2 (
2
+ c
2
) (b + c)
(b + c)
4
set
= 0
0 = c (b + c)
_
2
+ c
2
_
= bc
2
c =

2
b
plug into (1) we have
P (X b)
2
+
_
2
b
_
2
_
b +

2
b
_
2
=
b
2
2
+
4
(b
2
+
2
)
2
=

2
2
+ b
2
1.39
a.
P
_
S
n
E [S
n
]
n

_

var (S
n
)
n
2
var (X
i
)
n
2
nA
n
2
=
A
n
0
b.
4
P
_
S
n
E [S
n
]
n

_

var (S
n
)
n
2
var (X
i
)
n
2
n
i=1
An
1
n
2
=
An
2
n
2
= A
1
n
, 0 < 1
0
1.41
A based proof is given in the solution handbook, available online.
1.45
The quetion might be a bit vague. It actually asks us to show that
Sn
n
does not converge to
anything in probability. To show this we rst use a simple lemme
Lemma. If X
n

P
X, Y
n

P
Y , then X
n
+ Y
n

P
X + Y .
Proof. .
P (|(X
n
+ Y
n
) (X + Y )| > ) P (|(X
n
X)| +|Y
n
+ Y | > )
P
__
|(X
n
X)| >

2
_
_
_
|Y
n
+ Y | >

2
__
P
_
|(X
n
X)| >

2
_
+ P
_
|Y
n
+ Y | >

2
_
0
Suppose
Sn
n
Z for some r.v Z, then obviously
S
2n
2n
Z as well, thus
Sn
S
2n
2n

P
0. However,
S
n
n

S
2n
2n
=
S
n
n

_
S
n
+ S
2n
_
, where S
n
=
2n
i=n+1
X
i
=
D
S
n
=
_
1
1
2
_
S
n
n

1
2
S
n
(2)
Now notice S
n
and S
n
are independent (why?), thus
(2)
_
1
1
2
_
Z
1
2
Z
2
N
_
0,
_
_
1
1
2
_
2
+
1
2
_
2
_
= 0
where Z
1
and Z
2
are iid N
_
0,
2
_
This is a contradiction, thus
Sn
n
cannot converge in probability.
5
1.47.
1
1
m
= exp
_
ln
_
1
1
m
__
exp
_
1
m
_
Thushy cant we just say G
(k)
X
(1) = E [X (X 1) (X 2) ... (X k + 1)], why is that 1 neces-
sary? What happens when |z| > 1 is highly dependent on the distribution, i.e how fast P (X = i)
goes to 0.
mn
_
1
1
m
_
exp
_
mn
1
m
_
= 0
1.48.
a.
EX = 100
var (X) = EX
2
E
2
X
= 1 10
10
+ 10
14
100
S
n
=
n
i=1
X
i
, then
E [S
n
] = nEX
var [S
n
] = nvar (X)
b.
The sum of rst 10
6
variables is no larger than 10
6
.
The event S
n
10
6
occurs if X
n
= 10
12
, n, The probability of this event is
P
_
S
n
10
6
_
=
_
1 10
10
_
10
6
c.
We want P (S
n
> 10
6
), which means at least one X
n
> 10
6
, so
P
_
S
n
> 10
6
_
P
_
10
6
_
i=1
_
X
i
> 10
6
_
_
10
6
P
_
X
1
> 10
6
_
= 10
6
10
10
= 10
4
1.49.
P [|Y
n
| ]
E |Y
n
|
0
6
Worksheet Problems
1.
G
X
(z) =
i=1
p
i
z
i
= E
_
z
X
, by denition
2.
Suppose
G
X
(z) = G
Y
(z) , for some rv Y
We have
G
X
(z) = G
Y
(z) , |z| < 1
Thus
G
(k)
X
(0) = G
(k)
Y
(0) , k
P (X = k) = P (Y = k) , k
(can you justify switching the derivative and the sum? uniform convergence of power series within
its radius of convergence)
Thus X =
d
Y .
3.
d
k
dz
k
G
X
(z) =
d
k
dz
k
i=0
P (X = i) z
i
=
i=k
i (i 1) (i 2) ... (i k + 1) P (X = i) z
ik
, |z| < 1
G
(k)
X
(1) =
i=k
i (i 1) (i 2) ... (i k + 1) P (X = i)
= E [X (X 1) (X 2) ... (X k + 1)]
(Note 1: Why cant we just say G
(k)
X
(1) = E [X (X 1) (X 2) ... (X k + 1)], i.e why is that
1 necessary? What happens when |z| > 1 is highly dependent on the distribution, i.e how fast
P (X = i) goes to 0.
Note 2: The fact that lim
x1
G
X
(x) = G
X
(1) is justied by MCT)
4.
basic arithmatic
7
5.
G
X+Y
(z) = E
_
z
X+Y
_
= E
_
z
X
_
E
_
z
Y
_
= G
X
(z) G
Y
(z)
6.
Collect terms
G
X
(z) G
Y
(z) =
i=0
P (X = i) z
i
j=0
P (Y = j) z
j
=
k=0
_
k
l=0
P (X = l) P (Y = k l) z
k
_
=
k=0
P (X + Y = k) z
k
7.
generalize
8.
b.
G
X
(z) =
n=0
e
n
n!
z
n
=
n=0
e
(z)
n
n!
= e
(z1)
n=0
e
z
(z)
n
n!
= e
(z1)
G
Sn
(z) = e
(z1)
i
8
d.
G
X
(z) =
n=r
_
n 1
r 1
_
p
r
q
nr
z
n
=
n=r
_
n 1
r 1
_
p
r
q
nr
z
n
=
n=r
_
n 1
r 1
_
(zp)
r
(zq)
nr
=
(zp)
r
(1 zq)
r
n=r
_
n 1
r 1
_
(1 zq)
r
(zq)
nr
=
_
zp
1 zq
_
r
G
Sn
(z) =
_
zp
1 zq
_
r
i
9.
G
S
N
(z) = E
_
z
S
N
= E
_
E
_
z
Sn
|N
= E
_
E
N
_
z
X
i
_
= G
N
(G
X
(z))
10.
Apply 9
G
S
N
(z) = G
N
(G
X
(z))
= G
N
(1 p + pz)
= e
(1p+pz1)
= e
p(z1)
9
11.
P (S
N
= j, N S
N
= k) =
N=0
P (S
N
= j, N S
N
= k|N = n) P (N = n)
= P (S
N
= j, N S
N
= k|N = j + k) P (N = j + k)
, since P (S
N
= j, N S
N
= k|N = n) = 0, n = j + k
=
_
j + k
j
_
p
j
q
k
j+k
e
j + k!
=
e
j!k!
(p)
j
(q)
k
=
_
e
p
(p)
j
j!
__
e
q
(q)
k
k!
_
= P (S
N
= j) P (N S
N
= k)
12.
Same process as 11.
Challenge Problem
(note: I )
1.
Suppose we draw n = 2 random variables, then
P (N
1
= 0) =
1
2
P (N
1
= 1) =
1
2
At n = 3, we have
P (N
1
= 0) =
3
6
P (N
1
= 1) =
2
6
P (N
1
= 2) =
1
6
First of all, due to iid, WLOG let us replace the nX
s with numbers 1 n. Obviouly, the max-

imum number of adjacent records is n 1, which can only be achieved by placing the digits in
increasing order, thus n, only one way to achieve n 1 adjacent records. To get n 2 adj.rec,
it can be easily seen that the only way we can do this is to move one digit to the very back of the
ordered set, e.g if the ordered set is 1, 2, 3, 4, 5, then one can move one of the digits after the 5, e.g
1,2,4,5,3 to achieve 5 2 = 3 adj.rec. Since there are n 1 digits to move, the number of n 2
adj rec is simply n 1.
10
Now we look at the case of n 3 adj.rec using the same push stu to the back method. Lets
look at n = 3. To achieve 3 3 = 0 adj.rec, simply move 1,2 to the back of 3 (steps (2),(3)), or
move 1 between 2 and 3 (step (4)).
3 1 2 (3)
3 2 1 (4)
2 1 3 (5)
This corresponds to the 3 ways one can achieve n 3 adj.rec for 3. Now we look at n = 4, i.e we
want 1 adj.rec. This can be achieve by move 2 of {1, 2, 3} to the back of 4 (block 1), since each 2
number has 2 dierent arrangements, the total number is
_
3
2
_
2! = 6.
block1 :
_
_
3 4 1 2
3 4 2 1
2 4 1 3
2 4 3 1
1 4 2 3
1 4 3 2
_
_
We can also achieve 1 adj.rec by moving one of 1, 2 between 3 and 4, or 1 between 2 and 3. Thus
the total number of possible 1-adj.rec arrangement is
_
3
2
_
2! + 2 + 1 = 9
You can easily see that for n = 5, this idea follows through as well, and the total number is
_
4
2
_
2! + 3 + 2 + 1 = 18
Generalize to n we have the total number of possible (n 3)-adj.rec arrangements is
_
n 1
2
_
2! +
n2
i=1
i = 3
n2
i=1
i
If one follows this pattern, or any n m, m n arrangement one needs
m
0
nm
i
1
=1
i
1
i
2
=1
...
i
m2
i
m1
i
Where m
0
= # ways to rearrange a mtuple to achieve 0-adj.rec, with 1
0
= 1. Thus the distribu-
tion is dened recursively.
To check the result, I wrote a R program to simulate the process, and the values till n = 11 are
shown below:
11
n # of adjacent records
n-1 n-2 n-3 n-4 n-5 n-6 n-7 n-8 n-9 n-10 n-11
1 1
0
=1
2 1 2
0
=1
3 1 2 3
0
=3
4 1 3 9 4
0
=11
5 1 4 18 44 5
0
=53
6 1 5 30 110 265 6
0
=309
7 1 6 45 220 765 1854 7
0
=2119
8 1 7 63 385 1855 6489 14833 8
0
=16687
9 1 8 84 616 3710 17304 59332 133496 9
0
=148329
10 1 9 108 924 6678 38934 177996 600732 1334961 10
0
=1468457
11 1 10 135 1320 11130 77868 444990 2002440 6674805 14684570 11
0
=16019531
To get the probabilities one can simply divide the raw # by the number of permutations (n!).
12

Ws1 - Probability Generating Functionssol

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ws1 - Probability Generating Functionssol

Uploaded by

Copyright:

Available Formats

Book Problems

s with numbers 1 n. Obviouly, the max-

You might also like