Prob ALecture 7

CONVERGENCE OF PROBABILITY DISTRIBUTIONS
GENERATING FUNCTIONS
Probability generating function (pgf) X discrete with values in {0,1,2,...} gX(s)=E[sX] for 0 s 1. Moment generating function (mgf) X general real-valued random variable MX(t)=E[etX] Characteristic function X general real-valued random variable X(t)=E[ei tX]=E[cos(tX)]+i E[sin(tX)]
2
Moment generating function (mgf) X general real-valued random variable MX(t)=E[etX] Characteristic function X general real-valued random variable X(t)=E[ei tX]=E[cos(tX)]+i E[sin(tX)] These are just slices of the same complex function. The real slice (mgf) is the biggest possible. The radius of convergence is sup{t:MX(t)<}. Principle: Bigger functions give better tail bounds. mgf is very useful, if it exists: Chernoff bound: P{X>x}<e-txMX(t) for all t>0.
3
PROPERTIES OF MOMENT GENERATING FUNCTIONS

MX(0)=1. If Y=aX+b for const. a and b, then MY(t)=ebtMX(at). The k-th derivative of MX(t) at t=0 is E[Xk]. If X,Y are indep. then MX+Y(t)= MX(t)MY(t). Uniqueness: If MX(t)= MY(t) for all |t|<t0, for some positive t0, then X=dY.
USING UNIQUENESS
Z standard normal 1 z2 /2 tz MZ (t) = e e dz 2 1 t2 /2 (zt)2 /2 = e e dz 2
=e
t2 /2
X~N(,2)
MX (t) = e MZ (t) = e
t t+ 2 t2 /2
GAMMA DISTRIBUTION
r1 x f (x) = x e (r) r r1 (t)x MX (t) = x e dx (r) 0 r = t r t for t<. = 1
r
DISTRIBUTION
tY
Let Y have the distribution of Z2. Then

MY (t) = E[e ] = E[e ] 1 z2 /2 tz 2 = e e dz 2 1 (12t)z2 /2 e = dz for t<. 2
tZ 2
This is the same mgf as the Gamma(,).

7
= (1 2t)1/2 .
DECOMPOSING A NORMAL DISTRIBUTION

Is there a non-normal distribution whose convolution with itself is normal? 1 z2 /2 f (x)f (z x)dx = e 2 mgf:
MX1 +X2 (t) = MX (t) = e MX (t) = e
t2 /4 2 t2 /2
By uniqueness, X~N(0,).
8
WHY UNIQUENESS?
Equal mgf on interval Equal analytic continuation Equal characteristic function Distributions are equal when P(XA)=P(YA) for all events A. Equivalent: Distributions are equal when E[f(X)]= E[f(Y)] for all bounded functions f.
Distributions are equal when E[f(X)]= E[f(Y)] for all bounded functions f. Equivalent: Distributions are equal when E[f(X)]= E[f(Y)] for enough functions f. Whats enough? Enough to approximate all functions. Examples: All bounded continuous functions. Smooth functions with bounded derivative Functions eitx (Fourier analysis) Polynomials?
10
WHAT COULD GO WRONG?: THE MOMENT PROBLEM

Suppose we have random variables X1,X2,... and X, and
E[X ] = E[Y ] for all k

k k
Does it follow that X=dY? Proof sketch - Moments determine the moment generating function. Moment generating E S Ldistribution. functions determine the FA Therefore, equal moments determines equal distribution.
11
In fact, equality of moments does not determine equality of distributions. Let X have standard log-normal distribution, and dene Y to have density:
fX (x) = (2)
1/2 1 1 ln2 x 2
for x > 0,
fY (y) = fX (y) (1 + sin(2 ln y)) for y > 0. For any k 0, we have (substituting ln y = t = s + k), 1 k 1 t2 +kt sin(2t)dt y fX (y) sin(2 ln y)dy = e 2 2 0 1 k2 /2 1 s2 = e e 2 sin(2s)ds 2 = 0 by symmetry.
12
In fact, equality of moments does not even determine equality of distributions. Let X and Y have same moments (and all nite), but different distributions.
Whats wrong with our proof? The tails of the density fall off faster than any polynomial (hence moments nite) but slower than exponential. MX(t)= for all t>0, so we cant apply uniqueness theorem. Theorem - Distribution determined by moments when E[Xk]1/k doesnt increase faster than linearly in k.
13
CONVERGENCE
Fundamentally two kinds of convergence in probability Strong convergence X1,X2,... on same space.
X = lim Xn exists
n
Weak convergence Convergence of probabilities Xn d X. Not on the same space.
with prob. 1
Need a notion of when X is a r.v. dened on the probabilities are close! same space, and P{ > 0 N s.t. n > N, |Xn X | < } = 1.
14
STRONG CONVERGENCE
Example: 1, 2,... i.i.d. with P(i=1)= P(i=0)=. n i Xn = 2 i
i=1
Always a Cauchy sequence. X has uniform distribution on (0,1). What if i has a different distribution? X has a distribution on (0,1) that is neither discrete nor continuous.
15
WEAK CONVERGENCE
First idea: Random variables X and X are close if |P(XA)-P(XA)| is small for all events A. This works for discrete X, X, but problematic for others. Example:
X = Xn =

2 i 2i i
from before.
A={k/2n: n=1,2,...; k=0,1,...,2n-1} Youd want strong convergenceweak convergence.

16
i=1 n i=1
P{Xn A} = 1 for all n;
P{X A} = 0.
Idea: Weaker convergence. Restrict the relevant sets. We have a class of functions F. d(X,X)=supfF{|E[f(X)]-E[f(X)]|} Metrics arent really part of this course. XndX if
n
lim E[f (Xn )] = E[f (X )]
for bounded continuous f. Equivalent: P(Xnx)P(Xx) for x where the limit cdf is continuous. E[f(Xn)]E[f(X)] for f with |f(k)|C. E[exp(itXn)]E[exp(itXn)]. Polynomials?
17
CONTINUITY THEOREM
For probability generating functions For moment generating functions For characteristic functions You can prove convergence with any of these collections of functions. You may want to know how far off a particular function may be.
18
DISCRETE CONVERGENCE
Let Xn have Binom(n,/n) distribution. k nk n P{Xn = k} = 1 k n n
k
n(n 1) (n k + 1) = k! nk
1 n
nk
converges to 1
converges to e-
19
DISCRETE CONVERGENCE
Let Xn have Binom(n,/n) distribution. n n (s1) gn (s) = pgf = 1 + (s 1) e n which we recognise as the pgf of Poi().
20
CONVERGENCE IN PROBABILITY
The simplest form of convergence in probability is when the limit is a -distribution. That is, a deterministic point. X1,X2,... converge in probability to x (Xip x) if EQUIVALENT (i) Xnd x (= distribution with P({x})=1) (ii) E[f(Xn)] f(x) for bounded continuous f
lim (iii) > 0, n P{|Xn x| > } = 0
EQUIVALENT
21
(iv) > 0 P{|Xn x| > } < for n suciently large.
(i) Xnd x (= distribution with P({x})=1)
(ii) E[f(Xn)] f(x) for bounded continuous f

This is just the denition of convergence in distribution.
22
lim (iii) > 0, n P{|Xn x| > } = 0
1.25 1
Proof - Let f(y)=min{1,|x-y|/}. Then E[f(Xn)]P{|Xn-x|> }, and f(x)=0.
0.75
0.5
0.25
0.5
1.5
23
lim (iii) > 0, n P{|Xn x| > } = 0
(iv) > 0 P{|Xn x| > } < for n suciently large. Denition of limit.
24
(iv) > 0 P{|Xn x| > } < for n suciently large. (ii) E[f(Xn)] f(x) for bounded continuous f Proof - Let f be any continuous function with sup f=1. Choose any >0. By denition of continuity, we may nd >0 s.t.|f(y)-f(x)|< when |x-y|<. Forn sufciently large,

Since and may be chosen arbitrarily small, this completes the proof.
E[f (Xn )] f (x) = E f (Xn ) f (x) 1{|Xn x|<} + f (Xn ) f (x) 1{|Xn x|} E
E[f (Xn )] f (x) + E 21{|X x|} + . n
25
lim (iii) > 0, n P{|Xn x| > } = 0
(iv) > 0 P{|Xn x| > } < for n suciently large.
26
WEAK LAW OF LARGE NUMBERS

Theorem - Let X1,X2,... be i.i.d. with E[X]= and E[X2]<. Let Sn=n-1(X1+...+Xn). Then Snp . Proof - Let 2=Var(Xi) < . We know that Var(Sn)=n-12, and E[Sn]=. By Chebyshev, 2 P{|Sn | > } 2 n Regardless of , this 0 as n , which completes the proof.
27

Prob ALecture 7

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Prob ALecture 7

Uploaded by

Copyright:

Available Formats

CONVERGENCE OF PROBABILITY DISTRIBUTIONS

PROPERTIES OF MOMENT GENERATING FUNCTIONS

Let Y have the distribution of Z2. Then

This is the same mgf as the Gamma(,).

DECOMPOSING A NORMAL DISTRIBUTION

WHAT COULD GO WRONG?: THE MOMENT PROBLEM

E[X ] = E[Y ] for all k

Weak convergence Convergence of probabilities Xn d X. Not on the same space.

A={k/2n: n=1,2,...; k=0,1,...,2n-1} Youd want strong convergenceweak convergence.

P{Xn A} = 1 for all n;

lim E[f (Xn )] = E[f (X )]

(iv) > 0 P{|Xn x| > } < for n suciently large.

(ii) E[f(Xn)] f(x) for bounded continuous f

Proof - Let f(y)=min{1,|x-y|/}. Then E[f(Xn)]P{|Xn-x|> }, and f(x)=0.

E[f (Xn )] f (x) + E 21{|X x|} + . n

(iv) > 0 P{|Xn x| > } < for n suciently large.

WEAK LAW OF LARGE NUMBERS

You might also like