Professional Documents
Culture Documents
n
P(U1 Ai ) =
X X
P(Ai ) − P(Ai ∩ Aj )+
i<j
n−1 n
X
P(Ai ∩ Aj ∩ Ak ) − · · · + (−) P(∩1 Ai )
i<j<k
Week 1: Review of Probability
Week 1: Review of Probability
Conditional Probability & Independence:
Week 1: Review of Probability
Conditional Probability & Independence:
The conditional probability of A given B,
P(A∩B)
P(A|B) = P(B)
P(A ∩ B) = P(A)P(B)
Week 1: Review of Probability
Conditional Probability & Independence:
The conditional probability of A given B,
P(A∩B)
P(A|B) = P(B)
P(A ∩ B) = P(A)P(B)
Discrete Probability
Week 1: Review of Probability
Conditional Probability & Independence:
The conditional probability of A given B,
P(A∩B)
P(A|B) = P(B)
P(A ∩ B) = P(A)P(B)
Discrete Probability
List of values x1 , x2 , · · · , xn
Associated probabilities:
P(x1 ), P(x2 ), · · · , P(xn )
P i) ≥ 0
(i) P(x
(ii) i P(xi ) = 1
Examples: Bernoulli(p), Binomial(n,p),
hypergeometric, and Poisson.
Week 1: Review of Probability
Conditional Probability & Independence: Continuous Probability
The conditional probability of A given B, A continuous probability distribution is completely
P(A∩B) defined via the probability density function, f . The p.d.f
P(A|B) = P(B) satisfies
Events A and B are independent if (i) f ≥ 0
R∞
(ii) −∞ f (x)dx = 1
P(A ∩ B) = P(A)P(B)
(iii) The probabilities are calculated by
Discrete Probability Z b
P(a, b) = f (x)dx
List of values x1 , x2 , · · · , xn a
Associated probabilities:
(iv) Note: for continuous distribution, p(x) = 0 for
P(x1 ), P(x2 ), · · · , P(xn ) all x. So,
P(a, b) = P[a, b]
P i) ≥ 0
(i) P(x
Examples: Normal, Uniform, exponential ,and
(ii) i P(xi ) = 1 Pareto.
Examples: Bernoulli(p), Binomial(n,p),
hypergeometric, and Poisson.
Week 1: Review of Probability
Week 1: Review of Probability
Continuous Probability on R2
Week 1: Review of Probability
Continuous Probability on R2
Given a p.d.f f (x1 , x2 )
for any a < b; c < d
Z d Z b
P(a < X1 < b, c < X2 < d) = f (x1 , x2 )dx1 dx2
c a
Z d Z b
P(a < X1 < b, c < X2 < d) = f (x1 , x2 )dx1 dx2
c a
−1
P(X ∈ A) = P(X (A)) = P({ω : X (ω) ∈ A})
−1
P(X ∈ A) = P(X (A)) = P({ω : X (ω) ∈ A})
Typically, all that matters is the distribution of X . E(e−itX ) is called the characteristics function.
The underlying sample space is not very This always exists and has nice properties.
relevant.
Week 1: Review of Probability
Week 1: Review of Probability
Joint Distribution: Discrete
Week 1: Review of Probability
Joint Distribution: Discrete
If (X , Y ) are two discrete random variables,
their joint probability are described by
Joint Probability distribution
P(xi , yj ) = P(X = xi , Y = yj )
P
for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1
P
PX (xi ) = j P(xi , yj ) is called the marginal
distribution of X
Week 1: Review of Probability
Joint Distribution: Discrete
If (X , Y ) are two discrete random variables,
their joint probability are described by
Joint Probability distribution
P(xi , yj ) = P(X = xi , Y = yj )
P
for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1
P
PX (xi ) = j P(xi , yj ) is called the marginal
distribution of X
Joint distribution: continuous
Week 1: Review of Probability
Joint Distribution: Discrete
If (X , Y ) are two discrete random variables,
their joint probability are described by
Joint Probability distribution
P(xi , yj ) = P(X = xi , Y = yj )
P
for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1
P
PX (xi ) = j P(xi , yj ) is called the marginal
distribution of X
Joint distribution: continuous
The joint density of (X , Y ) is given by the joint
pdf f (x, y )
R R
P((X , Y ) ∈ A) = A
f (x, y )dxdy
The marginal density is given by
Z ∞
fX (x) = f (x, y)dy
−∞
f (x, y)
f (y|x) =
fX (x)
Week 1: Review of Probability
Joint Distribution: Discrete Independence
If (X , Y ) are two discrete random variables,
their joint probability are described by X and Y are independent if and only if
f (x, y) = fX (x)fY (x) for all x, y
Joint Probability distribution
P(xi , yj ) = P(X = xi , Y = yj ) If X and Y are independent then ρ(X , Y ) = 0.
P The converse is not true
for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 If X and Y are independent
P V (X + Y ) = V (X ) + V (Y )
PX (xi ) = j P(xi , yj ) is called the marginal
distribution of X If X1 and X2 are independent then
Joint distribution: continuous
MX1 +X2 (t) = MX1 (t) × MX2 (t)
The joint density of (X , Y ) is given by the joint
pdf f (x, y )
R R
P((X , Y ) ∈ A) = A
f (x, y )dxdy
The marginal density is given by
Z ∞
fX (x) = f (x, y)dy
−∞
f (x, y)
f (y|x) =
fX (x)
Week 1: Review of Probability
Joint Distribution: Discrete Independence
If (X , Y ) are two discrete random variables,
their joint probability are described by X and Y are independent if and only if
f (x, y) = fX (x)fY (x) for all x, y
Joint Probability distribution
P(xi , yj ) = P(X = xi , Y = yj ) If X and Y are independent then ρ(X , Y ) = 0.
P The converse is not true
for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 If X and Y are independent
P V (X + Y ) = V (X ) + V (Y )
PX (xi ) = j P(xi , yj ) is called the marginal
distribution of X If X1 and X2 are independent then
Joint distribution: continuous
MX1 +X2 (t) = MX1 (t) × MX2 (t)
The joint density of (X , Y ) is given by the joint
pdf f (x, y )
R R Inequalities
P((X , Y ) ∈ A) = A
f (x, y )dxdy
The marginal density is given by Markov’s Ineqaulity if X ≥ 0 then
Z ∞ E(X )
fX (x) = f (x, y)dy P(X ≥ a) ≤
−∞ a
f (x, y) V (x)
f (y|x) = P|(X − E(X )| > a) ≤
fX (x) a2
Week 1: Limit Theorems
Week 1: Limit Theorems
Key Concepts:
Week 1: Limit Theorems
Key Concepts:
Notions of convergence
Law of Large Number (LLN)
Central Limit Theorem (CLT)
Week 1: Limit Theorems
Key Concepts:
Notions of convergence
Law of Large Number (LLN)
Central Limit Theorem (CLT)
Notions of Convergence
Let (Ω, P) be a probability space. For each ω ∈ Ω, we
define a sequence of random variable
X1 (ω), X2 (ω), · · ·
We know, what it means to say a sequence of
numbers an converges to a
X1 (ω), X2 (ω), · · · are functions. A natural
definition would be
Xn → X in distribution if,
FXn (t) → FX (t) for all t such that FX is
continuous at t
The restriction on to points of continuity makes
this definition look artificial (will discuss this in a
moment)
Note: If the limit X has a continuous
distribution, then FX is continuous and we have
Xn → X in distribution if,
FXn (t) → FX (t) for all t such that FX is
continuous at t
The restriction on to points of continuity makes
this definition look artificial (will discuss this in a
moment)
Note: If the limit X has a continuous
distribution, then FX is continuous and we have
F is not continuous at 0. At all other t, MGF of Poisson(λn ); Mn (t) = exp(λn (et − 1))
Fn (t) → F (t) write Xn in its standardized form as Zn
We see that MZn (t) converges to MZ (t)
Week 1: Limit Theorems
Week 1: Limit Theorems
Law of Large Numbers:
Week 1: Limit Theorems
Law of Large Numbers:
Theorem (WLLN)
Let X1 , X2 , · · · , Xi · · · Xn be a sequence of
independent random variables with E(Xi ) = µ and
Var (Xi ) = σ 2 . Let X̄n = n−1 ni=1 Xi Then, for any
P
> 0,
P(|X̄n − µ| > ) → 0 as n → ∞
Proof.
n
1X
E(X̄n ) = E(Xi ) = µ
n i=1
n
1 X σ2
Var (X̄n ) = 2
Var (Xi ) =
n i=1 n
Proof.
n
1X
E(X̄n ) = E(Xi ) = µ
n i=1
n
1 X σ2
Var (X̄n ) = 2
Var (Xi ) =
n i=1 n
distribution
Simulate x1 , x2 , · · · , xn (large n) from U(0, 1)
distribution
By WLLN
n Z 1
1X
f (xi ) ≈ f (x)dx
n 1 0
n Z 1
1X
f (xi ) ≈ f (x)dx
n 1 0
Theorem (CLT)
Let X1 , X2 , · · · , be a sequence of independent
random variables with E(Xi ) = 0, Var (Xi ) = σ 2 with
common distribution F .( that is
X1 , X2 , · · · , i.i.d ∼ F ).
Assume that F has MGF M(t) defined in an interval
around 0. P
n
Let Sn = 1 Xi . Then for all −∞ < x < ∞,
Sn
P √ ≤x → Φ(x) as n → ∞
σ n
Proof.
Thus, n
t
MZn (t) = M √
σ n
2 /2
We want to show that , as n → ∞, this goes to et
We will make use of the following result
n
b + an b
1+ → e as an → 0
n
Proof.
Thus, n
t
MZn (t) = M √
σ n
2 /2
We want to show that , as n → ∞, this goes to et
We will make use of the following result
n
b + an b
1+ → e as an → 0
n
0 s2 00 s3 000
M(s) = M(0) + sM (0) + M (0) + M (0)
2 6
s2 s3
M(s) = 1 + σ2 + M 000 (0)
Week 1: Limit Theorems
Proof. Central Limit Theorem(CLT): Proof Continued:
Proof. s2 2 s3 000
M(s) = 1 + σ + M (0)
tX
M(t) = E(e ), MSn (t) = E(etSn ) n
= [M(t)] 2 6
Let !
S√ t
Sn t n √ with s = σ
√
n
Zn = √ , MZ (t) =E e σ n = MSn (t/σ n)
σ n n
Thus, n
t
t2
M √ =1+ + n
t
MZn (t) = M √ σ n 2n
σ n
2 /2 where
We want to show that , as n → ∞, this goes to et t 000
We will make use of the following result n = M (0)
6n3/2
t3
n
M 000 (0).
b + an b where n =
1+ → e as an → 0 3/2
6n
n t
Show that M σ√ n
can be written as
So we also need to express MZn (t) in this form (how?).
Note M(0) = 1; t 2 /2 + an
t
Since E(X ) = 0, M 0 (0) = 0, E(X 2 ) = σ 2 , we have M √ =1+
σ n n
M 00 (0) = σ 2
By Taylor expansion
with an → 0 We then have
" #n
t 2 /2 + an
n
s2 00 s3 000
0
t t 2 /2
M(s) = M(0) + sM (0) + M (0) + M (0) M √ = 1+ →e
2 6 σ n n
s2 s3
M(s) = 1 + σ2 + M 000 (0) as required.
Week 1: Limit Theorems
Week 1: Limit Theorems
Problems
Week 1: Limit Theorems
Problems
17. Suppose that a measurement has mean µ and
variance σ 2 = 25. Let X̄ be the average of n
such independent measurements. How large
should n be so that
( √ √ )
n(X̄ − µ) n
P{|X̄ −µ| < 1} = P | |<
5 5
( √ )
n
≈P |Z | < = .95
5
√
n
1.96 =
5
2 2
n = (1.96) × 5
Week 1: Limit Theorems
Problems Problems
17. Suppose that a measurement has mean µ and
variance σ 2 = 25. Let X̄ be the average of n
such independent measurements. How large
should n be so that
( √ √ )
n(X̄ − µ) n
P{|X̄ −µ| < 1} = P | |<
5 5
( √ )
n
≈P |Z | < = .95
5
√
n
1.96 =
5
2 2
n = (1.96) × 5
Week 1: Limit Theorems
Problems Problems
17. Suppose that a measurement has mean µ and
variance σ 2 = 25. Let X̄ be the average of n
such independent measurements. How large
should n be so that
( √ √ ) E(Xi ) = 15, σ = 10
n(X̄ − µ) n
P{|X̄ −µ| < 1} = P | |< P100
5 5 Total weight, T = 1 Xi
√
n
1.96 =
5
2 2
n = (1.96) × 5
Week 1: Limit Theorems
Week 1: Limit Theorems
Problems
Week 1: Limit Theorems
Problems
Week 1: Limit Theorems
Problems
X1 , X2 , · · · , Xn ∼ U(0, 1)
M = min(X1 , X2 , · · · , Xn )
n
P{1 − M < t} = P{M > 1 − t} = 1 − (1 − t)
t
P{1 − M < } = P{n(1 − M) < t}
n
t n
1 − (1 − )
n
n
As n → ∞, 1 − 1 − t
n → 1 − e−t
n(1 − M) → exp(1)
Week 1: Limit Theorems
Problems Problems
X1 , X2 , · · · , Xn ∼ U(0, 1)
M = min(X1 , X2 , · · · , Xn )
n
P{1 − M < t} = P{M > 1 − t} = 1 − (1 − t)
t
P{1 − M < } = P{n(1 − M) < t}
n
t n
1 − (1 − )
n
n
As n → ∞, 1 − 1 − t
n → 1 − e−t
n(1 − M) → exp(1)