Professional Documents
Culture Documents
Week 3 Slides
Week 3 Slides
Introduction
E [X ] = 0(1 − p) + 1(p) = p
E [X ] = 0(1 − p) + 1(p) = p
2 X ∼ Uniform{1, 2, 3, 4, 5, 6}
1 1 1 1 1 1
E [X ] = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6
E [X ] = 0(1 − p) + 1(p) = p
2 X ∼ Uniform{1, 2, 3, 4, 5, 6}
1 1 1 1 1 1
E [X ] = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6
1/1000 27/1000 972/1000
3 Value of a lottery ticket (in Rs.) is { 200 , 20 , 0 }.
1 27 972
E [X ] = 200 · + 20 · +0· = Rs. 0.56
1000 1000 1000
E [X ] = 0(1 − p) + 1(p) = p
2 X ∼ Uniform{1, 2, 3, 4, 5, 6}
1 1 1 1 1 1
E [X ] = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6
1/1000 27/1000 972/1000
3 Value of a lottery ticket (in Rs.) is { 200 , 20 , 0 }.
1 27 972
E [X ] = 200 · + 20 · +0· = Rs. 0.56
1000 1000 1000
1/5 1/5 1/2 1/10
4 Change in temperature (in ◦ C) is {−2, −1, 0 , 1 }.
1 1 1 1
E [X ] = −2 · −1· +0· +1· = −0.5 ◦ C
5 5 2 10
Week 4: Expected value 6 / 22
Example: Uniform{a, a + 1, . . . , b}
X ∼ Uniform{a, a + 1, . . . , b}
1 1 1
E [X ] = a · + (a + 1) · + ··· + b ·
b−a+1 b−a+1 b−a+1
X ∼ Uniform{a, a + 1, . . . , b}
1 1 1
E [X ] = a · + (a + 1) · + ··· + b ·
b−a+1 b−a+1 b−a+1
X ∼ Uniform{a, a + 1, . . . , b}
1 1 1
E [X ] = a · + (a + 1) · + ··· + b ·
b−a+1 b−a+1 b−a+1
a+b
E [X ] =
2
1 X ∼ Geometric(p)
∞
X
E [X ] = t(1 − p)t−1 p
t=1
1 X ∼ Geometric(p)
∞
X
E [X ] = t(1 − p)t−1 p
t=1
2 X ∼ Poisson(λ)
∞
λt
t e −λ
X
E [X ] =
t=0
t!
1 X ∼ Geometric(p)
∞
X
E [X ] = t(1 − p)t−1 p
t=1
2 X ∼ Poisson(λ)
∞
λt
t e −λ
X
E [X ] =
t=0
t!
3 X ∼ Binomial(n, p)
n
!
X n t
E [X ] = t p (1 − p)n−t
t=0
t
3 Exponential Function:
∞
λt
e −λ
X
=1
t=0
t!
3 Exponential Function:
∞
λt
e −λ
X
=1
t=0
t!
n
n k n−k
= (a + b)n
P
4 Binomial formula: k a b
k=0
λ λ2 λ3
−λ
E [X ] = e 0 · +1 · + 2 · +3· + ···
1! 2! 3!
λ λ2
= λe −λ 1 + + + ··· = λ
1! 2!
λ λ2 λ3
−λ
E [X ] = e 0 · +1 · + 2 · +3· + ···
1! 2! 3!
λ λ2
= λe −λ 1 + + + ··· = λ
1! 2!
Pn n t
3 X ∼ Binomial(n, p): E [X ] = t=0 t t p (1 − p)n−t
n n
X t n! X (n−1)!
E [X ] = p t (1−p)n−t = np p t−1 (1−p)n−t
t=1
t!(n−t)! t=1
(t−1)!(n−t)!
= np(p + (1 − p))n−1 = np
Week 4: Expected value 10 / 22
PMF and expected value
X
E [X ] = t P(X = t)
t∈TX
X
E [X ] = t P(X = t)
t∈TX
E [c] = c · 1 = c
X
E [X ] = t P(X = t)
t∈TX
E [c] = c · 1 = c
E [X ] ≥ 0
1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2
1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2
2 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2
2 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
1 4 1
E [g(X , Y )] = 0 · 6 +1· 6 +3· 6 = 7/6
1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2
2 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
1 4 1
E [g(X , Y )] = 0 · 6 +1· 6 +3· 6 = 7/6
1 1 1 1 1 1
E [g(X , Y )] = 0 · 6 + (1) · 6 + (1) · 6 + (3) · 6 + (1) · 6 + (1) · 6 = 7/6
= E [X ] + E [Y ]
= E [X ] + E [Y ]
E [aX + bY ] = aE [X ] + bE [Y ]
One of the most useful properties of expected value!
Week 4: Expected value 17 / 22
Illustrative examples
1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
2 X ∼ fX , Y ∼ fY , the joint PMF fXY is not given. Can you compute
E [X + Y ]? Yes!
1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
2 X ∼ fX , Y ∼ fY , the joint PMF fXY is not given. Can you compute
E [X + Y ]? Yes!
P P
E [X + Y ] = E [X ] + E [Y ] = t∈TX t fX (t) + t∈TY t fY (t)
1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
2 X ∼ fX , Y ∼ fY , the joint PMF fXY is not given. Can you compute
E [X + Y ]? Yes!
P P
E [X + Y ] = E [X ] + E [Y ] = t∈TX t fX (t) + t∈TY t fY (t)
Note: Expected value of the form E [g(X ) + h(Y )] can be computed with
marginal PMFs, and it does not depend on the joint PMF.
Y = X1 + · · · + Xn ∼ Binomial(n, p)
Y = X1 + · · · + Xn ∼ Binomial(n, p)
Variance
E [X ] = E [Y ] = E [Z ] = 10
E [X ] = E [Y ] = E [Z ] = 10
E [X ] = E [Y ] = E [Z ] = 10
E [X ] = E [Y ] = E [Z ] = 10
E [X ] = E [Y ] = E [Z ] = 10
1 1
Var(Y ) = (9 − 10)2 · + (11 − 10)2 · = 1, SD(X ) = 1
2 2
E [X ] = E [Y ] = E [Z ] = 10
1 1
Var(Y ) = (9 − 10)2 · + (11 − 10)2 · = 1, SD(X ) = 1
2 2
1 1
Var(Z ) = (0 − 10)2 · + (20 − 10)2 · = 100, SD(X ) = 10
2 2
E [X ] = 3.5
E [X ] = 3.5
1 1 1
Var(X ) = (1 − 3.5)2 · + (2 − 3.5)2 · + (3 − 3.5)2 ·
6 6 6
2 1 2 1 2 1
+ (4 − 3.5) · + (5 − 3.5) · + (6 − 3.5) ·
6 6 6
35
= = 2.916 . . .
12
SD(X ) = 1.7078 . . .
E [X ] = 3.5
1 1 1
Var(X ) = (1 − 3.5)2 · + (2 − 3.5)2 · + (3 − 3.5)2 ·
6 6 6
2 1 2 1 2 1
+ (4 − 3.5) · + (5 − 3.5) · + (6 − 3.5) ·
6 6 6
35
= = 2.916 . . .
12
SD(X ) = 1.7078 . . .
Var(X ) = E [X 2 ] − E [X ]2 .
Var(X ) = E [X 2 ] − E [X ]2 .
Proof:
Var(X ) = E [(X − E [X ])2 ] = E [X 2 − 2E [X ]X + E [X ]2 ]
By linearity of expected value,
Var(X ) = E [X 2 ] − 2E [E [X ]X ] + E [E [X ]2 ]
Since E [X ] is a constant,
Var(X ) = E [X 2 ] − 2E [X ]E [X ] + E [X ]2 = E [X 2 ] − E [X ]2
Var(X ) = E [X 2 ] − E [X ]2 .
Proof:
Var(X ) = E [(X − E [X ])2 ] = E [X 2 − 2E [X ]X + E [X ]2 ]
By linearity of expected value,
Var(X ) = E [X 2 ] − 2E [E [X ]X ] + E [E [X ]2 ]
Since E [X ] is a constant,
Var(X ) = E [X 2 ] − 2E [X ]E [X ] + E [X ]2 = E [X 2 ] − E [X ]2
E [X + Y ] = E [X ] + E [Y ].
E [X + Y ] = E [X ] + E [Y ].
E [X + Y ] = E [X ] + E [Y ].
E [X + Y ] = E [X ] + E [Y ].
Definition
A random variable X is said to be standardised if
E [X ] = 0, Var(X ) = 1.
Definition
A random variable X is said to be standardised if
E [X ] = 0, Var(X ) = 1.
Theorem
Let X be a random variable. Then,
X − E [X ]
SD(X )
fXY 0 1 fXY 0 1
0 1/2 1/2 0 0 1/2
1 1/2 1/2 1 1/2 0
fXY 0 1 fXY 0 1
0 1/2 1/2 0 0 1/2
1 1/2 1/2 1 1/2 0
Same marginal PMFs in both cases. So, mean and variance are the
same for X and Y .
However, the two cases are very different
I X and Y are independent in one case
I Value of X determines the value of Y in the other
fXY 0 1 fXY 0 1
0 1/2 1/2 0 0 1/2
1 1/2 1/2 1 1/2 0
Same marginal PMFs in both cases. So, mean and variance are the
same for X and Y .
However, the two cases are very different
I X and Y are independent in one case
I Value of X determines the value of Y in the other
Cov(X , Y ) is positive
I (X − E [X ])(Y − E [Y ]) tends to be positive
I When X is above/below its average, Y tends to be correspondingly
above/below its average
Cov(X , Y ) is positive
I (X − E [X ])(Y − E [Y ]) tends to be positive
I When X is above/below its average, Y tends to be correspondingly
above/below its average
Cov(X , Y ) is negative
I (X − E [X ])(Y − E [Y ]) tends to be negative
I When X is above/below its average, Y tends to be correspondingly
below/above its average
Cov(X , Y ) is positive
I (X − E [X ])(Y − E [Y ]) tends to be positive
I When X is above/below its average, Y tends to be correspondingly
above/below its average
Cov(X , Y ) is negative
I (X − E [X ])(Y − E [Y ]) tends to be negative
I When X is above/below its average, Y tends to be correspondingly
below/above its average
Cov(X , Y ) = 0
I X and Y are said to be “uncorrelated”
Week 4: Expected value 39 / 57
Examples: positive and negative covariance
1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
2 Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
I Proof: Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])] =
E [XY − XE [Y ] − YE [X ] + E [X ]E [Y ]]
1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
2 Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
I Proof: Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])] =
E [XY − XE [Y ] − YE [X ] + E [X ]E [Y ]]
3 Covariance if symmetric, Cov(X , Y ) = Cov(Y , X )
1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
2 Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
I Proof: Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])] =
E [XY − XE [Y ] − YE [X ] + E [X ]E [Y ]]
3 Covariance if symmetric, Cov(X , Y ) = Cov(Y , X )
4 Covariance is a “linear” quantity
1 Cov(X , aY + bZ ) = aCov(X , Y ) + bCov(X , Z )
2 Cov(aX + bY , Z ) = aCov(X , Z ) + bCov(Y , Z )
X = −1 X =0 X =1
Y =0 1/8 1/4 1/8
Y =1 1/4 0 1/4
Cov(X , Y )
ρ(X , Y ) = .
SD(X )SD(Y )
Cov(X , Y )
ρ(X , Y ) = .
SD(X )SD(Y )
Cov(X , Y )
ρ(X , Y ) = .
SD(X )SD(Y )
X =0 X =1
Y =0 1/4 − x 1/4 + x
Y =1 1/4 + x 1/4 − x
X =c X =c +1
Y =d 1/4 − x 1/4 + x
Y =d +1 1/4 + x 1/4 − x
X ∼ Uniform{1, 2, . . . , 100}
I µ = 50.5, σ ≈ 28.9
I P(|X − µ| > σ) = 58/100
I P(|X − µ| > 2σ) = 0
Proof:
P P P
µ= t∈TX t P(X = t) = t<c t P(X = t) + t≥c t P(X = t)
Proof:
P P P
µ= t∈TX t P(X = t) = t<c t P(X = t) + t≥c t P(X = t)
Since the first sum is non-negative,
X X X
µ≥ t P(X = t) ≥ c P(X = t) = c P(X = t) = cP(X ≥ c)
t≥c t≥c t≥c
Proof:
Apply Markov’s inequality to (X − µ)2 .
Proof:
Apply Markov’s inequality to (X − µ)2 .
Other forms:
σ2 1
P(|X − µ| ≥ c) ≤ c2
, P((X − µ)2 ≥ k 2 σ 2 ) ≤ k2
1
P(µ − kσ < X < µ + kσ) ≥ 1 − k2
1
P(X ≥ µ + kσ) + P(X ≤ µ − kσ) ≤ k2
1 1
I P(X ≥ µ + kσ) ≤ k2 , P(X ≤ µ − kσ) ≤ k2
2 X ∼ Geometric(1/4), µ = 4, σ ≈ 3.46