You are on page 1of 137

Subsection 1

Introduction

Week 4: Expected value 2 / 22


Summarizing data
How to provide summary information about a large set of data?
I Minimum value, maximum value, average value, median value

Average value of a set of data is very useful in practice.


I Average marks in a class exam
I Run rate, batting average in cricket (and similar numbers in many
sports)

What does the average represent?


I One number to represent a large set of numbers
I Useful in comparisons and in so many scenarios

Probability theory: expected value of a random variable is an important


theoretical construct that represents the average value
I It is used to express various types of summary information

Week 4: Expected value 3 / 22


Example: Casino math
Place bets on the sum seen when two die are rolled.
Bets and returns
I Under 7: get money back
I Over 7: get money back
I Equal to 7: get 4 times the money back

Week 4: Expected value 4 / 22


Example: Casino math
Place bets on the sum seen when two die are rolled.
Bets and returns
I Under 7: get money back
I Over 7: get money back
I Equal to 7: get 4 times the money back

Suppose you bet 1 unit of money.

Bet Gain Prob


Under 7 0 5/12
−1 7/12
Over 7 0 5/12
−1 7/12
Equal 7 +3 1/6
−1 5/6

Week 4: Expected value 4 / 22


Example: Casino math
Place bets on the sum seen when two die are rolled.
Bets and returns
I Under 7: get money back
I Over 7: get money back
I Equal to 7: get 4 times the money back

Suppose you bet 1 unit of money.

Bet Gain Prob


Under 7 0 5/12
−1 7/12
Over 7 0 5/12
−1 7/12
Equal 7 +3 1/6
−1 5/6

How did the casino decide on above returns? How to bet?


Week 4: Expected value 4 / 22
Expected value of a discrete random variable

Definition (Expected value)


Suppose X is a discrete random variable with range TX and PMF fX . The
expected value of X , denoted E [X ], is defined as
X
E [X ] = t fX (t),
t∈TX

assuming above sum exists.

Week 4: Expected value 5 / 22


Expected value of a discrete random variable

Definition (Expected value)


Suppose X is a discrete random variable with range TX and PMF fX . The
expected value of X , denoted E [X ], is defined as
X
E [X ] = t fX (t),
t∈TX

assuming above sum exists.

Other names: mean of X , average value of X


E [X ] may or may not belong to the range of X
E [X ] has the same units as X

Week 4: Expected value 5 / 22


Examples: Easy summation given PMF
1 X ∼ Bernoulli(p)

E [X ] = 0(1 − p) + 1(p) = p

Week 4: Expected value 6 / 22


Examples: Easy summation given PMF
1 X ∼ Bernoulli(p)

E [X ] = 0(1 − p) + 1(p) = p

2 X ∼ Uniform{1, 2, 3, 4, 5, 6}

1 1 1 1 1 1
E [X ] = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6

Week 4: Expected value 6 / 22


Examples: Easy summation given PMF
1 X ∼ Bernoulli(p)

E [X ] = 0(1 − p) + 1(p) = p

2 X ∼ Uniform{1, 2, 3, 4, 5, 6}

1 1 1 1 1 1
E [X ] = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6
1/1000 27/1000 972/1000
3 Value of a lottery ticket (in Rs.) is { 200 , 20 , 0 }.

1 27 972
E [X ] = 200 · + 20 · +0· = Rs. 0.56
1000 1000 1000

Week 4: Expected value 6 / 22


Examples: Easy summation given PMF
1 X ∼ Bernoulli(p)

E [X ] = 0(1 − p) + 1(p) = p

2 X ∼ Uniform{1, 2, 3, 4, 5, 6}

1 1 1 1 1 1
E [X ] = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6
1/1000 27/1000 972/1000
3 Value of a lottery ticket (in Rs.) is { 200 , 20 , 0 }.

1 27 972
E [X ] = 200 · + 20 · +0· = Rs. 0.56
1000 1000 1000
1/5 1/5 1/2 1/10
4 Change in temperature (in ◦ C) is {−2, −1, 0 , 1 }.

1 1 1 1
E [X ] = −2 · −1· +0· +1· = −0.5 ◦ C
5 5 2 10
Week 4: Expected value 6 / 22
Example: Uniform{a, a + 1, . . . , b}

X ∼ Uniform{a, a + 1, . . . , b}

1 1 1
E [X ] = a · + (a + 1) · + ··· + b ·
b−a+1 b−a+1 b−a+1

Week 4: Expected value 7 / 22


Example: Uniform{a, a + 1, . . . , b}

X ∼ Uniform{a, a + 1, . . . , b}

1 1 1
E [X ] = a · + (a + 1) · + ··· + b ·
b−a+1 b−a+1 b−a+1

How to simplify sum?


Identity: a + (a + 1) + · · · + b = (b − a + 1) a+b
2

Week 4: Expected value 7 / 22


Example: Uniform{a, a + 1, . . . , b}

X ∼ Uniform{a, a + 1, . . . , b}

1 1 1
E [X ] = a · + (a + 1) · + ··· + b ·
b−a+1 b−a+1 b−a+1

How to simplify sum?


Identity: a + (a + 1) + · · · + b = (b − a + 1) a+b
2

a+b
E [X ] =
2

Week 4: Expected value 7 / 22


Examples: More involved summation

1 X ∼ Geometric(p)

X
E [X ] = t(1 − p)t−1 p
t=1

Week 4: Expected value 8 / 22


Examples: More involved summation

1 X ∼ Geometric(p)

X
E [X ] = t(1 − p)t−1 p
t=1

2 X ∼ Poisson(λ)

λt
t e −λ
X
E [X ] =
t=0
t!

Week 4: Expected value 8 / 22


Examples: More involved summation

1 X ∼ Geometric(p)

X
E [X ] = t(1 − p)t−1 p
t=1

2 X ∼ Poisson(λ)

λt
t e −λ
X
E [X ] =
t=0
t!
3 X ∼ Binomial(n, p)
n
!
X n t
E [X ] = t p (1 − p)n−t
t=0
t

Week 4: Expected value 8 / 22


How to simplify sum?
1 Difference Equation (DE): at+1 − r at = bt (r 6= 1)
n
X a1 − r a n 1 n−1
X
at = + bt
t=1
1−r 1 − r t=1

Week 4: Expected value 9 / 22


How to simplify sum?
1 Difference Equation (DE): at+1 − r at = bt (r 6= 1)
n
X a1 − r a n 1 n−1
X
at = + bt
t=1
1−r 1 − r t=1

2 Geometric Progression (GP): at+1 − r at = 0 (r 6= 1)


n
X a1 − ran |r |<1 a1
at = =⇒
t=1
1−r n→∞ 1−r

Week 4: Expected value 9 / 22


How to simplify sum?
1 Difference Equation (DE): at+1 − r at = bt (r 6= 1)
n
X a1 − r a n 1 n−1
X
at = + bt
t=1
1−r 1 − r t=1

2 Geometric Progression (GP): at+1 − r at = 0 (r 6= 1)


n
X a1 − ran |r |<1 a1
at = =⇒
t=1
1−r n→∞ 1−r

3 Exponential Function:

λt
e −λ
X
=1
t=0
t!

Week 4: Expected value 9 / 22


How to simplify sum?
1 Difference Equation (DE): at+1 − r at = bt (r 6= 1)
n
X a1 − r a n 1 n−1
X
at = + bt
t=1
1−r 1 − r t=1

2 Geometric Progression (GP): at+1 − r at = 0 (r 6= 1)


n
X a1 − ran |r |<1 a1
at = =⇒
t=1
1−r n→∞ 1−r

3 Exponential Function:

λt
e −λ
X
=1
t=0
t!
n
n  k n−k
= (a + b)n
P
4 Binomial formula: k a b
k=0

Week 4: Expected value 9 / 22


Examples: More involved summation
1 X ∼ Geometric(p) (DE: a1 = p, r = 1 − p, bt = r t p)

X
E [X ] = t(1 − p)t−1 p = 1/p
t=1

Week 4: Expected value 10 / 22


Examples: More involved summation
1 X ∼ Geometric(p) (DE: a1 = p, r = 1 − p, bt = r t p)

X
E [X ] = t(1 − p)t−1 p = 1/p
t=1
P∞ t
2 X ∼ Poisson(λ): E [X ] = t=0 t e −λ λt!

λ λ2 λ3
 
−λ
E [X ] = e 0 · +1 · + 2 · +3· + ···
1! 2! 3!
λ λ2
 
= λe −λ 1 + + + ··· = λ
1! 2!

Week 4: Expected value 10 / 22


Examples: More involved summation
1 X ∼ Geometric(p) (DE: a1 = p, r = 1 − p, bt = r t p)

X
E [X ] = t(1 − p)t−1 p = 1/p
t=1
P∞ t
2 X ∼ Poisson(λ): E [X ] = t=0 t e −λ λt!

λ λ2 λ3
 
−λ
E [X ] = e 0 · +1 · + 2 · +3· + ···
1! 2! 3!
λ λ2
 
= λe −λ 1 + + + ··· = λ
1! 2!
Pn n t
3 X ∼ Binomial(n, p): E [X ] = t=0 t t p (1 − p)n−t
n n
X t n! X (n−1)!
E [X ] = p t (1−p)n−t = np p t−1 (1−p)n−t
t=1
t!(n−t)! t=1
(t−1)!(n−t)!
= np(p + (1 − p))n−1 = np
Week 4: Expected value 10 / 22
PMF and expected value

Week 4: Expected value 11 / 22


Subsection 2

Properties of expected value

Week 4: Expected value 13 / 22


Example: Casino math
Bet Gain Prob
Under 7 0 5/12
−1 7/12
Over 7 0 5/12
−1 7/12
Equal 7 +3 1/6
−1 5/6

Suppose you bet 1 unit of money with probability p1 each on “Under 7” or


“Over 7” and probability 1 − 2p1 on “Equal to 7”. What is the expected
gain?

Week 4: Expected value 12 / 22


Constant random variable and positive random variable

X
E [X ] = t P(X = t)
t∈TX

Week 4: Expected value 14 / 22


Constant random variable and positive random variable

X
E [X ] = t P(X = t)
t∈TX

1 Consider a constant c as a random variable X with P(X = c) = 1.

E [c] = c · 1 = c

Week 4: Expected value 14 / 22


Constant random variable and positive random variable

X
E [X ] = t P(X = t)
t∈TX

1 Consider a constant c as a random variable X with P(X = c) = 1.

E [c] = c · 1 = c

2 Suppose X takes only non-negative values, i.e. P(X ≥ 0) = 1. Then,

E [X ] ≥ 0

Week 4: Expected value 14 / 22


Expected value of a function of random variables

Theorem (Expected value of a function)


Suppose X1 , . . . , Xn have joint PMF fX1 ···Xn with range of Xi denoted TXi .
Let g : TX1 × · · · × TXn → R be a function, and let Y = g(X1 , . . . , Xn )
have range TY and PMF fY . Then,
X X
E [g(X1 , . . . , Xn )] = t fY (t) = g(t1 , . . . , tn )fX1 ···Xn (t1 , . . . , tn ).
t∈TY ti ∈TXi

Week 4: Expected value 15 / 22


Expected value of a function of random variables

Theorem (Expected value of a function)


Suppose X1 , . . . , Xn have joint PMF fX1 ···Xn with range of Xi denoted TXi .
Let g : TX1 × · · · × TXn → R be a function, and let Y = g(X1 , . . . , Xn )
have range TY and PMF fY . Then,
X X
E [g(X1 , . . . , Xn )] = t fY (t) = g(t1 , . . . , tn )fX1 ···Xn (t1 , . . . , tn ).
t∈TY ti ∈TXi

We have seen how to find fY , PMF of a function of multiple random


variables
The above theorem says: To find E [Y ], you do not always need fY .
The joint PMF of X1 , . . . , Xn can be used directly.
Simple sounding property with far-reaching consequences!

Week 4: Expected value 15 / 22


Examples for illustration
1/5 1/5 1/5 1/5 1/5 1/5 2/5 2/5
1 X ∼ {−2, −1, 0 , 1 , 2 }, g(X ) = X 2 ∼ { 0 , 1 , 4 }

Week 4: Expected value 16 / 22


Examples for illustration
1/5 1/5 1/5 1/5 1/5 1/5 2/5 2/5
1 X ∼ {−2, −1, 0 , 1 , 2 }, g(X ) = X 2 ∼ { 0 , 1 , 4 }

1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2

Week 4: Expected value 16 / 22


Examples for illustration
1/5 1/5 1/5 1/5 1/5 1/5 2/5 2/5
1 X ∼ {−2, −1, 0 , 1 , 2 }, g(X ) = X 2 ∼ { 0 , 1 , 4 }

1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2

Week 4: Expected value 16 / 22


Examples for illustration
1/5 1/5 1/5 1/5 1/5 1/5 2/5 2/5
1 X ∼ {−2, −1, 0 , 1 , 2 }, g(X ) = X 2 ∼ { 0 , 1 , 4 }

1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2

2 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }

Week 4: Expected value 16 / 22


Examples for illustration
1/5 1/5 1/5 1/5 1/5 1/5 2/5 2/5
1 X ∼ {−2, −1, 0 , 1 , 2 }, g(X ) = X 2 ∼ { 0 , 1 , 4 }

1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2

2 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }

1 4 1
E [g(X , Y )] = 0 · 6 +1· 6 +3· 6 = 7/6

Week 4: Expected value 16 / 22


Examples for illustration
1/5 1/5 1/5 1/5 1/5 1/5 2/5 2/5
1 X ∼ {−2, −1, 0 , 1 , 2 }, g(X ) = X 2 ∼ { 0 , 1 , 4 }

1 2 2
E [g(X )] = 0 · 5 +1· 5 +4· 5 =2
1 1 1 1 1
E [g(X )] = (−2)2 · 5 + (−1)2 · 5 + (0)2 · 5 + (1)2 · 5 + (2)2 · 5 =2

2 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }

1 4 1
E [g(X , Y )] = 0 · 6 +1· 6 +3· 6 = 7/6
1 1 1 1 1 1
E [g(X , Y )] = 0 · 6 + (1) · 6 + (1) · 6 + (3) · 6 + (1) · 6 + (1) · 6 = 7/6

Week 4: Expected value 16 / 22


Linearity of expected value
1 E [cX ] = cE [X ] for a random variable X and a constant c
Proof: X X
E [cX ] = ct fX (t) = c t fX (t) = cE [X ]
t∈TX t∈TX

Week 4: Expected value 17 / 22


Linearity of expected value
1 E [cX ] = cE [X ] for a random variable X and a constant c
Proof: X X
E [cX ] = ct fX (t) = c t fX (t) = cE [X ]
t∈TX t∈TX

2 E [X + Y ] = E [X ] + E [Y ] for any two random variables X , Y


Proof:
X
E [X + Y ] = (t1 + t2 )fXY (t1 , t2 )
t1 ∈TX ,t2 ∈TY
X X
= t1 fXY (t1 , t2 ) + t2 fXY (t1 , t2 )
t1 ∈TX ,t2 ∈TY t1 ∈TX ,t2 ∈TY

= E [X ] + E [Y ]

Week 4: Expected value 17 / 22


Linearity of expected value
1 E [cX ] = cE [X ] for a random variable X and a constant c
Proof: X X
E [cX ] = ct fX (t) = c t fX (t) = cE [X ]
t∈TX t∈TX

2 E [X + Y ] = E [X ] + E [Y ] for any two random variables X , Y


Proof:
X
E [X + Y ] = (t1 + t2 )fXY (t1 , t2 )
t1 ∈TX ,t2 ∈TY
X X
= t1 fXY (t1 , t2 ) + t2 fXY (t1 , t2 )
t1 ∈TX ,t2 ∈TY t1 ∈TX ,t2 ∈TY

= E [X ] + E [Y ]

E [aX + bY ] = aE [X ] + bE [Y ]
One of the most useful properties of expected value!
Week 4: Expected value 17 / 22
Illustrative examples

1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }

Week 4: Expected value 18 / 22


Illustrative examples

1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6

Week 4: Expected value 18 / 22


Illustrative examples

1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)

Week 4: Expected value 18 / 22


Illustrative examples

1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
2 X ∼ fX , Y ∼ fY , the joint PMF fXY is not given. Can you compute
E [X + Y ]? Yes!

Week 4: Expected value 18 / 22


Illustrative examples

1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
2 X ∼ fX , Y ∼ fY , the joint PMF fXY is not given. Can you compute
E [X + Y ]? Yes!
P P
E [X + Y ] = E [X ] + E [Y ] = t∈TX t fX (t) + t∈TY t fY (t)

Week 4: Expected value 18 / 22


Illustrative examples

1 (X , Y ) ∼ Uniform{(0, 0), (1, 0), (0, 1), (1, 1), (−1, 1), (1, −1)}
1/6 4/6 1/6
g(X , Y ) = X 2 + XY + Y 2 ∼ { 0 , 1 , 3 }
E [g(X , Y )] = 7/6
E [g(X , Y )] = E [X 2 + XY + Y 2 ] = E [X 2 ] + E [XY ] + E [Y 2 ] = 7/6
(E [X 2 ] = 4/6, E [XY ] = −1/6, E [Y 2 ] = 4/6)
2 X ∼ fX , Y ∼ fY , the joint PMF fXY is not given. Can you compute
E [X + Y ]? Yes!
P P
E [X + Y ] = E [X ] + E [Y ] = t∈TX t fX (t) + t∈TY t fY (t)
Note: Expected value of the form E [g(X ) + h(Y )] can be computed with
marginal PMFs, and it does not depend on the joint PMF.

Week 4: Expected value 18 / 22


Problem: Throw a fair die twice
What is the expected value of the sum of the two numbers seen?

Week 4: Expected value 19 / 22


Problem: Expected value of Binomial(n, p)
Suppose Y ∼ Binomial(n, p).
n
!
X n k
E [Y ] = k p (1 − p)n−k = np
k=0
k

Such summations may be difficult to simplify

Week 4: Expected value 20 / 22


Problem: Expected value of Binomial(n, p)
Suppose Y ∼ Binomial(n, p).
n
!
X n k
E [Y ] = k p (1 − p)n−k = np
k=0
k

Such summations may be difficult to simplify


Alternative method using linearity of expected value
Let X1 , . . . , Xn be i.i.d. Bernoulli(p). Then, E [Xi ] = p and

Y = X1 + · · · + Xn ∼ Binomial(n, p)

Week 4: Expected value 20 / 22


Problem: Expected value of Binomial(n, p)
Suppose Y ∼ Binomial(n, p).
n
!
X n k
E [Y ] = k p (1 − p)n−k = np
k=0
k

Such summations may be difficult to simplify


Alternative method using linearity of expected value
Let X1 , . . . , Xn be i.i.d. Bernoulli(p). Then, E [Xi ] = p and

Y = X1 + · · · + Xn ∼ Binomial(n, p)

So, E [Y ] = E [X1 ] + · · · + E [Xn ] = np

Week 4: Expected value 20 / 22


Zero-mean random variable
A random variable X with E [X ] = 0 is said to be a zero-mean random
variable.

Week 4: Expected value 21 / 22


Zero-mean random variable
A random variable X with E [X ] = 0 is said to be a zero-mean random
variable.
Translation of a random variable
X + c, where c is a constant, is a “translated” version of X .
Range of X + c is {t + c : t ∈ TX }, which is a translated version of
TX , the range of X
P(X + c = t + c) = P(X = t) and the PMF is “translated” as well

Week 4: Expected value 21 / 22


Zero-mean random variable
A random variable X with E [X ] = 0 is said to be a zero-mean random
variable.
Translation of a random variable
X + c, where c is a constant, is a “translated” version of X .
Range of X + c is {t + c : t ∈ TX }, which is a translated version of
TX , the range of X
P(X + c = t + c) = P(X = t) and the PMF is “translated” as well
Translation by expected value
Important: E [X ] is a constant.
Y = X − E [X ] is a translated version of X and E [Y ] = 0.
So, X − E [X ] is a zero-mean random variable.

Week 4: Expected value 21 / 22


Problem: Balls on bins
Suppose 10 balls are thrown uniformly at random into 3 bins. What is the
expected number of empty bins?

Week 4: Expected value 22 / 22


Subsection 3

Variance

Week 4: Expected value 23 / 36


Motivation

How good is expected value in describing a random variable?


1 1/2 1/2 1/2 1/2
Let X ∼ {10}, Y ∼ { 9 , 11}, Z ∼ { 0 , 20}.

E [X ] = E [Y ] = E [Z ] = 10

However, the three random variables are quite different!

Week 4: Expected value 24 / 36


Motivation

How good is expected value in describing a random variable?


1 1/2 1/2 1/2 1/2
Let X ∼ {10}, Y ∼ { 9 , 11}, Z ∼ { 0 , 20}.

E [X ] = E [Y ] = E [Z ] = 10

However, the three random variables are quite different!


Center and spread
Expected value: represents “center” of a random variable
We need some indicator of “spread” about the expected value

Week 4: Expected value 24 / 36


Variance and standard deviation
Definition (Variance and standard deviation)
The variance of a random variable X , denoted Var(X ), is defined as

Var(X ) = E [(X − E [X ])2 ].

The standard deviation of X , denoted SD(X ), is defined as


q
SD(X ) = + Var(X ).

Week 4: Expected value 25 / 36


Variance and standard deviation
Definition (Variance and standard deviation)
The variance of a random variable X , denoted Var(X ), is defined as

Var(X ) = E [(X − E [X ])2 ].

The standard deviation of X , denoted SD(X ), is defined as


q
SD(X ) = + Var(X ).

Variance is expected value of the random variable (X − E [X ])2 .


− E [X ])2 P(X = t)
P
I Var(X ) = t∈TX (t

I Variance is non-negative, and standard deviation is well-defined.


I Units of SD(X ) are same as units of X .

Intuitively, the more the “spread” in TX , the more will be Var(X )

Week 4: Expected value 25 / 36


Example
1 1/2 1/2 1/2 1/2
X ∼ {10}, Y ∼ { 9 , 11}, Z ∼ { 0 , 20}

E [X ] = E [Y ] = E [Z ] = 10

Variances and standard deviations are different!

Week 4: Expected value 26 / 36


Example
1 1/2 1/2 1/2 1/2
X ∼ {10}, Y ∼ { 9 , 11}, Z ∼ { 0 , 20}

E [X ] = E [Y ] = E [Z ] = 10

Variances and standard deviations are different!

Var(X ) = (10 − 10)2 · 1 = 0, SD(X ) = 0

Week 4: Expected value 26 / 36


Example
1 1/2 1/2 1/2 1/2
X ∼ {10}, Y ∼ { 9 , 11}, Z ∼ { 0 , 20}

E [X ] = E [Y ] = E [Z ] = 10

Variances and standard deviations are different!

Var(X ) = (10 − 10)2 · 1 = 0, SD(X ) = 0

1 1
Var(Y ) = (9 − 10)2 · + (11 − 10)2 · = 1, SD(X ) = 1
2 2

Week 4: Expected value 26 / 36


Example
1 1/2 1/2 1/2 1/2
X ∼ {10}, Y ∼ { 9 , 11}, Z ∼ { 0 , 20}

E [X ] = E [Y ] = E [Z ] = 10

Variances and standard deviations are different!

Var(X ) = (10 − 10)2 · 1 = 0, SD(X ) = 0

1 1
Var(Y ) = (9 − 10)2 · + (11 − 10)2 · = 1, SD(X ) = 1
2 2

1 1
Var(Z ) = (0 − 10)2 · + (20 − 10)2 · = 100, SD(X ) = 10
2 2

Week 4: Expected value 26 / 36


Example: Throw a die
X ∼ Uniform{1, 2, 3, 4, 5, 6}

E [X ] = 3.5

Week 4: Expected value 27 / 36


Example: Throw a die
X ∼ Uniform{1, 2, 3, 4, 5, 6}

E [X ] = 3.5

1 1 1
Var(X ) = (1 − 3.5)2 · + (2 − 3.5)2 · + (3 − 3.5)2 ·
6 6 6
2 1 2 1 2 1
+ (4 − 3.5) · + (5 − 3.5) · + (6 − 3.5) ·
6 6 6
35
= = 2.916 . . .
12
SD(X ) = 1.7078 . . .

Week 4: Expected value 27 / 36


Example: Throw a die
X ∼ Uniform{1, 2, 3, 4, 5, 6}

E [X ] = 3.5

1 1 1
Var(X ) = (1 − 3.5)2 · + (2 − 3.5)2 · + (3 − 3.5)2 ·
6 6 6
2 1 2 1 2 1
+ (4 − 3.5) · + (5 − 3.5) · + (6 − 3.5) ·
6 6 6
35
= = 2.916 . . .
12
SD(X ) = 1.7078 . . .

Given PMF of X with a small range, variance and standard


deviation can be readily computed.
Week 4: Expected value 27 / 36
Properties: Scaling and translation

Let X be a random variable. Let a be a constant real number.


1 Var(aX ) = a2 Var(X )
2 SD(aX ) = |a|SD(X )
3 Var(X + a) = Var(X )
4 SD(X + a) = SD(X )

Week 4: Expected value 28 / 36


Properties: Scaling and translation

Let X be a random variable. Let a be a constant real number.


1 Var(aX ) = a2 Var(X )
2 SD(aX ) = |a|SD(X )
3 Var(X + a) = Var(X )
4 SD(X + a) = SD(X )
Contrast with similar properties for E [X ]

Week 4: Expected value 28 / 36


Alternative formula for variance
Theorem
The variance of a random variable X is given by

Var(X ) = E [X 2 ] − E [X ]2 .

Week 4: Expected value 29 / 36


Alternative formula for variance
Theorem
The variance of a random variable X is given by

Var(X ) = E [X 2 ] − E [X ]2 .

Proof:
Var(X ) = E [(X − E [X ])2 ] = E [X 2 − 2E [X ]X + E [X ]2 ]
By linearity of expected value,

Var(X ) = E [X 2 ] − 2E [E [X ]X ] + E [E [X ]2 ]

Since E [X ] is a constant,

Var(X ) = E [X 2 ] − 2E [X ]E [X ] + E [X ]2 = E [X 2 ] − E [X ]2

Week 4: Expected value 29 / 36


Alternative formula for variance
Theorem
The variance of a random variable X is given by

Var(X ) = E [X 2 ] − E [X ]2 .

Proof:
Var(X ) = E [(X − E [X ])2 ] = E [X 2 − 2E [X ]X + E [X ]2 ]
By linearity of expected value,

Var(X ) = E [X 2 ] − 2E [E [X ]X ] + E [E [X ]2 ]

Since E [X ] is a constant,

Var(X ) = E [X 2 ] − 2E [X ]E [X ] + E [X ]2 = E [X 2 ] − E [X ]2

E [X ], E [X 2 ]: first and second moment, Var(X ): second central moment


Week 4: Expected value 29 / 36
Sum and product of independent random variables

For any two random variables X and Y (independent or dependent),

E [X + Y ] = E [X ] + E [Y ].

Week 4: Expected value 30 / 36


Sum and product of independent random variables

For any two random variables X and Y (independent or dependent),

E [X + Y ] = E [X ] + E [Y ].

Suppose X and Y are independent random variables. Then, more is true.


1 E [XY ] = E [X ]E [Y ]

Week 4: Expected value 30 / 36


Sum and product of independent random variables

For any two random variables X and Y (independent or dependent),

E [X + Y ] = E [X ] + E [Y ].

Suppose X and Y are independent random variables. Then, more is true.


1 E [XY ] = E [X ]E [Y ]
2 Var(X + Y ) = Var(X ) + Var(Y )

Week 4: Expected value 30 / 57


Proof

Week 4: Expected value 31 / 36


Sum and product of independent random variables

For any two random variables X and Y (independent or dependent),

E [X + Y ] = E [X ] + E [Y ].

Suppose X and Y are independent random variables. Then, more is true.


1 E [XY ] = E [X ]E [Y ]
2 Var(X + Y ) = Var(X ) + Var(Y )

Week 4: Expected value 30 / 36


Example: Sum of two dice
What is the variance of the sum of two dice?

Week 4: Expected value 32 / 36


Variance of common distributions
Distribution Expected value Variance
Bernoulli(p) p p(1 − p)
Binomials(n, p) np np(1 − p)
Geometric(p) 1/p (1 − p)/p 2
Poisson(λ) λ λ
Uniform{1, . . . , n} (n + 1)/2 2
(n − 1)/12

Week 4: Expected value 33 / 36


Working

Week 4: Expected value 34 / 36


Standardised random variables

Definition
A random variable X is said to be standardised if

E [X ] = 0, Var(X ) = 1.

Week 4: Expected value 35 / 36


Standardised random variables

Definition
A random variable X is said to be standardised if

E [X ] = 0, Var(X ) = 1.

Theorem
Let X be a random variable. Then,
X − E [X ]
SD(X )

is a standardised random variable.

Week 4: Expected value 35 / 36


Existence of expected value and variance

There are random variables s.t. E [X ] goes to ∞ (or −∞)


1/2 1/4 1/8 1/2n
I X ∼ { 1 , 2 , 4 , . . . , 2n−1 , . . .}
I E [X 2 ] will go to ∞ in this case and Var(X ) is ill-defined

There are random variables s.t. E [X ] is not well-defined


1/2 1/4 1/8 1/2n
n−1
I X ∼ { 1 , −2, 4 , . . . , (−2) , . . .}
I E [X 2 ] will go to ∞ in this case and Var(X ) is ill-defined

There are random variables s.t. E [X ] is finite, but E [X 2 ] goes to ∞


I Var(X ) goes to ∞ too

In this course, we will generally consider well-behaved random variables


with finite mean and variance

Week 4: Expected value 36 / 36


Subsection 4

Covariance and correlation

Week 4: Expected value 37 / 57


Motivation
Consider the following two joint PMFs for two random variables X and Y .

fXY 0 1 fXY 0 1
0 1/2 1/2 0 0 1/2
1 1/2 1/2 1 1/2 0

Week 4: Expected value 38 / 57


Motivation
Consider the following two joint PMFs for two random variables X and Y .

fXY 0 1 fXY 0 1
0 1/2 1/2 0 0 1/2
1 1/2 1/2 1 1/2 0

Same marginal PMFs in both cases. So, mean and variance are the
same for X and Y .
However, the two cases are very different
I X and Y are independent in one case
I Value of X determines the value of Y in the other

Week 4: Expected value 38 / 57


Motivation
Consider the following two joint PMFs for two random variables X and Y .

fXY 0 1 fXY 0 1
0 1/2 1/2 0 0 1/2
1 1/2 1/2 1 1/2 0

Same marginal PMFs in both cases. So, mean and variance are the
same for X and Y .
However, the two cases are very different
I X and Y are independent in one case
I Value of X determines the value of Y in the other

How to summarize relationship between random variables?


Week 4: Expected value 38 / 57
Covariance
Definition (Covariance)
Suppose X and Y are random variables on the same probability space. The
covariance of X and Y , denoted Cov(X , Y ), is defined as

Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])].

Week 4: Expected value 39 / 57


Covariance
Definition (Covariance)
Suppose X and Y are random variables on the same probability space. The
covariance of X and Y , denoted Cov(X , Y ), is defined as

Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])].

Cov(X , Y ) is positive
I (X − E [X ])(Y − E [Y ]) tends to be positive
I When X is above/below its average, Y tends to be correspondingly
above/below its average

Week 4: Expected value 39 / 57


Covariance
Definition (Covariance)
Suppose X and Y are random variables on the same probability space. The
covariance of X and Y , denoted Cov(X , Y ), is defined as

Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])].

Cov(X , Y ) is positive
I (X − E [X ])(Y − E [Y ]) tends to be positive
I When X is above/below its average, Y tends to be correspondingly
above/below its average

Cov(X , Y ) is negative
I (X − E [X ])(Y − E [Y ]) tends to be negative
I When X is above/below its average, Y tends to be correspondingly
below/above its average

Week 4: Expected value 39 / 57


Covariance
Definition (Covariance)
Suppose X and Y are random variables on the same probability space. The
covariance of X and Y , denoted Cov(X , Y ), is defined as

Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])].

Cov(X , Y ) is positive
I (X − E [X ])(Y − E [Y ]) tends to be positive
I When X is above/below its average, Y tends to be correspondingly
above/below its average

Cov(X , Y ) is negative
I (X − E [X ])(Y − E [Y ]) tends to be negative
I When X is above/below its average, Y tends to be correspondingly
below/above its average

Cov(X , Y ) = 0
I X and Y are said to be “uncorrelated”
Week 4: Expected value 39 / 57
Examples: positive and negative covariance

1 X : height of a person, Y : weight of a person


I A higher value of X tends to result in a higher value for Y
I We expect Cov(X , Y ) to be positive

Week 4: Expected value 40 / 57


Examples: positive and negative covariance

1 X : height of a person, Y : weight of a person


I A higher value of X tends to result in a higher value for Y
I We expect Cov(X , Y ) to be positive
2 X : Rainfall during monsoon, Y : Farmer debt
I A higher value of X tends to result in a lower value for Y
I We expect Cov(X , Y ) to be negative

Week 4: Expected value 40 / 57


Examples: positive and negative covariance

1 X : height of a person, Y : weight of a person


I A higher value of X tends to result in a higher value for Y
I We expect Cov(X , Y ) to be positive
2 X : Rainfall during monsoon, Y : Farmer debt
I A higher value of X tends to result in a lower value for Y
I We expect Cov(X , Y ) to be negative
3 X : Runs in an IPL over, Y : Wickets in the same over
I We expect Cov(X , Y ) to be negative

Week 4: Expected value 40 / 57


Example: Computing covariance
X = −1 X =0 X =1
Y = −1 1/15 2/15 2/15
Y =0 2/15 1/15 1/15
Y =1 2/15 2/15 1/15

Week 4: Expected value 41 / 57


Properties

1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]

Week 4: Expected value 42 / 57


Properties

1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
2 Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
I Proof: Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])] =
E [XY − XE [Y ] − YE [X ] + E [X ]E [Y ]]

Week 4: Expected value 42 / 57


Properties

1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
2 Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
I Proof: Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])] =
E [XY − XE [Y ] − YE [X ] + E [X ]E [Y ]]
3 Covariance if symmetric, Cov(X , Y ) = Cov(Y , X )

Week 4: Expected value 42 / 57


Properties

1 Cov(X , X ) = Var(X )
I Proof: Cov(X , X ) = E [(X − E [X ])(X − E [X ])]
2 Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
I Proof: Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])] =
E [XY − XE [Y ] − YE [X ] + E [X ]E [Y ]]
3 Covariance if symmetric, Cov(X , Y ) = Cov(Y , X )
4 Covariance is a “linear” quantity
1 Cov(X , aY + bZ ) = aCov(X , Y ) + bCov(X , Z )
2 Cov(aX + bY , Z ) = aCov(X , Z ) + bCov(Y , Z )

Week 4: Expected value 42 / 57


Covariance and independence

1 If X and Y are independent, X and Y are uncorrelated,


i.e. Cov(X , Y ) = 0
I Proof: Cov(X , Y ) = E [XY ] − E [X ]E [Y ] = 0

Week 4: Expected value 43 / 57


Covariance and independence

1 If X and Y are independent, X and Y are uncorrelated,


i.e. Cov(X , Y ) = 0
I Proof: Cov(X , Y ) = E [XY ] − E [X ]E [Y ] = 0
2 If X and Y are uncorrelated, they may be dependent.
Example:

X = −1 X =0 X =1
Y =0 1/8 1/4 1/8
Y =1 1/4 0 1/4

Week 4: Expected value 43 / 57


Correlation coefficient

Definition (Correlation coefficient)


The correlation coefficient or simply correlation of two random variables X
and Y , denoted ρ(X , Y ), is defined as

Cov(X , Y )
ρ(X , Y ) = .
SD(X )SD(Y )

Week 4: Expected value 44 / 57


Correlation coefficient

Definition (Correlation coefficient)


The correlation coefficient or simply correlation of two random variables X
and Y , denoted ρ(X , Y ), is defined as

Cov(X , Y )
ρ(X , Y ) = .
SD(X )SD(Y )

Result: −SD(X )SD(Y ) ≤ Cov(X , Y ) ≤ SD(X )SD(Y )


 2 
X −E [X ] Y −E [Y ]
I Simplify E SD(X ) + SD(Y ) ≥ 0 to get lower bound
 2 
X −E [X ] Y −E [Y ]
I Simplify E SD(X ) − SD(Y ) ≥ 0 to get upper bound

Week 4: Expected value 44 / 57


Correlation coefficient

Definition (Correlation coefficient)


The correlation coefficient or simply correlation of two random variables X
and Y , denoted ρ(X , Y ), is defined as

Cov(X , Y )
ρ(X , Y ) = .
SD(X )SD(Y )

Result: −SD(X )SD(Y ) ≤ Cov(X , Y ) ≤ SD(X )SD(Y )


 2 
X −E [X ] Y −E [Y ]
I Simplify E SD(X ) + SD(Y ) ≥ 0 to get lower bound
 2 
X −E [X ] Y −E [Y ]
I Simplify E SD(X ) − SD(Y ) ≥ 0 to get upper bound

Using the above, −1 ≤ ρ(X , Y ) ≤ 1

Week 4: Expected value 44 / 57


Correlation and random variables
1 ρ(X , Y )
I One number that summarizes “trend” between two random variables
I Dimensionless quantity.

Week 4: Expected value 45 / 57


Correlation and random variables
1 ρ(X , Y )
I One number that summarizes “trend” between two random variables
I Dimensionless quantity.
2 ρ(X , Y ) is close to zero
I X and Y are close to being uncorrelated
I No clear trend between X and Y

Week 4: Expected value 45 / 57


Correlation and random variables
1 ρ(X , Y )
I One number that summarizes “trend” between two random variables
I Dimensionless quantity.
2 ρ(X , Y ) is close to zero
I X and Y are close to being uncorrelated
I No clear trend between X and Y
3 ρ(X , Y ) = 1 or ρ(X , Y ) = −1
I There exist a 6= 0 and b so that Y = aX + b with probability 1.
I Y is a linear function of X .

Week 4: Expected value 45 / 57


Correlation and random variables
1 ρ(X , Y )
I One number that summarizes “trend” between two random variables
I Dimensionless quantity.
2 ρ(X , Y ) is close to zero
I X and Y are close to being uncorrelated
I No clear trend between X and Y
3 ρ(X , Y ) = 1 or ρ(X , Y ) = −1
I There exist a 6= 0 and b so that Y = aX + b with probability 1.
I Y is a linear function of X .
4 |ρ(X , Y )| is close to one
I X and Y are strongly correlated
I Increase in X is likely to match up with an increase in Y

Week 4: Expected value 45 / 57


Problem
Consider the following joint PMF, where −1/4 ≤ x ≤ 1/4.

X =0 X =1
Y =0 1/4 − x 1/4 + x
Y =1 1/4 + x 1/4 − x

Week 4: Expected value 46 / 57


Problem
Consider the following joint PMF, where −1/4 ≤ x ≤ 1/4.

X =c X =c +1
Y =d 1/4 − x 1/4 + x
Y =d +1 1/4 + x 1/4 − x

Week 4: Expected value 47 / 57


Scatter plots and correlation

Week 4: Expected value 48 / 57


Subsection 5

Bounds on probabilities using mean and variance

Week 4: Expected value 49 / 57


Notation for mean and variance

If there is no confusion about the random variable X ,


I µ will denote the mean E [X ]
I σ 2 will denote the variance Var(X )
I σ will denote the standard deviation SD(X )

Week 4: Expected value 50 / 57


Notation for mean and variance

If there is no confusion about the random variable X ,


I µ will denote the mean E [X ]
I σ 2 will denote the variance Var(X )
I σ will denote the standard deviation SD(X )

If there are multiple random variables under discussion,


I µX will denote the mean E [X ]
I σX2 will denote the variance Var(X )
I σX will denote the standard deviation SD(X )

Week 4: Expected value 50 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?

Week 4: Expected value 51 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?
I It cannot be 0. Why?

Week 4: Expected value 51 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?
I It cannot be 0. Why?
What fraction of students will have marks ≥ 80?

Week 4: Expected value 51 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?
I It cannot be 0. Why?
What fraction of students will have marks ≥ 80?
I It cannot be 1. Why?

Week 4: Expected value 51 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?
I It cannot be 0. Why?
What fraction of students will have marks ≥ 80?
I It cannot be 1. Why?
I Can it be 0.9?

Week 4: Expected value 51 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?
I It cannot be 0. Why?
What fraction of students will have marks ≥ 80?
I It cannot be 1. Why?
I Can it be 0.9?

Week 4: Expected value 51 / 57


What does mean say about the distribution?

Suppose the average marks in a course is 50/100.


What fraction of students will have marks ≥ 50?
I It cannot be 0. Why?
What fraction of students will have marks ≥ 80?
I It cannot be 1. Why?
I Can it be 0.9?

Average says something about the distribution of marks!

Week 4: Expected value 51 / 57


Standard units in statistics

Consider a random variable X with mean µ and variance σ 2 .


In an experiment, X may take a value that is close to µ or away from µ.
X − µ: measures the distance of X from the mean µ.
I Could be positive or negative

Standard units: The number of standard deviations that a realization


of a random variable is away from the mean.
I We expect X − µ to fall between −cσ and cσ for a small value of c
I In other words, we expect X to fall between µ − cσ and µ + cσ

Week 4: Expected value 52 / 57


Examples

Throw a pair of die, X = sum of the two numbers


I µ = 7, σ ≈ 2.42
I P(|X − µ| ≤ σ) = P(4.58 ≤ X ≤ 9.42) = P(X ∈ {5, 6, 7, 8, 9} = 2/3
F So, P(|X − µ| > σ) = 1/3
I P(|X − µ| > 2σ) = P(X ∈ {2, 12} = 2/36 ≈ 0.056

Week 4: Expected value 53 / 57


Examples

Throw a pair of die, X = sum of the two numbers


I µ = 7, σ ≈ 2.42
I P(|X − µ| ≤ σ) = P(4.58 ≤ X ≤ 9.42) = P(X ∈ {5, 6, 7, 8, 9} = 2/3
F So, P(|X − µ| > σ) = 1/3
I P(|X − µ| > 2σ) = P(X ∈ {2, 12} = 2/36 ≈ 0.056

X ∼ Uniform{1, 2, . . . , 100}
I µ = 50.5, σ ≈ 28.9
I P(|X − µ| > σ) = 58/100
I P(|X − µ| > 2σ) = 0

Week 4: Expected value 53 / 57


Markov’s inequality

Theorem (Markov’s inequality)


Let X be a discrete random variable taking non-negative values with a finite
mean µ. Then,
µ
P(X ≥ c) ≤ .
c

Week 4: Expected value 54 / 57


Markov’s inequality

Theorem (Markov’s inequality)


Let X be a discrete random variable taking non-negative values with a finite
mean µ. Then,
µ
P(X ≥ c) ≤ .
c

Proof:
P P P
µ= t∈TX t P(X = t) = t<c t P(X = t) + t≥c t P(X = t)

Week 4: Expected value 54 / 57


Markov’s inequality

Theorem (Markov’s inequality)


Let X be a discrete random variable taking non-negative values with a finite
mean µ. Then,
µ
P(X ≥ c) ≤ .
c

Proof:
P P P
µ= t∈TX t P(X = t) = t<c t P(X = t) + t≥c t P(X = t)
Since the first sum is non-negative,
X X X
µ≥ t P(X = t) ≥ c P(X = t) = c P(X = t) = cP(X ≥ c)
t≥c t≥c t≥c

Week 4: Expected value 54 / 57


Chebyshev’s inequality
Theorem (Chebyshev’s inequality)
Let X be a discrete random variable with a finite mean µ and a finite
variance σ 2 . Then,
1
P(|X − µ| ≥ kσ) ≤ 2 .
k

Week 4: Expected value 55 / 57


Chebyshev’s inequality
Theorem (Chebyshev’s inequality)
Let X be a discrete random variable with a finite mean µ and a finite
variance σ 2 . Then,
1
P(|X − µ| ≥ kσ) ≤ 2 .
k

Proof:
Apply Markov’s inequality to (X − µ)2 .

Week 4: Expected value 55 / 57


Chebyshev’s inequality
Theorem (Chebyshev’s inequality)
Let X be a discrete random variable with a finite mean µ and a finite
variance σ 2 . Then,
1
P(|X − µ| ≥ kσ) ≤ 2 .
k

Proof:
Apply Markov’s inequality to (X − µ)2 .
Other forms:
σ2 1
P(|X − µ| ≥ c) ≤ c2
, P((X − µ)2 ≥ k 2 σ 2 ) ≤ k2
1
P(µ − kσ < X < µ + kσ) ≥ 1 − k2
1
P(X ≥ µ + kσ) + P(X ≤ µ − kσ) ≤ k2
1 1
I P(X ≥ µ + kσ) ≤ k2 , P(X ≤ µ − kσ) ≤ k2

Week 4: Expected value 55 / 57


Compare actual and Chebyshev

Let X be a random variable with finite mean µ and variance σ 2 .


1
P(|X − µ| ≥ 2σ) ≤
4

Week 4: Expected value 56 / 57


Compare actual and Chebyshev

Let X be a random variable with finite mean µ and variance σ 2 .


1
P(|X − µ| ≥ 2σ) ≤
4

1 X ∼ Binomial(10, 1/2), µ = 5, σ = 2.5 ≈ 1.58

P(|X − 5| ≥ 2σ = P(X ∈ {0, 1, 9, 10}) ≈ 0.021

Week 4: Expected value 56 / 57


Compare actual and Chebyshev

Let X be a random variable with finite mean µ and variance σ 2 .


1
P(|X − µ| ≥ 2σ) ≤
4

1 X ∼ Binomial(10, 1/2), µ = 5, σ = 2.5 ≈ 1.58

P(|X − 5| ≥ 2σ = P(X ∈ {0, 1, 9, 10}) ≈ 0.021

2 X ∼ Geometric(1/4), µ = 4, σ ≈ 3.46

P(|X − 4| ≥ 2σ = P(X ∈ {11, 12, . . .}) ≈ 0.056

Week 4: Expected value 56 / 57


What do mean and variance say about distribution?

Mean µ, through Markov’s inequality: bounds the probability that a


non-negative random variable takes values much larger than the mean.

Week 4: Expected value 57 / 57


What do mean and variance say about distribution?

Mean µ, through Markov’s inequality: bounds the probability that a


non-negative random variable takes values much larger than the mean.
Mean µ and standard deviation σ, through Chebyshev’s inequality:
bound the probability that X is away from µ by kσ.

Week 4: Expected value 57 / 57


What do mean and variance say about distribution?

Mean µ, through Markov’s inequality: bounds the probability that a


non-negative random variable takes values much larger than the mean.
Mean µ and standard deviation σ, through Chebyshev’s inequality:
bound the probability that X is away from µ by kσ.
These are some of the most useful measures of expected centre and
spread in practice.
I Suppose the average number of accidents decreases by 10000 per day
across the country. Is that a “significant” decrease?
I If standard deviation of the number of accidents is known, we can find
how high 10000 is in terms of standard deviations to answer above
question.

Week 4: Expected value 57 / 57


What do mean and variance say about distribution?

Mean µ, through Markov’s inequality: bounds the probability that a


non-negative random variable takes values much larger than the mean.
Mean µ and standard deviation σ, through Chebyshev’s inequality:
bound the probability that X is away from µ by kσ.
These are some of the most useful measures of expected centre and
spread in practice.
I Suppose the average number of accidents decreases by 10000 per day
across the country. Is that a “significant” decrease?
I If standard deviation of the number of accidents is known, we can find
how high 10000 is in terms of standard deviations to answer above
question.

There are a lot of important theoretical implications as well. We will


see these later on.

Week 4: Expected value 57 / 57

You might also like