Conditional Expectation and Variance
Conditional Expectation and Variance
← previous next →
0 Preface
5.1.4 Functions of Two Note that E[X|Y = y] depends on the value of y . In other words, by changing y , E[X|Y = y] can
Random Variables also change. Thus, we can say E[X|Y = y] is a function of y , so let's write
5.1.5 Conditional
Expectation
g(y) = E[X|Y = y].
5.1.6 Solved Problems Thus, we can think of g(y) = E[X|Y = y] as a function of the value of random variable Y . We then
write
5.2 Two Continuous Random
Variables
g(Y ) = E[X|Y ].
5.3 More Topics
We use this notation to indicate that E[X|Y ] is a random variable whose value equals
5.4 Problems
g(y) = E[X|Y = y] when Y = y . Thus, if Y is a random variable with range RY ={y1 , y2 , ⋯}, then
6 Multiple Random Variables E[X|Y ] is also a random variable with
7 Limit Theorems and
⎧ E[X|Y = y1 ] with probability P(Y = y1 )
⎪
Convergence of Random
⎪ E[X|Y = y2 ]
Variables with probability P(Y = y2 )
8 Statistical Inference I: E[X|Y ] = ⎨ . .
Classical Methods ⎪
⎪ . .
9 Statistical Inference II:
Bayesian Inference
⎩ . .
10 Introduction to Random Let's look at an example.
Processes
Example 5.10 Let X = aY + b. Then E[X|Y = y] = E[aY + b|Y = y] = ay + b . Here, we have
11 Some Important Random
Processes g(y) = ay + b, and therefore,
12 Introduction to Simulation E[X|Y ] = aY + b,
Using MATLAB
14 Recursive Methods
Since E[X|Y ] is a random variable, we can find its PMF, CDF, variance, etc. Let's look at an
Appendix example to better understand E[X|Y ] .
Bibliography
Example 5.11 Consider two random variables X and Y with joint PMF given in Table 5.2. Let
Z = E[X|Y ] .
Y =0 Y =1
1 2
X=0 5 5
2
X=1 5
0
Solution
1 2 3
PX (0) = + = ,
5 5 5
2 2
PX (1) = +0= ,
5 5
1 2 3
PY (0) = + = ,
5 5 5
2 2
PY (1) = +0= .
5 5
Thus, the marginal distributions of X and Y are both Bernoulli( 25 ) . However, note that X and
Y
are not independent.
b. We have
PXY (0, 0)
PX|Y (0|0) =
PY (0)
1
5 1
= = .
3 3
5
Thus,
1 2
PX|Y (1|0) = 1 − = .
3 3
We conclude
X|Y = 0 ∼ Bernoulli ( ) .
2
3
Similarly, we find
PX|Y (0|1) = 1,
PX|Y (1|1) = 0.
Thus, given Y = 1, we have always X = 0 .
c. We note that the random variable Y can take two values: 0 and 1. Thus, the random variable
Z = E[X|Y ] can take two values as it is a function of Y . Specifically,
⎧ E[X|Y = 0] if Y = 0
⎪
Z = E[X|Y ] = ⎨
⎪
⎩ E[X|Y = 1] if Y = 1
Now, using the previous part, we have
2
E[X|Y = 0] = , E[X|Y = 1] = 0,
3
3 2
and since P(y = 0) = 5
, and P(y = 1) = 5
, we conclude that
⎧2 with probability 3
⎪3 5
Z = E[X|Y ] = ⎨
⎪
⎩0 with probability 2
5
So we can write
⎧3 if z = 2
⎪5 3
⎪
⎪
PZ (z) = ⎨ 2 if z = 0
⎪5
⎪
⎪
⎩0 otherwise
d. Now that we have found the PMF of Z , we can find its mean and variance. Specifically,
2 3 2 2
E[Z] = ⋅ + 0 ⋅ = .
3 5 5 5
2
We also note that EX = 5
. Thus, here we have
4 3
2 2 4
E[Z ] = ⋅ + 0 ⋅ = .
9 5 5 15
Thus,
4 4
Var(Z) = −
15 25
8
= .
75
Example 5.12
Let X and Y be two random variables and g and h be two functions. Show that
Iterated Expectations:
Let us look again at the law of total probability for expectation. Assuming g(Y ) = E[X|Y ] , we have
E[X] = E[E[X|N]]
= E[Np] (since X|N ∼ Binomial(N, p))
= pE[N] = pλ.
Equation 5.7 is called the law of iterated expectations. Since it is basically the same as Equation
5.4, it is also called the law of total expectation [3].
= xPX (x)
∑
x∈R X
= E[X].
Again, thinking of this as a random variable depending on Y , we obtain
E[g(X)|Y ] = E[g(X)].
Remember that for independent random variables, PXY (x, y) = PX (x)PY (y). From this, we can
show that E[XY ] = EXEY .
Lemma 5.2
If X and Y are independent, then E[XY ] = EXEY . Using LOTUS, we have
(∑ )( ∑ Y )
= xPX (x) yP (y)
x∈R x y∈R y
= EXEY .
Note that the converse is not true. That is, if the only thing that we know about X and Y is that
E[XY ] = EXEY , then X and Y may or may not be independent. Using essentially the same proof
as above, we can show if X and Y are independent, then E[g(X)h(Y )] = E[g(X)]E[h(Y )] for any
functions g : ℝ ↦ ℝ and h : ℝ ↦ ℝ .
1. E[X|Y ] = EX ;
2. E[g(X)|Y ] = E[g(X)] ;
3. E[XY ] = EXEY ;
4. E[g(X)h(Y )] = E[g(X)]E[h(Y )].
Conditional Variance:
Similar to the conditional expectation, we can define the conditional variance of X , Var(X|Y = y),
which is the variance of X in the conditional space where we know Y = y . If we let
μX|Y (y) = E[X|Y = y] , then
Note that Var(X|Y = y) is a function of y . Similar to our discussion on E[X|Y = y] and E[X|Y ] , we
define Var(X|Y ) as a function of the random variable Y . That is, Var(X|Y ) is a random variable
whose value equals Var(X|Y = y) whenever Y = y . Let us look at an example.
Example 5.13
Let X , Y , and Z = E[X|Y ] be as in Example 5.11. Let also V = Var(X|Y ) .
Solution
X|Y = 0 ∼ Bernoulli ( ) ,
2
3
P(X = 0|Y = 1) = 1,
8
Var(Z) = .
75
a. To find the PMF of V , we note that V is a function of Y . Specifically,
⎧ Var(X|Y = 0) if Y = 0
⎪
V = Var(X|Y ) = ⎨
⎪
⎩ Var(X|Y = 1) if Y = 1
Therefore,
Var(X|Y = 1) = 0.
Thus,
⎧2 with probability 3
⎪9 5
V = Var(X|Y ) = ⎨
⎪
⎩0 2
with probability 5
So we can write
⎧3 if v = 2
⎪5 9
⎪
⎪
PV (v) = ⎨ 2 if v = 0
⎪5
⎪
⎪
⎩0 otherwise
b. To find EV , we write
2 3 2 2
EV = ⋅ +0⋅ = .
9 5 5 15
c. To check that Var(X) = E(V )+Var(Z) , we just note that
2 3 6
Var(X) = ⋅ = ,
5 5 25
2
EV = ,
15
8
Var(Z) = .
75
In the above example, we checked that Var(X) = E(V )+Var(Z) , which says
EV = E[E[X 2 |Y ]] − E[Z 2 ]
= E[X 2 ] − E[Z 2 ] (law of iterated expectations(Equation 5.7)) (5.8)
Next, we have
There are several ways that we can look at the law of total variance to get some intuition. Let us
first note that all the terms in Equation 5.10 are positive (since variance is always positive). Thus,
we conclude
To describe the law of total variance intuitively, it is often useful to look at a population divided
into several groups. In particular, suppose that we have this random experiment: We pick a person
in the world at random and look at his/her height. Let's call the resulting value X . Define another
random variable Y whose value depends on the country of the chosen person, where
Y = 1, 2, 3, . . . , n , and n is the number of countries in the world. Then, let's look at the two terms
in the law of total variance.
Example 5.14
Let N be the number of customers that visit a certain store in a given day. Suppose that we know
E[N] and Var(N) . Let Xi be the amount that the i th customer spends on average. We assume Xi 's
are independent of each other and also independent of N . We further assume they have the same
mean and variance
EXi = EX,
Var(Xi ) = Var(X).
Let Y be the store's total sales, i.e.,
N
Y = Xi .
∑
i=1
Solution
To find EY , we cannot directly use the linearity of expectation because N is random. But,
conditioned on N = n , we can use linearity and find E[Y |N = n] ; so, we use the law of iterated
expectations:
[ ]]
N
[ i=1
=E E Xi |N
∑
N
[∑ ]
=E E[Xi |N] (linearity of expectation)
i=1
N
[∑ ]
=E E[Xi ] (Xi 's and N are indpendent)
i=1
= E[NE[X]] (since EXi = EXs)
= E[X]E[N] (since EX is not random).
To find Var(Y ) , we use the law of total variance:
N
Var(Y |N) = Var(Xi |N)
∑
i=1
N
= Var(Xi ) (since Xi 's are independent of N)
∑
i=1
= NV ar(X).
Thus, we have
← previous next →
Introduction to Probability by Hossein Pishro-Nik is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License