You are on page 1of 55

MAT 271 Probability and Statistics

Lecture 5: Further Topics on Random Variables

Asst. Prof. N. Kemal Ure

Istanbul Technical University


ure@itu.edu.tr

April 28th, 2020


Overview

1 Introduction

2 Derived Distributions

3 Covariance and Correlation

4 Iterated Expectations

5 Summary
Introduction
Introduction

▶ We now have a solid understanding of fundamental topics regarding


random variables

∎ CDFs, Expectation, Conditioning etc.

▶ Now we are going to take a look at some advanced concepts that are
very useful in engineering applications

∎ Derived Distributions: How to transform PDFs

∎ Correlation and Covariance: How to measure dependency of RVs

∎ Iterated Expectations: Some nice applications of conditional expectation


and variance
Derived Distributions
Derived Distributions

▶ Often we have a RV X and a transformed RV Y = g(X). How do we


transform the distributions?
∎ If X is discrete, transformations of PMFs is relatively straightforward.
∎ If Y is continuous, transformation is a bit more complicated
Derived Distributions
Derived Distributions
Derived Distributions
Derived Distributions
Derived Distributions
Derived Distributions
Linear Case

▶ An important special case is where the transformation is linear:

Y = g(X) = aX + b
Linear Case
Linear Case
Monotonic Case

▶ We can generalize the formula of the linear case to monotonic


transformations

∎ g(x) is monotonically increasing if x > x′ Ô⇒ g(x) > g(x′ )

∎ g(x) is monotonically decreasing if x > x′ Ô⇒ g(x) < g(x′ )

∎ We will also assime that g is differentiable

▶ Monotonic functions are special because they are always invertible:

∎ A function y = g(x) is invertible if there exists a function h(y) such that

y = g(x) ⇐⇒ x = h(y)

∎ For instance, g(x) = 180/x and h(y) = 180/y


Monotonic Case

▶ Note that linear functions are monotonic (if a ≠ 0), hence invertible
y−b
g(x) = ax + b ⇐⇒ h(y) =
a
▶ Here is an example of a nonlinear monotonic invertible function
ln y
g(x) = eax ⇐⇒ h(y) =
a
Monotonic Case
Monotonic Case
Multiple Variables

▶ We can extend the transformation procedure to functions of more


than one variable
Multiple Variables
Multiple Variables
Multiple Variables
Multiple Variables
Multiple Variables
Convolution

▶ Now lets look at an important special case for a function of two RVs.
▶ Let Z = X + Y , where X and Y are discrete and independent RVs.
Lets find the PMF of Z

pZ (z) = P (X + Y = z)
= ∑ P (X = x, Y = y)
(x,y)∣x+y=z

= ∑ P (X = x, Y = z − x)
x
= ∑ pX (x)pY (z − x)
x

▶ The resulting PMF pZ is called the convolution of PMFs pX and pY .


▶ Convolution of continuous RVs can be derived as:

fZ (z) = ∫ fX (x)fY (z − x)dx
−∞
Convolution
Convolution
Convolution
Covariance and Correlation
Covariance and Correlation

▶ One way to measure relationship between two RVs X and Y is to


compute their covariance
cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]

∎ If cov(X, Y ) = 0, we say that variables are uncorrelated


∎ Roughly speaking, positive or negative covariance indicated that
X − E[X] and Y − E[Y ] tend to have the same or the opposite sign.
Covariance and Correlation

▶ An alternative formula for the covariance

cov(X, Y ) = E[XY ] − E[X]E[Y ]

▶ And here are some easy to derive properties:

cov(X, X) = var(X)
cov(X, aY + b) = acov(X, Y )
cov(X, Y + Z) = cov(X, Y ) + cov(X, Z)

▶ Note that if X, Y are independent, then E[XY ] = E[X]E[Y ] and


hence cov(X, Y ) = 0
∎ Thus independence implies being uncorrelated.
∎ What about the converse?
Covariance and Correlation
Covariance and Correlation

▶ We sometimes prefer to use a normalized version of covariance, such


as correlation coefficient
cov(X, Y )
ρ(X, Y ) = √
var(X)var(Y )

∎ It can be shown that −1 ≤ ρ(X, Y ) ≤ 1.

▶ If ρ > 0 then X − E[X] and Y − E[Y ] tend to have the same sign.

▶ If ρ < 0 then X − E[X] and Y − E[Y ] tend to have the opposite sign.

▶ Size of ∣ρ∣ provides a normalized measure of to what extent these


notions are true.
Covariance and Correlation
Variance of Sum of Random Variables

▶ The covariance can be used to obtain a formula for the variance of


sum of several RVs (not necessarily independent).

var(X1 + X2 ) = var(X1 ) + var(X2 ) + 2cov(X1 , X2 )

▶ In more general form


n n
var (∑ Xi ) = ∑ var(Xi ) + ∑ cov(Xi , Xj )
i=1 i=1 (i,j)∣i≠j
Variance of Sum of Random Variables
Variance of Sum of Random Variables
Iterated Expectations
Conditional Expectation and Variance

▶ In this last section, we are going to revisit the idea of conditional


expectation and variance for some cool applications.

▶ Note that conditional expectation E[X∣Y ] is a random variable


when we view it as E[X∣Y = y]

▶ Since it is a function of Y , we can calculate it distribution using the


PDF or PMF of Y .

▶ We will see that expectation and variance of this random variable will
allows us to solve some difficult problems.
Conditional Expectation and Variance
Conditional Expectation and Variance
Conditional Expectation and Variance
Conditional Expectation and Variance
Conditional Expectation and Variance
Conditional Expectation as an Estimator

▶ If we view Y as an observation that provides information about X, it


is natural to view:
X̂ = E[X∣Y ]

∎ Here X̂ is the estimator of X given Y .


∎ The estimation error is:
X̃ = X̂ − X

∎ Estimation error is a RV, and its expectation is:

E[X̃∣Y ] = E[X̂ − X∣Y ] = E[X̂∣Y ] − E[X∣Y ] = 0

Hence by the law of iterated expectations: E[X̃] = 0


This is great! As we take more samples the estimator will give the
correct answer in average.
Conditional Expectation as an Estimator

▶ Another cool property of X̂ is that it is uncorrelated with the


estimation error!

E[X̂ X̃] = E[E[X̂ X̃∣Y ]] = E[X̂E[X̃∣Y ]] = 0

▶ It follows that

cov(X̂, X̃) = E[X̂ X̃] − E[X̂]E[X̃] = 0

▶ An important consequence of this property is:

var(X) = var(X̃) + var(X̂)

▶ This property can be further exploited by forming a useful law.


Conditional Variance

▶ We introduce the following RV:

var(X∣Y ) = E[(X − E[X∣Y ]2 )∣Y ] = E[X̃ 2 ∣Y ]

▶ Using the fact that E[X̃] = 0 and law of expected iterations:

var(X̃) = E[X̃ 2 ] = E[E[X̃ 2 ∣Y ]] = E[var(X∣Y )]

▶ We obtain the law of total variance by rewriting


var(X) = var(X̃) + var(X̂):
Conditional Variance
Conditional Variance
Conditional Variance
Conditional Variance
Conditional Variance
Summary
Summary

▶ This lecture:

∎ Derived Distributions

∎ Covariance and correlation

∎ Laws of Iterated Expectations

▶ What is next?

∎ Limit Theorems

You might also like