MAT 271 Probability and Statistics Lecture 5: Further Topics On Random Variables

MAT 271 Probability and Statistics
Lecture 5: Further Topics on Random Variables
Asst. Prof. N. Kemal Ure
Istanbul Technical University

ure@itu.edu.tr
April 28th, 2020

Overview
1 Introduction
2 Derived Distributions
3 Covariance and Correlation
4 Iterated Expectations
5 Summary
Introduction
Introduction
▶ We now have a solid understanding of fundamental topics regarding

random variables
∎ CDFs, Expectation, Conditioning etc.
▶ Now we are going to take a look at some advanced concepts that are
very useful in engineering applications
∎ Derived Distributions: How to transform PDFs
∎ Correlation and Covariance: How to measure dependency of RVs
∎ Iterated Expectations: Some nice applications of conditional expectation

and variance
Derived Distributions
▶ Often we have a RV X and a transformed RV Y = g(X). How do we

transform the distributions?
∎ If X is discrete, transformations of PMFs is relatively straightforward.
∎ If Y is continuous, transformation is a bit more complicated
Linear Case
▶ An important special case is where the transformation is linear:
Y = g(X) = aX + b
Linear Case
Linear Case
Monotonic Case
▶ We can generalize the formula of the linear case to monotonic

transformations
∎ g(x) is monotonically increasing if x > x′ Ô⇒ g(x) > g(x′ )
∎ g(x) is monotonically decreasing if x > x′ Ô⇒ g(x) < g(x′ )
∎ We will also assime that g is differentiable
▶ Monotonic functions are special because they are always invertible:
∎ A function y = g(x) is invertible if there exists a function h(y) such that
y = g(x) ⇐⇒ x = h(y)
∎ For instance, g(x) = 180/x and h(y) = 180/y

Monotonic Case
▶ Note that linear functions are monotonic (if a ≠ 0), hence invertible
y−b
g(x) = ax + b ⇐⇒ h(y) =
a
▶ Here is an example of a nonlinear monotonic invertible function
ln y
g(x) = eax ⇐⇒ h(y) =
a
Monotonic Case
Monotonic Case
Multiple Variables
▶ We can extend the transformation procedure to functions of more

than one variable
Multiple Variables
Multiple Variables
Multiple Variables
Multiple Variables
Multiple Variables
Convolution
▶ Now lets look at an important special case for a function of two RVs.
▶ Let Z = X + Y , where X and Y are discrete and independent RVs.
Lets find the PMF of Z
pZ (z) = P (X + Y = z)
= ∑ P (X = x, Y = y)
(x,y)∣x+y=z
= ∑ P (X = x, Y = z − x)
x
= ∑ pX (x)pY (z − x)
x
▶ The resulting PMF pZ is called the convolution of PMFs pX and pY .

▶ Convolution of continuous RVs can be derived as:
∞
fZ (z) = ∫ fX (x)fY (z − x)dx
−∞
Convolution
Convolution
Convolution
Covariance and Correlation
▶ One way to measure relationship between two RVs X and Y is to

compute their covariance
cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]
∎ If cov(X, Y ) = 0, we say that variables are uncorrelated

∎ Roughly speaking, positive or negative covariance indicated that
X − E[X] and Y − E[Y ] tend to have the same or the opposite sign.
▶ An alternative formula for the covariance
cov(X, Y ) = E[XY ] − E[X]E[Y ]
▶ And here are some easy to derive properties:
cov(X, X) = var(X)
cov(X, aY + b) = acov(X, Y )
cov(X, Y + Z) = cov(X, Y ) + cov(X, Z)
▶ Note that if X, Y are independent, then E[XY ] = E[X]E[Y ] and

hence cov(X, Y ) = 0
∎ Thus independence implies being uncorrelated.
∎ What about the converse?
▶ We sometimes prefer to use a normalized version of covariance, such

as correlation coefficient
cov(X, Y )
ρ(X, Y ) = √
var(X)var(Y )
∎ It can be shown that −1 ≤ ρ(X, Y ) ≤ 1.
▶ If ρ > 0 then X − E[X] and Y − E[Y ] tend to have the same sign.
▶ If ρ < 0 then X − E[X] and Y − E[Y ] tend to have the opposite sign.
▶ Size of ∣ρ∣ provides a normalized measure of to what extent these

notions are true.
Variance of Sum of Random Variables
▶ The covariance can be used to obtain a formula for the variance of

sum of several RVs (not necessarily independent).
var(X1 + X2 ) = var(X1 ) + var(X2 ) + 2cov(X1 , X2 )
▶ In more general form

n n
var (∑ Xi ) = ∑ var(Xi ) + ∑ cov(Xi , Xj )
i=1 i=1 (i,j)∣i≠j
Iterated Expectations
Conditional Expectation and Variance
▶ In this last section, we are going to revisit the idea of conditional

expectation and variance for some cool applications.
▶ Note that conditional expectation E[X∣Y ] is a random variable

when we view it as E[X∣Y = y]
▶ Since it is a function of Y , we can calculate it distribution using the

PDF or PMF of Y .
▶ We will see that expectation and variance of this random variable will
allows us to solve some difficult problems.
Conditional Expectation as an Estimator
▶ If we view Y as an observation that provides information about X, it

is natural to view:
X̂ = E[X∣Y ]
∎ Here X̂ is the estimator of X given Y .

∎ The estimation error is:
X̃ = X̂ − X
∎ Estimation error is a RV, and its expectation is:
E[X̃∣Y ] = E[X̂ − X∣Y ] = E[X̂∣Y ] − E[X∣Y ] = 0
Hence by the law of iterated expectations: E[X̃] = 0

This is great! As we take more samples the estimator will give the
correct answer in average.
Conditional Expectation as an Estimator
▶ Another cool property of X̂ is that it is uncorrelated with the

estimation error!
E[X̂ X̃] = E[E[X̂ X̃∣Y ]] = E[X̂E[X̃∣Y ]] = 0
▶ It follows that
cov(X̂, X̃) = E[X̂ X̃] − E[X̂]E[X̃] = 0
▶ An important consequence of this property is:
var(X) = var(X̃) + var(X̂)
▶ This property can be further exploited by forming a useful law.

Conditional Variance
▶ We introduce the following RV:
var(X∣Y ) = E[(X − E[X∣Y ]2 )∣Y ] = E[X̃ 2 ∣Y ]
▶ Using the fact that E[X̃] = 0 and law of expected iterations:
var(X̃) = E[X̃ 2 ] = E[E[X̃ 2 ∣Y ]] = E[var(X∣Y )]
▶ We obtain the law of total variance by rewriting

var(X) = var(X̃) + var(X̂):
Summary
Summary
▶ This lecture:
∎ Derived Distributions
∎ Covariance and correlation
∎ Laws of Iterated Expectations
▶ What is next?
∎ Limit Theorems

MAT 271 Probability and Statistics Lecture 5: Further Topics On Random Variables

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MAT 271 Probability and Statistics Lecture 5: Further Topics On Random Variables

Uploaded by

Copyright:

Available Formats

MAT 271 Probability and Statistics

Lecture 5: Further Topics on Random Variables

Asst. Prof. N. Kemal Ure

Istanbul Technical University

April 28th, 2020

3 Covariance and Correlation

▶ We now have a solid understanding of fundamental topics regarding

∎ CDFs, Expectation, Conditioning etc.

∎ Derived Distributions: How to transform PDFs

∎ Correlation and Covariance: How to measure dependency of RVs

∎ Iterated Expectations: Some nice applications of conditional expectation

▶ Often we have a RV X and a transformed RV Y = g(X). How do we

▶ An important special case is where the transformation is linear:

▶ We can generalize the formula of the linear case to monotonic

∎ g(x) is monotonically increasing if x > x′ Ô⇒ g(x) > g(x′ )

∎ g(x) is monotonically decreasing if x > x′ Ô⇒ g(x) < g(x′ )

∎ We will also assime that g is differentiable

▶ Monotonic functions are special because they are always invertible:

∎ A function y = g(x) is invertible if there exists a function h(y) such that

∎ For instance, g(x) = 180/x and h(y) = 180/y

▶ We can extend the transformation procedure to functions of more

▶ The resulting PMF pZ is called the convolution of PMFs pX and pY .

▶ One way to measure relationship between two RVs X and Y is to

∎ If cov(X, Y ) = 0, we say that variables are uncorrelated

▶ An alternative formula for the covariance

cov(X, Y ) = E[XY ] − E[X]E[Y ]

▶ And here are some easy to derive properties:

▶ Note that if X, Y are independent, then E[XY ] = E[X]E[Y ] and

▶ We sometimes prefer to use a normalized version of covariance, such

∎ It can be shown that −1 ≤ ρ(X, Y ) ≤ 1.

▶ Size of ∣ρ∣ provides a normalized measure of to what extent these

▶ The covariance can be used to obtain a formula for the variance of

var(X1 + X2 ) = var(X1 ) + var(X2 ) + 2cov(X1 , X2 )

▶ In more general form

▶ In this last section, we are going to revisit the idea of conditional

▶ Note that conditional expectation E[X∣Y ] is a random variable

▶ Since it is a function of Y , we can calculate it distribution using the

▶ If we view Y as an observation that provides information about X, it

∎ Here X̂ is the estimator of X given Y .

∎ Estimation error is a RV, and its expectation is:

E[X̃∣Y ] = E[X̂ − X∣Y ] = E[X̂∣Y ] − E[X∣Y ] = 0

Hence by the law of iterated expectations: E[X̃] = 0

▶ Another cool property of X̂ is that it is uncorrelated with the

E[X̂ X̃] = E[E[X̂ X̃∣Y ]] = E[X̂E[X̃∣Y ]] = 0

cov(X̂, X̃) = E[X̂ X̃] − E[X̂]E[X̃] = 0

▶ An important consequence of this property is:

var(X) = var(X̃) + var(X̂)

▶ This property can be further exploited by forming a useful law.

▶ We introduce the following RV:

var(X∣Y ) = E[(X − E[X∣Y ]2 )∣Y ] = E[X̃ 2 ∣Y ]

▶ Using the fact that E[X̃] = 0 and law of expected iterations:

var(X̃) = E[X̃ 2 ] = E[E[X̃ 2 ∣Y ]] = E[var(X∣Y )]

▶ We obtain the law of total variance by rewriting

∎ Covariance and correlation

∎ Laws of Iterated Expectations

You might also like