AB1202 Statistics and Analysis

AB1202
Statistics and Analysis

Lecture 3
Covariance and Correlation
Chin Chee Kai
cheekai@ntu.edu.sg
Nanyang Business School
Nanyang Technological University
NBS 2016S1 AB1202 CCK-STAT-018
2
Covariance and Correlation

• Covariance & Correlation Definitions
• Covariance & Correlation of Grouped Data
• Covariance & Correlation as Properties of Two
Joint Random Variables
• Covariance & Correlation of Distributions
• Mean of Sum of Random Variables
• Variance of Sum of Random Variables
3
Covariance & Correlation Definition

1 𝑁
• 𝐶𝑜𝑣 𝑋, 𝑌 = 𝑖=1 𝑥𝑖 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌 For discrete random
𝑁 variables X & Y
1 𝑛
• 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌 = 𝑥𝑖 − 𝑥 𝑦𝑖 − 𝑦
𝑛−1 𝑖=1
𝐶𝑜𝑣 𝑋,𝑌
• 𝐶𝑜𝑟𝑟𝑒𝑙 𝑋, 𝑌 =
𝑉𝑎𝑟(𝑋) 𝑉𝑎𝑟(𝑌)
𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋,𝑌
• 𝐶𝑜𝑟𝑟𝑒𝑙𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌 =
𝑠𝑋 𝑠𝑌
• More common correlation notations:
𝑆𝑆𝑋𝑌
𝑟= for calculating correlation from
𝑆𝑆𝑋𝑋 𝑆𝑆𝑌𝑌
samples, where 𝑆𝑆𝑋𝑌 = 𝑛𝑖=1 𝑥𝑖 − 𝑥 𝑦𝑖 − 𝑦
𝜎𝑋𝑌
𝜌= for calculating population
𝜎𝑋 𝜎𝑌
𝑁 1
correlation, where 𝜎𝑋𝑌 = 𝑖=1 𝑥𝑖 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌 is just
𝑁
the population covariance of X and Y, 𝐶𝑜𝑣 𝑋, 𝑌 .
4
Geometrical Interpretation of 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌

𝑛
𝑌 1 1
𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌 = 𝑥𝑖 − 𝑥 𝑦𝑖 − 𝑦 = 𝑆𝑆
𝑛−1 𝑛 − 1 𝑋𝑌
𝑖=1
𝑛
𝑆𝑆𝑋𝑌 = 𝑥𝑖 − 𝑥 𝑦𝑖 − 𝑦
𝑥3 − 𝑥
𝑖=1
+area
–area 𝑦3 − 𝑦
𝑥5 − 𝑥
𝑦5 − 𝑦
𝑦
𝑦4 − 𝑦
𝑦1 − 𝑦
𝑦2 − 𝑦
𝑥4 − 𝑥
𝑥1 − 𝑥
𝑥2 − 𝑥
–area
+area
𝑋
𝑥
5
Covariance For Grouped Data

• 𝐶𝑜𝑣 𝑋, 𝑌 = 𝑀 𝑘=1 𝑥𝑘 − 𝜇𝑋 𝑦𝑘 − 𝜇𝑌 𝑃(𝑋 = 𝑥𝑘 , 𝑌 = 𝑦𝑘 )
where M is the number of unique pairs of values of X & Y.
• As we typically group data in tables, it is common to
place X & Y on different axis. The covariance formula
then looks like:
𝑚 𝑛
𝐶𝑜𝑣 𝑋, 𝑌 = 𝑥𝑗 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌 𝑃(𝑋 = 𝑥𝑗 , 𝑌 = 𝑦𝑖 )
𝑖=1 𝑗=1
where 𝑚 is the number of rows (unique n
values of Y) and 𝑛 is the number of columns
(unique values of X), and 𝑚 × 𝑛 = 𝑀 X1=8 X2=9
m Y1=6 0.1 0.4
• When we use probabilities, we implicitly Y2=7 0.3 0.2
imply that the covariance calculated will
be population covariance.
6
Covariance as a Property of Two Joint

Random Variables
• Whereas a single X can
P(X=x, Y=y)
have descriptives like 0.4
mean, standard deviation,
etc, … 0.3
• Covariance can be thought 0.2

of as a descriptive involving Y
0.1
2 random variables.
7
▫ Higher absolute value 0
6
8 9
implies stronger linear X
relationship
X1=8 X2=9 P(Y)
▫ Negative sign implies Y1=6 0.1 0.4 0.5
inverted relationship (large Y2=7 0.3 0.2 0.5
X and small Y, or small X P(X)= 0.4 0.6
and large Y, are observed)
7
Calculating Covariance(X, Y)
• Given observation data on the right,
calculate covariance of X & Y. X Y
𝑛 1 8 6
• 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌 = 𝑥 − 𝑥 𝑦𝑖 − 𝑦 =
𝑛−1 𝑖=1 𝑖
8 − 8.6 6 − 6.5 + 9 − 8.6 7 − 6.5 + 9 7
8 − 8.6 7 − 6.5 + 9 − 8.6 6 − 6.5 + 8 7
1
8 − 8.6 7 − 6.5 + 9 − 8.6 7 − 6.5 + 9 6
10−1
9 − 8.6 6 − 6.5 + 8 − 8.6 7 − 6.5 + 8 7
9 − 8.6 6 − 6.5 + 9 − 8.6 6 − 6.5 9 7
1
= −1 = −0.1111 9 6
9 8 7
• If data is the entire population, then: 9 6
𝑁
1 9 6
𝐶𝑜𝑣 𝑋, 𝑌 = 𝑥𝑖 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌
𝑁
𝑖=1
1
= −1 = −0.1
10
8
Covariance of Distributions
• More commonly, we get tabulated X1=8 X2=9
data tables of joint probabilities. Y1=6 0.1 0.4
Y2=7 0.3 0.2
• Table on the right is tabulated from
the same data set as previous slide. P(X=x, Y=y)
• Calculating covariance will be: 𝜇𝑋 = 8 ∙ 0.4 + 9 ∙ 0.6 = 8.6
𝑚 𝑛 𝜇𝑌 = 6 ∙ 0.5 + 7 ∙ 0.5 = 6.5
𝐶𝑜𝑣 𝑋, 𝑌 = 𝑥𝑗 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌 𝑃(𝑋 = 𝑥𝑗 , 𝑌 = 𝑦𝑖 )
𝑖= 𝑗=1
8 − 8.6 6 − 6.5 ∙ 0.1 +
9 − 8.6 6 − 6.5 ∙ 0.4 +
=
8 − 8.6 7 − 6.5 ∙ 0.3 +
(9 − 8.6)(7 − 6.5) ∙ 0.2
= −0.1
9
Correlation of Random Variables

• Given observation data on the right,
calculate correlation of X & Y. X Y
𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 (𝑋,𝑌) 8 6
•𝑟= 9 7
𝑆𝑋 𝑆𝑌
8 7
−0.1111
= = −0.4082 9 6
0.5164×0.5270
8 7
• If data is the entire population, then 9 7
correlation becomes : 9 6
𝐶𝑜𝑣(𝑋,𝑌) −0.1 8 7
𝜌= = = −0.4083
𝜎𝑋 𝜎𝑌 0.4899×0.5 9 6
(would be -0.4082, due to rounding) 9 6
10
Correlation of Distributions
• More commonly, we get tabulated X1=8 X2=9
data tables of joint probabilities. Y1=6 0.1 0.4

Y2=7 0.3 0.2
Calculating correlation will be:
P(X=x, Y=y)
8 − 8.6 6 − 6.5 ∙ 0.1 + 𝜇𝑋 = 8 ∙ 0.4 + 9 ∙ 0.6 = 8.6
9 − 8.6 6 − 6.5 ∙ 0.4 + 𝜇𝑌 = 6 ∙ 0.5 + 7 ∙ 0.5 = 6.5
• 𝜎𝑋𝑌 = = −0.1
8 − 8.6 7 − 6.5 ∙ 0.3 +
(9 − 8.6)(7 − 6.5) ∙ 0.2
𝜎𝑋𝑌
𝜌=
𝜎𝑋 =
8 − 8.6 2 ∙ 0.4 +
= 0.4899 𝜎𝑋 𝜎𝑌
9 − 8.6 2 ∙ 0.6
−0.1
=
6 − 6.5 2 ∙ 0.5 + 0.4899×0.5
𝜎𝑌 = = 0.5
7 − 6.5 2 ∙ 0.5
= −0.4083
11
Covariance, Variance & Correlation

• Since covariance is defined for any random variables 𝑋
and 𝑌, we might just let 𝑌 = 𝑋 and get:
𝑉𝑎𝑟 𝑋 = 𝜎𝑋2 = 𝐶𝑜𝑣(𝑋, 𝑋) and
𝑉𝑎𝑟𝑠𝑎𝑚𝑝𝑙𝑒 𝑋 = 𝑠 2 = 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 (𝑋, 𝑋)
• The correlation of 𝑋 and 𝑌(= 𝑋) will be:
𝐶𝑜𝑣 𝑋,𝑋 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋,𝑋
𝜌= = 1 and 𝑟= =1
𝜎𝑋 ∙𝜎𝑋 𝑠∙𝑠
• This means a random variable 𝑋 is always linearly and
completely correlated with itself.
• Further, if 𝑐 is a constant, then:

𝑉𝑎𝑟 𝑐 = 0, 𝐶𝑜𝑣 𝑋, 𝑐 = 0, 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑐 = 0
• It is easy to also check that: 𝐶𝑜𝑣 𝑋, 𝑌 = 𝐶𝑜𝑣 𝑌, 𝑋

12
Correlation as a Property of Two Joint

Random Variables
• Just like covariance, correlation can also be thought

of as a descriptive involving 2 random variables.
▫ Range is always from -1 to 1.
▫ Value closer to 0 implies little to no linear correlation.
▫ Value closer to 1 implies stronger positive linear
correlation (large X with large Y).
▫ Value closer to -1 implies stronger negative linear
correlation (large X with small Y).
13
Mean of Sum of Random Variables

• Suppose 𝑋 and 𝑌 are random variables with means
𝐸 𝑋 , 𝐸(𝑌) and variances 𝑉𝑎𝑟(𝑋) and 𝑉𝑎𝑟(𝑌).
• If 𝑊 = 𝑋 + 𝑌 is a random variable, what is the
mean expected value of 𝑊?
𝐸 𝑊 =𝐸 𝑋+𝑌 =𝐸 𝑋 +𝐸 𝑌
• Thus, in layman’s words, mean of sum of two
random variables is the sum of each of the means
of the two random variables.
• In general, if 𝑎 and 𝑏 are real constants
𝐸 𝑎𝑋 + 𝑏𝑌 = 𝑎𝐸 𝑋 + 𝑏𝐸 𝑌
• This property is known as linearity of the expected
value of random variable sums.
14
Variance of Sum of Random Variables

• What about variance 𝑉𝑎𝑟 𝑊 ?
• 𝑉𝑎𝑟 𝑊 = 𝑉𝑎𝑟 𝑋 + 𝑌
= 𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟 𝑌 + 2𝐶𝑜𝑣 𝑋, 𝑌
• More generally, if 𝑊 = 𝑎𝑋 + 𝑏𝑌 where 𝑎 and 𝑏 are real

constants, then:
𝑉𝑎𝑟 𝑎𝑋 + 𝑏𝑌 = 𝑎2 𝑉𝑎𝑟 𝑋 + 𝑏 2 𝑉𝑎𝑟 𝑌 + 2𝑎𝑏𝐶𝑜𝑣 𝑋, 𝑌
= 𝑎2 𝑉𝑎𝑟 𝑋 + 𝑏 2 𝑉𝑎𝑟 𝑌 + 2𝑎𝑏 ∙ 𝜌𝜎𝑋 𝜎𝑌
• So linearity does not apply to variance calculations when

the random variables are not independent.
• When 𝑋 and 𝑌 are independent, it means 𝐶𝑜𝑣 𝑋, 𝑌 = 0,
so: 𝑉𝑎𝑟 𝑎𝑋 + 𝑏𝑌 = 𝑎2 𝑉𝑎𝑟 𝑋 + 𝑏 2 𝑉𝑎𝑟 𝑌
15
Wonders of Variance
• If a, 𝑏 ≥ 0, is it possible to make
𝑉𝑎𝑟 𝑎𝑋 + 𝑏𝑌 < 𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟(𝑌)
𝑎2 𝑉𝑎𝑟 𝑋 + 𝑏 2 𝑉𝑎𝑟 𝑌 + 2𝑎𝑏𝐶𝑜𝑣 𝑋, 𝑌 < 𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟 𝑌
• For simplicity, suppose a = 𝑏 = 1

𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟 𝑌 + 2𝐶𝑜𝑣 𝑋, 𝑌 < 𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟 𝑌 [*]
• For this to hold, we require 𝐶𝑜𝑣 𝑋, 𝑌 < 0

• This means that [*] does not hold for any 𝑋 and 𝑌 in
general.
• But if we find 𝑋, 𝑌 which are such that 𝐶𝑜𝑣 𝑋, 𝑌 < 0,
then [*] holds, and we’re in luck!
16
Wonders of Variance
• Variance is related to real-life properties, and affect
our decisions.
• Variance (and the associated notion of standard
deviation) is related to energy, uncertainty of
outcomes, financial risk, etc.
• We typically seek to reduce variance, since most
businesses prefer predictability than uncertainty.
• The theory and understanding about variance gives
us a solid mathematical foundation to spend the time
and resources on finding matching pairs of 𝑋, 𝑌
whose covariance is negative.

AB1202 Statistics and Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AB1202 Statistics and Analysis

Uploaded by

Copyright:

Available Formats

AB1202

Statistics and Analysis

Covariance and Correlation

Covariance & Correlation Definition

Geometrical Interpretation of 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌

Covariance For Grouped Data

Covariance as a Property of Two Joint

• Covariance can be thought 0.2

Correlation of Random Variables

data tables of joint probabilities. Y1=6 0.1 0.4

Covariance, Variance & Correlation

• Further, if 𝑐 is a constant, then:

• It is easy to also check that: 𝐶𝑜𝑣 𝑋, 𝑌 = 𝐶𝑜𝑣 𝑌, 𝑋

Correlation as a Property of Two Joint

• Just like covariance, correlation can also be thought

Mean of Sum of Random Variables

Variance of Sum of Random Variables

• More generally, if 𝑊 = 𝑎𝑋 + 𝑏𝑌 where 𝑎 and 𝑏 are real

• So linearity does not apply to variance calculations when

• For simplicity, suppose a = 𝑏 = 1

• For this to hold, we require 𝐶𝑜𝑣 𝑋, 𝑌 < 0

You might also like