Professional Documents
Culture Documents
𝑆𝑆𝑋𝑌 = 𝑥𝑖 − 𝑥 𝑦𝑖 − 𝑦
𝑥3 − 𝑥
𝑖=1
+area
–area 𝑦3 − 𝑦
𝑥5 − 𝑥
𝑦5 − 𝑦
𝑦
𝑦4 − 𝑦
𝑦1 − 𝑦
𝑦2 − 𝑦
𝑥4 − 𝑥
𝑥1 − 𝑥
𝑥2 − 𝑥
–area
+area
𝑋
𝑥
NBS 2016S1 AB1202 CCK-STAT-018
5
𝐶𝑜𝑣 𝑋, 𝑌 = 𝑥𝑗 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌 𝑃(𝑋 = 𝑥𝑗 , 𝑌 = 𝑦𝑖 )
𝑖=1 𝑗=1
where 𝑚 is the number of rows (unique n
values of Y) and 𝑛 is the number of columns
(unique values of X), and 𝑚 × 𝑛 = 𝑀 X1=8 X2=9
m Y1=6 0.1 0.4
• When we use probabilities, we implicitly Y2=7 0.3 0.2
imply that the covariance calculated will
be population covariance.
NBS 2016S1 AB1202 CCK-STAT-018
6
P(X=x, Y=y)
have descriptives like 0.4
mean, standard deviation,
etc, … 0.3
Calculating Covariance(X, Y)
• Given observation data on the right,
calculate covariance of X & Y. X Y
𝑛 1 8 6
• 𝐶𝑜𝑣𝑠𝑎𝑚𝑝𝑙𝑒 𝑋, 𝑌 = 𝑥 − 𝑥 𝑦𝑖 − 𝑦 =
𝑛−1 𝑖=1 𝑖
8 − 8.6 6 − 6.5 + 9 − 8.6 7 − 6.5 + 9 7
8 − 8.6 7 − 6.5 + 9 − 8.6 6 − 6.5 + 8 7
1
8 − 8.6 7 − 6.5 + 9 − 8.6 7 − 6.5 + 9 6
10−1
9 − 8.6 6 − 6.5 + 8 − 8.6 7 − 6.5 + 8 7
9 − 8.6 6 − 6.5 + 9 − 8.6 6 − 6.5 9 7
1
= −1 = −0.1111 9 6
9 8 7
• If data is the entire population, then: 9 6
𝑁
1 9 6
𝐶𝑜𝑣 𝑋, 𝑌 = 𝑥𝑖 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌
𝑁
𝑖=1
1
= −1 = −0.1
10
NBS 2016S1 AB1202 CCK-STAT-018
8
Covariance of Distributions
• More commonly, we get tabulated X1=8 X2=9
data tables of joint probabilities. Y1=6 0.1 0.4
Y2=7 0.3 0.2
• Table on the right is tabulated from
the same data set as previous slide. P(X=x, Y=y)
• Calculating covariance will be: 𝜇𝑋 = 8 ∙ 0.4 + 9 ∙ 0.6 = 8.6
𝑚 𝑛 𝜇𝑌 = 6 ∙ 0.5 + 7 ∙ 0.5 = 6.5
𝐶𝑜𝑣 𝑋, 𝑌 = 𝑥𝑗 − 𝜇𝑋 𝑦𝑖 − 𝜇𝑌 𝑃(𝑋 = 𝑥𝑗 , 𝑌 = 𝑦𝑖 )
𝑖= 𝑗=1
8 − 8.6 6 − 6.5 ∙ 0.1 +
9 − 8.6 6 − 6.5 ∙ 0.4 +
=
8 − 8.6 7 − 6.5 ∙ 0.3 +
(9 − 8.6)(7 − 6.5) ∙ 0.2
= −0.1
NBS 2016S1 AB1202 CCK-STAT-018
9
Correlation of Distributions
• More commonly, we get tabulated X1=8 X2=9
Wonders of Variance
• If a, 𝑏 ≥ 0, is it possible to make
𝑉𝑎𝑟 𝑎𝑋 + 𝑏𝑌 < 𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟(𝑌)
𝑎2 𝑉𝑎𝑟 𝑋 + 𝑏 2 𝑉𝑎𝑟 𝑌 + 2𝑎𝑏𝐶𝑜𝑣 𝑋, 𝑌 < 𝑉𝑎𝑟 𝑋 + 𝑉𝑎𝑟 𝑌
Wonders of Variance
• Variance is related to real-life properties, and affect
our decisions.
• Variance (and the associated notion of standard
deviation) is related to energy, uncertainty of
outcomes, financial risk, etc.
• We typically seek to reduce variance, since most
businesses prefer predictability than uncertainty.
• The theory and understanding about variance gives
us a solid mathematical foundation to spend the time
and resources on finding matching pairs of 𝑋, 𝑌
whose covariance is negative.