You are on page 1of 2

The converse, however, is not always true.

Cov(X, Y ) can be 0 for variables that are not inde-


pendent.
For an example where the covariance is 0 but
Covariance and Correlation X and Y aren’t independent, let there be three
Math 217 Probability and Statistics outcomes, (−1, 1), (0, −2), and (1, 1), all with the
Prof. D. Joyce, Fall 2014 same probability 31 . They’re clearly not indepen-
dent since the value of X determines the value of
Y . Note that µX = 0 and µY = 0, so
Covariance. Let X and Y be joint random vari-
ables. Their covariance Cov(X, Y ) is defined by Cov(X, Y ) = E((X − µX )(Y − µY ))
= E(XY )
Cov(X, Y ) = E((X − µX )(Y − µY )).
= 13 (−1) + 31 0 + 13 1 = 0
Notice that the variance of X is just the covariance We’ve already seen that when X and Y are in-
of X with itself dependent, the variance of their sum is the sum of
their variances. There’s a general formula to deal
Var(X) = E((X − µX )2 ) = Cov(X, X)
with their sum when they aren’t independent. A
covariance term appears in that formula.
Analogous to the identity for variance
Var(X + Y ) = Var(X) + Var(Y ) + 2 Cov(X, Y )
Var(X) = E(X 2 ) − µ2X
Here’s the proof
there is an identity for covariance
Var(X + Y )
Cov(X) = E(XY ) − µX µY = E((X + Y )2 ) − E(X + Y )2
= E(X 2 + 2XY + Y 2 ) − (µX + µY )2
Here’s the proof:
= E(X 2 ) + 2E(XY ) + E(Y 2 )
Cov(X, Y ) − µ2X − 2µX µY − µ2Y
= E((X − µX )(Y − µY )) = E(X 2 ) − µ2X + 2(E(XY ) − µX µY )
= E(XY − µX Y − XµY + µX µY ) + E(Y 2 ) − µ2Y
= E(XY ) − µX E(Y ) − E(X)µY + µX µY = Var(X) + 2 Cov(X, Y ) + Var(Y )
= E(XY ) − µX µY
Bilinearity of covariance. Covariance is linear
Covariance can be positive, zero, or negative. in each coordinate. That means two things. First,
Positive indicates that there’s an overall tendency you can pass constants through either coordinate:
that when one variable increases, so doe the other,
Cov(aX, Y ) = a Cov(X, Y ) = Cov(X, aY ).
while negative indicates an overall tendency that
when one increases the other decreases. Second, it preserves sums in each coordinate:
If X and Y are independent variables, then their
covariance is 0: Cov(X1 + X2 , Y ) = Cov(X1 , Y ) + Cov(X2 , Y )
and
Cov(X, Y ) = E(XY ) − µX µY
= E(X)E(Y ) − µX µY = 0 Cov(X, Y1 + Y2 ) = Cov(X, Y1 ) + Cov(X, Y2 ).

1
2
Here’s a proof of the first equation in the first But Var(X) = σX and Var(−Y ) = Var(Y ) = σY2 ,
condition: so that equals
 
Cov(aX, Y ) = E((aX − E(aX))(Y − E(Y ))) X ±Y
2 + 2 Cov ,
= E(a(X − E(X))(Y − E(Y ))) σX σY
= aE((X − E(X))(Y − E(Y ))) By the bilinearity of covariance, that equals
= a Cov(X, Y )
2
The proof of the second condition is also straight- 2± Cov(X, Y ) = 2 ± 2ρXY )
σx σY
forward.
and we’ve shown that
Correlation. The correlation ρXY of two joint
0 ≤ 2(1 ± ρXY .
variables X and Y is a normalized version of their
covariance. It’s defined by the equation Next, divide by 2 move one term to the other side
Cov(X, Y ) of the inequality to get
ρXY = .
σX σY
∓ρXY ≤ 1,
Note that independent variables have 0 correla-
tion as well as 0 covariance. so
By dividing by the product σX σY of the stan- −1 ≤ ρXY ≤ 1.
dard deviations, the correlation becomes bounded
between plus and minus 1. This exercise should remind you of the same
kind of thing that goes on in linear algebra. In
−1 ≤ ρXY ≤ 1. fact, it is the same thing exactly. Take a set of
real-valued random variables, not necessarily inde-
There are various ways you can prove that in- pendent. Their linear combinations form a vector
equality. Here’s one. We’ll start by proving space. Their covariance is the inner product (also
called the dot product or scalar product) of two
 
X Y
0 ≤ Var ± = 2(1 ± ρXY ). vectors in that space.
σX σY
There are actually two equations there, and we can X · Y = Cov(X, Y )
prove them at the same time.
2
First note the “0 ≤” parts follow from the fact The norm kXk of X is the square root of kXk
variance is nonnegative. Next use the property defined by
proved above about the variance of a sum.
  kXk2 = X · X = Cov(X, X) = V (X) = σX 2

X Y
Var ± and, so, the angle θ between X and Y is defined by
σ σ
 X Y    
X ±Y X ±Y X ·Y Cov(X, Y )
= Var + Var + 2 Cov , cos θ = = = ρXY
σX σY σX σY kXk kY k σX σY
Now use the fact that Var(cX) = c2 Var(X) to that is, θ is the arccosine of the correlation ρ .
XY
rewrite that as
  Math 217 Home Page at
1 1 X Y
2
Var(X) + 2 Var(±Y ) + 2 Cov ,± http://math.clarku.edu/~djoyce/ma217/
σX σY σX σY

You might also like