You are on page 1of 6

Prof.

Thistleton

MAT370 Applied Probability

Lecture 25

Sections from Text and Homework Problems: Read: 5.3 ; Problems 30, 31, 32 Topics from Syllabus: Correlation Results Harvard Lectures: Lecture 21 Review and Looking Ahead What do we know about joint (typically pairwise, so far) distributions? We know about Joint and marginal for discrete and continuous Conditional distributions, conditional expectation, total expectation theorem Covariance

We are about to review another example of a covariance calculation, the one you computed in the last lecture with face cards and spades. But first, lets explore some theory. Ill prove these in the continuous case- you should take out a blank sheet of paper and work them for the discrete case. [ ] [ ] [ ]

We use our favorite result: [ ( )] ( ) ( ) ( ) ( )

Just apply some Calc III now. We will distribute, and then switch order of integration, pulling independent variables through the integral as we go: [ ] ( ) ( )

Recognizing the definition of a marginal distribution: [ ] ( ) ( ) [ ] [ ]

Page 1

Prof. Thistleton

MAT370 Applied Probability

Lecture 25

[ ] [ ] when the random variables are independent

Play the same game. Just remember that independence allows us to factor the joint distribution as the product of the marginal, and then pull the constant through the integral. [ ] ( ) ( ) ( )

( ) [

( )

( )[

[ ] [ ]

Really, you just multiply and pull constants through the expected value operator. [( [( )( )( )] [ )] ] [ [ ] ]

when the random variables are independent

Easy Peasy Lemon Squeezy [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

Show that the converse is not true. Here we go. If we find one counterexample, we are done. Luckily, you calculated covariance in the last lecture for faces and spades. As a reminder

Page 2

Prof. Thistleton

MAT370 Applied Probability

Lecture 25

X= number of spades 0 1 2

0 Y=number of face

For the marginal expectations you can apply the definition or recall that X and Y are individually hypergeometric, so In either case, [ ] and [ ] . (A quick aside- the first time I thought up this

example, I did my calculation with a hand calculator and had a terrible time. The calculation is very sensitive to rounding, so stay in fractions. Looping around the table

[ ] [ ]

This is actually an important point: Page 3

Prof. Thistleton

MAT370 Applied Probability

Lecture 25

This must be important, since I put it in a box. An especially important special case is the multivariate normal distribution. In that case, uncorrelated is synonymous with independent. More on that will follow. Derive a formula for the variance of a linear combination of random variables. That is, find [ We can say [ ] [( ) ] for [ ] ] ] ] [ ] [ ] [ ( ] [ ( ] [ ] [ ] [ ] as convenient: ) ) ]. This will be a crucial result over the next several lectures.

Now just multiply like crazy, writing [ [ ] [ ] [ [

As an interesting special case, when [ This extends to the following. If [ So, if we take an average [ ] [ ] ] ]

are independent [ ] [ ]

are independent, identically distributed

Page 4

Prof. Thistleton

MAT370 Applied Probability

Lecture 25

Look at that denominator- this means that the variability of an average is decreasing dramatically as the sample size increases. This is why we trust a sample of size 100 much more than a sample of size 10. We have now seen several results concerning joint, marginal, and conditional distributions. We have also seen that a measure of linear relation between random variables is the covariance. There are other ways to measure dependency, such as mutual information, but a remarkable theory may be built upon covariances, especially when working with multivariate distributions. We take a moment here, before moving onto continuous distributions to define the correlation between two random variables. This is useful because we feel that the strength of the relationship should not depend upon whether we measure in inches, feet, or miles.

The Correlation Coefficient,


If you think about it, it may be useful to scale the covariance by standardizing our random variables, just as we found it useful to consider the standard normal distribution. We had, given ( )

Since we defined the covariance as an expected value: [ ] [( )( )]

We can define the correlation coefficient as the covariance between the standardized random variables [ ]

Page 5

Prof. Thistleton
Obviously, if and

MAT370 Applied Probability

Lecture 25

are independent, then the correlation between them is zero. People like the

correlation coefficient because, among other things, it makes the degree of linear relationship easy to understand. In particular, we can show that

To see how a correlation might be unity, consider the following ideas. First, supposing are random variables such that , show (trivially) that (We have seen this before- this is just a reminder). Then, consider the product [ Relate the variances of and and show that ] as [ ] Finally, we have that ( ) [ ] | | [ ( )] [ ] [ ]

and

We will soon be considering statistics, many of which are built off of the sum of independent random variables. Take a moment to compute the variance of the sum [ In the special case that and ]

are independent, show that [ ] [ ] [ ]

Finally, if

are independent, what is the variance of

Page 6