Professional Documents
Culture Documents
Introduction
• We have been looking at differences between means and at the chi-square test of
the independence of two variables.
• Now we are going to look at the relationship between two variables.
• Two common examples are the relationship between Beta-endorphin levels 12
hours before surgery and 10 minutes before surgery. Are high levels at one
reading associated with high levels at the other? (We ran a t test on these data
about two weeks ago.) The second example is the relationship between SAT
scores and performance on an SAT-like test when the subjects have not read the
passage on which the questions are based.
• Here we see that there is a positive relationship between the two variables--we'll
talk about significance later.
• If we want a measure of the degree of this relationship, the correlation is 0.699
o As we'll see later, the relationship is significant.
o What does that mean?
• In this particular example both of the variables are random--we don't know what
the values of X, or Y, will be before the experiment begins.
Example with Fixed X
1 2.201 3 2.811
1 2.411 3 2.857
1 2.407 3 3.422
1 2.403 4 3.233
1 2.826 4 3.505
1 3.380 4 3.192
2 1.893 4 3.209
2 3.102 4 2.860
2 2.355 4 3.111
2 3.644 5 3.200
2 2.767 5 3.253
2 2.109 5 3.357
3 2.906 5 3.169
3 2.118 5 3.291
5 3.290
Notice that there is no sampling error in X, whereas there was in the previous example.
• Notice how the columns line up. Get them to explain why. (This is common with
fixed X.)
• Notice how judged attractiveness increases with the number of faces included in
the composite.
• Notice how the variability of data points decreases as we increase X. This is a
no-no from the point of view of assumptions behind correlation and regression. It
will also be a problem with the analysis of variance.
o Keep in mind that we are talking about assumptions about populations,
though I'm pretty sure that the assumption is violated.
o Ask why this might be expected to happen.
• The correlation is about the same as in the previous example--r = .56, and it is
significant.
• I chose this example because it is one that psychologists deal with, and relates to
an important health problem.
• The question is the relationship between age and low-birthweight (we know they
are related), and what happens when mothers do, and do not, smoke.
• Data on Smoking mothers (pooled across 48 states, dv = % low birthweight.
•
• I want them to know what this is, but I don't want them to go away thinking that
we use if very often. (We rarely do).
• What we want is an unbiased estimate of the correlation in the population.
• Comment that we very rarely use the adjusted coefficient, even though most
programs print it out.
• Here we are looking for the best straight line that can be fit to these data.
• I have included those lines in the plots above.
• We want an equation of the form: