You are on page 1of 4

Chp-07 124-144.

qxd 26/5/04 4:26 pm Page 124

CHAPTER 7

Correlation

Learning objectives
In earlier chapters, only single variables have been considered. Now you will be working with
pairs of variables.
After studying this chapter, you should be able to:
■ investigate the strength of a linear relationship between two variables by using suitable
statistical analysis
■ evaluate and interpret the product moment correlation coefficient.

7.1 Scatter diagrams


Scatter diagrams are used where we are examining
possible relationships between two variables.

An athlete, recovering from injury, had her pulse rate measured


after performing a predetermined number of step-ups in a
gymnasium. The measurements were made at weekly intervals.
The table below shows the number of step-ups, x, the pulse rate,
y beats per minute, and the week in which the measurement was
made.

Week 1 2 3 4 5 6 7 8
x 15 50 35 25 20 30 10 45
y 114 155 132 112 96 105 78 113

To construct a scatter diagram you simply plot points with


coordinates (x, y) for each of the 8 weeks.
Chp-07 124-144.qxd 26/5/04 4:26 pm Page 125

Correlation 125

y
160

140

120
Pulse rate
(beats per
minute)

100

80

0 10 20 30 40 50 x 7
Number of step-ups

In this case, as you would expect, there appears to be a clear


tendency for the pulse rate to increase with the number of
step-ups.

7.2 Interpreting scatter diagrams


Interpreting a scatter diagram is often the easiest way for
you to decide whether correlation exists. Correlation means
that there is a linear relationship between the two
variables. This could mean that the points lie on a straight
line, but it is much more likely to mean that they are
scattered about a straight line.
Chp-07 124-144.qxd 26/5/04 4:26 pm Page 126

126 Correlation

The main types of scatter diagram


Positive correlation Negative correlation No correlation
y y y

x x x

● positive or ● negative or ● little or no


direct inverse correlation,
correlation correlation no linear
relationship
● x increases as ● x decreases as ● x and y are not
y increases y increases linked
● clear linear ● clear linear ● x and y appear
relationship relationship to be
exists. exists. independent.
y y

x x

● x and y are clearly linked by a non-linear relationship.

7.3 Studying results


The table below gives the marks obtained by ten pupils taking
maths and physics tests.
Pupil A B C D E F G H I J
Maths mark
(out of 30)
x 20 23 8 29 14 11 11 20 17 17
Physics mark
(out of 40) Physics
y 30 35 21 33 33 26 22 31 33 36
40
Is there a connection between the marks obtained by the ten
35
pupils in the maths and physics tests?
The starting point would be to plot the marks on a scatter diagram. 30

The areas in the bottom-right and top-left of the graph are 25


almost empty so there is a clear tendency for the points to run
20
from bottom-left to top-right. This indicates that positive 5 10 15 20 25 30
correlation exists between x and y. Maths
Chp-07 124-144.qxd 26/5/04 4:26 pm Page 127

Correlation 127

Calculating the means: Note: importance of scale.


170
x     17
10
Consider this change,
Physics
and 100
300
y  
10
  30. 75
Using these lines, the graph can be divided into four regions to 50
show this tendency very clearly.
25
Physics
x  17
40 0
5 10 15 20 25 30
35 Maths

30
The appearance of the scatter
y  30 diagram is now very different.
25 The existence of correlation is
much more difficult to identify.
20 Scales should cover the range of
5 10 15 20 25 30
Maths
the given data.

The table below gives the marks obtained by the ten pupils
taking maths and history tests.
Pupil A B C D E F G H I J
Maths mark
(out of 30) 7
x 20 23 8 29 14 11 11 20 17 17
History mark History
x  17
(out of 60) 60
z 28 21 42 32 44 56 36 24 51 26
50
Calculating the mean for z:
40 z  36
360
z  
10
  36
30
The scatter diagram for maths and history shows a clear
20
tendency for points to run from top-left to bottom-right. This 5 10 15 20 25 30
indicates that negative correlation exists between x and z. Maths

7.4 Product moment correlation


coefficient (PMCC)
(This is often known as Pearson’s correlation coefficient
after Karl Pearson, an applied mathematician who worked on
the application of statistics to genetics and evolution.)

How can the strength of correlation be quantified?


There are two main points to consider.
● How close to a straight line are the points?
● Is the correlation positive or negative?

You might also like