You are on page 1of 3

AQA Statistics 1 Correlation and regression

1 of 3 27/02/13 MEI
Section 1: Correlation

Notes and Examples

These notes contain subsections on
- Scatter diagrams
- The product moment correlation coefficient


Scatter diagrams

You probably remember meeting scatter diagrams at GCSE level. You
probably used them to gain a visual impression of whether there was any
correlation in a set of bivariate data. You probably also drew lines of best fit by
eye and may have used them to predict values.

In this chapter you will learn to calculate correlation coefficients which
measure the degree of correlation, and you will also learn (in section 3) to
calculate the equation of the line of best fit for a set of bivariate data. You
might therefore feel that the need for scatter diagrams is superseded by the
use of calculated values. However, although the calculations you will learn in
this chapter will give you more reliable information than the by eye methods
you used at GCSE, scatter diagrams are still an important part of the process.

- The scatter diagram allows you to spot any obvious outliers which may
affect the results of your calculation.

- You can see by the shape of the scatter diagram what procedures will
be valid. For example, a hypothesis test using the product moment
correlation coefficient is only a valid approach if both variables need to
be drawn from a Normal distribution, indicated by an approximately
elliptical shape. Alternatively, a scatter diagram may show non-linear
correlation, in which case other methods may be used.

- You can get some idea of the degree of correlation, and whether it is
positive or negative, from the scatter diagram, which might allow you to
spot if you have made an error in a calculation.


The product moment correlation coefficient

You will find an explanation of the product moment correlation coefficient, with
the different methods of finding it, on pages 118 and 120 121 of your
textbook. Note that there are two equivalent formulae for calculating the
product moment correlation coefficient, and the second one is usually the
more convenient. Examination questions sometimes give the summary
statistics (
2 2
, , , ,

x x y y xy and n) rather than the raw data, in
which case you must use the second form given on page 121. If the data set
AQA S1 Correlation 1 Notes and Examples
2 of 3 27/02/13 MEI
is quite small and the numbers are not awkward, you may prefer the form
given on page 118.

The Bivariate data interactive spreadsheet allows you to experiment with
data and see how the value of the correlation coefficient relates to the scatter
diagram. Select the first sheet (product moment). You can alter the position of
the points by changing the values in the table of data, and see how the
correlation coefficient changes. Try to arrange the data so that you have
strong positive correlation, weak positive correlation, strong negative
correlation and weak negative correlation. You could also try getting the
correlation as close to zero as you can.

You can also try the Geogebra resource Correlation, which can be used in a
similar way to the spreadsheet. You can vary the number of points used on
the scatter diagram.

You can also try the product moment correlation coefficient activity, in
which you match up scatter diagrams with values of the correlation coefficient.


Example 1
A researcher wishes to find out if there is any connection between the length of time
young children spend using computers and their reading ability.
She collects data on 50 seven-year-olds, summarised below.

The number of hours spent using a computer during a particular week is denoted by x.
The score in a reading test is denoted by y.
222 =

x
2
1604 =

x 2892 =

y
2
179558 =

y 11846 =

xy

Find the product moment correlation coefficient, and comment on your results..

Solution
2
2 2
222
1604 50 618.32
50
| |
= = =
|
\ .
xx
S x nx
2
2 2
2892
179558 50 12284.72
50
| |
= = =
|
\ .
yy
S y ny
222 2892
11846 50 994.48
50 50
= = =
xy
S xy nx y
994.48
0.3608
618.32 12284.72

= = =

xy
xx yy
S
r
S S


There is weak negative correlation between the number of hours spent using
computers and their reading ability.



AQA S1 Correlation 1 Notes and Examples
3 of 3 27/02/13 MEI
Note:
The value of the correlation coefficient suggests that there may be a
connection between the two variables. It does not mean that using
computers has a negative effect on a childs reading ability. For example, it
might be the case that children who read well spend less time on a computer
because they enjoy reading books, or perhaps parents who spend time
helping their children with reading may also be those who are more likely to
restrict the time their children spend on the computer. Issues such as these
are very complex, and all that the correlation coefficient shows is that there is
a slight connection between the two variables.

This illustrates a very important general point, that correlation does not
imply causation.

You might also like