Professional Documents
Culture Documents
Moment Correlation
University of St. La Salle
Source: statistics.laerd.com
The Pearson product-moment correlation generates a
coefficient called the Pearson correlation coefficient,
denoted as r (i.e., the italic lowercase letter r). This
coefficient measures the strength and direction of a linear
relationship between two continuous variables. Its value
can range from -1 for a perfect negative linear relationship
to +1 for a perfect positive linear relationship. A value of 0
(zero) indicates no relationship between two variables.
This statistical test is often known by its shorter title of the
Pearson correlation or Pearson's correlation.
Variables required
In order to run a Pearson correlation, you require the following variables:
» Two variables that are continuous (e.g., weight, cholesterol) and are
paired (i.e., each case has two values - one for each variable).
What problems can you solve using a
Pearson correlation?
If your variables show a monotonic relationship, you can either (1) run a Spearman's rank-
order correlation or (2) coax the non-linear relationship to a linear one using one or more
transformations on your data. You should choose (1) if you feel that transformations (e.g.,
log-transformations) are too advanced for you, or if the transformed variable will no longer
hold much meaning. This is the simplest option and the Spearman's rank-order correlation
is a well-established test. In all other cases, you should consider a transformation of the
data. However, it should be noted that not all non-linear relationships can be coaxed into a
linear relationship. For example, the far-right non-linear relationship in the diagram above
would be difficult to transform into a linear relationship (but potentially possible). So, to coax
a non-linear relationship to a linear one, you need to apply a transformation to either to one,
or both, of the variables. There are many different types of transformations that can be
applied for different types of non-linear relationships.
As a final note, you should be made aware that these are only guidelines. Many
researchers will, for example, transform their data into a new, relatively meaningless scale,
in order to run a Pearson's correlation or another parametric test rather than choose a non-
parametric test, such as a Spearman's rank-order correlation.
Pearson's correlation procedure
The following procedure shows you how to run a Pearson's correlation in SPSS.
Determining the correlation coefficient
The main reason for running a Pearson correlation analysis is to obtain the value of the
Pearson correlation coefficient. This value can be obtained from the Correlations table,
which is the only table produced by SPSS and is shown below:
You will notice from the table above that the results are in a matrix
such that each result is repeated. However, you are only concerned
with the table cells that contains the Pearson correlation between
the variables tv_time and cholesterol, such as the table cell shown
below:
Determining statistical significance
The results you have reported so far have only used the Pearson correlation coefficient to
describe the relationship between the two variables in your sample. If you wish to test
hypotheses about the linear relationship between your variables in the population your
sample is from, you need to test the level of statistical significance. The level of statistical
significance (p-value) of the correlation coefficient in this example would appear to be .000
(obtained from the "Sig. (2-tailed)" row). However, if you ever see SPSS print out a p-value
of .000, do not interpret this as a significance level that is actually zero; it does, in fact,
indicate that p < .0005. As p < .05 in this example (it is p < .0005), it can be concluded that
the correlation coefficient is statistically significantly different from zero. Remember that
statistical significance does not determine the strength of the relationship (r or ρ does that),
but whether the correlation coefficient is statistically significantly different from zero. You
can write this result as follows:
The implication in the above statement is that the increase in time spent watching TV
caused the increase in cholesterol concentration and/or that you manipulated how much
time participants watched TV. This is incorrect. It is only associated with cholesterol
concentration. Further knowledge would be required to make such a causal statement.
Putting it all together
You can write the full report as: