Professional Documents
Culture Documents
Pearson’s r correlation is a bivariate, correlational and parametric statistical test that measures
the strength of the linear relationship between two variables. Pearson’s r correlation assumes
that the variables are measured at least at the interval scale and that these two variables are
proportional, meaning they are linearly related (Hill and Lewicki, 2006). The assumptions of
Pearson’s r correlation are the following: (1) the units of measurement are the same for all
variables, (2) there is a linear relationship between the two variables, (3) the variables are
either normal or lognormal in distribution (Rollinson, 2014). Pearson’s r correlation requires
a multivariate normal distribution (Reimann et al., 2002).
The correlation coefficient can be misleading, as it is influenced by data outliers and skewed
data distributions (Filzmoser and Hron, 2009). The Pearson correlation coefficient, r, is also
known as the Pearson’s product moment correlation coefficient. The coefficient of
determination, r2, is calculated by taking the square of the Pearson’s correlation coefficient.
The coefficient of determination is a measure of the amount of shared variance between two
variables. When converted into a percentage, this measure indicates the proportion of shared
variance between two variables.
A correlation coefficient ranges from -1 to +1, a perfect negative and perfect positive
correlation, respectively. A correlation coefficient of zero corresponds to no correlation. A
correlation coefficient between +/- 0.5 to 0.7 is good, while a correlation coefficient between
+/- 45 0.7 to 1.0 is strong. A positive association is one in which both variables increase. A
negative association occurs when one variable increases and the other variable decreases.
The interpretation of Pearson’s r correlation and Spearman’s rho correlation tests are the
same. The probability of a statistically significant relationship between the two variables, x
and y, is denoted by the p-value. In a two-tailed test, the null hypothesis states that there is no
correlation between x and y and the population correlation coefficient is equal to zero. The
alternative hypothesis states that the population correlation coefficient is not equal to zero
and there is a correlation between x and y.
A strong correlation, but non-linear relationship between two variables, can be misleading
(Hill and Lewicki, 2006). Correlation between variables is often forced and negative bias is
introduced into correlation due to the constant sum constraint of geochemical data (Rollinson,
1993). Closed data affects the results of correlation analysis (Rock, 1988). Rock (1988)
suggests that when using closed data, a negative correlation is less significant, and a positive
correlation is more significant compared to open data. Data transformation does not correct
for data closure (Reimann et al., 2002). A significant difference between the Pearson’s r and
Spearman’s rho correlation coefficients suggests that the Pearson’s correlation coefficient is
influenced by data abnormalities (Cooksey, 2014).