Professional Documents
Culture Documents
Correlation - refers to the degree to which two or more events vary together.
- refers to the relationship or association between two or more variables.
correlation coefficient – or the product-moment correlation is a simple descriptive statistic that measures the
strength of the linear relationship or association between variables, as might be visualized in a scatter plot.
Scatter plot / Scattergram – a graphic technique used to represent the relationship between two variables.
Positive correlations
When two or more events change in the same direction, they are
said to be positively correlated. Ex: between age and height
Data has an upward trend. As the independent variable (x-axis)
increases the dependent variable (y-axis) also increases in a linear
manner
An r value of 1 suggests that there is a perfect linear association
present, which gives a perfect positive correlation. Ex: height in cm
and height in inches of subjects measured.
Perfect Correlation - An increase in one variable is accompanied by proportional increase in the
other variable. It can be connected with a straight line.
Negative correlation
When two or more events change in opposite directions, they are
said to be negatively correlated. Ex: between interest rates and
lending activity
Data has a downward trend. As the independent variable (x-axis) increases, the dependent
variable (y-axis) decreases in a linear manner.
An r value of -1 suggests that there is a perfect linear association present, which gives a perfect
negative correlation. Ex: the volume of gas decreases as pressure increases.
Nonexistent correlation
Data has no trend. There is no correlation between
the variables. The size of one variable is unrelated
to the size of the other variable.
Ex: IQ and shoe size.
**This shows the importance of plotting the data and not relying on summary statistics such as correlation coefficient. The
correlation coefficient may not show if it’s actually a nonlinear correlation.
Interpreting correlation using a scatter plot can also be subjective. Usually, a precise way to measure the type and strength of
a linear correlation between two variables is still to calculate the correlation coefficient. symbol r represents the sample correlation
coefficient. The formula for r is:
here’s a rough but useful guide to the degree of relationship indicated by the size of the coefficients.
**The strength of the relationship is indicated by the absolute value of the correlation coefficient.
Interpretation and Significance:
Our obtained (sample ) r could reflect a statistically significant population correlation depending on:
a. the size of the correlation coefficient obtained
- this is as whether we reject the null hypothesis (there is a zero correlation) based if the obtained
coefficient equals or exceeds the tabled critical value at the level of risk chosen.
b. The size of the sample
- A good-sized sample must have a minimum of 30 participants
Things to Remember:
"Correlation does not imply causation" Two events may vary together, but one does not necessarily
cause the other. For example, even though height and musical expression are positively correlated, no
one is likely to assert that if I grow taller then I would become more proficient in using musical
instruments.
(a) When there is a non-linear relationship; (b) when distinct subgroups are present. In both of these examples
the correlation coefficient quoted is spurious.