You are on page 1of 38

PEARSON PRODUCT MOMENT

CORRELATION COEFFICIENT (r)


1PSY675 ADVANCED PSYCHOLOGICAL STATISTICS
DEAN ELMER DE JOSE

BY:

CAMILLE MONIQUE YUSON


HISTORY OF KARL PEARSON

❏ Founder of the modern discipline of STATISTICS


❏ A famous PHILOSOPHER of SCIENCE
❏ Writer on social DARWINISM
❏ Leading mover to install EUGENICS as the key
social science
❏ University College School to King’s College
KARL PEARSON (1857-1936)
Cambridge
❏ 1879 - third wrangler in the mathematics tripos
❏ University in Heidelberg and Berlin - postgraduate
HISTORY KARL PEARSON

❏ 1884 - appointed as chair of applied mathematics


and mechanics: University College of London
❏ 1890 - W.F.R Weldon – Francis Galton
❏ 1893 - produced memoirs – ‘mathematical theory of
evolution’ – Philosophical Transactions of the Royal
Society
❏ 1901 – Biometrika, KARL PEARSON (1857-1936)
❏ 1911 -1st Galton Professor, director of a ‘Biometric
Laboratory’, director of the ‘Galton Laboratory for
National Eugenics’
❏ Department of Applied Statistics – 1st
WHAT IS THE PEARSON PRODUCT MOMENT
CORRELATION COEFFICIENT ( r ) ?
➢ Also known as Pearson’s R, PPMCC or PCC

➢ Is a measure of the linear relationship between two variables

➢ It can only be used to measure the relationship between two variables


which are both normally distributed

➢ Commonly used in linear regression

➢ A statistical treatment used to measure the strength and direction of 2


variables.

➢ It is usually denoted by r and it can only take values between -1 and +1


Assumptions of Pearson’s R

Assumption #1: Your two variables should be measured on a continuous scale.


They are measured at the interval or ratio level.

Assumption #2: Your two continuous variables should be paired. These


"values" are also referred to as "data points".
Assumptions of Pearson’s R

Assumption #3: There should be independence of cases, which means


that the two observations for one case should be independent of the two
observations for any other case.

Note: If observations are not independent, they are related, and


Pearson’s correlation is not an appropriate statistical test.
Assumptions of Pearson’s R

Assumption #4: There should be a linear relationship between your two


continuous variables.

Note: It is not appropriate to analyse a non-linear relationship using a Pearson


product-moment correlation.
Assumptions of Pearson’s R
Assumption #5: Theoretically, both continuous variables should follow a bivariate
normal distribution, although in practice it is frequently accepted that simply having
univariate normality in both variables is sufficient.

Note: When one or both variables is not normally distributed, there is disagreement
about whether Pearson’s correlation will still provide a valid result
Assumptions of Pearson’s R

Assumption #6: There should be homoscedasticity, which means that the


variances along the line of best fit remain similar as you move along the line. If
the variances are not similar, there is heteroscedasticity.
Assumptions of Pearson’s R

Assumption #7: There should be no univariate or multivariate outliers.

An outlier is an observation within your sample that does not follow a similar
pattern to the rest of your data. You need to consider outliers that are unusual
only on one variable, known as "univariate outliers", as well as those that are an
unusual "combination" of both variables, known as "multivariate outliers".
SAMPLE PROBLEM

Does age influence the blood glucose levels of an individual?

Null Hypothesis (Ho):


There is NO relationship between age and blood glucose levels.
Alternative Hypothesis (Ha):
There is a relationship between age and blood glucose levels.
DATA
AGE BLOOD GLUCOSE LEVEL
60 130

70 120

80 90

70 80

90 110
STEP 1: Compute the TOTAL of X and Y
X Y
60 130

70 120

80 90

70 80

90 110

∑X = 370 ∑Y = 530
STEP 2: Compute the Deviation from the Mean
Mean of X: X Y (X-x) (Y-y)
Sum (520) divide 60 130 -14 24
by number of
70 120 -4 14
participants (5)
= 74 80 90 6 -16

Deviation: 70 80 -4 -26
Value of X (60) 90 110 16 4
less the Mean
∑X = 520 ∑Y = 530
(74) = -114
Mean of Y: Deviation:
Sum (530 divide Value of Y (130)
by number of less the Mean
participants (5) (106) = 24
= 106
STEP 3: Multiply the Deviations
X Y (X-x) (Y-y) (X-x)(Y-y)

60 130 -14 24 -366

70 120 -4 14 -56

80 90 6 -16 -96

70 80 -4 -26 104

90 110 16 4 64

∑X = 520 ∑Y = 530 SP = -350


STEP 4: Multiply the Squared Deviations
X Y (X-x) (Y-y) (X-x)(Y-y) (X-x)2 (Y-y)2

60 130 -14 24 -366 196 576

70 120 -4 14 -56 16 196

80 90 6 -16 -96 36 256

70 80 -4 -26 104 16 676

90 110 16 4 64 256 16

∑X = 520 ∑Y = 530 SP = -350 SSx = 520 SSy = 1,720


STEP 5: Compute for r
STEP 6: Find the Critical Value

df = N -2 𝞪 = .05

df = 5 - 2

df = 3
STEP 7: Compare the t-statistic (computed value)
with the critical value (from the table)
r = -0.37
Table r = .8783

If table r is bigger than r then we can


accept the Ho.
But if r is bigger than table r, then reject
the Ho.
How to interpret the data using EXCEL
Step 1: Type your data into two columns in Step 6: Click “OK.”
Excel. For example, type your “x” data into
column A and your “y” data into column B. Step 7: Type the location of your data into
the “Array 1” and “Array 2” boxes.
Step 2: Select any empty cell.
Step 8: Click “OK.” The result will appear in
Step 3: Click the function button on the the cell you selected in Step 2.
ribbon.
For this particular data set, the correlation
Step 4: Type “correlation” into the ‘Search coefficient(r) is -0.033836.
for a function’ box.

Step 5: Click “Go.” CORREL will be


highlighted.
STEP: 1

STEPS: 2 & 3
4 5

6
STEPS: 4, 5, 6
7

STEPS: 7 & 8 8
Correlation coefficient (r)
How to interpret the data using JAMOVI
Step 1: Type your data into two columns Step 6: Click on both variables and drag to
from left to right
Step 2: Input the Data Variable Title
Step 7: Let the computer run the analysis
Step 3: Choose the Measure Type: Nominal,
Ordinal, Continuous, ID Step 8: You can select the type of
correlation coefficient and additional
Step 4: To run the analysis. Click the options
Regression Icon
For this particular data set, the correlation
Step 5: Select Correlation Matrix coefficient(r) is -0.0338
Via MANUAL computation:

Comparing the r = -0.37

3 methods: Via EXCEL:


r = -0.33836
Via JAMOVI:
r = -0.338

You might also like