You are on page 1of 21

Measures of Correlation

Correlation
is a measure of association between
variables. It could be used to make predictions
of one variable from another variable.
The Correlation Coefficient is the statistic used
to report whether two variables are
correlated or not and to what extent and in
which direction.
The correlation coefficient ranges from +1 to
-1, where zero means no correlation and 1
means strong ( perfect) correlation
Illustration of the Degree
of Correlation between two
Variables
Perfect + Correlation Perfect – Correlation
Some Degree

Some degree
of + of - No Correlation

Correlation Correlation
The correlation could be a positive correlation indicating a
positive relationship between variables in that the increase in
one will follow by an increase in the other.
Anegative correlation indicating a negative (inverse)
relationship between variables in that the increase in one will
follow by a decrease in the other.
A Scatter plot is a graphical way of showing the correlation
between two variables. The horizontal axis represent one
variable and the vertical represents the other variable.
Measures of
Correlation
1. Pearson Product-Moment Coefficient ( r )
- The most common correlation measure
- used for continuous data and when the scores are
normally distributed.

r n  xy   x 
y
2 2 2 2
[n  ][n  ]
 ( x)  ( y)
x y
2. Spearman Rank-order Coefficient ()
- is another correlation measure is used when scores
are not normally distributed or when data is ordinal
or ranked.

nR R  y x
 y

R R
 x
2 2 2 2
[nR x  ( Rx) ][n  y  ( Ry)]
R
Interpretation of Pearson r-values,
the following degree of relationship
shall be used:
Pearson r-value Interpretation

0.80 to0.99 High Correlation

0.60 to0.79 Moderately High Correlation

0.40 to0.59 Moderate Correlation

0.20 to 0.39 Low Correlation

0.10 to  0.19 Negligible Correlation


Ex: Suppose you’re given with two data sets X and
Y.
x y 2 2 xy
x y
9 28.5 81 806.56 255.6
15 29.3 225 858.49 439.5
24 37.6 576 1413.76 902.4
30 36.2 900 1310.44 1086
38 36.5 1444 1332.25 1387
46 35.3 2116 1246.09 1623.8
53 36.2 2809 1310.44 1918.6
60 44.1 3600 1944.81 2646
64 44.8 4096 2007.04 2867.2
76 47.2 5776 2227.84 3587.2
415 375.6 21623 14457.72 16713.3
Solution:
 XY = 16713.3,  
 X2 = 21623  2 X= 415, Y = 375.6
Y = 14457.72 n =10
r 10(16713.3)  (415)(375.6)
[10(21623)  (415) 2 ][10(14457.72)  (375.6) 2 ]

r 11259 .  11259
(44005)(3501.84) 154098469 .2
r = 0.906986153
 0.91 High + Correlation
Problem 1
1. Age and Exercise. A researcher wishes to determine
is a person’s age is related to the number of hours
he or she exercises per week. The data for the
sample are shown here.
Age (X) 18 26 32 38 52 59
Hours (Y) 1 5 2 3 1.5 1
Testing the Significance of r- The test for
the
significance of r is needed in order to know,
whether the computed r is significant or not
Solution:
1. Solve for r
2. Hypothesis:
Ho:
Ha:
3. Level of Significance = 5% ( .05)
df = 6 – 2 = 4
Tabular Value = 2.776
4. Test Statistic to be used is t for r
5. Compute for t

t r
1
r
2
n2
6. Decision:
/tcomputed/> TV, reject Ho, accept Ha
7. Interpretation:
Problem 2
2. Emergency Calls and Temperature:
An emergency service wishes to see whether a
relationship exists between the outside temperature
and the number of emergency calls it receives for a
7-hour period. The data are shown:
Temperature ( x): 68 74 82 88 93
99 101
No. of calls (y): 7 4 8 10 11
9 13
Regression Analysis
- designed to help us determine the probability that
are inferences are sound.
- It helps us test the degree to which the dependent
variable is affected by the independent variable
- If there is no significant linear correlation, do not use
the regression equation to make predictions.
Regression Equation
It is an error-free equation used to predict the
value of y. It is an equation for perfect correlations.
Formula: y = a + bx
( exact relationship)

( y)( x )  ( x)( xy)


2
a 2 )  ( x) 2
N ( x
( x)( y)
N ( xy) 
b  2
)  ( x) 2
N ( x
Problem 1
1. Age and Exercise. A researcher wishes to determine
is a person’s age is related to the number of hours
he or she exercises per week. The data for the
sample are shown here.
Age (X) 18 26 32 38 52 59
Hours (Y) 1 5 2 3 1.5 1
Problem 2
2. Emergency Calls and Temperature:
An emergency service wishes to see whether a
relationship exists between the outside temperature
and the number of emergency calls it receives for a
7-hour period. The data are shown:
Temperature (x): 68 74 82 88 93
99 101
No. of calls (y): 7 4 8 10 11
9 13