Chap 12 Correlation Analysis 249
Chapter 12
CORRELATION ANALYSIS
(IN SIX WAYS)
12.1 Correlation Analysis
ion aims to:
discuss the principles embodied in the correlation analysis;
analyze and interpret data involving correlation analysis; and
apply correlation in different applications
When the degree of relationship is measured, corre
measurement. For sure, you have noticed some expected relationship between variables,
Correlation means that two variables tend to vary together; the presence of one indicates
the presence of the other; one can be predicted from the presence of the other,
lation is basically the test of
‘The relationship of two variables does not necessarily mean that one is the cause
or the effect of the other variable. It does not imply cause-effect relationship.
When the computed value of the correlation coefficient is high, it does not
necessarily mean that one factor is strongly dependent on the other. On the other hand,
when the computed value of the correlation coefficient is small it does not necessarily
mean that one factor has no dependence on the other factor.
If there is a reason to believe that the two variables are related and the computed
value of the correlation coefficient is high, these two variables are really meant as
associated. On the other hand, if the variables correlated are low, other factors might be
responsible for such small association.
The meaning of correlation coefficient just simply informs us that when two
Valuables change there may be a strong or weak relationship taking place.
Correlations in Diagram Form
'
'
iis a6
aera
ene neaee
iyut .
fete
ts exit? .
oe th Bk
" ie ith i lation
*ct Positive Correlation Some Positive Correlation No Correla
: Fig. 12.1.2 Fig, 12.1.3
> Fig, 12.1.1zid
cdots
Perfect Negative Correlation
Some Negative Correlation Fie. 12.15
Fig, 12.1.4
Relationship Between Two Variables
From the diagram above, we can therefore summarize that there are three degrees
of correlation relationship between two variables namely:
4. Perfect correlation (positive or negative)
2. Some degree of correlation (positive or negative)
3. No correlation
Correlation Interpretation Guide
+100 perfect positive correlation |
I
i very high positive correlation
1
40.75 b—
i high positive correlation
I
+0.50 ----
t
! ‘moderately small positive correlation
40.25 P---
1
very small positive correlation
0.00 no correlation
! si
' very small negative correlation ai
0.25 p--~
' .
\ moderately small negative correlation
-0.50 }
1 .
i high negative correlation
0.75 La--
very high negative correlation
Perfect negative correlationa nnn Ch 12. Correlation Analysts 254
six Measures of Correlation
The process of selecting the appropriate measure of correlation to be used
involves a number of important decisions, It ‘might be thought that once a phenomenon
had been measured, the choice of a statistical formula would be a soutine one, This all
depends on what one means by the word “measure.” If we use the term to refer only to
those types of measurement ordinarily used in science such as Physics (e.g. the
measurement of length, time, or mass), there is little or no problem in the choice of a
mathematical system. But if we broaden the concept of measurement to include certain
categorization procedures ordinarily used in statistics, as will be done in this text, the
whole problem becomes more complex. We can distinguish among several levels of
measurement, and shall find different statistical correlation measures appropriate to each.
1. Correlation between interval variables: Pearson r
2. Correlation between nominal variables: Guttman's Lambda
3. Correlation between ordinal variables of 30 samples or less: Spearman Rho
4. Correlation between ordinal variables of more than 30 samples: Goodman
and Kruskal's Gamma
5. Correlation between interval and dichotomous nominal variables: Point
Biserial
6. Correlation between interval and any nominal variables: Correlation Ratio
Three Levels of Measurement
According to /evel of measurement, we can categorize variables into interval,
nominal and ordinal.
1. Interval Variable
This variable refers to a properly defined by an operation which permits making of
statements of equality of intervals rather than just statement of sameness or difference and
greater than or less than. An interval variable does not have a “true” zero point: although
for convenience, a zero point may be arbitrarily assigned. The measurement for
_ Fahrenheit and Centigrade temperature constitute interval variables when we consider
four objects a, b, c and d, with temperature of 12, 24, 36 and 48 respectively. It is
"appropriate to say that the difference between the temperature a and c is twice the
| difference between the temperature a and b.
' 2. Nominal Variable
"| This variable refers to a property of the members of a group defined by an
‘operation which allows making of statements only of equality or difference. We may
“State that one member is the same or different from the others. For example, individuals
May be classified according to their skin color. Color is an example of a nominal
Mariable. You may say that the color of your skin is the same or different from other's
color. In dealing with nominal variables, you may assign numerals to represent
Sses, but such numerals are labels only whose purpose is to identify the members
in a given class. For example, “how many students are if light brown color, dark
1” and the like. In short, this is a frequency count of student belonging to that252 Chap 12 Correlation Analysis (In Six Ways) —
3. Ordinal Variable vat,
5 This is a property defined by an operation whereby members of a particular group
are ranked. In this operation, we can state that one member Fe Toni ae
others in a criterion rather than saying that he/it is one equal or GHFGTenI 107 | io
Such as what is meant by the individuals according to aggressivenet. “opliivencss
and some other qualities by ranking them, the resulting variable is an ordinal variable,
Correlation Between Interval Variables
The Use of Pearson r Formula
NY xy -Dxvy
pe NE TY
; IND x? - (IND? - (Ly)
! where N = number of samples
x = first variable
y = other variable
Example 12.1.1. Let us measure the degree of relationship between the students
grades in Mathematics and Science.
} Individual Grade in Grade in
q Student Mathematics Science
1 85 80
a 90 89
3 87 84
4 79 86
\ 5 75 79
6 80 86
| u 88 90 1
1 8 85 90
| 9 86 87
\ 10 80 86
Solution.
Individual Grade in Grade in
‘Student Mathematics Science
: &x) @y)
85
2 % %
ji a 84
ai 79
: 5 aa 86
2 6 aa 719
" 7 fe 86
8 as 90
- 9 86 90
é ns 10 80 87
835 36
857Chap 12 ‘Correlation Analysis 253
Individual
‘Student 0°) ow) (xy)
ea ae me ERED
1 7,225 6,400 6,800
2 8,100 7,921 3,010
3 7,569 7,056 7,308
4 6,241 7,396 6.794
5 5,625 6,241 5,925
6 6,400 7,396 6,880
7 7,744 8,100 7,920
8 7,225 8,100 7,650
9 7,396 7,569 7,482
10 6,400 1,396 6,880
69,925 73,575 71,649
NY w= DxDy
YOY - Ow NT y - Oy)
_ 10(71,649) — 835(857)
4{{10(69,925) ~ (835)*] [10(73,575) - (857)"]
895
ss — = 0555
'(2,025)(1,301),
Interpretation
There is a high positive correlation between the students grades in Mathematics
and Science.
Correlation Between Nominal Variables
The Use of Guttman’s Lambda Formula (also known as Guttman's Coefficient
of Predictability)
9, - FR=CT
b °° N=CT
where
FR = the biggest cell frequency in each row
CT = the biggest column total
N = total frequency
and
the biggest cell frequency in each column
= the biggest row total
N = total frequency¢ Correlation Analysis (In Six Ways) —_——— —
||) religion and political party where he belongs.
Political Party _
LAKAS MMP | REPORMA| “Total.
nucb_| *4 :
44
Catholic 20 9 15
27
Iglesia ni Cristo 5 18 4
Protestant ul 8 10 29
Total 36 35 29 100
Solution.
FR - CT FC - RT
MSN Tot °F M = NoRT
a, = QO+18 +11 ~ 36 4g, = QO+18 +15) ~ 44
\4 6100-36 100
1 _13 9
te a * 36
| 2. = 0.20 2, = 0.16
Interpretation
> the error minimized in the prediction (increases its
accuracy) is 16 percent. These results prove that religion accurately predicts political
Party more than political party predicting religion,
Correlation Between Ordinal Variables of 30 Samples or Less
The Use of Spearman Rho Formula (also known as Spearman Rank-Order
Correlation Coefficient)
6y
NIN? =p
where N = number of samples
D = difference between ranks
Example 12.1.3. Let us measure the degree of relationshi veen the
J ionship between
_ Performance rank obtained by the ten trainees during the first and second evaluation
period.Chap 12_ Correlation Analysis 255
Student Rank During Rank During
Trainee Ist Evaluation 2nd Evaluation
A 8 7
B 2 5
ic 7 10
D 1 4
E 4 2
F 9 6
G 3 1
H 6 9
I 10 8
J 5 3
Solution,
Student Ist 2nd 7 D
Trainee Evaluation __ Evaluation
A 8 7 1 I
B 2 5 3 9
c 7 10 3 9
D 1 4 3 9
E 4 2 2 4
F 9 6 3 9
G 3 1 2 4
H 6 9 3 9
I 10 8 2 4
J 5 3 2 4
xD = 6
6(62)
10(10? — 1)
372
P=" 990
p = 0.62
| Interpretation :
i There is a high positive correlation between the student trainees’ performance
| Tank during the first and second evaluation period.
Correlation Between Ordinal Variables of More Than 30 Samples
The Use of Goodman and Kruskal’s Gamma Formula
X, = number of pairs observed in parallel direction
= number of pairs observed in opposite direction
*256_Chap 12- Correlation Analysis (In Six Ways)
‘Column Variable
3
Cy &
Xi
5, Xi Xn
x,
5 Xa Xn >
x
5 Xs, Xn 3
where
+x
Ke = Xu Qlan + ay + Xap Xp) + Xn Gay # Xn) + Xai +X) # aX,
+x
Ky = Xyg(Xqy + Xzq + Nyy + yy) + XQ (Kqy + X31) + Xp (Xr + Xz2) + XX
Example 12.1.4, Let us measure the degree of relationship between the students’
performance in academic and non-academic areas.
Performance in Academic and Non-Academic Areas
5 12 6
3 8 7
2 5 2
Solution. We follow the indicated steps.
Step 1: Arrange the ordering for one of the two characteristics from the highest to
the lowest or vice versa from top to bottoin through the rows and for the other
characteristic from the highest to the lowest or vice versa from left to tight through the
column.
‘Step 2: Compute x, by multiplying the frequency in every cell by the series of the
frequencies in all of the other cell which ate both to the right of the original cell below it
and then sum up the products obtained.
Xx, = 5B +745 +2) +12(7 + 2) + 365 + 2) + 8(2) = 255
Step 3: To solve x,, you simply reverse partially the process described in Step 2.
You multiply the frequency of every cell by the sum of the frequencies in all of the cells
to the left ofthe original cell below it and then sum up the products obtained.
x = 63 +8 +2 +5) +123 +2) + 702 +5) + 82) = 233
‘Step 4. Substitute the values of x, and x, in gamma.
= 22 _ 995
XxX, 488
Interpretation
There is a very small positive correlation between the i
I ‘ le students’ performance in
academic and non-academic areas. 7Chap 12. Correlation Analysis 257
Correlation Between Interval and Dichotomous Nominal Variables
The Use of Point Biserial Coefficient of Correlation Formula
a LAC &)
fe = eal
VDSS BCE AE fe) - OF 6)
where x = interval variable
f, = frequency of one of the dichotomous nominal variable
f, = frequency of the other dichotomous nominal variable
f = total frequency of the dichotomous nominal variable
Example 12.1.5, Let us measure the degree of relation between sex and
intelligence.
10 Score No. of Males No. of Females
95 8 3
90 3 2
85 1 4
80 2 0)
15 4 3
Solution.
(x) (f,) (f) (f)
95 8 3 8
90 3 2 3
85 1 4 4
80 2 0 2
15 4 3 4
18 12 30
(fx) (fx?) (£,x)
1,045 99,275 760
450 40,500 270
425 36,125 85
i 160 12,800 160
= 525 39,375 300
2,605 228,075 1,575
- fC AN- TAG
| PLETAL &) = fH]
: 1 = 30(1,575) — 18(2,605)
4 ® f18(12)[30(228,075) — (2,605)?]
_ 360
te = 5a
0.10
Tw25 12. Correlation Analysis (In Six Wa
Interpretation simeligene
i ‘There is very small positive relationship between sex and in .
Correlation Between Interval and Any Nominal Variables
The Use of Correlation Ratio Formula
pe ENON
Dy - Nv
number of sample per category
individual item
yy =
} Example 12.1.6, Let us measure the degree of relationship between the civil
status and the annual salary (expressed in thousand of pesos) of the given samples.
Single 65 83 81 69 7B 89 76 60
Married 70 67 90 84 B
Widowed 89 64 78
Solution. ;
i Np=8 Ns N=3
! 5 = 596 _ 5, - 389
{ werpams y, = Bans 3 = Blin
i N=16
|
‘i = 1216 «a6
| 1 465) ¢ @) Can?
| Yo = 6S) + BF + BD" 4.4 (9) + (64) 4 18) = 93799
es
i BP = ENG = Nyt = BAS! + 507.8)" + 30794) — 16(76)
(874.5! + 507.8)" + 307%)
| Dy, - Ny 93,792 —16(76=
Interpretation
There is a very small positive relationshi ..
salary (expressed in thousands of pesos) of the aoe civil status and the annual