You are on page 1of 7

CHAPTER XI

Correlation Coefficient

Lesson 1. Coefficient of Correlation.

The coefficient of correlation measures the strength of the


association between two variables. It describes the strength of the
relationship between two sets of interval-scaled or ratio-scaled
variables.

Correlation Analysis. A group of techniques to measure the


strength of the association between two variables.

Dependent Variable. The variable that is being predicted or


estimated

Independent Variable. A variable that provides the basis for


estimation. It is the predictor variable.

In coefficient of correlation the following must be present:

1. Both variables must be at least interval scale of measurement


2. The coefficient of correlation can range from -1.00 to +1.00
3. If the correlation between two variables is zero (0), there is no
association between them.
4. A value of 1.00 indicates a perfect positive correlation, and –
1.00 perfect negative correlation.
5. A positive sign means there is a direct relationship between
the variables, and a negative sign means there is an inverse
relationship.
6. It is designated by the letter r and found by the following
equation:

[ n∑XY]–[∑X∑Y]
r = ------------------------------------------
[n∑X2 – (∑X)2][n∑Y2 – (∑Y)2]

Example 1. To illustrate the step-by-step procedure for obtaining a


correlation coefficient let us examine the relationship
between years of school completed (X) and prejudice (Y) as
found in the following sample of 10 respondents.

Responden X X2 Y Y2 XY
t
A 10 100 1 1 10
B 3 9 7 49 21
C 12 144 2 4 24
D 11 121 4 16 44
E 6 36 5 25 30
F 8 64 4 16 32
G 14 196 1 1 14
H 9 81 2 4 18
I 10 100 3 9 30
J 2 4 10 100 20
Sum (∑) 85 855 39 225 243

Using the formula

Step 1. Determine the mean of X and y

x́ = 85/10 = 8.5; ý = 39/10 = 3.9

Step 2. Square X and Y and determine the sum of each

∑X2 = 855; ∑Y2 = 225

Step 3. Multiple X and Y, then get their sum.

∑XY = 243

Step 4. Get the sum of X and the sum of Y.

∑x = 85; ∑Y = 39

Step 5. Substitute the data to the formula

10(243) – (85)(39)
r = ---------------------------------------------------
[(10)(855) – (85)2][(10)(225) – (39)2]

2,430 – 3,315
= --------------------------------------------------
[8,550 – 7,225][2,550 – 1,521]

-885
= -------------------------
[1,325][1,029]

-885
= ----------------------
1,363,425

-885
= ------------  r = - 0.76
1,167.66

The result shows a very high negative correlation. This means that
there is an inverse relationship between the variables.

Example 2. Computation of Correlation Coefficient for Job Aptitude


test (X) and Job performance (Y) data

Worker X X2 Y Y2 XY
A 65 4,225 90 8.100 5,850
B 60 3,600 95 9,025 5,700
C 62 3,844 82 6,724 5,084
D 59 3,481 87 7,569 5,133
E 58 3,364 80 6,400 4,640
F 53 2,890 75 5,625 3,975
G 50 2,500 60 3,600 3,000
H 48 2,304 69 4,761 3,312
I 45 2,025 60 3,600 2,700
J 40 1,600 72 5,184 2,880
Sum(∑ 540 29,752 770 60.588 42,274
)

Step 1. Determine the mean of X and Y.


540 770
x́ = ------ = 54; ý = --------- = 77
10 10

Step 2. Determine the square of X and Y and determine the sum of


each.

∑X2 = 29,752; ∑Y2 = 60,588

Step 3. Determine the sum of the product of X and Y.

∑XY = 42,274
Step 4. Get the sum of X and the sum of Y.

∑X = 540; ∑Y = 770

Step 5. Substitute to the formula the data from steps 1 – 4.

10(42,274) – (540)(770)
r = ------------------------------------------------------------
[(10)(29,752) – (540)2][10(60,588) – (770)2]

422,740 – 415,800
= ---------------------------------------------------------
[297,520-291,600][605,880-592,900]

6,940
= -------------------------
[5,920][12,980]

6,940
= ----------------- r = 0.792
8,765.93

The result reveals a very high positive correlation which means the
there is a direct relationship between the two variables. It
means, as x increases, y increase and as x decreases, y
decreases also.

Lesson 2. Testing the significance of the Correlation Coefficient

The correlation can be tested to find out if the computed


correlation coefficient come from a population of paired observations with
zero correlation. To do this we have to formulate the Null and Alternative
hypotheses the compute for the t ratio.

H0: p = 0 (the correlation in the population is zero)


H1: p ≠0 (the correlation in the population is different from zero)

This is a two-tailed test. The following formula will be used:

t test for the r n-2


coefficient of t = ---------- n – 2 degrees of freedom
correlation 1 – r2 (11-2)

Using the 5% level of significance, the decision rule states that if


the computed t falls in the area between plus 2.306 and minus 2.306,
the null hypothesis is not rejected. (This value 2.306 is found in the table
of critical values at 5% level of significance with 10-2 = 8 degrees of
freedom.)

For r = .792 we substitute this to the

.792 10-2 .792(2.828) 2.24


t = ---------------------- = ---------------- = --------- = 3.67
1 – (.792)2 .611 .611

Based from the above computation, t = 3.67 is higher than the


critical t of 2.306, which means that t falls on the rejection region. Thus,
H0 is rejected at 5% level of significance with 8 degrees of freedom.
Activity 12

Coefficient of Correlation

For the following problems determine the:


a. Dependent and independent variables.
b. Coefficient of correlation
c. Interpret the strength of the correlation coefficient

1. The following sample were Randomly selected


X: 4 5 3 6 10
Y: 4 6 5 7 7

2. ACC Appliance stores has outlets in several places in Albay. The general
sales manager plans to air a camcorder television commercial on selected
local stations at least twice prior to a gigantic sale starting on Saturday
and ending Sunday. She plans to get te figure for Saturday-Sunday
camcorder sales at the various outlets and pair them with the number of
times the advertisement was shown on the local TV stations. The basic
purpose of this research is to find whether there is any relationship
between the number of times the advertisement was aired and camcorder
sales. The pairings are:

Location of TV Number of Airings Saturday-Sunday Sales


Station (Php in thuands)
Daraga 4 15
Legazpi City 2 8
Bitano, Legazpi 5 21
City 6 24
Washngton Drive 3 17
Tabaco City
Total 20 85

3. The owner of Dana Motors wants to study the relationship between the
age of a car and its selling price. Listed below is a random sample of 12
used cars sold at Dana Motors during the last year.
Car Age (in Selling Car Age ( in Selling Price
years) Price years) (Php000)
(Php000
1 9 8.1 7 8 7.6
2 7 6.0 8 11 8.0
3 11 3.6 9 10 8.0
4 12 4.0 10 12 6.0
5 8 5.0 11 6 8.6
6 7 10.0 12 6 8.0
4. The Production Dept of ABC Electronics wants to explore the relationship
between the number of employees who assemble a subassembly and the
number produced. As an experiment, two employees were assigned to
assemble the subassemblies. They produced 15 during a one-hour
period. Then four employees assembled them. They produced 25 during
a one-hour period. The complete set of paired observations follows:

Number of One-Hour Production


Assemblers (Units)
2 15
4 25
1 10
5 40
3 30

5. The city council of Prince City is considering the increasing the number of
police in an effort to reduce crime. Before making the final decision, the
council asks the Chief of police to survey other cities of similar size to
observe the relationship between the number of police and the number of
crimes reported. The Chief gathered the following information.

City Number of Police Number of Crimes


A 15 17
B 17 13
C 25 5
D 27 7
E 17 7
F 12 21
G 11 19
H 22 6
Total 146 95

You might also like