You are on page 1of 26

Rank Correlation method:

Sometimes we come across statistical series in which the variables under consideration are not
capable of quantitative measurement but can be arranged in serial order.

This happens when we are dealing with qualitative characteristics (attributes) such as honesty, beauty,
character, morality, etc., which cannot be measured quantitatively but can be arranged serially.

In such situations Karl Pearson’s coefficient of correlation cannot be used as such. Charles Edward
Spearman, a British psychologist, developed a formula in 1904 which consists in obtaining the correlation
coefficient between the ranks of n individuals in the two attributes under study.
For example:
Suppose we want to find if two characteristics A, say, intelligence and B, say, beauty are related or not.

Both the characteristics are incapable of quantitative measurements but we can arrange a group of n
individuals in order of merit (ranks) w.r.t. proficiency in the two characteristics.

Let the random variables X and Y denote the ranks of the individuals in the characteristics A and B
respectively.

If we assume that there is no tie, i.e., if no two individuals get the same rank in a characteristic then,
obviously, 𝑋 and 𝑌 assume numerical values ranging from 1 to 𝑛.
The Pearsonian correlation coefficient between the ranks 𝑋 and 𝑌 is called the rank correlation coefficient
between the characteristics 𝐴 and 𝐵 for that group of individuals.

Spearman’s rank correlation coefficient, usually denoted by 𝜌 (Rho) is given by the formula

𝟔 ∑ 𝒅𝟐
𝝆=𝟏 − 𝒏 𝒏𝟐 𝟏 (1)

Where 𝑑 is the difference between the pair of ranks of the same individual in the two characteristics

i.e. 𝑑 = 𝑥 − 𝑦 , 𝑖=1,2,……n, where (𝑥 , 𝑦 ) denote the ranks of the ith individual when ranked according to
the two characteristics.

and 𝑛 is the number of pairs.


Computation of rank correlation coefficient:

We shall discuss below the method of computing the Spearman’s rank correlation coefficient 𝜌 under the following
situations:
I. When actual ranks are given
II. When ranks are not given

Case I: When actual ranks are given:


In this situation the following steps are involved:
i. Compute 𝑑, the difference of ranks.
ii. Compute 𝑑
iii. Obtain the sum ∑ 𝑑
iv. Use formula (1) to get the value of 𝜌.
Example-2:

The ranks of the same 15 students in two subjects Mathematics and Physics are given below:

Here the two numbers within the brackets denoting the ranks of the same student in Mathematics and Physics
respectively. (1,10), (2,7), (3,2), (4,6), (5,4), (6,8), (7,3), (8,1), (9,11), (10,15), (11,9), (12,5), (13,14), (14,12),
(15,13).

Use Spearman’s formula to find the rank correlation coefficient.


Solution:

Rank in Math Rank in Phys. 𝑑 =𝑥−𝑦 𝑑


(x) (y)
1 10 -9 81
2 7 -5 5
3 2 1 1
4 6 -2 4
5 4 1 1
6 8 -2 4
7 3 4 16
8 1 7 49
9 11 -2 4
10 15 -5 25
11 9 2 4
12 5 7 49
13 14 -1 1
14 12 2 4
15 13 2 4
∑ 𝑑 =272
𝑑 =0
Spearman’s rank correlation coefficient 𝜌 (Rho) is given by

6∑𝑑
𝜌=1−
𝑛 𝑛 −1
6 × 272 17 18
=1− =1− = = 0.51
15 225 − 1 35 35

Example-3:

Calculate Spearman’s rank correlation coefficient between advertisement cost and sales from the following data,

Advertising cost (thousands Rs.) 39 65 62 90 82 75 25 98 36 78

Sales (lakhs Rs.) 47 53 58 86 62 68 60 91 51 84


Solution:

Let X denotes the advertising cost(‘000Rs.) and Y denotes the Sales (lakhs Rs.).

𝑋 𝑌 Rank of Rank of 𝑑 =𝑥−𝑦 𝑑


𝑋(𝑥) 𝑌(𝑦)
39 47 8 10 -2 4
65 53 6 8 -2 4 Here 𝑛 = 10
62 58 7 7 0 0
90 86 2 2 0 0 6∑𝑑
𝜌=1−
82 62 3 5 -2 4 𝑛 𝑛 −1
75 68 5 4 1 1
6 × 30
25 60 10 6 4 16 =1−
98 91 1 1 0 0 10 × 99
63 51 9 9 0 0 2 9
78 84 4 1 1 1 =1− = = 0.82.
11 11

∑ 𝑑 =0 ∑ 𝑑 =30
Example-4:
Find the rank correlation coefficient from the following data

Ranks 1 2 3 4 5 6 7
in X

Ranks 4 3 1 2 6 5 7
in Y
Solution:

In this problem ranks are not repeated

𝑥 𝑦 𝑑 =𝑥 −𝑦 𝑑
1 4 -3 9
2 3 -1 1
3 1 2 4
4 2 2 4
5 6 -1 1
6 5 1 1
7 7 0 0

𝑑 = 20

In this problem ranks are not repeated, so the rank correlation coefficient is

6∑𝑑 6 × 20
𝑟 𝑥, 𝑦 = 𝜌 = 1 − =1− = 0.6429
𝑛 𝑛 −1 7 7 −1
Example-5:

Calculate the rank correlation coefficient from the following data, which give the ranks of 10 students in Mathematics and
Computer Science

Mathematics 1 5 3 4 7 6 10 2 9 8
𝑥

Computer 6 9 1 3 5 4 8 2 10 7
Science 𝑦
Solution:

𝑥 𝑦 𝑑 =𝑥 −𝑦 𝑑 In this problem ranks are not repeated, so the


1 6 -5 25 rank correlation coefficient is
5 9 -4 16
3 1 2 4 6∑𝑑
𝜌=1−
𝑛 𝑛 −1
4 3 1 1
7 5 2 4 6 × 60
=1− = 0.63636
6 4 2 4 10 10 − 1
10 8 2 4
2 2 0 0
9 10 -1 1
8 7 1 1
∑𝑑 =60
HW:

The ranks of same 16 students in mathematics and physics are as follows. Calculate rank correlation
coefficients for proficiency in mathematics and physics

Mathematic 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
s 𝑥

Physics 𝑦 1 10 3 4 5 7 2 6 8 11 15 9 14 12 16 13
Example-6 ( based on beauty context):

Ten competitors in a beauty contest are ranked by three judges in the following order
Use the rank correlation coefficient to determine which pair of judges has the nearest approach in
common tastes in beauty.

1st Judge 1 6 5 10 3 2 4 9 7 8

2nd Judge 3 5 8 4 7 10 2 1 6 9

3rd Judge 6 4 9 8 1 2 3 10 5 7

Solution:
Let 𝑅 , 𝑅 and 𝑅 denote the ranks given by the first, second and third judges respectively and
let 𝜌 be the rank correlation coefficient between the ranks given by ith and jth judges 𝒊 ≠ 𝒋 = 𝟏, 𝟐, 𝟑.
Let 𝒅𝒊𝒋 = 𝑹𝒊 − 𝑹𝒋 , be the difference of ranks of an individual given by the ith and jth judge.
𝑅 𝑅 𝑅 𝑑 =𝑅 −𝑅 𝑑 =𝑅 −𝑅 𝑑 =𝑅 −𝑅 𝑑 𝑑 𝑑

1 3 6 -2 -5 -3 4 25 9
6 5 4 1 2 1 1 4 1
5 8 9 -3 -4 -1 9 16 1
10 4 8 6 2 -4 36 4 16
3 7 1 -4 2 6 16 4 36
2 10 2 -8 0 8 64 0 64
4 2 3 2 1 -1 4 1 1
9 1 10 8 -1 -9 64 1 81
7 6 5 1 2 1 1 4 1
8 9 7 -1 1 2 1 1 1

∑𝑑 =0 ∑𝑑 =0 ∑𝑑 =0 ∑𝑑 =200 ∑ 𝑑 =60 ∑ 𝑑 =214


We have 𝑛 = 10
Spearman’s rank correlation coefficient 𝜌 is given by

6∑𝑑 6 × 200 7
𝜌 = 1− =1− =− = −0.2121
𝑛 𝑛 −1 10 × 99 33
6∑𝑑 6 × 60 7
𝜌 =1− =1− = = 0.6363
𝑛 𝑛 −1 10 × 99 11
6∑𝑑 6 × 214 49
𝜌 = 1− =1− =− = −0.2970
𝑛 𝑛 −1 10 × 99 165
Since 𝜌 is maximum, the pair of first and third judges has the nearest approach to common tastes in
beauty.
Remark, since 𝜌 and 𝜌 are negative, the pair of judges (1,2) and (2,3) have opposite (divergent) tastes
for beauty.
Case II: When ranks are not given
Spearman’s rank correlation formula can also be used even if we are dealing with variables which are
measured quantitatively, i.e., when the actual data is available but not the ranks relating to two variables are
given.

In such a case we shall have to convert the data into ranks.

The highest (smallest) observation is given the rank 1. The next highest (next lowest) observation is given
rank 2 and so on.

It is immaterial in which way (descending or ascending) the ranks are assigned. However, the same
approach should be followed for the all the variables under consideration.
Repeated ranks:

In case of attributes if there is a tie i.e., if any two or more individuals are placed together in any

classification with respect to an attribute.

or if in case of variable data there is more than one item with the same value in either or both the

series,

then Spearman’s formula for calculating the rank correlation coefficient breaks down, since in this case

the variables X [the ranks of individuals in characteristic A (1st series)] and Y [ the ranks of individuals

in characteristic B (2nd series) do not take the values from 1 to n and consequently 𝑥̅ ≠ 𝑦, while

Spearman’s formula proving we had assumed that 𝑥̅ = 𝑦.


In this case, common ranks are assigned to the repeated items.

These common ranks are the arithmetic mean of the ranks which these items should have got if they are
different from each other and the next item will get the rank next to the rank used computing the common
rank.

For example,

suppose an item is repeated at rank 4.

The common rank to be assigned to each item is (4+5)/2 i.e., 4.5 which is the average of 4 and 5, the ranks which
these observations would have assigned if they were different.

The next item will be assigned the rank 6. If an item is repeated thrice at rank 7, then the common rank to be
assigned to each value will e (7+8+9)/3 i.e., 8 which is arithmetic mean of 7, 8 and 9.

The ranks these observations would have got if they were different from each other. The next rank to be assigned
will be 10.
For repeated/ tied rank:-

In the Spearman’s formula add the factor to ∑ 𝑑 ,

where 𝑚 is the number of times is repeated.

This correction factor is to be added for each repeated value in both the series.

Example-7:

A psychologist wanted to compare two methods A and B of teaching. He selected a random sample of 22 students. He
grouped them into 11 pairs so the students in a pair have approximately equal scores on an intelligence test. In each
pair one student was taught by method A and the other by method B and examined after the course. The marks
obtained by them are tabulated below:
Pair 1 2 3 4 5 6 7 8 9 10 11

A 24 29 19 14 30 19 27 30 20 28 11

B 37 35 16 26 23 27 19 20 16 11 21

Find the rank correlation coefficient

Solution:

In the X-series, we seen that the value 30 occurs twice. After arranging it in descending order the common rank
assigned to each of these values is 1.5, the arithmetic mean of 1 and 2, the ranks which these which observations
would have taken if they were different. The next value 29 gets the next i.e. rank 3. Again, the value 19 occurs twice.
The common rank assigned to it as 8.5, the arithmetic mean of 8 and 9 and the next value, 14 gets the rank 10.
Similarly, in the y-series the value 16 occurs twice and the common rank assigned to each is 9.5, the arithmetic mean
of 9 and 10, the next value, 11 gets the rank 11.
X Y Rank of X (x) Rank of Y (y) d=x-y 𝑑

24 37 6 1 5 25

29 35 3 2 1 1

19 16 8.5 9.5 -1 1

14 26 10 4 6 36

30 23 1.5 5 -3.5 12.25

19 27 8.5 3 5.5 30.25

27 19 5 8 -3 9

30 20 1.5 7 -5.5 30.25

20 16 7 9.5 -2.5 6.25

28 11 4 11 -7 49

11 21 11 6 5 25

𝑑 =0 𝑑 = 225

Arrange all the given values in descending order to find the rank.
Hence, we see that in the X-series the items 19 and 30 are repeated, each occurring twice and,

In the Y-serie, the item 16 is repeated twice. Thus, in each of the three cases 𝑚 = 2.

Hence on applying the correction factor for each repeated item, we get


𝜌 = 1− , here n=11

6 × 226.5
𝜌=1− = 1 − 1.0225 = −0.0225
11 × 120
Example-8:

A sample of 12 fathers and their eldest sons have the following data about their heights in inches.

Calculate the rank correlation coefficient.

Fathers 65 63 67 64 68 63 70 66 68 67 69 71
𝑥

Sons 68 66 68 65 69 66 68 65 71 67 68 70
𝑦
Solution:

Fathers 𝑥 Sons 𝑦 Rank of 𝑥 Rank of 𝑦 𝑑 =𝑥−𝑦 𝑑


65 68 9 5.5 3.5 12.25
63 66 11 9.5 1.5 2.25
67 68 6.5 5.5 1 1
64 65 10 11.5 -1.5 2.25
68 69 4.5 3 1.5 2.25
62 66 12 9.5 2.5 6.25
70 68 2 5.5 -3.5 12.25
66 65 8 11.5 -3.5 12.25
68 71 4.5 1 3.5 12.25
67 67 6.5 8 -1.5 2.25
69 68 3 5.5 2.5 6.25
71 70 1 2 -1 1
∑ 𝑑 =0 ∑ 𝑑 =72.5
Correction factors

In 𝑥, 68 is repeated twice, then = =

In 𝑥, 67 is repeated twice, then = =

In 𝑦, 68 is repeated 4 times, then = =5

In 𝑦, 66 is repeated twice, then = =

In 𝑦, 65 is repeated twice, then = =

Rank correlation is

1 1 1 1
6 ∑𝑑 + 2 + 2 + 5 + 2 + 2
𝜌=1− = 0.722
12 144 − 1

You might also like