You are on page 1of 17

Correlation

What is the relation?

Dibyojyoti Bhattacharjee

1
Definition
• Correlation analysis is a statistical procedure by which
we determine the degree of associa­tion or relationship
between two variables.
• Correlation only measures the extent of relationship
between variables
• However, does not predict anything about the cause and
effect relationship.
• The correlation coefficient is denoted by r.
• Karl Pearson (1867 – 1936), a British Biometrician,
developed the formula for Correlation Coefficient.
• The correlation coefficient between two variables X and
Y are denoted by r(X,Y) or rx,y.
2
Formula
• Let x1, x2,…,xn be a set of observations of
the variate X and let y1, y2,…,yn be the
corresponding values of Y. Then the
correlation coefficient is given by,

rxy 
Cov( X , Y )

 ( x  x )( y  y )
 x y 1 1
n
 ( x  x ) 2

n
 ( y  y ) 2

3
Formula For Calculation Purpose

rxy 
x y i i  nx y
( xi2  nx 2 ) ( y i2  ny 2 )

4
Uses of Correlation Coefficient
• The correlation analysis helps us to
measure the degree of relationship that
exists between the variables.
• Correlation is used to determine the
regression coefficient if standard
deviation of two variables is known.
• Correlation is used in problems of
reliability and validity of tests.
5
Properties of Correlation Coefficient
• It is rigidly defined.
• It is based on all the observations.
• The correlation coefficient is a pure
number and has no unit of
measurement.
• It lies between  1 and + 1.
• If the two variables are independent, the
correlation coefficient between them is
zero but the converse is not true.
6
Interpretation of Correlation
Coefficient
• r = +1 indicates Perfect Positive Correlation, i.e, there
is an equal proportional change in both the variables
and in the same direction.
• r =  1 indicates Perfect Negative Correlation i.e. there
is an equal proportional change in both the variables
and in the opposite direction.
• r = 0 implies that the variables are uncorrelated.
• A value of r very near to 0 means very little correlation
between X and Y i.e. X and Y are practically
independent variates.
• A value of r near to + 1 or 1 means Y is highly
dependent on X or X is highly dependent on Y.

7
Scatter Diagram

8
The height and weight of 10 girls each of 13 years old are given below.

Weight
Height (Kg)
(cm)
135 26
146 33
153 55
154 50
139 32
131 25
149 44
Compute the correlation coefficient
137 31
143 36
146 35

9
10
rxy 
x y i i  nx y
( xi2  nx 2 ) ( yi2  ny 2 )
53227  10  143.3  36.7

(2205880  10  143.3 2 ) (14357  10  36.7 2 )
53227  52591.1 635.9
 
(2205880  205348.9) (14357  13468.9) 2000531.1 888.1
635.9 635.9
   0.015
1414.4  29.8 42149.12

11
Rank Correlation
• Karl Pearson’s correlation coefficient
cannot be calculated if data are related to
attributes.
• However attributes can be ranked.
• Charles Edward Spearman, a British
Psychologist in the year 1904 developed a
formula for finding correlation between
ranks.
12
Rank correlation cont….
• The formula for rank correlation is given
by:
2
6 d
 1
2
n( n  1)
• where, n = number of individuals.
• d = difference between the ranks of an
individual in two attributes

13
Steps for Calculation of Spearman’s Rank
Correlation Coefficient ()
• For all the individual scores for the two series X
and Y, convert data into ranks either in
ascending or descending order and name them
Rx and Ry respectively when ranks are not
given.
• Compute the difference between ranks (d) which
is equal to Rx  R y
• Square these differences (d2) and total them to
get  d2
• Use the formula of rank correlation coefficient to
find .

14
Calculate Rank correlation coefficient from the
marks scored by 12 students in the following two
subjects and interpret your result.
QT: 60, 34, 40, 50, 45, 41, 22, 43, 42, 66, 64, 46
Accountancy : 75, 32, 34, 40, 45, 33, 12, 30, 36, 72, 41, 57

Spearman’s rank correlation coefficient is given


by,
6 d 2
  1
n(n 2  1)

15
Marks in Rank in QT Marks Rank in d=Rx-Ry d2
QT (Rx) in Acco. Acco.
(X) (Y) (Ry)
60 3 75 1 2 4
34 11 32 10 1 1
40 10 34 8 2 4
50 4 40 6 -2 4
45 6 45 4 2 4
41 9 33 9 0 0
22 12 12 12 0 0
43 7 30 11 -4 16
42 8 36 7 1 1
66 1 72 2 -1 1
64 2 41 5 -3 9
46 5 57 3 2 4

16
2
6 d 6 48
 1 = 1- = 1- 0.17 = 0 .83
2 2
n ( n  1) 12(12 1)

Interpretation???

What if the ranks are provided???

17

You might also like