You are on page 1of 11

Correlation analysis

Karl Pearsons correlation


coefficient

r= dx dy / dx 2 x dy

Where dx = X-mean or dx = X- A ,A is any other constant


dy = Y- mean or dy = Y-B , where B is any other constant.

Ex. Marks of 5 students in 2 tests are given below

Test 1 : (X)

Test 2 : (Y)

10

10

Here mean of X = 6, mean of Y = 8, Hence

dx : -1

dy : -1

dx 2 = 20

Hence r =

1
2

3
2

-3

-1

-2

dy 2 = 14

dx dy = 15

dx dy / dx 2 x dy

= 15/ 20 x 14

= 0.84

Karl Pearsons correlation


coefficient

When X and Y values are given,


r=( n XY - X . Y)/ [n X 2 - (X) 2] . [n Y 2 - (Y) 2]
Ex. Find r and interpret the value of it.
X
:9
8
7
6
5
4
3
2
1
Y: 15
16
14
13
11
12
10
8
9
Solution:- Here Sum of x =45 Sum of Y = 108 n=9

Sum of X 2 = 285

Sum of Y

2 =

1356

Sum of XY =597 Hence r= 0.95


Result:- There is high correlation between X and Y.

Karl Pearsons correlation coefficient

When x ,y are large values, r can be calculated


by U-V method.
Here U= X-A, V= Y-B where A and b are any
values near to x and Y respectively.
Treat U as new X and V new Y. find correlation
between u-v.
Use the property that correlation is
independent of change of origin and scale.
Hence correlation between U-V is same as
correlation between X-Y.

Spearmans Rank
Correlation Coefficient

This is used for qualitative characteristics


To find the relationship between the judgements by
two judges, honesty and efficiency this is used.
Here R = 1 6 d2 / n(n2-1)
Where d= the difference between the ranks and n
are the no. of pairs of values.
Three cases are
1. ranks of X and Y are known
2 X and y values are given ,we have to assign the
ranks.
3. X and Y values are given , but repetition of
values is observed.

Spearmans Rank
Correlation Coefficient

Ex. 1 10 food products A,B,.. Were ranked by two


committees as follows
I
F A G B D C J E I H
II
A B G D C E F H J I
What is the rank correlation coefficient ?
Solution: Here
Food products: A B C D E F G H I J
Rank (x)
2 4 6 5 8 1 3 10 9 7
Rank (Y)
1 2 5 4 6 7 3 8 10 9
D
1 2 1 1 2 -6 0 2 -1 -2
D2
1 4 1 1 4 36 0 4 1 4
D2 = 56 , n = 10 R = 1 6 D2 / n(n2 -1)
=

1 (6x56)/ 10x99
= -0.7

Spearmans Rank
Correlation Coefficient

Ex.2 Calculate R
X: 12
17 22 27
32
Y: 113 119 117 115 121
Solution: Here ranks are as below
R1: 5
4
3 2
1
R2: 5
2
3 4
1
D: 0
2
0 -2 0
D2: 0
4
0
4
0
D 2 = 8 Hence R = 1- 6 D 2/ n(n2 -1)
= 1- (6x 8 )/5x24
= 0.6

Spearmans Rank
Correlation Coefficient

Ex.3 Ranks with repetition


Find R
X:
18 16 15 19 20
Y:
71 60 60 73 70
Here ranks will be
R1:
3.5 6 7
2
1
R2:
2
9 9 1
3
d:
1.5 -3 -2 1
-2

D 2 = 53.5
C.F. = correction factor
= m(m2 -1)/12 = 2.5 = 2+0.5 where m is the no of times x or y repeats
Hence R = 1- 6 [d2 +correction factor] / n(n2 -1)
= 1 6 x (53.5+ 2.5)/10x99
= 1-0.34 = 0.66

10 5 3.5 9
5
7 4
9
5 -2 -0.5 0

D2 :

8
6
2

2.25 9 4

14 11 17 18 12
65 66 64 69 60

25

4 0.25 0

Regression analysis

Correlation does not necessarily represents the


relationship between x and y
Ex.1 Tall brothers have tall sisters but there exists no
cause and effect relationship between the two but both
are the children of tall parents.
2. sometimes production will decrease by increasing price
or sometimes production will increase by increasing
price .
3. correlation may be due to chance .There may exist
positive relation between heights of father and marks of
their sons due to chance
4.The existent relation may be sometimes opposite. The
correlation between time spent in studies and result of
the examination may be positive but in reality intelligency
is the third factor related to both of these factors.

Regression analysis

For prediction regression lines are used.


Reg line of y on x
Reg line of x on y
(y- y)= byx (x- x) (x- x)=bxy (y- y)

Where byx=(ry) / x

y=s.d of y, x=s.d of x, r = correlation between xy


x= mean of x
y= mean of y

bxy = (r x) / y

byx= dx dy / dx

bxy= dx dy / dy

2 Where dx = X-mean of x,
2

dy=

Y-mean of y

Regression analysis

Ex.1Estimate the cost of maintaining a 3 year


old car of the same make.
Age (yrs)
2 4 6 7 8 10 12
Annual cost
(in hundreds Rs.) 16 15 18 19 17 21 20
Ex.2 Given
x series
y series
mean
18
100
s.d.
14
20
r
0.8
Estimate 1) y if x=70 2) x if y= 90