Professional Documents
Culture Documents
6 P T E R
CORRELATION ANALYSIS
CONTENTS
6.1 Introduction
6.2
6.2.1
Types of Correlation
S
Positive or Negative Correlation
IM
6.2.2 Simple or Multiple Correlations
6.2.3 Partial or Total Correlation
6.2.4 Linear and Non-linear Correlation
6.3 Methods of Calculating Correlation
6.4 Scatter Diagram Method
NM
INTRODUCTORY CASELET
N O T E S
The correlation between the Sensex and the rupee has been drifting
away from its historical averages, following RBI’s interventions
in the currency market. The central bank has been intervening
in the forex market in order to cap the significant upside in the
rupee as well as to build forex reserves. The 120-day correlation
between the Sensex and the rupee has fallen to a negative point of
0.36. Interestingly, such correlation levels were not seen before the
global financial crisis in September 2008.
S
IM
NM
N O T E S
6.1 INTRODUCTION
We often encounter the situations, where data appears as pairs of
figures relating to two variables, for example, price and demand of
S
commodity, money supply and inflation, industrial growth and GDP,
advertising expenditure and market share, etc. Examples of correlation
problems are found in the study of the relationship between IQ and
IM
aggregate percentage marks obtained in mathematics examination or
blood pressure and metabolism. In these examples, both variables are
observed as they naturally occur, since neither variable can be fixed
at predetermined levels.
These are some of the important definitions about correlation.
NM
N O T E S
variables on a third variable. In some cases there may not be any
cause-effect relationship at all. Therefore, if we do not consider and
study the underlying economic or physical relationship, correlation
may sometimes give absurd results. For example, take a case of global
average temperature and Indian population. Both are increasing over
past 50 years but obviously not related.
Correlation is an analysis of the degree to which two or more variables
fluctuate with reference to each other. Correlation is expressed by a
coefficient ranging between –1 and +1. Positive (+ve) sign indicates
movement of the variables in the same direction. E.g. Variation of the
fertilizers used on a farm and yield, observes a positive relationship
within technological limits. Whereas negative (–ve) coefficient
indicates movement of the variables in the opposite directions, i.e.
when one variable decreases, other increases. E.g. Variation of price
and demand of a commodity have inverse relationship. Absence of
correlation is indicated if the coefficient is close to zero. Value of the
S
coefficient close to ±1 denotes a very strong linear relationship.
The study of correlation helps managers in following ways:
IM
To identify relationship of various factors and decision variables.
To estimate value of one variable for a given value of other if both
are correlated. E.g. estimating sales for a given advertising and
promotion expenditure.
NM
N O T E S
us that one variable is independent and other dependent on it. E.g.
surface temperature of the Pacific Ocean (Al Niño) affects monsoons
in India but monsoons do not affect temperatures of the Pacific Ocean.
Thirdly, in some cases both variables under study may be fluctuating
together due to a variation in the third variables. Thus both variables
under correlation analysis may be dependent variables and hence
not mutually correlated. In such a case, manager can not vary one of
them and expect other variable to vary. For example, correlation in
increase in share prices and stronger rupee against dollar may be due
to increase in Foreign Direct Investment (FDI). In this case expecting
to control falling share prices through selling dollars by the Reserve
Bank is incorrect. To control these two variables we need to control
FDI. Further, if the falling share prices are due to market sentiments or
overheated market, controlling FDI may not help. Thus, the manager
needs to analyze the problem in business environment before he/she
can apply the correlation analysis in decision-making.
N O T E S
In managerial decision-making, it is a good practice to draw the scatter
diagram first, and then study the logical relationship to identify the
type of correlation and the cause effect relation. Only then manager
should calculate the coefficient of correlation for further mathematical
analysis. Types of correlation that need to be differentiated before
using the correlation coefficient for managerial decision-making are
given below.
S
in the value of other variable also.
IM
Negative or inverse correlation refers to the movement of the
variables in opposite direction. Correlation is said to be negative, if
an increase (decrease) in the value of one variable is accompanied
by a decrease (increase) in the value of other.
NM
N O T E S
6.2.3 PARTIAL OR TOTAL CORRELATION
In case of multiple correlation analysis there are two approaches to
study the correlation. In case of partial correlation, we study variation
of two variables and excluding the effects of other variables by keeping
them under controlled condition. In case of ‘total correlation’ study we
allow all relevant variables to vary with respect to each other and find
the combined effect. With few variables, it is feasible to study ‘total
correlation’. As number of variables increase, it becomes impractical
to study the ‘total correlation’. For example, coefficient of correlation
between yield of wheat and chemical fertilizers excluding the effects of
pesticides and manures is called partial correlation. Total correlation
is based upon all the variables.
S
When the amount of change in one variable tends to keep a
constant ratio to the amount of change in the other variable, then
the correlation is said to be linear.
IM
But if the amount of change in one variable does not bear a
constant ratio to the amount of change in the other variable then
NM
N O T E S
5. Correlation is said to be ..................., if an increase (decrease)
in the value of one variable is accompanied by a decrease
(increase) in the value of other.
6. When the amount of change in one variable tends to keep a
constant ratio to the amount of change in the other variable,
then the correlation is said to be ................... .
7. In case on ................... correlation the rate of variation changes
as values increase or decrease.
S
Scatter diagram not only tell us about linearity or nonlinearity but
also whether the data is cyclic. When values of two variables have a
IM
constant rate of change it is linear correlation.
ETHODS OF CALCULATING
M
6.3
CORRELATION
Simple linear correlation is a statistical tool applied in many business
NM
N O T E S
S
problem solving. How will you find the correlation of your scores of
different subjects and interpret which was your strongest subject.
IM
6.4 SCATTER DIAGRAM METHOD
Scatter diagram is the most fundamental graph plotted to show
relationship between two variables. It is a simple way to represent
bivariate distribution. Bivariate distribution is the distribution of two
NM
random variables. Two variables are plotted one against each of the X
and Y axes. Thus, every data pair of (xi, yj) is represented by a point on
the graph, x being abscissa and y being the ordinate of the point. From
a scatter diagram we can find if there is any relationship between the
x and y, and if yes, what type of relationship. Scatter diagram thus,
indicates nature and strength of the correlation.
N O T E S
diagram. The way the dots scatter gives an indication of the kind of
relationship which exists between the two variables. While drawing
scatter diagram, it is not necessary to take at the point of sign the zero
values of X and Y variables, but the minimum values of the variables
considered may be taken.
When there is a positive correlation between the variables, the dots
on the scatter diagram run from left hand bottom to the right hand
upper corner. In case of perfect positive correlation all the dots will lie
on a straight line.
When a negative correlation exists between the variables, dots on the
scatter diagram run from the upper left hand corner to the bottom
right hand corner. In case of perfect negative correlation, all the dots
lie on a straight line.
If a scatter diagram is drawn and no path is formed, there is no
correlation.
S
Example: Figures on advertisement expenditure (X) and Sales (Y) of
a firm for the last ten years are given below. Draw a scatter diagram.
IM
Advertisement 40 65 60 90 85 75 35 90 34 76
cost in ‘000 `
Sales in Lakh ` 45 56 58 82 65 70 64 85 50 85
Solution:
NM
90
85
80
Sales in Lakh `
75
70
65 Sales
60 in Lakh `
55
50
45
40
30 50 70 90 110
Advertisement cost in '000 `
Income (X) (`) 100 110 113 120 125 130 130 140
Expenditure (Y) (`) 85 90 91 100 110 125 125 130
N O T E S
Solution:
140
130
Expenditure (Y) (`)
120
110
100
90
80
70
60
50
80 100 120 140 160
Income (X) (`)
Scatter Diagram
S
IM
Fill in the blanks:
10. Scatter diagram is the most fundamental graph plotted to
show relationship between ................... variables.
NM
N O T E S
1
n
∑ (X − X)(Y − Y) (1)
r=
sX sY
Where r is the ‘Correlation Coefficient’ or ‘Product Moment Correlation
Coefficient’ between X and Y. sX and sY are the standard deviations
of X and Y respectively. ‘n’ is the number of the pairs of variables X
1
and Y in the given data. The expression ∑ (X − X)(Y − Y) is known
n
as a covariance between the variables X and Y. It is denoted as Cov
(x, y). The Correlation Coefficient r is a dimensionless number whose
value lies between +1 and –1. Positive values of r indicate positive (or
direct) correlation between the two variables X and Y i.e. both X and
Y increase or decrease together. Negative values of r indicate negative
(or inverse) correlation, thereby meaning that an increase in one
variable X or Y results in a decrease in the value of the other variable.
A zero correlation means that there is no association between the two
variables.
S
The formula can be modified as,
IM
1 1
∑ ( X − X )(Y − Y ) ∑ ( XY − XY − XY + XY )
=r n= n
s Xs Y s Xs Y
∑ XY − ∑ X × ∑ Y
NM
= n n n
(2)
∑X ∑X ∑Y ∑Y
2 2 2 2
− −
n n n n
E[ XY ] − E[ X ] E[Y ]
= (3)
E[ X 2 ] − ( E[ X ] ) E[Y 2 ] − ( E[Y ] )
2 2
Equations (2) and (3) are alternate forms of equation (1). These have
advantage that we don’t have to subtract each value from the mean.
Example: The data of advertisement expenditure (X) and sales (Y)
of a company for past 10 year period is given below. Determine the
correlation coefficient between these variables and comment on the
correlation.
X 50 50 50 40 30 20 20 15 10 5
Y 700 650 600 500 450 400 300 250 210 200
Solution:
=
X
∑=
X 290
= 29=
,Y ∑
=
Y 4260
= 426
n 10 n 10
N O T E S
S.No. X Y x (X − X ) =
= y (Y − Y ) x2 y2 xy
=
1
Now, r n
∑ ( X − X )(Y − Y )
=
s Xs Y
=
1
n
∑ xy
∑ x 2 ∑ y2 S ∑ xy
∑x ∑y 2 2
IM
n n
28310
=r = 0.976
2740 × 306840
N O T E S
Where a, b, g and h are constants.
In this case, we have defined variables U and V through shift of origin
from (0, 0) to (a, b) and change the X and Y scale by factors ‘g’ and
‘h’ respectively. Thus for every observation pair (xi, yi) there is a
corresponding pair ( ui, vi) such that,
xi − a and v = yi − b
ui = i
g h
Σx i Σ(g × ui + a) g × Σui + n × a
Now, X = = = = gU + a
n n n
Similarly,
–
Y = hV + b
Now, xi − X = (g × ui + a) − (gU + a) = g( ui − U )
And
Σ ( x i − X )2
Hence, s X 2 =
S
yi − Y= h(vi − V )
g2 ×
=
Σ( ui − U )2
g2s U
=
2
IM
n n
And s Y 2 = h2s V 2
1
Σ(xi − X )( yi − Y )
n Σg × ( ui − U ) × h × (vi − V )
NM
=
Now, rXY =
s Xs Y n × (g × s U )(h × s V )
1
Σ( ui − U )(vi − V )
= n
s Us V
= rUV
This result is very useful for manual calculations. We can select
arbitrary constants a, b, g and h so as to simplify the data and the
find rUV which gives the result rXY. Thus, if any constant is added or
subtracted to the variables or the variables are multiplied or divided by
any constant, the correlation coefficient between these two variables
does not change.
Example: The data of advertisement expenditure (X) and sales (Y)
of a company for past 10 year period is given below. Determine the
correlation coefficient between these variables and comment the
correlation.
X 50 50 50 40 30 20 20 15 10 5
Y 700 650 600 500 450 400 300 250 210 200
Solution: We shall take U to be the deviation of X values from the
assumed mean of 30 divided by 5. Similarly, V represents the deviation
of Y values from the assumed mean of 400 divided by 10.
N O T E S
Short cut procedure for calculation of correlation coefficient
r= =i 1
n
∑ ui vi −
1 n
=
n
∑ i ∑ vi
u
n i 1=i 1 S
IM
2 2
n
1 n n
1 n
∑ ui − ∑ ui ∑ vi − ∑ vi
2 2
=i 1 = n i 1= i1 = n i 1
(−2)(26)
561 −
10 561 + 5.2
= = 0.976
NM
N O T E S
This is explained in the following example.
Example: Calculate coefficient of correlation for the following data.
0-500 250
S
Mark mx dx = g
-2
f
14 -28 56
IM
500-1000 750 -1 29 -29 29
1000-1500 1250 0 12 0 0
1500-2000 1750 1 9 9 9
2000-2500 2250 2 5 10 20
NM
∑ f ×d x × dy
= (−2)(−2)(12) + (−1)(−2)(6) + (−2)(−1)(2) + (−1)(−1)(18) + (−1)(1)(2) + (−1)(2)(1)
+(1)(−1)(1) + (1)(1)(2) + (1)(2)(1) + (2)(1)(2) + (2)(2)(3)
= 48 + 12 + 4 + 18 − 2 − 2 − 1 + 2 + 2 + 4 + 12 = 97
N O T E S
Hence,
1
Σf × dx × dy − Σ( f × dx )Σ( f × dy )
r= n
2 (Σf × dx )2 2 (Σf × dy )2
Σ( f × dx ) − Σ( f × dy ) −
n n
1
97 −× (−38)(−47)
69 71.1159
= = = 0.76
1 1 9.647 × 9.746
114 − × (−38)2 127 − × (−47)2
69 69
N O T E S
6.5.2 INTERPRETATION OF R
The correlation coefficient, r ranges from −1 to 1. A value of 1 implies
that a linear equation describes the relationship between X and Y
perfectly, with all data points lying on a line for which Y increases
as X increases. A value of −1 implies that all data points lie on a line
for which Y decreases as X increases. A value of 0 implies that there
is no linear correlation between the variables.
More generally, note that (Xi − X) (Yi − Y) is positive if and only
if Xi and Yi lie on the same side of their respective means. Thus the
correlation coefficient is positive if Xi and Yi tend to be simultaneously
greater than, or simultaneously less than, their respective means.
The correlation coefficient is negative if Xi and Yi tend to lie on
opposite sides of their respective means.
The coefficient of correlation r lies between –1 and +1 inclusive
of those values.
together. S
When r is positive, the variables x and y increases or decrease
IM
r=+1 implies that there is a perfect positive correlation between
variables x and y.
When r is negative, the variables x and y move in the opposite
direction.
NM
Symbolically e = r ± P. E.
P = Correlation (coefficient) of the population.
Example: If r = 0.6 and n = 64 find out the probable error of the
coefficient of correlation.
1 − r2
Solution: P. E. = 0.6745
n
N O T E S
1 − (−0.6)2
= 0.6745
64
= 0.6745 − 0.64
8
= 0.57
S
15. Correlation coefficient does not change with shifting of
................... i.e. by adding or subtracting any constant from the
two variables (X, Y) correlation coefficient remains same.
IM
16. If the value of r is ................... than P. E., then there is no
evidence of correlation i.e. r is not significant.
17. If r is ................... than 6 times the P. E. ‘r’ is practically certain
i.e. significant.
NM
N O T E S
case during beauty contests. However, in these cases the experts may
rank the candidates. It is then necessary to find out whether the two
sets of ranks are in agreement with each other. This is measured by
Rank Correlation Coefficient. The purpose of computing a correlation
coefficient in such situations is to determine the extent to which the
two sets of ranking are in agreement. The coefficient that is determined
from these ranks is known as Spearman’s rank coefficient, rs.
This is defined by the following formula:
n
6 × ∑ di
2
rS = 1 − i =1
n( n2 − 1)
S
6.6.1 RANK CORRELATION WHEN RANKS ARE GIVEN
IM
Example: Ranks obtained by a set of ten students in a mathematics
test (variable X) and a physics test (variable Y) are shown below:
∑d 2
Now, n = 10, i = 50
i =1
N O T E S
Using the formula
n
6 × ∑ di
2
6 × 50
rS =1− i =1
2
=
1− =
0.697
n( n − 1) 10(100 − 1)
We can say that there is a high degree of correlation between the
performance in mathematics and physics.
X: 88 95 70 60 80 81 50 75
Y: 50 115 110 140 142 100 120 134
Solution: Let R1 and R2 denotes the ranks in X and Y respectively.
X Y R1 R2 d=R1-R2 d2
75
88
95
120
134
150
5
2
1
5
4
1 S 0
–2
0
0
4
0
IM
70 115 6 6 0 0
60 110 7 7 0 0
80 140 4 3 1 1
81 142 3 2 1 1
50 100 8 8 0 0
NM
6
6∑ d2 6×6
Coefficient of Correlation P =
1− =
1− =
+.93
n( n2 − 1) 8 ( 64 − 1)
In this method the biggest item gets the first rank, the next biggest
second rank and so on.
X: 87 22 35 75 37
Y: 29 63 52 46 48
Solution:
X Y R1 R2 d=R1-R2 d2
87 29 1 5 –4 16
22 63 5 1 4 16
35 52 4 2 2 4
75 46 2 4 –2 4
37 48 3 3 0 0
40
N O T E S
6∑ d2 6 × 40
Coefficient of correlation P =
1− =
1− =
−1
n ( n − 1)
2
5 × 24
This shows on absolute negative correlation or perfect inverse
correlation.
showing that there are two items with the same 3rd rank and fourth
rank is skipped, then instead of writing 3, we write 3½ for both. Thus
the sum of these ranks which is 7 (3+4= 3½+3½= 7) remains same
keeping the mean of ranks unaffected. But in such cases the standard
deviation is affected. Therefore, correction is required for the Rank
( m3 − m)
Correlation Coefficient. For this, ∑ di is increased by
2
for
S 12
each tie, where m is number of items in each tie. If there are more
than one group of items with common rank, this correction factor is
to be added that many times once for each group.
IM
Example: Twelve salesmen are ranked for efficiency and length of
service as below:
Salesman A B C D E F G H I J K L
Efficiency (X) 1 2 3 4 4 4 7 8 9 10 11 12
NM
Length of 2 1 5 3 9 7 7 6 4 11 10 11
Service (Y)
Find the value of Spearman’s Rank Coefficient.
Solution:
Computations of Spearman’s Rank Correlation as shown below:
Individual Efficiency (X Length of Service di = xi – yi di2
= xi) (Y = yi)
A 1 2 -1 1
B 2 1 1 1
C 3 5 -2 4
D (4+5+6)/3 = 5 3 2 4
E (4+5+6)/3 = 5 9 -4 16
F (4+5+6)/3 = 5 (7+8)/2 = 7.5 -2.5 6.25
G 7 (7+8)/2 = 7.5 -0.5 0.25
H 8 6 2 4
I 9 4 5 25
J 10 (11+12)/2 = 11.5 -1.5 2.25
K 11 10 1 1
L 12 (11+12)/2 = 11.5 0.5 0.25
Total 65
N O T E S
n
Now, n = 12, ∑d
i =1
i
2
= 65
S
educational and aptitude test scores, together with assessment score
by the Personal department of their ability one year after joining the
company. 1 is a low score and 20 is a high score.
IM
Employee Educational Aptitude Assembly by
test officer
A 9 17 12
B 10 14 14
NM
C 15 12 16
D 14 13 15
E 16 10 17
F 11 15 10
G 12 12 11
H 17 16 18
Rank each set of the data
Calculate appropriate rank correlation coefficients
Solution: Let X denote the score in educational tests, let Y denote the
score in aptitude test and Z denote the assessment by personal office.
Employee X Y Z Rx Ry Rz d1 d2 d12 d22
A 9 17 12 8 1 6 2 –5 4 25
B 10 14 14 7 4 5 –1 4 1 16
C 15 12 16 3 6.5 3 3.5 0 12.25 0
D 14 13 15 4 5 4 1 0 1 0
E 16 10 17 2 8 2 6 0 36 0
F 11 15 10 6 3 8 –5 4 25 16
G 12 12 11 5 6.5 7 –0.5 0 0.25 0
H 17 16 18 1 2 1 1 0 1 0
16 101.25 67
N O T E S
6∑ d2 6 × 16
P(d2 1) =
1− 2
=
1− =
0.81
N ( N − 1) 8 × 63
6∑ d2 + ∑ m( m2 − 1) / 12
P(d2 2)= 1 −
N ( N 2 − 1)
6 × (101.25 + 0.5)
=
1− =
0.2141
8 × 63
The rank correlation coefficient between educational test and
assessment score is positive and high and therefore high educational
test score will correspond to high ability in performance of the job.
S
18. The coefficient that is determined from these ranks is known
as ................... rank coefficient, rs.
19. When two or more items have the same rank, a correction has
IM
to be applied to ................... .
Collect the data of marks of all the students of your class of any
NM
two subjects. Convert them into ranks and find the rank correlation
between the two subjects.
2×c − n
r =± ±
n
Where, n = total number of pairs.
c = Number of concurrent changes
Example: The data of advertisement expenditure (X) and sales (Y)
of a company for past 10 year period is given below. Determine the
correlation coefficient between these variables and comment the
correlation.
N O T E S
X 50 50 50 40 30 20 20 15 10 5
Y 700 650 600 500 450 400 300 250 210 200
Solution:
2×c − n 2×6 −9
r =± ± =+ + =0.577
n 9
NM
Collect the data of heights and weights of all the boys in your class.
Find the correlation coefficient using concurrent deviation method
between the variables height and weight.
2×c − n
1. Sign ± is selected to make the value of positive. The
same sign is used outside the radical. n
2. This method does not give strength of correlation. The
method is ad hoc and used only to reduce the efforts of tedious
calculations.
N O T E S
6.8 SUMMARY
In this chapter the concept of correlation or the association
between two variables has been discussed. A scatter plot of the
variables may suggest that the two variables are related but
the value of the Pearson correlation coefficient r quantifies this
association.
Correlation is a degree of linear association between two random
variables. In these two variables, we do not differentiate them
as dependent and independent variables. It may be the case
that one is the cause and other is an effect i.e. independent and
dependent variables respectively. On the other hand, both may
be dependent variables on a third variable.
In business, correlation analysis often helps manager to take
decisions by estimating the effects of changing the values of the
decision variables like promotion, advertising, price, production
processes, on the objective parameters like costs, sales, market
S
share, consumer satisfaction, competitive price. The decision
becomes more objective by removing subjectivity to certain
extent.
IM
The correlation coefficient r may assume values between –1 and
1. The sign indicates whether the association is direct (+ve) or
inverse (-ve). A numerical value of r equal to unity indicates
perfect association while a value of zero indicates no association.
The correlation is said to be positive when the increase
NM
N O T E S
coefficient, coefficient of determination, Yule’s coefficient of
association, coefficient of colligation, etc.
The correlation coefficient measures the degree of association
between two variables X and Y.
Karl Pearson’s formula for correlation coefficient is given as,
Covx.cov y
r=
s Xs Y
1
n
∑ ( X − X )(Y − Y )
r=
s Xs Y
The purpose of computing a correlation coefficient in such
situations is to determine the extent to which the two sets of
ranking are in agreement. The coefficient that is determined
from these ranks is known as Spearman’s rank coefficient, rs.
This is defined by the following formula:
rS = 1 −
n( n2 − 1)
n
6 × ∑ di
i =1
2
S
IM
Where, n = Number of observation pairs
di = Xi – Yi
Xi = Values of variable X and Yi = values of variable Y
Although the concurrent deviation method is effective in giving
NM
N O T E S
Linear Correlation: When the amount of change in one
variable tends to keep a constant ratio to the amount of change
in the other variable, then the correlation is said to be linear.
Non-linear Correlation: The amount of change in one variable
does not bear a constant ratio to the amount of change in the
other variable then the correlation is said to be non-linear.
Coefficient of Correlation: The correlation coefficient
measures the degree of association between two variables X
and Y.
Scatter Diagram: The pattern of points obtained by plotting
the observed points are knows as scatter diagram.
Advertisement 39 65 62 90 82 75 25 98 36 78
cost in ’000 `
Sales in Lakh ` 47 53 58 86 62 68 60 91 51 84
2.
Marks in Marks in
Statistics Economics
Mean 55 48
Standard Deviation 4 5
N O T E S
The correlation coefficient between marks in statistics and
economics is 0.8 given in table above. Estimate the marks in
statistics of a student who scored 50 marks in economics.
3. Calculate coefficient of correlation between X and Y as per the
data given below:
X 14 16 20 22 28 30 34 40 45
Y 97 89 68 65 56 50 37 18 12
4. Ten competitors in a beauty contest are ranked by three judges
in the following order. Determine which pair of judge has the
nearest approach to common taste in beauty?
Judge 1: 1 6 5 10 3 2 4 9 7 8
Judge 2: 3 5 8 4 7 10 2 1 6 9
Judge 3: 6 4 9 8 1 2 3 10 5 7
5.
S
Ten candidates obtained the following marks in examinations in
Statistics and Mathematics. Find the rank correlation coefficient
to determine whether these results support the suggestion that
IM
ability in one subject is associated with ability in the other.
Candidate A B C D E F G H I J
Statistics 40 65 61 49 53 42 68 57 58 46
Maths 51 58 67 55 76 45 69 56 73 63
NM
N O T E S
16. Less
17. More
Rank Correlation Method 18. Spearman’s
19.
∑d i
2
N O T E S
6. Refer Section 6.5
Karl Pearson’s formula for correlation coefficient is given as,
Covx.cov y
r=
s Xs Y
1
∑ ( X − X )(Y − Y )
r= n
s Xs Y
Where r is the ‘Correlation Coefficient’ or ‘Product Moment
Correlation Coefficient’ between X and Y. sX and sY are the
standard deviations of X and Y respectively. ‘n’ is the number of
the pairs of variables X and Y in the given data.
7. Refer Section 6.5.1
The assumptions underlying Karl Pearson’s correlation
coefficient are as follow:
S
(a) Your data on both variables is measured on either an Interval
Scale or a Ratio Scale.
IM
(b) The traits you are measuring are normally distributed in the
population.
8. Refer Section 6.5.2
The correlation coefficient, r ranges from −1 to 1. A value
NM
rS = 1 − i =1
n( n2 − 1)
N O T E S
the number of items that increase or decrease or remains equal
concurrently and denote as c. The correlation coefficient is then
calculated as,
2×c − n
r =± ±
n
Where, n = total number of pairs.
c = Number of concurrent changes
5. 0.6
between them.
S
testing beauty because the coefficient of correlation is highest
IM
6.11 SUGGESTED READINGS FOR REFERENCE
SUGGESTED READINGS
NM
Gupta, S.P. and Gupta, M.P., Business Statistics, Sultan Chand &
Sons, New Delhi, 1987
Loomba, M.P., Management – A Quantitative Perspective,
MacMillan Publishing Company, New York, 1978.
Levin, R.I., Statistics for Management, Prentice-Hall of India,
New Delhi, 1979
Shenoy, G.V., Srivastava, U.K. and Sharma, S.C., Quantitative
Techniques for Managerial Decision Making, Wiley Eastern, New
Delhi, 1985
Venkata Rao, K., Management Science, McGraw-Hill Book
Company, Singapore, 1986.
Bhardwaj, R.S., Business Statistics, 2nd Edition, Excel Books,
New Delhi.
Kothari, C.R., Quantitative Techniques, Vikas Publication.
E-REFERENCES
http://www.pinkmonkey.com/
https://www.tutorsland.com/
http://www.jstor.org/