You are on page 1of 76

STATISTICS FOR MANAGEMENT

Presentation

By

Prof. (Dr.) T. Muthukumar, M.Sc; MCA; MBA; PhD.


Professor (Business Analytics) & Associate Dean (Academic)
XIME-Bangalore
REGRESSION ANALYSIS
 Regression: “The general process of predicting
one variable from another by statistical means,
using previous data”
• “A method that uses past data to estimate the
relationship between two variable”
• Regression Line: “A line fitted to a set of data
points to estimate the relationship between two
variables”
• Multiple Regression: “The statistical process
by which several variables are used to predict
another variable”
2
REGRESSION ANALYSIS
 Least-Squares Method: “ A technique for
fitting a straight line through a set of points in
such a way that the sum of the squared
vertical distances from the n points to the line
is minimized.

 Linear regression model: “ A regression model


that gives a straight line relationship between
two variable”
3
REGRESSION ANALYSIS
 Formula: Type I: The regression lines are
1. (x – ¯x) = γ * (σ(x) / σ(y)) * (y – ¯y)
2. (y – ¯y) = γ * (σ(y) / σ(x)) * (x – ¯x)

Type.II-Actual Mean is used


1. (x – ¯x) = b’ * (y – ¯y)
2. (y –¯y) = b * (x – ¯x)
3. Where b’ = ⅀((x – ¯x) * (y – ¯y)) / ⅀(y – ¯ y)^2
4. b = ⅀((x – ¯x) * (y – ¯y)) / ⅀(x – ¯x)^2

4
REGRESSION ANALYSIS
 Type-III: Assumed Mean us used
• (i) (y – ¯y) = b (x – ¯x)
• (ii) (x – ¯x) = b’(y – ¯y)

• b = (n * ⅀(dxdy) – (⅀(dx)) * (⅀(dy))) /


(n * ⅀dx^2 – (⅀dx)^2

• b’ = (n* ⅀(dxdy) – (⅀(dx)) * (⅀(dy))) /


• (n* ⅀dy^2 – (⅀dy)^2)
5
REGRESSION ANALYSIS
 Ex.1: The following results were worked out from
scores in Statistics and Mathematics in a creation
examination.
• -----------------------------------------------------------------
• Score in Statistics Score in Mathematics
• -----------------------------------------------------------------
• Mean 40 48
• S.D 10 15
• -----------------------------------------------------------------

6
REGRESSIAN ANALYSIS
 Karl Pearson’s correlation coefficient between x
and y = 0.42. Find both the regression lines. Use
these regression lines to estimate the value of y for
x = 50 and also estimate the value of x for y = 30.
• Ans: The following data are given
• ¯x = 40, ¯y = 48, σ(x) = 10, σ(y) = 15, γ = 0.42
• The regression lines are
• (x – ¯x) = γ * (σ(x) / σ(y)) * (y – ¯y)
• (x – 40) = 0.42 * (10 / 15) * (y-48)
• X = 0.279y + 26.608
7
REGRESSION ANALYSIS

1. and (y – ¯y) = γ * (σ(y) / σ(x)) *(x – ¯x)


• (y – 48) = (0.42) * (15 / 10) * (x - 40)
• Y= 0.63x + 22.80
• And Regression equ. x on y is x = 0.297y + 26.608
• When y = 30, x = 0.297 * 30 + 26.608
• x = 35.518
• And Regression Equ. Y on X is y = 0.63x + 22.80
• When x = 50, y = 0.63 * 50 + 22.80
• y = 54.3
8
REGRESSION ANALYSIS
 Ex.2: From the following data obtain two
regression equation
• ---------------------------------------------
• x 12 4 20 8 16
• y 18 22 10 16 14
• ---------------------------------------------
• Ans: (Actual Mean is used)
• ¯x = (12+4+20+8+16) / 5 = 60 / 5 = 12
• ¯ y = (18+22+10+16+14) / 5 = 80 / 5 = 16

9
REGRESSION ANALYSIS
 ----------------------------------------------------------------
• x y x–¯x y–¯y (x–¯x)^2 (y–¯y)^2 (x–¯x)(y–¯y)
• ----------------------------------------------------------------
• 12 18 0 -2 0 4 0
• 4 22 -8 6 64 36 -48
• 20 10 8 -6 64 36 -48
• 8 16 -4 0 16 0 0
• 16 14 4 -2 16 4 -8
-------------------------------------------------------------------
160 80 -104
10
REGRESSION ANALYSIS
• b’ = ⅀((x – ¯x) * (y – ¯y)) / ⅀(y – ¯y)^2
• b’ = -104 / 80 = -1.3
• b = ⅀((x – ¯x) * (y – ¯y)) / ⅀(x – ¯x)^2
• b = -104 / 160 = -0.65
• Regression equation x on y
• (x – ¯x) = b’ * (y – ¯y)
• (x – 12) = -1.3 * (y - 16)
• X = 32.8 – 1.3y

11
REGRESSION ANALYSIS
• Regression equation y on x is
• (y – ¯y) = b * (x – ¯x)
• y – 16 = -0.65 (x – 12)
• Y = 23.8 – 0.65x

• Ex.3: The following table gives the height x(in


inch) and weight y of 10 students aged 18 years
selected by random sampling from a population.
Calculated the weight of a student whose height is
71.
12
REGRESSION ANALYSIS
 ----------------------------------------------------------------
• x= 61 68 68 64 65 70 63 62 64 67
• y= 112 123 130 115 110 125 100 113 116 123
• ---------------------------------------------------------------
• Ans: (Assumed Mean is used)
• It is required to calculate weight of a student when
height is given. The required regression line is
weight on height.
• ie, The regression line is y on x
• Let Assumed mean = ¯x = 70, ¯y = 125

13
REGRESSION ANALYSIS
 --------------------------------------------------------------------------
• x y dx=x-¯x dy=y-¯y (dx)^2 (dy)^2 dx*dy
• ---------------------------------------------------------------------------
• 61 112 -9 -13 81 169 117
• 68 123 -2 -2 4 4 4
• 68 130 -2 5 4 25 10
• 64 115 -6 -10 36 100 60
• 65 110 -5 -15 25 225 75
• 70 125 0 0 0 0 0
• 63 100 -7 -25 49 625 175

14
REGRESSION ANALYSIS
 62 113 -8 -12 64 144 96
• 64 116 -6 -9 36 81 54
• 67 123 -3 -2 9 4 6
• -----------------------------------------------------------------------
• -48 -93 308 1377 597
• -----------------------------------------------------------------------
• b = (n * ⅀(dxdy) – (⅀(dx)) * (⅀(dy))) /
• (n * ⅀dx^2 – (⅀dx)^2
• = (10 * 597 – (-48) * (-93)) / (10 *308 – (-48)^2)
• = 1.94
15
REGRESSION ANALYSIS
 ¯ x = A + ⅀(dx) / n = 70 + (-48) / 10 = 65.2
• ¯ y = A’ + ⅀(dy) / n = 725 + (-93) / 10 = 115.7
• The Regression equation y on x is
• y – ¯y = b*(x – ¯x)
• y – 115.7 = 1.94*(x – 65.2)
• y = 1.94x – 10.788
• When x = 71
• Therefore, y = 1.94*71 – 10.788
• y = 126.952
• The weight of a student whose height is 71 is 126.952

16
REGRESSION ANALYSIS
 Ex.4: From the following data, find the regression
equation
• x = 6 2 10 4 8
• y = 9 11 5 8 7
• Ans.4: x = 16.4 – 1.3y

• Ex.5: Find the equation of the regression line of y


on x of the observation (xi, yi) are following: (1,4),
(2,8), (3,2), (4,12), (5,10), (6,14), (7,6), (8,6), (9,18).
• Ans.5: y = x + 3.8889
17
CURVE FITTING-METHOD OF LEAST SQUARES

 Fitting of the line of the form ax+b


 Fitting of the curve of the form ax^2+bx+c
 Fitting of the curve of the form ab^x
 Fitting of the curve of the form ax^b
 Type.I: Fitting of a Straight line Y=a+bx
• Formula: Sum(yi) = n*a + b*Sum(xi) --- (i)
• Sum(xi*yi) = a* Sum(xi) + b*Sum(xi^2) --- (ii)

18
CURVE FITTING-METHOD OF LEAST SQUARE

• Ex.1: Fit a straight line to the following data


• X: 1 2 3 4 6 8
• Y : 2.4 3 3.6 4 5 6
 Ans.1: Let the line be Y = a + bx

19
CURVE FITTING-METHOD OF LEAST SQUARES

• ------------------------------------------------------
• X Y X^2 X*Y
• ------------------------------------------------------
• 1 2.4 1 2.4
• 2 3.0 4 6.0
• 3 3.6 9 10.8
• 4 4.0 16 16.0
• 6 5.0 36 30.0
• 8 6.0 64 48.0
• ------------------------------------------------------
• 24 24 130 113.2
20
CURVE FITTING-METHOD OF LEAST SQUARES

 Using the Normal equation (i),


• Sum(yi) = n*a+b*sum(xi)
• ie, 24 = 6*a + 24b
• a + 4b = 4 … (i)
• And equation (ii),
• Sum(xi*yi) = a * Sum(xi) + b * Sum(xi^2)
• ie, 113.2 = 24a + 130b
• 24a + 130b = 113.2 … (ii)
• Solving (i) and (ii) we get a=1.976, b=0.506
• Therefore, Y = 1.976 + 0.506 X
21
CURVE FITTING – METHOD OF LEAST SQUARES

 Ex.2: Fit a straight line to the following data


• X= 0 1 2 3 4
• Y = 1 1.8 3.3 4.5 6.3
• Ans.2: Y = 0.72 + 1.33X

22
CURVE FITTING-METHOD OF LEAST SQUARE
• Type.II: Fitting of Second degree parabola
• Y = a + bX + c*X^2
• The Normal equations are
• Sum(yi) = n*a+b*Sum(xi)+c*Sum(xi^2)
• Sum(xi*yi)=a*Sum(xi)+b*Sum(xi^2)+c*Sum(xi^3)
• Sum(xi^2*yi)=a*Sum(xi^2)+b*sum(xi^3)+c*Sum(
• xi^4)

23
CURVE FITTING-METHOD OF LEAST SQUARES

 Ex.1: Fit a parabola of second degree to the


following data
• X: 0 1 2 3 4
• Y : 1 1.8 1.3 2.5 6.3

• Ans.1: Let Y= a + b*x + c*x^2 be the Second


degree parabola

24
CURVE FITTING-METHOD OF LEAST SQUARE
• -------------------------------------------------------------------------
• X Y X^2 X^3 X^4 X*Y X^2*Y
• -------------------------------------------------------------------------
• 0 1 0 0 0 0 0
• 1 1.8 1 1 1 1.8 1.8
• 2 1.3 4 8 16 2.6 5.2
• 3 2.5 9 27 81 7.5 22.5
• 4 6.3 16 64 256 25.2 100.8
• -------------------------------------------------------------------------
• 10 12.9 30 100 354 37.1 130.3

25
CURVE FITTING-METHOD OF LEAST SQUARES

 Using Normal equation


• 12.9 = 5a +10b +30c … (i)
• 37.1 = 10a + 30b + 100c … (ii)
• 130.3 = 30a + 100b + 354c … (iii)
• Solving equation (i), (ii) and (iii), we get
• a = 1.42, b = -1.07, c = 0.55
• Therefore, Y = 1.42 – 1.07 X + 0.55 X^2

26
CURVE FITTING-MEATHOD OF LEAST SQUARES

 Type.III: Fitting of a Power curve Y = aX^b


• Normal Equation is Sum(U)=n*A+b*Sum(V) … (i)
• Sum(U*V)=A*Sum(V)+b*Sum(V^2) … (ii)
• Where, U = A + b*V
• ie, U = log Y, V = log X, A= log a
• Ex.1: Fit an exponential Curve of the form Y =
aX^b to the following data
• X: 1 2 3 4 5 6 7 8
• Y : 1.0 1.2 1.8 2.5 3.6 4.7 6.6 9.1

27
CURVE FITTING-METHOD OF LEAST SQUARE
 Ex.2: Fit an exponential curve of the form Y=aX^b
• X 2 3 4 5 6
• Y 144 172.8 207.4 248.8 298.6

28
CURVE FITTING-METHOD OF LEAST SQUARE
 Type.IV: Fitting of Exponential Curve: Y = ab^x
• U = A + B*X
• Where, U=logY, A=loga, B=logb
• The normal equation are
• SumU=nA+B*SumX
• Sum(X*U)=A*SumX+B*Sum(x^2)
• a=Antilog(A)
• b=Antilog(B)

29
CURVE FITTING-METHOD OF LEAST SQUARE
• Ex.1: Fit an exponential Curve of the form Y =
ab^x to the following data
• X: 1 2 3 4 5 6 7 8
• Y : 1.0 1.2 1.8 2.5 3.6 4.7 6.6 9.1

30
CURVE FITTING – METHOD OF LEAST SQUARES
 Ans: ------------------------------------------------------------------
• X Y U=logY XU X^2
• ------------------------------------------------------------------
• 1 1.0 0.0 0.0 1
• 2 1.2 0.0792 0.1584 4
• 3 1.8 0.2553 0.7659 9
• 4 2.5 0.3979 1.5916 16
• 5 3.6 0.5563 2.7815 25
• 6 4.7 0.6721 4.0326 36
• 7 6.6 0.8195 5.7365 49
• 8 9.1 0.9590 7.6720 64
• -------------------------------------------------------------------------
• 36 30.5 3.7393 22.7385 204
31
CURVE FITTING-METHOD OF LEAST SQUARE
 The normal equation
• 3.7393 = 8A + 36B … (i)
• 22.7385 = 36A + 204B … (ii)
• B = 0.1406, A = -0.6655/4 = -0.1660
• b = Antilog(B) = Antilog(0.1406) = 1.38
• a = Antilog(A) = 0.68
• Therefore, Y = 0.68(1.38)^x

32
CURVE FITTING-METHOD OF LEAST SQUARE
 Ex.2: Fit an exponential curve of the form Y=ab^x
• X 2 3 4 5 6
• Y 144 172.8 207.4 248.8 298.6
• Ans.2: Y=(101.3)*(1.196)^x

33
CORRELATION ANALYSIS
 Correlation Coefficient
• Simple Correlation
• Partial Correlation
• Multiple Correlation
• Rank Correlation
• Correlation Coefficient: “If the change in one
variable affects a change in the other variable, the
variables are said to be correlated”

34
CORRELATION ANALYSIS
• Direct or Positive Correlation: “If the two
variables deviate in the same direction, ie if the
increase or decrease in one results in a correlation
is said to be direct or positive”

 Inverse or Negative Correlation: “If the two


variable constantly deviate in the opposite
direction, ie if increase or decrease in one results in
corresponding decrease (or increase) in the other,
correlation is said to be inverse or negative”

35
CORRELATION ANALYSIS
• Ex.1:Positive:The heights and weights of a group
of persons.
• 2. The income and expenditure is possible.
• 3. Negative: The price and demand of a commodity

36
CORRELATION ANALYSIS
 Perfect correlation: “The Correlation is said to be
perfect if the deviation in one variable is followed
by a corresponding and proportional deviation in
the other”
• Simple Correlation
• Scatter Diagram Method
• Graphic Method
• Karl Pearson’s Coefficient of Correlation
• Concurrent Deviation Method
• Method of Least Squares
37
CORRELATION ANALYSIS
 Karl Pearson’s Co-efficient of Correlation
• Type.I: Raw Score Method:

• Correlation Coefficient between two random


variable X and Y, usually denoted by G(x,y) or
Simply G xy, is a numerical measures of linear
relationship between them and if defined as:

38
CORRELATION ANALYSIS
• G(x, y) = Cov(X,Y) / Sig(x)*Sig(y)
• If (xi,yi), i=1,2,3 … n is the bivariate distribution
then
• Cov(X,Y) = (1/n)*Sum(xi*yi) – x par * y par
• Sig(x) = sqrt(1/n*Sum(xi^2) – X par ^ 2)
• Sig(y) = sqrt(1/n*Sum(yi^2) – Y par ^ 2)
• X par = Sum(x)/n, Y par = Sum(y)/n

39
CORRELATION ANALYSIS
 Type-II: Actual Mean Method:
• G = Sum(x*y) / Sqrt(Sum(x^2)*Sum(y^2))
• x = X – X par, y = Y – Y par

40
CORRELATION ANALYSIS
 Ex.1: Calculate the correlation coefficient for the
following heights (in inches) of fathers (X) and
their Sons (Y):
• X : 65 66 67 67 68 69 70 72
• Y : 67 68 65 68 72 72 69 71

• Ans: Calculation for Correlation coefficient

41
CORRELATION ANALYSIS
 -----------------------------------------------------------
• X Y X^2 Y^2 X*Y
• -----------------------------------------------------------
• 65 67 4225 4489 4355
• 66 68 4356 4624 4488
• 67 65 4489 4225 4355
• 67 68 4489 4624 4556
• 68 72 4624 5184 4896
• 69 72 4761 5184 4968
• 70 69 4900 4761 4830
• 72 71 5184 5041 5112
• ----------------------------------------------------------
• 544 552 37028 38132 37560
42
CORRELATION ANALYSIS
 X par = Sum(x)/n = 544/8 = 68
• Y par = Sum(y)/n = 552/8 = 69
• G(x,y) = Cov(X,Y)/Sig(x)*Sig(y)
• = (Sum(x*y)/n–(x par * y par)) /
• Sqrt(Sum(x^2)/n– x par^2)*
• Sqrt(Sum(y^2)/n–y par ^ 2)
• = (37560/8 – 68*69) / Sqrt(37028/8 – 68^2) *
Sqrt(38132/8 – 69^2)
• = 0.603

43
CORRELATION ANALYSIS
 Ex.2: Calculate Correlation between the variable x
and y
• X 20 16 12 8 4
• Y 22 14 4 12 8

44
CORRELATION ANALYSIS
 Ans: ------------------------------------------------------------
• X Y X^2 Y^2 X*Y
• ------------------------------------------------------------
• 20 22 400 484 440
• 16 14 256 196 224
• 12 4 144 16 48
• 8 12 64 144 96
• 4 8 16 64 32
• -----------------------------------------------------------
• 60 60 880 904 840
• -----------------------------------------------------------
45
CORRELATION ANALYSIS
 X par = Sum(x)/n = 60/5 = 12
• Y par = Sum(y)/n =60/5 = 12
• G(x,y) = Cov(X,Y)/Sig(x)*Sig(y)
• = (Sum(x*y)/n – X par * Y par) /
• Sqrt(Sum(x^2)/n - X par^2) *
• Sqrt(Sum(y^2)/n – Y par^2)
• = (840/5 -12*12) / Sqrt(880/5 – 12^2) *
• Sqrt(904/5 – 12^2)
• = 0.7
46
CORRELATION ANALYSIS
 Ex.3: Calculate the coefficient of correlation
between x and y for the following data
• X 15 16 17 17 18 20 10
• Y 12 17 15 16 12 15 11
• Ans: 0.53

47
CORRELATION ANALYSIS
 Type-II: Deviation Score Method (Actual Mean
Method)
• Ex.1: Calculate Karl Pearson Coefficient of
correlation from the following data

• Year :1985 1986 1987 1988 1989 1990 1991 1992


• Production 100 102 104 107 105 112 103 99
• Unemployed 15 12 13 11 12 12 19 26
Ans: Calculation of Karl Pearson’s correlation
coefficients.
48
CRRELATION ANALYSIS
 Year X Y x=X – Xpar y=Y – Ypar x^2 y^2 xy
• ----------------------------------------------------------------------------
• 1985 100 15 -4 0 16 0 0
• 1986 102 12 -2 -3 4 9 6
• 1987 104 13 0 -2 0 4 0
• 1988 107 11 3 -4 9 16 -12
• 1989 105 12 1 -3 1 9 -3
• 1990 112 12 8 -3 64 9 -24
• 1991 103 19 -1 4 1 16 -4
• 1992 99 26 -5 11 25 121 -55
• ---------------------------------------------------------------------------
• 832 120 0 0 120 184 -92

49
CORRELATION ANALYSIS
 X par = Sum(X) / N = 832 / 8 = 104
• Y par = Sum(Y) / N = 120/8 = 15
• G(x,y) = Sum(xy)/Sqrt(Sum(x^2))*Sqrt(Sum(y^2))
• = -92 / Sqrt(120)*Sqrt(184)
• = - 0.619
• Correlation between index of production and
unemployed is negative.

50
CORRELATION ANALYSIS
 Type.III: Assumed Mean Method:
• G = (N*Sum(dx*dy) – Sum(dx) * Sum(dy))/
(Sqrt(N*Sum(dx^2) – (Sum(dx))^2) *
Sqrt(N*Sum(dy^2) – (Sum(dy)^2)

51
CORRELATIONS ANALYSIS
 Ex.1: A company manufactures different types of
electrical appliances. It has been using ratio for
advertising its products. The following table shows
amounts of ratio time (X, in minutes) and the
number of electrical appliances sold (Y) over the
last six weeks.
• X : 25 18 32 21 35 29
• Y : 16 11 20 15 26 28
• Calculate the coefficient of correlation between the
two series.
52
CORRELATIONS ANALYSIS
• ----------------------------------------------------------------------
• X Y dx=X – A (dx)^2 dy=Y-A’ (dy)^2 dx * dy
• ----------------------------------------------------------------------
• 25 16 0 0 -4 16 0
• 18 11 -7 49 -9 81 63
• 32 20 7 49 0 0 0
• 21 15 -4 16 -5 25 20
• 35 26 10 100 6 36 60
• 29 28 4 16 8 64 32
• ----------------------------------------------------------------------
• 160 116 10 230 -4 222 175
53
CORRELATION ANALYSIS
• Ans.1: 0.84
• Ex.2: Calculation of Correlation Coefficient (Using
Assumed Mean):
• X 50 60 58 47 49 33 65 43 46 68
• Y 48 65 50 48 55 58 63 48 50 70

54
CORRELATION ANALYSIS
 Ans.2: Calculation of Karl Pearson coefficient of
correlation
• -----------------------------------------------------------------
• X Y dx=X-A dx^2 dy=Y-A dy^2 dx*dy
• -----------------------------------------------------------------
• 50 48 0 0 -7 49 0
• 60 65 10 100 10 100 100
• -----------------------------------------------------------------
• 519 1077 535 5 595 489
• G(x,y) = 0.611
55
CORRELATION ANALYSIS
• Type – IV: Grouped Data:

• G = (N*f*dx*dy – Sum(f*dx) * Sum(f*dy))/


(Sqrt(N*Sum(f*dx^2) – (Sum(f*dx))^2 *
Sqrt(N*Sum(f*dy^2) – (Sum(f*dy))^2)

56
CORRELATION ANALYSIS
 Rank Correlation : “A method to determine
correlation when the data are not available in
numerical form and, as an alternative, the method
of ranking is used.
• R = 1 – 6 Sum(D^2) / (N*(N^2-1)
• Or R = 1 – 6 Sum(D^2)/(N^3 – N)

57
CORRELATIONS ANALYSIS
 Ex.1: Suppose that 10 salesmen employed by a
company were given a month’s training. At the end
of the specified training, they took a test and were
ranked on the basis of their performance. They
were then posted to their respective areas. At the
end of six months, they were rated in respect of
their sales performance. These ranks are shown
below:

58
CORRELATION ANALYSIS
• Salesmen :1 2 3 4 5 6 7 8 9 10
• Ranks in Training: 4 6 1 3 9 7 10 2 8 5
• Rank in Sales :5 8 3 1 7 6 9 2 10 4
• Calculate the coefficient of rank correlation and
comment on the result.

59
CORRELATION ANALYSIS
 Ex.3: The ranks of some 16 students in
Mathematics and physics are as follows. Two
numbers within brackets denote the ranks of the
students in Mathematics and Physics (1,1), (2,10),
(3,3), (4,4), (5,5), (6,7), (7,2), (8,6), (9,8), (10,11),
(11,15), (12,9), (13,14), (14,12), (15,16), (16,13).
• Ans.3: Sum(d^2) = 136, R = 0.8

60
CORRELATIONS ANALYSIS
 Ex.4: Ten competition in a musical test were
ranked by the three judges x, y and z in following
order:
• Rank of x: 1 6 5 10 3 2 4 9 7 8
• Rank of y: 3 5 8 4 7 10 2 1 6 9
• Rank of z: 6 3 9 8 1 2 3 10 5 7
• Using Rank Correlation method, discuss which
pair of judges has the nearest approach to common
liking in music.

61
CORRELATION ANALYSIS
 Ans.4:
• -----------------------------------------------------------------
• x y z d1=x-y d2=x-z d3=y-z d1^2 d2^2 d3^2
• -----------------------------------------------------------------
• 1 3 6 -2 -5 -3 4 25 9
• 6 5 4 1 2 -1 1 4 1

• ----------------------------------------------------------------
• 0 0 0 200 60 214
• ----------------------------------------------------------------
62
CORRELATION ANALYSIS
 R(x,y) = 1 – 6 Sum(d1^2)/(n*(n^2-1)) = -7 / 33
• R(x,z) = 1- 6 Sum(d2^2)/ (n*(n^2-1)) = 7 / 11
• R(y,z) = 1 – 6 Sum(d3^2) / (n*(n^2-1) = - 49/165
• Since R(x,z) is maximum, we conclude that the pair
of judges x and z has the nearest approach to
common liking in music.

63
CORRELATION ANALYSIS
 Repeated Ranks: “If there is more than one item
with the same values in the series, then the
spearman’s formula for calculating the rank
correlation coefficient, we add the factor m(m^2 -
1)/12 to Sum(d^2), where m is the number of times
an items is repeated. This correction factor is to be
added for each repeated values”

64
CORRELATION ANALYSIS
 Ex.5: Obtain the rank correlation co-efficient for
the following data:
• X: 68 64 75 50 64 80 75 40 55 64
• Y: 62 58 68 45 81 60 68 48 50 70
• Ans.5: The total correlation for the X-series is
• 2(4-1)/12 + 3(9-1)/12 = 5/2
• The total correlation for y- series is 2(4-1)/12=1/2
as the value 68 occurs twice.
• R = 1-6*(Sum(d^2+5/2+1/2)/(n*(n^2-1) = 0.545

65
CORRELATION ANALYSIS
 Multiple Correlation: “In a tri-variate distribution
in which each of the variables X1, X2 and X3 has N
observations, the multiple correlation coefficient of
X1 on X2 and X3 usually denoted by R1.23 is the
simple correlation coefficient between X1 and the
joint effect of X2 and X3 on X1, then
• R^2(1.23)=(G12^2+G13^2–2*G12*G13*G23)/(1–
• G23^2)

66
CORRELATION ANALYSIS
 Ex.1: The following zero-order correlation
coefficients are given r12 = 0.98, r13 = 0.44, r23 =
0.54, Calculate multiple correlation coefficient
treating first variable as dependent and second and
third variable as independent.
• Ans.1: Given first variable is dependent, second and
third variable are independent.
• Therefore, Multiple correlation coefficient
• R1.23=Sqrt{(r12^2+r13^2–2*r12*r13*r23)/(1-
r23^2)}= Sqrt{(0.98^2+0.44^2-2*(.98)(.44)(.54))/(1-
0.54^2)} = 0.986
67
CORRELATION ANALYSIS
 Ex.2: Given the following zero-order coefficients of
correlation, calculate multiple coefficient of
correlation taking first variable as dependent and
the other two variables as independent: r12 = 0.56,
r13 = 0.38 and r23 = 0.69
• Ans.2: R1.23 = Sqrt{(r12^2+r13^2-2*r12*r13*r23)
• / (1- r23^2)}
• =Sqrt{(0.56^2+0.38^2-2*0.56*0.38* 0.69)
• / (1- 0.69^2)}
• = 0.56
68
CORRELATION ANALYSIS
• Partial Correlation Coefficient: “Sometimes the
correlation between two variables X1 and X2 may
be partly due to the correlation of a third variable,
X3 with both X1 and X2. In such a situation, one
may want to know what the correlation between
X1 and X2 would be if the effect of X3 on each of
X1 and X2 were eliminated. This correlation is
called the partial correlation and the correlation
coefficient between X1 and X2 after the linear
effect of X3 on them has been eliminated is called
the partial correlation coefficient”
69
CORRELATION ANALYSIS
 The formula: G(12.3) = (G12 – G13*G23) / Sqrt((1
– G13^2)*(1 – G23^2))
• Where, G12.3= partial correlation between
variables 1 & 2
• G12 = correlation between variables 1 and 2
• G13 = correlation between variables 1 and 3
• G23 = Correlation between variables 2 and 3
• G13.2=(G13–G12*G23)/Sqrt((1–G12^2)*(1–
• G23^2))
• G23.1 = (G23 – G12*G13) / Sqrt((1 – G12^2)*(1 –
G13^2))
70
CORRELATION ANALYSIS
 Ex.1: From the following data, Calculate the
correlation coefficient between variables 1 and 2
by keeping the effect of variable 3 constant, G12 =
0.7, G13 = 0.6, G23 = 0.4
• Ans: G12.3=(G12–G13*G23)/Sqrt((1–G23^2)*(1–
G23^2))
• = (0.7 – (0.6*0.4)) / Sqrt((1 – 0.6^2)*(1 –
• 0.4^2))
• = 0.625

71
CORRELATION ANALYSIS
• Ex.2: Calculate G23.1 and G13.2 from the following data
G12 = 0.60, G13 = 0.51, G23 = 0.40
• Ans:G23.1=(G23–G12*G13)/Sqrt((1 – G12^2)*(1 –
• G13^2))
• =(0.40-0.60*0.51)/Sqrt((1-0.60^2)*(1-
• 0.51^2))
• = 0.137
 G13.2 = (G13 – G12*G23)/Sqrt((1-G12^2)*(1-
G23^2))
• = (0.51 – 0.60*0.40)/Sqrt((1-0.60^2)*(1-0.40^2))
• = 0.367
72
CORREALATION ANALYSIS
• Ex.3: Given the following zero order correlation
coefficients G12 = 0.8, G13 = 0.6, G23 = 0.5.
Calculate the partial correlation between first and
third variables, keeping the second variable
constant.
• Ans: Given G12 = 0.8, G13 = 0.6, G23 = 0.5
• G13.2 = (G13 – G12*G23) / Sqrt((1-G12^2)*(1-
• G23^2))
• = (0.6 – 0.8 * 0.5) / Sqrt((1-0.8^2)*(1-0.5^2))
• = 0.385
73
CORRELATION ANALYSIS
 Ex.4: From the data relating to the yield of dry
park X1, height X2 and girth X3 for 18 cinchona
plants the following correlation coefficients were
obtained G12 = 0.77, G13 = 0.72 and G23 = 0.52.
Find the partial correlation coefficient G12.3 and
Multiple correlation coefficient R1.23
• Ans: G12.3= (G12 – G13*G23)/Sqrt((1-G13)^2*(1-
• G23)^2))
• = (0.77 -0.72*0.52)/Sqrt((1-0.72^2)*(1-
• 0.52^2)) = 0.62
74
CORRELATION ANALYSIS
• R1.23^2 = (G12^2*G13^2 – 2*G12*G13*G23)/(1-
G23^2)
• = (0.77^2+0.72^2-2*0.77*0.72*0.52)/(1 – 0.52^2))=
0.7334
• R = Sqrt(R1.23^2) = 0.8564

75
CORRELATION ANALYSIS
 Ex.5: In a trivariate distribution G21=0.7,
G23=G31=0.5, find (i) G23.1 (ii) R1.23
• Ans.5:(i)G23.1=(G23–G21*G31)/Sqrt((1-G21^2) *
• (1-G31^2))
• =(0.5– 0.7*0.5)/Sqrt((1-0.49)*(1-0.25))
• = 0.2425
• (ii) R1.23^2=(G12^2+G13^2-2*G12*G13*G23)/(1-
G23^2)
=(0.49+0.25–2*0.7*0.5*0.5)/(1–0.25)=0.5211

76

You might also like