Professional Documents
Culture Documents
Presentation
By
4
REGRESSION ANALYSIS
Type-III: Assumed Mean us used
• (i) (y – ¯y) = b (x – ¯x)
• (ii) (x – ¯x) = b’(y – ¯y)
6
REGRESSIAN ANALYSIS
Karl Pearson’s correlation coefficient between x
and y = 0.42. Find both the regression lines. Use
these regression lines to estimate the value of y for
x = 50 and also estimate the value of x for y = 30.
• Ans: The following data are given
• ¯x = 40, ¯y = 48, σ(x) = 10, σ(y) = 15, γ = 0.42
• The regression lines are
• (x – ¯x) = γ * (σ(x) / σ(y)) * (y – ¯y)
• (x – 40) = 0.42 * (10 / 15) * (y-48)
• X = 0.279y + 26.608
7
REGRESSION ANALYSIS
9
REGRESSION ANALYSIS
----------------------------------------------------------------
• x y x–¯x y–¯y (x–¯x)^2 (y–¯y)^2 (x–¯x)(y–¯y)
• ----------------------------------------------------------------
• 12 18 0 -2 0 4 0
• 4 22 -8 6 64 36 -48
• 20 10 8 -6 64 36 -48
• 8 16 -4 0 16 0 0
• 16 14 4 -2 16 4 -8
-------------------------------------------------------------------
160 80 -104
10
REGRESSION ANALYSIS
• b’ = ⅀((x – ¯x) * (y – ¯y)) / ⅀(y – ¯y)^2
• b’ = -104 / 80 = -1.3
• b = ⅀((x – ¯x) * (y – ¯y)) / ⅀(x – ¯x)^2
• b = -104 / 160 = -0.65
• Regression equation x on y
• (x – ¯x) = b’ * (y – ¯y)
• (x – 12) = -1.3 * (y - 16)
• X = 32.8 – 1.3y
11
REGRESSION ANALYSIS
• Regression equation y on x is
• (y – ¯y) = b * (x – ¯x)
• y – 16 = -0.65 (x – 12)
• Y = 23.8 – 0.65x
13
REGRESSION ANALYSIS
--------------------------------------------------------------------------
• x y dx=x-¯x dy=y-¯y (dx)^2 (dy)^2 dx*dy
• ---------------------------------------------------------------------------
• 61 112 -9 -13 81 169 117
• 68 123 -2 -2 4 4 4
• 68 130 -2 5 4 25 10
• 64 115 -6 -10 36 100 60
• 65 110 -5 -15 25 225 75
• 70 125 0 0 0 0 0
• 63 100 -7 -25 49 625 175
14
REGRESSION ANALYSIS
62 113 -8 -12 64 144 96
• 64 116 -6 -9 36 81 54
• 67 123 -3 -2 9 4 6
• -----------------------------------------------------------------------
• -48 -93 308 1377 597
• -----------------------------------------------------------------------
• b = (n * ⅀(dxdy) – (⅀(dx)) * (⅀(dy))) /
• (n * ⅀dx^2 – (⅀dx)^2
• = (10 * 597 – (-48) * (-93)) / (10 *308 – (-48)^2)
• = 1.94
15
REGRESSION ANALYSIS
¯ x = A + ⅀(dx) / n = 70 + (-48) / 10 = 65.2
• ¯ y = A’ + ⅀(dy) / n = 725 + (-93) / 10 = 115.7
• The Regression equation y on x is
• y – ¯y = b*(x – ¯x)
• y – 115.7 = 1.94*(x – 65.2)
• y = 1.94x – 10.788
• When x = 71
• Therefore, y = 1.94*71 – 10.788
• y = 126.952
• The weight of a student whose height is 71 is 126.952
16
REGRESSION ANALYSIS
Ex.4: From the following data, find the regression
equation
• x = 6 2 10 4 8
• y = 9 11 5 8 7
• Ans.4: x = 16.4 – 1.3y
18
CURVE FITTING-METHOD OF LEAST SQUARE
19
CURVE FITTING-METHOD OF LEAST SQUARES
• ------------------------------------------------------
• X Y X^2 X*Y
• ------------------------------------------------------
• 1 2.4 1 2.4
• 2 3.0 4 6.0
• 3 3.6 9 10.8
• 4 4.0 16 16.0
• 6 5.0 36 30.0
• 8 6.0 64 48.0
• ------------------------------------------------------
• 24 24 130 113.2
20
CURVE FITTING-METHOD OF LEAST SQUARES
22
CURVE FITTING-METHOD OF LEAST SQUARE
• Type.II: Fitting of Second degree parabola
• Y = a + bX + c*X^2
• The Normal equations are
• Sum(yi) = n*a+b*Sum(xi)+c*Sum(xi^2)
• Sum(xi*yi)=a*Sum(xi)+b*Sum(xi^2)+c*Sum(xi^3)
• Sum(xi^2*yi)=a*Sum(xi^2)+b*sum(xi^3)+c*Sum(
• xi^4)
23
CURVE FITTING-METHOD OF LEAST SQUARES
24
CURVE FITTING-METHOD OF LEAST SQUARE
• -------------------------------------------------------------------------
• X Y X^2 X^3 X^4 X*Y X^2*Y
• -------------------------------------------------------------------------
• 0 1 0 0 0 0 0
• 1 1.8 1 1 1 1.8 1.8
• 2 1.3 4 8 16 2.6 5.2
• 3 2.5 9 27 81 7.5 22.5
• 4 6.3 16 64 256 25.2 100.8
• -------------------------------------------------------------------------
• 10 12.9 30 100 354 37.1 130.3
25
CURVE FITTING-METHOD OF LEAST SQUARES
26
CURVE FITTING-MEATHOD OF LEAST SQUARES
27
CURVE FITTING-METHOD OF LEAST SQUARE
Ex.2: Fit an exponential curve of the form Y=aX^b
• X 2 3 4 5 6
• Y 144 172.8 207.4 248.8 298.6
28
CURVE FITTING-METHOD OF LEAST SQUARE
Type.IV: Fitting of Exponential Curve: Y = ab^x
• U = A + B*X
• Where, U=logY, A=loga, B=logb
• The normal equation are
• SumU=nA+B*SumX
• Sum(X*U)=A*SumX+B*Sum(x^2)
• a=Antilog(A)
• b=Antilog(B)
29
CURVE FITTING-METHOD OF LEAST SQUARE
• Ex.1: Fit an exponential Curve of the form Y =
ab^x to the following data
• X: 1 2 3 4 5 6 7 8
• Y : 1.0 1.2 1.8 2.5 3.6 4.7 6.6 9.1
30
CURVE FITTING – METHOD OF LEAST SQUARES
Ans: ------------------------------------------------------------------
• X Y U=logY XU X^2
• ------------------------------------------------------------------
• 1 1.0 0.0 0.0 1
• 2 1.2 0.0792 0.1584 4
• 3 1.8 0.2553 0.7659 9
• 4 2.5 0.3979 1.5916 16
• 5 3.6 0.5563 2.7815 25
• 6 4.7 0.6721 4.0326 36
• 7 6.6 0.8195 5.7365 49
• 8 9.1 0.9590 7.6720 64
• -------------------------------------------------------------------------
• 36 30.5 3.7393 22.7385 204
31
CURVE FITTING-METHOD OF LEAST SQUARE
The normal equation
• 3.7393 = 8A + 36B … (i)
• 22.7385 = 36A + 204B … (ii)
• B = 0.1406, A = -0.6655/4 = -0.1660
• b = Antilog(B) = Antilog(0.1406) = 1.38
• a = Antilog(A) = 0.68
• Therefore, Y = 0.68(1.38)^x
32
CURVE FITTING-METHOD OF LEAST SQUARE
Ex.2: Fit an exponential curve of the form Y=ab^x
• X 2 3 4 5 6
• Y 144 172.8 207.4 248.8 298.6
• Ans.2: Y=(101.3)*(1.196)^x
33
CORRELATION ANALYSIS
Correlation Coefficient
• Simple Correlation
• Partial Correlation
• Multiple Correlation
• Rank Correlation
• Correlation Coefficient: “If the change in one
variable affects a change in the other variable, the
variables are said to be correlated”
34
CORRELATION ANALYSIS
• Direct or Positive Correlation: “If the two
variables deviate in the same direction, ie if the
increase or decrease in one results in a correlation
is said to be direct or positive”
35
CORRELATION ANALYSIS
• Ex.1:Positive:The heights and weights of a group
of persons.
• 2. The income and expenditure is possible.
• 3. Negative: The price and demand of a commodity
36
CORRELATION ANALYSIS
Perfect correlation: “The Correlation is said to be
perfect if the deviation in one variable is followed
by a corresponding and proportional deviation in
the other”
• Simple Correlation
• Scatter Diagram Method
• Graphic Method
• Karl Pearson’s Coefficient of Correlation
• Concurrent Deviation Method
• Method of Least Squares
37
CORRELATION ANALYSIS
Karl Pearson’s Co-efficient of Correlation
• Type.I: Raw Score Method:
38
CORRELATION ANALYSIS
• G(x, y) = Cov(X,Y) / Sig(x)*Sig(y)
• If (xi,yi), i=1,2,3 … n is the bivariate distribution
then
• Cov(X,Y) = (1/n)*Sum(xi*yi) – x par * y par
• Sig(x) = sqrt(1/n*Sum(xi^2) – X par ^ 2)
• Sig(y) = sqrt(1/n*Sum(yi^2) – Y par ^ 2)
• X par = Sum(x)/n, Y par = Sum(y)/n
39
CORRELATION ANALYSIS
Type-II: Actual Mean Method:
• G = Sum(x*y) / Sqrt(Sum(x^2)*Sum(y^2))
• x = X – X par, y = Y – Y par
40
CORRELATION ANALYSIS
Ex.1: Calculate the correlation coefficient for the
following heights (in inches) of fathers (X) and
their Sons (Y):
• X : 65 66 67 67 68 69 70 72
• Y : 67 68 65 68 72 72 69 71
41
CORRELATION ANALYSIS
-----------------------------------------------------------
• X Y X^2 Y^2 X*Y
• -----------------------------------------------------------
• 65 67 4225 4489 4355
• 66 68 4356 4624 4488
• 67 65 4489 4225 4355
• 67 68 4489 4624 4556
• 68 72 4624 5184 4896
• 69 72 4761 5184 4968
• 70 69 4900 4761 4830
• 72 71 5184 5041 5112
• ----------------------------------------------------------
• 544 552 37028 38132 37560
42
CORRELATION ANALYSIS
X par = Sum(x)/n = 544/8 = 68
• Y par = Sum(y)/n = 552/8 = 69
• G(x,y) = Cov(X,Y)/Sig(x)*Sig(y)
• = (Sum(x*y)/n–(x par * y par)) /
• Sqrt(Sum(x^2)/n– x par^2)*
• Sqrt(Sum(y^2)/n–y par ^ 2)
• = (37560/8 – 68*69) / Sqrt(37028/8 – 68^2) *
Sqrt(38132/8 – 69^2)
• = 0.603
43
CORRELATION ANALYSIS
Ex.2: Calculate Correlation between the variable x
and y
• X 20 16 12 8 4
• Y 22 14 4 12 8
44
CORRELATION ANALYSIS
Ans: ------------------------------------------------------------
• X Y X^2 Y^2 X*Y
• ------------------------------------------------------------
• 20 22 400 484 440
• 16 14 256 196 224
• 12 4 144 16 48
• 8 12 64 144 96
• 4 8 16 64 32
• -----------------------------------------------------------
• 60 60 880 904 840
• -----------------------------------------------------------
45
CORRELATION ANALYSIS
X par = Sum(x)/n = 60/5 = 12
• Y par = Sum(y)/n =60/5 = 12
• G(x,y) = Cov(X,Y)/Sig(x)*Sig(y)
• = (Sum(x*y)/n – X par * Y par) /
• Sqrt(Sum(x^2)/n - X par^2) *
• Sqrt(Sum(y^2)/n – Y par^2)
• = (840/5 -12*12) / Sqrt(880/5 – 12^2) *
• Sqrt(904/5 – 12^2)
• = 0.7
46
CORRELATION ANALYSIS
Ex.3: Calculate the coefficient of correlation
between x and y for the following data
• X 15 16 17 17 18 20 10
• Y 12 17 15 16 12 15 11
• Ans: 0.53
47
CORRELATION ANALYSIS
Type-II: Deviation Score Method (Actual Mean
Method)
• Ex.1: Calculate Karl Pearson Coefficient of
correlation from the following data
49
CORRELATION ANALYSIS
X par = Sum(X) / N = 832 / 8 = 104
• Y par = Sum(Y) / N = 120/8 = 15
• G(x,y) = Sum(xy)/Sqrt(Sum(x^2))*Sqrt(Sum(y^2))
• = -92 / Sqrt(120)*Sqrt(184)
• = - 0.619
• Correlation between index of production and
unemployed is negative.
50
CORRELATION ANALYSIS
Type.III: Assumed Mean Method:
• G = (N*Sum(dx*dy) – Sum(dx) * Sum(dy))/
(Sqrt(N*Sum(dx^2) – (Sum(dx))^2) *
Sqrt(N*Sum(dy^2) – (Sum(dy)^2)
51
CORRELATIONS ANALYSIS
Ex.1: A company manufactures different types of
electrical appliances. It has been using ratio for
advertising its products. The following table shows
amounts of ratio time (X, in minutes) and the
number of electrical appliances sold (Y) over the
last six weeks.
• X : 25 18 32 21 35 29
• Y : 16 11 20 15 26 28
• Calculate the coefficient of correlation between the
two series.
52
CORRELATIONS ANALYSIS
• ----------------------------------------------------------------------
• X Y dx=X – A (dx)^2 dy=Y-A’ (dy)^2 dx * dy
• ----------------------------------------------------------------------
• 25 16 0 0 -4 16 0
• 18 11 -7 49 -9 81 63
• 32 20 7 49 0 0 0
• 21 15 -4 16 -5 25 20
• 35 26 10 100 6 36 60
• 29 28 4 16 8 64 32
• ----------------------------------------------------------------------
• 160 116 10 230 -4 222 175
53
CORRELATION ANALYSIS
• Ans.1: 0.84
• Ex.2: Calculation of Correlation Coefficient (Using
Assumed Mean):
• X 50 60 58 47 49 33 65 43 46 68
• Y 48 65 50 48 55 58 63 48 50 70
54
CORRELATION ANALYSIS
Ans.2: Calculation of Karl Pearson coefficient of
correlation
• -----------------------------------------------------------------
• X Y dx=X-A dx^2 dy=Y-A dy^2 dx*dy
• -----------------------------------------------------------------
• 50 48 0 0 -7 49 0
• 60 65 10 100 10 100 100
• -----------------------------------------------------------------
• 519 1077 535 5 595 489
• G(x,y) = 0.611
55
CORRELATION ANALYSIS
• Type – IV: Grouped Data:
56
CORRELATION ANALYSIS
Rank Correlation : “A method to determine
correlation when the data are not available in
numerical form and, as an alternative, the method
of ranking is used.
• R = 1 – 6 Sum(D^2) / (N*(N^2-1)
• Or R = 1 – 6 Sum(D^2)/(N^3 – N)
57
CORRELATIONS ANALYSIS
Ex.1: Suppose that 10 salesmen employed by a
company were given a month’s training. At the end
of the specified training, they took a test and were
ranked on the basis of their performance. They
were then posted to their respective areas. At the
end of six months, they were rated in respect of
their sales performance. These ranks are shown
below:
58
CORRELATION ANALYSIS
• Salesmen :1 2 3 4 5 6 7 8 9 10
• Ranks in Training: 4 6 1 3 9 7 10 2 8 5
• Rank in Sales :5 8 3 1 7 6 9 2 10 4
• Calculate the coefficient of rank correlation and
comment on the result.
59
CORRELATION ANALYSIS
Ex.3: The ranks of some 16 students in
Mathematics and physics are as follows. Two
numbers within brackets denote the ranks of the
students in Mathematics and Physics (1,1), (2,10),
(3,3), (4,4), (5,5), (6,7), (7,2), (8,6), (9,8), (10,11),
(11,15), (12,9), (13,14), (14,12), (15,16), (16,13).
• Ans.3: Sum(d^2) = 136, R = 0.8
60
CORRELATIONS ANALYSIS
Ex.4: Ten competition in a musical test were
ranked by the three judges x, y and z in following
order:
• Rank of x: 1 6 5 10 3 2 4 9 7 8
• Rank of y: 3 5 8 4 7 10 2 1 6 9
• Rank of z: 6 3 9 8 1 2 3 10 5 7
• Using Rank Correlation method, discuss which
pair of judges has the nearest approach to common
liking in music.
61
CORRELATION ANALYSIS
Ans.4:
• -----------------------------------------------------------------
• x y z d1=x-y d2=x-z d3=y-z d1^2 d2^2 d3^2
• -----------------------------------------------------------------
• 1 3 6 -2 -5 -3 4 25 9
• 6 5 4 1 2 -1 1 4 1
• ----------------------------------------------------------------
• 0 0 0 200 60 214
• ----------------------------------------------------------------
62
CORRELATION ANALYSIS
R(x,y) = 1 – 6 Sum(d1^2)/(n*(n^2-1)) = -7 / 33
• R(x,z) = 1- 6 Sum(d2^2)/ (n*(n^2-1)) = 7 / 11
• R(y,z) = 1 – 6 Sum(d3^2) / (n*(n^2-1) = - 49/165
• Since R(x,z) is maximum, we conclude that the pair
of judges x and z has the nearest approach to
common liking in music.
63
CORRELATION ANALYSIS
Repeated Ranks: “If there is more than one item
with the same values in the series, then the
spearman’s formula for calculating the rank
correlation coefficient, we add the factor m(m^2 -
1)/12 to Sum(d^2), where m is the number of times
an items is repeated. This correction factor is to be
added for each repeated values”
64
CORRELATION ANALYSIS
Ex.5: Obtain the rank correlation co-efficient for
the following data:
• X: 68 64 75 50 64 80 75 40 55 64
• Y: 62 58 68 45 81 60 68 48 50 70
• Ans.5: The total correlation for the X-series is
• 2(4-1)/12 + 3(9-1)/12 = 5/2
• The total correlation for y- series is 2(4-1)/12=1/2
as the value 68 occurs twice.
• R = 1-6*(Sum(d^2+5/2+1/2)/(n*(n^2-1) = 0.545
65
CORRELATION ANALYSIS
Multiple Correlation: “In a tri-variate distribution
in which each of the variables X1, X2 and X3 has N
observations, the multiple correlation coefficient of
X1 on X2 and X3 usually denoted by R1.23 is the
simple correlation coefficient between X1 and the
joint effect of X2 and X3 on X1, then
• R^2(1.23)=(G12^2+G13^2–2*G12*G13*G23)/(1–
• G23^2)
66
CORRELATION ANALYSIS
Ex.1: The following zero-order correlation
coefficients are given r12 = 0.98, r13 = 0.44, r23 =
0.54, Calculate multiple correlation coefficient
treating first variable as dependent and second and
third variable as independent.
• Ans.1: Given first variable is dependent, second and
third variable are independent.
• Therefore, Multiple correlation coefficient
• R1.23=Sqrt{(r12^2+r13^2–2*r12*r13*r23)/(1-
r23^2)}= Sqrt{(0.98^2+0.44^2-2*(.98)(.44)(.54))/(1-
0.54^2)} = 0.986
67
CORRELATION ANALYSIS
Ex.2: Given the following zero-order coefficients of
correlation, calculate multiple coefficient of
correlation taking first variable as dependent and
the other two variables as independent: r12 = 0.56,
r13 = 0.38 and r23 = 0.69
• Ans.2: R1.23 = Sqrt{(r12^2+r13^2-2*r12*r13*r23)
• / (1- r23^2)}
• =Sqrt{(0.56^2+0.38^2-2*0.56*0.38* 0.69)
• / (1- 0.69^2)}
• = 0.56
68
CORRELATION ANALYSIS
• Partial Correlation Coefficient: “Sometimes the
correlation between two variables X1 and X2 may
be partly due to the correlation of a third variable,
X3 with both X1 and X2. In such a situation, one
may want to know what the correlation between
X1 and X2 would be if the effect of X3 on each of
X1 and X2 were eliminated. This correlation is
called the partial correlation and the correlation
coefficient between X1 and X2 after the linear
effect of X3 on them has been eliminated is called
the partial correlation coefficient”
69
CORRELATION ANALYSIS
The formula: G(12.3) = (G12 – G13*G23) / Sqrt((1
– G13^2)*(1 – G23^2))
• Where, G12.3= partial correlation between
variables 1 & 2
• G12 = correlation between variables 1 and 2
• G13 = correlation between variables 1 and 3
• G23 = Correlation between variables 2 and 3
• G13.2=(G13–G12*G23)/Sqrt((1–G12^2)*(1–
• G23^2))
• G23.1 = (G23 – G12*G13) / Sqrt((1 – G12^2)*(1 –
G13^2))
70
CORRELATION ANALYSIS
Ex.1: From the following data, Calculate the
correlation coefficient between variables 1 and 2
by keeping the effect of variable 3 constant, G12 =
0.7, G13 = 0.6, G23 = 0.4
• Ans: G12.3=(G12–G13*G23)/Sqrt((1–G23^2)*(1–
G23^2))
• = (0.7 – (0.6*0.4)) / Sqrt((1 – 0.6^2)*(1 –
• 0.4^2))
• = 0.625
71
CORRELATION ANALYSIS
• Ex.2: Calculate G23.1 and G13.2 from the following data
G12 = 0.60, G13 = 0.51, G23 = 0.40
• Ans:G23.1=(G23–G12*G13)/Sqrt((1 – G12^2)*(1 –
• G13^2))
• =(0.40-0.60*0.51)/Sqrt((1-0.60^2)*(1-
• 0.51^2))
• = 0.137
G13.2 = (G13 – G12*G23)/Sqrt((1-G12^2)*(1-
G23^2))
• = (0.51 – 0.60*0.40)/Sqrt((1-0.60^2)*(1-0.40^2))
• = 0.367
72
CORREALATION ANALYSIS
• Ex.3: Given the following zero order correlation
coefficients G12 = 0.8, G13 = 0.6, G23 = 0.5.
Calculate the partial correlation between first and
third variables, keeping the second variable
constant.
• Ans: Given G12 = 0.8, G13 = 0.6, G23 = 0.5
• G13.2 = (G13 – G12*G23) / Sqrt((1-G12^2)*(1-
• G23^2))
• = (0.6 – 0.8 * 0.5) / Sqrt((1-0.8^2)*(1-0.5^2))
• = 0.385
73
CORRELATION ANALYSIS
Ex.4: From the data relating to the yield of dry
park X1, height X2 and girth X3 for 18 cinchona
plants the following correlation coefficients were
obtained G12 = 0.77, G13 = 0.72 and G23 = 0.52.
Find the partial correlation coefficient G12.3 and
Multiple correlation coefficient R1.23
• Ans: G12.3= (G12 – G13*G23)/Sqrt((1-G13)^2*(1-
• G23)^2))
• = (0.77 -0.72*0.52)/Sqrt((1-0.72^2)*(1-
• 0.52^2)) = 0.62
74
CORRELATION ANALYSIS
• R1.23^2 = (G12^2*G13^2 – 2*G12*G13*G23)/(1-
G23^2)
• = (0.77^2+0.72^2-2*0.77*0.72*0.52)/(1 – 0.52^2))=
0.7334
• R = Sqrt(R1.23^2) = 0.8564
75
CORRELATION ANALYSIS
Ex.5: In a trivariate distribution G21=0.7,
G23=G31=0.5, find (i) G23.1 (ii) R1.23
• Ans.5:(i)G23.1=(G23–G21*G31)/Sqrt((1-G21^2) *
• (1-G31^2))
• =(0.5– 0.7*0.5)/Sqrt((1-0.49)*(1-0.25))
• = 0.2425
• (ii) R1.23^2=(G12^2+G13^2-2*G12*G13*G23)/(1-
G23^2)
=(0.49+0.25–2*0.7*0.5*0.5)/(1–0.25)=0.5211
76