You are on page 1of 24

REGRESSION ANALYSIS

MEANING
“The term regression analysis refers to the methods by which estimates are made
of the values of a variable from a knowledge of the values of one or more other
variables and to the measurement of the errors involved in this estimation
process.” Morris Hamburg

REGRESSION LINES
A regression line is a graphic technique to show the functional relationship between
the two variables X and Y. i.e. dependent and independent variables. It is a line which
shows average relationship between two variables X and Y. Thus, this is a line of
average. This is also called an estimating line as it gives the average estimated value
of dependent variable (Y) for any given value of independent variable (X).
METHODS OF CALCULATING REGRESSION EQUATIONS OR
DERIVATION OF REGRESSION LINES
The following are the two methods to form two regression equations, that is, equation
for Y and X and, for X and Y.
a. Regression equations through normal equations.
b. Regression equations through regression co-efficient.

1. Regression Equations through normal Equations


The two main equations generally used in regression analysis are:
i. Y on X,
ii. X on Y
For Y on X, the equation is
𝑌𝑐 = 𝑎 + 𝑏𝑋
For X on Y, the equation is
𝑋𝑐 = 𝑎 + 𝑏𝑌

Uday N
Assistant Prof. of Commerce
The Regression Equation of Y on X
The regression equation of Y on X can be written as 𝒀𝑪 = 𝒂 + 𝒃𝑿
We can arrive at two normal equations as follows
Given 𝑌 = 𝑎 + 𝑏𝑋
Now Summate (∑) Eq. (i)
(∑) = 𝑁𝑎 + 𝑏∑ 𝑋
Now Multiply the whole equation (ii) by X; we get
∑𝑋𝑌 = 𝑎∑𝑋 + 𝑏∑𝑋 2
Equations (ii) and (iii) are called normal equations

The Regression Equation of X on Y


the regression of X on Y is expressed as
𝑋𝑐 = 𝑎 + 𝑏𝑌
For determining the values of ‘a’ and ‘b’ we determine two normal equations which
can be solved simultaneously.
Summate Eq. (i) ∑𝑋 = 𝑁𝑎 + 𝑏∑𝑌
Multiply Eq. (ii) by Y ; we get ∑𝑋𝑌 = 𝑎∑𝑌 + 𝑏∑𝑌 2
Equation (ii) and (iii) are normal equations.

Uday N
Assistant Prof. of Commerce
1. Given the bivariate data:
X 2 6 4 3 2 2 8 4
Y 7 2 1 1 2 3 2 6
a) Fit the regression line of X on Y and hence predict Y, if X=20
b) Fit the regression line of Y on X and hence predict X, if Y=5
Solution
In this case both the regression lines are needed. It is necessary to find ‘a’ and ‘b’.
Given X, Y can be estimated and Vice versa.
X Y 𝑋2 𝑌2 𝑋𝑌
2 7 4 49 14
6 2 36 4 12
4 1 16 1 4
3 1 9 1 3
2 2 4 4 4
2 3 4 9 6
8 2 64 4 16
4 6 16 36 24
2 2
∑𝑋 = 31 ∑𝑌 = 24 ∑𝑋 = 153 ∑𝑌 = 108 ∑𝑋𝑌 = 83

In order to find the values of a and b, two equations are to be solved simultaneously.
∑𝑌 = 𝑁𝑎 + 𝑏∑𝑋
∑𝑋𝑌 = 𝑎∑𝑋 + 𝑏∑𝑋 2
Substitute the values ; we get
24 = 8𝑎 + 31𝑏 (i)
83 = 31𝑎 + 153𝑏 (ii)
Multiply Eq. (i) 31 and Eq. (ii) by 8 we get
744 = 248𝑎 + 961𝑏 (iii)
664 = 248𝑎 + 1224𝑏 (iv)

Uday N
Assistant Prof. of Commerce
Subtract (iii) and (iv)
248𝑎 + 1224𝑏 = 664
2484 + 961𝑏 = 744
-----------------------
263𝑏 = −80
80
𝑏=− = −0.30 𝑎𝑝𝑝𝑜𝑥
263
Substituting 𝑏 = −0.30 in Eq., (i) we get
24 = 8𝑎 + 31(−0.30)
24 = 8𝑎 − 9.3
8𝑎 = 33.3
33.3
𝑎= = 4.1625
8
Thus regression of Y on X is
𝒀 = 𝟒. 𝟏𝟔𝟐𝟓, −𝟎. 𝟑𝟎 𝑿 𝑨𝒏𝒔.

Similarly, we can solve second regression equation with the help of the two
simultaneous equations. The regression equation of X on Y is
𝑋 = 𝑎 + 𝑏𝑦
The two normal =ns are
∑𝑋 = 𝑁𝑎 + 𝑏∑𝑌
∑𝑋𝑌 = 𝑎∑𝑌 + 𝑏∑𝑌 2
Substitute the values in Eq. (i) and (ii)
31 = 8𝑎 + 24𝑏
83 = 24𝑎 + 108𝑏

Uday N
Assistant Prof. of Commerce
Multiply Eq. (i) by 3 and subtract (ii) and (i) we get
93 == 24𝑎 + 72𝑏
83 = 24𝑎 ± 108𝑏
-----------------------
10 = −36𝑏
10 5
𝑏=− =− = −0.28
36 18
Put this is Eq. (i)
8𝑎 + (24) (−0.28) = 31
8𝑎 − 6.72 = 31
8𝑎 = 31 + 6.72
37.72
𝑎= = 4.715
8
Thus, the regression of X on Y is
𝑿 = 𝟒. 𝟕𝟏𝟓 − 𝟎. 𝟐𝟖𝒀 𝑨𝒏𝒔.
(a) Now let us predict the value of Y is X = 20
Y on X regression equation is
𝑌 = 4.162 − 0.30𝑋
= 4.162 − 0.30(20)
= 4.162 − 6
𝒀 = −𝟏. 𝟖𝟑𝟕𝟓 Ans.
(b) Now let us predict X if Y = 5
𝑋 = 4.715 − 0.28(5)
𝑋 = 4.715 − 1.40
𝑿 = 𝟑. 𝟑𝟏𝟓 Ans.

Uday N
Assistant Prof. of Commerce
2. Regression Equations through coefficients
Regression coefficient refers to the constant value multiplied to the independent
variable in a given. Say a relation 𝑌 = 𝑎 + 𝑏𝑥, here 𝑏 (the slope of the regression line)
is the regression coefficient, since it is a multiple of independent variable 𝑥.
Regression equations or lines can easily be arrived at by the use of regression
coefficients. For this purpose, we are required to calculate man, standard deviation and
correlation coefficient of the given series. The following are the main methods to
calculate regression coefficient 𝑌 𝑜𝑛 𝑋(𝑏𝑦𝑥 ) 𝑜𝑟 𝑋 𝑜𝑛 𝑌 (𝑏𝑥𝑦 ).
1. Taking regression from actual mean
2. Taking deviations from assumed mean.
3. Applying actual observations
4. Applying grouped data

(1) Deviation taken from Actual Arithmetic Means of X and Y


In case of above example, we have found out the regression lines directly from the
actual data. If we take deviation of X and Y variable from their respective means, the
calculation be simplified. In that case the equation 𝑌𝐶 = 𝑎 + 𝑏𝑋 is written as
𝑌 − 𝑌̅ = 𝑏(𝑋 − 𝑋̅)
Taking (𝑌 − 𝑌̅ ) = 𝑦 𝑎𝑛𝑑 (𝑋 − 𝑋̅) = 𝑥
In that case we get
We know the normal equations are
∑𝑌 = 𝑁𝑎 + 𝑏∑𝑋
∑𝑋𝑌 = 𝑎∑𝑋 + 𝑏∑𝑋 2
Writing them in terms of 𝑥 𝑎𝑛𝑑 𝑦, 𝑤𝑒 𝑔𝑒𝑡
∑𝑦 = 𝑁𝑎 + 𝑏∑𝑥
∑𝑥𝑦 = 𝑎∑𝑥 + 𝑏∑𝑥 2
Now deviations are taken from actual means, in that case
∑𝑥 = 0 𝑎𝑛𝑑 ∑𝑦 = 0
Therefore eq. (i) will be reduced to
𝑁𝑎 = 0 𝑜𝑟 𝑎 = 0

Uday N
Assistant Prof. of Commerce
Eq. (ii) reduces to
∑𝑥𝑦 = ∑𝑥 2
∑𝑥𝑦
𝑏=
∑𝑥 2
Thus, the regression equation of Y and X can be written as
∑𝑥𝑦
𝑦= 𝑥
∑𝑥 2
𝑌 − 𝑌̅ = 𝑏𝑦𝑥 (𝑋 − 𝑋̅)
In the same way the regression equation X and Y
∴ 𝑋 = 𝑎 + 𝑏𝑌
Is reduced to 𝑥 = 𝑏𝑦 where the value of 𝑏 can be obtained as
∑𝑥𝑦
𝑏=
∑𝑦 2
∑𝑥𝑦
𝑥= 𝑦
∑𝑦 2
𝑋 − 𝑋̅ = 𝑏𝑥𝑦 (𝑌 − 𝑌̅ )

Uday N
Assistant Prof. of Commerce
We know the regression coefficient X on Y i.e. 𝑏𝑥𝑦 is written as
∑𝑥𝑦 𝜎𝑥
𝑏𝑥𝑦 = = 𝑟
∑𝑦 2 𝜎𝑦
∑𝑥𝑦 𝜎𝑥 ∑𝑥𝑦
= =
𝑛𝜎 2 𝜎 2 𝜎𝑦 ∑𝑦 2
(i) Regression equation of X on Y. this can be written as

𝜎𝑥 ∑𝑥𝑦
𝑋 − 𝑋̅ = 𝑟 = =
𝜎𝑦 ∑𝑦 2
𝜎𝑥
𝑟 is known as the regression coefficient of X on Y. it is denoted by 𝑏𝑥𝑦 .
𝜎𝑦

(ii) Regression equation of Y on X


𝜎𝑥
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅)
𝜎𝑦
𝜎𝑦
𝑟 is known as the regression coefficient of Y on X. it is denoted by 𝑏𝑦𝑥 .
𝜎𝑥

When deviations are taken from actual means. The regression coefficient of X on Y
can be written as.
𝜎𝑦 ∑𝑥𝑦
𝑟 = = 𝑏𝑦𝑥
𝜎𝑥 ∑𝑥 2

Uday N
Assistant Prof. of Commerce
2. Given the bivariate data.
X 1 5 3 2 1 2 7 3
Y 6 1 0 0 1 2 1 5
Find the regression equations by taking deviations of items from the means of X and Y
respectively.
Solution:
(𝑋 − 𝑋̅) (𝑌 − 𝑌̅ )
X 𝑥 𝑥2 𝑌 𝑦 𝑦2 𝑥𝑦
1 -2 4 6 4 16 -8
5 2 4 1 -1 1 -2
3 0 0 0 -2 4 0
2 -1 1 0 -2 4 2
1 -2 4 1 -1 1 2
2 -1 1 2 0 0 0
7 4 16 1 -1 1 -4
3 0 0 5 3 9 0
2 2
∑𝑋 ∑𝑥 = 30 ∑𝑌 = 16 ∑𝑦 = 36 ∑𝑥𝑦 = −10
= 24
Regression Equation of Y on X
𝑦 = 𝑏𝑥
∑𝑥𝑦 10
𝑏= = −
∑𝑥 2 30
∑𝑋 24
𝑋̅ = = =3
𝑁 8
∑𝑌 16
𝑌̅ = = =2
𝑛 8
Substituting the values in (i), we get
10
𝑦=− 𝑥 = −0.33𝑥
30
But 𝑦 = (𝑌 − 2) 𝑎𝑛𝑑 𝑥 = (𝑋 − 3)
𝑌 − 2 = −0.33(𝑋 − 3)
𝑌 − 2 = −0.33𝑋 + 0.99
𝑌 = −0.33 + 2.99
𝒀 = 𝟐. 𝟗𝟗 − 𝟎. 𝟑𝟑𝑿

Uday N
Assistant Prof. of Commerce
Regression equation of X on Y i.e.
i.e. 𝑥 = 𝑏𝑦
∑𝑥𝑦
𝑏=
∑𝑦 2

Substituting the values, we get


−10
𝑥= 𝑦
36
= −0.287𝑦
But 𝑥 = (𝑋 − 3) 𝑎𝑛𝑑 𝑦 = (𝑌 − 2)
(𝑋 − 3) = −0.28 (𝑌 − 2)
𝑋 − 3 = −0.28 𝑌 + 0.56
𝑿 = 𝟑. 𝟓𝟔 − 𝟎. 𝟐𝟖𝒀

Uday N
Assistant Prof. of Commerce
2. Deviations taken from assumed means X and Y
In practice we got means in fractions and for simplicity we take deviations from
assumed means. When the deviations are taken from assumed mean, the procedure for
finding regression equations remains the same. In case of actual means the regression,
equations are
𝜎𝑥
𝑋 − 𝑋̅ = 𝑟 (𝑌 − 𝑌̅ )
𝜎𝑦
𝜎𝑥
The value of 𝑟 will. Now be obtained as follows
𝜎𝑦

∑𝑑𝑥 × ∑𝑑𝑦
𝜎𝑥 ∑ 𝑑𝑥𝑑𝑦 − 𝑁
𝑟 = 2 = 𝑏𝑥𝑦
𝜎𝑦 2 ∑𝑑𝑦)
∑𝑑𝑦 − (
𝑁
𝑑𝑥 = (𝑋 − 𝐴) 𝑎𝑛𝑑
In the both the cases, numerically is the same but the differences is only that of
denominator. In the las method it is needed to find of b only. In this method, the
regression coefficients are to be found before solving the regression equations.

Uday N
Assistant Prof. of Commerce
1.
X 2 6 4 3 2 3 8 4
Y 7 2 14 1 2 3 2 6
Obtain regression equations taking deviations from 5 in case of X and 4 in case of Y.

Solution

(𝑋 − 5) (Y-4)
2
𝑋 𝑑𝑥 𝑑𝑥 𝑌 𝑑𝑦 𝑑𝑦 2 𝑑𝑥𝑑𝑦
2 -3 9 7 +3 9 -9
6 +1 1 2 -2 4 -2
4 -1 1 1 -3 9 +3
3 -2 4 1 -3 9 +6
2 -3 9 2 -2 4 +6
3 -2 4 3 -1 1 +2
8 +3 9 2 -2 4 -6
4 -1 1 6 +2 4 -2
∑ 𝑋 = 32 ∑𝑑𝑥 = −8 2 ∑𝑌 = 24 ∑𝑑𝑦 = −8 2 ∑𝑑𝑥𝑦 = −2
∑𝑑𝑥 = 38 ∑𝑑𝑦 = 44

Regression equation of X on Y
∑𝑑𝑥 × ∑𝑑𝑦
∑ 𝑑𝑥𝑑𝑦 −
𝑏𝑥𝑦 = 𝑁
∑𝑑𝑦)2
∑𝑑𝑦 2 − (
𝑁

(−8)(−8)
−2 −
𝑏𝑥𝑦 = 8
(−8)2
44 −
8

−2 − 8 10
𝑏𝑥𝑦 = =− = −𝟎. 𝟐𝟖
44 − 8 36

Uday N
Assistant Prof. of Commerce
∑𝑋 32
𝑋̅ = = =4
𝑁 8
∑𝑌 24
𝑌̅ = = =3
𝑁 8
(𝑋 − 4) = −0.28(𝑌 − 3)
(𝑋 − 4) = −0.28𝑌 + 0.84
𝑿 = 𝟒. 𝟒𝟖𝟕 − 𝟎. 𝟐𝟖𝒀
Regression equation of Y on X
(𝑌 − 𝑌̅) = 𝑏𝑦𝑥 (𝑋 − 𝑋̅)
∑𝑑𝑥 × ∑𝑑𝑦
∑ 𝑑𝑥𝑑𝑦 −
𝑏𝑦𝑥 = 𝑁
∑𝑑𝑥)2
∑𝑑𝑥 2 − (
𝑁

(−8)(−8)
−2
𝑏𝑦𝑥 = 8
(−8)2
38 −
8
−2 − 8 10
= =− = −𝟎. 𝟑𝟑.
38 − 8 30
Thus, the regression equation of Y on X will be
𝑌 − 3 = 0.33(𝑋 − 4)
𝑌 = 1.32 − 0.33𝑋 + 3
𝒀 = 𝟒. 𝟑𝟐 − 𝟎. 𝟑𝟑𝑿

Uday N
Assistant Prof. of Commerce
2. If 𝑥 = 0.85𝑦 𝑎𝑛𝑑 𝑦 = 0.89𝑥
𝜎𝑥 = 3, 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒 𝜎𝑦 𝑎𝑛𝑑 𝑟.

Solution:
𝜎𝑥
Regression Coefficient of X on Y (𝑏𝑥𝑦 ) = 𝑟 0.85
𝜎𝑦
𝜎𝑦
Regression Coefficient of Y on X (𝑏𝑦𝑥 ) = 𝑟 0.89
𝜎𝑥

𝑟 = √𝑏𝑥𝑦 . 𝑏𝑦𝑥 = √(0.85)(0.89) = 0.87


𝜎𝑥
Now 𝑏𝑥𝑦 = 𝑟
𝜎𝑦
𝑏𝑥𝑦 = 0.85, 𝑟 = 0.87 𝜎𝑥 = 3

0.87(3)
So, 0.85 = 𝑜𝑟 𝜎𝑦 = 3.07
𝜎𝑦

Uday N
Assistant Prof. of Commerce
Regression Equations or lines when actual observations are used.
The regression equations or lines can also be arrived at by using actual observations.
1. Find the regression lines from the data:
X 1 2 3 4 5
Y 11 20 17 25 27

Solution:
X Y 𝑋2 𝑌2 𝑋𝑌
1 11 1 121 11
2 20 4 400 40
3 17 9 289 51
4 25 16 625 100
5 27 25 729 135
2 2
∑𝑋 = 15 ∑𝑌 = 100 ∑𝑋 = 55 ∑𝑌 = 2164 ∑𝑋𝑌 = 337

Regression equation of X on Y is
𝑋 − 𝑋̅ = 𝑏𝑥𝑦 (𝑌 − 𝑌̅)
(∑𝑋)(∑𝑌)
∑𝑋𝑌 =
𝑏𝑥𝑦 = 𝑁
(∑𝑌 )2
∑𝑌 2 −
𝑁
(15)(100)
337 −
𝑏𝑥𝑦 = 5
(100)2
2164 −
5
337 − 300 37
= = = 0.226
2164 − 2000 164

(∑𝑋)(∑𝑌)
∑𝑋𝑌 −
𝑏𝑦𝑥 = 𝑁
( ∑𝑋)2
∑𝑋2 −
𝑁

Uday N
Assistant Prof. of Commerce
(15)(100)
337 −
𝑏𝑦𝑥 = 5
(15)2
55 −
5
337 − 300 37
= = = 3.7
55 − 45 10
∑𝑋 15
𝑋̅ = = =3
𝑁 5
∑𝑌 100
𝑌̅ = = = 20
𝑁 5
Regression equation of X on Y
𝑋 − 𝑋̅ = 𝑏𝑥𝑦 (𝑌 − 𝑌̅)
(𝑋 − 3) = 0.226(𝑌 − 20)
𝑋 − 3 = 0.226 𝑌 − 4.52
𝑿 = −𝟏. 𝟓𝟐 + 𝟎. 𝟐𝟐𝟔𝒀 𝑨𝒏𝒔.
Regression equations of Y on X
(𝑌 − 𝑌̅) = 𝑏𝑦𝑥 (𝑋 − 𝑋̅)

𝑌 − 20 = 3.7(𝑋 − 3)
𝑌 − 20 = 3.7𝑋 − 11.1
𝑌 = 20 − 11.1 + 3.7𝑋
𝒀 = 𝟖. 𝟖𝟗 + 𝟑. 𝟕𝑿 𝑨𝒏𝒔.
Hence the two regression lines are:
𝑿 = −𝟏. 𝟓𝟐 + 𝟎. 𝟐𝟐𝟔𝒀
𝒀 = 𝟖. 𝟗 + 𝟑. 𝟕𝑿

Uday N
Assistant Prof. of Commerce
MISCELLANEOUS SOLVED EXAMPLES
1. Given that the means of X and Y are 65 and 67, their standard deviations are 2.5
and 3.5 respectively and the coefficient between them is 0.8.
i. Write down the regression lines.
ii. Obtain the best estimate of X when Y=70.
iii. Using the estimated value of X as the given value of X, estimate the
corresponding value of Y.
Solution:
i. Regression line of Y on X
𝜎𝑦
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅ )
𝜎𝑥
3.5
𝑌 − 67 = 0.8 (𝑋 − 65)
2.5
= 1.12𝑋 − 65 × 1.12
= 1.12𝑋 − 72.8
𝑌 = 1.12𝑋 − 72.8 + 67
𝑌 = 1.12𝑋 − 5.8

ii. Regression line of X on Y


𝜎𝑥
𝑋 − 𝑋̅ = 𝑟 (𝑌 − 𝑌̅ )
𝜎𝑦
2.5
𝑋 − 65 = 0.8 (𝑌 − 67)
3.5
= 0.571(𝑌 − 67)
= 0.571 𝑌 − 38.257

𝑋 = 0.571 𝑌 − 38.257 + 65
𝑋 = 0.571 𝑌 + 26.743
(ii) best estimated of X when Y = 70 can be got from regression equation X on Y.
𝑋 = 0.571 (70) + 26.743
= 39.97 + 26.743 = 𝟔𝟔. 𝟕𝟏𝟑
(iii) when X = 66.713, Y=1.12 (66.713) - 5.8
𝑌 = 74.72 − 5.8 = 𝟔𝟖. 𝟗𝟐

Uday N
Assistant Prof. of Commerce
2. The regression coefficient between the variables X and Y is 𝑟 = 0.60. if 𝜎𝑥 =
1.50, 𝜎𝑦 = 2.00, 𝑋̅ = 10 𝑌̅ = 20, find the equations of the regression lines.
i. Y on X
ii. X on Y
Solution:
(i) Equation of the regression lines of Y on X is
𝜎𝑦
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅)
𝜎𝑥
2
𝑌 − 20 = 0.6 (𝑋 − 10)
1.50
𝑌 − 20 = 0.8 × 𝑋 − 8
𝑌 = 0.8𝑋 − 8 + 20
𝒀 = 𝟎. 𝟖𝑿 + 𝟏𝟐

(ii) Equation of the regression lines X on Y is


𝜎𝑥
𝑋 − 𝑋̅ = 𝑟 (𝑌 − 𝑌̅ )
𝜎𝑦
1.5
𝑋 − 10 = 0.6 (𝑌 − 20)
2.0
= 0.45(𝑌 − 20)
𝑋 − 10 = 0.45 𝑌 − 9
𝑋 = 0.45 𝑌 − 9 + 10
𝑿 = 𝟎. 𝟒𝟓 𝒀 + 𝟏

Uday N
Assistant Prof. of Commerce
3. You are given the following data:
X Y
Arithmetic Mean 𝑋̅ 36 85
Standard deviation 𝜎 11 8
Correlation coefficient between X and Y=0.66
(i) Find the two regression equations
(ii) Estimate value of X when Y=75
Solution:
(i) Regression equation of Y on X
𝜎𝑦
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅)
𝜎𝑥
8
𝑌 − 85 = 0.66 (𝑋 − 36)
11
= 0.48(𝑋 − 36)
𝑌 = 0.48𝑋 − 17.28
𝑌 = 0.48𝑋 − 17.28 + 85
𝒀 = 𝟎. 𝟒𝟖𝑿 + 𝟔𝟕. 𝟕𝟐

Regression Equation X on Y
𝜎𝑥
𝑋 − 𝑋̅ = 𝑟 (𝑌 − 𝑌̅ )
𝜎𝑦

11
𝑋 − 36 = 0.66 (𝑌 − 85)
8

= 0.98 (𝑌 − 85)

= 0.98 𝑌 − 77.18

𝑋 = 0.908 𝑌 − 77.18 + 36

𝑿 = 𝟎. 𝟗𝟎𝟖 𝒀 − 𝟒𝟏. 𝟏𝟖.

Uday N
Assistant Prof. of Commerce
(ii) Estimate the value of X when Y = 75; we use regression equation of X on Y,
𝑋 = 0.908 (75) − 41.8
= 68.1 − 41.18 = 𝟐𝟔. 𝟗𝟐

4. From the following data calculate (i) coefficient of correlation (ii) standard
deviation of Y.

𝑋 = 0.854 𝑌 ; 𝑌 = 0.89 𝑋; 𝜎𝑥 = 3

Solution

𝑋 = 0.854 𝑌 𝑡ℎ𝑎𝑡 𝑚𝑒𝑎𝑛𝑠 𝑏𝑥𝑦 = 0.854


𝑌 = 0.89 𝑋 𝑡ℎ𝑎𝑡 𝑚𝑒𝑎𝑛𝑠 𝑏𝑦𝑥 = 0.89
Since 𝑏𝑥𝑦 𝑎𝑛𝑑 𝑏𝑦𝑥 are positive so r would also be positive
𝜎𝑥 3
𝑏𝑥𝑦 = 𝑟 = 0.854 𝑜𝑟 0.87 = 0.854
𝜎𝑦 𝜎𝑦
0.854 𝜎𝑦 = 2.61
2.61
𝜎𝑦 =
0.854
5. From the following regression equations, calculate 𝑋̅, 𝑌̅𝑎𝑛𝑑 𝑟.
20𝑋 − 9𝑌 = 107
4𝑋 − 5𝑌 = −33
Solution
Calculation of 𝑋̅ 𝑎𝑛𝑑 𝑌̅
Given
107 9
20𝑋 − 9𝑌 = 107 +
20 20
33 4
4𝑋 − 5𝑌 = −33 +
5 5
Multiply eq. (ii) by (5) and deduct it from = n (𝑖)
𝑌 = 17 𝑜𝑟 𝑌̅ = 17

Uday N
Assistant Prof. of Commerce
Putting the value of Y in eq. (𝑖)
20𝑋 − 9(17) = 107
260
20𝑋 = 107 + 153 = = 13 𝑜𝑟𝑋̅ = 13
20
Calculations of r
Let us assume that = (𝑖) is the regression eq. of 𝑋 𝑜𝑛 𝑌 𝑎𝑛𝑑 = 𝑛. (𝑖𝑖 ) 𝑌 𝑜𝑛 𝑋
107 9 9
𝑋= + 𝑌, 𝑜𝑟 𝑏𝑥𝑦 =
20 20 20
33 4 4
𝑌= + 𝑋 𝑜𝑟 𝑏𝑦𝑥 =
5 5 5

9 4
𝑟 = √𝑏𝑥𝑦 × 𝑏𝑦𝑥 = √ × = 𝟎. 𝟔
20 5

6. Find the mostly likely production corresponding to a rainfall 40" from the
following data.
Rainfall Production
Average 30" 𝑋̅ 500kg 𝑌̅
Standard Deviation 5" 𝜎𝑥 100kg
Coefficient of correlation = 0.8
Solution
As production depends on rainfall, we denote production by Y and rainfall by X.
that way we have to fit a regression equation of Y on X for finding like likely
production.
𝜎𝑦
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅)
𝜎𝑥
100
𝑜𝑟 𝑌 − 500 = 0.8 (𝑋 − 30)
5
𝑌 − 500 = 16(𝑋 − 30)
𝑜𝑟 𝑌 − 500 = 16𝑋 − 480
𝑌 = 16𝑋 + 20
When X is 40, Y shall be

Uday N
Assistant Prof. of Commerce
𝑌 = (16)(40) + 20 = 𝟔𝟎𝟎
7. You are given the following data:
X Y
Average 36 85
Standard Deviation 11 8
r = 0.66
(i) Find two regression equations
(ii) Estimate the value of X when Y = 75
Solution
(i) Regression equation of X on Y:
𝜎
𝑋 − 𝑋̅ = 𝑟 𝑥 = (𝑌 − 𝑌̅ )
𝜎𝑦
11
𝑋 − 36 = 0.66 (𝑌 − 85)
8
𝑋 − 36 = 0.908 (𝑌 − 85)
𝑋 − 36 = 0.908 𝑌 − 77.18
𝑿 = 𝟎. 𝟗𝟎𝟖 𝒀 − 𝟒𝟏. 𝟏𝟖

(ii) Regression equation of Y on X:


𝜎𝑦
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅)
𝜎𝑥
8
𝑌 − 85 = 0.66 (𝑋 − 36)
11
𝑌 − 85 = 0.48(𝑋 − 36)
𝑌 − 85 = 0.48𝑋 − 17.28
𝒀 = 𝟎. 𝟒𝟖𝑿 + 𝟔𝟕. 𝟕𝟐

By putting 𝑌 = 75 𝑖𝑛 = 𝑛. (𝑖) we find out estimated value of X


𝑋 = 0.908(75) − 41.18
𝑋 = 68.1 − 41.18 = 26.92
𝑇ℎ𝑢𝑠 𝑿 = 𝟐𝟔. 𝟗𝟐

Uday N
Assistant Prof. of Commerce
8. You are given below the following information about advertisement and sales.
Adv. Exp. (X) (Rs. crs.) Sales (Y) (Rs. crs.)
Mean 20 120
SD 5 25 r = 0.8
(i) Calculate the two regression equations.
(ii) Find the likely sales when advertisement expenditure is Rs. 25 crores.
(iii) What should be advertisement budget if the company wants to attain target
of Rs. 150 crores.
Solution
(i) Regression Equation of X on Y
𝜎𝑥
𝑋 − 𝑋̅ = 𝑟 (𝑌 − 𝑌̅ )
𝜎𝑦
5
𝑋 − 20 = 0.8 (𝑌 − 120)
25
𝑋 − 20 = 0.16(𝑌 − 120)
𝑋 − 20 = 0.16𝑌 − 19.2
𝑿 = 𝟎. 𝟏𝟔𝒀 + 𝟎. 𝟖

Regression equation of Y on X
𝜎𝑦
𝑌 − 𝑌̅ = 𝑟 (𝑋 − 𝑋̅)
𝜎𝑥
25
𝑌 − 120 = 0.8 (𝑋 − 20)
5
𝑌 − 120 = 4(𝑋 − 20)
𝑌 − 120 = 4𝑋 − 80
𝒀 = 𝟒𝑿 + 𝟒𝟎

(ii) From regression equation of Y on X, we can find out sales when


advertisement expenditure is 5 crs.
𝑌 = 4(25) + 40 = 𝟏𝟒𝟎 𝒄𝒓𝒐𝒓𝒆𝒔
(iii) From X on Y we can find advertisement budget if sales target is 150.
𝑋 = 0.16(150) + (0.8)
= 24 + 0.8 = 𝟐𝟒. 𝟖

Uday N
Assistant Prof. of Commerce
To be solved
a. Obtain the regression equations for the following:
X 15 27 27 30 34 38 46
Y 120 140 150 170 180 200 250

b. Two regression equations are:


𝑌 𝑜𝑛 𝑋 ∶ 10𝑋 + 9𝑌 + 7 = 0
𝑋 𝑜𝑛 𝑌 ∶ 2𝑋 + 5𝑌 + 10 = 0

c. Find the two regression lines for the data:


X 1 2 3 4 5
Y 1 20 17 25 27

d. Find the coefficient of correlation from the following regression equations;


3𝑦 − 2𝑥 − 10 = 0
2𝑦 − 50 − 𝑥 = 0
Also find the estimated value of Y when X =10.

e. The following table shows the ages (x) and blood pressure (Y) of 8 persons:
X 52 63 45 36 72 65 47 25
Y 62 53 51 25 79 43 60 33
Attain the two regression equations. Also find the expected blood pressure of 49-year-
old persons.

Uday N
Assistant Prof. of Commerce

You might also like