You are on page 1of 7

Regression

Regression is a mathematical measure of the average relationship between two or


more than two variables.

If we consider two variables then the relationship between the two variables is
called bivariate regression and if we consider more than two variables then the
relationship between more than two variables is called multivariate regression.

In a bivariate distribution one variable is called independent or regressor or


explanatory variable. While the other variable is called dependent or regressed or
explained variable.

Regression Lines

The regression line is the line of best fit and it is obtained by the method of least
squares. There are basically two types of regression lines.

Regression line of Y on X

Regression line of X on Y

Regression line of Y on X

Let there be a set of ‘n’ pairs of observations of X and Y given as (X1, Y1), (X2,
Y2), (X3, Y3), ------------, (Xn, Yn).

Let the regression line of Y on X be

Y=a+bX (1)

Where X is an independent variable and Y is a dependent variable.

And the corresponding normal equations are

Page 1 of 7
Y
i
i
 an  b X i
i
(2)

 X iY  a  X i  b X i
2
(3)
i
i i i

Solving the above normal equations, we get regression of Y on X in the following


form:

Y  Y   r  X  X  Y
XY
 X

b  r XY  Y
YX
 X

  
n X i Y i    X i   Y i 
 i  i 

i
b YX 2

n X i  
 X i 
2

i i 
Thus, we get the regression line of Y on X as given below.

Y  Y   b X  X  YX

Regression line of X on Y

Let there be a set of ‘n’ pairs of observations of X and Y given as (X1, Y1), (X2,
Y2), (X3, Y3),------------, (Xn, Yn).

Let the regression line of X on Y be

X=a+bY (1)

Where Y is an independent variable and X is a dependent variable.

And the corresponding normal equations are

X
i
i
 an  bY i
i
(2)

Page 2 of 7
 X iY  aY i  bY i
2
(3)
i
i i i

Solving the above normal equations, we get the regression of X on Y in the


following form:

X  X   r  Y  Y  X
XY
 Y

b  r XY  X
XY
 Y

  
n X i Y i    X i  Y i 
 i  i  i 
b 2

nY i   Y i 
XY

 
2

i i 
Thus, we get the regression line of X on Y as given below.

X  X   b Y  Y  XY

Example-1

The following data relate to the advertising expenditure and sales.

Advertising
Expenditure (Rs 12 5 10 3 4 18 12 16
lakhs)
Sales (Rs lakhs) 83 75 80 78 68 89 88 87

Find out the regression line of advertising expenditure on sales, regression


line of sales on advertising expenditure and the correlation coefficient
between advertising expenditure and sales.

Page 3 of 7
Solution-

X X2 Y Y2 XY
12 144 83 6889 996
5 25 75 5625 375
10 100 80 6400 800
3 9 78 6084 234
4 16 68 4624 272
18 324 89 7921 1602
12 144 88 7744 1056
16 256 87 7569 1392
X  80
Y X Y
X  1018  648 Y  52856  6727
i 2 2
i i i i i i
i i i i

X
i
i
80
Y
i
i
648
X= = = 10;Y = = = 81
n 8 n 8

  
n X i Y i    X i  Y i 
bYX = i  i  i  = 8  6727  80  648 = 1.13
8  1018  80 
2 2

n X i     X i
2

i i 
  
n X i Y i    X i  Y i 
b XY = i  i  i  = 8  6727  80  648 = 0.67
8  52856  648
2 2

nY i    Y i 
2 
i i 

Regression line of advertising expenditure on sales

X  X  b XY Y  Y 
X  10  0.67(Y  81)
X  0.67Y  54.27  10
X  0.67Y  44.27

Page 4 of 7
Regression line of sales on advertising expenditure

Y  Y  bYX X  X 
Y  81  1.13 X  10
Y  1.13X  69.7

Coefficient of correlation r  
XY
bxy  byx  0.67 1.13  0.87

Example-2

In a study relating to the prices of two shares, X and Y, the following two
regression lines were found, where the share prices were expressed in rupees:

5X -145 = -10Y
14Y-208 = -8X

The standard deviation of the prices of share X = 2. You are required to calculate:
(i) Coefficient of correlation between the prices of shares X and Y
(ii) Standard deviation of the prices of shares Y
(iii) Average prices of shares X and Y

Solution-
(i) Let the regression equation X on Y is
5X-145 = -10Y
 10Y 145
X  
5 5
X  2Y  29
b XY  2

Let the regression equation Y on X is

14Y-208 = -8X
208 8 X 8
Y   bYX 
14 14 14
8
r  bYX  bXY     2  1.06
14

But r cannot exceed one. Therefore we arrive at a contradiction.


Page 5 of 7
Now let regression equation of X on Y is

14Y-208 = -8X
208 14
X   Y
8 8
 14
bXY 
8

And regression equation of Y on X is


5 X  145  10Y
10Y  145  5 X
145 5
Y   X
10 10
5 1
bYX  
10 2
1   14 
r bYX  bXY       0.93
2  8 

But bYX and b XY both are negative

So, r should also be negative

Hence r = -0.93

(ii)
Y
b YX  r
X
1 
   0 . 93 Y
2 2
2
Y 
0 . 93  2
 Y  1 . 075

Standard Deviation of Y = 1.075

(iii)
Now to find the values of X and Y, Solve the two regression equations
simultaneously,

Page 6 of 7
5 X  145  10Y
14Y  208  8 X
5 X  10Y  145
8 X  14Y  208

Multiply (ii) by 8 and (iii) by 5 and subtracting, we get

Y = 12

Now substituting the value of Y in (ii), we get


5 X  10 12   145
X 5

Hence X 5 and Y  12

Page 7 of 7

You might also like