You are on page 1of 26

N A LY SI S

SS ION A
RE G RE
INTRODUCTION
The term regression was originally introduced in
statistics by Sir Francis Galton in 1877 in his
research paper ‘Regression towards Mediocrity
in Hereditiary stucture’.
He reached at the conclusion that

ü Tall fathers had tall sons and short fathers had


short statured sons.
ü The mean height of the sons of a group of tall
fathers was found to be less than that of the
fathers and the mean height of the sons of a
group of short statured fathers was found to
be greater than that of the fathers.
Definition :
 Regression is the measure of the average relationship between two or
more variables in terms of original units of data.

Utility :

Regression analysis is a statistical method which is used


in those fields where we find the tendency of going
back towards the general average in two of more
correlated series. In the field of economics and business
, regression analysis has more utility. Regression
analysis is used as control tools by management in
their business. This helps in taking decisions in
business.
Regression analysis can be used in other fields also like

natural , physical and social sciences. The best


estimate can only be had if the series are correlated.
The analysis can also be extended in more than two
series also.

Some functional highlights
 It help us to estimate the dependent variables
with the help the independent variables.

 It helps us to measure the error involved in


using the regression lines as the basis for
estimations.

 We can obtain a measure of degree


associations or correlation that exist
between the two variables i.e. dependant
variables and independent variables.
Types of Regression
Regression

Simple Regression Simple Regression

Dependant variable Independent variable
Regression lines
The lines of best fit drawn to show the mutual

relationship between X and Y variables are


known as Regression Lines. For two variables
we have two regression lines, one
representing regression of X on Y and other Y
on X. The line representing regression of X on
Y presumes Y as an independent variable and
X as a dependent variable. The lines gives the
best estimated value of X for the given value
of Y. In the same way, the second line
represents the regression of Y on X.
Functions Of Regression
Lines
I. Best Estimate

I. Extent and Direction of Correlation


 Positive Correlation
 Negative Correlation
 Perfect Correlation
 Absence of Correlation
 Limited Correlation
Regression Equations
Regression Equations are algebraic form of

regression lines. They are also known as


estimating equations. As there are two
regression lines, we have two regression
equations.

 Regression Equation of X on Y : This is used to


describe variation in the value of X for the
given changes in Y.

 Regression Equation of Y on X : This is used to


describe variation in the value of Y for the
given changes in X.
Regression Equation Of X on Y
The regression equation X on Y is written in the
form of X= a+bY.
From this equation we can have the best
estimate of X for the given value of Y. In this
way from the estimated values of X and
known values of Y, we draw a line which is
known as regression line X on Y.
To determine the values of a and b the
following two normal equations are to be
solved simultaneously.

∑X = Na +b∑Y
∑XY =a ∑Y +b∑Y 2
Regression Equation of Y on X
The regression equation of Y on X is written in
the form of Y= a+bX.
From this equation we can have the best
estimate of Y for the given value of X. In this
way from the estimated values of Y and known
values of X, we draw a line which is known as
regression line Y on X
To determine the values of a and b the
following two normal equations are to be
solved simultaneously.

∑Y =Na +b∑X
∑XY =a ∑X +b∑X 2
Example 
Calculate the regression equations of X on Y

and Y on X from the following data :



 X 2 3 4 5 6
Y 3 4 5 6 7

X Y Y2 XY XY

2 3 4 9 6
3 4 9 16 12
4 5 16 25 20
5 6 25 36 30
6 7 36 49 42
∑ X = 20 ∑ Y = 25 ∑X 2
= 86 ∑Y 2
= 135 ∑ XY = 135
Calculations Based Of Arithmetic
Mean
gression equation  of X  on Y : Regression Equation of  Yon X :
σX σy
( X − X ) =r (Y −Y ) (Y −Y ) =r
σx
( X −X )
σy σ
r X
         = is called Regression                  co­efficient of  Y 
X =
   denotes the actual mean of σy
  X­series and            Its value is           in the same
= denotes the actual mean of
Y ∑ xy
    Y­series and  ∑ x2
manner as proved in case of the regression equation  X
co­efficient of correlation     between X and Y series.
 = standard deviation of X­ series
σx
  = standard deviation of Y­ series
σy
bXY =r
σx
=
∑ XY σ
× x =
∑ xy
σy nσx ×nσy σy nσ2
y

=
∑ xy
=
∑xy

n×∑ y2 ∑y2
n
Example
X 3 5 7 9 11

Y 6 7 9 8 10
Solution :

X (X − X ) 2 Y (Y − Y ) Xy
( x) Y ( y) Y2
3 -4 16 6 -2 4 +8
5 -2 4 7 -1 1 +2
7 0 0 9 +1 1 0
9 +2 4 8 0 0 0
11 +4 16 10 +2 4 +8
∑ X = 35 ∑ x = 0 ∑x 2
= 40 ∑Y =40
Mean=8
∑y=0 ∑y 2
= 10 ∑ xy= 18
Mean=7
Regression equation of X on Y :  Regression equation of Y on X :

σx σy
( X − X ) = r (Y − Y ) (Y − Y ) = r (X − X )
σy σx

or, ( X − X ) =
∑ xy
(Y − Y ) or , (Y − Y ) =
∑ xy
(X − X )
∑y 2
∑x 2

18 18
X − 6 = (Y − 8) Y −8 = ( X − 6)
10 40
⇒ X − 6 = 1.8(Y − 8) ⇒ Y − 8 = 0.45( X − 6)
⇒ X = 1.8Y − 8.4 ⇒ Y − 8 = 0.45 X − 2.70
Y = 0.45 X + 2.70
Calcualtions Based of Assumed Mean
Regression equation of Y on X :

N∑ xy −∑∑x y
b yx = ( X −X )
N∑ y −( ∑
2 2
x )

Regression equation of X on Y :

N∑ xy −∑∑x y
bxy = (Y −Y)
N∑ y −( ∑
2
y) 2
Example
Height of the 62 64 66 67 68 68 69 71 72 73
fathers in inches

Height of the sons 63 62 65 67 67 70 70 67 68 71


in inches

Solution :
X ( X - 65 )= x Column1 Y ( Y - 65 )= y Column6 xy

62 -3 9 63 -2 4 6

64 -1 1 62 -3 9 3

66 1 1 65 0 0 0

67 2 4 67 2 4 4

68 3 9 67 2 4 6

68 3 9 70 5 25 15

69 4 16 70 5 25 20

71 6 36 67 2 4 12

72 7 49 68 3 9 21

73 8 64 71 6 36 48

  30 198   20 120 135


X =A +
∑x
==
65 65 ++
30
3=68
N 10

Y =A +
∑ y
=65 +2 =67
N

Regression equation of X on Y: Regression equation of Y on X:
σx σ
(X −X ) =r (Y −Y )
y
(Y −Y) =r (X −X)
σy σx

N ∑ xy −∑ x ∑ y N ∑xy −∑x ∑y
bxy = (Y − Y ) byx = ( X −X )
N ∑ y − (∑ y )
2 2 N ∑y −(∑x )
2 2

30 * 20 30 * 20
135 − ( ) 135 − ( )
( X − 68) = 10 (Y − 67) (Y − 67) = 10 ( X − 68)
(20) 2
(30) 2
120 − 198 −
10 10
75 75
( X − 68) = (Y − 67) (Y − 67) = ( X − 68)
80 108
X = 68 − 62.8 + 0.94Y Y = 67 − 47.6 + 0.7 X
∴ X = 5.2 + .94Y ∴Y = 19.4 + 0.7 X
Regression Equations in Grouped
Frequency Distribution
 Regression Equations of X on Y:

(Y −Y) =b yx ( X −X)

(∑ ∑
 fx * fy )
∑fxy −
N iy
(Y −Y) = * (X −X)
fx 2 − ∑
 2
( fx ) i

∑ N
x

 Regression Equations of Y on X :

(Y −Y ) =b yx ( X −X )

∑fxy −(
∑fx * ∑fy )
N iy
(Y −Y ) = * ( X −X )
(∑fx 2 ) ix
∑fx 2

N
Example
Height in inches Weight in lbs.  

80-90 90-100 100-110 110-120 Total

50-55 2 10 8 - 20

55-60 4 15 5 1 25

60-65 2 10 15 8 35

65-70 2 5 2 11 20

Total 10 40 30 11 100
X =A +
∑fx
*i Y =A +
∑fy
*i
N N
−45 * 5 60
X =62 .5 + Y =95 + *10
100 100
X =60 .25
Y =101

Regression Equation of X on Y: Regression Equation of Y on X:
X − X = bxy (Y − Y ) Y = a + bX

( X − 60.25) = 0.202(Y − 101) (Y − Y ) = byx ( X − X )

( X − 60.25) = 0.202Y − 0.202 *101 (Y −101) = 0.649 ( X − 60 .25 )


(Y −101) = 0.649 X − 0.649 * 60 .25
X = 60.25 − 20.402 + 0.202Y
Y = 101 − 39 .102 + 0.649 X
∴ X = 39.848 + 0.202Y
∴Y = 61 .898 + 0.649 X
Regression Coefficients
 Regression Coefficients of X on Y :
σx ∑

xy
bxy =r b yx =
σy ∑

2
y

∑ xy − ( ∑ x*∑ y
)
bxy = N or
∑ xy * N − (∑ xy )

∑ y2 − N
( ∑ y ) 2
∑ y 2
* N − ( ∑ y ) 2

∑ x*∑ y
∑ xy − ( N ) ix
bxy = *
(∑ y ) 2
iy
∑ y 2

N
 Regression Coefficients of Y on X :

σy ∑xy
bxy =r b yx =
σx ∑x 2

∑ xy − ( ∑ x*∑ y
)
byx = N or
∑ xy * N − (∑ xy )

∑ x2 − N
( ∑ x ) 2
∑ x 2
* N − ( ∑ y ) 2

∑ x*∑ y
∑ xy − ( N ) i y
bxy = *
(∑ x ) 2
ix
∑ x 2

N
Example :

Find the regression coefficients :

X 1 2 3 4 5

Y 2 5 3 8 7

Solution :

X x=(X-2) x 2 Y y=(Y-4) y2 xy
1 -1 1 2 -2 4 2
2 0 0 5 +1 1 0
3 +1 1 3 -1 1 -1
4 +2 4 8 +4 16 8
5 +3 9 7 +3 9 9
+5 15 +5 31 18
Means of X and Y :

∑ x 5 Y = A+
∑ y 5
= 4 + = 4 +1 = 5
X = A+ = 2+ = 2+1= 3
N 5 N 5

Regression Coefficients of X on Y :

∑xy −( ∑x * ∑y ) 18 −(
5*5
)
N 5 13
byx = = = = 0 .5
(∑y ) 2 (5) 2 26
∑y − N
2
31 −
5
∴bxy = +0.5

Regression Coefficients of Y on X :

∑ xy − ( ∑ x * ∑y
) 18 − (
5*5
)
N 5 13
byx = = = =1.3
(∑x ) 2
(5) 2
10
∑ x 2

N
15 −
5
∴bxy = +1.3
Yo u
an k
T h