You are on page 1of 16

Regression

by
Dr. Rajesh Moharana

Department of Mathematics
School of Advanced Sciences
Vellore Institute of Technology
Vellore, India

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 1 / 16


Regression

Definition: Regression is the measure of the average relationship be-


tween two or more variables in terms of the original units of data.

Regression Equation: The functional relationship of a dependent vari-


able with one or more independent variable is called regression equation.
It is also called a prediction equation or estimating equation.

Note: The independent variable in regression analysis is called the ”pre-


dictor” or ”regressor” and the dependent variable is called the regressed
variable.

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 2 / 16


Conti...

Types of Regression:

I If there are only two variables under consideration, then the regression
is called simple regression.

I If there are more than two variables under consideration, then the
regression is called multiple regression.

I If there are more than two variables under consideration, and only the
relation between two variables is established, after excluding the effect of
the remaining variables, then the regression is called partial regression.

I If the relationship between x and y is non-linear, then the regression


is a curvilinear regression.

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 3 / 16


Regression Equations (Linear Fit)

Linear regression equation of y on x


Linear regression equation of x on y

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 4 / 16


Conti...

Linear regression equation of y on x:


Let (xi , yi ), i = 1, 2, . . . , n be n bivariate data and let the corresponding
straight line to be fitted to these data points be y = a + bx. The linear
regression equation of y on x is

y − y = byx (x − x), (1)

where byx = covσ(x,y


2
)
. Further, by using correlation coefficient Eq. (1)
x
can be rewrite as
σy
y − y = ρ (x − x), (2)
σx
where ρ is the correlation coefficient between x and y . Here, byx is also
σ
given by byx = ρ σyx and is called the regression coefficient of y on x.

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 5 / 16


Conti...

Linear regression equation of x on y :


Let (xi , yi ), i = 1, 2, . . . , n be n bivariate data and let the corresponding
straight line to be fitted to these data points be x = a + by . The linear
regression equation of x on y is

x − x = bxy (y − y), (3)

where bxy = covσ(x,y


2
)
. Further, by using correlation coefficient Eq. (3)
y
can be rewrite as
σx
x − x = ρ (y − y), (4)
σy
where ρ is the correlation coefficient between x and y . Here, bxy is also
given by bxy = ρ σσyx and is called the regression coefficient of x on y .

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 6 / 16


Conti...

Notes: P
XY
If Xi = xi − x, Yi = yi − y , then byx = P i 2i .
Xi
Similarly
P
XY
bxy = P i 2i .
Yi
The two regression lines given in Eq. (1) and (3) are identical if
byx × bxy = 1 or ρ2 = 1 or ρ = ±1.
The two regression lines always intersect at (x, y ).
When ρ = 0, the equations of the lines of regression are x = x and
y = y which are the equations of the lines parallel to the axes.

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 7 / 16


Conti...

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 8 / 16


Problems

Problem 1: For the following data, find the regression line of y on x.

x 1 2 3 4 5 8 10
y 9 8 10 12 14 16 15
P P
xi 33 yi 84
Solution 1: x = n = 7 = 4.714 and y = n = 7 = 12.

x y xy x2
1 9 9 1
2 8 16 4
3 10 30 9
4 12 48 16
5 14 70 25
8 16 128 64
10 15 150 100

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 9 / 16


Conti...

P P P
n xi yi − xi y
byx = P 2 P 2 i = 0.867
n xi − ( xi )

The regression equation of y on x is

y −y = byx (x − x)
⇒ y − 12 = 0.867(x − 4.714)
⇒y = 0.867x + 7.9129.

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 10 / 16


Conti...
Problem 2: From the following data, fit two regression equations by
finding actual means (of x and y ), i.e., by the actual mean method.

x 1 2 3 4 5 6 7
y 2 4 7 6 5 6 5
P P
xi 28 yi 35
Solution 2: x = n = 7 = 4 and y = n = 7 = 5.

x y X =x −x Y =y −y X2 Y2 XY
1 2 -3 -3 9 9 9
2 4 -2 -1 4 4 2
3 7 -1 2 1 1 -2
4 6 0 1 0 0 0
5 5 1 0 1 1 0
6 6 2 1 4 4 2
7 5 3 0 9 9 0
28 35 0 0 28 16 11
MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 11 / 16
Conti...
P
Xi Yi 11
byx = P 2 = = 0.3928
Xi 28
P
Xi Yi 11
bxy = P 2 = = 0.6875
Yi 16
The regression equation of y on x is

y −y = byx (x − x)
⇒ y − 5 = 0.3928(x − 4)
⇒y = 0.393x + 3.428.

The regression equation of x on y is

x −x = bxy (y − y )
⇒ x − 4 = 0.6875(y − 5)
⇒x = 0.688y + 0.56.
MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 12 / 16
Conti...

Problem 3: From the following results, obtain the two regression equa-
tions and estimate the yield of crops when the rainfall is 29 cms and the
rainfall when the yield is 600 kg.

Y (yield in kgs) X (rainfall in cms)


Mean 508.4 26.7
S.D. 36.8 4.6

Coefficient of correlation between yield and rainfall is 0.52.


Solution 3: We have x = 26.7, y = 508.4, σx = 4.6, σy = 36.8 and
ρ = 0.52. Now,
σy σx
byx = ρ = 4.16 and bxy = ρ = 0.065.
σx σy

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 13 / 16


Conti...

The required regression equations are

y = 4.16x + 397.328
and x = 0.065y − 6.346.

When x = 29 cm, we have y = (4.16 × 29) + 397.328 = 517.968 kg.

When y = 600 kg, we have x = (0.065 × 600) − 6.346 = 32.654 cm.

i.e., when the rainfall is 29 cms, the yield of the crop is 517.968 kg, and
when the yield is 600 kg, the rainfall is 32.654 cms.

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 14 / 16


References

Vijay K. Rohatgi and A.K. Md. Ehsanes Saleh (2003)


An Introduction to Probability and Statistics
Wiley Series in Probability and Statistics
Kapoor, V.K. and Gupta, S.C., (1980)
Fundamentals of Mathematical Statistics
Sultan Chand & Sons
Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers and Keying Ye
(2012)
Probability & Statistics for Engineers & Scientists
Pearson Education
T Veerarajan (2017)
Probability - Statistics and Random Processes
McGraw Hill Education
Rao G. S. (2011)
Probability and Statistics for Science and Engineering
Universities Press

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 15 / 16


Thank You

MAT 2001 ((Statistics for Engineers)) rajesh.moharana@vit.ac.in 16 / 16

You might also like