Regression Analysis
Meaning and Origin
• Regression and correlation analyses show us how to determine
both the nature and the strength of a relationship between two
variables.
• The term regression was first used as a statistical concept in
1877 by Sir Francis Galton
• Galton made a study that shoed that the height of children born
to tall parents tends to move back or regress toward the mean
height of population.
• He designed the word regression as the name of the general
process of predicting one variable from another.
Continued……
• In Regression analysis, we shall develop an estimating
equation- that is, a mathematical formula that relates the
known variables to the unknown variable.
• The Statistical technique that expresses a functional (algebric
relationship between two or more variables in the form of an
equation to estimate the value of a variable, based on the given
value of another variable is called Regression analysis
• The variable whose value is to be estimated is called
dependent ( or response) variable and the variable whose value
is used to estimate this is called independent ( regressor or
predictor) variable.
• In regression, we can have only one dependent variable in our
estimating equation
• However, we can use more than one independent variable.
Usages of Regression
• Regression analysis helps in prediction or forecasting of the
data.
• Areas
– Population and deforestation
– Income and expenditure
– Price and supply of commodity
– Price and demand for commodity
– Sale of Woolen garments and the day temperature
– Sales and advertisement
– Forecasting in the stock Market
– Tax rate and price
Types of Regression Models
• A regression model is an algebraic equation between two variables
based on the given data and estimating the value of a dependent
variable based on the known values of one or more independent
variable.
• Simple and Multiple Regression Models
• If a regression model represents the relationship between a
dependent, Y, and only one independent variable, X, then such
model is called a simple regression model.
• If more than one independent variable is associated with a
dependent variable, then such a regression model is called a multiple
regression model.
• For example:
• Sales turnover of a product ( a dependent variable) is associated
with more than one independent variables such as price of the
product, expenditure on advertisement, quality of the product,
Competitors.
• Linear and Non-Linear Regression Models
• If the change (increase or decrease) in the values of a
dependent (response) variable Y in a regression model is
directly proportional to a unit change (increase or decrease) in
the values of independent (predictor ) variable X, then such a
model is called a linear regression model.
• X = 1, 2, 3, 4, 5, 6,
• Y = 5, 7, 9, 11, 13
• Y = 3 +2X
• A non-linear relationship implies that directly proportional to a
unit change in the value of the independent variable, X.
Regression Line
• Regression line is a line that gives the best estimation of one
variable for any given value of other variable. If we are having
two variables say X and Y, then we will have two regression
lines i.e. regression of X on Y and the regression of Y on X.
• Line of regression of Y on X is the line which gives the
estimated value of Y for any specified value of X.
• Similarly, the line of regression of X on Y is the line which
gives the estimated value of X for any specified value of Y.
Regression Equations
• Regression equations, also known as estimating equations, are
algebraic expressions of the regression lines.
• Since there are two regression lines, there are two regression
equations:
• The regression equations of X on Y is used to describe the
variations in the values of X for a given changes in Y and
• The regression equation of Y on X is used describe the
variation in the values of Y for given changes in X.
Estimation of Regression Equation
Method of Least Square
Example
• Annual Truck-Repair Expenses
Repair Expenses (Y) Age of Truck (X)
5 7
3 7
3 6
1 4
∑X=12 ∑Y=24
• From the above data
• Estimate the regression Equation
• Find the value of Y when X is 10
Solution
Age of Truck(X) Repair XY X2 Y2
Expenses(Y)
5 7 35 25 49
3 7 21 9 49
3 6 18 9 36
1 4 4 1 16
∑X=12 ∑Y=24 ∑XY=78 ∑X2= 44 ∑Y2=150
• We need to find out the values of a and b.
Standard Error of Estimate
• The Standard error of estimate measures the variability, or
scatter, of the observed values around the regression line.
• Short-cut Method
Example
• Annual Relationship between Research and Development and
Profits
Year Exp. For R&D Annual XY X2
(X) Profits (Y)
1995 5 31 155 25
1994 11 40 440 121
1993 4 30 120 16
1992 5 34 170 25
1991 3 25 75 9
1990 2 20 40 4
30 180 1000 200