You are on page 1of 3

Simple Linear Regression

It is a basic and commonly used type of predictive analysis.


These regression estimates are used to explain the relationship between one dependent variable
and one or more independent variables.
Dependent variable- It may be called an outcome variable, criterion variable, endogenous v.
Independent variables- exogenous variables, predictor variables, or regressors.
Three major uses for regression analysis are
(1) determining the strength of predictors
(2) forecasting an effect
(3) trend forecasting
Types of Linear Regression
1. Simple linear regression
 1 dependent variable (interval or ratio), 2+ independent variables (interval or ratio or
dichotomous)
2. Logistic regression
 1 dependent variable (dichotomous), 2+ independent variable(s) (interval or ratio or
dichotomous)
3. Ordinal regression
 1 dependent variable (ordinal), 1+ independent variable(s) (nominal or dichotomous)
4. Multinomial regression
 1 dependent variable (nominal), 1+ independent variable(s) (interval or ratio or
dichotomous)
5. Discriminant analysis
 1 dependent variable (nominal), 1+ independent variable(s) (interval or ratio

Regression
Pearson correlation is a technique for describing and measuring the linear relationship
between two variables. (E.g. – SAT and GPA)
Find the relationship between SAT & GPA
The line identifies central tendency of the relationship.
The line is useful in making predictions.

The line can establish a precise relationship between each X value (SAT) and its
corresponding Y values (GPA).

the resulting straight line is called the regression line.

This relationship can even be presented in an equation.

Y = bx + a

a and b are constants


Example: A local tennis club charges a fee of $5 per hour plus an
annual membership fee of $25. With this information, the total cost
playing tennis can be computed using a “linear equation” that
describes the relationship between the total cost (Y) and the number of
hours (X).

Y = 5x + 25
Y = Complete cost
b = Slope
25 = Fixed amount
Y = bx + a

 b determines how much the “Y” variable will change when “X”
increased by 1 point.

 According to the equation I have presented, the total cost would go


up by $5, for every addition in X.

 a is the Y intercept. It determines the value of Y when X = 0.

 According to the equation, even if you would not play tennis, still you
will have to pay $25.
Y = 5x + 25

Predict how much you would need to play 10 hrs.

Y = 5 (10) + 25

Y = 5 (10) + 25

50 + 25

=$75

The least squared error solution and the regression line


 To determine how well a line fits the data points, the first step is to define mathematically
the distance between the line and each data point.

 For every X value in the data the linear equation will determine a Y value on the line.
This value is the predicted Y and is called Ŷ (Y hat).

 The distance between this predicted value and the actual Y value in the data is
determined by,
 Distance = Y – Ŷ

 Here we are simply measuring the vertical distance between the actual data point (Y) and
the predicted point on the line. The distance measures the error between the line and the
actual data.

You might also like