You are on page 1of 2

Chapter 1 Simple Linear Regression Model

The general Model


Linear Regression’s used when dependent variable is numeric. Independent variables can be numeric or categorical.
The assumed relationship in a linear regression model has the form:
𝑦𝑖 = β1 + β2𝑥𝑖 + 𝑒𝑖
𝑦𝑖 Is the dependent variable (or Response variable)
𝑥𝑖 Is the independent variable (or Explanatory Variables)
𝑒𝑖 Is the error term
2
σ Is the variance of the error term?
β1 Is the intercept parameter or coefficient
β2 Is the slope parameter or coefficient

𝑦𝑖 ℎ𝑎𝑠 𝑡𝑤𝑜 𝑐𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡𝑠: β1 + β2𝑥𝑖 𝑤ℎ𝑖𝑐ℎ 𝑖𝑠 𝑛𝑜𝑛. 𝑟𝑎𝑛𝑑𝑜𝑚 𝑎𝑛𝑑 𝑒𝑖 𝑤ℎ𝑖𝑐ℎ 𝑖𝑠 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚 𝑐𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡 .

The results of estimation if β1 and β2 using Ordinary Least Squares (OLS) method are:
^ ^
β1 = 𝑦 − β2. 𝑥
𝑁
∑ (𝑥𝑖−𝑥)(𝑦𝑖−𝑦)
𝑖=1
β2 = 𝑁 2
∑ (𝑥𝑖−𝑥)
𝑖=1

Goodness of Fit
To assess the performance of the predictive regression model, we use:
2
● 𝑅 : representing the squared correlation between the observed and the predicted values by the model.
⇨ The higher the R-square, the better the model. The closer it is to 1, the better the fit.
● Root Mean Squared Error (𝑅𝑀𝑆𝐸) measures he model prediction error
⇨ The lower the RMSE the better the model.

Analysis of Variance (ANOVA) and Sums of Squares


𝑁 𝑁 2 𝑁
2 ^ ^2
∑ (𝑦𝑖 − 𝑦) = ∑ (𝑦𝑖 − 𝑦) + ∑ 𝑒𝑖
𝑖=1 𝑖=1 𝑖=1

𝑇𝑆𝑆 = 𝐸𝑆𝑆 + 𝑅𝑆𝑆


Total Sum of Squares (TSS) = Explained Sum of Squares (ESS) + Residual Sum of Squares (RSS)
2
𝑅
𝑁 2 𝑁
^ ^2
∑ (𝑦𝑖−𝑦) ∑ 𝑒𝑖
2 𝑖=1 𝑖=1
𝑅 = 𝑁 2
=1− 𝑁 2
∑ (𝑦𝑖−𝑦) ∑ (𝑦𝑖−𝑦)
𝑖=1 𝑖=1

2 𝐸𝑆𝑆 𝑅𝑆𝑆
𝑅 = 𝑇𝑆𝑆
=1− 𝑇𝑆𝑆

2
⇨ The Closer 𝑅 is to 1 the better the fit.
RMSE
- Randomly split your data into training set (called𝑁1) and test set (called𝑁2).
- Build your Regression model using the training set
- Make predictions using the test set and compute the model accuracy metrics.

𝑁2 2
(
∑ 𝑦𝑖−𝑦𝑖
^
)
𝑅𝑀𝑆𝐸 = ⎛ 𝑖=1

𝑁2
⎝ ⎠
⇨ The lower the RMSE, the better the model.

Non-Linear Relationships
2
𝑦𝑖 = β1 + β2𝑥𝑖 + 𝑒𝑖
𝑑𝑦
The Marginal Effect: 𝑚𝑒 = 𝑑𝑥
= 2β2𝑥

𝑑𝑦 𝑥 𝑥
The Elasticity: ϵ = 𝑑𝑥 𝑦
= 𝑚𝑒 𝑦

𝑙𝑜𝑔(𝑦𝑖) = β1 + β2𝑥𝑖 + 𝑒𝑖

You might also like