You are on page 1of 10

Multiple linear regression

(R-squared, multicollinearity, categorical variates)


MMA1402 - Regression and Time Series Modeling

Nanang Susyanto
Department of Mathematics, UGM
R-Squared
Variance:

Conclusion
Interpretation of R-Squared

Coefficient of determination (or Multiple R-squared in R output)


• the proportion of the total variation in the response (as expressed by
SS(Tot)) that is explained by the variates in the model
Multicollinearity
Presence of strong (linear) relationships
among two or more explanatory variates.
leads to wide (imprecise) confidence
intervals and inaccurate conclusions
from hypothesis tests. Why?
Assess Multicollinearity
• Regress !! onto all other explanatory
variates (do not include the original
response)
• Measure the degree of multicollinearity
associated with !! via the variance
inflation factor (VIF):
1
"#$! =
1 − (!"
can be interpreted as the factor by which the
variance of "!! is increased relative to the
ideal case in which all explanatory variates are
uncorrelated
Rule of thumb: multicollinearity is considered a
serious problem if !"#! > 10 ⟺ (!" > ⋯
Example of Multicollinearity

Related to clients,
what do you see?
Modelling categorical variates
How to investigate the relationship between promo type and sales?

How about?
Dummy or indicator variates

How about adding a dummy variate


Interpretation of parameter estimates
• the estimated mean sales (% increase) over this period is

• no promotion: *)# can be interpreted as…

• estimated mean sales for stores using promo 1:


• estimated mean sales for stores using promo 2:
• *)$ and *)" can be interpreted as…
Inference for parameters

Is there a difference in mean sales between stores


that used promo1 and stores that used no promotion?

Is there a difference in mean sales between promo1


and promo2?

You might also like