Professional Documents
Culture Documents
Key Takeaways
There is a linear relationship between the dependent variables and the independent
variables
The independent variables are not too highly correlated with each other
yi observations are selected independently and randomly from the population
Residuals should be normally distributed with a mean of 0 and variance σ
The coefficient of determination (R-squared) is a statistical metric that is used to measure how
much of the variation in outcome can be explained by the variation in the independent variables.
R2 always increases as more predictors are added to the MLR model, even though the predictors
may not be related to the outcome variable.
R2 by itself can't thus be used to identify which predictors should be included in a model and
which should be excluded. R2 can only be between 0 and 1, where 0 indicates that the outcome
cannot be predicted by any of the independent variables and 1 indicates that the outcome can be
predicted without error from the independent variables.1
When interpreting the results of multiple regression, beta coefficients are valid while holding all
other variables constant ("all else equal"). The output from a multiple regression can be
displayed horizontally as an equation, or vertically in table form.2
In reality, multiple factors predict the outcome of an event. The price movement of ExxonMobil,
for example, depends on more than just the performance of the overall market. Other predictors
such as the price of oil, interest rates, and the price movement of oil futures can affect the price
of XOM and stock prices of other oil companies. To understand a relationship in which more
than two variables are present, multiple linear regression is used.
Still, the model is not always perfectly accurate as each data point can differ slightly from the
outcome predicted by the model. The residual value, E, which is the difference between the
actual outcome and the predicted outcome, is included in the model to account for such slight
variations.
Assuming we run our XOM price regression model through a statistics computation software,
that returns this output:
An analyst would interpret this output to mean if other variables are held constant, the price of
XOM will increase by 7.8% if the price of oil in the markets increases by 1%. The model also
shows that the price of XOM will decrease by 1.5% following a 1% rise in interest rates. R2
indicates that 86.5% of the variations in the stock price of Exxon Mobil can be explained by
changes in the interest rate, oil price, oil futures, and S&P 500 index.
Multiple regressions are based on the assumption that there is a linear relationship between both
the dependent and independent variables. It also assumes no major correlation between the
independent variables.
What makes a multiple regression multiple?
A multiple regression considers the effect of more than one explanatory variable on some
outcome of interest. It evaluates the relative effect of these explanatory, or independent, variables
on the dependent variable when holding all the other variables in the model constant.