You are on page 1of 7

ST 4011 – Econometrics (30L, 2C) Recall from the last lesson that

Department of Statistics 𝜎2 ∑𝑛 2
𝑖=1 𝑥2𝑖
𝑉𝑎𝑟(𝛽̂1 ) = 2. This can be simplified to
(∑𝑛 2 𝑛 2 𝑛
𝑖=1 𝑥1𝑖 )(∑𝑖=1 𝑥2𝑖 )−(∑𝑖=1 𝑥1𝑖 𝑥2𝑖 )
Please note that these ideas are mainly obtained from the book “Basic
𝜎2
Economics” by D. N. Gujarati. 𝑉𝑎𝑟(𝛽̂1 ) = 2 .
∑𝑛 𝑥 𝑥
∑𝑛 2 𝑖=1 1𝑖 2𝑖
𝑖=1 𝑥1𝑖 (1− ( 𝑛 𝑛
) )
√∑𝑖=1 𝑥2 2
1𝑖 √∑𝑖=1 𝑥2𝑖
Practical Consequences of Multicollinearity
Note that the 𝑋 variables are defined in the deviation form from their
1. OLS estimators are BLUE (Best Linear Unbiased Estimators) even under
respective mean values. Therefore, the above equation can be shown
multicollinearity conditions.
equal to
If the general assumptions of the linear models are satisfied, the OLS
𝜎2
estimators of regression models are still BLUE even if the multicollinearity 𝑉𝑎𝑟(𝛽̂1 ) = ∑𝑛 2 2 )
(1− 𝑟12
; where 𝑟12 is the correlation between 𝑥1 and
𝑖=1 𝑥1𝑖

is high. Unbiasedness is a multi-sample or repeated sampling property. It 𝜎2


𝑥2. Similarly, 𝑉𝑎𝑟(𝛽̂2 ) = ∑𝑛 2 2 )
(1− 𝑟12
.
𝑖=1 𝑥2𝑖
means that the average values of the computed OLS estimates of repeated
It is evident from above variance estimators that as 𝑟12 tends to move
samples by keeping the values of the X variables as fixed will be converged
toward 1, as the multicollinearity increases, the variances of the two
to the corresponding unknown true parameters as the number of samples
estimators increase and they are infinite in the limit when 𝑟12 = 1.
increases. Estimates from a given sample can stay closer or away from
their true parameter values depending on the variances of estimators.
3. The confidence intervals tend to be wider, leading to the acceptance
Generally, multicollinearity causes to produce estimators with larger
of the null hypotheses in a misleading manner resulting in increased
variances as discussed below.
Type II errors.
You know the fact that confidence intervals for parameters get wider if the
2. Although BLUE, the OLS estimators have large variances and
standard errors of estimators are high. Then, the null hypotheses may not
covariances, making precise estimation difficult.
Department of Statistics ST 4011 - Econometrics

be rejected as they should be. Type II errors occur when the null Exercise 01: Consider the following hypothetical data to fit a regression on
hypotheses are not rejected when they are false. the consumption expenditure (𝑌𝑡 ) on income (𝑋1𝑡 ) and wealth (𝑋2𝑡 ).

4. t-ratio of one or more coefficients tend to be statistically insignificant. Table 01


This is relevant to the above discussed issue of not rejecting the null 𝑌𝑡 ($) 𝑋1𝑡 ($) 𝑋2𝑡 ($)
hypotheses because of observing high values of the standard error 70 80 810
estimates. Hence, we do not have enough evidence to reject the null 65 100 1009
hypotheses implying that the corresponding coefficients are statistically 90 120 1273
insignificant. 95 140 1425
110 160 1633
5. Although the t-ratios of one or more coefficients are statistically 115 180 1876
insignificant, R2, the overall measure of goodness of fit, can be high. 120 200 2052
This happens as the F test is significant (joint null hypothesis to test all the 140 220 2201
coefficients are equal to zero rejects), but one or more t-ratios of 155 240 2435
explanatory variables are individually insignificant meaning that some of
150 260 2686
the explanatory variables are highly correlated and therefore it is
impossible to isolate their individual impact on the response variable of
i. Fit a linear regression model with both explanatory variables
interest. However, there is a possibility to jointly explain considerable
and observe that individual regression coefficients are
proportion of the variability of the response variable by these correlated
insignificant but the overall F test is significant with a higher 𝑅 2
explanatory variables leading to observe a high R2 value in the model.
value.

2
Department of Statistics ST 4011 - Econometrics

ii. Fit separate linear regression models by taking one Table 02 Table 03
explanatory variable at a time and observe that individual Y X1 X2 Y X1 X2
regression coefficients are significant. 1 2 4 1 2 4
2 0 2 2 0 2
6. The OLS estimators and their standard errors can be sensitive to small 3 4 0 3 4 12
changes in the data 4 6 12 4 6 0
Under less than perfect multicollinearity, the regression coefficients 5 8 16 5 8 16
can be estimated but the estimates and their standard errors become
very sensitive to even the slightest change in the data. This is relevant
to the fact that individual regression coefficients cannot be estimated
precisely.

Exercise 02: Consider the following hypothetical data and observe that
there is a small change in the data of two tables. Fit linear regression
models to the data and observe that estimates can be considerable
different between models.

3
Department of Statistics ST 4011 - Econometrics

Detection of Multicollinearity 4. Eigenvalues and Condition Index


Following indicators will be useful to measure the degree of
Obtain the eigenvalues of the 𝑋 𝑇 𝑋 matrix of the regression data and
multicollinearity in a given context.
compute the condition index (CI) as

1. High 𝑅 2, but a few significant t – ratios of the regression coefficients of


𝑀𝑎𝑥𝑖𝑚𝑢𝑚 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒
𝐶𝐼 = √
a model. 𝑀𝑖𝑛𝑖𝑚𝑢𝑚 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒

2. High pairwise correlations among the explanatory variables of a model. CI < 10 → Fine

Rule of thumb: r ≥ 0.8 generally indicates severe multicollinearity 10 ≤ CI < 30 → Moderate to strong multicollinearity

3. Auxiliary Regression CI ≥ 30 → Severe multicollinearity.

Obtain regression models for each 𝑋𝑖 based on the remaining 𝑋 variables


and obtain 𝑅𝑖2 for each model. 5. Variance Inflation Factor (VIF)
Note that in the presence of multicollinearity one or more 𝑋 variables
Here, the tolerance is calculated first and proceed to calculate the VIF.
could be represented as exact or approximate linear combinations of one
1
or more remaining 𝑋 variables. Therefore, it is meaningful to consider the 𝑡𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒𝑖 = 1 − 𝑅𝑖2 → 𝑉𝐼𝐹𝑖 = 𝑡𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒 . If 𝑉𝐼𝐹𝑖 > 10 → high
𝑖

auxiliary regression in this context. multicollinearity.


Klien’s Rule of thumb: Degree of multicollinearity is high if the 𝑅𝑖2 of any
It is also useful to observe that for two explanatory variables regression
auxiliary regression is greater than the 𝑅 2 of the overall model with all
case, 𝑅𝑖2 = 𝑟12
2
.
explanatory variables.

4
Department of Statistics ST 4011 - Econometrics

Hence, the above discussed In time series data, price and income generally tend to be highly correlated
as they both tend to increase over time. Therefore, the multicollinearity
𝜎2 𝜎2
𝑉𝑎𝑟(𝛽̂1 ) = ∑𝑛 2 2 )
(1− 𝑟12
and 𝑉𝑎𝑟(𝛽̂2 ) = 𝑛
∑ 2 2 )
(1− 𝑟12
can also be effect can be there in the fitted time series model.
𝑖=1 𝑥1𝑖 𝑖=1 𝑥2𝑖

𝜎2 𝜎 2 Suppose we have cross-sectional data from consumer panels, budget


given as 𝑉𝑎𝑟(𝛽̂1 ) = 𝑉𝐼𝐹 and 𝑉𝑎𝑟(𝛽̂2 ) = ∑𝑛 𝑥 2 𝑉𝐼𝐹,
∑𝑛 2
𝑖=1 𝑥1𝑖 𝑖=1 2𝑖
studies that are conducted by various private and government agencies.

respectively. Therefore, we can obtain a reliable estimate for income elasticity 𝛽2 based
on this available data. The reliability of the estimate can be assured as the
prices do not vary much within a cross-sectional data obtained at a
Remedial Methods
particular point in time.
1. Combining cross sectional and time series data (pooling data)
We can write the following model based on the estimate of 𝛽̂2 as
Consider the following example of studying the demand for automobiles in
𝑌𝑡∗ = 𝛽0 + 𝛽1 ln ( 𝑃𝑡 ) + 𝑢𝑡 , where 𝑌𝑡∗ = 𝑙𝑛 (𝑌𝑡 ) − 𝛽̂2 ln (𝐼𝑡 ).
USA using a time series regression model.
Here, 𝑌𝑡∗ represents the adjusted value after removing the effect of income
𝑌𝑡 − Number of cars sold at time 𝑡
from 𝑙𝑛 (𝑌𝑡 ). Now, 𝛽1 can be estimated using the adjusted model.
𝑃𝑡 − Average price at time 𝑡
𝐼𝑡 − Average income at time 𝑡.
Assumptions:
ln (𝑌𝑡 ) = 𝛽0 + 𝛽1 ln ( 𝑃𝑡 ) + 𝛽2 𝑙𝑛 (𝐼𝑡 ) + 𝑢𝑡
i. The cross-sectionally estimated income elasticity is
Note that log values are generally used to satisfy the constant variance
approximately equal to the corresponding estimate
assumption of models. Here, the objective is to estimate price elasticity
obtained from the pure time series data.
(𝛽1 ) and income elasticity (𝛽2 ).
ii. Cross-sectional estimates do not substantially vary from
Note: Price elasticity is an economic measure that indicates the change in
one cross section to another.
consumption of a certain product in relation to a change in its price.
Income elasticity measures how responsive the quantity demand for a
good or service to a change in income.

5
Department of Statistics ST 4011 - Econometrics

2. Dropping highly related variables Idea: Although the original X series can be correlated, their differenced
Simplest method to alleviate the effect of multicollinearity is to drop one series may not be correlated after removing the trend.
or more collinear variables from the model. It can be observed in the
above discussed example of consumption expenditure on income and 4. Ratio Transformation.
wealth that both coefficients were insignificant in the original model, but Consider the following time series regression model
the coefficient of income was highly significant after removing wealth from 𝑌𝑡 = 𝛽0 + 𝛽1 𝑋1𝑡 + 𝛽2 𝑋2𝑡 + 𝑢𝑡 ; where
the model. 𝑌𝑡 − Consumption expenditure at time 𝑡
𝑋1𝑡 − Gross Domestic Product (GDP) at time 𝑡
3. Transformation of variables 𝑋2𝑡 − Total population at time 𝑡.
Let us consider the following two explanatory variable time series model Assume that 𝑋1𝑡 and 𝑋2𝑡 can be correlated and multicollinearity problem
at time 𝑡 and assume that multicollinearity is there due to sharing common arises due to the common trends over time like above. Here, one potential
trends. solution to reduce the multicollinearity effect is to express the model on a
𝑌𝑡 = 𝛽0 + 𝛽1 𝑋1𝑡 + 𝛽2 𝑋2𝑡 + 𝑢𝑡 per capita basis by applying the ratio transformation on the original time
If the model holds at time 𝑡, model must also hold at time 𝑡 − 1 as well series regression model as
(origin of the time is arbitrary). Hence, 𝑌𝑡 1 𝑋
= 𝛽0 (𝑋 ) + 𝛽1 (𝑋1𝑡 ) + 𝛽2 + 𝑋 𝑡 .
𝑢
𝑋2𝑡 2𝑡 2𝑡 2𝑡
𝑌𝑡−1 = 𝛽0 + 𝛽1 𝑋1(𝑡−1) + 𝛽2 𝑋2(𝑡−1) + 𝑢𝑡−1
Now get the difference between two models.
One potential problem of this model is the possibility to observe non-
𝑌𝑡 − 𝑌𝑡−1 = 𝛽1 (𝑋1𝑡 − 𝑋1(𝑡−1) ) + 𝛽2 (𝑋2𝑡 − 𝑋2(𝑡−1) ) + 𝑣𝑡 , 𝑢𝑡
constant variance in , even though the constant variance assumption
𝑋2𝑡
where 𝑣𝑡 = 𝑢𝑡 − 𝑢𝑡−1 .
can be satisfied for the original error terms 𝑢𝑡 . Therefore, we must
Here, the regression analysis is conducted on the differenced series.
carefully assess the impact of this transformation on the severity of the
Hence, the multicollinearity problem may not appear in the new model.
violation of the OLS assumptions.

6
Department of Statistics ST 4011 - Econometrics

5. Increasing the sample size Factor Analysis generally concerns not only identifying uncorrelated but
𝜎2 also interpretable linear combinations. Hence, it can also be used to reduce
Note that 𝑉𝑎𝑟(𝛽̂1 ) = as shown above. When the sample
∑𝑛 2
𝑖=1 𝑥1𝑖
2 )
(1− 𝑟12
the multicollinearity problem.
size increases, ∑𝑛𝑖=1 𝑥1𝑖
2
will generally increase. Therefore, 𝑉𝑎𝑟(𝛽̂1 ) will
decrease with a given correlation coefficient 𝑟12.
7. Applying mean-centered variables in Polynomial Regression
In polynomial regression models, explanatory variables appear with
6. Applying the Principal Component and Factor Analysis techniques.
different powers. Therefore, the explanatory variables can be correlated
Consider a general regression model
according to the way they are introduced into the model.
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝑢 Consider the following cubic polynomial model with one explanatory
and assume that 𝑋1 , 𝑋2 , and 𝑋3 are correlated. According to Principal
variable 𝑋.
Component Analysis (PCA), three linear combinations of 𝑋1 , 𝑋2 , and 𝑋3 ;
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝛽2 𝑋 2 + 𝛽3 𝑋 3 + 𝑢
𝑃𝐶1 , 𝑃𝐶2, and 𝑃𝐶3 (principal components) can be created such a way that
Suppose the explanatory variables are expressed in the mean-centered
they are uncorrelated and together they explain the total variation of the
form as given below.
original three variables 𝑋1 , 𝑋2 , and 𝑋3 .
𝑌 = 𝛽0 + 𝛽1 (𝑋 − 𝑋̅) + 𝛽2 (𝑋 − 𝑋̅)2 + 𝛽3 (𝑋 − 𝑋̅)3 + 𝑢
The purpose of performing PCA is to reduce the dimension of data.
It generally leads to substantially reduce the multicollinearity problem in
Therefore, we generally try to use a smaller number of PC’s than the
practice. Mathematically, this deviation form from the overall mean
number of 𝑋 variables that sufficiently describe a considerable proportion
(mean-centered) has no impact on the original regression.
of variability of the original set of variables.
In summary, this technique can be used to overcome the problem of
multicollinearity as the principal components are uncorrelated to each
other.

You might also like