You are on page 1of 31

12/16/21

Chapter 9
Multicollinearity

Linear Regression Analysis 6E Montgomery, Peck 1


& Vining

9.1 Introduction
• Multicollinearity is a problem that plagues many regression
models. It impacts the estimates of the individual regression
coefficients.
• Uses of regression:
1. Identifying the relative effects of the regressor variables
2. Prediction and/or estimation, and
3. Selection of an appropriate set of variables for the model.

Linear Regression Analysis 6E Montgomery, Peck 2


& Vining

1
12/16/21

9.1 Introduction
• If all regressors are orthogonal, then multicollinearity is not a
problem. This is a rare situation in regression analysis.
• More often than not, there are near-linear dependencies among the
regressors such that p
åt jX j = 0
j =1

is approximately true. If this sum holds exactly for a subset of


regressors, then (X’X)-1 does not exist.

Linear Regression Analysis 6E Montgomery, Peck 3


& Vining

9.2 Sources of Multicollinearity

Four primary sources:


1. The data collection method employed
2. Constraints on the model or in the population
3. Model specification
4. An overdefined model

Linear Regression Analysis 6E Montgomery, Peck 4


& Vining

2
12/16/21

9.2 Sources of Multicollinearity


Data collection method employed
- Occurs when only a subsample of the entire sample space has
been selected. (Soft drink delivery: number of cases and distance
tend to be correlated. That is, we may have data where only a small
number of cases are paired with short distances, large number of
cases paired with longer distances). We may be able to reduce this
multicollinearity through the sampling technique used. There is no
physical reason why you can’t sample in that area.

Linear Regression Analysis 6E Montgomery, Peck 5


& Vining

9.2 Sources of Multicollinearity

Linear Regression Analysis 6E Montgomery, Peck 6


& Vining

3
12/16/21

9.2 Sources of Multicollinearity

Constraints on the model or in the population.


(Electricity consumption: two variables x1 – family income
and x2 – house size). Physical constraints are present,
multicollinearity will exist regardless of collection method.

Linear Regression Analysis 6E Montgomery, Peck 7


& Vining

9.2 Sources of Multicollinearity

Model Specification
Polynomial terms can cause ill-conditioning in the X’X
matrix. This is especially true if range on a regressor variable,
x, is small.

Linear Regression Analysis 6E Montgomery, Peck 8


& Vining

4
12/16/21

9.2 Sources of Multicollinearity


Overdefined model
More regressor variables than observations. The best way to
counter this is to remove regressor variables.
- Recommendations:
1) Redefine the model using smaller set of regressors;
2) Do preliminary studies using subsets of regressors; or
3) Use principal components type regression methods to
remove regressors.

Linear Regression Analysis 6E Montgomery, Peck 9


& Vining

9.3 Effects of Multicollinearity


Strong multicollinearity can result in large variances and covariances
for the least squares estimates of the coefficients. Recall from chapter
3, C = (X’X)-1 and
1
C jj =
1 - R 2j
Strong multicollinearity between xj and any other regressor variable
will cause Rj2 to be large, and thus Cjj to be large.
In other words, the variance of the least squares estimate of the
coefficient will be very large.

Linear Regression Analysis 6E Montgomery, Peck 10


& Vining

10

5
12/16/21

9.3 Effects of Multicollinearity


Strong multicollinearity can also produce least-squares
estimates of the coefficients that are too large in absolute
value. The squared distance between the least squares
estimate and the true parameter is denoted
L12 = (bˆ - b)' (bˆ - b)
( )
E L2 = E (bˆ - b)' (bˆ - b)
1

= s 2Tr ( X' X) -1

Linear Regression Analysis 6E Montgomery, Peck 11


& Vining

11

Linear Regression Analysis 6E Montgomery, Peck 12


& Vining

12

6
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 13


& Vining

13

Linear Regression Analysis 6E Montgomery, Peck 14


& Vining

14

7
12/16/21

9.4 Multicollinearity Diagnostics

• Ideal characteristics of a multicollinearity diagnostic:


1. We want the procedure to correctly indicate if
multicollinearity is present; and,
2. We want the procedure to provide some insight as to which
regressors are causing the problem.

Linear Regression Analysis 6E Montgomery, Peck 15


& Vining

15

9.4.1 Examination of the Correlation Matrix


• If we scale and center the regressors in the X’X matrix, we have the
correlation matrix. The pairwise correlation between two variables xi and xj
is denoted rij. The off diagonal elements of the centered and scaled X’X
matrix (X’X matrix in correlation form) are the pairwise correlations.
• If |rij| is close to unity, then there may be an indication of multicollinearity.
But, the opposite does not always hold. That is, there may be instances when
multicollinearity is present, but the pairwise correlations do not indicate a
problem. This can happen when using pairwise correlations in a problem
with more than two variables involved.

Linear Regression Analysis 6E Montgomery, Peck 16


& Vining

16

8
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 17


& Vining

17

The correlation matrix fails to identify the multicollinearity problem


in the Mason, Gunst & Webster data in Table 9.4, page 304.

Linear Regression Analysis 6E Montgomery, Peck 18


& Vining

18

9
12/16/21

9.4.2 Variance Inflation Factors


• As discussed in Chapter 3, variance inflation factors are very
useful in determining if multicollinearity is present.

VIF j = C jj = (1 - R 2j ) -1
• VIFs > 5 to 10 are considered significant. The regressors that
have high VIFs probably have poorly estimated regression
coefficients.

Linear Regression Analysis 6E Montgomery, Peck 19


& Vining

19

Linear Regression Analysis 6E Montgomery, Peck 20


& Vining

20

10
12/16/21

9.4.2 Variance Inflation Factors


VIFs: A Second Look and Interpretation
• The length of the normal-theory confidence interval on
the jth regression coefficient can be written as

L j = 2(C jj sˆ 2 )1 / 2 t a / 2,n- p -1

Linear Regression Analysis 6E Montgomery, Peck 21


& Vining

21

9.4.2 Variance Inflation Factors


VIFs: A Second Look and Interpretation
• The length of the corresponding normal-theory confidence
interval based on a design with orthogonal regressors (with
same sample size, same root-mean square (rms) values) is

L* = 2sˆ ta / 2,n- p -1

Linear Regression Analysis 6E Montgomery, Peck 22


& Vining

22

11
12/16/21

9.4.2 Variance Inflation Factors


VIFs: A Second Look and Interpretation
• Take the ratio of these two: Lj/L* = C 1jj/ 2. That is, the square
root of the jth VIF gives us a measure of how much longer the
confidence interval for the jth regression coefficient is
because of multicollinearity.

• For example, say VIF3 = 10. Then VIF3 @ 3.3 . This tells us that
that the confidence interval is 3.3 times longer than if the
regressors had been orthogonal (the best case scenario).

Linear Regression Analysis 6E Montgomery, Peck 23


& Vining

23

9.4.3 Eigensystem Analysis of X’X


• The eigenvalues of X’X (denoted l1, l2, …, lp) can be used to
measure multicollinearity. Small eigenvalues are indications of
multicollinearity.
l max
The condition number of X’X is k=
l min
• This number measures the spread in the eigenvalues.
k < 100, no serious problem
100 < k < 1000, moderate to strong multicollinearity
k > 1000, strong multicollinearity.
Linear Regression Analysis 6E Montgomery, Peck 24
& Vining

24

12
12/16/21

9.4.3 Eigensystem Analysis of X’X


• A large condition number indicates multicollinearity exists. It does
not tell us how many regressors are involved.
The condition indices of X’X are
l max
kj =
lj
• The number of condition indices that are large (greater than 1000)
provide a measure of the number of near linear dependencies in X’X.
• In SAS, PROC REG, in the model statement of your program, you can
use the option COLLIN; this will produce out eigenvalues, condition
indices, etc.
Linear Regression Analysis 6E Montgomery, Peck 25
& Vining

25

Linear Regression Analysis 6E Montgomery, Peck 26


& Vining

26

13
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 27


& Vining

27

Linear Regression Analysis 6E Montgomery, Peck 28


& Vining

28

14
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 29


& Vining

29

Linear Regression Analysis 6E Montgomery, Peck 30


& Vining

30

15
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 31


& Vining

31

9.5 Methods for Dealing with Multicollinearity

• Collect more data


• Respecify the model
• Ridge Regression and related techniques (PC regression,
LASSO, etc)

Linear Regression Analysis 6E Montgomery, Peck 32


& Vining

32

16
12/16/21

9.5 Methods for Dealing with Multicollinearity

• Least squares estimation gives an unbiased estimate,

E (bˆ ) = b
with minimum variance – but this variance may still be very
large, resulting in unstable estimates of the coefficients.
– Alternative: Find an estimate that is biased but with smaller variance than
the unbiased estimator.

Linear Regression Analysis 6E Montgomery, Peck 33


& Vining

33

9.5 Methods for Dealing with Multicollinearity

Ridge Estimator b̂ R

bˆ R = ( X' X + kI ) -1 X' y
= ( X' X + kI ) -1 X' Xβˆ
= Z βˆ k

k is a “biasing parameter” usually between 0 and 1.

Linear Regression Analysis 6E Montgomery, Peck 34


& Vining

34

17
12/16/21

9.5 Methods for Dealing with Multicollinearity


The effect of k on the MSE
Recall: MSE (bˆ * ) = Var (bˆ * ) + (bias ) 2
Now, MSE (bˆ *R ) = Var (bˆ *R ) + (bias ) 2
lj
= s2 å + k 2β' ( X' X + kI ) - 2 β
(l j + k ) 2

As k ­, Var ¯, and bias ­


Choose k such that the reduction in variance > increase in bias.
SS Re s = ( y - xbˆ R )' ( y - xbˆ R )

Linear Regression Analysis 6E Montgomery, Peck 35


& Vining

35

9.5 Methods for Dealing with Multicollinearity

• Ridge Trace
- Plots k against the coefficient estimates. If multicollinearity is
severe, the ridge trace will show it. Choose k such that b̂ R is
stable and hope the MSE is acceptable

- Ridge regression is a good alternative if the model user wants to


have all regressors in the model.

Linear Regression Analysis 6E Montgomery, Peck 36


& Vining

36

18
12/16/21

9.5 Methods for Dealing with Multicollinearity

Linear Regression Analysis 6E Montgomery, Peck 37


& Vining

37

Linear Regression Analysis 6E Montgomery, Peck 38


& Vining

38

19
12/16/21

More About Ridge Regression


• Methods for choosing k
• Relationship to other estimators
• Ridge regression and variable selection
• Generalized ridge regression (a procedure with a biasing
parameter k for each regressor

Linear Regression Analysis 6E Montgomery, Peck 39


& Vining

39

Generalized Regression Techniques

Linear Regression Analysis 6E Montgomery, Peck 40


& Vining

40

20
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 41


& Vining

41

Linear Regression Analysis 6E Montgomery, Peck 42


& Vining

42

21
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 43


& Vining

43

Linear Regression Analysis 6E Montgomery, Peck 44


& Vining

44

22
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 45


& Vining

45

Linear Regression Analysis 6E Montgomery, Peck 46


& Vining

46

23
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 47


& Vining

47

Linear Regression Analysis 6E Montgomery, Peck 48


& Vining

48

24
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 49


& Vining

49

Linear Regression Analysis 6E Montgomery, Peck 50


& Vining

50

25
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 51


& Vining

51

Linear Regression Analysis 6E Montgomery, Peck 52


& Vining

52

26
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 53


& Vining

53

Linear Regression Analysis 6E Montgomery, Peck 54


& Vining

54

27
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 55


& Vining

55

9.5.4 Principal-Component Regression

Linear Regression Analysis 6E Montgomery, Peck 56


& Vining

56

28
12/16/21

Linear Regression Analysis 6E Montgomery, Peck 57


& Vining

57

Linear Regression Analysis 6E Montgomery, Peck 58


& Vining

58

29
12/16/21

The eigenvalues suggest that a model based on 4 or 5 of the PCs


would probably be adequate.

Linear Regression Analysis 6E Montgomery, Peck 59


& Vining

59

Linear Regression Analysis 6E Montgomery, Peck 60


& Vining

60

30
12/16/21

Models D and E are


pretty similar

Linear Regression Analysis 6E Montgomery, Peck 61


& Vining

61

31

You might also like