You are on page 1of 22

Multiple Regression Analysis

y = β0 + β1x1 + β2x2 + . . . βkxk + u

3. Asymptotic Properties

Lectured by Dr Jin Hongfei 1


Multiple Regression: Recap
Linear in parameters: y = β0 + β1x1 + β2x2 +…+ βkxk + u
[MLR.1]
{(xi1, xi2,…, xik, yi): i=1, 2, …, n} is a random sample from the
population model, so that
yi = β0 + β1xi1 + β2xi2 +…+ βkxik + ui [MLR.2]
E(u|x1, x2,… xk) = E(u) = 0. Conditional mean independence
[MLR.3]
No exact multicollinearity [MLR.4]
V(u|x) = V(u) = σ2. Homoscedasticity. [MLR.5]
Assume that u is independent of x1, x2,…, xk and u is normally
distributed with zero mean and variance s2: u ~ Normal(0,s2)
[MLR.6]
Lectured by Dr Jin Hongfei 2
Asymptotic = Large Sample (n→∞)

In a large sample:
„ estimators can have desirable properties under
less stringent assumptions
„ we can perform inference in the usual way
without assuming that u (and hence y) is
normally distributed
„ we can derive useful additional methods of
testing hypotheses

Lectured by Dr Jin Hongfei 3


Consistency
Under the Gauss-Markov assumptions OLS
is BLUE, but in other cases it won’t always
be possible to find unbiased estimators
In those cases, we may settle for estimators
that are consistent, meaning as n → ∞, the
distribution of the estimator collapses to the
parameter value

Lectured by Dr Jin Hongfei 4


Sampling Distributions as n ↑
n3
n1 < n2 < n3

n2

n1

β1
Lectured by Dr Jin Hongfei 5
Consistency of OLS
Under the Gauss-Markov assumptions, the
OLS estimator is consistent (and unbiased)
We say that the probability limit of βˆ j is
equal to βj or plimβˆ j = βj
In the simple regression case we can
establish consistency relatively easily

Lectured by Dr Jin Hongfei 6


Proving Consistency

βˆ 1 =
∑ ( xi1 − x1 )yi
∑ ( xi1 − x1 )
2

n ∑ ( xi1 − x1 )ui
−1
= β1 + −1
n ∑ ( xi1 − x1 )
2

Cov ( x , u )
plim βˆ 1 = β1 + 1
= β1
Var ( x1 )
since Cov ( x1 , u ) = 0
Lectured by Dr Jin Hongfei 7
A Weaker Assumption
For unbiasedness, we assumed a zero
conditional mean – E(u|x1, x2,…,xk) = 0
For consistency, we can have the weaker
assumption of zero mean and zero
correlation – E(u) = 0 and Cov(xj,u) = 0, for
j = 1, 2, …, k
Without this assumption, OLS will be
biased and inconsistent!

Lectured by Dr Jin Hongfei 8


Deriving the Inconsistency
Just as we could derive the omitted variable
bias earlier, now we want to think about the
inconsistency, or asymptotic bias, in this
case
True model: y = β0 + β1 x1 + β2 x2 + v
You think: y = β0 + β1 x1 + u , so that
u = β2 x2 + v and, plimβ% 1 = β1 + β2δ
where δ = Cov ( x1 , x2 ) Var ( x1 )
Lectured by Dr Jin Hongfei 9
Asymptotic Bias (cont)
So, thinking about the direction of the asymptotic
bias (inconsistency) is just like thinking about the
direction of bias for an omitted variable
Main difference is that asymptotic bias uses the
population variance and covariance, while bias
uses the sample counterparts
Remember, inconsistency is a large sample
problem – it doesn’t go away as we add data

Lectured by Dr Jin Hongfei 10


Large Sample Inference
Recall that under the CLM assumptions,
the sampling distributions are normal, so we
could derive t and F distributions for testing
This exact normality was due to assuming
the population error distribution was normal
This assumption of normal errors implied
that the distribution of y, given the x’s, was
normal as well

Lectured by Dr Jin Hongfei 11


Large Sample Inference (cont)
Easy to come up with examples for which
this exact normality assumption will fail
Any clearly skewed variable, like wages,
arrests, savings, etc. can’t be normal, since a
normal distribution is symmetric
Normality assumption not needed to
conclude OLS is BLUE, only for inference

Lectured by Dr Jin Hongfei 12


Central Limit Theorem
Based on the central limit theorem, we can show
that OLS estimators are asymptotically normal
Asymptotic Normality implies that P(Z<z)→Φ(z) as
n →∞, or P(Z<z) ≈ Φ(z)
The central limit theorem states that the standardized
average of any population with mean μ and variance
σ2 is asymptotically ~N(0,1), or
Y − μY a
Z= ~ N (0,1)
σ
n
Lectured by Dr Jin Hongfei 13
Asymptotic Normality (Theorem 5.2)
Under the Gauss - Markov assumptions,
ˆ ( ) a
(
(i) n β j − β j ~ Normal 0, σ a j ,
2 2
)
where a 2
j = plim(n ∑ rˆ )
−1 2
ij

(ii) σ is a consistent estimator of σ


ˆ 2 2

( ) ( )
(iii) βˆ j − β j se βˆ j ~ Normal(0,1)
a

Lectured by Dr Jin Hongfei 14


Asymptotic Normality (cont)
Because the t distribution approaches the
normal distribution for large df, we can also
say that

(βˆ ) ( )
a
j − β j se βˆ j ~ t n − k −1
Note that while we no longer need to
assume normality with a large sample, we
do still need homoscedasticity
Lectured by Dr Jin Hongfei 15
Implication (so what?)
Even if MLR.6 does not hold, we can continue to
use F and t tests as we learned about in lecture 6 as
these will have approximately F and t distributions
if the sample is large enough.
How large is a large sample? (How long is a
piece of string?)
Wooldridge: “Some econometricians think that n
= 30 is satisfactory, but this cannot be sufficient
for all possible distributions of u” (page 183)

Lectured by Dr Jin Hongfei 16


Lagrange Multiplier statistic
Once we are using large samples and relying on
asymptotic normality for inference, we can use
more that t and F stats
The Lagrange multiplier or LM statistic is an
alternative for testing multiple exclusion
restrictions
Because the LM statistic uses an auxiliary
regression, it’s sometimes called an nR2 statistic

Lectured by Dr Jin Hongfei 17


LM Statistic (cont)
Suppose we have a regression model, y = β0 +
β1x1 + β2x2 + . . . βkxk + u and our null hypothesis
is
H0: βk-q+1 = 0, ... , βk = 0
First, we run the restricted model
y = β% 0 + β% 1 x1 + ... + β% k −q xk −q + u%
Now take the residuals, u%, and regress
u% on x1 , x2 ,..., xk (i.e. all the variables)
LM = nRu2 , where Ru2 is from this auxiliary regression
Lectured by Dr Jin Hongfei 18
LM Statistic (cont)
LM has asymptotically a χ2q distribution so we
compare LM with an appropriate critical value (c)
from this distribution
We reject H0 if LM > c
Or calculate a p-value
With a large sample, the result from an F test and
from an LM test should be similar
Unlike the F test and t test for one exclusion, the
LM test and F test will not be identical

Lectured by Dr Jin Hongfei 19


Example 5.3: Arrests model
narr86 = β0 + β1pcnv + β2avgsen + β3tottime +
β4ptime86 + β5qemp86 + u
H0: β2 = β3 = 0
Estimate restricted model:
narr86 = β0 + β1pcnv + β4ptime86 + β5qemp86
+u
save residuals and regress on all the explanatory
variables and an intercept – R2u = 0.0015
LM = 2,725(0.0015)=4.09 < χ22 (5%) = 5.99 so do
not reject [p-value = 0.129 very similar to F-test of
same null]
Lectured by Dr Jin Hongfei 20
Asymptotic Efficiency
Estimators besides OLS will be consistent
However, under the Gauss-Markov
assumptions, the OLS estimators will have
the smallest asymptotic variances
We say that OLS is asymptotically efficient
Important to remember our assumptions
though, if not homoscedastic, not true

Lectured by Dr Jin Hongfei 21


Next Time
What happens when variables are qualitative?
Binary or Dummy variables
Wooldridge, Chapter 7

Lectured by Dr Jin Hongfei 22

You might also like