Multiple Regression Analysis: y + X + X + - . - X + U

Multiple Regression Analysis
y = β0 + β1x1 + β2x2 + . . . βkxk + u
3. Asymptotic Properties
Lectured by Dr Jin Hongfei 1

Multiple Regression: Recap
Linear in parameters: y = β0 + β1x1 + β2x2 +…+ βkxk + u
[MLR.1]
{(xi1, xi2,…, xik, yi): i=1, 2, …, n} is a random sample from the
population model, so that
yi = β0 + β1xi1 + β2xi2 +…+ βkxik + ui [MLR.2]
E(u|x1, x2,… xk) = E(u) = 0. Conditional mean independence
[MLR.3]
No exact multicollinearity [MLR.4]
V(u|x) = V(u) = σ2. Homoscedasticity. [MLR.5]
Assume that u is independent of x1, x2,…, xk and u is normally
distributed with zero mean and variance s2: u ~ Normal(0,s2)
[MLR.6]
Asymptotic = Large Sample (n→∞)
In a large sample:
estimators can have desirable properties under
less stringent assumptions
we can perform inference in the usual way
without assuming that u (and hence y) is
normally distributed
we can derive useful additional methods of
testing hypotheses

Consistency
Under the Gauss-Markov assumptions OLS
is BLUE, but in other cases it won’t always
be possible to find unbiased estimators
In those cases, we may settle for estimators
that are consistent, meaning as n → ∞, the
distribution of the estimator collapses to the
parameter value

Sampling Distributions as n ↑
n3
n1 < n2 < n3
n2
n1
β1
Consistency of OLS
Under the Gauss-Markov assumptions, the
OLS estimator is consistent (and unbiased)
We say that the probability limit of βˆ j is
equal to βj or plimβˆ j = βj
In the simple regression case we can
establish consistency relatively easily

Proving Consistency
βˆ 1 =
∑ ( xi1 − x1 )yi
∑ ( xi1 − x1 )
2
n ∑ ( xi1 − x1 )ui
−1
= β1 + −1
n ∑ ( xi1 − x1 )
2
Cov ( x , u )
plim βˆ 1 = β1 + 1
= β1
Var ( x1 )
since Cov ( x1 , u ) = 0
A Weaker Assumption
For unbiasedness, we assumed a zero
conditional mean – E(u|x1, x2,…,xk) = 0
For consistency, we can have the weaker
assumption of zero mean and zero
correlation – E(u) = 0 and Cov(xj,u) = 0, for
j = 1, 2, …, k
Without this assumption, OLS will be
biased and inconsistent!

Deriving the Inconsistency
Just as we could derive the omitted variable
bias earlier, now we want to think about the
inconsistency, or asymptotic bias, in this
case
True model: y = β0 + β1 x1 + β2 x2 + v
You think: y = β0 + β1 x1 + u , so that
u = β2 x2 + v and, plimβ% 1 = β1 + β2δ
where δ = Cov ( x1 , x2 ) Var ( x1 )
Asymptotic Bias (cont)
So, thinking about the direction of the asymptotic
bias (inconsistency) is just like thinking about the
direction of bias for an omitted variable
Main difference is that asymptotic bias uses the
population variance and covariance, while bias
uses the sample counterparts
Remember, inconsistency is a large sample
problem – it doesn’t go away as we add data

Large Sample Inference
Recall that under the CLM assumptions,
the sampling distributions are normal, so we
could derive t and F distributions for testing
This exact normality was due to assuming
the population error distribution was normal
This assumption of normal errors implied
that the distribution of y, given the x’s, was
normal as well

Large Sample Inference (cont)
Easy to come up with examples for which
this exact normality assumption will fail
Any clearly skewed variable, like wages,
arrests, savings, etc. can’t be normal, since a
normal distribution is symmetric
Normality assumption not needed to
conclude OLS is BLUE, only for inference

Central Limit Theorem
Based on the central limit theorem, we can show
that OLS estimators are asymptotically normal
Asymptotic Normality implies that P(Z<z)→Φ(z) as
n →∞, or P(Z<z) ≈ Φ(z)
The central limit theorem states that the standardized
average of any population with mean μ and variance
σ2 is asymptotically ~N(0,1), or
Y − μY a
Z= ~ N (0,1)
σ
n
Asymptotic Normality (Theorem 5.2)
Under the Gauss - Markov assumptions,
ˆ ( ) a
(
(i) n β j − β j ~ Normal 0, σ a j ,
2 2
)
where a 2
j = plim(n ∑ rˆ )
−1 2
ij
(ii) σ is a consistent estimator of σ

ˆ 2 2
( ) ( )
(iii) βˆ j − β j se βˆ j ~ Normal(0,1)
a

Asymptotic Normality (cont)
Because the t distribution approaches the
normal distribution for large df, we can also
say that
(βˆ ) ( )
a
j − β j se βˆ j ~ t n − k −1
Note that while we no longer need to
assume normality with a large sample, we
do still need homoscedasticity
Implication (so what?)
Even if MLR.6 does not hold, we can continue to
use F and t tests as we learned about in lecture 6 as
these will have approximately F and t distributions
if the sample is large enough.
How large is a large sample? (How long is a
piece of string?)
Wooldridge: “Some econometricians think that n
= 30 is satisfactory, but this cannot be sufficient
for all possible distributions of u” (page 183)

Lagrange Multiplier statistic
Once we are using large samples and relying on
asymptotic normality for inference, we can use
more that t and F stats
The Lagrange multiplier or LM statistic is an
alternative for testing multiple exclusion
restrictions
Because the LM statistic uses an auxiliary
regression， it’s sometimes called an nR2 statistic

LM Statistic (cont)
Suppose we have a regression model, y = β0 +
β1x1 + β2x2 + . . . βkxk + u and our null hypothesis
is
H0: βk-q+1 = 0, ... , βk = 0
First, we run the restricted model
y = β% 0 + β% 1 x1 + ... + β% k −q xk −q + u%
Now take the residuals, u%, and regress
u% on x1 , x2 ,..., xk (i.e. all the variables)
LM = nRu2 , where Ru2 is from this auxiliary regression
LM Statistic (cont)
LM has asymptotically a χ2q distribution so we
compare LM with an appropriate critical value (c)
from this distribution
We reject H0 if LM > c
Or calculate a p-value
With a large sample, the result from an F test and
from an LM test should be similar
Unlike the F test and t test for one exclusion, the
LM test and F test will not be identical

Example 5.3: Arrests model
narr86 = β0 + β1pcnv + β2avgsen + β3tottime +
β4ptime86 + β5qemp86 + u
H0: β2 = β3 = 0
Estimate restricted model:
narr86 = β0 + β1pcnv + β4ptime86 + β5qemp86
+u
save residuals and regress on all the explanatory
variables and an intercept – R2u = 0.0015
LM = 2,725(0.0015)=4.09 < χ22 (5%) = 5.99 so do
not reject [p-value = 0.129 very similar to F-test of
same null]
Asymptotic Efficiency
Estimators besides OLS will be consistent
However, under the Gauss-Markov
assumptions, the OLS estimators will have
the smallest asymptotic variances
We say that OLS is asymptotically efficient
Important to remember our assumptions
though, if not homoscedastic, not true

Next Time
What happens when variables are qualitative?
Binary or Dummy variables
Wooldridge, Chapter 7

Multiple Regression Analysis: y + X + X + - . - X + U

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multiple Regression Analysis: y + X + X + - . - X + U

Uploaded by

Copyright:

Available Formats

Multiple Regression Analysis

y = β0 + β1x1 + β2x2 + . . . βkxk + u

Lectured by Dr Jin Hongfei 1

Lectured by Dr Jin Hongfei 3

Lectured by Dr Jin Hongfei 4

Lectured by Dr Jin Hongfei 6

Lectured by Dr Jin Hongfei 8

Lectured by Dr Jin Hongfei 10

Lectured by Dr Jin Hongfei 11

Lectured by Dr Jin Hongfei 12

(ii) σ is a consistent estimator of σ

Lectured by Dr Jin Hongfei 14

Lectured by Dr Jin Hongfei 16

Lectured by Dr Jin Hongfei 17

Lectured by Dr Jin Hongfei 19

Lectured by Dr Jin Hongfei 21

Lectured by Dr Jin Hongfei 22

You might also like