Professional Documents
Culture Documents
Jürgen Meinecke
Lecture 3 of 12
Research School of Economics, Australian National University
1 / 29
Roadmap
2 / 29
Let there be a probability space (Ω, F , P)
Example
18 if ω even,
X(ω ) =
24 if ω odd
4 / 29
The event {6} is not assigned a probability
Of course we have a reasonable suspicion that P({6}) should
equal 1/6, but strictly speaking this hasn’t been defined two
slides earlier
So we have to treat P({6}) as unknown
To make sure that our random variable is not ill-defined like
this we need to rule out such situations
Here’s a more robust definition
5 / 29
Definition (Random Variable—last attempt)
A random variable on (Ω, F ) is a function Z : Ω → R such
that
{ ω ∈ Ω : Z(w) ∈ B} ∈ F for all B ∈ B (R).
• pick B = {2}
• { ω ∈ Ω : Z( ω ) = 2} = {6} 6 ∈ F
• same for B = {7}
7 / 29
Definition (Distribution or Law)
Given a random variable Z on a probability space (Ω, F , P),
the distribution or law of the random variable is the
probability measure define by
µ (B) : = P(Z ∈ B), B ∈ B (R).
8 / 29
Definition (Weak Convergence)
If Fn and F are distribution functions then Fn converges weakly
to F if limN→∞ Fn (z) = F(z) for each z at which F is continuous.
w
We write FN → F.
w
Equivalently we could say µN → µ for weak convergence
Definition (Convergence in Distribution)
Let ZN and Z be random variables. Then ZN converges in
w
distribution or law to Z if FN → F.
d
We write ZN → Z.
9 / 29
Theorem (Continuous Mapping Theorem)
p p
(i) If ZN → Z then g(ZN ) → g(Z) for continuous g.
d d
(ii) If ZN → Z then g(ZN ) → g(Z) for continuous g.
10 / 29
Theorem (Central Limit Theorem (CLT))
Let Z1 , Z2 , . . . be a sequence of independent and identically
distributed random vectors with µz := EZi and E kZi k2 < ∞. Define
Z̄N := ∑N i=1 Zi /N. Then
√ d
N (Z̄N − µZ ) → N 0, E (Zi − µZ )(Zi − µZ )0 .
11 / 29
The restrictions imposed in it don’t seem very strong
For example, it does not matter what distribution the Zi come
from (as long as E kZi k2 < ∞)
√
The sample average multiplied by N converges to a normal
distribution
12 / 29
Conventional terminology with regard to the result
√ d
N (Z̄N − µZ ) → N (0, Ω)
13 / 29
Primitive usage
14 / 29
Illustration of CLT
The underlying distribution of Z1 , . . . , ZN is exponential
15 / 29
Illustration of CLT
The underlying distribution of Z1 , . . . , ZN is exponential
16 / 29
Roadmap
17 / 29
We know that β̂OLS ∈ L2
We would like to know the exact distribution of β̂OLS for finite
samples (so-called small sample distribution)
Remember −1
N N
β̂OLS = β∗ + ∑i=1 Xi Xi0 ∑ i = 1 Xi ui
β∗ = E(Xi Xi0 )−1 E(Xi Yi )
18 / 29
The CLT will be our main tool in deriving the asymptotic
distribution of β̂OLS
Big picture: we already know that β̂OLS − β∗ = op (1)
√
From what I said earlier, we may suspect that N ( β̂OLS − β∗ )
could converge to a normal distribution
Again using β̂OLS = β∗ + ∑N 0 −1 0
i=1 (Xi Xi ) Xi ui
√
Then isolating N ( β̂OLS − β∗ ) and using sum operators instead
of matrices ! −1 !
√ OLS N N
N β̂ − β = N ∑ Xi Xi
∗ −1 0
N −1/2
∑ Xi ui
i=1 i=1
Let’s break the rhs up again into its bits and pieces
19 / 29
We’ve already shown last week (using Slutsky’s theorem) that,
given E(Xi Xi0 ) < ∞,
! −1
1 N
N i∑
Xi Xi0 = E(Xi Xi0 )−1 + op (1)
=1
= Op (1)
by the CLT
20 / 29
Using our tools from basic asymptotic theory (part 2)
Proposition (Asymptotic Distribution of OLS Estimator)
! −1 !
√ OLS N N
N β̂ − β∗ = N −1 ∑ Xi Xi0 N −1/2 ∑ Xi ui
i=1 i=1
d
→ N (0, Ω)
21 / 29
Roadmap
22 / 29
√
The asymptotic variance of N ( β̂OLS − β∗ ) is
Ω := E(Xi Xi0 )−1 E(u2i Xi Xi0 )E(Xi Xi0 )−1
23 / 29
If we observed ui then we would surely use (1/N ) ∑N 2 0
i=1 ui Xi Xi
24 / 29
Personally, I don’t care about this bias, I’ll explain later
Let’s try understand the source of this bias
First some new tools
Let MX := IN − PX with PX := X(X0 X)−1 X0
Then û = MX u
Second, the trace of a K × K matrix is the sum of its diagonal
elements: tr A := ∑Ki=1 aii
Savvy tricks: tr (AB) = tr (BA) and tr (A + B) = tr A + tr B
Then
N
σ̂u2 := ∑ û2i /N = tr (u0 MX u)/N = tr (MX uu0 )/N
i=1
25 / 29
Now studying the conditional expectation
E σ̂u2 |X = E tr (MX uu0 )|X /N
= tr E MX uu0 |X /N
= tr MX E uu0 |X /N
= σu2 · tr (MX ) /N
−K
= σu2 NN
< σu2
26 / 29
There is an easy fix!
Use s2u := N 2
N −K σ̂u = 1
N −K ∑N 2
i=1 ûi instead
27 / 29
Combining things, we propose the following asymptotic
variance estimator
Definition (Asymptotic Variance Estimator)
! −1 ! ! −1
N N N
Ω̂ = 1
N ∑ Xi Xi0 1
N −K ∑ û2i Xi Xi0 1
N ∑ Xi Xi0
i=1 i=1 i=1
28 / 29
Notice: Wooldridge, on page 61, proposes this version
Definition (Asymptotic Variance Estimator)
! −1 ! ! −1
N N N
Ω̂Wool
dridge = 1
N ∑ Xi Xi0 1
N ∑ û2i Xi Xi0 1
N ∑ Xi Xi0
i=1 i=1 i=1
29 / 29