Professional Documents
Culture Documents
Econometrics
1st Semester
Academic Year: 2018/2019
(2018) 2018/2019 1 / 42
Introduction
(2018) 2018/2019 2 / 42
In certain applications and models, distributional assumptions like
normality are commonly imposed because estimation strategies that
do not require such assumptions are complex or unavailable.
If the distributional assumptions are correct, the maximum likelihood
estimator is, under weak regularity conditions, consistent and
asymptotically normal.
Moreover, it fully exploits the assumptions about the distribution so
that the estimator is asymptotically e¢ cient.
In other words, alternative consistent estimators will have an
asymptotic covariance matrix that is at least as large as the maximum
likelihood estimator.
(2018) 2018/2019 3 / 42
The starting point of MLE is the assumption that the distribution of
an observed phenomenon (the endogenous variable) is known, except
for a …nite number of unknown parameters.
These parameters will be estimated by taking those values for them
that give the observed values the highest probability, i.e. the highest
likelihood.
The maximum likelihood method thus provides a means of estimating
a set of parameters characterizing a distribution, if we know, or
assume we know the form of the distribution.
For example, we could characterize the distribution of a variable yi
(for given xi ) as normal with mean β1 + β2 xi and variance s 2 . This
would represent the simple linear regression model with normal error
terms.
(2018) 2018/2019 4 / 42
Example
Suppose the success or failure of a …eld goal in football can be
modeled with a Bernoulli(π) distribution. Let X = 0 if the …eld goal
is a failure and X = 1 if the …eld goal is a success. Then the
probability distribution for X is:
f (x ) = π x (1 π )1 x
n
Suppose ∑ xi = 4 and n = 10. Then the following table can be formed:
i =1
π L (π jx1 , ..., xn )
0.20 0.000419
0.30 0.000953
0.35 0.001132
0.39 0.001192
0.40 0.001194
0.41 0.001192
0.50 0.000977
(2018) 2018/2019 6 / 42
Plot of the likelihood function
0.0014
0.0012
0.0008
0.0006
0.0004
0.0002
0.0000
0 0.2 0.4 0.6 0.8 1
(2018) 2018/2019 7 / 42
Example
(2018) 2018/2019 8 / 42
MLE Procedure
1 Find the natural log of the likelihood function, i.e., ` (π jx1 , ..., xn ) ;
2 Take the derivative of ` (π jx1 , ..., xn ) with respect to π;
3 Set the derivative equal to 0 and solve for π to …nd the maximum
likelihood estimate.
4 Note that the solution is the maximum of L (π jx1 , ..., xn ) ; provided
certain “regularity” conditions hold.
(2018) 2018/2019 9 / 42
Example
!
n
` (π jx1 , ..., xn ) = log ∏π xi
(1 π) 1 xi
i =1
! !
n n
= ∑ xi log (π ) + n ∑ xi log (1 π)
i =1 i =1
(2018) 2018/2019 10 / 42
Example
Then the derivative of ` (π jx1 , ..., xn ) with respect to π is,
d ` (π jx1 , ..., xn )
= 0 ()
dπ
n n n n
∑ xi n ∑ xi ∑ xi n ∑ xi
i =1 i =1 i =1 i =1
= 0 () = ()
π 1 π π 1 π
n n n
1 π
n ∑ xi 1
n ∑ xi + ∑ xi
i =1 i =1 i =1
= n () = n ()
π π
∑ xi ∑ xi
i =1 i =1
n
∑ xi
i =1
b =
π
n
Therefore, the maximum likelihood estimate of π is the proportion of …eld
goals made.
(2018) 2018/2019 11 / 42
Example
i =1
1 1
Σ (x i µ )2
= n/2
e 2σ2
(2π ) σn
(2018) 2018/2019 12 / 42
Example
(2018) 2018/2019 13 / 42
Example
∂` µ, σ2 jx1 , ..., xn
= 0
∂µ
n
1
2σ2 i∑
2 (xi µ) = 0
=1
(2018) 2018/2019 14 / 42
Example
To …nd the maximum likelihood estimate of σ, take the derivative with
respect to σ and set it equal to 0.
∂` µ, σ2 jx1 , ..., xn
= 0
∂σ
n 2 n
σ 2σ3 i∑
(xi µ)2 = 0
=1
n 1 n ∑ (xi µ )2
σ
= 3
σ ∑ (xi µ)2 () σ2 = i =1
n
i =1
and v
u n
u
u ∑ (xi x )2
t i =1
b=
σ
n
(2018) 2018/2019 15 / 42
General Properties
which is a function of θ.
(2018) 2018/2019 16 / 42
General Properties
where log L(θ ) is the loglikelihood function and for simplicity the
other arguments were dropped.
(2018) 2018/2019 17 / 42
General Properties
(2018) 2018/2019 18 / 42
General Properties
∂ log Li (θ )
si ( θ ) = ;
∂θ
so that the …rst order condition can be stated as,
n
∑ si (θ ) = 0.
i =1
This says that the sample averages of the K scores, evaluated at the
ML estimate bθ should be zero.
(2018) 2018/2019 19 / 42
General Properties
(2018) 2018/2019 20 / 42
General Properties
The covariance matrix V is determined by the shape of the
loglikelihood function and can be shown to equal,
1
∂2 log Li (θ )
V = E .
∂θ∂θ 0
The term in brackets is the expected value of the matrix of the second
derivatives and re‡ects the curvature of the loglikelihood function.
Clearly, if the loglikelihood function is highly curved around its
maximum, the second derivative is large, the variance is small and the
MLE estimator is relatively accurate. If the function is less curved the
variance will be larger.
The symmetric matrix
∂2 log Li (θ )
I (θ ) = E
∂θ∂θ 0
is known as the (Fisher) information matrix.
(2018) 2018/2019 21 / 42
General Properties
(2018) 2018/2019 22 / 42
General Properties
where G re‡ects that the variance estimator uses the outer product of
the gradients (…rst derivatives).
(2018) 2018/2019 23 / 42
The Normal Linear Regression Model
Consider,
yi = xi0 β + εi , εi NID (0, σ2 )
under the usual OLS assumptions. This imposes that (conditional upon
the exogenous variables), yi is normal with mean xi0 β and a constant
variance σ2 .
(2018) 2018/2019 24 / 42
The Normal Linear Regression Model
(2018) 2018/2019 25 / 42
The Normal Linear Regression Model
(2018) 2018/2019 26 / 42
The Normal Linear Regression Model
The MLE b b2 satisfy the …rst order conditions
β and σ
N
(yi xi0 β)
∑ σ2
xi = 0
i =1
and
2
N 1 N (yi xi0 β)
2 i∑
+ = 0.
2σ2 =1 σ4
The solutions to these equations are,
! 1
N N
b
β= ∑ xi xi0 ∑ xi yi
i =1 i =1
and
N
1
∑
2
b2 =
σ yi xi0 β .
N i =1
The estimator for the vector of slopes is identical to the OLS estimator,
b2 di¤ers from OLS by dividing by N rather than N K .
while σ
(2018) 2018/2019 27 / 42
The Normal Linear Regression Model
I ( β, σ2 ) = E si ( β, σ2 )si ( β, σ2 )0 .
E fεi g = 0,
E ε2i = σ2 ,
E ε3i = 0,
E ε4i = 3σ4 ,
σ 2E
fxi xi0 g 0
I ( β, σ2 ) = 1 .
0 2σ4
(2018) 2018/2019 28 / 42
The Normal Linear Regression Model
(2018) 2018/2019 29 / 42
Speci…cation Tests
(2018) 2018/2019 30 / 42
Speci…cation Tests
The three test principles can be summarised as follows:
Wald test. Estimate θ by ML and check whether the di¤erence
Rbθ r is close to zero, using its asymptotic covariance matrix. This
is the idea that underlies the well-known t- and F-tests.
Likelihood ratio test. Estimate the model twice: once without the
restriction imposed (giving b
θ ) and once with the null hypothesis
imposed (giving the constrained ML estimator e θ, where R e
θ = r ) and
check whether the di¤erence in likelihood values log L(bθ ) log L(eθ ) is
signi…cantly di¤erent from zero. This implies the comparison of an
unrestricted and a restricted maximum of log L(θ ).
Lagrange Multiplier test. Estimate the model with the restriction
from the null hypothesis imposed (giving e θ) and check whether the
…rst order conditions from the general model are signi…cantly violated.
That is check whether ∂ log L(θ )/ ∂θ jeθ is signi…cantly di¤erent from
zero.
(2018) 2018/2019 31 / 42
Speci…cation Tests
(2018) 2018/2019 32 / 42
Speci…cation Tests
The Wald Test
The Wald test starts from the result that
p
N (b
θ θ ) ! N (0, V )
from which it follows that the J-dimensional vector R b
θ also has an
asymptotic normal distribution, given by,
p
N (R b
θ Rθ ) ! N (0, RVR 0 ).
Under the null hypothesis Rθ equals the known vector r , so that we
can construct a test statistics by forming the quadratic form
h i 1
ξ W = N (R b θ r )0 R V b R0 N (R b
θ r)
0 0
yields the constraint ML estimator e
θ = (e
θ1 , e e
θ 2 )0 and λ.
(2018) 2018/2019 35 / 42
Speci…cation Tests
and
N N
∂ log Li (θ )
e=
λ ∑ ∂θ 2 e
= ∑ si 2 e
θ (2)
i =1 θ i =1
(2018) 2018/2019 36 / 42
Speci…cation Tests
I11 (θ ) I12 (θ )
I (θ ) = ,
I21 (θ ) I22 (θ )
(2018) 2018/2019 37 / 42
Speci…cation Tests
(2018) 2018/2019 38 / 42
Speci…cation Tests
The Lagrange Multiplier Test
Computation of the LM test statistic is particularly attractive if the
information matrix is estimated on the basis of the …rst derivatives of
the loglikelihood function, as
N 0
1
bIG (e
θ) =
N ∑ si e
θ si e
θ ,
i =1
ξ LM = NR 2 .
(2018) 2018/2019 42 / 42