Professional Documents
Culture Documents
Microeconometrics
Based on the textbooks
Verbeek: A Guide to Modern Econometrics
and Cameron and Trivedi: Microeconometrics
Robert M. Kunst
robert.kunst@univie.ac.at
University of Vienna
and
Institute for Advanced Studies Vienna
October 4, 2013
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Outline
Basics
Ordinary least squares
The linear regression model
Goodness of t
Restriction tests on coecients
Asymptotic properties of OLS
Endogenous regressors
Maximum likelihood
Limited dependent variables
Binary choice models
Multiresponse models
Panel data
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
What is micro-econometrics?
Micro-econometrics is concerned with the statistical analysis of
individual (not aggregated) economic data. Some aspects of such
data imply special emphasis:
) =
N
i =1
(y
i
x
i
)
2
is the OLS (ordinary least squares) vector b = (b
1
, . . . , b
K
)
.
Notation: x
1
1 but x
1
= (1, x
12
, . . . , x
1K
).
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Ordinary least squares
The normal equations of OLS
To minimize S(
i =1
x
i
(y
i
x
i
b) = 0
_
N
i =1
x
i
x
i
_
b =
N
i =1
x
i
y
i
,
or simply, in matrix and vector notation, using y = (y
1
, . . . , y
N
)
and X = (x
1
, . . . , x
N
)
,
b = (X
X)
1
X
y.
Notation: Here, x
i
= (1, x
i 2
, . . . , x
iK
)
.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
therefore
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Ordinary least squares
y and residuals
The (systematic or predicted) value
y
i
= x
i
b
is the best linear approximation of y from x
2
, . . . , x
K
, the best
approximation by linear combinations. The dierence between y
and y is called the residual
e
i
= y
i
y
i
= y
i
x
i
b.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Ordinary least squares
Residual sum of squares
The function S(.) evaluated at its minimum
S(b) =
N
i =1
(y
i
y
i
)
2
=
N
i =1
e
2
i
is the residual sum of squares (RSS). Because of the normal
equations,
N
i =1
x
i
(y
i
x
i
b) =
N
i =1
x
i
e
i
= 0,
such that residuals and regressors are orthogonal. Also,
N
i =1
e
i
= 0 =
N
i =1
(y
i
x
i
b) = y x
b
or y = x
X)
1
X
N
i =1
(x
i
x)(y
i
y)
N
i =1
(x
i
x)
2
for the slope.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Ordinary least squares
A dummy regressor
If x
i
= 1 for some observations and x
i
= 0 otherwise (man/woman,
employed/unemployed, before/after 1989), x is called a dummy
variable. x will be the share of 1s in the sample.
b
1
is the average y for the 0 individuals y
[0]
, and b
2
is the
dierence between y
[1]
and y
[0]
.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Ordinary least squares
Linear regression and matrix notation
Most issues are simpler and more compact in matrix notation:
X =
_
_
_
1 x
12
. . . x
1K
.
.
.
.
.
.
.
.
.
1 x
N2
. . . x
NK
_
_
_
=
_
_
_
x
1
.
.
.
x
N
_
_
_
, y =
_
_
_
y
1
.
.
.
y
N
_
_
_
.
The sum of squares to be minimized is now
S(
) = (y X
)
(y X
) = y
y 2y
X
+
X
,
and S(
)/
= 0 yields immediately b = (X
X)
1
X
y.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Ordinary least squares
The projection matrix
The vector of observations is the sum of its systematic part and
the residuals
y = Xb + e = X(X
X)
1
X
y + e = y + e.
The matrix P
X
that transforms y into its systematic part
y = X(X
X)
1
X
y = P
X
y
is called the projection matrix. It is singular, N N of rank K,
and P
X
P
X
= P
X
. Note the orthogonality
P
X
(I P
X
) = 0
to the matrix M
X
= I P
X
that transforms y to residuals e.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
The linear regression model
Linear regression in a statistical model
X is non-singular (no
multicollinearity);
i
and also that E[
i
] = 0. With
non-stochastic regressors, this exogeneity assumption is
unnecessary. (There are also other denitions of exogeneity)
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
The linear regression model
Estimator and estimate
An estimate is a statistic obtained from a sample that is designed
to approximate an unknown parameter closely: b is an estimate for
. The unknown but xed is called a Kdimensional parameter
or a Kvector of scalar parameters (real numbers). In regression, b
may be called the coecient estimates, may be called the
coecients.
An estimator is a rule that says how to obtain the estimate from
data. OLS is an estimator.
The residual e
i
approximates the true error term
i
but it is not an
estimate, as
i
is not a parameter. Dont be sloppy, never confuse
errors and residuals.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
The linear regression model
Gauss-Markov conditions
The Gauss-Markov conditions imply nice properties for the OLS
estimator. In particular, for the linear regression model
y
i
= x
i
+
i
, assume
A1 E
i
= 0, i = 1, . . . , N;
A2 {
1
, . . . ,
N
} and {x
1
, . . . , x
N
} are independent;
A3 V
i
=
2
, i = 1, . . . , N;
A4 cov(
i
,
j
) = 0, i , j = 1, . . . , N, i = j .
(A1) identies the intercept, (A2) is an exogeneity assumption,
(A3) is called the homoskedasticity assumption, (A4) assumes
the absence of autocorrelation.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
The linear regression model
Some implications of the Gauss-Markov assumptions
(A1), (A3), (A4) imply for the vector random variable that
E = 0 and V() =
2
I
N
, a scalar matrix;
X)
1
X
y = E{(X
X)
1
X
(X + )}
= + E{(X
X)
1
X
} =
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
The linear regression model
Variance of OLS
Under assumptions (A1)(A4), the variance of the OLS estimator
follows the simple formula
V(b|X) =
2
(X
X)
1
,
because of
V(b|X) = E{(b )(b )
|X}
= E{(X
X)
1
X
X(X
X)
1
|X}
= (X
X)
1
X
E(
|X)X(X
X)
1
=
2
(X
X)
1
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
The linear regression model
The Gauss-Markov Theorem
Theorem
Under the assumptions (A1)(A4), the OLS estimator is the best
linear unbiased estimator, i.e. the estimator among the linear
unbiased estimators with the smallest variance.
1
non-negative denite;
X)
1
is estimated by plugging in an
estimate for
2
= E(
2
) that uses residuals
s
2
=
1
N 1
N
i =1
e
2
i
, s
2
=
1
N K
N
i =1
e
2
i
,
with s
2
often preferred as it is unbiased. Note that
Es
2
=
2
, Es = .
The square roots of the variances in the diagonals of s
2
(X
X)
1
are the standard errors of the coecients b
k
.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Goodness of t
The R
2
The most customary goodness-of-t measure is the R
2
dened by
R
2
=
N
i =1
( y
i
y)
2
N
i =1
(y
i
y)
2
=
V( y)
V(y)
,
with
V denoting empirical variance and y the sample mean.
Orthogonality between y and e implies that
V(y) =
V( y) +
V(e),
such that
R
2
= 1
V(e)
V(y)
= 1
N
i =1
e
2
i
N
i =1
(y
i
y)
2
.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Goodness of t
Variants of R
2
If a regression is run without an intercept (homogeneous
regression), the uncentered R
2
R
2
0
= 1
N
i =1
e
2
i
N
i =1
y
2
i
becomes attractive. Comparison with the usual R
2
becomes
dicult.
If R
2
is interpreted as an estimate of the squared correlation
coecient of y and y, it is severely biased. The adjusted R
2
R
2
= 1
1
NK
N
i =1
e
2
i
1
N1
N
i =1
(y
i
y)
2
has less bias. It is not a panacea, however, for the problem that R
2
cannot be used for model selection.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Restriction tests on coecients
The tstatistic
Additional to (A1)(A4), assume
A5 The errors
i
follow a normal distribution.
Then, it follows that the vector N(0,
2
I
N
). Clearly,
y|X N(X,
2
I
N
), and also the OLS coecient b follows a
normal distribution. It can be shown that the ratio
t
k
=
b
k
k
s
c
kk
is tdistributed with N K degrees of freedom. Here, c
kk
denotes
the k-th diagonal element of (X
X)
1
.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Restriction tests on coecients
The ttest
The null hypothesis of interest is H
0
:
k
=
0
k
. Then, under the
null hypothesis, the tstatistic
t
k
=
b
k
0
k
s
c
kk
is t
NK
distributed. As N K increases, t
NK
approaches the
normal N(0, 1) distribution. It is customary to say that a
coecient is signicant whenever its t
k
for H
0
:
k
= 0 is larger
than 1.96 or even 2.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Restriction tests on coecients
The Ftest
Assume the null hypothesis of interest is
H
0
:
KJ+1
= . . . =
K
= 0,
i.e. J restrictions on the coecients. Let S
0
denote the residual
sum of squares if OLS is applied to the restricted model with
K J regressors, and S
1
the RSS for the unrestricted model with
K regressors. One can show that the Fstatistic
F =
(S
0
S
1
)/J
S
1
/(N K)
is under H
0
distributed F(J, N K). As N gets large, the Ftest
becomes essentially a
2
(J) test.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Restriction tests on coecients
Variants of the Ftest
(y
i
y)
2
. This is the only F shown in
a standard regression printout;
Rejecting H
0
although it is correct is a type I error;
Not rejecting H
0
although it is incorrect is a type II error;
X/N =
X
, a nite non-singular matrix,
which is stronger than no multicollinearity. It excludes asymptotic
multicollinearity as well as increasing regressors.
Theorem
Under the assumptions (A1)(A4),(A6), it holds that plimb = .
This property is also called in short b is consistent for .
Formally, it is dened by
lim
N
P{|b
k
k
| > } = 0 > 0 k.
Proof uses the Chebyshev inequality.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Asymptotic properties of OLS
Remarks on OLS consistency
X)
1
N
1
X
y = + (N
1
X
X)
1
N
1
X
,
such that b converges to if the last term converges to 0.
The factor N
1
X
converges to
0, and this will hold true under conditions weaker than the
Gauss-Markov conditions, admitting some correlation among
errors and some heteroskedasticity.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Asymptotic properties of OLS
Asymptotic normality of OLS
Theorem
Under assumptions (A1)(A4),(A6), it holds that
N(b ) N(0,
2
1
X
).
Again, for this result conditions weaker than the Gauss-Markov
conditions would suce. Note that the normality assumption (A5)
is not needed, according to the Central Limit Theorem. The
theorem implies that b is asymptotically distributed as
N(,
2
(X
X)
1
). Similarly, the corresponding tstatistics will be
asymptotically N(0, 1) distributed etc.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Binary choice models
Binary choice models
Linear regression only works as long as it is reasonable to view the
dependent variable as distributed Gaussian or similar (unimodal,
continuous, support is R). It fails when the dependent variable is
binary (0-1, buys a laptop-does not buy, employed-unemployed
etc.).
Two suggestions:
P(y
i
= 1|x
i
) = G(x
i
, ), known G is a link function with its
image in [0, 1];
i
= x
i
+ u
i
for the latent (hidden, unobserved) variable y
,
and y = 1 if y
):
F(w) = L(w) =
e
w
1+e
w
, the logistic cdf, denes the logit
model;
F(w) =
_
_
_
0, w 0;
w, 0 w 1;
1, w > 1.
denes the linear probability model.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Binary choice models
Marginal eects
In linear regression,
k
reect the marginal eect of a change in
x
k
. In binary choice models, this issue is more complex, as the
marginal eects ( denotes the normal density)
(x
i
)
x
ik
= (x
i
)
k
;
L(x
i
)
x
ik
=
exp(x
i
)
{1 + exp(x
i
)}
2
k
,
depend on the values of the covariates. They are often computed
at sample averages.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Binary choice models
The odds ratio
For p
i
= P(y
i
= 1|x
i
), the term
log
p
i
1 p
i
= x
is the log odds ratio. At the point of undecidedness, the log odds
ratio is 0. In the logit model, it is a linear function of the
covariates.
k
is the marginal reaction of the log odds ratio to a
change in x
k
.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Binary choice models
Underlying latent model
Consider the second interpretation
y
i
= x
i
+
i
with y
i
= 1 if y
i
> 0 and y
i
= 0 otherwise. y
is an unobserved
latent variable (imagine utility etc.). Note
P(y
i
= 1) = P(y
i
> 0) = P(
i
x
i
) = F(x
i
),
with F the distribution function of the errors . Normal (logistic)
distribution for implies the probit (logit) model.
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Binary choice models
The likelihood of a binary choice model
The standard logit/probit model is fully parametric. The likelihood
is
L(; y, X) =
N
i =1
P(y
i
= 1|x
i
; )
y
i
P(y
i
= 0|x
i
; )
1y
i
,
which yields for the log-likelihood
log L(; y, X) =
N
i =1
log F(x
i
) +
N
i =1
(1 y
i
) log(1 F(x
i
)).
Microeconometrics University of Vienna and Institute for Advanced Studies Vienna
Basics Endogenous regressors Maximum likelihood Limited dependent variables Panel data
Binary choice models
The maximum-likelihood estimator
Maximizing the likelihood L in yields the ML estimator
. There
is no closed form. Because of continuity, the ML estimator can be
obtained numerically by solving
log L()
=
N
i =1
y
i
F(x
i
)
F(x
i
)(1 F(x
i
))
f (x
i
)x
i
= 0,
with f = F