Professional Documents
Culture Documents
Mućk - Linear Regression - Least Squares Estimator - Asymptotic Properties - Gauss-Markov Theorem
Mućk - Linear Regression - Least Squares Estimator - Asymptotic Properties - Gauss-Markov Theorem
Asymptotic
properties. Gauss-Markov theorem
Jakub Mućk
SGH Warsaw School of Economics
Econometrics textbooks:
1. Wooldridge J. M., Econometric Analysis of Cross Section and Panel Data.
2. Pesaran M. H., Time Series and Panel Data Econometrics.
3. Greene W. H., Econometric Analysis.
4. Hall R. C., Griffiths W. E., Lim G. C., Principles of Econometrics.
5. Wooldridge J. M., Introductory Econometrics: A Modern Approach.
Software
Stata.
R.
Exam
Homework (× 2-3) and classroom activity.
Econometrics
is an application of statistical techniques to economics in the study of
problems, the analysis of data, and the development and testing of theories
and models.
Econometrics
is an application of statistical techniques to economics in the study of
problems, the analysis of data, and the development and testing of theories
and models.
We use econometrics
to estimate economic parameters (e.g. elasticities),
to forecast economic outcomes,
to verify economic hypotheses.
C = β0 + β1 Y (1)
C = β0 + β1 Y (1)
C = β0 + β1 Y + ε, (2)
where
yi a dependent (outcome) variable,
x1i , . . . , xki is a set of k explanatory (independent) variables,
β0 , β1 , . . . , βk are the model parameters,
εi is error term,
i is an index of observation.
Experimental data
Microeconomic data
Macroeconomic data
Financial data
Jakub Mućk Advanced Applied Econometrics Linear regression Simply Linear Regression Model 14 / 64
The starting point – conditional distribution of Y given X.
f(y|x=a)
f(y|x)
µy|a
Jakub Mućk Advanced Applied Econometrics Linear regression Simply Linear Regression Model 15 / 64
The starting point – conditional distribution of Y given X.
f(y|x=a) f(y|x=b)
f(y|x)
µy|a µy|b
Jakub Mućk Advanced Applied Econometrics Linear regression Simply Linear Regression Model 15 / 64
Simply Regression:
E(y|x)
∆E(y|x)
E (y|x) = β0 + β1 x
∆x
∆E(y|x)
β1=
∆x
Jakub Mućk Advanced Applied Econometrics Linear regression Simply Linear Regression Model 16 / 64
Simply Linear Regression Model
y = β0 + β1 x + ε (4)
where
I y is the (outcome) dependent variable;
I x is independent variable;
I ε is the error term.
The dependent variable is explained with the components that vary with the
the dependent variable and the error term.
β0 is the intercept.
β1 is the coefficient (slope) on x.
Jakub Mućk Advanced Applied Econometrics Linear regression Simply Linear Regression Model 17 / 64
Simply Linear Regression Model
y = β0 + β1 x + ε (4)
where
I y is the (outcome) dependent variable;
I x is independent variable;
I ε is the error term.
The dependent variable is explained with the components that vary with the
the dependent variable and the error term.
β0 is the intercept.
β1 is the coefficient (slope) on x.
Jakub Mućk Advanced Applied Econometrics Linear regression Simply Linear Regression Model 17 / 64
The least squares (LS) estimator
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 18 / 64
How to estimate the slope and intercept?
5.0
4.5
y (dependent variable)
4.0
3.5
3.0
2.5
2.0
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 19 / 64
How to estimate the slope and intercept?
5.0
4.5
y (dependent variable)
4.0
3.5
3.0
2.5
2.0
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 19 / 64
Assumptions of the least squares estimators
Assumption #1: true DGP (data generating process):
y = β0 + β1 x + ε. (5)
E (ε) = 0, (6)
ε ∼ N 0, σ 2 .
(9)
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 20 / 64
Fitted values and residuals
where β̂0 and β̂1 are estimates of intercept and slope, respectively.
The residuals (êi ):
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 21 / 64
Assumptions of the least squares estimators
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 22 / 64
The least squares estimator
5.0
4.5
y (dependent variable)
4.0
3.5
3.0
2.5
2.0
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 23 / 64
The least squares estimator
5.0
4.5
y (dependent variable)
4.0
3.5
3.0
2.5
2.0
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 23 / 64
The least squares estimators
where ȳ and x̄ are the sample averages of dependent and independent vari-
ables, respectively.
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 24 / 64
Gauss-Markov Theorem
Gauss-Markov Theorem
Under the assumptions A#1-A#5 of the simple linear regression model, the
least squares estimators β̂0LS and β̂1LS have the smallest variance of all
linear and unbiased estimators of β0 and β1 .
β̂0LS and β̂1LS are the Best Linear Unbiased Estimators (BLUE) of β0 and
β1 .
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 25 / 64
Remarks on the Gauss-Markov Theorem I
1. The estimators β̂0LS and β̂1LS are best when compared to linear and
unbiased estimators.
Based on the Gauss-Markov theorem we cannot claim that the estimators
β̂0LS and β̂1LS are the best of all possible estimators.
2. Why the estimators β̂0LS and β̂1LS are best?
Because they have the minimum variance.
3. The Gauss-Markov theorem holds if assumptions A#1-A#5 are satisfied.
If not, then β̂0LS and β̂1LS are not BLUE.
4. The Gauss-Markov theorem does not require the assumption of normality
(A#6)
5. Apart from that, the least squares estimator is consistent if assumptions
A#1-A#5 are satisfied.
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 26 / 64
Linearity of estimator
(xi − x̄)2 .
P
where wi = (xi − x̄) /
After manipulation we get:
N
X
β̂1LS = β1 + w i εi . (19)
i=1
Since the wi are known this is linear function of random variable (ε).
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 27 / 64
Unbiasedness
The estimator is unbiased if its expected value equals the true value, i.e.,
E β̂ = β. (20)
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 28 / 64
Example: unbiased estimator
θ = θ^
0 1 2 3 4
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 29 / 64
Example: biased estimator
θ^
f(θ)
−1 0 1 2 3
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 30 / 64
The variance and covariance of the LS estimators
In general, variance measures efficiency.
If the assumption A#1-A#5 are satisfied then:
" PN 2 #
xi
β̂0LS 2
var = σ PN i=1
N i=1
(xi − x̄)2
σ2
var β̂1LS
= PN
(xi − x̄)2
i=1
−x̄
cov β̂0LS , β1LS σ2
=
(xi − x̄)2
The greater the variance of the error term (σ 2 ), i.e., the larger role of the
error term, the larger variance and covariance of estimates.
PN
The larger variability of the dependent variable i=1
(xi − x̄)2 , the smaller
variance of the least squares estimators.
The larger sample size (N ) the smaller variance of the least squares estima-
tors.
PN
The larger i=1
x2i the greater variance of the intercept estimator
The covariance of estimator has a sign opposite to that of x̄ and if x̄ is larger
then the covariance is greater.
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 31 / 64
The probability distribution of the least squares estimators
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 32 / 64
Estimating the variance of the error term
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 33 / 64
Estimating the variance of the least squares estimators
Jakub Mućk Advanced Applied Econometrics Linear regression The least squares (LS) estimator 34 / 64
Multiple regression
Multiple regression :
y = β0 + β1 x1 + β2 x2 + . . . + βK xK + ε (26)
where
I y is the (outcome) dependent variable;
I x1 , x2 , . . . , xK is the set of independent variables;
I ε is the error term.
The dependent variable is explained with the components that vary with the
the dependent variable and the error term.
β0 is the intercept.
β1 , β2 , . . . , βK are the coefficients (slopes) on x1 , x2 , . . . , xK .
Multiple regression :
y = β0 + β1 x1 + β2 x2 + . . . + βK xK + ε (26)
where
I y is the (outcome) dependent variable;
I x1 , x2 , . . . , xK is the set of independent variables;
I ε is the error term.
The dependent variable is explained with the components that vary with the
the dependent variable and the error term.
β0 is the intercept.
β1 , β2 , . . . , βK are the coefficients (slopes) on x1 , x2 , . . . , xK .
General form:
y = β0 + β1 x1 + β2 x2 + . . . + βk xk + ε. (27)
Matrix form:
y = Xβ + ε (28)
where
y1 1 x1,1 x1,2 ... x1,K
y2 1 x2,1 x2,2 ... x2,K
y=
...
, X=
... .. .. .. .. ,
. . . .
yN N ×1 1 xN,1 xN,2 ... xN,k N ×(K+1)
β0 ε1
β1 ε2
β=
...
, ε=
...
,
βK (K+1)×1 εN N ×1
K – the number of explanatory variables; N – the number of observations.
y = Xβ + ε. (29)
E (ε) = 0, (30)
E(Xε) = 0. (34)
Jakub Mućk Advanced Applied Econometrics Linear regression Multiple regression 38 / 64
Assumptions of the least squares estimators (multiple regression) II
rank(X) = K + 1 ≤ N. (35)
ε ∼ N 0, σ 2 .
(36)
y = Xβ + ε, (37)
where e = y − ŷ = y − Xβ̂.
The SSE can be expressed as a function of unknown parameters:
∂SSE(β)
= −2X0 y + 2X0 Xβ, (41)
∂β
after manipulations
X0 y = X0 Xβ, (42)
Finally, using assumption about full rank of X we get:
−1
β̂ OLS = X0 X X0 y. (43)
then
h −1 −1 0 i
V ar(β̂ OLS ) = E X0 X X0 ε X0 X X0 ε
h −1 −1 i
= E X0 X X0 εε0 X X0 X
−1 −1
X0 X X0 E εε0 X X0 X
=
The variance of the OLS estimator can be calculated with the estimates of
the variance of the error term (S2ε ):
where
e0 e SSE(β̂ OLS )
S2 = = (48)
N − (K + 1) df
where SSE(β̂ OLS ) is the sum of squared residuals, and df stands for degree
of freedom.
Diagonal elements of the variance-covariance matrix (denotes as dˆii ) measure
the variance of respective parameters. Then standard error:
p
S(β̂i ) = dii . (49)
Gauss-Markov Theorem
Under the assumptions A#1-A#5 of the multiple linear regression model,
the least squares estimator β̂ OLS has the smallest variance of all linear and
unbiased estimators of β.
Asymptotic properties ..
are discussed in Appendix.
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 46 / 64
Random variable
Random variable
is a variable whose value is unknown until it is observed
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 47 / 64
The probability density function
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 48 / 64
The cumulative distribution function
I For a discrete random variables, the cdf is obtained by summing the pdf over
all values xi .
I For a continuous random variable, F (x) is the area under the pdf to the left of
the point x.
The cdf is useful in calculating probabilities. For instance:
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 49 / 64
Joint density, marginal and conditional distribution I
For a set (at least two) of random variables it is useful to analyse joint
distribution.
The joint probability density function summarize the information con-
cerning the possible outcomes of (at least two) random variables and the
corresponding probabilities. For discrete random variables
Random variables are independent if and only if they joint pdf is the product
of the individuals pdfs. For, two random variables:
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 51 / 64
Expected value I
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 52 / 64
Expected value II
3. For any constant a1 , . . ., ak and random variables X1 , . . . Xk :
k
! k
X X
E ai Xi = ai E (Xi ) , (65)
i i
and when ai = 1 for all i then the expected value of the sum is exactly the
sum of expected values.
4. For a function creating new random variable g(·):
k
X
E (g(X)) = g(xi )fx (xi ), (66)
i
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 53 / 64
Variance & Covariance I
The variance is one the measure of variability:
var(X) = E (X − µ)2 ,
(69)
where E(X) = µ.
The variance is usually denoted σ 2 (or σX
2
for X).
Alternatively, the variance can be expressed:
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 56 / 64
χ2 and t distributions
E(V ) = k
var(V ) = 2k.
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Probability Primer 57 / 64
Appendix. Asymptotic properties of the OLS estimator
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 58 / 64
Unbiasedness of the OLS estimator
Unbiasedness of estimator:
E β̂ OLS = β.
(80)
Using true DGP, i.e., y = Xβ + ε we can rewrite the least square estimator:
Using a fact that E (β) = β and assumption that explanatory variables are
not random, i.e., E (X) = X:
E β̂ OLS = β.
(85)
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 59 / 64
Linearity of the OLS estimator
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 60 / 64
Efficiency of the OLS estimator I
Let us introduce with some unbiased linear estimator β, eg. B̂ = Cy. Then
V ar(B̂) = σ 2 CC 0 . (89)
−1
Let us introduce D = C − (X0 X) X0 .
Variance of the estimator B̂:
h −1 −1 0 i
V ar(B̂) = σ 2 D + X0 X X0 D + X0 X X0 , (90)
DX = 0 since
−1
CX = I = DX + (X0 X) (X0 X). The variance can be written as:
−1
V ar(B̂) = σ 2 X0 X + σ 2 DD0 = V ar(β̂ OLS ) + σ 2 DD0 . (91)
If D is zero matrix then the variance of B̂ is the lowest. But then we consider
the least square estimator.
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 61 / 64
Example – efficiency
A
θ^ = θ
f(θ)
B
θ^ = θ
0 1 2 3 4
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 62 / 64
Consistency of the OLS estimator I
Consistent estimator converges in probability to the true value of the un-
known parameter.
Consistency of the OLS estimator
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 63 / 64
Example – consistency of estimator
0.8
θ^1
0.6 θ^2
θ^3
θ^n
f(θ)
0.4
0.2
0.0
0 1 2 3 4 5 6
Jakub Mućk Advanced Applied Econometrics Linear regression Appendix. Asymptotic properties of the OLS estimator 64 / 64