Professional Documents
Culture Documents
Cours 2020
Cours 2020
Jean-Paul Renne
HEC Lausanne
Organisational issues Statistics and
Econometrics II
J.-P. Renne
Prelim.
Tests
I Lectures (#19)
I Exercise sessions (#7) Linear regression
I Mid-term exam + correction (#2) Assumptions
OLS
I Evaluation: GRM
Panels
I Two exams: mid-term and final exam.
Max.Lik.Estim.
I Mid-term exam: 40% of the total grade if it is higher than the
grade obtained at the final exam (0% otherwise). Binary
I Pen-and-paper exams. Time Series
I A4 sheet (home-made) with pertinent information is allowed; AR processes
Appendix
Stat. Tables
2 / 295
Outline of course Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Linear regression
2. Central Limit Theorem Assumptions
OLS
3. Tests Inference
Common pitfalls
Panels
6. Maximum Likelihood Estimation Max.Lik.Estim.
Appendix
Stat. Tables
3 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
4 / 295
Definition 1.1 (Cumulative distribution function (c.d.f.)) Statistics and
Econometrics II
The random variable (r.v.) X admits the cumulative distribution function F if, J.-P. Renne
for all a:
Prelim.
F (a) = P(X Æ a), ’a.
CLT
Linear regression
A continuous random variable X admits the probability density function f if: Assumptions
OLS
⁄ b Inference
Common pitfalls
P(a < X Æ b) = f (x )dx , ’a Æ b, Large sample
a IV
GRM
Max.Lik.Estim.
I In particular (if this limit exists), we have:
Binary
5 / 295
Statistics and
Definition 1.3 (Joint cumulative distribution function (c.d.f.)) Econometrics II
J.-P. Renne
The random variables X and Y admit the joint cumulative distribution
function FXY if, for all a and b: Prelim.
CLT
FXY (a, b) = P(X Æ a, Y Æ b).
Tests
Linear regression
Definition 1.4 (Joint probability density function (p.d.f.)) Assumptions
OLS
The continuous random variables X and Y admit the joint p.d.f. fXY , where Inference
⁄ b⁄ d
IV
GRM
Time Series
fXY (x , y ) AR processes
MLE of AR processes
P(x < X Æ x + Á, y < Y Æ y + Á) Forecasting
= lim
Áæ0 Á2 Appendix
FXY (x + Á, y + Á) ≠ FXY (x , y + Á) ≠ FXY (x + Á, y ) + FXY (x , y ) Stat. Tables
= lim .
Áæ0 Á2
6 / 295
Statistics and
Econometrics II
Prelim.
CLT
Tests
OLS
0.10
Inference
ns
Common pitfalls
ity
Panels
−3
0
−2
X
−1
0
1
Max.Lik.Estim.
Y
1 2
Binary
2
3
3 Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
I The volume between the horizontal plane (z = 0) and the surface is Stat. Tables
equal to FXY (0.5, 1).
7 / 295
Statistics and
Figure 2: Joint normal distribution Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
0.10 OLS
densi
Inference
3
ty
Y
Panels
−1 −1
0
X
1 −2 Max.Lik.Estim.
2
−3 Binary
3
Time Series
AR processes
MLE of AR processes
I Assume that the basis of the black column is defined by those points Forecasting
If X and Y are continuous r.v., then the distribution of X conditional on J.-P. Renne
CLT
P(x < X Æ x + Á|Y = y )
fX |Y (x , y ) = lim . Tests
Áæ0 Á
Linear regression
Panels
Proof We have:
Max.Lik.Estim.
P(x < X Æ x + Á|Y = y ) Binary
fX |Y (x , y ) = lim
Áæ0 Á
Time Series
1 AR processes
= lim P(x < X Æ x + Á|y < Y Æ y + Á)
Áæ0 Á MLE of AR processes
Forecasting
1 P(x < X Æ x + Á, y < Y Æ y + Á)
= lim
Áæ0 Á P(y < Y Æ y + Á) Appendix
9 / 295
Statistics and
Econometrics II
Definition 1.6 (Independent random variables) J.-P. Renne
Consider two r.v., X and Y , with c.d.f. and p.d.f. FX , FY , fX and fY . Prelim.
These random variables are independent if and only if (iff) the joint c.d.f. of CLT
X and Y (see Def. 1.3) is given by: Tests
Linear regression
FXY (x , y ) = FX (x ) ◊ FY (y ) Assumptions
OLS
Inference
or, equivalently, iff the joint p.d.f. of (X , Y ) (see Def. 1.4) is given by: Common pitfalls
Large sample
fXY (x , y ) = fX (x ) ◊ fY (y ). IV
GRM
Panels
Remark 1.1 Max.Lik.Estim.
10 / 295
Proposition 1.2 (Law of iterated expectations) Statistics and
Econometrics II
J.-P. Renne
If X and Y are two random variables and if E(|X |) < Œ, we have:
Prelim.
E(X ) = E(E(X |Y )).
CLT
Tests
Proof (in the case where the p.d.f. of (X , Y ) exists) Let us denote by fX , fY and fXY the probability
distribution functions (p.d.f.) of X , Y and (X , Y ), respectively. We have: Linear regression
⁄ Assumptions
OLS
¸ ˚˙ ˝ Stat. Tables
E[X |Y =y ]
11 / 295
Example 1.1 (Mixture of Gaussian distributions (1/2)) Statistics and
Econometrics II
Prelim.
X = B ◊ Y1 + (1 ≠ B) ◊ Y2 ,
CLT
where B, Y1 and Y2 are three independent variables: Tests
B ≥ Bernoulli(p), Y1 ≥ N (µ1 , ‡12 ) and Y2 ≥ N (µ2 , ‡22 ). Web-interface
Linear regression
Assumptions
OLS
p=0.5, µ1=−1, σ1=1, µ2=2, σ2=1, p=0.1, µ1=−2, σ1=0.5, µ2=2, σ2=1, p=0.3, µ1=−2, σ1=2, µ2=1, σ2=1, Inference
Common pitfalls
0.30
0.20
Large sample
IV
0.25
0.15
GRM
0.20
Panels
0.15
0.10
Max.Lik.Estim.
0.10
0.05
Binary
0.05
Time Series
0.00
0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4 AR processes
MLE of AR processes
Forecasting
Stat. Tables
E(X ) = E(E(X |B)) = E(1{B=1} µ1 + 1{B=0} µ2 ) = pµ1 + (1 ≠ p)µ2 .
¸ ˚˙ ˝ ¸ ˚˙ ˝
=B =1≠B
12 / 295
Statistics and
Econometrics II
Example 1.2 (Buffon (1733)’s needles and fi)
J.-P. Renne
Suppose we have a floor made of
parallel strips of wood, each the same Prelim.
0 otherwise
GRM
Panels
Conditionally on ◊:
w=1
Max.Lik.Estim.
●
Stat. Tables
Web-interface illustration
13 / 295
Statistics and
Example 1.3 (Davis and Lo’s (2001) model) Econometrics II
Davis and Lo (2001) introduce a contagion model and propose applications in J.-P. Renne
terms of credit-risk management.
Prelim.
They consider n entities. The state of entity i is denoted by di : di = 0 if CLT
entity i stays non-infected and 1 if it gets infected. There are two ways of
getting infected: Tests
Direct infection and contamination are modelled through two types of random Common pitfalls
Large sample
variables: Xi s (n of them) and Yji s (n2 of them). These n(n + 1) variables are IV
independently Bernoulli distributed. The Bernoulli parameter is p for the Xi s GRM
A B Max.Lik.Estim.
Ÿ
Binary
di = Xi +(1 ≠ Xi ) 1≠ (1 ≠ Xj Yji ) .
¸˚˙˝ Time Series
j”=i
autonomous infection ¸ ˚˙ ˝ AR processes
contamination
MLE of AR processes
Forecasting
Appendix
In order to determine the distribution of the number of infected entities, it is
convenient to first reason conditionally
q on the number of autonomous Stat. Tables
infections r defined by r = X.
i i
Web-interface illustration
14 / 295
Statistics and
Proposition 1.3 (Law of total variance) Econometrics II
J.-P. Renne
If X and Y are two random variables and if the variance of X is finite,
we have: Prelim.
Tests
Binary
Example 1.4 (Mixture of Gaussian distributions (2/2)) Time Series
We have: Forecasting
Appendix
Var (X ) = E(Var (X |B)) + Var (E(X |B))
Stat. Tables
= p‡12 + (1 ≠ p)‡22 + p(1 ≠ p)(µ1 ≠ µ2 )2 .
15 / 295
About consistent estimators Statistics and
Econometrics II
J.-P. Renne
I The objective of econometrics is to estimate parameters out of
data observations (samples). Prelim.
I Except in degenerate cases, the estimates are different from the CLT
16 / 295
Example 1.6 (Example of non-convergent estimator) Statistics and
Econometrics II
I Assume that Xi ≥ i.i.d.Cauchy with a location parameter of 1 and J.-P. Renne
a scale parameter of 1 (Def. 9.7).
Prelim.
I The sample mean
n CLT
1ÿ
X̄n = Xi Tests
n
i=1 Linear regression
Xn Xn Panels
Max.Lik.Estim.
1
Binary
−1000 1000
Time Series
−1
AR processes
MLE of AR processes
−2
Forecasting
−3
−4000
Appendix
Stat. Tables
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
17 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
18 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Definition 2.1 (Sample and Population) CLT
Remarks Inference
Common pitfalls
I Most of the time, it is necessary to use samples for research, Large sample
IV
because it is impractical to study the whole population. GRM
Swiss students: Not possible to get the heights of all of the Max.Lik.Estim.
20-year-old students in Switzerland, but possible to get them for a Binary
sample of them. Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
19 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 3: Data (red bars) and estimated density of men heights
Prelim.
CLT
0.06
Tests
0.05
Linear regression
Assumptions
0.04
OLS
Inference
frequency
Common pitfalls
0.03
Large sample
IV
0.02
GRM
Panels
0.01
Max.Lik.Estim.
0.00
Binary
20 / 295
Statistics and
Econometrics II
J.-P. Renne
I The central limit theorem (CLT) and the law of large numbers
(LLN) are (the) two fundamental theorems of probability. Prelim.
I LLN: The sample mean converges to the population mean (when CLT
it exists). Tests
I CLT: The distribution of the sum (or average) of a large number Linear regression
of independent, identically distributed (i.i.d.) variables with finite Assumptions
OLS
variance is approximately normal, regardless of the underlying Inference
Panels
Max.Lik.Estim.
distribution. Appendix
Stat. Tables
21 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Theorem 2.1 (Law of large numbers)
Tests
The sample mean is a consistent estimator of the population mean. Linear regression
Assumptions
OLS
Inference
Common pitfalls
Panels
I The assumption of finite variance is not necessary.
Max.Lik.Estim.
Binary
Proof. See extra material.
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
22 / 295
Statistics and
Econometrics II
J.-P. Renne
I What is the minimal number n of needles that one has to throw to Max.Lik.Estim.
Appendix
Stat. Tables
23 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
I Assume xi ≥ i.i.d. L(µ, ‡ 2 ), where L(µ, ‡) denotes a distribution
2
of mean µ and variance ‡ . Tests
Linear regression
I We have: Assumptions
1
OLS
Max.Lik.Estim.
Var (µ ≠ x̄n ) ‡2
P(|x̄n ≠ µ| > Á) Æ = 2 æ 0. Binary
Á 2 nÁ næŒ
Time Series
I How does the CLT complement this information? AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
24 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
If xn is an i.i.d. sequence of random variables with mean µ and variance Tests
‡ 2 (œ]0, +Œ[), then:
Linear regression
Assumptions
n
Ô d 2 1ÿ OLS
n Common pitfalls
i=1 Large sample
IV
GRM
d
[“æ” denotes the convergence in distribution (see Def. 9.12)] Panels
Max.Lik.Estim.
Binary
Proof. See online extra material.
Time Series
AR processes
Appendix
Stat. Tables
25 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
History of CLT
Tests
I First version of this theorem: postulated by the mathematician
Linear regression
Abraham de Moivre. In an article published in 1733, he used the Assumptions
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
26 / 295
Statistics and
Econometrics II
Figure 4: Distribution of sample means, heights example (see Fig. 3) J.-P. Renne
Prelim.
CLT
0.10
Linear regression
Assumptions
OLS
0.06
frequency
Inference
Common pitfalls
Large sample
0.04
IV
GRM
0.02
Panels
Max.Lik.Estim.
0.00
Binary
−20 −10 0 10 20 Time Series
AR processes
(centered) height MLE of AR processes
Forecasting
Appendix
Notes: To obtain the black distribution, we simulate a large number (4000) of samples of n = 100
points using the red distributions. For each sample, we compute the sample mean.ÔThe black curve is Stat. Tables
the distribution of the (simulated) sample means, after having multiplied them by n = 10.
Web-interface illustration
27 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Linear regression
If xn is an i.i.d. sequence of m-dimensional random variables with mean Assumptions
n GRM
i=1
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
28 / 295
Statistics and
Econometrics II
J.-P. Renne
Panels
Max.Lik.Estim.
Time Series
I Assume you have a lot of time and a coin. How would you
AR processes
approximate the 100 percentiles of the N (0, 1) distribution? MLE of AR processes
Forecasting
Appendix
Stat. Tables
29 / 295
Statistics and
Example 2.2 (Comparison of sample means) Econometrics II
J.-P. Renne
I The question is:
Is the average of hourly wage rates the same for men (µm ) and Prelim.
Linear regression
I Wage sample for women: {w1 , . . . , wnw }.
Assumptions
I CLT: OLS
Inference
Common pitfalls
Ô 2 Ô
nm (m̄ ≠ µm ) ≥ N (0, ‡m ) and nw (w̄ ≠ µw ) ≥ N (0, ‡w2 ), Large sample
IV
GRM
or
2 Panels
m̄ ≥ N (µm , ‡m /nm ) and w̄ ≥ N (µw , ‡w2 /nw ).
Max.Lik.Estim.
I Because m̄ and w̄ are based on randomly selected samples, they
Binary
are independent (Def. 1.6) and:
Time Series
AR processes
! 2
" ‡2 ‡2
m̄ ≠ w̄ ≥ N µm ≠ µw , ‡ with ‡2 = m + w .
MLE of AR processes
nm nw Forecasting
Appendix
I In particular, under the assumption that µm ≠ µw = 0, the
Stat. Tables
probability that m̄ ≠ w̄ is in [≠2‡, 2‡] is approximately 0.95.
30 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
Figure 5: Distributions of hourly wage rates
Tests
(Case A) education>12 years (Case B) education>18 years
Linear regression
Men Men Assumptions
Women Women OLS
0.06
0.06
Inference
Common pitfalls
Large sample
0.04
0.04
frequency
frequency
IV
GRM
Panels
0.02
0.02
Max.Lik.Estim.
0.00
0.00
Binary
0 10 20 30 40 50 0 10 20 30 40 50 60
Time Series
hourly wage rate hourly wage rate AR processes
MLE of AR processes
Source of the data: Fox (2008), Canadian Survey of Labour and Income Dynamics, Ontario province. Forecasting
Appendix
Stat. Tables
31 / 295
Statistics and
Econometrics II
J.-P. Renne
Example 2.2 (Comparison of sample means (continued))
Prelim.
I Case A (Education > 12 years):
Ô CLT
We have m̄ ≠ w̄ = 3.44, ‡ 2 = 0.33.
Tests
I If µm ≠ µw = 0, P(m̄ ≠ w̄ œ
/ [≠0.65, 0.65]) ¥ 5%.
Linear regression
I If µm ≠ µw = 0, P(m̄ ≠ w̄ œ
/ [≠0.85, 0.85]) ¥ 1%.
Assumptions
I If µm ≠ µw = 0, P(m̄ ≠ w̄ > 3.44) ¥ 0.0000001%.
¸ ˚˙ ˝ OLS
Inference
p-value Common pitfalls
Large sample
N Statistical Table
IV
GRM
Panels
I Case B (Education > 18 years):
Ô Max.Lik.Estim.
We have m̄ ≠ w̄ = 2.34, ‡ 2 = 1.22. Binary
I If µm ≠ µw = 0, P(m̄ ≠ w̄ œ
/ [≠2.39, 2.39]) ¥ 5%. Time Series
I If µm ≠ µw = 0, P(m̄ ≠ w̄ œ
/ [≠3.14, 3.14]) ¥ 1%. AR processes
I If µm ≠ µw = 0, P(m̄ ≠ w̄ > 2.34) ¥ 2.7% . MLE of AR processes
¸˚˙˝ Forecasting
p-value Appendix
Stat. Tables
32 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Tests Inference
Common pitfalls
Large sample
IV
GRM
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
33 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Context and objective
CLT
I Context: We consider a vector of parameters ◊.
Tests
I Objective: We want to test a specific hypothesis about ◊. Linear regression
Assumptions
OLS
Inference
Common pitfalls
Large sample
IV
Definition 3.1 (Hypothesis) GRM
Max.Lik.Estim.
I We consider:
Binary
a null hypothesis (H0 : {◊ œ }) and
Time Series
c
an alternative hypothesis (H1 : {◊ œ
/ }, i.e. H1 : {◊ œ }). AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
34 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I Researchers may want to find support for the null hypothesis H0 Tests
I General idea:
Assumptions
OLS
statistic; IV
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
35 / 295
Statistics and
I When testing a hypothesis, one can make two types of errors: Econometrics II
J.-P. Renne
Type I Reject H0 while it is true = False Positive (FP).
Type II Accept H0 while it is false = False Negative (FN).
Prelim.
CLT
Example 3.1 (A practical illustration of type-I and type-II errors) Tests
I Early warning systems (EWS): approaches designed to warn policy Linear regression
H1 : A crisis is coming.
GRM
Panels
I Four cases: Max.Lik.Estim.
36 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Definition 3.2 (Size and Power of a test)
CLT
I the probability of type-I errors, denoted by –, is called the size (or Linear regression
I For a given size, we prefer tests with high power. Power = Panels
probability that the test will lead to the rejection of the null Max.Lik.Estim.
hypothesis if the latter is false (almost ruling out false negatives). Binary
Appendix
Stat. Tables
37 / 295
Statistics and
Econometrics II
J.-P. Renne
I A test involves:
I a vector of (unknown) parameters (◊), Prelim.
I a test statistic (say S) and CLT
I a critical region (say ).
Tests
I H0 expressed some constraints on ◊, formulated as h(◊) = 0.
Linear regression
I The computation of S is based on a set of observed data Y (i.e. Assumptions
OLS
S © S(Y)). Inference
Common pitfalls
I Decision: Large sample
IV
I H0 is accepted if S ”œ ; GRM
I H0 is rejected if S œ .
Panels
I Notations: Max.Lik.Estim.
Binary
– = P(S œ |H0 ) (Proba. of false positive) (2)
Time Series
— = P(S ”œ |H1 ) (Proba. of false negative). (3) AR processes
MLE of AR processes
Forecasting
Stat. Tables
38 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Definition 3.3 (Different types of tests) CLT
Appendix
Stat. Tables
39 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Example 3.2 (Factory example (1/2))
CLT
has to be equal to 1cm. The tolerance is a = 0.01cm and more Linear regression
than 90% of the pieces have to satisfy the tolerance for the whole
Assumptions
OLS
I ◊ could be computed by measuring all the pieces but this would Panels
be costly. Max.Lik.Estim.
Appendix
Stat. Tables
40 / 295
Statistics and
Econometrics II
; CLT
0 if the size of the i th cylinder is in [1 ≠ a, 1 + a];
di = Tests
1 otherwise.
Linear regression
qn Assumptions
I We set xn = di , i.e. xn is the number of measured pieces OLS
i=1
that do not satisfy the tolerance (out of n).
Inference
Common pitfalls
x Large sample
I A decision rule is: accept H0 if n Æ b, reject otherwise. IV
n GRM
i=b◊n+1 Appendix
Stat. Tables
41 / 295
Statistics and
Econometrics II
Figure 6: Factory example
J.-P. Renne
Prelim.
Probability to reject H0, N=100
CLT
Tests
1.0
Linear regression
Assumptions
0.8
Large sample
Probability
Panels
Size of the test
Max.Lik.Estim.
0.2
b = 0.05 Binary
b = 0.10 Time Series
b = 0.20
0.0
AR processes
MLE of AR processes
Forecasting
0.0 0.1 0.2 0.3 0.4
Appendix
"true" θ
Stat. Tables
Web-interface illustration
42 / 295
Statistics and
Econometrics II
Figure 6: Factory example
J.-P. Renne
Prelim.
Probability to reject H0, N=400
CLT
Tests
1.0
Linear regression
Assumptions
0.8
Large sample
Probability
Panels
Size of the test
Max.Lik.Estim.
0.2
b = 0.05 Binary
b = 0.10 Time Series
b = 0.20
0.0
AR processes
MLE of AR processes
Forecasting
0.0 0.1 0.2 0.3 0.4
Appendix
"true" θ
Stat. Tables
Web-interface illustration
43 / 295
Statistics and
Econometrics II
Figure 6: Factory example
J.-P. Renne
Prelim.
Probability to reject H0, N=1000
CLT
Tests
1.0
Linear regression
Assumptions
0.8
Large sample
Probability
Panels
Size of the test
Max.Lik.Estim.
0.2
b = 0.05 Binary
b = 0.10 Time Series
b = 0.20
0.0
AR processes
MLE of AR processes
Forecasting
0.0 0.1 0.2 0.3 0.4
Appendix
"true" θ
Stat. Tables
Web-interface illustration
44 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Definition 3.4 (Asymptotic level)
Tests
An asymptotic test with critical region n has an asymptotic level equal Linear regression
to – if: Assumptions
Panels
An asymptotic test with critical region n is consistent if: Max.Lik.Estim.
c
’◊ œ , P◊ (Sn œ n ) æ 1. Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
45 / 295
Statistics and
Econometrics II
Back to the Factory example (Exmpl. 3.2): Asymptotic level J.-P. Renne
I Because Sn = d̄n , and since E(di ) = ◊ and Var (di ) = ◊(1 ≠ ◊), the Prelim.
CLT (Thm 9.6) leads to: CLT
1 2 Ô
1 n(Sn ≠ ◊) Tests
Sn ≥ N ◊, ◊(1 ≠ ◊) or ≥ N (0, 1)
n ◊(1 ≠ ◊) Linear regression
Assumptions
1Ô 2 OLS
n(b≠◊) Inference
I Hence, P◊ (Sn œ n) = P◊ (Sn > b) ¥ 1 ≠ Ô . Common pitfalls
◊(1≠◊)
Large sample
1Ô 2 IV
n(b≠◊)
I ◊ æ1≠ Ô increases w.r.t. ◊, therefore: GRM
◊(1≠◊)
Panels
46 / 295
Statistics and
Econometrics II
J.-P. Renne
Back to the Factory example (Exmpl. 3.2): Asymptotic consistency
I We proceed under the assumption that ◊ > 0.1 and we consider Prelim.
bn = b = 0.1. CLT
Tests
I We still have:
AÔ B Linear regression
n(b ≠ ◊) Assumptions
Stat. Tables
47 / 295
Normality tests Statistics and
Econometrics II
J.-P. Renne
Panels
is the k th central moment of Y .
Max.Lik.Estim.
I In particular, µ2 = Var (Y ) = ‡ 2 , say. Therefore:
Binary
µk
Âk = 1 2k , Time Series
1/2 AR processes
µ2 MLE of AR processes
Forecasting
Stat. Tables
48 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Linear regression
For a Gaussian var., the skewness (Â3 ) is 0 and the kurtosis (Â4 ) is 3. Assumptions
OLS
Inference
3 3
Proof For a centered Gaussian distribution, (≠y ) f (≠y ) = ≠y f (y ). This implies that
sŒ s0 sŒ sŒ sŒ Common pitfalls
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
49 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
n
1ÿ Tests
mk = (yi ≠ ȳ )k .
n Linear regression
i=1
Assumptions
OLS
I k th standardized sample moment of Y : Inference
Common pitfalls
mk Large sample
gk = k/2
. IV
m2 GRM
Panels
Max.Lik.Estim.
Proposition 3.2 (Consistency of central sample moments)
Binary
If the k th central moment of Y , exists, then the sample central moment Time Series
mk is a consistent estimate of the central moment µk . AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
50 / 295
Statistics and
Econometrics II
J.-P. Renne
Tests
Panels
Max.Lik.Estim.
Ô Ô Time Series
Asymptotically, the vector ( ng3 , n(g4 ≠ 3)) is bivariate Gaussian. AR processes
Appendix
Stat. Tables
51 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
3 2
4 3 2
4 Tests
g32 (g4 ≠ 3) n (g4 ≠ 3)
JB = n + = g32 + . Linear regression
6 24 6 4 Assumptions
OLS
Inference
Common pitfalls
Proposition 3.6 (Jarque-Bera asympt. distri.) Large sample
IV
d 2
If Y is Gaussian, JB æ ‰ (2). GRM
Panels
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
52 / 295
Statistics and
Econometrics II
Figure 7: The distr. of the JB stat. in small-sample (Y ≥ N ).
J.-P. Renne
n=5 n=10
Prelim.
1.0
2.0
CLT
0.8
1.5
0.6
Density
Density
Tests
1.0
0.4
0.5
0.2
Linear regression
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10 Assumptions
all.res[, i] all.res[, i] OLS
n=20 n=50
Inference
0.5
Common pitfalls
0.6
0.4
Large sample
0.4
0.3
Density
Density
IV
0.2
GRM
0.2
0.1
Panels
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
all.res[, i] all.res[, i]
Max.Lik.Estim.
n=100 n=500
Binary
0.0 0.1 0.2 0.3 0.4 0.5
0.4
Time Series
0.3
Density
Density
AR processes
0.2
MLE of AR processes
0.1
Forecasting
0.0
0 2 4 6 8 10 0 2 4 6 8 10
Appendix
10.000 samples of size n are simulated (from a normal distribution). For each sample, the JB statistic is
Stat. Tables
computed. The red line shows the ‰2 (2) density. This figure illustrates the convergence of the JB
statistic to the ‰2 (2) distribution under H0 (the null hypothesis of normality).
53 / 295
Statistics and
Econometrics II
Figure 8: The distr. of the JB stat. in small-sample (Y ≥ U ).
J.-P. Renne
n=5 n=10
Prelim.
1.2
2.0
CLT
1.5
0.8
Density
Density
Tests
1.0
0.4
0.5
Linear regression
0.0
0.0
0 5 10 15 0 5 10 15 Assumptions
all.res[, i] all.res[, i] OLS
n=20 n=50
Inference
Common pitfalls
0.8
Large sample
0.6
0.4
Density
Density
IV
0.4
GRM
0.2
0.2
Panels
0.0
0.0
0 5 10 15 0 5 10 15
all.res[, i] all.res[, i]
Max.Lik.Estim.
n=100 n=500
Binary
0.15
0.3
Time Series
0.10
Density
Density
0.2
AR processes
MLE of AR processes
0.05
0.1
Forecasting
0.00
0.0
0 5 10 15 0 5 10 15
Appendix
10.000 samples of size n are simulated (from a uniform distribution). For each sample, the JB statistic is
Stat. Tables
computed. The red line shows the ‰2 (2) density. This figure illustrates the absence of convergence of
the JB statistic to the ‰2 (2) distribution under H1 .
54 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
55 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
Figure 9: Infant mortality rate (deaths per 1000 births) and GDP/capita Tests
Linear regression
● ● ●
● ●
● ●
Assumptions
5
● ●
●●●
● ●● ●●● ● ●
150
150
●● ● ●
●
● ●● ●
OLS
● ● ● ●
●
● ●
● ●●● ●● ●
● ● ●
● ● ● ●● ● ●
● ●● ●
Inference
● ● ● ●● ● ●
● ● ●● ● ● ●
● ●
4
● ● ●● ● ●●
100
●
● ●● ● ● ●●
Large sample
● ●
●
●
● ● ● ● ●● ● ●●●●●
●
●
●
●
●● ● ● ● ●● ●● ●
●● ● ● ●●●
3
● ●●
IV
●● ● ●●● ● ● ● ● ● ● ● ● ●
●
●
●
●
●● ● ● ●● ● ● ●●
●●● ● ● ● ● ●
● ●
●
● ●●
●
●
●
● ●●●● ● ● ● ●
GRM
● ● ● ●●● ●
●● ●● ●
●●● ● ● ●
●●● ● ● ● ● ●
●● ● ● ●● ● ● ● ● ● ●● ●
●
● ● ● ● ●
●● ● ●
●●
50
50
2
Panels
● ●
● ● ●● ●●
●●●●● ● ●
●●● ● ● ● ●●●
●●
● ●● ●
● ● ● ●●
● ●●●
● ●
●●
● ● ●●●
●●●
●● ● ●● ●● ●●● ● ●
●● ●●
● ●
●● ● ● ●●●● ● ●
●● ●
●
● ●●●
●●●●● ●●● ●
● ●● ●● ● ● ● ●● ●●●●
Max.Lik.Estim.
●●
●● ●●
●
●●● ● ● ● ● ● ●
● ●● ●●●
● ●
●●●
●
●●● ● ● ● ● ● ●
●● ● ●●● ● ● ● ● ●
●● ●● ● ● ● ● ● ●● ● ● ●●
●●
1
●● ●●●
● ●● ● ● ●● ●●●●●●
● ●● ●● ●●●●
●● ●●● ● ●
●● ●
●
●● ● ●●
●● ●● ●● ● ● ● ●● ●
●
●
●●●
●
●●●●
●
●
● ●●●
●● ●● ●●
●●●
●
● ●●
● ●
● ●
0
Appendix
Stat. Tables
56 / 295
Statistics and
Example 4.2 (Income, Seniority and Education) Econometrics II
J.-P. Renne
CLT
100
● ●
●
●
● ●
Tests
● ●
80
● ● ●
● ●
● ●
Linear regression
●
● ●
Income
60
●
● Assumptions
OLS
● ●
●
40
● ● ●
Inference
●
● ● ● ●
● ●
●
●
● ● Common pitfalls
20
● ● ● ● ●
●●
Large sample
● ●
●
Incom
10 12 14 16 18 20 22 ●
IV
●
Education
GRM
e
● ●
● ● ●
●
Panels
●
100
rity
● ● ●
●
● ●
nio
●
● ● ●
Max.Lik.Estim.
●
Se
Ed
uca
80
● ● ● tion
● ●
● ●
●
Binary
● ●
Income
60
●
●
●
●
Time Series
40
●
●
AR processes
●
●
● ● MLE of AR processes
20
●
●
Forecasting
50 100 150
Seniority Appendix
Source of the data An Introduction to Statistical Learning, with applications in R, (Springer, 2013) Stat. Tables
with permission from the authors: G. James, D. Witten, T. Hastie and R. Tibshirani.
57 / 295
Statistics and
Econometrics II
J.-P. Renne
Example 4.3 (Advertising and Sales)
Prelim.
Tests
Linear regression
● ● ●
● ● ●
● ● ●● ●
●● ● ● ● ● ●
25
25
25
● ● ●
● ●
Assumptions
● ● ●
●
●● ● ● ●
● ● ● ●
● ● ●
●● ● ● ●● ● ● ● ●
OLS
●● ● ●● ● ●●
● ● ● ● ●
●
● ● ● ●●
●
● ● ●
● ● ● ● ●● ● ● ●● ●
Inference
20
20
20
● ● ● ●● ● ●● ●
● ●● ●
●
● ●●
● ● ●● ● ● ●
● ● ●● ●● ●
● ● ● ● ● ● ● ● ● ● ●
● ● ●
Common pitfalls
● ● ● ●● ●
● ● ● ● ●
●
● ● ●
● ●● ●● ●● ● ●● ●
● ● ● ● ● ● ● ● ●
●●
● ● ●● ● ● ●●
● ●● ●● ● ● ● ●
● ●
● ● ●● ● ●● ● ●● ●
● ●● ●
●● ● ●
● ● ● ● ● ●●
●●
●
●●
● ● Large sample
Sales
Sales
Sales
15
15
15
● ● ● ● ● ● ● ●
●● ● ●●●● ● ● ●● ●
● ●●● ● ●● ●
● ●
● ●● ● ●
● ●●●
● ● ●●
● ●
● ●
●
● ●●
●
●●● ●●
● ● ●
●
●●
● ●
● ● ●
●●● ●
●
● ● ●
IV
● ●
● ● ● ● ●
●● ●● ● ● ● ●● ● ●● ● ● ● ● ●
●● ● ● ● ● ● ●●
●● ● ●● ●
GRM
● ● ●●
●● ●●
●
● ●● ● ●
● ●
●● ●
● ● ● ● ●
● ● ●● ● ● ●●
●●● ●
●● ● ● ● ●
● ●● ●●●
●●
● ●● ●
● ● ●●●●●●● ●
●● ● ●
●
●
● ● ● ● ● ●● ●● ●
● ● ● ● ● ●● ● ●●
●●
●●● ● ● ● ●
● ● ● ● ● ●● ● ● ● ● ● ●●
● ●● ●
10
10
10
●
● ●●
● ●● ● ●
●●●
● ● ●● ● ● ● ●
●●● ●●
● ● ● ●
Panels
● ● ● ●
● ●● ●● ●● ● ● ●
●● ●● ● ● ● ●●
●●● ●● ● ● ●
● ● ● ● ● ●● ●
●● ● ● ● ●●● ● ●
● ●
●● ●
●● ●● ● ● ●●
● ● ● ● ● ●● ●● ●
● ●● ● ● ● ● ●●
● ●
Max.Lik.Estim.
●●●
●
● ● ●● ● ●
● ●
●● ● ● ● ●
5
5
● ● ●
● ● ●
● ● ● Binary
0 50 100 150 200 250 300 0 10 20 30 40 50 0 20 40 60 80 100 Time Series
TV Radio Newspaper AR processes
MLE of AR processes
Source of the data An Introduction to Statistical Learning, with applications in R, (Springer, 2013) Forecasting
with permission from the authors: G. James, D. Witten, T. Hastie and R. Tibshirani. Appendix
Stat. Tables
58 / 295
Statistics and
Econometrics II
yi = xi,1 ◊—1 + · · · + xi,K ◊—K + Ái (4)
¸˚˙˝ ¸˚˙˝ ¸˚˙˝ ¸˚˙˝ J.-P. Renne
regressand th disturbance
first k
Prelim.
regressor regressor
CLT
Tests
I Using the notation xi = [xi,1 , . . . , xi,K ]Õ , we get:
Linear regression
Assumptions
yi = xiÕ — + Ái . OLS
scalar (K ◊1) scalar Inference
Common pitfalls
covariates. Panels
I We have a sample of n observations for yi and for xi . Max.Lik.Estim.
Binary
Remark 4.1
Time Series
I Important to distinguish between population quantities (— and Ái ) and AR processes
Stat. Tables
yi = xiÕ — + Ái = xiÕ b + ei but — ”= b and Ái ”= ei .
59 / 295
Statistics and
Econometrics II
J.-P. Renne
Linear regression
Assumptions
OLS
Inference
Common pitfalls
Notations
Large sample
IV
Panels
S y T S Á T S xÕ T Max.Lik.Estim.
1 1 1
W y2 X W Á2 X W x2Õ X Binary
y = U . V, Á = U . V and X = U . V.
(n◊1)
. . (n◊1) . . (n◊K ) . . Time Series
yn Án xnÕ AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
60 / 295
Assumptions of the classical linear regression model Statistics and
Econometrics II
There is no exact linear relationship among the independent variables (the xi s). Prelim.
CLT
Example of exact linear relationship among the xi s Tests
Max.Lik.Estim.
Implications
Binary
Under Assumption A.2:
Time Series
i) E(Ái ) = 0; AR processes
ii) the xij s and the Ái s are uncorrelated, i.e. ’i, j Corr (xij , Ái ) = 0. MLE of AR processes
Forecasting
Prelim.
’i, Var (Ái |X) = ‡ 2 .
CLT
Tests
Linear regression
Figure 12: Heteroskedasticity (left) and homoskedasticity (right) Assumptions
OLS
Inference
6
● ●
●● ●
●
●●●
● ●
● Common pitfalls
6
● ●
● ●
●
● ● ●
●●● ●
4
●● ●
●●● ● ●
● ● Large sample
●● ● ●●
●●●
●● ●
●●● ●●● ● ● ● ●●
●● ●●● ●● IV
4
● ● ● ●●● ● ●
●●
●●
2
● ●●
● ● ●● ● ●
GRM
● ●●
●●
●● ● ● ●●● ● ●●
● ●
●● ● ● ●●
●● ●
Y
Y
● ●
2
●● ●● ●
● ● ●● ●●● ●●●●
0
Panels
● ● ● ●
● ●●
● ● ● ●
● ● ● ●● ● ●
●● ● ● ●
●
● ●● ● ●●
● ● ●
−2
0
●● ● ●
Max.Lik.Estim.
● ● ●
● ● ●
● ●● ●●
● ●
● ●
● ● ● ●
● ●
−4
−2
Binary
●
●
●●
●● ●
−2 −1 0 1 −2 −1 0 1 Time Series
AR processes
X X
MLE of AR processes
Xi ≥ N (0, 1) and Áú
i ≥ N (0, 1). ! " Forecasting
Left-hand side (heteroskedasticity): Yi = 2 + 2Xi + 21{X <0} + 0.21{X Ø0} i .
Áú
i i Appendix
Right-hand side (homoskedasticity): Yi = 2 + 2Xi + Áú
i .
Stat. Tables
62 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 13: Salaries versus years since end of PhD
Prelim.
● CLT
Tests
200000
● ●
● ● ● Linear regression
● ●
● ● Assumptions
● ●
●
● ● ● ●
● ● OLS
● ● ●
● ●● ● ● ●
● ●● ● Inference
150000
●
●● ●
●● ● ●●● ●
Salary
● ● ● ● ● ● ● ●● ● Common pitfalls
● ● ● ● ● ● ● ● ●
●● ●● ●
● ● ●● ● ● ● ●
●●●● ● Large sample
● ●
●●● ● ●●
● ● ●●● ●
●●●
● ● ● ● ●●● ●● ●● IV
● ●●● ● ● ●
● ● ●
● ● ● ●● ●● ● ●
●●●●
●
●●●● ● ●
● GRM
●● ● ● ●
● ● ●●●
● ● ●● ● ● ●● ●● ● ●●● ● ●
100000
● ● ●●●●●● ● ● ●● ● ●
Panels
●● ●●●●●● ●● ● ● ● ●
● ● ●
●●●● ● ● ●● ●● ● ● ● ● ●●● ●
●
●●●
● ●● ●●●● ●● ● ●
● ●●● ● ● ● ● ●● ● ● ●
● ●
● ● ● ● ● ● ●
●●● ● ● ● ●●●●● ●
Max.Lik.Estim.
●●
●● ●●● ● ● ● ● ● ●●●
●●●●● ● ● ●
●
●● ● ●●
● ●● ●● ● ● ● ● ●
●● ● ●
●● ● ● ●● ● ● ●
● ●● ●●●● ● ●
Binary
●●●● ● ● ● ●
●● ● ● ●
● ● ●
●
Time Series
AR processes
0 10 20 30 40 50
MLE of AR processes
Forecasting
years since PhD
Appendix
Data from Fox J. and Weisberg, S. (2011).
Stat. Tables
63 / 295
Statistics and
Econometrics II
Assumption A.4 (Non-correlated residuals)
J.-P. Renne
’i ”= j, Cov (Ái , Áj |X) = 0.
Prelim.
CLT
Remark 4.2 (Implication of A.3 and A.4)
Tests
If A.3 and A.4 hold, then: Linear regression
Assumptions
2
Var (Á|X) = ‡ Id, OLS
Inference
Common pitfalls
Panels
Binary
’i, Ái ≥ N (0, ‡ 2 ).
Time Series
AR processes
As we will see later (Slide 108), even if this assumption is not satisfied, the Appendix
normal distribution will show up in asymptotic results (by virtue of the Central
Stat. Tables
Limit Theorem 9.6).
64 / 295
The least square coefficient vector Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
I For a given vector of coefficients b, the sum of square residuals is: Linear regression
Assumptions
n
A K
B2 n OLS
ÿ ÿ ÿ Inference
f (b) = yi ≠ xi,j bj = (yi ≠ xiÕ b)2 Common pitfalls
Large sample
i=1 j=1 i=1 IV
GRM
Max.Lik.Estim.
f (b) = (y ≠ Xb)Õ (y ≠ Xb).
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
65 / 295
Statistics and
Econometrics II
J.-P. Renne
XÕ Xb = XÕ y. (5)
Common pitfalls
Large sample
IV
GRM
I Under Assumption A.1, XÕ X is invertible. Hence:
Panels
Max.Lik.Estim.
b = (XÕ X)≠1 XÕ y.
Binary
Appendix
Stat. Tables
66 / 295
Statistics and
Example 4.4 (Income, Seniority and Education (Example 4.2)) Econometrics II
J.-P. Renne
Figure 14: Income, Seniority and Education
Prelim.
CLT
Tests
●
●
● ● ●
●
● ●
● ●
●
● Linear regression
● ● ● ●●
Incom ● Assumptions
●
OLS
●
Inference
e
● ● ●
● ●
● ● Common pitfalls
ity
● ●
Large sample
r
●
nio
Ed IV
Se
uc atio
n GRM
Panels
Max.Lik.Estim.
Binary
Estimate Std. Error t value Pr(>|t|) Time Series
(Intercept) -50.086 5.999 -8.349 0.000 AR processes
Appendix
Table 1: Income, Education and Seniority Stat. Tables
67 / 295
Statistics and
Econometrics II
I The estimated residuals are: J.-P. Renne
CLT
where M := I ≠ X(XÕ X)≠1 XÕ is called the residual maker matrix. Tests
I We have MX = 0: if one regresses one of the explanatory variables on
Linear regression
X, the residuals are null. Assumptions
Panels
where P = X(XÕ X)≠1 XÕ is a projection matrix. (ŷ is the projection of
Max.Lik.Estim.
the vector y onto the vectorial space spanned by the columns of X.)
Binary
Time Series
Remark 4.3 AR processes
∆ If intercepts are included in the regression (xi,1 © 1), the average of the Appendix
residuals is null.
Stat. Tables
68 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Properties of M and P Linear regression
I M is symmetric (M = MÕ ) and idempotent (M = M2 = Mk for k > 0). Assumptions
OLS
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
69 / 295
Statistics and
Econometrics II
(i) Under Assumptions A.1 and A.2, the OLS estimator is linear and Prelim.
unbiased. CLT
(ii) Under Assumptions A.1 to A.4, the conditional covariance matrix of b Tests
is: Var (b|X) = ‡ 2 (XÕ X)≠1 . Linear regression
Assumptions
OLS
Proof Under A.1, XÕ X can be inverted. We have: Inference
Common pitfalls
Õ ≠1 Õ Õ ≠1 Õ
b = (X X) X y = — + (X X) X Á. Large sample
IV
GRM
Panels
(i) Let us consider the expectation of the last term, i.e. E((XÕ X)≠1 XÕ Á). Using the law of iterated
expectations (Prop. 1.2), we obtain: Max.Lik.Estim.
Õ ≠1 Õ Õ ≠1 Õ Õ ≠1 Õ Binary
E((X X) X Á) = E(E[(X X) X Á|X]) = E((X X) X E[Á|X]).
Time Series
AR processes
By A.2, we have E[Á|X] = 0. Hence E((XÕ X)≠1 XÕ Á) = 0 and result (i) follows.
MLE of AR processes
(ii) Var (b|X) = (X X) X E(ÁÁ |X)X(X X) . By Remark 4.2, if A.3 and A.4 hold, then we
Õ ≠1 Õ Õ Õ ≠1
Forecasting
Stat. Tables
70 / 295
Statistics and
Example 4.5 (Bivariate case (K = 2)) Econometrics II
I If K = 2 and if the regression includes a constant (xi,1 = 1), X is a n J.-P. Renne
matrix whose i th row is [1, xi,2 ]. Let us use the notation wi = xi,2 :
Prelim.
5 q 6 CLT
Õ
X X = qn q i w2i ,
wi wi Tests
i i
5 q q 6 Linear regression
2
Õ
(X X)
≠1
= q 1
q qi wi ≠
i
wi
, Assumptions
n w2 ≠( wi )2 ≠ wi n OLS
i i i i
Inference
5 q q q q 6
wi2
Common pitfalls
1 yi ≠ wi wi yi
Õ
(X X)
≠1 Õ
X y = q q iq iq i q i Large sample
n w2 ≠( wi )2 ≠ wi yi + n wi yi IV
i i i i i i
5 ȳ
q 2 w̄ q 6 GRM
= q 1 n
1
qi wi ≠ n i wi yi . Panels
1 (wi ≠ w̄ )2 n
(wi ≠ w̄ )(yi ≠ ȳ )
n i i Max.Lik.Estim.
Binary
I It can be seen that the second element of b = (XÕ X)≠1 XÕ y is:
Time Series
AR processes
Stat. Tables
where Cov (W , Y ) and Var (W ) are sample estimates.
I Since there is a constant in the regression, we have b1 = ȳ ≠ b2 w̄ .
71 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Theorem 4.1 (Gauss-Markov Theorem)
CLT
Under Assumptions A.1 to A.4, for any vector w , the minimum-variance linear Tests
unbiased estimator of w Õ — is w Õ b, where b is the least squares estimator. Linear regression
(BLUE: Best Linear Unbiased Estimator.) Assumptions
OLS
Proof.: Consider bú = C y, another linear unbiased estimator of —. Since it is unbiased, we must have Inference
E(C y|X) = E(C X— + C Á|X) = —. We have E(C Á|X) = C E(Á|X) = 0 (by A.2). Therefore bú is Common pitfalls
unbiased if E(C X)— = —. This has to be the case for any —, which implies that we must have C X = I. Large sample
Let us compute Var (bú |X). For this, we introduce D = C ≠ (XÕ X)≠1 XÕ , which is such that IV
Dy = bú ≠ b. The fact that C X = I implies that DX = 0. GRM
We have Var (bú |X) = Var (C y|X) = Var (C Á|X) = ‡ 2 CC Õ (by Assumptions A.3 and A.4, see
Remark 4.2). Using C = D + (XÕ X)≠1 XÕ and exploiting the fact that DX = 0 leads to: Panels
# $ Max.Lik.Estim.
ú 2 Õ ≠1 Õ Õ ≠1 Õ Õ 2 Õ
Var (b |X) = ‡ (D + (X X) X )(D + (X X) X ) = Var (b|X) + ‡ DD . Binary
Time Series
Therefore, we have Var (w Õ bú |X) = w Õ Var (b|X)w + ‡ 2 w Õ DDÕ w Ø w Õ Var (b|X)w = Var (w Õ b|X). ⌅ AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
72 / 295
Statistics and
Econometrics II
J.-P. Renne
Prop. 4.1, Theorems 4.1 and 4.2 (next slide) hold for any sample size. CLT
Tests
Max.Lik.Estim.
I Consider the case where we have two sets of explanatory variables: Binary
X = [X1 , X2 ]. Time Series
I With obvious notations: by/X = [bÕ1 , bÕ2 ]Õ . AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
73 / 295
Statistics and
Econometrics II
Theorem 4.2 (Frisch-Waugh Theorem)
J.-P. Renne
We have: X1
y/MX1 X2 Prelim.
b2 = bM .
CLT
Proof The minimization of the least squares leads to (these are first-order conditions, see Eq. 5): Tests
Ë ÈË È Ë È Linear regression
XÕ1 X1 XÕ1 X2 b1 XÕ1 y Assumptions
= .
XÕ2 X1 XÕ2 X2 b2 XÕ2 y OLS
Inference
Common pitfalls
Use the first-row block of equations to solve for b1 first; it comes as a function of b2 . Then use the Large sample
second set of equations to solve for b2 , which leads to: IV
GRM
Õ X ≠1 Õ X
[X2 M 1 X2 ] X2 M 1 y.
Õ Õ Õ Õ ≠1 Õ Õ Õ
b2 = [X2 X2 ≠ X2 X1 (X1 X1 )X1 X2 ] X2 (Id ≠ X1 (X1 X1 )X1 )y = Panels
Max.Lik.Estim.
Using the fact that MX1 is idempotent and symmetric leads to the result. ⌅
Binary
Time Series
Remark 4.5 AR processes
MLE of AR processes
74 / 295
Statistics and
Econometrics II
J.-P. Renne
200
Prelim.
150
CLT
Tests
100
Linear regression
Assumptions
50
OLS
Inference
Common pitfalls
2005 2006 2007 2008 2009 2010 2011 Large sample
IV
GRM
Figure 15: Google searches for “parapluie” (red) versus precipitations (black) Panels
Max.Lik.Estim.
Binary
Stat. Tables
75 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.762 1.954 11.138 0.000 CLT
Appendix
Stat. Tables
76 / 295
Statistics and
Estimate Std. Error t value Pr(>|t|) Econometrics II
(Intercept) 0.000 0.465 0.000 1.000 J.-P. Renne
deseas.precip 0.108 0.025 4.271 0.000
Prelim.
Table 4: Regression of deseas. Google searches on deseas. precipitations
CLT
Tests
Linear regression
Estimate Std. Error t value Pr(>|t|) Assumptions
77 / 295
Partial regression coefficients Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
XÕ2 M X1 y GRM
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
78 / 295
Goodness of fit Statistics and
Econometrics II
J.-P. Renne
I Total variation in y = sum of squared deviations:
Prelim.
n
ÿ CLT
TSS = (yi ≠ ȳ )2 .
Tests
i=1
Linear regression
I We have: Assumptions
y = Xb + e = ŷ + e
OLS
Inference
Common pitfalls
I In the following, we assume that the regression includes a constant (i.e. Large sample
for all i, xi,1 = 1). IV
GRM
I Let us denote by M0 the matrix that transforms observations into
Panels
deviations from sample means. Using that M0 e = e and that XÕ e = 0,
we have: Max.Lik.Estim.
Binary
yÕ M0 y = (Xb + e)Õ M0 (Xb + e)
¸ ˚˙ ˝ Time Series
Total sum of sq. AR processes
MLE of AR processes
0
= b X M Xb
Õ Õ
+ ee
Õ Forecasting
¸ ˚˙ ˝ ¸˚˙˝
Appendix
"Explained" sum of sq. Sum of sq. residuals
Stat. Tables
TSS = Expl.SS + SSR.
79 / 295
Coefficient of determination R 2 Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Expl.SS SSR eÕ e
Coefficient of determination = =1≠ =1≠ Õ 0 . (6) Linear regression
TSS TSS yM y Assumptions
OLS
Inference
Common pitfalls
Remark 4.6 Large sample
IV
I It can be shown [Greene, 2012, Section 3.5] that: GRM
qn Panels
[ (yi ≠ ȳ )(yˆi ≠ ȳ )]2
Coefficient of determination = qn i=1 qn . Max.Lik.Estim.
i=1
(yi ≠ ȳ )
2
i=1
(yˆi ≠ ȳ )2 Binary
Time Series
∆ R 2 is the sample squared correlation between y and the AR processes
Appendix
Stat. Tables
80 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 16: Bivariate regressions with low/high R 2
Prelim.
CLT
Low R2 High R2
Tests
Linear regression
● ●
●● ●
3
4
● ●
● ● ● Assumptions
● ●
●
● ● ● ●
● OLS
● ●●●
●● ● ●●
3
Inference
● ●
● ●
● ● ●
2
● ● ●● ●●
●
● ●
Common pitfalls
● ●●●
● ● ●● ● ● ●●
● ●
●● ● ●●
●
●
● ●●
2
Large sample
●●
● ● ● ● ●●●●
● ● ●● ● ●●
●●●●
● ● ●
IV
● ● ●
1
●● ●● ● ●●
● ●
●●
Y
Y
● ● ● ● ●
GRM
1
● ● ● ● ●●●●●●
● ●● ● ●●●●
●
●● ●
●
● ● ● ● ●
Panels
●● ●
●●
● ●
●● ● ●● ● ●●
●
0
●
●
0
● ●
● ●●
● ●● ●
Max.Lik.Estim.
● ●
● ● ● ● ● ●
● ●
−1
● ● ●
−1
● ● ●
●
●
● ●
● Binary
●
−2
● ● Time Series
−2
AR processes
−3 −2 −1 0 1 2 −3 −2 −1 0 1 2
MLE of AR processes
X X Forecasting
Appendix
Stat. Tables
81 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Definition 4.1 (Partial correlation coefficient) Tests
The partial correlation between y and z, controlling for some variables X is Linear regression
the sample correlation between y ú and z ú , where the latter two variables are
Assumptions
OLS
the residuals in regressions of y on X and of z on X, respectively. Inference
Common pitfalls
Large sample
The correlation is denoted by X.
ryz By definition, we have: IV
GRM
zúÕ yú Panels
X
ryz = . (7)
Max.Lik.Estim.
(zúÕ zú )(yúÕ yú )
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
82 / 295
Statistics and
Econometrics II
Proposition 4.2 (Change in SSR (see Slide 79) when a variable is added) J.-P. Renne
We have: Prelim.
uÕ u = eÕ e ≠ c 2 (zúÕ zú ) (Æ eÕ e) (8) CLT
where (i) u and e are the residuals in the regressions of y on [X, z] and of y on Tests
X, respectively, (ii) c is the regression coefficient on z in the former regression Linear regression
and where zú are the residuals in the regression of z on X. Assumptions
OLS
Proof.: The OLS estimates [dÕ , c]Õ in the regression of y on [X, z] satisfies (first-order cond., Eq. 5) Inference
Common pitfalls
Ë ÈË È Ë È Large sample
XÕ X XÕ z d XÕ y
= . IV
zÕ X zÕ z c zÕ y
GRM
Panels
Hence, in particular d = b ≠ (XÕ X)≠1 XÕ zc, where b is the OLS of y on X. Substituting in
u = y ≠ Xd ≠ zc, we get u = e ≠ zú c. We therefore have: Max.Lik.Estim.
Binary
Õ ú ú Õ 2 úÕ ú úÕ
u u = (e ≠ z c)(e ≠ z c) = e e + c (z z ) ≠ 2cz e. (9)
Time Series
AR processes
MLE of AR processes
Now zú Õ e = zú Õ (y ≠ Xb) = zú Õ y because zú are the residuals in an OLS regression on X. Since
Forecasting
c = (zú Õ zú )≠1 zú Õ yú (by an application of Theorem 4.2), we have (zú Õ zú )c = zú Õ yú and, therefore, Appendix
zú Õ e = (zú Õ zú )c. Inserting this in Eq. (9) leads to the results. ⌅ Stat. Tables
83 / 295
Statistics and
Proposition 4.3 (Change in R 2 when a variable is added) Econometrics II
J.-P. Renne
Denoting by RW2 the coefficient of determination in the regression of y on
CLT
2
RX,z = RX2 + (1 ≠ RX2 )(ryz
X 2
) , Tests
Linear regression
X is the coefficient of partial correlation (Def. 4.1)
where ryz Assumptions
OLS
Inference
Proof.: Let’s use the same notations as in Prop. 4.2. Theorem 4.2 implies that c = (z z ) z
ú Õ ú ≠1 ú Õ ú
y .
Common pitfalls
Using this in Eq. (8) gives uÕ u = eÕ e ≠ (zú Õ yú )2 /(zú Õ zú ). Using the definition of the partial
! " Large sample
X 2
correlation (Eq. 7), we get u u = e e 1 ≠
Õ Õ
(ryz ) . The results is obtained by dividing both sides of IV
GRM
the previous equation by yÕ M0 y. ⌅
Panels
Max.Lik.Estim.
I The previous theorem shows that we necessarily increase the R 2 if we
add variables, even if they are irrelevant. Binary
I The adjusted R 2 , denoted by R̄ 2 , is a fit measure that penalizes large Time Series
numbers of regressors: AR processes
MLE of AR processes
Forecasting
84 / 295
Inference and Prediction Statistics and
Econometrics II
J.-P. Renne
I Under the normality assumption (Assumption A.5), we know the
Prelim.
distribution of b (conditional on X).
CLT
I Indeed, (b|X) © (XÕ X)≠1 XÕ y is multivariate Gaussian:
Tests
2
b|X ≥ N (—, ‡ (X X) Õ ≠1
). (10) Linear regression
Assumptions
Max.Lik.Estim.
eÕ e
s2 = . (11) Binary
n≠K
Time Series
AR processes
(It is sometimes denoted by 2
‡OLS .) MLE of AR processes
Forecasting
2
Proof.: E(e e|X) = E(Á MÁ|X) = E(Tr(Á MÁ)|X)) = Tr(ME(ÁÁ |X)) = ‡ Tr(M). (Note that we
Õ Õ Õ Õ
Appendix
have E(ÁÁÕ |X) = ‡ 2 Id by Assumptions A.3 and A.4, see Remark 4.2.) Finally:
Stat. Tables
Tr(M) = n ≠ Tr(X(XÕ X)≠1 XÕ ) = n ≠ Tr((XÕ X)≠1 XÕ X) = n ≠ Tr(IdK ◊K ) (see also Prop. 9.2). ⌅
85 / 295
Statistics and
Econometrics II
I Two results will prove important to perform hypothesis testing: J.-P. Renne
Tests
Proposition 4.5 (Distribution of s 2 )
Linear regression
s2 Assumptions
Stat. Tables
86 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Proposition 4.6 (Independence of s 2 and b)
Linear regression
Under A.1 to A.5, b and s 2 are independent. Assumptions
OLS
Inference
Common pitfalls
Proof We have b = — + [XÕ X]≠1 XÁ and s 2 = ÁÕ MÁ/(n ≠ K ). Hence b is an affine combination of Á Large sample
and s 2 is a quadratic combination of the same Gaussian shocks. One can write s 2 as IV
s 2 = (MÁ)Õ MÁ/(n ≠ K ) and b as — + TÁ. Since TM = 0, TÁ and MÁ are independent (because two GRM
uncorrelated Gaussian variables are independent), therefore b and s 2 , which are functions of respective
independent variables, are independent. ⌅ Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
87 / 295
Statistics and
I Under A.1 to A.5, let us consider bk , the k th entry of b: Econometrics II
J.-P. Renne
bk |X ≥ N (—k , ‡ 2 vk ),
Prelim.
where vk is the kth component of the diagonal of (XÕ X)≠1 . CLT
I Besides, we have (Prop. 4.5): Tests
Linear regression
(n ≠ K )s 2
|X ≥ ‰2 (n ≠ K ). Assumptions
‡2 OLS
Inference
Common pitfalls
I As a result (using Props. 4.5 and 4.6), we have: Large sample
IV
GRM
bk ≠—k
Ô
‡ 2 vk bk ≠ —k Panels
tk = Ò = ≥ t(n ≠ K ), (12)
Max.Lik.Estim.
(n≠K )s 2 s 2 vk
‡ 2 (n≠K ) Binary
Time Series
where t(n ≠ K ) denotes a t distribution with n ≠ K degrees of freedom AR processes
88 / 295
Statistics and
Remark 4.7 Econometrics II
I The previous result (Eq. 12) can be extended to any linear combinations Tests
I Therefore: Panels
–Õ b ≠ –Õ — Max.Lik.Estim.
|X ≥ N (0, 1).
‡ 2 –Õ (XÕ X)≠1 – Binary
I Using the same approach as the one used to derive Eq. (12), one can Time Series
–Õ b ≠ –Õ — Appendix
≥ t(n ≠ K ). (13)
s 2 –Õ (XÕ X)≠1 – Stat. Tables
89 / 295
Statistics and
Econometrics II
Figure 17: Normal and Student-t distributions J.-P. Renne
Prelim.
0.4 N(0,1)
CLT
t(3)
t(7) Tests
t(20)
Linear regression
0.3
Assumptions
OLS
Inference
Common pitfalls
0.2
Large sample
IV
GRM
Panels
0.1
Max.Lik.Estim.
Binary
0.0
Time Series
AR processes
−3 −2 −1 0 1 2 3
MLE of AR processes
Forecasting
X
Appendix
Note: See Def. 9.5 of the t distribution. The chart shows that the higher the degree of freedom ‹, the Stat. Tables
closer the distribution of t(‹) gets to the normal distribution. See also Table 23.
90 / 295
Confidence interval of —k Statistics and
Econometrics II
I Assume we want to compute a (symmetrical) confidence interval
J.-P. Renne
[Id,1≠– , Iu,1≠– ] that is such that P(—k œ [Id,1≠– , Iu,1≠– ]) = 1 ≠ –.
I In particular, we want to have: P(—k < Id,1≠– ) = – . Prelim.
2
I For this purpose, we make use of tk = Ô bk ≠—k CLT
≥ t(n ≠ K ) (Eq. 12).
s 2 vk
Tests
I We have:
– Linear regression
P(—k < Id,1≠– ) = …
2 Assumptions
OLS
Inference
A B A B Common pitfalls
P > = …P tk > = … IV
s 2 vk s 2 vk 2 s 2 vk 2 GRM
A B Panels
bk ≠ Id,1≠– bk ≠ Id,1≠–
1 2
– – Max.Lik.Estim.
1≠P tk Æ = … = ≠1
t(n≠K )
1≠ ,
s 2 vk 2 s2v 2 Binary
k
Time Series
where t(n≠K ) (–) is the c.d.f. of the t(n ≠ K ) distribution (Table 23). AR processes
MLE of AR processes
I Doing the same for Iu,1≠– , we obtain: Forecasting
Appendix
[Id,1≠– , Iu,1≠– ] =
Ë 1 2 1 2 È Stat. Tables
– –
bk ≠ ≠1
t(n≠K )
1≠ s 2 vk , bk + ≠1
t(n≠K )
1≠ s 2 vk .
2 2
91 / 295
Statistics and
Econometrics II
Example 4.6 (Income, Seniority and Education (Example 4.2))
J.-P. Renne
Estimate Std. Error t value Pr(>|t|)
(Intercept) -50.086 5.999 -8.349 0.000 Prelim.
Education 5.896 0.357 16.513 0.000 CLT
Seniority 0.173 0.024 7.079 0.000
Tests
Panels
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.939 0.312 9.422 0.000 Max.Lik.Estim.
Appendix
Stat. Tables
The data are the same as in Example 4.3.
92 / 295
Statistics and
Econometrics II
J.-P. Renne
I In Examples 4.6 and 4.7, the last two columns give the test statistic and Prelim.
p-values associated to the test (Section 3) whose null hypothesis is:
CLT
Tests
H0 : —k = 0.
Linear regression
Assumptions
I The t-statistics, that is bk / s 2 vk , is the test statistic of the test. OLS
I Under H0 , the t-statistic is t(n ≠ K ) (see Eq. 12). Hence, the critical Inference
Common pitfalls
region (see Slide 38) for the test of size – is: Large sample
IV
È 1 2È Ë 1 2 Ë GRM
– –
≠Œ, ≠ ≠1
t(n≠K )
1≠ fi ≠1
t(n≠K )
1≠ , +Œ .
2 2 Panels
Max.Lik.Estim.
I The p-value is defined as the probability that |Z | > |t|, where t is the
Binary
(computed) t statistics and where Z ≥ t(n ≠ K ). That is, the p-value is
given by 2(1 ≠ t(n≠K ) (|tk |)). Time Series
AR processes
MLE of AR processes
Charts on tests (critical region, level, p-value)
Forecasting
Appendix
Stat. Tables
93 / 295
Set of linear restrictions Statistics and
Econometrics II
Prelim.
y = X— + Á, Á ≥ i.i.d.N (0, ‡ 2 ).
CLT
and we want to test for the joint validity of a set of restrictions Tests
involving the components of — in a linear way. Linear regression
I Set of linear restrictions: Assumptions
OLS
Inference
r1,1 —1 + · · · + r1,K —K = q1 Common pitfalls
.. .. Large sample
. . (14) IV
GRM
rJ,1 —1 + · · · + rJ,K —K = qJ ,
Panels
Binary
R— = q. (15)
Time Series
AR processes
MLE of AR processes
Example 4.8 (Advertising and sales (Examples 4.3 and 4.7)) Forecasting
Appendix
Is the effect of advertising on TV (one unit) equal to that of advertising on
Stat. Tables
radio (one unit)?
∆ R = [0, 1, ≠1, 0], q = 0.
94 / 295
Statistics and
Econometrics II
I Discrepancy vector: m = Rb ≠ q. J.-P. Renne
CLT
E(m|X) = R— ≠ q = 0 and
Tests
Var (m|X) = RVar (b|X)R .Õ
Linear regression
I Under A.1 to A.4, Var (m|X) = ‡ 2 R(XÕ X)≠1 RÕ (see Prop. 4.1). Assumptions
OLS
I Test: Inference
Under A.1 to A.5 (we need the normality assumption) and under H0 , we Panels
have:
Max.Lik.Estim.
W = mÕ Var (m|X)≠1 m ≥ ‰2 (J) (17)
Binary
(see Prop. 9.9).
Time Series
I However, ‡ 2 is unknown. Hence we cannot compute W . AR processes
of this new statistic is not ‰2 (J) any more; it is an F distribution, and Forecasting
Stat. Tables
95 / 295
Statistics and
Econometrics II
Proposition 4.7 (Definition and distribution of the F stat.)
J.-P. Renne
CLT
W ‡2 mÕ (R(XÕ X)≠1 RÕ )≠1 m
F = = ≥ F (J, n ≠ K ), (18)
J s2 s2J Tests
Linear regression
where F is the distribution of the F-statistic (see Def. 9.4). Assumptions
OLS
Inference
Proof According to Eq. (17), W /J ≥ ‰2 (J)/J. Consider the denominator s 2 /‡ 2 . We have Common pitfalls
s 2 = eÕ e/(n ≠ K ) (Eq. (11)) and eÕ e = ÁÕ MÁ. By Prop. 9.2, the latter term is ≥ ‰2 (n ≠ K ). Large sample
For large n ≠ K , the FJ,n≠K distribution converges to FJ,Œ = ‰2 (J)/J. Stat. Tables
96 / 295
Proposition 4.8 (Other expressions of the F -statistics) Statistics and
Econometrics II
The F-statistic defined by Eq. (18) is also equal to: J.-P. Renne
Prelim.
(R 2 ≠ Rú2 )/J (SSRrestr ≠ SSRunrestr )/J
F = = , (19) CLT
(1 ≠ R 2 )/(n ≠ K ) SSRunrestr /(n ≠ K )
Tests
where Rú2 is the coef. of determination (Eq. 6) of the “restricted regression” Linear regression
(SSR: sum of squared residuals, see Slide 79.) Assumptions
OLS
Inference
Proof Let’s denote by eú = y ≠ Xbú the vector of residuals associated to the "restricted regression" Common pitfalls
(i.e. Rbú = q). We have eú = e ≠ X(bú ≠ b). Using eÕ X = 0, we get Large sample
eÕú eú = eÕ e + (bú ≠ b)Õ XÕ X(bú ≠ b) Ø eÕ e. IV
By Prop. 9.3, we know that bú ≠ b = ≠(XÕ X)≠1 RÕ {R(XÕ X)≠1 RÕ }≠1 (Rb ≠ q). Therefore: GRM
Õ Õ Õ Õ ≠1 Õ ≠1
Panels
eú eú ≠ e e = (Rb ≠ q) [R(X X) R ] (Rb ≠ q).
Max.Lik.Estim.
This implies that the F statistic defined in Prop. 4.7 is also equal to:
Binary
(eÕú eú ≠ eÕ e)/J Time Series
.⌅
eÕ e/(n ≠ K ) AR processes
MLE of AR processes
Forecasting
Stat. Tables
The null hypothesis H0 (Eq. 16) of the F-test is rejected if F –defined by Eq.
(18) or (19)– is higher than F1≠– (J, n ≠ K ). [one-sided test: see Slide 287]
97 / 295
Prediction Statistics and
Econometrics II
J.-P. Renne
Prelim.
I We consider the model:
CLT
2 Tests
yi = xiÕ — + Ái , Ái ≥ N (0, ‡ ).
Linear regression
I Prediction exercise: we want to predict y ú , which would be associated Assumptions
I The Gauss-Markov Theorem (4.1) implies that, under under A.1 to A.4, Large sample
IV
yú
‚ = xúÕ b is the minimum-variance linear unbiased estimator of GRM
Time Series
Var (e ú |X, x = xú ) = xúÕ {‡ 2 (XÕ X)≠1 }xú + ‡ 2 . (20) AR processes
MLE of AR processes
Forecasting
I The prediction variance is estimated by replacing ‡ 2 by s 2 .
Appendix
Stat. Tables
98 / 295
I Let’s focus on the specific case where the regression includes a constant
term and a single regressor, i.e. each row of X is of the form [1, wi ].
(This is Example 4.5.)
I In this case, we have xú = [1, w ú ]Õ and, using the expression of (XÕ X)≠1
computed in Example 4.5 (Eq. 20), we have:
5 6
1 (w ú ≠ w̄ )2
Var ( y ú ≠ xúÕ b |X, x = xú ) = ‡ 2 1 + + q . (21)
¸ ˚˙ ˝ n i
(wi ≠ w̄ )
2
forecast error
4
●
● ● ●
● ● ●
● ● ●
● ● ● ●● ●●
2
2
● ● ● ● ● ●
● ● ● ●●
●
● ● ●●● ●●
●
● ● ● ●
● ●
● ●
● ● ●● ● ● ● ● ● ●
●● ●● ● ● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ●● ● ● ●
● ● ●
●
● ●●● ●
● ●●● ● ●●●
● ● ● ●●
● ●
●● ● ● ●
● ● ● ●
● ●
●● ● ● ●
● ●● ● ●
● ● ●● ●
● ● ●● ● ● ●
●
0
0
y
y
●
● ●● ●
Theoretical ●
Theoretical ●
Theoretical
regression line ●
regression line regression line
●
−2
−2
−2
Estimated Estimated Estimated
regression line regression line regression line
−4
−4
−4
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x
100 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
Tests
Linear regression
150
Assumptions
OLS
Inference
Income
100
● ● Common pitfalls
● ●
●● ● ● Large sample
● ● ●
●●● IV
●● ● ●
● GRM
●
50
●
●
● Panels
● ●
●
● ● Max.Lik.Estim.
0
Binary
Appendix
Stat. Tables
101 / 295
Potential common pitfalls Statistics and
Econometrics II
I Model: yi = —1 xi,1 + —2 xi,2 + Ái where all variables are zero-mean and Prelim.
q q 1
q q i xi,2 ≠
qi xi,12 xi,2 . GRM
2
xi,1 2 ≠(
xi,2 xi,1 xi,2 )2 ≠ xi,1 xi,2 xi,1
i i i i i Panels
Max.Lik.Estim.
I The inverse of the upper-left parameter of (XÕ X)≠1 is:
Binary
ÿ q ÿ Time Series
2
( i xi,1 xi,2 )2 2 2
xi,1 ≠ q 2 = xi,1 (1 ≠ correl1,2 ), (22) AR processes
i
xi,2 MLE of AR processes
i i Forecasting
Appendix
where correl1,2 is the sample correlation between x1 and x2 .
Stat. Tables
I Hence, the closer to one correl1,2 , the higher the variance of b1 (recall
that the variance of b1 is the upper-left component of ‡ 2 (XÕ X)≠1 ).
102 / 295
Potential common pitfalls Statistics and
Econometrics II
J.-P. Renne
Prelim.
Remark 4.10 (Omitted variables)
CLT
I “True model”: Tests
y = X1 —1 2+ X 2 — +Á
¸˚˙˝ ¸˚˙˝ ¸˚˙˝ ¸˚˙˝ Linear regression
n◊K1 K1 ◊1 n◊K2 K2 ◊1 Assumptions
OLS
Max.Lik.Estim.
E(b1 |X) = — 1 + (XÕ1 X1 )≠1 (XÕ1 X2 ) — 2 Binary
¸ ˚˙ ˝
K1 ◊K2 Time Series
AR processes
MLE of AR processes
(each column of (XÕ1 X1 )≠1 (XÕ1 X2 ) are the OLS regressors obtained Forecasting
Stat. Tables
103 / 295
Potential common pitfalls Statistics and
Econometrics II
J.-P. Renne
Prelim.
Example 4.9 (Omitted variables & Effect of education on wages) CLT
Linear regression
wagei = —0 + —1 edui + —2 abilityi + Ái , Ái ≥ i.i.d. N (0, ‡Á2 ) (23) Assumptions
OLS
Inference
I Further, we assume that the edu variable is correlated to the ability . Common pitfalls
I Assume we mistakingly run the regression omitting the ability variable: Max.Lik.Estim.
Binary
wagei = “0 + “1 edui + ›i . (24) Time Series
AR processes
I It can be seen that ›i = Ái ≠ (—2 /–1 )÷i ≥ i.i.d. N (0, ‡Á2 + (—2 /–1 )2 ‡÷2 ) MLE of AR processes
Forecasting
and that the population regression coefficient is “1 = —1 + —2 /–1 ”= —1 .
Appendix
Stat. Tables
104 / 295
Potential common pitfalls Statistics and
Econometrics II
Example 4.10 (Omitted variables & Effect of class size on performances) J.-P. Renne
Prelim.
Figure 20: California school district data CLT
Tests
●
● Linear regression
700
●
●● ● ●
Assumptions
● ● ●
● ● ●
●●
● ● OLS
● ●
● ●
●●● ● ● ● Inference
680
● ● ●● ● ●● ● ●
● ● ● ● ● ● ●
● Common pitfalls
● ● ● ● ● ● ● ● ●
●
● ●
●● ●● ●● ●●
●●● ● ●● ●● ● ● ●● ● Large sample
●●● ● ● ●
● ● ●● ● ● ● ●
● ●● ● ●
● ●● ●●●● ●●●●●● ● ● ●●● ● ● ● IV
Test score
●● ●●
● ●
● ● ● ● ● ● ● ● ●
● ● ● ●●●● ● ● ●●● ● ●
660
● ●● ● ●●
● ● ●● ●● ● ● ●●● ●●
●● ●
● GRM
●●●●
●● ● ● ●● ●● ●● ● ●
●
●● ●● ● ● ●● ● ● ●● ●● ●
● ●● ●●● ● ●● ● ●
●●
Panels
● ● ●● ● ● ● ● ●● ● ●
●● ● ● ● ● ● ● ● ● ●
●● ● ● ● ●● ●●
● ● ● ●●●
● ● ●
●● ●
● ● ● ● ● ● ●●
●
● ●●●●● ● ●● ●●●●● ● ●● ● ● ● ●●
Max.Lik.Estim.
●
640
● ● ● ● ● ● ●● ●
●
● ● ●● ● ● ● ●
● ●● ● ●●●● ●●● ● ● ● ● ●
● ● ● ● ●●●● ● ●
●
● ●● ● ●●
● ● ●●
●
● ● ● ● ● ● ●
Binary
●
● ●● ●● ● ● ● ● ●● ● ●
● ● ● ●●
● ● ● ● ● ● ●● ● ●
● ● ●● ●
620
Time Series
● ● ● ● ●
● ● ●
● ●● ● ●●
● ● AR processes
●
● ● MLE of AR processes
Forecasting
14 16 18 20 22 24 26 Appendix
105 / 295
Potential common pitfalls Statistics and
Econometrics II
Panels
Table 9: Regression of Test score on Student-teacher ratio.
Max.Lik.Estim.
Binary
Estimate Std. Error t value Pr(>|t|)
Time Series
(Intercept) 700.15 4.69 149.42 0.00 AR processes
StockWatsonEduc$str -1.00 0.24 -4.18 0.00 MLE of AR processes
Stat. Tables
Table 10: Regression of Test score on Student-teacher ratio.
106 / 295
Potential common pitfalls Statistics and
Econometrics II
J.-P. Renne
Prelim.
Remark 4.11 (Irrelevant variable) CLT
I “True model”: Tests
y = X1 — 1 + Á Linear regression
I “Estimated model”: Assumptions
OLS
y = X1 — 1 + X2 — 2 + Á Inference
Common pitfalls
I Unbiased estimate. Problem however: the variance of the estimate of Large sample
— 1 resulting from this regression is higher than that stemming from the IV
infer that they are no relevant variables when they truly are (Type-II MLE of AR processes
Appendix
Stat. Tables
107 / 295
Least Squares: Large Sample Properties Statistics and
Econometrics II
J.-P. Renne
Prelim.
later how to deal with –partial– relaxations of Assumptions A.3 and OLS
A.4.)
Inference
Common pitfalls
Large sample
IV
Preview of OLS large-sample properties under A.1 to A.4 GRM
I Under A.1 to A.4, even if the residuals are not normally-distributed, the Panels
least square estimators can be asymptotically normal and inference can Max.Lik.Estim.
be performed as in small samples when A.1 to A.5 hold.
Binary
I This derives from Prop. 4.9 (below).
Time Series
I The F-test (Prop. 4.8) and the t-test described in Slide 93 can then be AR processes
Appendix
Stat. Tables
108 / 295
Statistics and
Proposition 4.9 (Asymp. distri. of b with independent observations) Econometrics II
J.-P. Renne
Under Assumptions A.1 to A.4, and assuming further that:
Prelim.
XÕ X
Q = plimnæŒ , (25) CLT
n
Tests
and that the (xi , Ái )s are independent (across entities i), we have: Linear regression
Assumptions
Ô d ! " OLS
n(b ≠ —) æ N 0, ‡ 2 Q ≠1
, (26) Inference
Common pitfalls
Large sample
! XÕ X "≠1 ! XÕ Á " Ô ! Õ "≠1 ! 1 " Õ IV
Proof Since b = — + n n
, we have: n(b ≠ —) = XnX Ô X Á. Since
! XÕ X "≠1 ≠1 n GRM
109 / 295
Instrumental Variables Statistics and
Econometrics II
I Here, we want to relax Assumption A.2 (conditional mean zero J.-P. Renne
assumption, implying in particular that xi and Ái are uncorrelated).
I Model: Prelim.
yi = xi Õ — + Ái , where E(Ái ) = 0 and xi ”‹ Ái . (27) CLT
Tests
Definition 4.3 (Valid set of instruments)
Linear regression
The L-dimensional random variable zi is a valid set of instruments if: Assumptions
OLS
Panels
Example 4.11 (A situation where E(Ái ) = 0 and xi ”‹ Ái )
Max.Lik.Estim.
I Let us make the assumption xi ”‹ Ái in Model (27) more precise: Binary
Time Series
E(Ái ) = 0 and E(Ái xi ) = “ AR processes
MLE of AR processes
Forecasting
by the law of large numbers (Theorem 2.1) plimnæŒ XÕ Á/n = “.
Appendix
I If Qxx := plim XÕ X/n, the OLS estimator is not consistent because
Stat. Tables
p
b = — + (X X) Õ ≠1
XÁæ—+
Õ
Q≠1
xx “ ”= —.
110 / 295
Statistics and
I If zi is a valid set of instruments, we have: Econometrics II
1 2 1 2 1 2 J.-P. Renne
ZÕ y ZÕ (X— + Á) ZÕ X
plim = plim = plim —
n n n Prelim.
CLT
ZÕ Á p
Indeed, by the law of large numbers (Theorem 2.1), n
æ E(zi Ái ) = 0. Tests
I If L = K , the matrix ZÕ X
n
is of dimension K ◊ K and we have: Linear regression
Assumptions
Ë 1 2È≠1 1 2 OLS
ZÕ X ZÕ y Inference
plim plim = —.
n n Common pitfalls
Large sample
IV
# ! ZÕ X "$≠1 ! ZÕ X "≠1 GRM
I By continuity of the inverse funct.: plim = plim .
n n
Panels
I The Slutsky Theorem (Prop. 9.4) further implies that
Max.Lik.Estim.
1 2≠1 1 2 31 2≠1 4 Binary
ZÕ X ZÕ y ZÕ X ZÕ y
plim plim = plim . Time Series
n n n n
AR processes
MLE of AR processes
I Hence biv is consistent if it is defined by: Forecasting
Appendix
111 / 295
Statistics and
Proposition 4.10 (Asymp. distri. of biv ) Econometrics II
J.-P. Renne
If zi is a L-dimensional random variable that constitutes a valid set of
instruments (see Def. 4.3) and if L = K , then the asymptotic distribution of Prelim.
biv is: 3 4 CLT
d ‡2 # $≠1
biv æ N —, Qxz Qzz Qzx
≠1
Tests
n
Linear regression
where plim ZÕ Z/n =: Qzz , plim ZÕ X/n =: Qzx , plim XÕ Z/n =: Qxz . Assumptions
OLS
Inference
Proof The proof is very similar to that of Prop. 4.9, the starting point being that Common pitfalls
Large sample
biv = — + (ZÕ X)≠1 ZÕ Á. ⌅
IV
GRM
Max.Lik.Estim.
# $≠1
Qxz Qzz
≠1
Qzx = Qzx
≠1
Qzz Qxz
≠1 Binary
Time Series
‡ 2 ≠1
we replace ‡ 2 by:
AR processes
I In practice, to estimate Var (biv ) = Qzx Qzz Qxz
≠1
,
n MLE of AR processes
Forecasting
n
1 ÿ Appendix
siv2 = (yi ≠ xiÕ biv )2
n Stat. Tables
i=1
112 / 295
Statistics and
Econometrics II
J.-P. Renne
I And when L > K ?
I Idea: First regress X on the space spanned by Z and then regress y on Prelim.
the fitted values X̂ := Z(ZÕ Z)≠1 ZÕ X. That is biv = (X̂Õ X̂)≠1 X̂Õ y: CLT
Tests
biv = [XÕ Z(ZÕ Z)≠1 ZÕ X]≠1 XÕ Z(ZÕ Z)≠1 ZÕ Y. (28)
Linear regression
Assumptions
I In this case, Prop. 4.10 still holds, with biv given by Eq. (28). OLS
Inference
I biv is also the result of the regression of y on Xú , where the columns of Common pitfalls
Max.Lik.Estim.
Definition 4.4 (Weak instruments) Binary
I If the instruments do not properly satisfy Condition (a) in Def. 4.3 (i.e. Time Series
if xi and zi are only loosely related), the instruments are said to be weak. AR processes
MLE of AR processes
I This problem is for instance discussed in Stock and Yogo (2003). See Forecasting
also Stock and Watson pp. 489-490.
Appendix
Stat. Tables
113 / 295
Statistics and
Econometrics II
J.-P. Renne
Panels
Binary
where MPI is the Moore-Penrose pseudo-inverse (see Def. 9.3).
Time Series
I Under the null hypothesis, H ≥ ‰2 (q), where q is the rank of AR processes
Appendix
Stat. Tables
114 / 295
Statistics and
Example 4.12 (Estimation of price elasticity) Econometrics II
I See e.g. WHO and estimation of tobacco price elasticity of demand. J.-P. Renne
cost factors
IV
log(supply) GRM
Panels
where yt , wt , Ást ≥ N (0, ‡s2 ) and Ádt ≥ N (0, ‡d2 ) are independent.
Max.Lik.Estim.
I Equilibrium: qtd = qts . This implies that prices are endogenous:
Binary
Time Series
–0 + –2 wt + Ádt
≠ “0 ≠ “2 yt ≠ Ást
pt = . AR processes
“1 ≠ – 1 MLE of AR processes
Forecasting
‡d2 Appendix
I In particular we have E(pt Ádt ) = “1 ≠–1
”= 0.
Stat. Tables
∆ Regressing by OLS qtd on pt gives biased estimates (see Example 4.11).
115 / 295
Statistics and
Econometrics II
J.-P. Renne
Example 4.12 (Estimation of price elasticity, continued)
Prelim.
Figure 21: Demand and Supply
CLT
Tests
3.0
Linear regression
Supply 4
Assumptions
2.5
Supply 2 OLS
Inference
2.0
Supply 3
Common pitfalls
log(quantities)
Large sample
1.5
●
Supply 1
● IV
GRM
1.0
●
Panels
●
0.5
Demand 2
Max.Lik.Estim.
Binary
0.0
Demand 4
Demand 1
Time Series
−0.5
Demand 3 AR processes
Stat. Tables
116 / 295
Statistics and
Example 4.12 (Estimation of price elasticity, continued) Econometrics II
I Estimation of the price elasticity of cigarette demand. J.-P. Renne
Instrument: real tax on cigarettes arising from the state’s general sales
tax. Presumption: in states with a larger general sales tax, cigarette Prelim.
prices are higher, but the general tax is not determined by other forces CLT
affecting Ádt .
Tests
Linear regression
Assumptions
Table 11: IV Regression Results OLS
Inference
Common pitfalls
Dependent variable:
Large sample
log(packs) IV
log(rprice) ≠1.277úúú GRM
(0.263)
Panels
log(rincome) 0.280
Max.Lik.Estim.
(0.239)
Binary
Constant 9.895 úúú
(1.059) Time Series
AR processes
Observations 48 MLE of AR processes
R2 0.429 Forecasting
Adjusted R2 0.404
Appendix
Residual Std. Error 0.188 (df = 45)
Note: ú
p<0.1; úú p<0.05; úúú p<0.01 Stat. Tables
117 / 295
Example 4.13 (Education and Wages) Statistics and
Econometrics II
I Potential problem with OLS regression of wages on education: omitted J.-P. Renne
variables (see Example 4.9).
Prelim.
I Card (1993)’s idea: distance to college used as an instrument.
CLT
Table 12: Effect of education on wage
Tests
118 / 295
General Regression Model Statistics and
Econometrics II
J.-P. Renne
I We want to relax the assumption according to which the disturbances
are uncorrelated with each other (Assumption A.4) or the Prelim.
homoskedasticity Assumption A.3. CLT
I We replace the latter two assumptions by the general formulation: Tests
Linear regression
E(ÁÁÕ |X) = . (29) Assumptions
OLS
I Note that Eq. (29) is more general than Assumptions A.3 and A.4 Inference
Common pitfalls
because the diagonal entries of may be different (not the case under Large sample
Panels
Definition 4.6 (General Regression Model)
Max.Lik.Estim.
Assumptions A.1 and A.2, together with Eq. (29), form the General Regression Binary
Model (GRM) framework. Time Series
AR processes
MLE of AR processes
Remark 4.12 (A specific case) Forecasting
Appendix
A regression model where Assumptions A.1 to A.4 hold is a specific case of
Stat. Tables
the General Regression Model (GRM) framework.
119 / 295
Statistics and
Econometrics II
J.-P. Renne
I The GRM context notably allows to model heteroskedasticity and
autocorrelation. Prelim.
I Heteroskedasticity: CLT
Tests
S ‡2 0 ... 0
T
1 Linear regression
W 0 ‡22 0 X. Assumptions
=U . .. .. V (30) OLS
. . . . Inference
I Autocorrelation: GRM
S T Panels
1 fl2,1 ... fln,1 Max.Lik.Estim.
W .. X
W fl 1 . X. Binary
= ‡ W 2,1
2
X (31)
U .. ..
.
V Time Series
. fln,n≠1 AR processes
fln,1 fln,2 ... 1 MLE of AR processes
Forecasting
Appendix
Stat. Tables
120 / 295
Statistics and
Econometrics II
J.-P. Renne
Example 4.14 (Autocorrelation in the time-series context)
Prelim.
I Autocorrelation is, in particular, a recurrent problem when time-series
data are used (see Section 8). CLT
Panels
I In this case, we are in the GRM context, with:
Max.Lik.Estim.
S T
1 fl ... fln Binary
W .. X Time Series
‡v2 W fl 1 . X.
= W X (34) AR processes
1 ≠ fl2 U .. .. V MLE of AR processes
. . fl Forecasting
Stat. Tables
121 / 295
Statistics and
Econometrics II
Definition 4.7 (Generalized Least Squares) J.-P. Renne
P y = P X— + P Á
Õ Õ Õ
or y =X —+Á .
ú ú ú IV
GRM
Binary
bGLS = (XÕ ≠1
X)≠1 XÕ ≠1
y (35) Time Series
AR processes
MLE of AR processes
I bGLS = Generalized least squares estimator of —. Forecasting
I We have: Appendix
Var (bGLS |X) = (XÕ ≠1 X)≠1 . Stat. Tables
122 / 295
Statistics and
Econometrics II
I When is unknown, the GLS estimator is said to be infeasible. Some J.-P. Renne
structure is required.
I Assume admits a parametric form (◊). Prelim.
I The estimation becomes "feasible" (FGLS) if one replaces (◊) by (◊). ˆ CLT
Max.Lik.Estim.
I Consider the case presented in Example 4.14.
Binary
I Because the OLS estimate b of — is consistent, the estimates ei s of the
Time Series
Ái s also are. AR processes
I Consistent estimators of fl and ‡v are then obtained by regressing the MLE of AR processes
Forecasting
ei s on the ei≠1 s.
Appendix
I Using these estimates in Eq. (34) provides a consistent estimate of .
Stat. Tables
123 / 295
Statistics and
Proposition 4.11 (Finite sample properties of b in the GRM (Def. 4.6)) Econometrics II
J.-P. Renne
Conditionally on X, we have:
1 2≠1 1 21 2≠1 Prelim.
1 1 Õ 1 Õ 1 Õ
Var (b|X) = XX X X XX . (36) CLT
n n n n
Tests
Under A.5, since b is linear in Á, Linear regression
Assumptions
1 ! "≠1 ! "! "≠1 2 OLS
b|X ≥ N —, XÕ X XÕ X XÕ X . (37) Inference
Common pitfalls
Large sample
IV
Remark 4.13 GRM
The variance of the estimator is not ‡ 2 (XÕ X)≠1 any more, so using s 2 (XÕ X)≠1 Panels
Binary
Proposition 4.12 (OLS consistency in the GRM (Def. 4.6)) Time Series
AR processes
if plim (XÕ X/n) and plim (XÕ X/n) are finite positive definite matrices, then MLE of AR processes
Appendix
Proof Var (b) = E[Var (b|X)] + Var [E(b|X)]. Since E(b|X) = —, Var [E(b|X)] = 0. Eq. (36) implies Stat. Tables
that Var (b|X) æ 0. Hence b converges in mean square and therefore in probability (Prop. 9.5). ⌅
124 / 295
Statistics and
Econometrics II
J.-P. Renne
if Qxx = plim (XÕ X/n) and Qx x = plim (XÕ X/n) are finite positive definite CLT
matrices, then: Tests
Ô d
n(b ≠ —) æ N (0, Qxx
≠1
Qx x Qxx
≠1
). Linear regression
Assumptions
OLS
Proposition 4.14 (Asymptotic distri. of biv in the GRM) Inference
Common pitfalls
Max.Lik.Estim.
where 1 2
1 1 Õ Binary
Viv = (Qú ) plim Z Z (Qú )Õ ,
n n Time Series
AR processes
with MLE of AR processes
Qú = [Qxz Q≠1
zz Qzx ]
≠1
Qxz Q≠1
zz . Forecasting
Appendix
Stat. Tables
125 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
I For practical purposes, one needs to have estimates of in Props. 4.11,
CLT
4.13 or 4.14.
Tests
I Idea: instead of estimating (dimension n ◊ n) directly, one can
estimate n1 XÕ X (dimension K ◊ K ) since this is this expression (XÕ X) Linear regression
that eventually appears in the formulas (for instance in Eq. 36). Assumptions
OLS
I We have: Inference
n n
1 Õ 1 ÿ ÿ Common pitfalls
n n IV
i=1 j=1 GRM
I Their computation is based on the fact that if b is consistent, then the Binary
ei s are consistent (pointwise) estimators of the Ái s. Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
126 / 295
Statistics and
Econometrics II
J.-P. Renne
1ÿ 1ÿ Inference
plim ‡i2 xi xiÕ = plim ei2 xi xiÕ . (39) Common pitfalls
n n Large sample
i=1 i=1 IV
GRM
I The estimator of 1 Õ
n
X X therefore is: Panels
Max.Lik.Estim.
1 Õ 2
X E X, Binary
n
Time Series
where E is an n ◊ n diagonal matrix whose diagonal elements are the AR processes
estimated residuals ei .
MLE of AR processes
Forecasting
Appendix
Stat. Tables
127 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 22: Salaries versus time since PhD obtention
Prelim.
● CLT
Tests
200000
● ●
● ● ● Linear regression
● ●
● ● Assumptions
● ●
●
● ● ● ●
● ● OLS
● ● ●
● ●● ● ● ●
● ●● ● Inference
150000
●
●● ●
●● ● ●●● ●
Salary
● ● ● ● ● ● ● ●● ● Common pitfalls
● ● ● ● ● ● ● ● ●
●● ●● ●
● ● ●● ● ● ● ●
●●●● ● Large sample
● ●
●●● ● ●●
● ● ●●● ●
●●●
● ● ● ● ●●● ●● ●● IV
● ●●● ● ● ●
● ● ●
● ● ● ●● ●● ● ●
●●●●
●
●●●● ● ●
● GRM
●● ● ● ●
● ● ●●●
● ● ●● ● ● ●● ●● ● ●●● ● ●
100000
● ● ●●●●●● ● ● ●● ● ●
Panels
●● ●●●●●● ●● ● ● ● ●
● ● ●
●●●● ● ● ●● ●● ● ● ● ● ●●● ●
●
●●●
● ●● ●●●● ●● ● ●
● ●●● ● ● ● ● ●● ● ● ●
● ●
● ● ● ● ● ● ●
●●● ● ● ● ●●●●● ●
Max.Lik.Estim.
●●
●● ●●● ● ● ● ● ● ●●●
●●●●● ● ● ●
●
●● ● ●●
● ●● ●● ● ● ● ● ●
●● ● ●
●● ● ● ●● ● ● ●
● ●● ●●●● ● ●
Binary
●●●● ● ● ● ●
●● ● ● ●
● ● ●
●
Time Series
AR processes
0 10 20 30 40 50
MLE of AR processes
Forecasting
years since PhD
Appendix
Data from Fox J. and Weisberg, S. (2011).
Stat. Tables
128 / 295
I Let us illustrate the influence of heteroskedasticity using simulations. Statistics and
Econometrics II
I We consider the following model:
J.-P. Renne
CLT
where the xi s are i.i.d. t(4).
Tests
I Here is a simulated sample (n = 200) of this model:
Linear regression
Assumptions
OLS
●
GRM
Panels
●
Max.Lik.Estim.
5
●
● ●● ●
● ● ●
●
●
Binary
● ● ● ● ● ●
● ●
● ●● ● ●
● ● ●●
● ● ● ● ● ●
y
●● ● ●
●●
●
●●●●● ●
● ●● ●●●
●● ●●● ●● ●● ●
●● ●● ● ●
● ●●● ● ● ●●●
● ●●
●●● ●●
●●
●●● ●
● ●● ●● ●
●
●●●
0
● ● ●●●●
●
●●●
● ● ● ●● ● ●
●● ● ● ● ● ●●●●
●● ●● ● ● ● ● ●
Time Series
● ●●●
● ●● ●
● ● ●●
●●
●●
●●
●● ●
● ●
● ●
●● ● ● ● ●
● ●
●
●
●
●
● ●
● AR processes
● ● ● ●
●
●
MLE of AR processes
−5
Forecasting
Appendix
●
−4 −2 0 2 4
Stat. Tables
x
129 / 295
I We simulate 1000 samples of the same model with n = 200. Statistics and
I For each sample, we compute the OLS estimate of — (=1). Econometrics II
I Using these 1000 estimates of b, we construct an approximated J.-P. Renne
(kernel-based) distribution of this OLS estimator (in red on the figure).
I For each of the 1000 OLS estimations, we employ the standard OLS Prelim.
variance formula (s 2 (XÕ X)≠1 ) to estimate the variance of b. The blue CLT
curve is a normal distribution centred on 1 and whose variance is the
average of the 1000 previous variance estimates. Tests
Linear regression
Assumptions
Figure 24: Heteroskedasticity and the inappropriateness of the standard OLS OLS
GRM
Panels
5
Max.Lik.Estim.
4
Binary
3
Time Series
AR processes
2
MLE of AR processes
Forecasting
1
Appendix
Stat. Tables
0
b value
130 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I The variance of the simulated b is of 0.040 (that is the “true” one); the
average of the estimated variances based on the standard OLS formula is Tests
of 0.005 (“bad” estimate); the average of the estimated variances based Linear regression
on the White robust covariance matrix is of 0.030 (“better” estimate). Assumptions
OLS
I The standard OLS formula for the variance of b overestimates the Inference
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
131 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Second case: HAC
CLT
I This includes the cases of Eqs. (30) and (31).
Tests
I Newey and West (1987): If the correlation between terms i and j gets
Linear regression
sufficiently small when |i ≠ j| increases: Assumptions
OLS
A n n
B Inference
1 ÿÿ Common pitfalls
plim ‡i,j xi xjÕ = (40) Large sample
n IV
i=1 j=1
GRM
A n L n
B
1ÿ 1ÿ ÿ Panels
plim et2 xt xtÕ + w¸ et et≠¸ (xt xt≠¸
Õ
+ xt≠¸ xtÕ )
n n Max.Lik.Estim.
t=1 ¸=1 t=¸+1
Binary
Appendix
Stat. Tables
132 / 295
I Let us illustrate the influence of autocorrelation using simulations. Statistics and
I We consider the following model: Econometrics II
J.-P. Renne
yi = xi + Ái , Ái ≥ N (0, xi2 ) (41)
where the xi s and the Ái s are such that: Prelim.
CLT
xi = 0.8xi≠1 + ui and Ái = 0.8Ái≠1 + vi (42)
Tests
where the ui s and the vi s are i.i.d. N (0, 1).
Linear regression
I Here is a simulated sample (n = 200) of this model: Assumptions
OLS
Inference
Figure 25: Simulation of Eqs. (41) and (42) on 200 periods Common pitfalls
Large sample
IV
xt in red, εt in blue GRM
● Panels
6
●● Max.Lik.Estim.
2
● ●
4
● ●
Binary
●● ●
● ●
● ● ●
● ●
● ●●
●
2
●
0
● ●● ● ● ●
Time Series
● ●
●
●● ● ● ●
●●
yt
● ● ●●
● ● ●● ● ● ● ●●
● ●
● ●● ●
●● ● ●●● ●● ●●●
0
●●● ● ● ●
●
● ● ●
● ● ● ●
●●
● ●●
●
●●●
●●● ●
●●
● AR processes
● ●●●●
●
● ●●● ● ● ● ●
●● ●● ●
● ● ●
●● ●
−2
MLE of AR processes
● ● ● ● ●●● ● ●
● ●●
●● ●●●●● ●
−2
● ●
● ● ● ●●● ●●
● ● ●
●●
Forecasting
● ● ● ●
● ● ● ●● ●
● ● ● ●
● ● ● ● ● ●
−4
● ● ●
●
Appendix
●● ●
−4
● ● ●
● ● ● ● ●
● ●
133 / 295
I We simulate 1000 samples of the same model with n = 200. Statistics and
I For each sample, we compute the OLS estimate of — (=1). Econometrics II
I Using these 1000 estimates of b, we construct an approximated J.-P. Renne
(kernel-based) distribution of this OLS estimator (in red on the figure).
I For each of the 1000 OLS estimations, we employ the standard OLS Prelim.
variance formula (s 2 (XÕ X)≠1 ) to estimate the variance of b. The blue CLT
curve is a normal distribution centred on 1 and whose variance is the
average of the 1000 previous variance estimates. Tests
Linear regression
Assumptions
Figure 26: Auto-correlation and the inappropriateness of the standard OLS OLS
Panels
6
Max.Lik.Estim.
5
Binary
4
Time Series
3
AR processes
MLE of AR processes
Forecasting
2
Appendix
1
Stat. Tables
0
b value
134 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I The variance of the simulated b is of 0.020 (that is the “true” one); the
average of the estimated variances based on the standard OLS formula is Tests
of 0.005 (“bad” estimate); the average of the estimated variances based Linear regression
on the White robust covariance matrix is of 0.015 (“better” estimate). Assumptions
OLS
I The standard OLS formula for the variance of b overestimates the Inference
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
135 / 295
I For the sake of comparison, let us consider a model with no Statistics and
Econometrics II
auto-correlation (xi ≥ i.i.d.N (0, 2.8) and Ái ≥ i.i.d.N (0, 2.8)).
J.-P. Renne
xt in red, εt in blue
Prelim.
CLT
6
● ● ●
4
● ●
●● ●
●
4
●● ● ● ● ●
●● ● ● ●
● ● ●
Tests
● ● ●
● ● ●● ● ●
● ● ● ●●● ●
●●●● ●
● ●● ●● ●
2
2
● ●
● ●● ●● ● ● ● ●● ● ●● ● ● ●
● ● ●●● ● ●●●
●●●● ●●● ●● ●
● ●● ● ●
● ● ●●●
●●● ● ●● ●
0
●●●
Linear regression
●●● ● ●● ●●●●●●● ●●
●●
●● ● ●● ● ●
●●● ●●
● ●● ●● ● ● ●
0
yt
● ●●
●● ● ●● ●● ● ●
●●● ● ● ● ●●
● ●
● ● ● ●
Assumptions
●
● ●
●● ●
● ● ● ● ● ●
−4
−2
● ●
●
OLS
● ●
Inference
−4
−8
●
Common pitfalls
−4 −2 0 2 4 0 50 100 150 200 Large sample
IV
xt
GRM
Panels
Density function of b Max.Lik.Estim.
Binary
6
5
Time Series
4
AR processes
3
MLE of AR processes
Forecasting
2
Appendix
1
0
Stat. Tables
0.8 0.9 1.0 1.1 1.2 1.3
b value
136 / 295
How to detect autocorrelation in residuals? Statistics and
Econometrics II
I Consider the usual regression (say Eq. 32). J.-P. Renne
I The Durbin-Watson test is a typical autocorrelation test. Its test
Prelim.
statistic is:
qn CLT
i=2
(ei ≠ ei≠1 )2 e12 + en2 Tests
DW = qn = 2(1 ≠ r ) ≠ qn 2 ,
e2
i=1 i
e
˚˙ i˝
¸ i=1 Linear regression
Assumptions
p
æ0 OLS
Inference
qn IV
ee
i=2 i i≠1
GRM
r = q n≠1 2
. Panels
i=1
ei
Max.Lik.Estim.
(r is a consistent estimator of Cor (Ái , Ái≠1 ), i.e. fl in Eq. 33.) Binary
I Critical values depend only on T and K: see e.g. tables. Time Series
I The one-sided test for H0 : fl = 0 against H1 : fl > 0 is carried out by AR processes
I Appendix
If DW < dL , the null hypothesis is rejected;
if DW > dU , the hypothesis is not rejected; Stat. Tables
If dL Æ DW Æ dU , no conclusion is drawn.
137 / 295
Statistics and
Econometrics II
J.-P. Renne
Table 13: Properties of the OLS estimator
Prelim.
Normality of disturbances
CLT
Uncorrelated residuals
Condit. mean-zero
Homoskedasticity
Tests
Linear regression
Assumptions
OLS
Inference
Common pitfalls
Large sample
IV
GRM
138 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
139 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
I Context: Tests
yi,t = xi,t
Õ
— + zÕi – +Ái,t IV
Max.Lik.Estim.
I Objective:
Binary
Estimate the previous equation.
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
140 / 295
Statistics and
Econometrics II
Figure 27: Simulated Panel Data J.-P. Renne
Prelim.
CLT
● ●
Tests
Linear regression
6
●
Assumptions
●
OLS
4
● Inference
●
Common pitfalls
2
● ●
● Large sample
y
● IV
0
● GRM
● ●
Panels
−2
● ● ●
Max.Lik.Estim.
● ●
●
● ●
−4
● Binary
Time Series
−6 −4 −2 0 2
AR processes
MLE of AR processes
x
Forecasting
141 / 295
Statistics and
Econometrics II
Figure 28: Simulated Panel Data J.-P. Renne
Prelim.
CLT
● ●
Tests
Linear regression
6
●
Assumptions
●
OLS
4
● Inference
●
Common pitfalls
2
● ●
● Large sample
y
● IV
0
● GRM
● ●
Panels
−2
● ● ●
Max.Lik.Estim.
● ●
●
● ●
−4
● Binary
Time Series
−6 −4 −2 0 2
AR processes
MLE of AR processes
x
Forecasting
142 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 29: Consumption of cigarettes versus price
Prelim.
● CLT
●
●
Tests
● ●
5.0
●
●● ● Linear regression
● ●● ●
● ● ● ●●● ● ● ●● Assumptions
● ●●
●● ● ●●● ●
OLS
log(Nb. of packs)
● ● ● ●●
● ●● ●
● ● ● ● ● ● ●
● ● ● ●● ● ● ●● Inference
● ● ●
● ● ●
● ● ● ● Common pitfalls
●● ● ●
4.5
● ●
● ● ● ● Large sample
● ● ● ●
● ● ● IV
● ●
● ●
● ● GRM
●
●
Panels
●
●
4.0
Max.Lik.Estim.
●
Binary
Time Series
AR processes
4.4 4.6 4.8 5.0 5.2
MLE of AR processes
Forecasting
log(price)
Appendix
Cigarettes data from Stock and Watson (2003). Data for U.S. states, 2 years: 1985 and 1995.
Stat. Tables
143 / 295
Statistics and
Econometrics II
Prelim.
●
●
CLT
●
Tests
● ●
5.0
●●
●
●
●● ●
●
Linear regression
● ● ● ●●● ● ● ●●
Assumptions
● ●●
●● ● ●●● ●
log(Nb. of packs)
● ● ● ●●●● ●
● ● ● ● ● ● ● ● OLS
● ● ● ●● ● ● ●●
●
● ●
● ● ● Inference
● ● ● ●
●● ● ● Common pitfalls
4.5
● ●
● ● ● ●
● ● ● Large sample
● ● ● ●
● ● IV
● ●
● ● GRM
●
● ●
Panels
●
4.0
●
Max.Lik.Estim.
Binary
Time Series
4.4 4.6 4.8 5.0 5.2 AR processes
MLE of AR processes
log(price) Forecasting
Appendix
Cigarettes data from Stock and Watson (2003). Data for U.S. states, 2 years: 1985 and 1995.
Each colour corresponds to a given State. Stat. Tables
144 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 31: Airlines: cost versus fuel price
Prelim.
CLT
Tests
17
Linear regression
Assumptions
16
OLS
●●
●●● ●
●● Inference
15
● ● ●
● ● ●
●
● ● Common pitfalls
● ● ●
● ●
log(cost)
●●
Large sample
14
●
●●● ● ●● ● ●
●● ● ●
● ●
●
●
●
●
● IV
● ● ● ●
●● ●
● ● ●●
●
13
● ● ●● GRM
●● ● ● ●
●●● ● ●
Panels
●
● ●
●
12
● ● ●
● ● ●●
● ●
●●●
●●
●
Max.Lik.Estim.
11
Binary
10
Time Series
11.0 11.5 12.0 12.5 13.0 13.5 14.0 AR processes
MLE of AR processes
Appendix
Airlines data from Greene (2003). Each colour corresponds to a given airline.
Stat. Tables
145 / 295
Statistics and
Notations Econometrics II
J.-P. Renne
S T S T S xÕ T S T
yi,1 Ái,1 i,1 x1 Prelim.
yi = U
.. V, Ái = U .. V, xi = U .. V, X = U .. V. CLT
. . . .
yi,T Ái,T xi,T
Õ xn Tests
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝ Linear regression
T ◊1 T ◊1 T ◊K (nT )◊K Assumptions
OLS
S T S T
yi,1 ≠ ȳi
Inference
Ái,1 ≠ Á̄i Common pitfalls
.. ..
ỹi = U .
V , Á̃i = U
.
V, Large sample
IV
S xÕ ≠ x̄Õ T S T S T Panels
i,1 i x̃1 ỹ1
Max.Lik.Estim.
. .. .
x̃i = U .. V , X̃ = U V , Ỹ = U .. V,
. Binary
xi,T
Õ ≠ x̄iÕ x̃n ỹn Time Series
AR processes
where MLE of AR processes
Forecasting
T T T
1 ÿ 1 ÿ 1 ÿ Appendix
ȳi = yi,t , Á̄i = Ái,t and x̄i = xi,t .
T T T Stat. Tables
i=1 i=1 i=1
146 / 295
Three standard cases Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I Fixed Effects: zi is unobserved, but correlates with xi ∆ b is biased and Linear regression
Assumptions
inconsistent in the OLS regression of y on X (omitted variable, see OLS
Remark 4.10). Inference
Common pitfalls
I Random Effects: zi is unobserved, but uncorrelated with xi . The model Large sample
writes: IV
yi,t = xi,t
Õ
—+–+ ui + Ái,t , GRM
¸ ˚˙ ˝ Panels
compound disturbance
Max.Lik.Estim.
where – = E(zÕi –) and ui = zÕi – ≠ E(zÕi –) ‹ xi . Binary
Least squares are consistent (but inefficient, see GLS Box 4.7).
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
147 / 295
Estimation of Fixed Effects Models Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Linear regression
(i) There is no perfect multicollinearity among the regressors. Assumptions
OLS
(ii) E(Ái,t |X) = 0, for all i, t. Inference
Ó
‡2 if i = j and s = t,
Common pitfalls
Panels
Remark 5.1
Max.Lik.Estim.
These assumptions are the same as those introduced in the standard linear
Binary
regression:
(i) ¡ A.1, (ii) ¡ A.2, (iii) ¡ A.3 + A.4. Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
148 / 295
Statistics and
Econometrics II
I In matrix form, for a given i, the model writes:
J.-P. Renne
yi = Xi — + 1–i + Ái , Prelim.
CLT
where 1 is a T -dimensional vector of ones.
Tests
I This is the Least Square Dummy Variable (LSDV) model:
Linear regression
Assumptions
Ë È OLS
—
y = [X D] + Á, (43) Inference
– Common pitfalls
Large sample
IV
with: S 1 T GRM
0 ... 0
Panels
W 0
D=U
1 ... 0 X.
.. V Max.Lik.Estim.
. Binary
0 0 ... 1
¸ ˚˙ ˝ Time Series
(nT ◊n) AR processes
MLE of AR processes
I The linear regression (43) –with the dummy variables– satisfies the Forecasting
149 / 295
Statistics and
I Denoting by Z the matrix [X D], and by b and a the respective OLS Econometrics II
Ë È Prelim.
b
= [ZÕ Z]≠1 ZÕ y. (44) CLT
a
Tests
I The asymptotical distribution of [bÕ , aÕ ]Õ derives from the standard OLS Linear regression
I We have: Inference
Ë È 3Ë È 4 Common pitfalls
b d — Q ≠1 Large sample
æN , ‡2 (45)
a – nT
IV
GRM
Panels
where
1 Õ Max.Lik.Estim.
Q = plimnT æŒ Z Z.
nT Binary
I In practice, an estimator of the covariance matrix of [bÕ , aÕ ]Õ is: Time Series
AR processes
2
! "≠1 2 eÕ e MLE of AR processes
s ZZ
Õ
with s = , Forecasting
nT ≠ K ≠ n Appendix
150 / 295
I There is an alternative way of expressing the LSDV estimators. Statistics and
Econometrics II
I It is based on matrix MD = I ≠ D(DÕ D)≠1 DÕ , which acts as an operator
J.-P. Renne
that removes entity-specific means, i.e. (with the notation of Slide 146):
Prelim.
Ỹ = MD Y, X̃ = MD X and Á̃ = MD Á.
CLT
Linear regression
Assumptions
b = [XÕ MD X]≠1 XÕ MD y. (46) OLS
Inference
Common pitfalls
I This amounts to regressing the ỹi,t s (= yi,t ≠ ȳi ) on the x̃i,t s Large sample
(= xi,t ≠ x̄i ). IV
GRM
I The estimates of – are given by:
Panels
Max.Lik.Estim.
a = (DÕ D)≠1 DÕ (y ≠ Xb), (47)
Binary
Time Series
which is obtained by developing the second row of
AR processes
Ë ÈË È Ë È MLE of AR processes
XÕ X XÕ D b XÕ Y Forecasting
= ,
DÕ X DÕ D a DÕ Y Appendix
Stat. Tables
which are the first-order conditions resulting from the least squares
problem (see Eq. 5).
151 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.517 0.229 41.514 0.000 Tests
Panels
Regression of the log of airline costs on log(output), log(fuel price) and capacity utilization of the fleet Max.Lik.Estim.
(Data from Greene (2003), see Fig. 31)
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
152 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Estimate Std. Error t value Pr(>|t|)
log(airlines$output) 0.919 0.030 30.756 0.000 Tests
log(airlines$pf) 0.417 0.015 27.468 0.000 Linear regression
airlines$lf -1.070 0.202 -5.307 0.000 Assumptions
Max.Lik.Estim.
Table 15: Dep. variable: log(cost). LSDV regression.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
153 / 295
Statistics and
Econometrics II
J.-P. Renne
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.067 0.516 19.525 0.000 Prelim.
log(CigarettesSW$rincome) 0.318 0.136 2.337 0.022
CLT
log(CigarettesSW$rprice) -1.334 0.135 -9.856 0.000
Tests
Table 16: Dep. variable: Nb of packs. OLS regression. Linear regression
Assumptions
OLS
Inference
Common pitfalls
Estimate Std. Error t value Pr(>|t|) Large sample
Max.Lik.Estim.
Table 17: Dep. variable: Nb of packs. LSDV HAC fixed-effect regression.
Binary
Time Series
AR processes
MLE of AR processes
Regression of the number of cigarette packs (log) on real income (log) and real price of cigarettes (log)
Forecasting
(Data from Stock and Watson (2003), see Fig. 29.)
Appendix
Stat. Tables
154 / 295
Statistics and
Econometrics II
J.-P. Renne
Extension: Fixed time and group effects
I Time effects are introduced: Prelim.
CLT
yi,t = xiÕ — + –i + “t + Ái,t . Tests
Linear regression
I The LSDV (Eq. 43) can be extended:
Assumptions
OLS
C D Inference
—
Common pitfalls
y = [X D C] – + Á, (48) Large sample
“ IV
GRM
with: S T Panels
”1 ”2 ... ” T ≠1 Max.Lik.Estim.
.. .. ..
C=U . . .
V, Binary
”1 ”2 ... ” T ≠1 Time Series
AR processes
where the T -dimensional vector ” t is [0, . . . , 0, 1 , 0, . . . , 0]Õ . MLE of AR processes
¸˚˙˝ Forecasting
Stat. Tables
155 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Example 5.1 (Geo-located data) CLT
I Geo-located data from Airbnb, Zürich Tests
(Airbnb, date 22 June 2017). Linear regression
Assumptions
OLS
Panels
Table 18: Regression of prices on the number of people the appartment can
Max.Lik.Estim.
accommodate and a dummy indicating if the appartment is a studio.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
156 / 295
Statistics and
Econometrics II
Example 5.1 (Geo-located data) J.-P. Renne
Time Series
Table 19: Regression of prices on the number of people the appartment can AR processes
accommodate, a dummy indicating if the appartment is a studio and dummies MLE of AR processes
Appendix
Stat. Tables
157 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Inference
Common pitfalls
Large sample
IV
GRM
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Source: Airbnb, date 22 June 2017. Regression of price for Entire home/apt on number of bedrooms, Appendix
number of people that can be accommodated. Stat. Tables
158 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Inference
Common pitfalls
Large sample
IV
GRM
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Source: Airbnb, date 22 June 2017. Regression of price for Entire home/apt on number of bedrooms, Appendix
number of people that can be accommodated and dummies for the 12 city areas. Stat. Tables
159 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Inference
Common pitfalls
Large sample
IV
GRM
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Source: Airbnb, date 22 June 2017. City’s districts.
Stat. Tables
160 / 295
Estimation of random effects models Statistics and
Econometrics II
J.-P. Renne
Prelim.
I Here, the individual effects are assumed not to be correlated to other
variables (the xi s). CLT
Linear regression
yi,t = xitÕ — + (– + u
i ) + Ái,t Assumptions
¸˚˙˝ OLS
Random Inference
heterogeneity Common pitfalls
Large sample
with
IV
GRM
Panels
E(Ái,t |X) = E(ui |X) = 0,
Ó Max.Lik.Estim.
‡Á2 if i = j and s = t,
E(Ái,t Áj,s |X) = Binary
0 otherwise.
Ó Time Series
‡u2 if i = j,
E(ui uj |X) = AR processes
0 otherwise. MLE of AR processes
Appendix
Stat. Tables
161 / 295
Statistics and
Econometrics II
J.-P. Renne
I Notation: ÷i,t = ui + Ái,t and ÷ i = [÷i,1 , ÷i,2 , . . . , ÷i,T ]Õ .
I We have E(÷ i |X) = 0 and Var (÷ i |X) = where Prelim.
CLT
S ‡2 + ‡2 ‡u2 ‡u2 ... ‡u2
T
Á u Tests
W ‡u2 ‡Á2 + ‡u2 ‡u2 ... ‡u2 X = ‡2 I + ‡2 11Õ . Linear regression
=U .. .. .. V Á u
. . .
Assumptions
OLS
‡u2 ‡u2 ‡u2 ... ‡Á + ‡u2
2
Inference
Common pitfalls
Large sample
I Denoting by the covariance matrix of ÷ = [÷ Õ1 , . . . , ÷ Õn ]Õ , we have: IV
GRM
=I¢ . Panels
Max.Lik.Estim.
I If we knew , we would apply (feasible) GLS (Eq. (35)):
Binary
— = (X Õ ≠1
X) ≠1
X Õ ≠1
y Time Series
AR processes
MLE of AR processes
≠1/2 Õ ≠1/2 Õ
(recall that this amounts to regressing y on X). Forecasting
Appendix
Stat. Tables
162 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
1 2
1 ◊
≠1/2
= I ≠ 11Õ Tests
‡Á T
Linear regression
Assumptions
with ‡Á OLS
◊ =1≠ . Inference
S y ≠ ◊ȳ T Panels
i,1 i
1 W yi,2 ≠ ◊ȳi X Max.Lik.Estim.
≠1/2
yi = U .. V. Binary
‡ Á .
yi,T ≠ ◊ȳi Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
163 / 295
I What about when is unknown? Statistics and
Econometrics II
I Idea: taking deviations from group means removes heterogeneity:
J.-P. Renne
CLT
I The previous equation can be consistently estimated by OLS
(the residuals are correlated across t within an entity but the OLS Tests
remain consistent though, see Prop. 4.12). Linear regression
Ëq È Assumptions
T
I We have E (Á
i=1 i,t
≠ Á̄i )2 = (T ≠ 1)‡Á2 . OLS
Inference
I The Ái,t s are not observed but b is a consistent estimator of —; Common pitfalls
I And for ‡u2 ? We can exploit the fact that OLS are consistent in the Time Series
pooled regression: AR processes
MLE of AR processes
Forecasting
2 eÕ e
plim spooled = plim = ‡u2 + ‡Á2 , Appendix
nT ≠ K ≠ 1
Stat. Tables
2
and therefore use spooled ˆe2 as an approximation to ‡u2 .
≠‡
164 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
165 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Context and Objective
Tests
I Context:
Linear regression
I You observe a sample y = {y1 , . . . , yn }.
Assumptions
I You know that these data have been generated by a model OLS
parameterized by ◊0 œ RK . Inference
Common pitfalls
I Objective: Large sample
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
166 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I Assume that the time periods between the arrivals of two customers in a
shop, denoted by yi , are i.i.d. and follow an exponential distribution, i.e. Tests
yi ≥ E(⁄). Linear regression
I You have observed these arrivals for some time, thereby constituting a Assumptions
OLS
sample {y1 , . . . , yn }. You want to estimate ⁄. Inference
(i.e. in that case, the vector of parameters is simply ◊ = ⁄). Common pitfalls
1
Large sample
⁄ GRM
density functions for different values of ⁄.
Panels
I Your 200 observations are reported at the bottom of Fig. 33 (red).
Max.Lik.Estim.
I You build the histogram and report it on the same chart (Fig. 34).
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
167 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 32: Densities of exponential distribution for different ⁄
Prelim.
CLT
Tests
1.0
OLS
lambda=7 Inference
Common pitfalls
0.6
Large sample
density
IV
GRM
0.4
Panels
Max.Lik.Estim.
0.2
Binary
Time Series
0.0
AR processes
MLE of AR processes
0 2 4 6 8 10 12 Forecasting
Stat. Tables
168 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 33: Densities of exponential distribution for different ⁄
Prelim.
CLT
Tests
1.0
OLS
lambda=7 Inference
Common pitfalls
0.6
Large sample
density
IV
GRM
0.4
Panels
Max.Lik.Estim.
0.2
Binary
Time Series
0.0
AR processes
MLE of AR processes
0 2 4 6 8 10 12 Forecasting
Stat. Tables
169 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 34: Densities of exponential distribution for different ⁄
Prelim.
CLT
Tests
1.0
OLS
lambda=7 Inference
Common pitfalls
0.6
Large sample
density
IV
GRM
0.4
Panels
Max.Lik.Estim.
0.2
Binary
Time Series
0.0
AR processes
MLE of AR processes
0 2 4 6 8 10 12 Forecasting
Stat. Tables
170 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I What is your estimate of ⁄?
Tests
I Now, assume that you have only four observations: y1 = 1.1, y2 = 2.2,
Linear regression
y3 = 0.7 and y4 = 5.0.
Assumptions
I What was the probability of observing, for a small Á, OLS
r4 Panels
I Because the yi s are i.i.d., this probability is (2Áf (yi , ⁄)).
i=1
Max.Lik.Estim.
I The next plot shows the probability (divided by 16Á4 ) as a function of ⁄.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
171 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 35: Proba. that yi ≠ Á Æ Yi < yi + Á, i œ {1, 2, 3, 4}
Prelim.
CLT
Tests
Linear regression
6e−04
Assumptions
OLS
Inference
Common pitfalls
4e−04
Large sample
Probability
IV
GRM
Panels
2e−04
Max.Lik.Estim.
Binary
0e+00
Time Series
AR processes
MLE of AR processes
0 2 4 6 8 10 Forecasting
lambda Appendix
Stat. Tables
172 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 36: log of Proba. that yi ≠ Á Æ Yi < yi + Á, i œ {1, 2, 3, 4}
Prelim.
CLT
Tests
−5
Linear regression
Assumptions
OLS
Inference
−10
Common pitfalls
log(Probability)
Large sample
IV
GRM
Panels
−15
Max.Lik.Estim.
Binary
Time Series
−20
AR processes
MLE of AR processes
0 2 4 6 8 10 Forecasting
lambda Appendix
Stat. Tables
173 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 37: Back to the example with 200 observations
Prelim.
CLT
Tests
−400
Linear regression
Assumptions
−420
OLS
Inference
Common pitfalls
log(Probability)
−440
Large sample
IV
GRM
−460
Panels
Max.Lik.Estim.
−480
Binary
Time Series
−500
AR processes
MLE of AR processes
0 2 4 6 8 10 Forecasting
lambda Appendix
Stat. Tables
174 / 295
Statistics and
Notations Econometrics II
I f (y ; ◊) denotes the probability density function (p.d.f.) of a random J.-P. Renne
variable Y which depends on a set of parameters ◊.
Prelim.
I Density of n independent and identically distributed (i.i.d.) observations
of Y : CLT
n
Ÿ Tests
f (y1 , . . . , yn ; ◊) = f (yi ; ◊).
Linear regression
i=1
Assumptions
Max.Lik.Estim.
I We often work with log L, the log-likelihood function.
Binary
Time Series
Example 6.1 AR processes
If yi ≥ N (µ, ‡ 2 ), then
MLE of AR processes
Forecasting
Appendix
1 ÿ3
n
(yi ≠ µ)2
4
log L(◊; y) = ≠ log ‡ 2 + log 2fi + . Stat. Tables
2 ‡2
i=1
175 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Definition 6.2 (Score)
Tests
ˆ log f (y ;◊)
The score S(y ; ◊) is given by ˆ◊
. Linear regression
Assumptions
OLS
S T S T GRM
ˆ log f (y ; ◊) y ≠µ
Panels
ˆ log f (y ; ◊) 1 ‡2 2 V.
=U ˆµ V=U
ˆ◊ ˆ log f (y ; ◊) 1 (y ≠µ)2
≠1
Max.Lik.Estim.
2‡ 2 ‡2
ˆ‡ 2 Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
176 / 295
Statistics and
Econometrics II
CLT
1 2 ⁄ Linear regression
ˆ log f (Y ; ◊) ˆ log f (y ; ◊) Assumptions
E = f (y ; ◊)dy
ˆ◊ ˆ◊ OLS
⁄ ⁄ Inference
Common pitfalls
ˆf (y ; ◊)/ˆ◊ ˆ
= f (y ; ◊)dy = f (y ; ◊)dy = ˆ1/ˆ◊ = 0.⌅ Large sample
f (y ; ◊) ˆ◊ IV
GRM
Panels
Binary
The information matrix is (minus) the the expectation of the second
derivatives of the log-likelihood function: Time Series
AR processes
3 4 MLE of AR processes
ˆ2 log f (Y ; ◊) Forecasting
IY (◊) = ≠E .
ˆ◊ˆ◊ Õ Appendix
Stat. Tables
177 / 295
Statistics and
Econometrics II
Proposition 6.2 (Information matrix) J.-P. Renne
Ë! " ! ˆ log f (Y ;◊) "Õ È
ˆ log f (Y ;◊) Prelim.
We have IY (◊) = E ˆ◊ ˆ◊
= Var [S(Y ; ◊)].
CLT
Tests
ˆ 2 log f (Y ;◊) ˆ 2 f (Y ;◊) 1 ˆ log f (Y ;◊) ˆ log f (Y ;◊)
Proof = ≠ . The expectation of the first
ˆ◊ˆ◊ Õ ˆ◊ˆ◊ Õ f (Y ;◊) ˆ◊ ˆ◊ Õ Linear regression
2
right-hand side term is ˆ 1/(ˆ◊ˆ◊ ) = 0. Õ
⌅ Assumptions
OLS
Inference
Common pitfalls
Example Large sample
IV
If yi ≥ N (µ, ‡ 2 ), let ◊ = [µ, ‡ 2 ]Õ then GRM
Panels
5 3 46Õ
ˆ log f (y ; ◊) y ≠µ 1 (y ≠ µ)2 Max.Lik.Estim.
= ≠1
ˆ◊ ‡2 2‡ 2 ‡2 Binary
Time Series
and AR processes
MLE of AR processes
3 5 64 Ë È
1 ‡2 y ≠µ 1/‡ 2 0
Forecasting
178 / 295
Statistics and
Econometrics II
The information matrix resulting from two independent experiments is the Prelim.
sum of the information matrices:
CLT
Linear regression
Assumptions
Theorem 6.1 (Fréchet-Darmois-Cramér-Rao bound) OLS
Inference
ˆ ).
Consider an unbiased estimator of ◊ denoted by ◊(Y Common pitfalls
Panels
(Ê Õ Ê)2 /(Ê Õ IY (◊)Ê).
Max.Lik.Estim.
Binary
Proof The Cauchy-Schwarz inequality (Theorem 9.7) implies that
Time Series
ˆ ))Var (Ê Õ S(Y ; ◊)) Ø |Ê Õ Cov [◊(Y
Var (Ê Õ ◊(Y ˆ ), S(Y ; ◊)]Ê|. Now,
s s AR processes
ˆ ), S(Y ; ◊)] =
Cov [◊(Y ˆ ) ˆ log f (y ;◊) f (y ; ◊)dy
◊(y = ˆ◊
ˆ ˆ )f (y ; ◊)dy = I because ◊
◊(y ˆ is MLE of AR processes
y ˆ◊ y Forecasting
unbiased.
ˆ )) Ø Var (Ê Õ S(Y ; ◊))≠1 (Ê Õ Ê)2 . Prop. 6.2 leads to the result.
Therefore Var (Ê Õ ◊(Y ⌅ Appendix
Stat. Tables
179 / 295
Statistics and
Econometrics II
Prelim.
The vector of parameters ◊ is identifiable if, for any other vector ◊ ú :
CLT
◊ ú ”= ◊ ∆ L(◊ ú ; y) ”= L(◊; y). Tests
Linear regression
Definition 6.5 (Maximum Likelihood Estimator (MLE)) Assumptions
OLS
Inference
The maximum likelihood estimator (MLE) is the vector ◊ that maximizes the Common pitfalls
Max.Lik.Estim.
Time Series
Necessary condition for maximizing the likelihood function: AR processes
MLE of AR processes
= 0. (50) Appendix
ˆ◊
Stat. Tables
180 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Assumption A.7 (Regularity Assumptions) CLT
(v) The log-likelihood function is such that (1/n) log L(◊; y) converges Large sample
IV
almost surely to E◊0 (log f (Y ; ◊)), uniformly in ◊ œ . GRM
Appendix
Stat. Tables
181 / 295
Statistics and
Econometrics II
Prelim.
Under regularity conditions (Assumptions A.7), the MLE is:
(a) Consistent: plim ◊ MLE = ◊0 (◊0 is the true vector of parameters). CLT
Linear regression
3 4
ˆ 2 log L(◊; y)
Assumptions
ˆ◊ˆ◊ Õ Inference
Common pitfalls
Large sample
Panels
(d) Invariant: The MLE of g(◊ 0 ) is g(◊ MLE ) if g is a continuous and
continuously differentiable function. Max.Lik.Estim.
Binary
Proof: See Online additional material.
Time Series
Remark 6.1 AR processes
MLE of AR processes
Stat. Tables
182 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
5 3 46≠1
ˆ 2 log L(◊; y) Tests
[I(◊ 0 )]≠1 = ≠E0 .
ˆ◊ˆ◊ Õ Linear regression
Assumptions
OLS
I A direct (analytical) evaluation of this expectation is often out of reach. Inference
Î≠1
1 = ≠ , (52) Panels
ˆ◊ˆ◊ Õ
A n B≠1 Max.Lik.Estim.
ÿ ˆ log L(◊MLE ; yi ) ˆ log L(◊MLE ; yi ) Binary
Î≠1
2 = . (53)
ˆ◊ ˆ◊ Õ Time Series
i=1
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
183 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
To sum up – MLE in practice
CLT
I A parametric model (depending on the vector of parameters ◊ whose
Tests
"true" value is ◊ 0 ) is specified.
Linear regression
I i.i.d. sources of randomness are identified. Assumptions
I The MLE estimator results from the optimization problem (this is GRM
Appendix
Stat. Tables
184 / 295
Statistics and
Econometrics II
I Consider the returns of the SMI index used in Fig. 38. Prelim.
I Let’s assume that these returns are independently drawn from a mixture CLT
of Gaussian distributions. The p.d.f. f (x ; ◊), with
Tests
◊ = [µ1 , ‡1 , µ2 , ‡2 , p]Õ , is given by:
Linear regression
3 4 3 4 Assumptions
1 (x ≠ µ1 )2 1 (x ≠ µ2 )2 OLS
p exp ≠ + (1 ≠ p) exp ≠ .
2fi‡12 2‡12 2fi‡22 2‡22 Inference
Common pitfalls
Large sample
IV
p.d.f. of mixtures of Gaussian dist. GRM
185 / 295
Statistics and
Example 6.3 (MLE estim. of a mixture of Gaussian distri. (cont’d)) Econometrics II
J.-P. Renne
CLT
5−day return on SMI index Return's density (solid line)
Tests
0.25
Linear regression
5
Assumptions
0.20
OLS
Inference
Common pitfalls
0.15
0
in percent
Large sample
IV
GRM
0.10
−5
Panels
0.05
Max.Lik.Estim.
−10
Binary
0.00
Time Series
2004 2006 2008 2010 2012 2014 2016 −10 −5 0 5 10
AR processes
return, in percent MLE of AR processes
Forecasting
Gaussian kernel, h = 0.5 (in percent).
The data spans the period from 2 June 2006 to 23 February 2016 at the daily frequency. Appendix
Left-hand plot: the blue lines indicates µ ± 2‡, where µ is the sample mean of the returns and ‡ is Stat. Tables
their sample standard deviation. Right-hand plot: the red dotted line is the density N (µ, ‡ 2 )
186 / 295
Statistics and
Econometrics II
Example 6.3 (MLE estim. of a mixture of Gaussian distri. (cont’d)) J.-P. Renne
CLT
Inference
Common pitfalls
Large sample
0.15
IV
GRM
Panels
0.10
Max.Lik.Estim.
0.05
Binary
Time Series
0.00
AR processes
MLE of AR processes
−4 −2 0 2 4 Forecasting
Stat. Tables
187 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
188 / 295
Statistics and
Econometrics II
I Context: Prelim.
yi |X ≥ B(g(xi ; ◊)),
Common pitfalls
Large sample
Time Series
where ◊ is a vector of parameters to be estimated. AR processes
MLE of AR processes
I Objective: Forecasting
Stat. Tables
189 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Linear regression
Binary-choice models can be used to account for... Assumptions
living in the city or in the countryside, in/out of the labour Common pitfalls
force,...),
Large sample
IV
Panels
I success/failure (exams).
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
190 / 295
I A possibility would be to run a linear regression: Statistics and
Econometrics II
yi = ◊ xi + Ái .
Õ
J.-P. Renne
Panels
1.0
● ●● ●●
●●●
●
●●
●●
●●
●●●
●
●●●
●●
●●
●●●
●●●●
●●
● ●●●●
●●●● ●
● Observations
fitted linear regressions
Max.Lik.Estim.
0.8
Binary
0.6
y
Time Series
0.4
AR processes
0.2
MLE of AR processes
0.0
● ● ●
●●● ●
●
●●
●●●
●●●
●●●
●
●●
●●●● ●●
●●
●● Forecasting
−4 −2 0 2 Appendix
x Stat. Tables
The model is P(yi = 1|xi ) = (0.5 + 2xi ), where is the c.d.f. of the normal distribution.
191 / 295
I Two prominent models to tackle this situation. In both models, we have: Statistics and
Econometrics II
P(yi = 1|xi ; ◊) = g(◊ Õ xi ).
J.-P. Renne
I Probit model:
g(z) = (z), (55) Prelim.
Panels
1.0
Max.Lik.Estim.
0.8
Binary
Time Series
0.6
g(x)
AR processes
MLE of AR processes
0.4
Forecasting
Appendix
0.2
Stat. Tables
0.0
−4 −2 0 2 4
x 192 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Tests
Linear regression
1.0
● ●● ●●
●●●
●
●●
●●
●●
●●●
●
●●●
●●
●●
●●●
●●●●
●●
● ●●●●
●●●● ●
Assumptions
● Observations
fitted linear regressions OLS
0.8
P(yi=1|xi) Inference
Common pitfalls
0.6
Large sample
y
0.4
IV
GRM
0.2
Panels
0.0
● ● ●
●●● ●
●
●●
●●●
●●●
●●●
●
●●
●●●● ●●
●●
●●
Max.Lik.Estim.
−4 −2 0 2
Binary
x
Time Series
The model is P(yi = 1|xi ) = (0.5 + 2xi ), where is the c.d.f. of the normal distribution. AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
193 / 295
I These models have an interpretation in terms of latent variables. Statistics and
I Consider the Probit case. We have: Econometrics II
J.-P. Renne
P(yi = 1|xi ; ◊) = (◊ Õ xi ) = P(≠Ái < ◊ Õ xi ),
Prelim.
where Ái ≥ N (0, 1). That is:
CLT
P(yi = 1|xi ; ◊) = P(0 < yiú ), Tests
i OLS
Inference
Common pitfalls
Large sample
Figure 43: Distribution of yiú conditional on xi IV
GRM
Panels
0.4
Max.Lik.Estim.
Binary
0.3
P(yi=0|xi;θ) P(yi=1|xi;θ)
Time Series
0.2
AR processes
MLE of AR processes
0.1
Forecasting
θ'x_i
Appendix
0.0
−4 −2 0 2 4 Stat. Tables
Possible values of yi*
194 / 295
Estimation Statistics and
Econometrics II
J.-P. Renne
Linear regression
I The r.v. are assumed to be independent across entities i.
Assumptions
yi 1≠yi Panels
= g(◊ xi ) (1 ≠ g(◊ xi ))
Õ Õ
. (57)
Max.Lik.Estim.
I Therefore, if the observations (xi , yi ) are independent (across Binary
entities i), then: Time Series
AR processes
n
ÿ MLE of AR processes
i=1 Appendix
Stat. Tables
195 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
I The likelihood equation (Def. 6.6) –that is the first-order CLT
condition of the optimization program– is: Tests
yi xi ≠ (1 ≠ yi )xi = 0. GRM
g(◊ xi )
Õ
1 ≠ g(◊ Õ xi ) Panels
i=1
Max.Lik.Estim.
∆ Nonlinear equation that generally has to be numerically solved (no Binary
closed-form formula available).
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
196 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Panels
3 4≠1
ˆ 2 log L(◊ MLE ; y) Max.Lik.Estim.
I(◊ 0 )
≠1
¥≠ .
ˆ◊ˆ◊ Õ Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
197 / 295
Statistics and
Econometrics II
I In the Probit case (Eq. 55), it can be shown that we have: J.-P. Renne
n Prelim.
ˆ 2 log L(◊; y) ÿ
Õ =≠ g Õ (◊ Õ xi )[xi xiÕ ] ◊ CLT
ˆ◊ˆ◊
i=1 Tests
Ë È
g Õ (◊ Õ xi ) + ◊ Õ xi g(◊ Õ xi ) g Õ (◊ Õ xi ) ≠ ◊ Õ xi (1 ≠ g(◊ Õ xi )) Linear regression
yi + (1 ≠ yi ) .
g(◊ Õ xi )2 (1 ≠ g(◊ Õ xi ))2 Assumptions
OLS
Inference
I In the Logit case (Eq. 56), it can be shown that we have: Common pitfalls
Large sample
IV
n
ˆ 2 log L(◊; y) ÿ GRM
=≠ g (◊
Õ Õ
xi )xi xiÕ , Panels
ˆ◊ˆ◊ Õ
i=1
Max.Lik.Estim.
exp(≠x ) Binary
where g Õ (x ) = .
(1 + exp(≠x ))2 Time Series
AR processes
MLE of AR processes
(Note: because g Õ (x )
> 0, it can be checked that Forecasting
Stat. Tables
198 / 295
Marginal effects Statistics and
Econometrics II
J.-P. Renne
I How to measure marginal effects, i.e. the effect on the probability that Prelim.
yi = 1 of a marginal increase of xi,k ? CLT
I It is given by: Tests
ˆP(yi = 1|xi ; ◊)
= g Õ (◊ Õ xi ) ◊k . Linear regression
ˆxi,k ¸ ˚˙ ˝ Assumptions
>0 OLS
Inference
I This effect is of the same sign as ◊k . Common pitfalls
∆ Increases by 1 unit of xi,k (entity i) and of xj,k (entity j) have not Panels
necessarily the same effects on P(yi = 1|xi ; ◊) and on P(yj = 1|xj ; ◊), Max.Lik.Estim.
respectively. Binary
I Measure of “average” marginal effect? Two solutions. For each
Time Series
explanatory variable k: AR processes
g Õ (◊ ÕMLE x̂)◊MLE ,k .
Forecasting
Stat. Tables
199 / 295
Goodness of fit Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
2
I There is no commonly accepted version of R (see Eq. 6) for Tests
Max.Lik.Estim.
2 log L(◊; y)
RMF =1≠ . Binary
log L0 (y) Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
200 / 295
Statistics and
Econometrics II
Max.Lik.Estim.
Binary
Source: Gerfin (1996).
Time Series
AR processes
income: ≠0.22, age: 0.69, education: 0.006, youngkids: ≠0.24, oldkids: Forecasting
Stat. Tables
201 / 295
Statistics and
Econometrics II
Example 7.2 (Labor market participation (cont’d)) J.-P. Renne
Figure 44: Labor market participation vs. age and education Prelim.
CLT
Tests
Linear regression
1.0
1.0
Assumptions
OLS
0.8
0.8
Inference
no
Common pitfalls
no
Large sample
0.6
0.6
participation
participation
IV
GRM
0.4 Panels
0.4
Max.Lik.Estim.
yes
0.2
0.2
yes
Binary
Time Series
0.0
0.0
AR processes
2 3 3.5 4 4.5 5 6 0 4 6 8 10 12 16
MLE of AR processes
Appendix
Stat. Tables
202 / 295
Statistics and
Econometrics II
Example 7.3 (Credit risk) J.-P. Renne
CLT
Share of defaulted entities and rating Interest rate and rating Tests
1.0
0.25
Linear regression
Assumptions
No default
0.8
OLS
0.20
Inference
Common pitfalls
Default status
0.6
Interest rate
●
Large sample
0.15
●
IV
●
0.4
●
●
●
●
GRM
●
●
Default
Panels
0.10
●
●
●
0.2
Max.Lik.Estim.
0.05
●
0.0
Binary
A B C D E F A B C D E F G
Source: Lending Club (large online peer-to-peer loan platform). About 38.000 loans, 2007 to 2011. Appendix
Stat. Tables
203 / 295
Statistics and
Econometrics II
Binary
amt2income = (monthly payment owed by the borrower)/(self-reported annual income); term is 36 or 60 Time Series
months; delinq2yrs = number of 30+ days past-due incidences of delinquency in the borrower’s credit AR processes
file for the past 2 years. home_ownership is either MORTGAGE, OWN or RENT. MLE of AR processes
Forecasting
Marginal effects:
Appendix
amt2income: 0.031; term 60 months: 0.11; homeownershipRENT: 0.023; delinq2yrs: 0.014.
Stat. Tables
204 / 295
Example 7.4 (Credit risk) Statistics and
Econometrics II
J.-P. Renne
Figure 46: Comparing probit-based predictions and ratings
Prelim.
0.4 ● CLT
●
●
Tests
●
Linear regression
Probit−based proba. of defaults
●
● Assumptions
0.3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
OLS
●
● ●
●
● ●
●
●
●
●
●
●
●
●
● Inference
●
●
●
●
●
●
●
●
Common pitfalls
●
●
●
●
●
●
● Large sample
0.2
●
●
●
●
●
●
●
●
●
●
● IV
●
●
GRM
Panels
0.1
●
●
●
Max.Lik.Estim.
●
●
●
● ●
●
●
●
●
● Binary
●
Time Series
A B C D E F G AR processes
MLE of AR processes
rating (or grade)
Forecasting
Appendix
Stat. Tables
Source: Lending Club (large online peer-to-peer loan platform). About 38.000 loans, 2007 to 2011.
205 / 295
Likelihood ratio tests Statistics and
Econometrics II
I Because the models are estimated by MLE, one can easily conduct J.-P. Renne
Likelihood ratio (LR) tests in the context of binary-choice models.
I In LR tests, we compare a restricted model (R) and an unrestricted Prelim.
Max.Lik.Estim.
H0 : the “true” model is the restricted one.
Binary
I Even under H0 , Eq. (58) holds. However, if H0 is true, one expects that Time Series
log LU is not too far from log LR . AR processes
MLE of AR processes
I Formally, it can be proven that, asymptotically: Forecasting
Appendix
d
2(log LU ≠ log LR ) æ ‰2 (J). Stat. Tables
¸ ˚˙ ˝
LR test statistic
206 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Linear regression
Assumptions
OLS
Inference
Common pitfalls
Large sample
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
207 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Tests
I A time series: {yt }+Œ k
t=≠Œ = {. . . , y≠2 , y≠1 , y0 , y1 , . . . , yt , . . . }, yi œ R .
Linear regression
I Typical sample: {y1 , . . . , yT } Assumptions
OLS
Inference
Appendix
Stat. Tables
208 / 295
Statistics and
Econometrics II
Figure 47: Federal fund rate J.-P. Renne
Prelim.
CLT
Federal Fund Rate
Tests
Linear regression
Assumptions
OLS
Inference
15
Common pitfalls
Large sample
IV
GRM
10
Panels
Max.Lik.Estim.
Binary
5
Time Series
AR processes
MLE of AR processes
0
Forecasting
1960 1980 2000
Appendix
Stat. Tables
Source of the data: FRED database
209 / 295
Statistics and
Econometrics II
Figure 48: Federal fund rate and unemployment rate J.-P. Renne
Prelim.
CLT
Federal Fund Rate and Unemployment Rate
Tests
Linear regression
Assumptions
OLS
15
Inference
Common pitfalls
Large sample
IV
GRM
10
Panels
Max.Lik.Estim.
5
Binary
Time Series
AR processes
MLE of AR processes
0
Forecasting
1960 1980 2000
Appendix
Stat. Tables
Source of the data: FRED database. Unemployment rate in red.
210 / 295
Statistics and
Econometrics II
Figure 49: U.S. Gross Domestic Product J.-P. Renne
CLT
15000
Tests
Linear regression
10000
Assumptions
OLS
5000
Inference
Common pitfalls
Large sample
1960 1980 2000 IV
GRM
Max.Lik.Estim.
0.10
Binary
0.05
Time Series
AR processes
MLE of AR processes
0.00
Forecasting
Appendix
1960 1980 2000
Stat. Tables
Source of the data: FRED database.
211 / 295
Statistics and
Econometrics II
Figure 50: Intraday data J.-P. Renne
CLT
119.5
Tests
Linear regression
119.0
Assumptions
OLS
Inference
Common pitfalls
118.5
Large sample
in USD
IV
GRM
118.0
Panels
Max.Lik.Estim.
117.5
Binary
Time Series
117.0
AR processes
MLE of AR processes
Forecasting
11:00 13:00 15:00
Appendix
time
Stat. Tables
Source of the data: Google Finance.
212 / 295
Statistics and
Econometrics II
Figure 51: Seasonality illustration (1/2) J.-P. Renne
Prelim.
CLT
Number of deaths in France (monthly)
Tests
60000
Linear regression
Assumptions
OLS
55000
Inference
Common pitfalls
Large sample
IV
50000
GRM
Panels
Max.Lik.Estim.
45000
Binary
Time Series
40000
AR processes
MLE of AR processes
Forecasting
1985 1990 1995 2000 2005
Appendix
Stat. Tables
Source of the data: FRED database.
213 / 295
Statistics and
Econometrics II
Figure 52: Seasonality illustration (2/2) J.-P. Renne
Prelim.
CLT
Number of Google searches for "depression"
Tests
Linear regression
90
Assumptions
OLS
Inference
80
Common pitfalls
Large sample
IV
70
GRM
Panels
60
Max.Lik.Estim.
Binary
50
Time Series
AR processes
40
MLE of AR processes
Forecasting
2004 2006 2008 2010 2012 2014 2016
Appendix
Stat. Tables
Source of data: Google trends.
214 / 295
I Standard time series models are built using "basic" shocks that we will Statistics and
Econometrics II
often denote by Át . Typically: E(Át ) = 0.
I A basic form of shocks: i.i.d. shocks (independence: Def. 1.6). J.-P. Renne
Tests
Definition 8.2 (White noise)
Linear regression
The process {Át }tœ]≠Œ,+Œ[ is a white noise if, for all t (a) E(Át ) = 0, (b) Assumptions
OLS
Var (Át ) = E(Á2t ) = ‡ 2 < Œ and (c) for all s ”= t, E(Át Ás ) = 0. Inference
Common pitfalls
Large sample
Max.Lik.Estim.
E(Át ) = E [E(Át |Át≠1 )] = 0 and Var (Át |Át≠1 ) = E(Á2t |Át≠1 ) = 1+0.5Á2t≠1 .
Binary
Var (Át ) = Var [E(Át |Át≠1 )] + E [Var (Át |Át≠1 )] = 0 + E(1 + 0.5Á2t≠1 ). Forecasting
Appendix
I Therefore Var (Át ) = 2 < Œ. It can be shown further that E(Át Ás ) = 0
Stat. Tables
if s ”= t. Hence Át is a white noise. But the Át s are not independent, for
instance because E(Á2t |Át≠1 ) ”= E(Á2t ) = Var (Át ).
215 / 295
Statistics and
Econometrics II
J.-P. Renne
I In the general case, the expectation and the variance of yt may depend
Prelim.
on time t.
I However, in the following, we will focus on those processes yt whose CLT
Appendix
Stat. Tables
216 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Tests
The process yt is covariance-stationary if, for all t and j,
Linear regression
Assumptions
E(yt ) = µ and E{(yt ≠ µ)(yt≠j ≠ µ)} = “j < Œ. OLS
Inference
Common pitfalls
Max.Lik.Estim.
Proof Since yt is covariance-stationary, the covariance between yt and yt≠j (i.e “j ) is the same as
that between yt+j and yt+j≠j (i.e. “≠j ). ⌅ Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
217 / 295
Statistics and
Econometrics II
Prelim.
CLT
Annual GDP (log) Growth
Tests
Linear regression
Assumptions
0.10
OLS
Inference
Common pitfalls
Large sample
IV
0.05
GRM
Panels
Max.Lik.Estim.
0.00
Binary
Time Series
AR processes
MLE of AR processes
1960 1980 2000 Forecasting
Appendix
Blue solid line: sample mean; blue dashed line: sample mean ± 2 std deviations. Stat. Tables
218 / 295
Statistics and
Econometrics II
Prelim.
CLT
The first−order auto−covariance
Tests
●
Linear regression
Assumptions
0.10
●
● OLS
● ● Inference
● ● ●
● ● ● ● Common pitfalls
● ● ● ●
●
Large sample
●
●
●● ●
●
● ● ●
●● ● ●● ● ●● ●
IV
● ● ●
● ● ●
●● ●
0.05
● ●
Growth[t]
● ● ● ● ● ●● ● ● ●
● ●● ● ●● ● ●●
● ●
●●
● ● ●● ●●● ●
●●
●●
●
●●
●●
●
● ● ● ● GRM
● ●● ●
●●● ●
●● ● ●●● ● ●
● ● ●● ● ● ●
● ●
●● ● ● ● ●● ● ●
Panels
● ● ●
●
● ● ● ● ●●●●●● ● ●
●●● ●● ●
●
●
●
●
●● ● ● ●●● ● ●●●● ●●● ●
● ●
●●●
●●● ● ●● ●
●
● ●
●
● ●●
● ● ● ● ●●
● ●
● ● ●● ● ●
● ● ●●● ●
Max.Lik.Estim.
●● ●
● ● ●● ●
● ● ●●● ●
● ● ● ●
●
0.00
●● ● ●
● ● ● ● ●●
Binary
●● ● ●
● ● ●● ●
●
●
●● ●
● ●
●
Time Series
●● ●
● ● ●
● ●
● AR processes
MLE of AR processes
0.00 0.05 0.10 Forecasting
Growth[t−1]
Appendix
219 / 295
Statistics and
Econometrics II
Prelim.
CLT
The second−order auto−covariance
Tests
●
Linear regression
Assumptions
0.10
●
● OLS
● ● Inference
●● ●
● ● ● ● Common pitfalls
● ● ● ●
●
Large sample
●
●● ●
● ●● ● ●
● ● ● ● ● ●●● ● ●
● ● ● ●
●
●
●
IV
0.05
●● ●● ●
Growth[t]
●● ● ● ● ●
● ● ●
● ● ● ●●● GRM
● ● ●
●
● ●● ● ● ● ● ●● ●
● ● ● ●● ● ● ● ●●
●● ●
●●●●● ● ●● ●
● ● ● ● ●
● ●●
● ●●
● ● ● ●● ●
Panels
● ● ● ●●● ●● ●●
● ●● ● ●● ● ●
● ● ●● ●● ● ● ●
● ●● ● ● ●
●
● ●● ● ● ●
● ●● ●
●●●
● ●●● ● ● ● ●●● ● ●
●●● ●● ●
● ● ● ●
●
●● ● ● ● ●● ● ●
● ● ●● ●● ●
●●
●
Max.Lik.Estim.
● ● ●● ●
● ● ●●●●● ●
● ● ●● ● ● ● ●
●
● ●● ● ●
0.00
● ●
● ●● ● ●
Binary
●
● ● ●● ●
● ●
● ●
●
●●
● ●
● ● ●
Time Series
● ● ●
● ● ●
● ●
● AR processes
MLE of AR processes
0.00 0.05 0.10 Forecasting
Growth[t−2]
Appendix
220 / 295
Statistics and
Econometrics II
Prelim.
CLT
The fifth−order auto−covariance
Tests
●
Linear regression
Assumptions
0.10
●
● OLS
● ● Inference
● ● ●
●● ● ● Common pitfalls
● ● ● ●
●
Large sample
●
● ● ●
● ●● ● ● ●
● ● ● ● ●● ● ●● ●
●
●
● ●● IV
0.05
●● ●● ● ● ●
Growth[t]
● ● ● ●
●●● ● ●
●●
GRM
● ● ● ●
● ● ● ● ●
●● ● ● ●
●●
● ● ● ● ● ●● ●●●●● ● ●● ● ●
●●●● ● ● ●
● ●
● ●● ● ● ● ● ●
●● ● ●
Panels
● ● ●● ● ● ●●●
● ●● ●● ● ● ●
● ●● ● ● ● ● ● ●● ● ● ● ●
●● ●
●
● ●●●●● ● ●●● ●● ● ●
● ●● ●
● ● ● ● ●
● ● ● ●
● ●
● ● ●● ● ●
● ● ● ● ● ● ● ●
● ● ●
● ● ●●● ●
●
Max.Lik.Estim.
●● ●● ● ●
● ● ● ● ● ● ●
● ● ● ● ●● ●
● ● ● ●
0.00
● ● ● ●
● ●
● ● ●
Binary
●
● ● ●● ●● ●
●
●
●
● ●
● ●
● ●
●
Time Series
● ● ●
●
● ●
● ●
● AR processes
MLE of AR processes
0.00 0.05 0.10 Forecasting
Growth[t−5]
Appendix
221 / 295
Statistics and
Econometrics II
J.-P. Renne
Figure 57: An example of nonstationary process: yt = 0.05 ◊ t + Át
Prelim.
CLT
12
Tests
10
Linear regression
Assumptions
OLS
8
Inference
Common pitfalls
6
Large sample
IV
GRM
4
Panels
Max.Lik.Estim.
2
Binary
0
Time Series
AR processes
MLE of AR processes
Forecasting
0 50 100 150 200
Appendix
Stat. Tables
222 / 295
Statistics and
Econometrics II
J.-P. Renne
I It is very common to work with auto-correlated time series.
Prelim.
I Auto-correlation affects, among many other things, the estimation of
CLT
the moments of a variable.
I Typically, assume you have a sample of observations of a time series yt Tests
qT OLS
ȳT = 1/T y .
t=1 t
Inference
Common pitfalls
I If you want to compute a confidence interval for µ, you are tempted to Large sample
This theorem (8.1 below) involves the sequence of autocovariances Time Series
{“j }jœ]≠Œ,Œ[ . Both theorems (9.6 and 8.1) coincide when “j = 0 for AR processes
Appendix
Stat. Tables
223 / 295
Statistics and
Econometrics II
Figure 58: Processes with same first two unconditional moments J.-P. Renne
Prelim.
Case A
1 2 3 4 5
CLT
Tests
−1
Common pitfalls
3
Large sample
2
1
IV
0 20 40 60 80 100 GRM
Case C
Panels
2.5
Max.Lik.Estim.
2.0
1.5
Binary
1.0
0 20 40 60 80 100
Time Series
AR processes
Note: The three samples have been simulated using the following data generating process:
MLE of AR processes
224 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
If process yt is covariance-stationary
q+Œ and if the sequence of autocovariance is
Tests
absolutely summable ( j=≠Œ |“j | < Œ), then:
Linear regression
Assumptions
p
ȳT æ µ = E(yt ) (59) OLS
Inference
+Œ
ÿ Common pitfalls
Time Series
Proof Extra online material. ⌅ AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
225 / 295
Statistics and
Definition 8.4 (Long-run variance) Econometrics II
Under the assumptions of Theorem 8.1, the limit appearing in Eq. (60) exists J.-P. Renne
and is called long-run variance. It is denoted by S, i.e.:
Prelim.
2
S = limT æ+Œ T E[(ȳT ≠ µ) ] CLT
Tests
I A natural estimator is: Linear regression
q
ÿ Assumptions
ˆ0 + 2
“ ˆ‹
“ (62) OLS
Inference
‹=1 Common pitfalls
1
qT Large sample
where “
ˆ‹ = T ‹+1
(yt ≠ ȳ )(yt≠‹ ≠ ȳ ). IV
GRM
I Problem: in small samples, using Eq. (62) may result in a negative
Panels
figure...
Max.Lik.Estim.
Binary
Definition 8.5 (Newey-West estimator of S)
Time Series
Newey and West (1987) have proposed the following (nonnegative) consistent AR processes
q Appendix
ÿ1 ‹
2
S NW = “
ˆ0 + 2 1≠ ˆ‹ .
“ Stat. Tables
q+1
‹=1
226 / 295
Difference equations and Auto-Regressive (AR) processes Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I Autoregressive processes constitute an important class of time series
Tests
processes.
I These processes are flexible and can reproduce the dynamics of a wide Linear regression
Assumptions
range of time-varying variables. OLS
Appendix
Stat. Tables
227 / 295
Statistics and
Definition 8.6 (Difference equation) Econometrics II
J.-P. Renne
A linear p-th order difference equation is of the form:
Prelim.
yt = „1 yt≠1 + „2 yt≠2 + . . . „p yt≠p + Át , (63) CLT
Tests
where Át is a sequence of shocks.
Linear regression
Assumptions
I It will prove convenient to introduce the following notations: OLS
Inference
S T Common pitfalls
„1 „2 ... „p S Á T S y T Large sample
t t
W 1 0 ... 0 X IV
0 yt≠1
W 0 1 ... 0 X, › = W X W X. GRM
F =W X t U .. V , yt = U .. V
U .. .. .. V . .
Panels
. . .
0 yt≠p+1 Max.Lik.Estim.
0 0 ... 1 0
(64) Binary
yt = F yt≠1 + › t (65)
AR processes
MLE of AR processes
Forecasting
I and:
Appendix
228 / 295
Statistics and
Econometrics II
Definition 8.7 (Dynamic multiplier)
J.-P. Renne
ˆyt+j
The dynamic multiplier of yt+j w.r.t. Át is .
ˆÁt Prelim.
CLT
ˆyt+j
I We have: = (F j )[1,1] . Tests
ˆÁt
Linear regression
I Let us assume that the eigenvalues of F (see Def. 9.2), denoted by
Assumptions
⁄1 , . . . , ⁄p , are distinct. OLS
W ⁄2 ... X
F =PU .. V P ≠1 = PDP ≠1 . Panels
. Max.Lik.Estim.
0 ... 0 ⁄p
Binary
⁄j2
MLE of AR processes
W 0 0 ... X ≠1 Forecasting
Fj = P W ..
XP .
U . V Appendix
229 / 295
Statistics and
Econometrics II
Prelim.
zt = Dzt≠1 + ÷ t . (66)
CLT
I The i th of the p equations of Eq. (66) reads zi,t = ⁄i zi,t≠1 + ›i,t . Tests
I We have: Linear regression
Assumptions
OLS
zi,t+h = ⁄i zi,t+h≠1 + ›i,t+h Inference
Panels
I If Át is a white noise sequence, then zi,t also is. Let denote by ‡i2 the
variance of zi,t . We have: Max.Lik.Estim.
Binary
# $ 2(h≠1)
Var ›i,t+h + ⁄i ›i,t+h≠1 + · · · + ⁄h≠1
i ›i,t+1 = ‡i2 (1 + ⁄2i + . . . ⁄i ). Time Series
AR processes
MLE of AR processes
I For h æ +Œ, this series converges iff |⁄i | < 1. Forecasting
Stat. Tables
230 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
I If one of the ⁄i s is such that |⁄i | > 1, then F j = PD j P ≠1 "explodes"
- - CLT
- ˆyt+j -
when j get large, and - - æ Œ. Tests
ˆÁt jæŒ
Linear regression
Assumptions
Panels
I Hence, the process defined by Eq. (63) is "not stable" (explodes) if the Max.Lik.Estim.
roots of the previous equation lie outside the unit circle. Binary
I "Not stable" will be given a more formal definition later on (Def. 8.3).
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
231 / 295
Statistics and
Econometrics II
Example 8.2
J.-P. Renne
I Consider the 2nd-order difference equation:
Prelim.
yt = 0.9yt≠1 ≠ 0.2yt≠2 + Át , Át ≥ i.i.d. N (0, 1). CLT
Tests
I The eigenvalues of F are 0.5 and 0.4.
Linear regression
Assumptions
OLS
Inference
Dynamic multipliers Simulated samples
Common pitfalls
Large sample
1.0
IV
4
GRM
0.8
2
Panels
0.6
Max.Lik.Estim.
0
0.4
Binary
−2
0.2
Time Series
−4
AR processes
0.0
MLE of AR processes
0 5 10 15 20 0 20 40 60 80 100 Forecasting
j time Appendix
Stat. Tables
Web-interface illustration (use the "ARMA(p,q)" panel, with q=0)
232 / 295
Statistics and
Econometrics II
Example 8.3
J.-P. Renne
I Consider the 2nd-order difference equation:
Prelim.
yt = 1.1yt≠1 ≠ 0.3yt≠2 + Át , Át ≥ i.i.d. N (0, 1). CLT
Tests
I The eigenvalues of F are 0.6 and 0.5.
Linear regression
Assumptions
OLS
Inference
Dynamic multipliers Simulated samples
Common pitfalls
Large sample
IV
1.0
4
GRM
0.8
Panels
2
0.6
Max.Lik.Estim.
0
0.4
Binary
−2
Time Series
0.2
−4
AR processes
0.0
−6
MLE of AR processes
0 5 10 15 20 0 20 40 60 80 100 Forecasting
j time Appendix
Stat. Tables
Web-interface illustration (use ARMA(p,q) panel, with q=0)
233 / 295
Statistics and
Econometrics II
Example 8.4
J.-P. Renne
I Consider the 2nd-order difference equation:
Prelim.
yt = 1.4yt≠1 ≠ 0.7yt≠2 + Át , Át ≥ i.i.d. N (0, 1). CLT
Tests
I The eigenvalues of F are complex conjugates: 0.7 ± 0.56i.
Linear regression
Assumptions
OLS
Inference
Dynamic multipliers Simulated samples
Common pitfalls
Large sample
IV
6
GRM
1.0
4
Panels
2
0.5
Max.Lik.Estim.
0
Binary
−2
0.0
−4
Time Series
AR processes
−6
−0.5
MLE of AR processes
0 5 10 15 20 0 20 40 60 80 100 Forecasting
j time Appendix
Stat. Tables
Web-interface illustration (use the "ARMA(p,q)" panel, with q=0)
234 / 295
Statistics and
Econometrics II
Example 8.5
J.-P. Renne
I Consider the 2nd-order difference equation:
Prelim.
yt = 0.9yt≠1 + 0.2yt≠2 + Át , Át ≥ i.i.d. N (0, 1). CLT
Tests
I The eigenvalues of F are 1.084 and -0.184.
Linear regression
Assumptions
OLS
Inference
Dynamic multipliers Simulated samples
Common pitfalls
Large sample
IV
1.0 1.5 2.0 2.5 3.0 3.5 4.0
15000
GRM
Panels
Max.Lik.Estim.
5000
Binary
0
Time Series
−5000
AR processes
MLE of AR processes
0 5 10 15 20 0 20 40 60 80 100 Forecasting
j time Appendix
Stat. Tables
Web-interface illustration (use the "ARMA(p,q)" panel, with q=0)
235 / 295
Statistics and
Econometrics II
CLT
yt = c + „yt≠1 + Át , Tests
Linear regression
where {Át }tœ]≠Œ,+Œ[ is a white noise process (see Def. 8.2). Assumptions
OLS
Inference
I If |„| Ø 1, yt is "explosive" (see Slide 230). Its variance does not exist; Common pitfalls
Panels
yt = c + Át + „(c + Át≠1 ) + „2 (c + Át≠2 ) + · · · + „k (c + Át≠k ) + . . .
Max.Lik.Estim.
Time Series
c ‡2 AR processes
Appendix
Web-interface illustration (use the "AR(1)" panel)
Stat. Tables
236 / 295
Statistics and
Econometrics II
Prelim.
E([yt ≠ µ][yt≠j ≠ µ])
2 j j+1 CLT
= E([Át + „Át≠1 + „ Át≠2 + · · · + „ Át≠j + „ Át≠j≠1 . . . ] ◊
Tests
[Át≠j + „Át≠j≠1 + „2 Át≠j≠2 + · · · + „k Át≠j≠k + . . . ])
Linear regression
= E(„j Á2t≠j + „j+2 Á2t≠j≠1 + „j+4 Á2t≠j≠2 + . . . ) Assumptions
OLS
„j ‡ 2
=
Inference
.
1 ≠ „2 Common pitfalls
Large sample
IV
I Therefore flj = „ . j GRM
Panels
I By what precedes, we have:
Max.Lik.Estim.
Binary
Proposition 8.2 (Covariance-stationarity of an AR(1) process)
Time Series
Appendix
Web-interface illustration (use AR(1) panel)
Stat. Tables
237 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
●
Tests
3
3
● ●
●
●
●
Linear regression
●
2
2
● ●
●
●
●
● Assumptions
● ●
● ●
●
● ●● ● ● ●
●● ● ●
● OLS
●
1
1
●●● ●
● ●●● Inference
●
● ● ● ●●● ●
● ●● ●
● ●
●●
Common pitfalls
y(t)
● ●
0
0
● ● ●
● ●● ●● ● ● Large sample
●● ● ● ● ●
● ●● ●
● ● ●
● IV
● ● ●
−1
−1
● ● ● ● ●
● ●●
● ● GRM
●
●
● ●
● Panels
−2
−2
Max.Lik.Estim.
−3
−3
● Binary
Appendix
Web-interface illustration (use the "AR(1)" panel)
Stat. Tables
238 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
4
4
●
Tests
●
●●
●
● Linear regression
●
2
2
●●
●
● Assumptions
●
● ● ●
●
●● ● OLS
●
●
●● ●
● Inference
●
Common pitfalls
0
0
● ●●
y(t)
●● ●●
● ● ●●
●
● ●
● ● ● ● ● Large sample
● ●
● ● ● ●●
●● ● IV
● ●● ● ●
●●● ● GRM
●● ● ●●
−2
−2
●
●●● ● ● ●
●
●
● ● ●
●
● ● ● ●
● Panels
● ● ●
●● ●
●
●
● ● Max.Lik.Estim.
−4
−4
Binary
●
●
Appendix
Web-interface illustration (use the "AR(1)" panel)
Stat. Tables
239 / 295
Statistics and
Definition 8.9 (p th -order AR process) Econometrics II
J.-P. Renne
Process yt follows a p th -order autoregressive process (AR(p)) if:
Prelim.
yt = c + „1 yt≠1 + „2 yt≠2 + · · · + „p yt≠p + Át , (68)
CLT
where {Át }tœ]≠Œ,+Œ[ is a white noise process (see Def. 8.2). Tests
Linear regression
Assumptions
Proposition 8.3 (Covariance-stationarity of an AR(p) process) OLS
Inference
The AR(p) process defined in Def. 8.9 is covariance-stationary iff (i) the Common pitfalls
eigenvalues of F (as defined by Eq. 64) lie within the unit circle, i.e. iff (ii) the Large sample
roots of
IV
GRM
⁄p ≠ „1 ⁄p≠1 ≠ · · · ≠ „p≠1 ⁄ ≠ „p = 0. (69)
Panels
lie within the unit circle. Max.Lik.Estim.
Binary
Proof The fact that (i) is equivalent to (ii) is Remark 8.3. Slide 230 provides the idea of the proof in
the case where F can be diagonalized. Time Series
AR processes
MLE of AR processes
I If the AR(p) defined in Def. 8.9 is covariance-stationary, we have Forecasting
c
µ = E(yt ) = . Appendix
1 ≠ „1 ≠ · · · ≠ „p
Stat. Tables
I The autocovariances are derived from the Yule-Walker equations (see
next slide).
240 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Computation of the autocovariances of an AR(p) process
CLT
I To compute the autocovariances of yt (assumed to be
Tests
covariance-stationary), let’s rewrite Eq. (68):
Linear regression
Assumptions
(yt ≠ µ) = „1 (yt≠1 ≠ µ) + „2 (yt≠2 ≠ µ) + · · · + „p (yt≠p ≠ µ) + Át . OLS
Inference
Common pitfalls
I Multiplying both sides by yt≠j ≠ µ and taking expectations leads to the Large sample
Yule-Walker equations: IV
GRM
Ó
„1 “j≠1 + „2 “j≠2 + · · · + „p “j≠p if j>0 Panels
“j = (70)
„1 “1 + „2 “2 + · · · + „p “p + ‡ 2 for j = 0. Max.Lik.Estim.
Binary
I Using “j = “≠j (Prop. 8.1), one can solve for (“0 , “1 , . . . , “p ) as
functions of (‡ 2 , „1 , . . . , „p ). Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
241 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
●
● ● Tests
●
Linear regression
●
5
5
●
●
● Assumptions
●●● ●●
●●
●
● ● OLS
● ●●●
●●● ●
● ● ● Inference
●
0
0
● ●
●●●●● ● ● Common pitfalls
●●
● ●
y(t)
● ●●●
●●●
●●●● ● ● Large sample
● ● ● ●
●● ●●● ●
● ● ●
IV
● ● ● ●
● ●● ●●● GRM
−5
−5
● ● ●●●
●●●
● ●
●
Panels
●
Max.Lik.Estim.
−10
−10
●
● ●
● Binary
Appendix
Web-interface illustration (use the "AR(1)" panel)
Stat. Tables
242 / 295
MLE of AR processes Statistics and
Econometrics II
J.-P. Renne
I Context: CLT
yt = c + „1 yt≠1 + · · · + „p yt≠p + Át . (71) Tests
where {Át } is an i.i.d. white noise process (see Def. 8.2). Linear regression
Assumptions
I Objective: OLS
◊ ◊ Appendix
Stat. Tables
243 / 295
Statistics and
Econometrics II
I We use the Bayes’ formula:
J.-P. Renne
CLT
I ... to get this key decomposition of our likelihood function:
Tests
Binary
I Log-likelihood:
Time Series
AR processes
T
ÿ MLE of AR processes
log L(◊; y) = log fY1 (y1 ; ◊) + log fYt |Yt≠1 ,...,Y1 (yt , . . . , y1 ; ◊). Forecasting
Appendix
t=2
Stat. Tables
244 / 295
MLE of a Gaussian AR(1) process Statistics and
Econometrics II
I For t > 1: J.-P. Renne
fYt |Yt≠1 ,...,Y1 (yt , . . . , y1 ; ◊) = fYt |Yt≠1 (yt , yt≠1 ; ◊). Prelim.
CLT
I If Át ≥ N (0, ‡ 2 ) and yt = c + „1 yt≠1 + Át :
Tests
3 4 Linear regression
1 (yt ≠ c ≠ „1 yt≠1 )2
fYt |Yt≠1 (yt , yt≠1 ; ◊) = Ô exp ≠ . Assumptions
2fi‡ 2‡ 2 OLS
Inference
Common pitfalls
I What about fY1 (y1 ; ◊)? We may use the marginal distribution of yt . Large sample
likelihood: Panels
Max.Lik.Estim.
log Lú (◊; y) = log L(◊; y) ≠ log fY1 (y1 ; ◊)
Binary
T
ÿ Time Series
= log fYt |Yt≠1 ,...,Y1 (yt , . . . , y1 ; ◊). (72) AR processes
I Exact and approximate MLE have the same large-sample distribution Appendix
(because –if the process is stationary– fY1 (y1 ; ◊) makes a relatively
Stat. Tables
negligible contribution to log L(◊; y)).
I Practical advantage of the conditional MLE: simply obtained by OLS.
245 / 295
MLE of a Gaussian AR(1) process Statistics and
Econometrics II
S T S T Prelim.
y2 1 y1
.. .. .. CLT
Y =U .
V and X = U
. .
V,
Tests
yT 1 yT ≠1
Linear regression
Assumptions
Eq. (72) becomes: OLS
Inference
T ≠1 Common pitfalls
log L (◊; y)
ú
= ≠ log(2fi) ≠ (T ≠ 1) log(‡) Large sample
2 IV
1 GRM
≠ 2 (Y ≠ X [c, „1 ]Õ )Õ (Y ≠ X [c, „1 ]Õ ), (73)
2‡ Panels
Max.Lik.Estim.
which is maximised for:
Binary
246 / 295
MLE of a Gaussian AR(p) process Statistics and
Econometrics II
J.-P. Renne
CLT
log L(◊; y) = log fYp ,...,Y1 (yp , . . . , y1 ; ◊) +
Tests
T
ÿ Linear regression
log fYt |Yt≠1 ,...,Yt≠p (yt , . . . , yt≠p ; ◊) . Assumptions
OLS
t=p+1
¸ ˚˙ ˝ Inference
Common pitfalls
log Lú (◊;y)
Large sample
IV
I Maximisation of the approximate log-likelihood log Lú (◊; y): OLS, using GRM
S T S T Max.Lik.Estim.
yp+1 1 yp ... y1
Binary
.. .. .. ..
Y =U .
V and X = U
. . .
V.
Time Series
yT 1 yT ≠1 ... yT ≠p AR processes
MLE of AR processes
Forecasting
I For stationary processes, approximate and exact MLE have the same
Appendix
large-sample distribution.
Stat. Tables
247 / 295
MLE of a Gaussian AR(p) process Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Remark 8.4
Tests
I The regression is:
Linear regression
yt = — Õ xt + Át , (76) Assumptions
OLS
where xt = [1, yt≠1 , . . . , yt≠p ]Õ and — = [c, „1 , . . . , „p ]Õ . Inference
I The fact that xt correlates to the Ás for s < t implies that the OLS Common pitfalls
Large sample
estimator b of — is biased in small sample. IV
GRM
Max.Lik.Estim.
Õ ≠1 Õ
b = — + (X X ) X Á (77) Binary
Because of the specific form of X , we have non-zero correlation between xt and Ás for s < t, therefore Time Series
AR processes
E[(X Õ X )≠1 X Õ Á] ”= 0. ⌅
MLE of AR processes
Forecasting
Appendix
Stat. Tables
248 / 295
MLE of a Gaussian AR(p) process Statistics and
Econometrics II
CLT
yt = c + „1 yt≠1 + · · · + „p yt≠p + Át Tests
Linear regression
where {Át } is an i.i.d. white noise process. If b is the OLS estimator of — of Assumptions
Eq. (76), we have: OLS
Inference
C T
D≠1 C T
D Common pitfalls
Ô 1 ÿ Ô 1 ÿ Large sample
T (b ≠ —) = xt xtÕ T xt Át , IV
T T GRM
t=p t=1
¸ ˚˙ ˝¸ ˚˙ ˝ Panels
p d
æQ≠1 æN (0,‡ 2 Q) Max.Lik.Estim.
Binary
1
qT 1
qT
where Q = plim T
x xÕ = plim
t=p t t T
x xÕ is given by:
t=1 t t Time Series
AR processes
S T MLE of AR processes
1 µ µ ... µ Forecasting
J.-P. Renne
Prelim.
Proof (of Proposition 8.4) Rearranging Eq. (77), we have:
CLT
Ô Õ ≠1 Ô Õ Tests
T (b ≠ —) = (X X /T ) T (X Á/T ).
Linear regression
Let us consider the autocovariances of vt = xt Át , denoted by “jv . Using the fact that xt is a linear Assumptions
combination of past Át s and that Át is a white noise, we get that E(Át xt ) = 0. Therefore OLS
Inference
Common pitfalls
v Õ
“j = E(Át Át≠j xt xt≠j ). Large sample
IV
GRM
If j > 0, we have E(Át Át≠j xt xt≠j
Õ
) = E(E[Át Át≠j xt xt≠j
Õ
|Át≠j , xt , xt≠j ]) =
E(Át≠j xt xt≠j
Õ
E[Át |Át≠j , xt , xt≠j ])
= 0. Note that we have E[Át |Át≠j , xt , xt≠j ] = 0 because {Át } is Panels
an i.i.d. white noise sequence (see Remark 1.1). If j = 0, we have: Max.Lik.Estim.
v 2 Õ 2 Õ 2
Binary
“0 = E(Át xt xt ) = E(Át )E(xt xt ) = ‡ Q.
Time Series
Ô Ô qT AR processes
The convergence in distribution of T (X Á/T ) = T T1
Õ
vt results from Theorem 8.1 (applied MLE of AR processes
t=1
to vt = xt Át ), using the “jv computed above. ⌅ Forecasting
Appendix
Stat. Tables
250 / 295
Information Criteria and Lag length selection Statistics and
Econometrics II
J.-P. Renne
I Compromise: Prelim.
(b) If the t-statistics associated to the p0th lag is low, consider a GRM
instead of p0 ). Max.Lik.Estim.
I Information criteria: AIC, BIC. Binary
Idea: minimisation of a loss function equal to the sum of Time Series
(a) the sum of squared residuals (SSR) and AR processes
MLE of AR processes
(b) a penalty term depending (positively) on the number of Forecasting
parameters. Appendix
Stat. Tables
251 / 295
Statistics and
Definition 8.10 (Information criteria) Econometrics II
J.-P. Renne
The Akaike (AIC), Hannan-Quinn (HQ) and Schwarz information (BIC)
criteria are of the form Prelim.
CLT
ˆ T (k); y)
≠2 log L(◊ k„(i) (T )
(i)
c (k) = + Tests
T T
¸ ˚˙ ˝ ¸ ˚˙ ˝ Linear regression
decreases w.r.t. k increases w.r.t. k Assumptions
OLS
◊ 0 (k) –a vector of parameters of length k. For q Ø 0, the model specified by Large sample
Panels
Criterion (i) „(i) (T )
Max.Lik.Estim.
Akaike AIC 2
Hannan-Quinn HQ 2 log(log(T )) Binary
Schwarz BIC log(T ) Time Series
AR processes
Appendix
k̂ (i) = argmin c (i) (k). Stat. Tables
k
252 / 295
Case of a normal linear regression Statistics and
Econometrics II
I We consider a linear regression with normal disturbances:
J.-P. Renne
2
yt = xt — + Át , Át ≥ i.i.d. N (0, ‡ ). Prelim.
CLT
The associated log-likelihood is of the form of Eq. (72).
Tests
I In that case, we have:
Linear regression
Assumptions
ˆ T (k); y)
≠2 log L(◊ k„(i) (T )
c (i) (k)
OLS
= +
T T
Inference
Common pitfalls
T
1 ÿ Á2t k„(i) (T )
Large sample
¥ log(2fi) + log(‡‚2 ) + + . IV
T ‡‚2 T GRM
t=1
Panels
Binary
T
1 ÿ Time Series
‡‚2 ¥ Á2t = SSR/T . AR processes
T MLE of AR processes
t=1
Forecasting
Appendix
k„(i) (T )
I Hence k̂ (i) ¥ argmin log(SSR/T ) + . Stat. Tables
k T
I In the case of an ARMA(p,q) process, k = 2 + p + q
253 / 295
Forecasting with AR models Statistics and
Econometrics II
J.-P. Renne
Prelim.
I Macroeconomic forecasts are done in many places: Public CLT
Administration (notably Treasuries), Central Banks, International Tests
Institutions (e.g. IMF, OECD), banks, big firms.
Linear regression
I These institutions are interested in the point estimates (≥ most likely Assumptions
value) of the variable of interest. They also interested in measuring the OLS
Appendix
Stat. Tables
254 / 295
persistence in low inflation, but to be broadly balanced
further out. Statistics and
Econometrics II
In light of the economic outlook, at its meeting ending on
3 February the MPC voted to maintain Bank Rate at 0.5% and J.-P. Renne
the stock of purchased assets at £375 billion. The factors
Figure 62: "Fan Chart" frombehind
the that
Bank of are
decision England
set out in the Monetary Policy Prelim.
Summary on pages i–ii of this Report, and in more detail in the
Minutes of the meeting.(1) CLT
Chart 5.2 CPI inflation projection based on market interest Chart 5.3 CPI inflation projection in November based on market
Tests
rate expectations and £375 billion purchased assets interest rate expectations and £375 billion purchased assets
Linear regression
Percentage increase in prices on a year earlier Percentage increase in prices on a year earlier
6 6 Assumptions
5 5
OLS
Inference
4 4
Common pitfalls
3 3
Large sample
2 2 IV
1 1 GRM
+ +
Panels
0 0
– –
1 1
2 2 Max.Lik.Estim.
2011 12 13 14 15 16 17 18 19
3
2011 12 13 14 15 16 17 18 19
3
Binary
Charts 5.2 and 5.3 depict the probability of various outcomes for CPI inflation in the future. They have been conditioned on the assumption that the stock of purchased assets financed by the issuance of central bank reserves
remains at £375 billion throughout the forecast period. If economic circumstances identical to today’s were to prevail on 100 occasions, the MPC’s best collective judgement is that inflation in any particular quarter would lie
within the darkest central band on only 30 of those occasions. The fan charts are constructed so that outturns of inflation are also expected to lie within each pair of the lighter red areas on 30 occasions. In any particular Time Series
quarter of the forecast period, inflation is therefore expected to lie somewhere within the fans on 90 out of 100 occasions. And on the remaining 10 out of 100 occasions inflation can fall anywhere outside the red area of the
fan chart. Over the forecast period, this has been depicted by the light grey background. See the box on pages 48–49 of the May 2002 Inflation Report for a fuller description of the fan chart and what it represents. AR processes
MLE of AR processes
Stat. Tables
255 / 295
Statistics and
Econometrics II
J.-P. Renne
Context and Objectives
Prelim.
I Context:
CLT
We have observations for yt and xt (potentially multi-dimensional)
variables. Tests
xt may contain lagged values of yt . Linear regression
I Objective: Assumptions
OLS
Inference
Max.Lik.Estim.
E([yt+1 ≠ yt+1
ú
]2 ) Binary
¸ ˚˙ ˝
Mean square error (MSE) Time Series
AR processes
MLE of AR processes
where yt+1
ú is the forecast of yt+1 (function of xt ). Forecasting
Appendix
Stat. Tables
256 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Theorem 8.2 (Smallest MSE)
Tests
The smallest MSE is obtained with the expectation of yt+1 conditional on xt . Linear regression
Assumptions
OLS
Proof (Sketch of) Consider yt+1
ú
= g(xt ). We have: Inference
Common pitfalls
ú 2
! ú 2
" Large sample
E([yt+1 ≠ yt+1 ] ) = E [{yt+1 ≠ E(yt+1 |xt )} + {E(yt+1 |xt ) ≠ yt+1 }] . IV
GRM
Expand. Use the law of iterated expectations on the double product (conditioning on xt ); the latter is 0. Panels
ú 2
It implies E([yt+1 ≠ yt+1 ] ) is always larger than E([yt+1 ≠ E(yt+1 |xt )]2 ), and there is equality if
Max.Lik.Estim.
E(yt+1 |xt ) = yt+1
ú
. ⌅
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
257 / 295
Forecasting an AR(p) process Statistics and
Econometrics II
J.-P. Renne
I Consider the process:
Prelim.
2
yt = c +„1 yt≠1 +„2 yt≠2 +· · ·+„p yt≠p +Át , with Át ≥ i.i.d. N (0, ‡ ). CLT
Tests
I Using the notation of Eq. (64), we have:
Linear regression
Assumptions
yt ≠ µ = F (yt≠1 ≠ µ) + › t . OLS
Inference
Common pitfalls
Hence: yt+h ≠ µ = › t+h + F › t+h≠1 + · · · + F h≠1 › t+1 + F h (yt ≠ µ). Large sample
IV
I Therefore: GRM
Panels
E(yt+h |yt , yt≠1 , . . . ) = µ + F h (yt ≠ µ)
Max.Lik.Estim.
Var ([yt+h ≠ E(yt+h |yt , yt≠1 , . . . )]) = + F F Õ + · · · + F h≠1 (F h≠1 )Õ ,
Binary
Stat. Tables
Web-interface illustration
258 / 295
Figure 63: Forecasts and 95% confidence intervals when shocks are Gaussian
30
h
25
E(yt)
20
Et(yt+h)
15
Web-interface illustration
Assessing the performances of a forecasting model Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
I Once one has fitted a model on a given set of data (of length T , say),
one compute MSE (mean square errors) to evaluate the performance of Tests
the model. Linear regression
I But this MSE is the in-sample one. Assumptions
OLS
I It is easy to reduce in-sample MSE. Typically, if the model is estimated Inference
Appendix
Stat. Tables
260 / 295
Statistics and
Econometrics II
Figure 64: In-sample versus out-of-sample forecasting
J.-P. Renne
p=1 p=2
Prelim.
0.04
0.04
CLT
0.03
0.03
Tests
0.02
0.02
0.01
0.01
Linear regression
0.00
0.00
Assumptions
OLS
Inference
−0.02
−0.02
Common pitfalls
1970 1980 1990 2000 2010 1970 1980 1990 2000 2010 Large sample
IV
p=4 p=7 GRM
Panels
0.04
0.04
0.03
0.03
Max.Lik.Estim.
0.02
0.02
Binary
0.01
0.01
0.00
0.00
Time Series
AR processes
−0.02
−0.02
MLE of AR processes
Forecasting
1970 1980 1990 2000 2010 1970 1980 1990 2000 2010
Appendix
Data: U.S. GDP growth. The blue lines are in-sample forecasts, they are based on AR(p) models
estimated on the whole sample. The red lines are out-of-sample forecasts, they are based on reduced Stat. Tables
samples (from the first date to the date at which the red line begins).
261 / 295
I The previous (out-of-sample) exercise is a form of backtesting. Statistics and
Econometrics II
I Other forms of backtesting aim at testing whether the realisations of the
J.-P. Renne
variable are consistent with (previous) confidence intervals.
I Typically, if realised (ex post) variables tend to be far outside (ex ante) Prelim.
confidence intervals, we may deem the model inadequate.
CLT
Tests
Linear regression
Figure 65: Backtesting a forecasting approach
Assumptions
OLS
Inference
0.04
IV
GRM
0.03
0.03
Panels
0.02
0.02
Max.Lik.Estim.
0.01
0.01 Binary
0.00
0.00
Time Series
AR processes
MLE of AR processes
−0.02
−0.02
Forecasting
Appendix
1970 1980 1990 2000 2010 1970 1980 1990 2000 2010 Stat. Tables
Model-implied 95% confidence intervals in red.
262 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Appendix Inference
Common pitfalls
Large sample
IV
GRM
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
263 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Definition 9.1 (Skewness and Kurtosis) CLT
Tests
Let Y be a random variable whose fourth moment exists. The
Linear regression
expectation of Y is denoted by µ. Assumptions
Panels
I The kurtosis of Y is given by:
Max.Lik.Estim.
E[(Y ≠ µ)4 ] Binary
.
{E[(Y ≠ µ)2 ]}2 Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
264 / 295
Statistics and
Econometrics II
J.-P. Renne
Definition 9.2 (Eigenvalues)
Prelim.
The eigenvalues of of a matrix M are the numbers ⁄ for which: CLT
|M ≠ ⁄I| = 0, Tests
Linear regression
where | • | is the determinant operator. Assumptions
OLS
Inference
I |M ≠1
| = |M| ≠1
. Panels
Max.Lik.Estim.
I If M admits the diagonal representation M = TDT ≠1 , where D is
Binary
a diagonal matrix whose diagonal entries are {⁄i }i=1,...,n , then:
Time Series
n
Ÿ AR processes
Stat. Tables
265 / 295
Statistics and
Econometrics II
J.-P. Renne
Linear regression
(i) MM M = M
ú
Assumptions
(ii) M MM = M
ú ú ú OLS
Inference
(iv) (M ú M)Õ = M ú M. IV
GRM
Panels
Remark 9.1 (Some properties) Max.Lik.Estim.
Time Series
I The pseudo-inverse of a zero matrix is its transpose. AR processes
Appendix
Stat. Tables
266 / 295
Definition 9.4 (F statistics) Statistics and
Econometrics II
Consider n = n1 + n2 i.i.d. N (0, 1) r.v. Xi . If the r.v. F is defined by: J.-P. Renne
qn1 2 Prelim.
Xi n2
F = qn1i=1
+n2 CLT
2
j=n1 +1
Xj n1
Tests
Linear regression
then F ≥ F (n1 , n2 ). Table
Assumptions
OLS
Inference
Úq Panels
O ‹
Xi2
i=1 Max.Lik.Estim.
Z = X0 , Xi ≥ i.i.d.N (0, 1).
‹ Binary
Time Series
We have E(Z ) = 0, and Var (Z ) = ‹
‹≠2
if ‹ > 2. Table
AR processes
MLE of AR processes
Forecasting
2
Definition 9.6 (‰ distribution) Appendix
q‹ Stat. Tables
Z follows a ‰2 distribution with ‹ d.f. if Z = i=1
Xi2 where
Xi ≥ i.i.d.N (0, 1). We have E(Z ) = ‹. Table
267 / 295
Statistics and
Definition 9.7 (Cauchy distribution) Econometrics II
J.-P. Renne
The probability distribution function of the Cauchy distribution defined
by a location parameter µ and a scale parameter “ is: Prelim.
CLT
1
f (x ) = 1 # x ≠µ $2 2 . Tests
fi“ 1 + “ Linear regression
Assumptions
Panels
0.30
Max.Lik.Estim.
0.20
Binary
Time Series
0.10
AR processes
MLE of AR processes
0.00
Forecasting
−4 −2 0 2 4 Appendix
Stat. Tables
268 / 295
Statistics and
Econometrics II
Definition 9.8 (Idempotent matrix) J.-P. Renne
2
Matrix M is idempotent if M = M. Prelim.
Tests
Proposition 9.1 (Roots of an idempotent matrix) Linear regression
Assumptions
Max.Lik.Estim.
Proposition 9.2 (Idempotent matrix and ‰2 distribution) Binary
Time Series
The rank of a symmetric idempotent matrix is equal to its trace. AR processes
MLE of AR processes
Forecasting
Proof The result follows from Prop. 9.1, combined with the fact that the rank of a symmetric matrix is
equal to the number of its nonzero eigenvalues. ⌅ Appendix
Stat. Tables
269 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
The solution of the following optimisation problem: Tests
2
min ||y ≠ X—|| Linear regression
— Assumptions
subject to R— = q
OLS
Inference
Common pitfalls
Max.Lik.Estim.
where — 0 = (XÕ X)≠1 XÕ y.
Binary
Time Series
Proof See for instance Jackman, 2007. AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
270 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
Tests
2
E[(X ≠ c) ] Linear regression
’Á > 0, P(|X ≠ c| > Á) Æ . Assumptions
Á2 OLS
Inference
Common pitfalls
Theorem 9.2 (Generalized Chebychev’s inequality) Large sample
IV
r
If E(|X | ) is finite for some r > 0 then, ’Á > 0, GRM
r
]
P(|X ≠ c| > Á) Æ E[|XÁ≠c| r . Panels
Max.Lik.Estim.
Proof Remark that Ár I{|X |ØÁ} Æ |X |r and take the expectation of both sides. ⌅ Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
271 / 295
Statistics and
Econometrics II
p
Definition 9.9 (Convergence in probability, xn æ c) J.-P. Renne
Linear regression
xn converges in the r -th mean (or in the Lr -norm) towards x , if E(|xn |r ) Common pitfalls
Large sample
and E(|x |r ) exist and if IV
GRM
r
lim E(|xn ≠ x | ) = 0. Panels
næŒ
Max.Lik.Estim.
In particular, for r = 2, this convergence is called mean square Binary
convergence.
Time Series
AR processes
272 / 295
Statistics and
Econometrics II
d
Definition 9.12 (Convergence in distribution, xn æ x ) J.-P. Renne
Prelim.
xn is said to converge in distribution (or in law) to x if
CLT
lim Fxn (s) = Fx (s) Tests
næŒ
Linear regression
for all s at which FX –the cumulative distribution of X – is continuous. Assumptions
OLS
Inference
Panels
d
xn yn æ xc Max.Lik.Estim.
d
xn + yn æ x +c Binary
d Time Series
xn /yn æ x /c (if c ”= 0) AR processes
MLE of AR processes
d Forecasting
(ii) Continuous mapping theorem: If xn æ x and g is a continuous
Appendix
d
function then g(xn ) æ g(x ). Stat. Tables
273 / 295
Proposition 9.5 (Implications of stochastic convergences) Statistics and
Econometrics II
J.-P. Renne
Prelim.
Ls Lr
æ ∆ æ CLT
1Ær Æs
Tests
»
Linear regression
a.s. p d Assumptions
æ ∆ æ ∆ æ OLS
!p" 1 2 Inference
d Common pitfalls
Proof of the fact that æ ∆ æ . Large sample
IV
p
Assume that Xn æ X . Denoting by F and Fn the c.d.f. of X and Xn , respectively: GRM
274 / 295
Statistics and
Econometrics II
J.-P. Renne
Proposition 9.6 (Convergence in distribution to a constant)
Prelim.
If Xn converges in distribution to a constant c, then Xn converges in CLT
probability to c. Tests
Linear regression
Proof If Á > 0, we have P(Xn < c ≠ Á) æ 0 i.e. P(Xn Ø c ≠ Á) æ 1 and Assumptions
næŒ næŒ OLS
P(Xn < c + Á) æ 1. Therefore P(c ≠ Á Æ Xn < c + Á) æ 1. ⌅
Inference
næŒ næŒ
Common pitfalls
Large sample
IV
Panels
I Let {xn }nœN be a series of random variables defined by:
Max.Lik.Estim.
xn = nun , Binary
Time Series
where un are independent random variables s.t. un ≥ B(1/n). AR processes
MLE of AR processes
p Lr
I We have xn æ 0 but xn 9 0 because E(|Xn ≠ 0|) = E(Xn ) = 1. Forecasting
Appendix
Stat. Tables
275 / 295
Statistics and
Econometrics II
J.-P. Renne
- ai - < ÷. OLS
- - Inference
i=N+1 Common pitfalls
Large sample
IV
Theorem 9.4 (Cauchy criterion (stochastic case)) GRM
qT Panels
◊Á
i=0 i t≠i
converges in mean square (T æ Œ) to a random variable
Max.Lik.Estim.
iff, for any ÷ > 0, there exists an integer N such that, for all M Ø N,
Binary
CA M
B2 D
ÿ Time Series
E ◊i Át≠i < ÷. AR processes
MLE of AR processes
i=N+1 Forecasting
Appendix
Stat. Tables
276 / 295
Statistics and
Econometrics II
Definition 9.13 (Characteristic function)
J.-P. Renne
Linear regression
Theorem 9.5 (Law of large numbers) Assumptions
OLS
Max.Lik.Estim.
The properties of the characteristic function (see Def. 9.13) imply that:
Binary
n 1 1 22
Ÿ u u iuµ Time Series
„1 (u) = 1+i µ+o æe .
(X1 +···+Xn ) n n AR processes
n
i=1 MLE of AR processes
Forecasting
iuµ
The facts that (a) e is the characteristic function of the constant µ and (b) that a characteristic Appendix
function uniquely characterises a distribution imply that the sample mean converges in distribution to
Stat. Tables
the constant µ, which further implies that it converges in probability to µ [Exercise]. ⌅
277 / 295
Statistics and
Econometrics II
Theorem 9.6 (Lindberg-Levy Central limit theorem, CLT) J.-P. Renne
n Tests
Ô d 2 1ÿ
n(x̄n ≠ µ) æ N (0, ‡ ), where x̄n = xi . Linear regression
n Assumptions
i=1 OLS
Inference
Common pitfalls
Large sample
IV
Ô GRM
Proof Let us introduce the r.v. Yn :=
# ! "$n n(X̄n ≠ µ). We have
„Yn (u) = E exp(i Ô1 u(X1 ≠ µ)) . We have: Panels
n
Max.Lik.Estim.
Ë 1 1 22Èn Ë 1 2Èn
1 1 1 2 2 2
E exp i Ô u(X1 ≠ µ) = E 1 + i Ô u(X1 ≠ µ) ≠ u (X1 ≠ µ) + o(u ) Binary
n n 2n
1 2n Time Series
1 2 2 2 AR processes
= 1≠ u ‡ + o(u ) .
2n MLE of AR processes
Forecasting
! "
Therefore „Yn (u) æ exp ≠ 12 u 2 ‡ 2
, which is the characteristic function of N (0, ‡ 2 ). ⌅ Appendix
næŒ
Stat. Tables
278 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Proposition 9.7 (Inverse of a partitioned matrix)
Linear regression
We have: Assumptions
OLS
5 6≠1 Inference
A11 A12 Common pitfalls
=
A21 A22 Large sample
IV
5 6 GRM
(A11 ≠ A12 A≠1
22 A21 )
≠1
11 A12 (A22 ≠ A21 A11 A12 )
≠A≠1 ≠1 ≠1
Panels
≠(A22 ≠ A21 A11 A12 ) A21 A≠1
≠1 ≠1
11 (A22 ≠ A21 A11 A12 )
≠1 ≠1
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
279 / 295
back Statistics and
Econometrics II
Proposition 9.8 (Independence of a linear and a quadratic form) J.-P. Renne
Tests
Proof If LA = 0, then the two Gaussian vectors Lx and Ax are independent. This implies the
independence of any function of Lx and any function of Ax. The results then follows from the Linear regression
observation that xÕ Ax = (Ax)Õ (Ax), which is a function of Ax. ⌅ Assumptions
OLS
Inference
Common pitfalls
Proposition 9.9 (Inner product of a multivariate Gaussian variable) Large sample
IV
Panels
We have:
X Õ ≠1 X ≥ ‰2 (n). Max.Lik.Estim.
Binary
Proof Because is a symmetrical definite positive matrix, it admits the spectral decomposition PDP Õ Time Series
where P is an orthogonal matrix (i.e. PP Õ = Id) and D is a diagonal matrix with non-negative entries. AR processes
Ô
Denoting by D ≠1 the diagonal matrix whose Ô diagonal entries are the inverse of those of D, it is easily
MLE of AR processes
Forecasting
checked that the covariance matrix of Y := D ≠1 P Õ X is Id. Therefore Y is a vector of uncorrelated
Gaussian variables. The properties of Gaussian variables imply that the components of Y are then also
q Appendix
independent. Hence Y Õ Y = Yi2 ≥ ‰2 (n).
i
Stat. Tables
It remains to note that Y Õ Y = X Õ PD ≠1 P Õ X = X Õ Var (X )≠1 X to conclude. ⌅
280 / 295
Statistics and
Econometrics II
J.-P. Renne
CLT
We have:
|Cov (X , Y )| Æ Var (X )Var (Y ) Tests
Linear regression
and, if X ”== and Y ”= 0, the equality holds iff X and Y are the same Assumptions
up to an affine transformation. OLS
Inference
Common pitfalls
Cov (X ,Y ) Large sample
Proof If Var (X ) = 0, this is trivial. If this is not the case, then let’s define Z as Z = Y ≠ Var (X )
X.
IV
Cov (X ,Y )
It is easily seen that Cov (X , Z ) = 0. Then, the variance of Y = Z + Var (X )
X is equal to the sum of GRM
Cov (X ,Y )
the variance of Z and of the variance of X, that is: Panels
Var (X )
1 22 1 22 Max.Lik.Estim.
Cov (X , Y ) Cov (X , Y )
Var (Y ) = Var (Z ) + Var (X ) Ø Var (X ). Binary
Var (X ) Var (X )
Time Series
Cov (X ,Y )
The equality holds iff Var (Z ) = 0, i.e. iff Y = Var (X )
X + cst. ⌅ AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
281 / 295
Statistics and
Econometrics II
J.-P. Renne
Definition 9.14 (Matrix derivatives)
Prelim.
Consider a fonction f : RK æ R. Its first-order derivative is: CLT
S T
ˆf
ˆb1
(b) Tests
ˆf W .. X Linear regression
(b) = U . V. Assumptions
ˆb
(b)
ˆf OLS
ˆbK Inference
Common pitfalls
Time Series
I If f (b) = AÕ b where A is a K ◊ 1 vector then ˆf
(b) = A. AR processes
ˆb MLE of AR processes
Appendix
Stat. Tables
282 / 295
Statistics and
Econometrics II
J.-P. Renne
An asymptotic test with critical region n has an asymptotic level equal CLT
to – if: Tests
sup lim P◊ (Sn œ n ) = –, Linear regression
◊œ næŒ
Assumptions
where Sn is the test statistic and is such that the null hypothesis H0 OLS
Inference
is equivalent to ◊ œ . Common pitfalls
Large sample
IV
Panels
An asymptotic test with critical region n is consistent if: Max.Lik.Estim.
c Binary
’◊ œ , P◊ (Sn œ n) æ 1,
Time Series
c
where Sn is the test statistic and is such that the null hypothesis H0 AR processes
MLE of AR processes
/ c.
is equivalent to ◊ œ Forecasting
Appendix
Stat. Tables
283 / 295
Understanding two-sided tests Statistics and
Econometrics II
back to main Web interface producing these charts
J.-P. Renne
Prelim.
Figure 66: Understanding two-sided tests (1/3)
CLT
Tests
Example of one−sided test [specifically, under H0, the test statistic X ~ t(5)
Linear regression
Assumptions
α is the level (or size) of the test. OLS
Inference
0.5
Common pitfalls
Critical region (sum of areas = α)
Large sample
Density of test statistic under Ho
IV
0.4
GRM
Panels
0.3
Max.Lik.Estim.
Φ−1
X (α 2) Φ−1
X (1 − α 2)
Binary
0.2
Time Series
AR processes
0.1
MLE of AR processes
Forecasting
0.0
Appendix
−4 −2 0 2 4 Stat. Tables
Test statistic X
284 / 295
Understanding two-sided tests Statistics and
Econometrics II
back to main Web interface producing these charts
J.-P. Renne
Prelim.
Figure 67: Understanding two-sided tests (2/3)
CLT
Tests
Example of one−sided test [specifically, under H0, the test statistic X ~ t(5)
Linear regression
Assumptions
α is the level (or size) of the test. p−value > α, which implies OLS
Common pitfalls
Sum of areas = p−value
Large sample
Density of test statistic under Ho
IV
0.4
GRM
(Sample−based) Test statistics X
Panels
0.3
Max.Lik.Estim.
Φ−1
X (α 2) Φ−1
X (1 − α 2)
Binary
0.2
Time Series
AR processes
0.1
MLE of AR processes
Forecasting
0.0
Appendix
−4 −2 0 2 4 Stat. Tables
Test statistic X
285 / 295
Understanding two-sided tests Statistics and
Econometrics II
back to main Web interface producing these charts
J.-P. Renne
Prelim.
Figure 68: Understanding two-sided tests (3/3)
CLT
Tests
Example of one−sided test [specifically, under H0, the test statistic X ~ t(5)
Linear regression
Assumptions
α is the level (or size) of the test. p−value < α, which implies OLS
Common pitfalls
Sum of areas = p−value
Large sample
Density of test statistic under Ho
IV
0.4
GRM
(Sample−based) Test statistics X
Panels
0.3
Max.Lik.Estim.
Φ−1
X (α 2) Φ−1
X (1 − α 2)
Binary
0.2
Time Series
AR processes
0.1
MLE of AR processes
Forecasting
0.0
Appendix
−4 −2 0 2 4 Stat. Tables
Test statistic X
286 / 295
Understanding one-sided tests Statistics and
Econometrics II
back to main Web interface producing these charts
J.-P. Renne
Prelim.
Figure 69: Understanding one-sided tests (1/3)
CLT
Tests
Example of one−sided test [specifically, under H0, the test statistic X ~ χ2(5)
Linear regression
Assumptions
α is the level (or size) of the test. OLS
Inference
0.20
Common pitfalls
Critical region (area = α)
Large sample
Density of test statistic under Ho
IV
GRM
0.15
Panels
Max.Lik.Estim.
0.10
Φ−1
X (1 − α)
Binary
Time Series
0.05
AR processes
MLE of AR processes
Forecasting
0.00
Appendix
0 5 10 15 Stat. Tables
Test statistic X
287 / 295
Understanding one-sided tests Statistics and
Econometrics II
back to main Web interface producing these charts
J.-P. Renne
Prelim.
Figure 70: Understanding one-sided tests (2/3)
CLT
Tests
Example of one−sided test [specifically, under H0, the test statistic X ~ χ2(5)
Linear regression
Assumptions
α is the level (or size) of the test. p−value > α, which implies OLS
Common pitfalls
Area = p−value
Large sample
Density of test statistic under Ho
IV
GRM
0.15
Max.Lik.Estim.
0.10
Φ−1
X (1 − α)
Binary
Time Series
0.05
AR processes
MLE of AR processes
Forecasting
0.00
Appendix
0 5 10 15 Stat. Tables
Test statistic X
288 / 295
Understanding one-sided tests Statistics and
Econometrics II
back to main Web interface producing these charts
J.-P. Renne
Prelim.
Figure 71: Understanding one-sided tests (3/3)
CLT
Tests
Example of one−sided test [specifically, under H0, the test statistic X ~ χ2(5)
Linear regression
Assumptions
α is the level (or size) of the test. p−value < α, which implies OLS
Common pitfalls
Large sample
Density of test statistic under Ho
IV
GRM
0.15
Area = p−value
(Sample−based) Test statistics X
Panels
Max.Lik.Estim.
0.10
Φ−1
X (1 − α)
Binary
Time Series
0.05
AR processes
MLE of AR processes
Forecasting
0.00
Appendix
0 5 10 15 Stat. Tables
Test statistic X
289 / 295
Statistics and
Econometrics II
J.-P. Renne
Prelim.
CLT
Tests
Linear regression
Assumptions
OLS
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
Appendix
Stat. Tables
290 / 295
Statistical Tables Statistics and
Econometrics II
J.-P. Renne
Statistical tables usually report either probabilities – or quantiles z, the two CLT
being related as follows: Tests
Linear regression
Assumptions
For random variables on R For random variables on R+ OLS
Inference
Common pitfalls
Large sample
Probability α= IV
Probability α=
P(0<X ≤ z) GRM
P(0<X ≤ z)
Panels
Max.Lik.Estim.
Binary
Time Series
AR processes
MLE of AR processes
Forecasting
0 z 0 z Appendix
Stat. Tables
291 / 295
TCL Theorem
Statistics and
Econometrics II
Table 22: Normal distribution N (0, 1) J.-P. Renne
z =y +x Prelim.
x æ 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
y ¿ CLT
0 0.000 0.004 0.008 0.012 0.016 0.020 0.024 0.028 0.032 0.036 Tests
0.1 0.040 0.044 0.048 0.052 0.056 0.060 0.064 0.067 0.071 0.075
0.2 0.079 0.083 0.087 0.091 0.095 0.099 0.103 0.106 0.110 0.114 Linear regression
0.3 0.118 0.122 0.126 0.129 0.133 0.137 0.141 0.144 0.148 0.152 Assumptions
0.4 0.155 0.159 0.163 0.166 0.170 0.174 0.177 0.181 0.184 0.188 OLS
0.5 0.191 0.195 0.198 0.202 0.205 0.209 0.212 0.216 0.219 0.222
Inference
0.6 0.226 0.229 0.232 0.236 0.239 0.242 0.245 0.249 0.252 0.255
Common pitfalls
0.7 0.258 0.261 0.264 0.267 0.270 0.273 0.276 0.279 0.282 0.285
Large sample
0.8 0.288 0.291 0.294 0.297 0.300 0.302 0.305 0.308 0.311 0.313
IV
0.9 0.316 0.319 0.321 0.324 0.326 0.329 0.331 0.334 0.336 0.339
GRM
1.0 0.341 0.344 0.346 0.348 0.351 0.353 0.355 0.358 0.360 0.362
1.1 0.364 0.367 0.369 0.371 0.373 0.375 0.377 0.379 0.381 0.383 Panels
1.2 0.385 0.387 0.389 0.391 0.393 0.394 0.396 0.398 0.400 0.401
1.3 0.403 0.405 0.407 0.408 0.410 0.411 0.413 0.415 0.416 0.418 Max.Lik.Estim.
1.4 0.419 0.421 0.422 0.424 0.425 0.426 0.428 0.429 0.431 0.432
1.5 0.433 0.434 0.436 0.437 0.438 0.439 0.441 0.442 0.443 0.444 Binary
1.6 0.445 0.446 0.447 0.448 0.449 0.451 0.452 0.453 0.454 0.454
1.7 0.455 0.456 0.457 0.458 0.459 0.460 0.461 0.462 0.462 0.463 Time Series
1.8 0.464 0.465 0.466 0.466 0.467 0.468 0.469 0.469 0.470 0.471 AR processes
1.9 0.471 0.472 0.473 0.473 0.474 0.474 0.475 0.476 0.476 0.477 MLE of AR processes
2.0 0.477 0.478 0.478 0.479 0.479 0.480 0.480 0.481 0.481 0.482 Forecasting
2.1 0.482 0.483 0.483 0.483 0.484 0.484 0.485 0.485 0.485 0.486
2.2 0.486 0.486 0.487 0.487 0.487 0.488 0.488 0.488 0.489 0.489 Appendix
2.3 0.489 0.490 0.490 0.490 0.490 0.491 0.491 0.491 0.491 0.492
Stat. Tables
2.4 0.492 0.492 0.492 0.492 0.493 0.493 0.493 0.493 0.493 0.494
2.5 0.494 0.494 0.494 0.494 0.494 0.495 0.495 0.495 0.495 0.495
Notes: If z = x + y and if X ≥ N (0, 1), the cell of the table corresponding to x and y gives
P(0 < X Æ z) (see Slide 291, left chart). 292 / 295
Statistics and
Econometrics II
Table 23: Student-t distribution t(‹) (Def. 9.5) J.-P. Renne
Notes: This table reports the quantiles of Student-t distributions with ‹ degrees of freedom. If Appendix
X ≥ t(‹), the cell of the table corresponding to ‹ and – gives z that is such – = P(0 < X Æ z) (see
Stat. Tables
Slide 291, left chart). z is also such that –ú = P(≠z < X Æ z).
293 / 295
Statistics and
Econometrics II
Table 24: ‰2 distribution ‰2 (‹) (Def. 9.6) J.-P. Renne
500 449.147 459.926 520.950 540.930 553.127 563.852 576.493 603.446 MLE of AR processes
Forecasting
Notes: This table reports the quantiles of ‰2 distributions with ‹ degrees of freedom. If X ≥ ‰2 (‹), Appendix
the cell of the table corresponding to ‹ and – gives z that is such – = P(0 < X Æ z) (see Slide 291, Stat. Tables
right chart).
294 / 295
Statistics and
Table 25: F distribution (Def. 9.4) Econometrics II
J.-P. Renne
n1 æ 1 2 3 4 5 6 7 8 9 10
n2 ¿ Prelim.
– = 0.90
5 4.06 3.78 3.62 3.52 3.45 3.40 3.37 3.34 3.32 3.30 CLT
10 3.29 2.92 2.73 2.61 2.52 2.46 2.41 2.38 2.35 2.32 Tests
15 3.07 2.70 2.49 2.36 2.27 2.21 2.16 2.12 2.09 2.06
20 2.97 2.59 2.38 2.25 2.16 2.09 2.04 1.10 1.96 1.94 Linear regression
50 2.81 2.41 2.20 2.06 1.97 1.90 1.84 1.80 1.76 1.73 Assumptions
100 2.76 2.36 2.14 2.00 1.91 1.83 1.78 1.73 1.69 1.66 OLS
500 2.72 2.31 2.09 1.96 1.86 1.79 1.73 1.68 1.64 1.61
Inference
– = 0.95
Common pitfalls
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 Large sample
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98
IV
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54
GRM
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35
50 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 2.03 Panels
100 3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 1.93
500 3.86 3.01 2.62 2.39 2.23 2.12 2.03 1.96 1.90 1.85 Max.Lik.Estim.
– = 0.99
5 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05 Binary
10 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85
Time Series
15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80
20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37 AR processes
50 7.17 5.06 4.20 3.72 3.41 3.19 3.02 2.89 2.78 2.70 MLE of AR processes
100 6.90 4.82 3.98 3.51 3.21 2.99 2.82 2.69 2.59 2.50 Forecasting
500 6.69 4.65 3.82 3.36 3.05 2.84 2.68 2.55 2.44 2.36
Appendix
Notes: This table reports the quantiles of F distributions with (n1 , n2 ) degrees of freedom. If Stat. Tables
X ≥ F (n1 , n2 ), the cell of the table corresponding to n1 , n2 and a given – gives z that is such
– = P(0 < X Æ z) (see Slide 291, right chart).
295 / 295