You are on page 1of 38

OLS estimation and Monte Carlo Simulation

Introductory notes and comments

Athanassios Stavrakoudis
http://stavrakoudis.econ.uoi.gr

Spring 2014

1 / 38
Contents

1 Econometrics with Octave

2 Simple Regression, use of sums and/or matrices

3 Goodness of Fit

4 Distribution of of OLS with Monte Carlo

5 Data Generation Processes, autocorrelation example

2 / 38
Matrix Algebra

Dirk Eddelbbttel:

Econometricians sweat
linear algebra.
Dirk Eddelbbttel, Econometrics with Octave,
J. Appl. Econ., 15: 531542 (2000), doi:10.1002/1099-
1255(200009/10)15:5<531::AID-JAE573>3.0.CO;2-8

available also: http://dirk.eddelbuettel.com/papers/octave.pdf

3 / 38
Creel Econometrics
a

4 / 38
Creel Econometrics
b

5 / 38
Creel Econometrics
c

6 / 38
Creel Econometrics
d

7 / 38
Contents

1 Econometrics with Octave

2 Simple Regression, use of sums and/or matrices

3 Goodness of Fit

4 Distribution of of OLS with Monte Carlo

5 Data Generation Processes, autocorrelation example

8 / 38
House prices
price sqft
199 ,9 1065
228 ,0 1254
235 ,0 1300
285 ,0 1577
239 ,0 1600
293 ,0 1750
285 ,0 1800
365 ,0 1870
295 ,0 1935
290 ,0 1948
385 ,0 2254
505 ,0 2600
425 ,0 2800
415 ,0 3000
INTRODUCTORY ECONOMETRICS WITH APPLICATIONS
Ramu Ramanathan, 5th Ed. 2002, ISBN: 0-03-034342-9
http://econweb.ucsd.edu/ rramanat/embook5.htm
HousePrices.txt
9 / 38
House prices plot

10 / 38
Estimation with gretl

11 / 38
OLS estimation with R
1 > y <- read.table("HousePrices.txt")[,1]
2 > x <- read.table("HousePrices.txt")[,2]
3 > f <- lm (y~x)
4 > summary(f)
5

6 Call:
7 lm(formula = y ~ x)
8

9 Residuals:
10 Min 1Q Median 3Q Max
11 -53.602 -23.650 -1.192 10.948 91.898
12

13 Coefficients:
14 Estimate Std. Error t value Pr(>|t|)
15 (Intercept) 52.35091 37.28549 1.404 0.186
16 x 0.13875 0.01873 7.407 8.2e-06 ***
17 ---
18 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
19

20 Residual standard error: 39.02 on 12 degrees of freedom


21 Multiple R-squared: 0.8205, Adjusted R-squared: 0.8056
22 F-statistic: 54.86 on 1 and 12 DF, p-value: 8.199e-06
12 / 38
Estimation of beta with simple regression

y = + x + e

n xy x y
P P P
=
n x 2 ( x )2
P P

= y x

13 / 38
Simple regression with Octave

n xy x y
P P P
=
n x 2 ( x )2
P P

1 clear;
2 load HousePrices.txt;
3

4 y = HousePrices(:,1); % price
5 x = HousePrices(:,2); % sqft
6 n = length(x)
7

8 sumx = sum(x);
9 sumy = sum(y);
10 sumxy = sum(x .* y);
11 sumx2 = sum(x .^ 2);
12 sum2x = sum(x) ^ 2;
13

14 betahat = (n*sumxy - sumx*sumy) / (n*sumx2 - sum2x)


15 alphahat = mean(y) - betahat * mean(x)
olsHousePrices1.m
14 / 38
Equivalent computation with Octave

(x x )(y y ) (x x ) (y y )
P P P
=
(x x )2 ( (x x ))2
P P

1 clear;
2 load HousePrices.txt;
3

4 y = HousePrices(:,1); % price
5 x = HousePrices(:,2); % sqft
6

7 xm = x - mean(x);
8 ym = y - mean(y);
9

10 sumx = sum(xm);
11 sumy = sum(ym);
12 sumxy = sum(xm .* ym);
13 sumx2 = sum(xm .^ 2);
14 sum2x = sum(xm) ^ 2;
15

16 betahat = (sumxy - sumx*sumy) / (sumx2 - sum2x)


olsHousePrices2.m
15 / 38
calculation with matrices

y = 0 + 1 x + e (1)
y = X + e (2)
= (X 0 X )1 X 0 y (3)

1 clear;
2 load HousePrices.txt;
3

4 y = HousePrices(:,1); % price
5 x = HousePrices(:,2); % sqft
6

7 T = length(x);
8 X = [ones(T,1) x];
9

10 bhat = inv(X*X)*X*y
11

olsHousePrices3.m
16 / 38
Contents

1 Econometrics with Octave

2 Simple Regression, use of sums and/or matrices

3 Goodness of Fit

4 Distribution of of OLS with Monte Carlo

5 Data Generation Processes, autocorrelation example

17 / 38
Estimated y and residuals

y = x
e = y y

1 bhat = inv(X*X)*X*y;
2 yhat = X*bhat;
3 ehat = y - X*bhat;
olsHousePrices4.m

18 / 38
Plot of y , y vs x

19 / 38
Plot of residuals

20 / 38
Sum of Squares

(yi y )2
X
TSS =
(yi y )2
X
ESS =
ei2
X
RSS =

1 bhat = inv(X*X)*X*y;
2 yhat = X*bhat;
3 ehat = y - yhat;
4

5 TSS = sumsq(y - mean(y))


6 ESS = sumsq(yhat - mean(yhat))
7 RSS = sumsq(ehat)
olsHousePrices5.m
21 / 38
R2

RSS
R2 = 1
TSS

1 TSS = sumsq(y - mean(y));


2 ESS = sumsq(yhat - mean(yhat));
3 RSS = sumsq(ehat);
4

5 R2 = 1 - RSS/TSS
6 R2adj = 1 - (RSS/TSS) * ((T-1)/(T-K-1))
olsHousePrices6.m

22 / 38
Contents

1 Econometrics with Octave

2 Simple Regression, use of sums and/or matrices

3 Goodness of Fit

4 Distribution of of OLS with Monte Carlo

5 Data Generation Processes, autocorrelation example

23 / 38
has a distribution

What is its mean?


What is its variance?
What is the shape of distribution?

24 / 38
Monte Carlo Simulation of distribution

1 T = 100;
2 N = 1000;
3

4 beta = [1 1];
5 X = [ ones(T, 1), normrnd(0, 1, T, 1) ];
6 bhat1 = zeros(N, 1);
7

8 for (i = 1:N)
9 u = normrnd(0, 2, T, 1);
10 y = X*beta + u;
11 bhat = inv(X*X) * X * y;
12 bhat1(i) = bhat(2);
13 end
14

15 [min(bhat1) max(bhat1) mean(bhat1) var(bhat1)]


olsMC1.m
25 / 38
Distribution of

y =1+x +u

26 / 38
What theory tells us

27 / 38
What theory tells us

Variance of is lower when

28 / 38
What theory tells us

Variance of is lower when


1
Sample size is larger

29 / 38
What theory tells us

Variance of is lower when


1
Sample size is larger
2
Variance of explanatory variables is larger

30 / 38
What theory tells us

Variance of is lower when


1
Sample size is larger
2
Variance of explanatory variables is larger
3
Variance of error term is smaller

31 / 38
What theory tells us

Variance of is lower when


1
Sample size is larger
2
Variance of explanatory variables is larger
3
Variance of error term is smaller
4
Fewer variables are omitted

32 / 38
Contents

1 Econometrics with Octave

2 Simple Regression, use of sums and/or matrices

3 Goodness of Fit

4 Distribution of of OLS with Monte Carlo

5 Data Generation Processes, autocorrelation example

33 / 38
Serial autocorrelation

yt = Xt + ut
ut = ut1 + et

DGP1.m
34 / 38
Serial autocorrelation

yt = Xt + ut
ut = ut1 + et

1 T = 100;
2 phi = 0.95;
3

4 u = zeros(T, 1);
5 u(1) = randn;
6

7 for (t=2:T)
8 u(t) = phi*u(t-1) + randn;
9 end
DGP1.m
35 / 38
Distribution of with auto-correlated errors

yt = Xt + ut
ut = ut1 + et

MCDGP2.m
36 / 38
DurbinWatson test

PT 2
t=2 (et et1 )
d= PT 2
t=1 et

MCDGP3.m
37 / 38

38 / 38