Econometria Resumo

Econometria - Resumo
Gabriel F. Ferraz
September 2020
Capítulo 2
y = β0 + β1 x + u (1)
• y → dependent variable
• x → independent variable
• β0 → intercept parameter
• β1 → slope parameter
• u → error term
E(u) = 0 (2)
Crucial assumption:
E(u|x) = E(u) (3)
Substituting (2) on (3) we have the zero conditional mean assumption:
E(u|x) = E(u) = 0 (4)
Population regression function (PRF):
E(y|x) = β0 + β1 x (5)
Let {(xi , yi ): i = 1, 2, . . . , n } denote a random sample size n from the

population
yi = β0 + β1 xi + ui (6)

E(y − β0 − β1 x|x) = 0
E(u|x) − 0 =⇒ Cov(u, x) = 0 ∴ E(ux) = 0 =
E((y − β0 − β1 x)x|x) = 0
n
X
E(y − β0 − β1 x|x) = 0 =⇒ (yi − βˆ0 − βˆ1 xi ) · n−1 = 0
i=1
1
y = βˆ0 + βˆ1 x (7)
n
X
E((y − β0 − β1 x)x|x) = 0 =⇒ (yi − βˆ0 − βˆ1 xi )xi · n−1 = 0
i=1
n n n
X X (yi − y)xi X (xi − x)xi
(yi − y + βˆ1 x − βˆ1 xi )xi · n−1 = 0 =⇒ − βˆ1 =0
i=1 i=1
n i=1
n
n
X Xn
βˆ1 (xi − x)xi = (yi − y)xi (8)
i=1 i=1
We know that:
n
X n
X n
X n
X n
X n
X
(xi −x)(yi −y) = (xi yi −xi y−xyi +xy) = xi yi − xi y− xyi + xy
i=1 i=1 i=1 i=1 i=1 i=1
n
X n
X n
X n
X n
X n
X
xi yi −nxy−nxy+nxy = xi yi −nxy = xi yi − xi y = xi yi −xi y = (yi −y)xi
i=1 i=1 i=1 i=1 i=1 i=1
n
X n
X
(yi − y)xi = (xi − x)(yi − y) (9)
i=1 i=1
Also:
n
X n
X n
X n
X n
X n
X
(xi −x)2 = (x2i −2xi x+x2 ) = x2i −2 xi x+ x2 = x2i −2nx2 +nx2
i=1 i=1 i=1 i=1 i=1 i=1
n
X n
X n
X n
X n
X n
X
x2i −nx2 = x2i −(nx)·x = x2i − xi ·x = xi ·xi −xi ·x = xi (xi −x)
i=1 i=1 i=1 i=1 i=1 i=1
n
X Xn
(xi − x)xi = (xi − x)2 (10)
i=1 i=1
Substituting (9) and (10) on (8), we have:
Pn
(x − x)(yi − y)
βˆ1 = i=1Pn i 2
(11)
i=1 (xi − x)
Pn
i=1 (xi −x)(yi −y)
βˆ1 = Pn n
2
i=1 (xi −x)
n
\x1 ]
cov[y,
βˆ1 =
\1 ]
var[x
yî = βˆ0 + βˆ1 xi → sample regression function

ˆ ˆ ûi > 0 → overestimate
ûi = yi − yî = ûi = yi − β0 − β1 xi =
ûi < 0 → underestimate
2
Properties
Pn
i=1 ûi = 0 → the sum, and therefore the sample average of the OLS
residual
Pn is zero
i=1 xi ûi = 0 → the sample covariance between the regressor and the OLS
residual is zero
y = βˆ0 + βˆ1 x → the point (x, y) is always on the OLS regression line
n
X n
X n
X n
X n
X
yi = yî + ûi → yi = yî + ûi = yî = (βˆ0 + βˆ1 xi )
i=1 i=1 i=1 i=1 i=1
Pn Pn ˆ + βˆ1 xi )
yi i=1 (β0
i=1
= → y = βˆ0 + βˆ1 x
n n
Pn
Total sum of squares → SST ≡ i=1 (yi − y)2
Pn
Explained sum of squares → SSE ≡ i=1 (yî − y)2
Pn
Residual sum of squares → SSR ≡ i=1 ûi 2
∴ SST = SSE + SSR

SSE SSR
R2 = =1− (12)
SST SST
Hipótese 1: O modelo é linear nos parâmetros → y = β0 + β1 x1 + u

Hipótese 2: A amostra é aleatória → {(xi , yi ) : i = 1, 2, . . . , n} | yi =
β0 + β1 xi + ui Pn
Hipótese 3: Há variação na váriavel explicativa x → i=1 (xi − x) > 0
Hipótese 4: média condicional zero → E(u|x) = 0 =⇒ E(ui |xi ) = 0
Sobre hipótese 1-4 temos que:
E(βˆ0 ) = β0
E(βˆ1 ) = β1
Hipótese 5: Homocedasticidade → Var(ui |xi ) = σ 2 = Var(ui )
σ2
Var(βˆ1 ) =
SSTx
Pn
σ 2 · i=1 x2i
Var(βˆ0 ) =
n · SSTx
3
Sendo que: Pn
2 ûi 2
i=1
σ̂ =
n−2
Assim como:
E(σ̂ 2 ) = σ̂ 2
Além disso: q
ˆ
se(βˆ0 ) = Var(βˆ0 )
q
ˆ
se(βˆ1 ) = Var(βˆ1 )
Exemplo:
Base de dados "salario"
reg <- lm(salariom ∼ educ, data = salario, na.action = na.exclude)
Visualizar sumário na tabela 1
Table 1:
Dependent variable:
salariom
educ 180.674∗∗∗
(1.339)
Constant 0.830
(13.313)
Observations 161,092
R2 0.101
Adjusted R2 0.101
Residual Std. Error 2,308.464 (df = 161090)
F Statistic 18,196.630∗∗∗ (df = 1; 161090)
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
Interpretação: nivel-nivel, cada um ano amais de estudo, em média, au-

menta o salário em 180,67 reais
reg <- lm(lsalario-h ∼ educ, data = salario, na.action = na.exclude)
interpretção: log-nivel, cada ano amais de estudo, em média, o prêmio
salárial é de 8,9% a mais no salário recebido por hora trabalhada.
reg <- lm(lsalari0-h ∼ log(idade), data = salario, na.action = na.exclude)
Interpretação: log-log, a elasticidade dos salários em hora com respeito a
idade, que dá em média, 35,8%
4
Table 2:
Dependent variable:
lsalario_h
educ 0.089∗∗∗
(0.0005)
Constant 1.154∗∗∗
(0.005)
R2 0.192
Adjusted R2 0.192
Residual Std. Error 0.769 (df = 151932)
F Statistic 36,038.180∗∗∗ (df = 1; 151932)
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
Table 3:
Dependent variable:
lsalario_h
log(idade) 0.358∗∗∗
(0.007)
Constant 0.686∗∗∗
(0.024)
R2 0.019
Adjusted R2 0.019
Residual Std. Error 0.846 (df = 152357)
F Statistic 2,946.150∗∗∗ (df = 1; 152357)
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
5
Capítulo 3
y = β 0 + β 1 x 1 + β 2 x 2 + . . . βk x k + u
• y → dependent variable
• x1 − xk → independent variable
• β0 → intercept parameter
• β1 − βk → slope parameter
• u → error term
Key assumption
E(u|x1 , x2 , . . . xk ) = 0
Ao mínimo a equação acima requer que todos os fatores no termo não observado
sejam não correlacionados com as váriaveis explicativas
Definição do resíduo de uma regressão múltipla
ûi = yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 · · · − βˆk xik
O método de OLS consiste em escolher os k + 1 parâmetros do modelo

economêtrico de forma a minimizar a soma do quadrado dos resíduos:
n
X
min ûi 2 → βˆ0 , βˆ1 , βˆ2 , . . . βˆk
i=1
Caso de duas váriaveis (ou 3 parâmetros
ûi = yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2

n
X n
X
min ûi 2 = (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 )2
i=1 i=1
C.P.O:
n
X
[β0 ] : 2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 )(−1) = 0(∗)
i=1
n
X
[β1 ] : 2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 )(−xi1 ) = 0(∗∗)
i=1
n
X
[β2 ] : 2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 )(−xi2 ) = 0(∗ ∗ ∗)
i=1
6
De (*)
n
X n
X
−2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 ) = 0 → (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 ) = 0
i=1 i=1
n
X n
X n
X
yi − nβˆ0 − βˆ1 xi1 − βˆ2 xi2 = 0 (/n) → y − βˆ0 − βˆ1 x1 − βˆ2 x2 = 0
i=1 i=1 i=1
βˆ0 = y − βˆ1 x1 − βˆ2 x2 (1)

Substituindo (1) em (**)
n
X n
X
−2 xi1 (yi −(y−βˆ1 x1 −βˆ2 x2 )−βˆ1 xi1 −βˆ2 xi2 ) = 0 → xi1 (yi −y+βˆ1 x1 +βˆ2 x2 −βˆ1 xi1 −βˆ2 xi2 ) = 0
i=1 i=1
n
X n
X n
X n
X
xi1 [(yi −y)−βˆ1 (xi1 −x1 )−βˆ2 (xi2 −x2 )] = 0(∗∗)0 → xi1 (yi −y) = βˆ1 xi1 (xi1 −x1 )+βˆ2 xi1 (xi2 −x2 )]
i=1 i=1 i=1 i=1
Simetricamente para (***)

n
X n
X n
X n
X
xi2 [(yi −y)−βˆ1 (xi1 −x1 )−βˆ2 (xi2 −x2 )] = 0(∗∗∗)0 → xi2 (yi −y) = βˆ1 xi2 (xi1 −x1 )+βˆ2 xi2 (xi2 −x2 )]
i=1 i=1 i=1 i=1
Montando um sistema Ax=b com (**)’ e (***)’
Pn Pn Pn
ˆ

Pi=1 xi1 (xi1 − x1 ) Pi=1 xi1 (xi2 − x2 ) β1 i=1 xi1 (yi − y)
n n
ˆ = n
P
i=1 xi2 (xi1 − x1 ) i=1 xi2 (xi2 − x2 ) β2 i=1 xi2 (yi − y)
Aβ̂ = b
Lembrando:

a a12
A = 11
a21 a22
detA = a11 a22 − a12 a21
Aβ̂ = b → β̂ = A−1 b, se detA 6= 0
(−1)1+1 a22 (−1)1+2 a12

1
β̂ = b
detA (−1)2+1 a21 (−1)2+2 a11
Pn Pn Pn
(−1)2 Pi=1 xi2 (xi2 − x2 ) (−1)3 Pi=1 xi1 (xi2 − x2 ) Pi=1 xi1 (yi − y)

1
n n n
detA (−1)3 i=1 xi2 (xi1 − x1 ) (−1)4 i=1 xi1 (xi1 − x1 ) i=1 xi2 (yi − y)
Pn Pn Pn
1
Pnxi2 (xi2 − x2 )
i=1 Pn i=1 xi1 (xi2 − x2 ) Pni=1 xi1 (yi − y)
(−1)
detA (−1) i=1 xi2 (xi1 − x1 ) i=1 xi1 (xi1 − x1 ) i=1 xi2 (yi − y)
7
Pn Pn Pn Pn
ˆ i=1 xi2 (xi2 − x2 ) i=1 xi1 (yi − y) − i=1 xi1 (xi2 − x2 ) i=1 xi2 (yi − y)
β1 = Pn Pn Pn Pn
i=1 xi1 (xi1 − x1 ) i=1 xi2 (xi2 − x2 ) − i=1 xi1 (xi2 − x2 ) i=1 xi2 (xi1 − x1 )
Como
n
X n
X
xi1 (yi − y) = (xi1 − x1 )(yi − y)
i=1 i=1
n
X n
X
xi1 (xi1 − x1 ) = (xi1 − x1 )2
i=1 i=1
n
X n
X
xi1 (xi2 − x2 ) = (xi1 − x1 )(xi2 − x2 )
i=1 i=1
Portanto
Pn Pn Pn Pn
(xi2 − x2 )2 i=1 (xi1 − x1 )(yi − y) − i=1 (xi1 − x1 )(xi2 − x2 ) i=1 (xi2 − x2 )(yi − y)
βˆ1 = i=1 Pn 2
Pn 2
Pn 2
i=1 (xi1 − x1 ) i=1 (xi2 − x2 ) − [ i=1 (xi1 − x1 )(xi2 − x2 )]
Pn 2 Pn Pn Pn
i=1 (xi2 −x2 ) i=1 (xi1 −x1 )(yi −y) i=1 (xi1 −x1 (xi2 −x2 ) i=1 (xi2 −x2 )(yi −y)
−
βˆ1 = n Pn
−x
nP
2 n
−x 2
Pnn n
(x i1 1 ) (x i2 2) i=1 (xi1 −x1 )(xi2 −x2 )
i=1
n
i=1
n −[ n ]2
\x1 ]var[x
cov[y, \2 ] − cov[x
\ \
1 , x2 ]cov[y, x2 ]
βˆ1 =
\1 ]var[x
var[x \2 ] − [cov[x
\ 2
1 , x2 ]]
Agora para βˆ2

Pn Pn Pn Pn
− i=1 xi2 (xi1 − x1 ) i=1 xi1 (yi − y) + i=1 xi1 (xi1 − x1 ) i=1 xi2 (yi − y)
βˆ2 = Pn Pn Pn Pn
i=1 xi1 (xi1 − x1 ) i=1 xi2 (xi2 − x2 ) − i=1 xi1 (xi2 − x2 ) i=1 xi2 (xi1 − x1 )
Pn Pn Pn Pn
− i=1 (xi2 − x2 )(xi1 − x1 ) i=1 (xi1 − x1 )(yi − y) + i=1 (xi1 − x1 )2 i=1 (xi2 − x2 )(yi − y)
βˆ2 = Pn 2
Pn 2
Pn 2
Pn Pn Pn 2 Pn
i=1 (xi2 −x2 )(xi1 −x1 ) i=1 (xi1 −x1 )(yi −y) i=1 (xi1 −x1 ) i=1 (xi2 −x2 )(yi −y)
− +
βˆ2 = n
Pn 2
Pn n
2
Pn n n
i=1 (xi1 −x1 ) i=1 (xi2 −x2 ) i=1 (xi1 −x1 )(xi2 −x2 ) 2
n n −[ n ]
\x2 ]var[x
\ \
1 , x2 ]cov[y, x1 ]
βˆ2 =
\1 ]var[x
\ 2
1 , x2 ]]
Valor predito:
yî = βˆ0 + βˆ1 xi1 + βˆ2 xi2 + ... + βˆk xik
sendo yî o fitted or predicted value.

Resíduo:
ûi = yi + yî
8
Properties
Pn
Pni=1 ûi = 0 → deviations from regression line sum up to zero
i=1 xij ûi = 0 → covariance between deviation and regression are zero
y = βˆ0 + βˆ1 x1 + βˆ2 x2 +...+ βˆk xk → sample averages of y and of the regressors
lie on regression line Pn Pn Pn
Pn Pn Pn uî yi yî
ûi = yi − yî → i=1 ûi = i=1 yi − i=1 yî → i=1 n = i=1
n + i=1n
û = y − ŷ Pn
i=1 u
î
Pn
i=1 ûi = 0 → n = û = 0
Logo ŷ = y
ŷ = βˆ0 + βˆ1 x1 + βˆ2 x2 + ... + βˆk xk
Então y = βˆ0 + βˆ1 x1 + βˆ2 x2 + ... + βˆk xk
Considere a regressão com duas variáveis Podemos usar o resíduo de uma
regressão (auxiliar) simples de x1 contra x2 para obter βˆ1 por meio de outra
regressão simples → efeito "Partialling out" de βˆ1
i.e, qual a relção entre y e x1 depois de "considerar" o efeito de x2 em x1
Processo:
• Regredir a varável explicativa x1 contra x2 , obter o vetor de resíduos dessa
regressão rî1
• Regredir y contra esse vetor de resíduos do passo 1.
Pn
rî1 yi
βˆ1 = Pi=1
n
rî1
i=1
Pn
i=1 rˆ
i1 yi
Mostar que βˆ1 = P n (*) é igual a βˆ1 =
i=1 rî1
\ 1 ]var[x
cov[y,x \2 ]−cov[x \ \
1 ,x2 ]cov[y,x2 ]
(**)
var[x1 ]var[x2 ]−[cov[x1 ,x2 ]]2
\ \ \
1º passo: rî1 = xi1 − xî1 onde xî1 = αˆ0 + αˆ1 xi2
αˆ0 = x1 − αˆ1 x2
Substituindo
rî1 = xi1 − xî1 = rî1 = xi1 − αˆ0 + αˆ1 xi2 = xi1 − (x1 − αˆ1 x2 ) + αˆ1 xi2
rî1 = (xi1 − x1 ) − αˆ1 (xi2 − x2 )

Substituindo em (*)
Pn Pn Pn
ˆ i=1 [(xi1 − P
x1 ) − αˆ1 (xi2 − x2 )]yi i=1 (xi1 − x1 )yP
i − αˆ1 i=1 (xi2 − x2 )yi
β1 = n = n
r
i=1 i1ˆ r
i=1 i1ˆ
Pn
Pn (x −x1 )(xi2 −x2 ) Pn
− x1 )yi − Pn i1 − x2 )yi
i=1 (xi1 i=1 (xi2
i=1
2
i=1 (xi2 −x2 )
βˆ1 = Pn
i=1 rˆ
i1
Pn 2
Pn Pn Pn
i=1 (xi2 − x2 ) i=1 (xi1 − x1 )(yi − y) − i=1 (xi1 − x1 )(xi2 − x2 ) i=1 (xi2 − x2 )(yi − y)
βˆ1 = Pn 2
Pn
i=1 (xi2 − x2 ) i=1 rˆ
i1
9
Resolvendo o denominador:
n
X n
X n
X n
X
(xi2 − x2 )2 rî1 2 = (xi2 − x2 )2 [(xi1 − x1 ) − αˆ1 (xi2 − x2 )]2
i=1 i=1 i=1 i=1
n
X n
X
= (xi2 − x2 )2 [(xi1 − x1 )2 − αˆ1 (xi1 − x1 )(xi2 − x2 ) + αˆ1 2 (xi2 − x2 )2 ]
i=1 i=1
n
X n
X n
X n
X
= (xi2 −x2 )2 (xi1 −x1 )2 −αˆ1 (xi1 −x1 )(xi2 −x2 ) (xi2 −x2 )2 +αˆ1 2 (xi2 −x2 )4
i=1 i=1 i=1 i=1
n n Pn n n
(x − x1 )(xi2 − x2 ) X
Pni1
X X X
2 2
= (xi2 −x2 ) (xi1 −x1 ) −( i=1
2
) (xi1 −x1 )(xi2 −x2 ) (xi2 −x2 )2 +
i=1 i=1 i=1 (xi2 − x2 ) i=1 i=1
Pn
(x − x1 )(xi2 − x2 ) 2
( Pni1
i=1
2
) (xi2 − x2 )4
i=1 (x i2 − x 2 )
n
X n
X n
X
= (xi2 − x2 )2 (xi1 − x1 )2 − [ (xi1 − x1 )(xi2 − x2 )]2
i=1 i=1 i=1
Juntando tudo:
Pn Pn Pn Pn
(xi2 − x2 )2 i=1 (xi1 − x1 )(yi − y) − i=1 (xi1 − x1 )(xi2 − x2 ) i=1 (xi2 − x2 )(yi − y)
βˆ1 = i=1 Pn 2
Pn 2
Pn 2
Pn 2 Pn Pn Pn
i=1 (xi2 −x2 ) i=1 (xi1 −x1 )(yi −y) i=1 (xi1 −x1 )(xi2 −x2 ) i=1 (xi2 −x2 )(yi −y)
−
βˆ1 = n Pn n P
2 n 2
Pnn n
i=1 (xi2 −x2 ) i=1 (xi1 −x1 ) i=1 (xi1 −x1 )(xi2 −x2 )
n n −[ n ]2
\x1 ]var[x
\ \
1 , x2 ]cov[y, x2 ]
βˆ1 =
\1 ]var[x
\ 2
1 , x2 ]]
Decompondo a variação total das váriavel explicada (y):
SST = SSE + SSR

n
X
SST = (yi − y)2
i=1
n
X
SSE = (yî − y)2
i=1
n
X
SSR = ûi 2
i=1
R-quadrado
SSE SSR
R2 ≡ =1−
SST SST
10
O R2 nunca reduz quando adicionado outras váriaveis → ferramenta ruim
para avaliar se uma variável deve ser incluida no modelo
cov[y, ŷ]
ρy,ŷ =
σy σŷ
R2 = (ρy,ŷ )2
Pn
2 [ − y)(yî − ŷ)]2
i=1 (yi
R = Pn 2
Pn 2
i=1 (yi − y) i=1 (yî − ŷ)
Pn 2
i=1 (yi −y)(yî −ŷ)]
n2
= Pn 2
Pn 2
i=1 (yi −y) i=1 (yî −ŷ)
n n
2
cov[yi , y]
=
var[yi ]var[ŷ]
Mostrar que R2 = (ρy,ŷ )2 = SSE
SST P
[ ni=1 (yi −y)(yî −ŷ)]
2
Como yi = yî + ûi e 2
R = Pn (y 2
Pn 2
i=1 i −y) i=1 (yî −ŷ)
Pn Pn
[ (yî + ûi − y)(yî − ŷ)]2 [ i=1 (yî − y + ûi )(yî − ŷ)]2
R2 = Pni=1 2
P n 2
= P n 2
Pn 2
i=1 (yi − y) i=1 (yî − ŷ) i=1 (yi − y) i=1 (yî − ŷ)
Como ŷ = y
Pn Pn
2 [ i=1 (yî − y)2 + i=1 ûi (yî − ŷ)]2
R = Pn 2
Pn 2
i=1 (yi − y) i=1 (yî − ŷ)
Como
n
X n
X
ûi (yî − ŷ) = ûi [βˆ1 (xi1 − x1 ) + βˆ2 (xi2 − x2 ) + . . . βˆk (xik − xk )]
i=1 i=1
n
X n
X n
X
= βˆ1 ûi (xi1 − x1 ) + βˆ2 ûi (xi2 − x2 ) + . . . βˆk ûi (xik − xk )
i=1 i=1 i=1
Pn
Pela prop. 2: Pi=1 ûi xij = 0 para j = 1, 2, . . . k
n
Deriva-se que i=1 ûi (xij − xj ) = 0
Portanto:
Xn
ûi (yî − ŷ) = βˆ1 0 + βˆ2 0 + . . . βˆk 0
i=1
Então:
Pn Pn
2 [ i=1 (yî − y)2 ]2 i=1 (yî − y)
2
SSE
R = Pn n = Pn 2
=
− SST
P
(y
i=1 i − y) 2 (
i=1 iyˆ − ŷ)2 (y
i=1 i y)
11
Hipótese 1: O modelo é linear nos parâmetros → y = β0 + β1 x1 + β2 x2 +
· · · + βk x k + u
Hipótese 2: A amostra é aleatória → {(xi1 , xi2 , . . . , xik , yi ) : i =
1, 2, . . . , n} | yi = β0 + β1 xi1 + β2 xi2 + · · · + βk xik + ui
Hipótese 3: Ausência de colinearidade perfeita.
Isso acontece quando ao menos uma váriavel independente é uma combinação
linear exata de outras váriaveis independentes.
Hipótese 4: média condicional zero → E(u|x) = 0 =⇒
E(ui |xi1 , xi2 , . . . , xik ) = 0
variáveis endógenas → variáveis explicativas correlacionadas com o termo
de erro
variáveis exógenas → variáveis explicativas não são correlacionadas com
o erro
Exogeneidade é portanto uma hipótese crucial para a interpretação causal
da regressão, assim como, para a condição de não-viés dos estimadores de OLS
Teorema 1:
Sobre hipótese 1-4 temos que:
E(βˆj ) = βj , j = 0, 1, . . . , k
Prova (matricial)
y = Xβ + u
u = y − Xβ
Then if we want to derive OLS we must find the beta value that minimizes the
squared residuals (u).
u0 u = (y − Xβ)0 (y − Xβ)
Note that the square of a matrix is denoted by the multiplication of the matrix
transpose by itself. Our next step is to simply distribute the terms.
u0 u = y 0 y − y 0 (Xβ) − (Xβ)0 y + (Xβ)0 (Xβ)
Sabemos que y 0 (Xβ) é igual a (Xβ)0 y
u0 u = y 0 y − 2(Xβ)0 y + (Xβ)0 (Xβ)
u0 u = y 0 y − 2(Xβ)0 y + β 0 X 0 Xβ
Now in order to finnd the beta that minimizes our subject, we want to take the
derivative in respect to beta and set it equal to zero. This will find the point in
the function where our slope is equal to zero, also known as a minimum point.
∂u0 u
= −2X 0 y + 2X 0 X β̂ = 0
∂β
X 0 X β̂ = X 0 y
(X 0 X)−1 X 0 X β̂ = (X 0 X)−1 X 0 y
12
β̂ = (X 0 X)−1 X 0 y
Substituindo y por Xβ + u
β̂ = (X 0 X)−1 X 0 (Xβ + u)
β̂ = (X 0 X)−1 X 0 Xβ + (X 0 X)−1 X 0 u
β̂ = Iβ + (X 0 X)−1 X 0 u
Tirando a expectativa dos dois lados, sabendo que E[ui |xi ] = 0 temos que:
E[β̂] = β
Hipótese 5: Homocedasticidade → Var[ui |xi1 , xi2 , . . . xik ] = σ 2

Notação: V ar[ui |xi ] = σ 2 onde xi = (xi1 , xi2 , . . . xik )
Teorema 2:
Dadas as hipóteses 1-5, a variância dos esrtimadores OLS:
σ2
V ar[βˆj ] =
SSTj (1 − Rj2 )
A variância do erro pode ser estimada:

Pn
2 i=1 ûi
σ =
[n − k − 1]
Teorema 3: A variância do erro estimado é não viesado

Dado as hipóteses 1-5:
E[σ 2 ] = σ 2
Calculando o erro padrão dos estimadores de OLS:
A verdadeira variância ou desvio de βˆj :
s
σ2
q
sd(βˆj ) = V ar[βˆj ] =
[SSTj (1 − Rj2 )]
A variância do erro estimado de βˆj

s
σˆ2
q
ˆ
se(βˆj ) = V ar[ ˆβj] =
[SSTj (1 − Rj2 )]
Teorema 5: Gauss-Markov
Dada as hipóteses 1-5, o estimador de OLS é o melhor estimador linear
não-viesado ("BLUE") dos coeficientes da regressão, i.e
V ar[βˆj ] ≤ V ar[β˜j ] j = 0, 1, . . . , k
Pn
para todo β˜j = i=1 wij yi tal que E[β˜j ] = βj ; j = 0, . . . , k
13
Table 4:
Dependent variable:
salariom
educ 208.138∗∗∗
(1.341)
idade 42.068∗∗∗
(0.473)
horas_m 6.183∗∗∗
(0.102)
Constant −2,902.121∗∗∗
(30.242)
R2 0.160
Adjusted R2 0.160
Residual Std. Error 2,231.445 (df = 161088)
F Statistic 10,262.850∗∗∗ (df = 3; 161088)
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
14
Exemplo: Base de dados "salario"
reg <- lm(salariom educ + idade + horas-m, data = salario, na.action =
na.exclude)
Visualizar sumário na tabela 4:
Interpretação: Cada ano amais de estudo, em média, o salário aumenta
em 208,13 reais controlando por idade e horas mensais trabalhadas.
Cada ano amais de idade, em média, o salário aumenta em 42,06 reais con-
trolando por estudos e horas mensais trabalhadas
Cada hora amais trabalhada, em média, aumenta o salário em 6,18 reais
controlado por estudo e idade.
Capítulo 4
Hipótese 6: Normalidade → u ∼ N ormal(0, σ 2 )
since u is independent of the xj under 6, E[u|x1 , . . . , xk ] = E[u] = 0 and
V ar[u|x1 , . . . , xk ] = V ar[u] = 0 → hipótese forte
Hipóteses 1-6: Classical Linear Model (CLM) assumptions
Então CLM = Gauss-Markov + distribuição normal do termo de erro
y|x ∼ N ormal(β0 + β1 x1 + β2 x2 + ... + βk xk , σ 2 )

where x = (x1 , . . . , xk ). Thus, conditional on x, y has a normal distribution
with mean linear in x1 , . . . , xk and a constant variance.
Teorema: Distribuição amostral dos estimadores de OLS
Sob CLM:
βˆj − βj
βˆj ∼ N ormal(βj , V ar[βˆj ]) ∼ N ormal(0, 1)
sd[βˆj ]
condicional as variáveis independentes xi

Prova:
y = β0 + β1 x 1 + · · · + βk x k + u
Método Pnde 2 estágios
rî1 yi
βˆ1 = Pi=1 n onde rî1 = xi1 − xî1
i=1 rˆ
i1
e xî1 é o valor predito de xi1 obtida da regressão de xi1 contra {xi2 , xi3 , , xik }
Rescrevendo βˆ1
Pn n n
rî1 X X
βˆ1 = [ Pni=1 ]yi = wi1 yi = wi1 (β0 + β1 xi1 + · · · + βk xik + ui )
i=1 rˆ i1 i=1 i=1
n
X n
X n
X n
X
β0 wi1 + β1 wi1 xi1 + · · · + βk wi1 xik + wi1 ui
i=1 i=1 i=1 i=1
Pn Pn Pn
i) i=1 wi1 = 0, pois i=1
Pnrî1 2 = Pn 1
rî1 2 i=1 rˆ
i1 =0
i=1 rˆ
i1 i=1
15
Pn
ii) i=1 wi1 xi1 = 1, pois
n n Pn
X rî1 1 X (xi1 − xî1 )(xi1 − xî1 )
Pn 2 xi1 = Pn 2 (xi1 − xî1 )xi1 = i=1 Pn 2
i=1 i=1 rî1 i=1 rˆ
i1 i=1 i=1 rî1
Pn Pn
pois Pi=1 rî1 xî1 = Pi=1 rî1 (αˆ0 + αˆ1 xi2 + P
. . . αk−1
ˆ xik )
n n n
= αˆP
0 i=1 rˆ
i1 + αˆ1 i=1 rî1 xi2 + . . . αk−1
ˆ i=1 rˆ
i1 xik = 0
n
iii) i=1 wî1 xik = 0, pois
n n
X rî1 1 X
Pn x
2 ik = P n 2 rî1 xik = 0, ∀k ≥ 2
i=1 i=1 rî1 i=1 rˆ
i1 i=1
Pn
=⇒ βˆ1 = β1 + i=1 wi1 ui
onde wi1 só depende de xi
Hip.2 =⇒ ui iid
Hip.6 =⇒ ui ∼ N (0, σ 2
Pn
βˆ1 = β1 + i=1 wi1 ui é normal, pois a soma de variáveis normais indepe-
dentes é uma normal.
Qual a média e variância?
E[βˆ1 |xi ] = β1 ← não-viés, exógeno
σ2
V ar[βˆ1 |xi ] = Pn (xi1 −x 2 2 , R
i ) (1−R )
2
da regressão de x1 em {x2 , . . . xk }
i=1 1
βˆ1 |xi ∼ N (β1 , V ar[βˆ1 |xi ])
βˆ1 − β1
∼ N (0, 1)
sd(βˆ1 )
Teorema: (distribuição t para valores padronizados de βˆ1 usando o erro
padrão)
Sob CLM:
βˆj − βj
∼ tn−k−1
se(βˆj )
Teste de hipótese
Defina um nível de significância (= a probabilidade de rejeitar H0 quando
ela é verdadeira)
Testando contra H1 em teste unilateral
H0 : βj = 0 contra H1 : βj > 0
1. Construa a estatistica-t
2. Defina o nível de significância: 5% (+ comum)

3. Obtenha a tabela de distribuição-t o valor crítico (c) correspondente a 5%
e n − k − 1 graus de liberdade. Nesse caso n − k − 1 = 28, por isso c =
1,701
16
4. Rejeitar H0 se estatistica-t > 1,701
Testando contra H1 em teste unilateral (menor que zero)

H0 : βj = 0 contra H1 : βj < 0
3. Obtenha a tabela de distribuição-t o valor crítico (c) correspondente a 5%

e n − k − 1 graus de liberdade. Nesse caso n − k − 1 = 18, por isso c =
-1,734
4. Rejeitar H0 se estatistica-t < -1,734
Testando contra H1 em teste bilateral (diferente de zero)

H0 : βj = 0 contra H1 : βj 6= 0
17
3. Obtenha a tabela de distribuição-t o valor crítico (c) correspomndente a
5% e n − k − 1 graus de liberdade. Nesse caso n − k − 1 = 25, por isso c
= 2,06 e -2,06
4. Rejeitar H0 se estatistica-t < -2,06 ou > 2,06
Lembrar:
|t − ratio| > 1.645 → ”estatisticamente relevante ao nivel de 10%”

Calculando o p-valor para os testes com 1 parâmetro
P-valor: é o menor nivel de significância sob o qual a H0 é ainda rejeitada
→ Um p-valor pequeno é evidência contra H0 dado que esta pode ser rejeitada
mesmo a um nível muito pequeno de significância.
Como o p-valor é calculado
18
• O p-valor é o menor nível de significância sob a qual a H0 é ainda rejeitada
• No caso bilateral, o p-valor é portanto a probabilidade que a variável
aleatória que segue a distribuição-t com n − k − 1 graus de liberdade seja
maior que a estatistica t em termos absolutos, por ex. P C|T | > 1.85 =
2P (T > 1.85) = 2(0.0355) = 0.0718
• H0 é rejeitada se e somente se o p-valor é menor do que o nivel escolhido de significância
Por exemplo, se o nível for 5%, H0 não é rejeitada (pois 0.0718 > 5%)
Importância econômica 6= Significância estatística
Importância econômica depende do tamanho e sinal de βˆj e da unidade
de medida da variável dependente e independente.
19

Econometria Resumo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometria Resumo

Uploaded by

Copyright:

Available Formats

Econometria - Resumo

E(u|x) = E(u) = 0 (4)

Population regression function (PRF):

Let {(xi , yi ): i = 1, 2, . . . , n } denote a random sample size n from the

∴ SST = SSE + SSR

Hipótese 1: O modelo é linear nos parâmetros → y = β0 + β1 x1 + u

Interpretação: nivel-nivel, cada um ano amais de estudo, em média, au-

Definição do resíduo de uma regressão múltipla

ûi = yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 · · · − βˆk xik

O método de OLS consiste em escolher os k + 1 parâmetros do modelo

Caso de duas váriaveis (ou 3 parâmetros

ûi = yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2

βˆ0 = y − βˆ1 x1 − βˆ2 x2 (1)

Simetricamente para (***)

Montando um sistema Ax=b com (**)’ e (***)’

detA = a11 a22 − a12 a21

Aβ̂ = b → β̂ = A−1 b, se detA 6= 0

(−1)1+1 a22 (−1)1+2 a12

Agora para βˆ2

yˆi = βˆ0 + βˆ1 xi1 + βˆ2 xi2 + ... + βˆk xik

sendo yˆi o fitted or predicted value.

rˆi1 = (xi1 − x1 ) − αˆ1 (xi2 − x2 )

Decompondo a variação total das váriavel explicada (y):

SST = SSE + SSR

u0 u = y 0 y − y 0 (Xβ) − (Xβ)0 y + (Xβ)0 (Xβ)

Sabemos que y 0 (Xβ) é igual a (Xβ)0 y

u0 u = y 0 y − 2(Xβ)0 y + (Xβ)0 (Xβ)

Hipótese 5: Homocedasticidade → Var[ui |xi1 , xi2 , . . . xik ] = σ 2

A variância do erro pode ser estimada:

Teorema 3: A variância do erro estimado é não viesado

A variância do erro estimado de βˆj

y|x ∼ N ormal(β0 + β1 x1 + β2 x2 + ... + βk xk , σ 2 )

condicional as variáveis independentes xi

βˆ1 |xi ∼ N (β1 , V ar[βˆ1 |xi ])

2. Defina o nível de significância: 5% (+ comum)

Testando contra H1 em teste unilateral (menor que zero)

3. Obtenha a tabela de distribuição-t o valor crítico (c) correspondente a 5%

Testando contra H1 em teste bilateral (diferente de zero)

4. Rejeitar H0 se estatistica-t < -2,06 ou > 2,06

|t − ratio| > 1.645 → ”estatisticamente relevante ao nivel de 10%”

|t − ratio| > 1.96 → ”estatisticamente relevante ao nivel de 5%”

• H0 é rejeitada se e somente se o p-valor é menor do que o nivel escolhido de significância

You might also like

Montando um sistema Ax=b com ()’ e (*)’