Professional Documents
Culture Documents
Stephen Bianchi
Department of Economics
UC Berkeley
H0 : 1 = 1, 2 = 0, 3 = 0, 4 =0
H1 : 1 6= 1 or 2 6= 0 or 3 6= 0 or 4 6= 0
I Restricted:
Yi = 0 + X1i + vi
or
Yi X1i = 0 + vi
I We then proceed as before.
I Note: in this case we can not use the R 2 version of the
F -statistic. This is because the dependent variable is different
in the unrestricted and restricted models, i.e., TSSur 6= TSSr
(and hence, won’t cancel).
General joint hypotheses
\
ln(price) = 0.264 + 1.043 · ln(assess) + 0.0074 · ln(lotsize)
(0.57) (0.151) (0.386)
2
n = 88, SSRur = 1.822, Rur = 0.7728
ln(price)\ln(assess) = 0.0848
(0.0156)
(1.88 1.822)/4
F = = 0.661
1.822/83
General joint hypotheses
I Hypothesis test:
k
X k
X
H0 : cj j = r vs H1 : cj j 6= r
j=0 j=0
H0 : 1 = 2 vs H1 : 1 < 2
H0 : 1 2 = 0 vs H1 : 1 2 <0
Thus we have
H0 : c 0 0 + c1 1 + c2 2 + c3 3 =r
where c0 = 0, c1 = 1, c2 = 1, c3 = 0, r = 0
Linear hypotheses
H0 : ✓1 = 0 vs H1 : ✓1 < 0
Linear hypotheses
✓ˆ1
t=
SE (✓ˆ1 )
I Since ✓1 = 1 2 we can also write 1 = ✓1 + 2 and plug
this into the model
\
ln(wage) = 1.472 0.0102 · jc + 0.0769 · totcoll +
(0.022) (0.0066) (0.0024)
0.0049 · exper
(0.0002)
H0 : 1 = 2 = 3 vs H1 : 1 6= 2 or 2 6= 3 or 1 6= 3
H0 : ✓1 = 0, ✓2 = 0 vs H1 : ✓1 6= 0 or ✓2 6= 0
I Model:
Yi = 0 + X + 2 X2i + 3 X3i + ui
|{z} | 1{z 1i} | {z } |{z}
dependent variable of control error
variable interest variables term
I Returns to education:
I Returns to education:
Yi = 0 + 1 X1i + X +ui
| 2{z 2i}
control
variable
I "Illustration":
I Suppose E[ui |X2i ] is linear in X2i , i.e.,
1
Statistical fact: if the random variables X and Y are jointly normally
distributed then E[Y |X ] = a + bX , for suitably chosen values of a and b.
Specification
I "Illustration":
I Then we have
where ↵0 = 0 + 0 and ↵2 = 2 + 2.
Stephen Bianchi
Department of Economics
UC Berkeley
April 1, 2021
Nonlinear regression
I Nonlinear regression falls into two categories:
(i) Nonlinear in the dependent variable and/or nonlinear in the
independent variables.
I But linear in the coefficients.
(ii) Nonlinear in the coefficients. We will leave this to a more
advanced course in econometrics.
I First category:
2
Yi = 0 + 1 X1i + 2 X2i + ui (quadratic)
Yi = 0 + 1 X1i + 2 ln(X2i ) + ui (linear-log)
2 3
Yi = 0 + 1 X1i + 2 X1i + 3 X1i + ui (polynomial)
More generally:
Yi = 0 + 1 g1 (X1i , X2i ) + 2 g2 (X1i , X2i ) + ui
where g1 and g2 are nonlinear functions of (potentially all) the
independent variables.
Nonlinear regression
Then
2
Yi = 0 + 1 ln(X1i ) + 2 (X1i + X2i2 ) + ui
= 0 + 1 X̃1i + 2 X̃2i + ui
where X̃1i = ln(X1i ) and X̃2i = X1i2 + X2i2 , i.e., our regression
equation is still linear in the coefficients, as well as in the
redefined independent variables.
Nonlinear regression
I Polynomials:
2 k
Yi = 0 + 1 X1i + 2 X1i + ··· + k X1i + ui
I All k regressors are functions of the same independent
variable!
I k = 2 (quadratic), k = 3 (cubic)
I The interpretation of the coefficients is different.
I Suppose
Yi = 0 + 1 X1i + 2 X1i2 + ui
then
@Y
= 1 + 2 2 X1
@X1
I A common joint hypothesis test with polynomial regressions is
to test whether the population regression function is linear in
X1 :
H0 : 2 = 0, . . . , k = 0 vs
H1 : J 6= 0 for at least one j 2 {2, . . . , k}
Nonlinear regression
I Quadratic example: suppose we think that average hourly
earnings depend on experience in a nonlinear way.
Model:
2
wagei = 0 + 1 experi + 2 experi + ui
OLS results:
I Comments:
I Experience has a diminishing effect on wage with this
estimated equation.
The 1st year of experience is worth an average of $0.298
The 2nd year of experience is worth an average of
$(0.298-0.0122) = $0.286
I The change in wage will be positive provided exper < 24.4
years, i.e.,
@wage
> 0, if 0.298 > 2(0.0061) · exper
@exper
I The is no change in wage if exper = 24.4 years, i.e.,
@wage
= 0, if 0.298 = 2(0.0061) · exper
@exper
Nonlinear regression
I Comments:
I The change in wage will be negative if exper > 24.4 years,
i.e.,
@wage
< 0, if 0.298 < 2(0.0061) · exper
@exper
I How do we interpret this?
I Some possibilities:
(i) It could be that few people in the sample have more than 24.4
years of experience. If this were true, this would not be much
of a concern. But it turns out that 28% of the people in this
sample have more than 24.4 years of experience.
(ii) Of course, it is possible that the return to experience becomes
negative at some point. However, 24.4 years seems a little
early!
(iii) It is more likely that this is either a biased estimate due to
OVB, or the quadratic model is misspecified.
Logarithmic Regression
(ii) Linear-log
Yi = 0 + 1 ln(X1i ) + 2 X2i + ui
(iii) Log-log
\
ln(price) = 11.08 0.954 · ln(nox) 0.134 · ln(dist)
+ 0.255 · rooms 0.052 · stratio
I To summarize:
Dep Var \ Regressor X ln(X )
X
Y Y = 1 X Y = 1 X
Y Y X
ln(Y ) Y = 1 X Y = 1 X
Single binary independent variable
I Example:
where ⇢
1 if female
femalei =
0 if male
I Interpretation of 0 : difference in hourly wage for men and
women, given the same amount of education.
I We can use this model to investigate the issue of wage
discrimination against women (i.e., if 0 < 0, then there is
wage discrimination).
I This can be depicted graphically as a parallel shift of the
regression line.
Single binary independent variable
where
⇢
1 if obs from the south region
southi =
0 if otherwise
⇢
1 if obs from the north region
northi =
0 if otherwise
⇢
1 if obs from the east region
easti =
0 if otherwise
The omitted region is west.
Multiple binary independent variables
H0 : 2 = 0, 3 = 0, 4 = 0 vs
H1 : 2 6= 0 or 3 6= 0 or 4 6= 0
Econ 140
MLR: Interaction Terms, Internal Validity
Stephen Bianchi
Department of Economics
UC Berkeley
April 6, 2021
Interaction terms
D2 \ D 3 0 1
0 0 0 + 3
1 0 + 2 0+ 2+ 3 + 4
Interaction terms
I OLS results:
\
ln(wage) = 0.356 + 0.087 · educ + 0.0073 · exper 0.111 · ufem
0.141 · mfem + 0.323 · mmale
Stephen Bianchi
Department of Economics
UC Berkeley
April 8, 2021
Errors-in-variables (measurement error)
I Sometimes we can’t collect data on the variable that truly
affects economic behaviour.
I But we can get data that is an imprecise measurement of that
variable.
I Good example: reported versus actual annual income
I Consider the simple regression model
⇤
Yi = 0 + 1 Xi + ui
Xi = Xi⇤ + ei
Yi = 0 + 1 (Xi ei ) + u i = 0 + 1 Xi + (ui 1 ei )
Hence
P
ˆ1 ! Cov (Xi , ui 1 ei )
1 + 2
X
I Under our assumption we have
Yi = 0 + 1 Xi + ui , and
Xi = 0 + 1 Yi + vi ,
unless 1 = 0.
I Hence, we will get biased and inconsistent estimates of 1 .
I Example: cities often want to determine whether additional
law enforcement will lower murder rates.
Model: murdpci = 0 + 1 polpci + ui
I But it is entirely plausible that a city’s spending on law
enforcement is determined (at least in part) by its murder
rate, i.e.,
polpci = 0 + 1 murdpci + vi
I Possible remedy: use IV.
Panel Data
I Information recorded for an entity (individual, household,
firm, state, country, etc) over several periods of time (day,
month, quarter, year).
1 2 ··· T
1 Y11 Y12 Y1T
2 Y21 Y22 Y2T
..
.
..
.
N YN1 YN2 YNT
1980 : c=
tfr 6.45 + 1.69 · p1424
(1.41)
1992 : c=
tfr 15.16 + 2.26 · p1424
(1.01)
I Thus, if we let
dtfri = 0 + 1 dprci + vi
I This is sometimes called the "first-differenced" equation.
I We have included an intercept to allow for the possibility that
the mean change in the fatality rate, in the absence of a
change in the percentage of young people, is nonzero.
I Our results from the estimation of this model are
giving a t-stat for the intercept of 0.89 and a t-stat for the
slope coefficient of 2.58.
I This regression controls for factors that affect traffic fatality
rates that vary across states (e.g., speed limits) but do not
vary across time.
Fixed Effects Regression
I We can write the model from our example more generally as
I Graphically, we have
Fixed Effects Regression
I Using entity dummies, we write the regression model for our
traffic fatality example as
tfrit = 0 + 1 p1424it + 2 D2i + ··· + 48 D48i + uit
I Based on this specification, our estimation results (using the
data for 1980 and 1992) are
c = 1.85 · p1424 + state fixed effects
tfr
(0.23)
which we write as
Xi1 + Xi2
X̃i1 = Xi1
2
1 1
= Xi1 Xi2
2 2
1
= (Xi2 Xi1 )
2
and
Xi1 + Xi2
X̃i2 = Xi2
2
1
= (Xi2 Xi1 )
2
I Similarly, we have
1 1
Ỹi1 = (Yi2 Yi1 ) and Ỹi2 = (Yi2 Yi1 )
2 2
Fixed Effects Regression*
I OLS results:
\
ln(wage) = 0.356 + 0.087 · educ + 0.0073 · exper
0.111 · female + 0.323 · married
0.353(female · married)
H0 : 4 = 0 vs H1 : 4 6= 0
Since
@ ln(wage)
= 1 + 4 female
@educ
if 4 = 0, then additional years of education result in the same
percentage increase in wage for men and women.
Interaction terms
\
ln(wage) = 0.461 + 0.093 · educ + 0.009 · exper 0.296 · female
0.004(female · educ)
I Note:
Corr (female, female · educ) = 0.96
Corr (female, female · (educ educ)) = 0.07
Hence, the effect on SE ( ˆ3 ) is much smaller!
Interaction terms
I Note:
Cov (female, female · educ) = 3.08
Stephen Bianchi
Department of Economics
UC Berkeley
I Recall: PN PT
ˆ1 = i=1 t=1 X̃it ũit
1 + P N PT 2
i=1 t=1 X̃it
N
! N N X N
X X X
Var Xi = Var (Xi ) + 2 Cov (Xi , Xj )
i=1 i=1 i=1 j=i+1
N X
X N
= Cov (Xi , Xj )
i=1 j=1
Fixed Effects Standard Errors
I Using this we can write
N T X
T
!
X X
Var ( ˆ1 ) / Cov (X̃is ũis , X̃it ũit )
i=1 s=1 t=1
where ⇢
1 if s = t
Ist =
0 otherwise
I Hence, t = 0 + s (when s = t).
Time Fixed Effects
or
Ȳt = 0 + 1 X̄t + 3 Wt + ūt
I The we subtract from the original (time fixed effects)
equation to get
where
⇢ ⇢
1 if i = j 1 if s = t
Dji = Ist =
0 otherwise 0 otherwise
Entity and Time Fixed Effects
I Of course, we can also demean using both entity and time
means to get
(Yit Ȳi Ȳt + Ȳ¯ ) = 1 (Xit X̄i X̄t + X̄
¯ ) + (u
it ūi ūt + ū¯)
PN PT
where Ȳ¯ = 1
NT i=1 t=1 Yit ,
¯ and ū¯.
and similarly for X̄
I This is a bivariate regression with entity and time demeaned
variables.
I Going back to our example, using entity and time fixed effects
yields
c = 0.54 · p1424 + state fixed effects + time fixed effects
tfr
(0.20)
Yi = 0 + 1 Xi + ui
ln(wagei ) = 0 + 1 educi + vi
I If we estimate this model via OLS, and educ and abil are
correlated, we will get a biased and inconsistent estimator for
1.
I It turns out that we can still use this equation as a basis for
estimation, provided we find an instrumental variable for
educ.
I Consider the simple regression model
Yi = 0 + 1 Xi + ui
where we think Cov (Xi , ui ) 6= 0, i.e., we think Xi is
endogenous.
Instrumental Variables (IV)
I In order to obtain consistent estimates of 0 and 1 , we need
an instrumental variable, call it Zi .
I A valid instrument Zi must satisfy two conditions:
(i) Instrument relevance: Cov (Zi , Xi ) 6= 0
(ii) Instrument exogeneity: Cov (Zi , ui ) = 0
I These conditions say that Zi must be uncorrelated with the
omitted variables and Zi must be related, either positively or
negatively, to the endogenous variable Xi .
I There is a very important difference between these two
requirements.
I Since Cov (Zi , ui ) involves the unobserved error ui we can not
generally hope to test this condition.
I We must appeal to economic theory or common sense.
I On the other hand, we can test the condition
Cov (Zi , Xi ) 6= 0.
I The easiest way to do this is to estimate a simple regression
between Xi and Zi
X i = ⇡0 + ⇡1 Z i + v i
Instrumental Variables (IV)
Stephen Bianchi
Department of Economics
UC Berkeley
QiD = 0 + 1 Pi + ui ,
QiS = 0 + 1 Pi + 2 Zi + vi ,
Demand : Qi = 0 + 1 Pi + ui
Supply : Qi = 0 + 1 Pi + 2 Zi + vi
I If we set the two equations equal, solve for Pi , and then plug
this Pi into the demand equation, we obtain the following
reduced form equations
0 0 2 vi ui
Pi = + Zi +
1 1 1 1 1 1
1 0 0 1 1 2 1 vi 1 ui
Qi = + Zi +
1 1 1 1 1 1
0 0 2 vi ui
Pi = + Zi +
1 1 1 1 1 1
1 0 0 1 1 2 1 vi 1 ui
Qi = + Zi +
1 1 1 1 1 1
more compactly as
I Since
1 2 1 1 ⇡1Q
1 = · =
1 1 2 ⇡1P
we can get a consistent estimate of 1 as long as we can get
consistent estimates of ⇡1P and ⇡1Q .
Simultaneous causality
SZP SZQ
ˆ1P =
⇡ , ˆ1Q =
⇡
SZ2 SZ2
I Then
ˆ1Q
ˆ1 = ⇡ SZQ
P
=
ˆ1
⇡ SZP
X i = ⇡ 0 + ⇡ 1 Zi + v i ,
X̂i = ⇡
ˆ0 + ⇡
ˆ 1 Zi
where
SXZ
ˆ1 =
⇡ , ˆ0 = X̄
⇡ ˆ1 Z̄
⇡
SZ2
ˆ1TSLS = SY X̂
S2
X̂
Two Stage Least Squares (TSLS)
so that
ˆ1 Cov (Yi , Zi )
⇡ Cov (Yi , Zi )
1 = 2
=
ˆ1 Var (Zi )
⇡ ˆ1 Var (Zi )
⇡
Two Stage Least Squares (TSLS)
ˆ1TSLS SYZ
=
ˆ1 SZ2
⇡
SYZ SZ2
= ·
SZ2 SXZ
SYZ
=
SXZ
Pn
(Zi Z̄ )(Yi Ȳ )
= Pi=1
n
(Zi Z̄ )(Xi X̄ )
Pni=1
(Zi Z̄ )Yi
= Pi=1
n
i=1 (Zi Z̄ )Xi
Large sample distribution of ˆ1TSLS
n
X
d (Zi , Xi ) = 1
Cov (Zi Z̄ )(Xi X̄ )
n 1
i=1
Two Stage Least Squares (TSLS)
I Let’s consider another approach which gets us to the same
place.
I Recall that we can write
ˆ1TSLS = SYZ
SXZ
SYZ SZ2
= ·
SZ2 SXZ
Yi = 0 + 1 Xi + ui
= 0 + 1 (⇡0 + ⇡1 Z i + v i ) + u i
= ( 0 + 1 ⇡0 ) + 1 ⇡1 Z i +( 1 vi + ui )
= 0 + 1 Zi + ei
where
0 = 0 + 1 ⇡0
1 = 1 ⇡1
ei = 1 vi + ui
Two Stage Least Squares (TSLS)
Yi = 0 + 1 Xi + ui
X i = ⇡0 + ⇡1 Z i + vi
Yi = ( 0 + 1 ⇡0 ) + 1 ⇡1 Z i +( 1 vi + ui )
= 0 + 1 Zi + ei
ei = 1 vi + ui
is "OK" since
0 0 2 vi ui
Pi = + Zi +
1 1 1 1 1 1
ˆ1TSLS = SZQ
SZP
I Note that this is exactly the same estimator we found before
using indirect least squares (ILS).
I Since there is no exogenous demand shifter in the demand
equation, we can not estimate the coefficient of price in the
supply equation: it is not identified.
Econ 140
Instrumental Variables
Stephen Bianchi
Department of Economics
UC Berkeley
Xi = ⇡0 + ⇡1 Z1i + · · · + ⇡m Zmi
+⇡m+1 W1i + · · · + ⇡m+r Wri + vi
X̂i = ⇡ ˆ1 Z1i + · · · + ⇡
ˆ0 + ⇡ ˆm Zmi
⇡m+1 W1i + · · · + ⇡
+ˆ ˆm+r Wri
I Second stage:
Yi = 0 + 1 Xi + 2 Wi + vi
where vi = ui 1 ei .
IV solutions to errors-in-variables*
ln(wagei ) = 0 + 1 educi + vi
H0 : ⇡1 = · · · = ⇡m = 0
1
The intuition for the cutoff of 10 is a bit complicated and involves the
asymptotic bias of the TSLS coefficients and how much bias you are willing to
tolerate (for more details see S&W (4th Edition), Appendix 12.5).
Instrument exogeneity
Yi = 0 + 1 Xi + 2 W1i + 3 W2i + ui
q=3 1=2
overidentifying restrictions.
I When q 2, comparing several IV estimates is cumbersome.
I But we can easily compute a test statistic based on the TSLS
residuals.
I If all instruments are exogenous, the TSLS residuals should be
uncorrelated with the instruments (up to sampling error).
Basmann test of overidentifying restrictions
1. Run TSLS to obtain
ûiTSLS = Yi ŶiTSLS
H0 : 1 = ··· = m =0
Basmann test of overidentifying restrictions
where q = m k.