Professional Documents
Culture Documents
5. Dummy Variables
{
d0
y = b0 + b1x
d=0
} b0
x
Economics 20 - Prof. Anderson 4
Dummies for Multiple Categories
We can use dummy variables to control for
something with multiple categories
Suppose everyone in your data is either a
HS dropout, HS grad only, or college grad
To compare HS and college grads to HS
dropouts, include 2 dummy variables
hsgrad = 1 if HS grad only, 0 otherwise;
and colgrad = 1 if college grad, 0 otherwise
d=1
y = (b0 + d0) + (b1 + d1) x
x
Economics 20 - Prof. Anderson 10
Testing for Differences Across
Groups
Testing whether a regression function is
different for one group versus another can
be thought of as simply testing for the joint
significance of the dummy and its
interactions with all other x variables
So, you can estimate the model with all the
interactions and without and form an F
statistic, but this could be unwieldy
Economics 20 - Prof. Anderson 11
The Chow Test
Turns out you can compute the proper F statistic
without running the unrestricted model with
interactions with all k continuous variables
If run the restricted model for group one and get
SSR1, then for group two and get SSR2
Run the restricted model for all to get SSR, then
F
SSR SSR1 SSR2 n 2k 1
•
SSR1 SSR2 k 1
Economics 20 - Prof. Anderson 12
The Chow Test (continued)
The Chow test is really just a simple F test
for exclusion restrictions, but we’ve
realized that SSRur = SSR1 + SSR2
Note, we have k + 1 restrictions (each of
the slope coefficients and the intercept)
Note the unrestricted model would estimate
2 different intercepts and 2 different slope
coefficients, so the df is n – 2k – 2
6. Heteroskedasticity
.
. E(y|x) = b0 + b1x
.
x1 x2 x3 x
Economics 20 - Prof. Anderson 3
Why Worry About
Heteroskedasticity?
OLS is still unbiased and consistent, even if
we do not assume homoskedasticity
The standard errors of the estimates are
biased if we have heteroskedasticity
If the standard errors are biased, we can not
use the usual t statistics or F statistics or LM
statistics for drawing inferences
ˆ xi x ui
For the simple case, b1 b1 , so
xi x
2
i
2
x x 2
2
i
, where uˆi are are the OLS residuals
SST x
Varˆ bˆ j
ij uˆi
rˆ 2 2
1 T
2
u
2
u
2 12
a
z z y y
b̂1 i i
z z x x
i i
s 2
Var bˆ1
ns x x , z
2 2
sˆ 2
ˆ
se b1
SSTx Rx2, z
Economics 20 - Prof. Anderson 6
IV versus OLS estimation
Standard error in IV case differs from OLS
only in the R2 from regressing x on z
Since R2 < 1, IV standard errors are larger
However, IV is consistent, while OLS is
inconsistent, when Cov(x,u) ≠ 0
The stronger the correlation between z and
x, the smaller the IV standard errors
ˆ Corr( z , u ) s u
IV : plimb1 b1 •
Corr( z , x) s x
~ su
OLS : plim b1 b1 Corr( x, u ) •
sx
y1 = a1y2 + b1z1 + u1
y2 = a2y1 + b2z2 + u2
S (z=z3)
h
Economics 20 - Prof. Anderson 6
Using IV to Estimate Demand
So, we can estimate the structural demand
equation, using z as an instrument for w
First stage equation is w = p0 + p1z + v2
Second stage equation is h = a2ŵ + u2
Thus, 2SLS provides a consistent estimator
of a2, the slope of the demand curve
We cannot estimate a1, the slope of the
supply curve
y* = b0 + xb + u, y = max(0,y*)