You are on page 1of 9

Homework # 2 Solutions

9.3 Let math10 denote the percentage of students at a Michigan high school receiving a
passing score on a standardized math test (see also Example 4.2) We are interested in
estimating the effect of per student spending on math performance. A simple model is:
math10   0  1 log(exp end )   2 log(enroll )   3 poverty  u
where poverty is the percentage of students living in poverty.
(i) The variable lnchprg is the percentage of students eligible for the federally funded
school lunch program. Why is this a sensible proxy variable for poverty?

Eligibility for federally funded school lunch programs is tightly linked with

m
being economically disadvantaged-i.e poor. That suggests the percentage of

er as
students eligible for the lunch program is very similar to the percentage of

co
students living in poverty.

eH w
o.
rs e
(ii) The table that follows contains OLS estimates, with and without lnchprg as an
ou urc
explanatory variable. Explain why the effect of expenditures on math10 is
lower in column (2) than in column (1). Is the effect in column (2) still
statistically greater than zero?
o
aC s

The effect of expenditures on math10 is lower in column (2) than (1) because
vi y re

the estimate in (1) is likely biased upward by some unobserved characteristic


that is correlated with expenditure and affects math10. An example of this
unobserved characteristic is that students who attend schools where
ed d

expenditure is high are probably from high income families (since expenditure
ar stu

at a school is determined by the population of households in the


neighborhood), and these high income families likely provide resources which
positively affect test achievement. Another possible unobserved characteristic
sh is

is teacher quality—at schools with high expenditure, teacher quality is high,


Th

and this positively affects student achievement.

It’s likely that schools with high expenditure have low poverty rates, so that
the correlation between expenditure and lnchrp (our proxy for poverty) is
negative. Moreover, schools with high poverty are likely to do worse on
student achievement. Once we include lnchrpg to control for the poverty level
of the school, we are able to control for the negative effect of poverty on
achievement, where poverty itself is negatively correlated with expenditure.
The two negatives result in a positive bias in the effects of expenditure on
achievement in column (1).

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
The t-statistic on log(expend) is 7.75/3.04=2.54. At the 5% level, and a 2
sided test, the critical value with 428 degrees of freedom is 1.96. Since the t-
stat is greater than the critical value, we reject the null of insignificance. That
is, the effect of expenditure on math10 is statistically different from zero at the
5% level.
(iii)Does it appear that pass rates are lower at larger schools, other factors being
equal? The coefficient on log(enroll) in column (2) is -1.26 (s.e.=.58). The t-
stat is -2.17, which for a 2 sided test and 5% level and 428 degrees of
freedom, the critical value is 1.96. That means the estimate on enroll is
statistically different from zero: A 10% increase in enrollment leads to a drop
in math10 of .126 percentage points.

m
er as
(iv) Interpret the coefficient on lnchrpg in column (2).

co
eH w
A 10 percentage point increase in lnchrpg leads to about a 3.24 percentage

o.
point fall in math10.
rs e
(v) What do you make of the substantial increase in R2 from column (1) to column
ou urc
(2)?
o

In Column (1) we explain only about 3% of the variation in test scores,


aC s

whereas in column (2) we explain about 18%. This suggests the addition of
vi y re

lnchrpg adds a lot of explanatory power into the regression.


ed d

9.4 The following equation explains weekly hours of television viewing by a child in
ar stu

terms of the child’s age, mother’s education, father’s education, and number of siblings:
tvhours *   0   1 age   2 age 2   3 motheduc   4 fatheduc   5 sibs  u
We are worried that tvhours* is measured with error in our survey. Let tvhours denote the
sh is

reported hours of television viewing per week.


(i) What do the classic error in variables (CEV) assumptions require in this
Th

application?
For the CEV assumption to hold, we must be able to write
tvhours=tvhours*+e , where the measurement error e has zero mean and is
uncorrelated with tvhours* and each explanatory variable in the equation.

(ii) Do you think the CEV assumptions are likely to hold?


It is unlikely that the CEV assumptions hold. For instance, we could argue
that more highly educated parents are more likely to underreport their child’s
television watching time-because say, they feel a stigma attached to watching

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
too much television. This would suggest the measurement error e is negatively
correlated with the level of mothers and fathers education (i.e. higher
education, lower e is).

C9.1
(i) Apply RESET from equation (9.3) to the model estimated in Computer Exercise C7.5
Is there evidence of functional form misspecification in the equation?
See attached sheet. To run the RESET test, we first estimate the equation, and obtain
predicted values for log(salary), denoted lsalaryhat. We then create
lsalaryhat_sq=lsalaryhat*lsalaryhat, and lsalaryhat_cubed=
lsalaryhat*lsalaryhat*lsalaryhat. We add these regressors to the original model, and
calculate the F-statistic. We can calculate it by hand or use the “test” command in
STATA. The F-statistic is 1.33 with p-value=0.27

m
The F-statistic has 2 numerator degrees of freedom, and 203 degrees of freedom. This

er as
gives us a critical value of c=3 at alpha=5%. Since the F-statistic is smaller than the

co
critical value, we fail to reject the null hypothesis of joint insignificance. That means,

eH w
there is no evidence for functional form misspecification.

o.
rs e
(ii) Compute a heteroskedasticity-robust form of RESET. Does your conclusion from part
ou urc
(i) change?
We re-run the regression above with lsalaryhat_sq and lsalaryhat_cube as regressors and
use the robust command. We cannot compute the F-statistic by hand when using the
o

robust command, so we use STATA’s test command to derive the F-statistic. The F-
aC s

statistic is 2.17 with p-value 0.1163. The critical value for alpha=5% and 2 numerator
vi y re

degrees of freedom and 203 denominator degrees of freedom is c=3. We can still fail to
reject the null of joint insignificance, meaning there is no evidence of functional form
specification. However, it should be noted that there is slightly more evidence in favor of
functional for mis-specification (although not enough to change our conclusion at
ed d

alpha=5%) in part (ii) compared to part (i).


ar stu

C93.
Use the data from JTRAIN.RAW for this exercise.
sh is
Th

(i) Consider the simple regression model


log(scrap)   0   1 grant  u
Where scrap is the firm scrap rate and grant is a dummy variable indicating whether a
firm received a job training grant. Can you think of some reasons why the unobserved
factors in u might be correlated with grant?

If the grants were awarded to firms based on firm or worker characteristics, grant could
easily be correlated with such factors that affect productivity. In the simple regression
model, these are contained in u.

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
(ii) Estimate the simple regression model using data for 1998. Does receiving a job
training grant significantly lower a firm’s scrap rate?

See attached log. The coefficient on grant is positive, but not statistically different from
zero since (0.057-0)/0.406=0.14

(iii) Now, add as an explanatory variable log(scrap87). How does this change the
estimated effect on grant? Interpret the coefficient on grant. Is it statistically
significant at the 5% level against the one-sided alternative that H 1 :  grant  0 ?

See the attached log. When we add the lagged value of log(scrap), the coefficient on
grant is -0.254. The t-stat testing H 0 :  grant  0 is

m
er as
(-0.254-0)/0.147=-1.73. The 5% critical value for n-k-1=54-2-1=51 is not available in

co
Table G.2, so we use the critical value for dof=60, which is 1.671 (it’s okay to also use

eH w
the critical value for dof=40 which is 1.684).

o.
rs e
Since -1.73 <-1.671, we reject the null that there is no effect of grants on scrap rate in
ou urc
favor of the alternative, which is that the effect is negative.

(iv) Test the null hypothesis that the parameter on log(scrap87) is one against the two-
o

sided alternative. Report the p-value.


aC s
vi y re

From part (iii), we see that the coefficient estimate on log(scrap87) is 0.8311606 with a
standard error of 0.04444. For H 0 :  lscrap87  1 versus H 0 :  lscrap87  1 , we get
(0.8311606-1)/0.04444=-3.7989245.
ed d
ar stu

Since |-3.79|> 2.0 (critical value for dof=60 with 5% significance level and two-tailed),
we reject the null in favor of the alternative.

To get the p-value for this, we use the test command. See the attached log sheet. The test
sh is

command gives us the F-statistic, testing the joint significance of the variables specified.
Th

Since there is only one variable being tested, we can take the F-statistic and the p-value
and map it to what the t-stat would be using the following relationship:

t-stat^2 = F-stat (3.79^2=14.43)


p-value t-stat = p-value F-sat

Therefore the p-value for the t-stat is 0.0004

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
(v)

Repeat parts (iii) and (iv) using heteroscedasticity robust standard errors, and briefly
discuss differences.

See attached logs. With the heteroskedasticity-robust standard error, the t statistic for
grant88 is .254/.142  1.74, so the coefficient is even more significantly less than zero
when we use the heteroskedasticity-robust standard error. The t statistic for H0:
 log( scrap ) = 1 is (.831 – 1)/.073  -2.3150685, which is notably smaller than before, but it
87

is still pretty significant.

m
er as
co
eH w
o.
rs e
ou urc
o
aC s
vi y re
ed d
ar stu
sh is
Th

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
Wednesday November 25 17:50:38 2009 Page 1

___ ____ ____ ____ ____tm


/__ / ____/ / ____/
___/ / /___/ / /___/
Statistics/Data Analysis

rosneg .9572086 13.9183 0.07 0.945 -26.48576 28.40018


log: C:\Documents and Settings\user\My Documents\Teaching\Fall 2009\Econ 8740\Homework Solut
log type: smcl
opened on: 25 Nov 2009, 17:44:10

1 . do "C:\DOCUME~1\user\LOCALS~1\Temp\STD0d000000.tmp"

2 . /*C 9.1*/
3 .
4 . /*(i)*/
5 . gen rosneg=.
(209 missing values generated)

6 . replace rosneg=1 if ros<0


(23 real changes made)

m
er as
7 . replace rosneg=0 if ros>=0
(186 real changes made)

co
eH w
8 .
9 . reg lsalary lsales roe rosneg

o.
Source SS df MS Number of obs = 209

rs e F( 3, 205) = 28.81
ou urc
Model 19.7902019 3 6.59673397 Prob > F = 0.0000
Residual 46.9319613 205 .228936397 R-squared = 0.2966
Adj R-squared = 0.2863
Total 66.7221632 208 .320779631 Root MSE = .47847
o
aC s

lsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]


vi y re

lsales .2883868 .0336172 8.58 0.000 .222107 .3546665


roe .0166571 .0039681 4.20 0.000 .0088336 .0244806
rosneg -.225675 .109338 -2.06 0.040 -.4412462 -.0101038
_cons 4.297602 .2932526 14.65 0.000 3.719425 4.87578
ed d

10 .
ar stu

11 . predict lsalaryhat, xb

12 . gen lsalaryhat_sq=lsalaryhat*lsalaryhat

13 . gen lsalaryhat_cube=lsalaryhat*lsalaryhat*lsalaryhat
sh is

14 .
Th

15 . reg lsalary lsales roe rosneg lsalaryhat_sq lsalaryhat_cube

Source SS df MS Number of obs = 209


F( 5, 203) = 17.88
Model 20.3987921 5 4.07975843 Prob > F = 0.0000
Residual 46.3233711 203 .228193946 R-squared = 0.3057
Adj R-squared = 0.2886
Total 66.7221632 208 .320779631 Root MSE = .4777

lsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]

lsales -1.210438 17.77697 -0.07 0.946 -36.26162 33.84074


roe -.0685937 1.026873 -0.07 0.947 -2.093299 1.956112
rosneg .9572086 13.9183 0.07 0.945 -26.48576 28.40018
lsalaryhat~q 1.140818 8.878866 0.13 0.898 -16.36581 18.64744
lsalaryhat~e -.0732213 .4256701 -0.17 0.864 -.9125231 .7660804
_cons -12.42752 122.5815 -0.10 0.919 -254.1238 229.2687

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
Wednesday November 25 17:50:38 2009 Page 2

16 . test lsalaryhat_sq lsalaryhat_cube

( 1) lsalaryhat_sq = 0
( 2) lsalaryhat_cube = 0

F( 2, 203) = 1.33
Prob > F = 0.2659

17 .
18 . /*(ii)*/
19 .
20 . reg lsalary lsales roe rosneg lsalaryhat_sq lsalaryhat_cube, robust

Linear regression Number of obs = 209


F( 5, 203) = 29.19
Prob > F = 0.0000
R-squared = 0.3057
Root MSE = .4777

Robust

m
er as
lsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]

co
lsales -1.210438 13.37127 -0.09 0.928 -27.57482 25.15394

eH w
roe -.0685937 .7721086 -0.09 0.929 -1.590975 1.453788
rosneg .9572086 10.47364 0.09 0.927 -19.69387 21.60829
lsalaryhat~q 1.140818 6.689588 0.17 0.865 -12.04917 14.3308

o.
lsalaryhat~e -.0732213 .3211029 -0.23 0.820 -.706346 .5599033
_cons
rs e
-12.42752 92.4416 -0.13 0.893 -194.6964 169.8413
ou urc
21 . test lsalaryhat_sq lsalaryhat_cube

( 1) lsalaryhat_sq = 0
o

( 2) lsalaryhat_cube = 0
aC s

F( 2, 203) = 2.17
vi y re

Prob > F = 0.1163

22 .
23 .
end of do-file
ed d

24 . clear
ar stu

25 . use "C:\Documents and Settings\user\My Documents\Teaching\Fall 2009\Econ 8740\Homework Assignments\Data\

26 . do "C:\DOCUME~1\user\LOCALS~1\Temp\STD0d000000.tmp"
sh is

27 .
28 . /*C 9.8*/
Th

29 .
30 . /*(i)*/
31 .
32 . reg lscrap grant if year==1988

Source SS df MS Number of obs = 54


F( 1, 52) = 0.02
Model .039451758 1 .039451758 Prob > F = 0.8895
Residual 105.323208 52 2.02544631 R-squared = 0.0004
Adj R-squared = -0.0188
Total 105.36266 53 1.98797472 Root MSE = 1.4232

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
Monday February 1 12:06:40 2016 Page 1

___ ____ ____ ____ ____(R)


/__ / ____/ / ____/
___/ / /___/ / /___/
Statistics/Data Analysis

name: <unnamed>
log: C:\Users\rbhatt\Desktop\Log C9.3.smcl
log type: smcl
opened on: 1 Feb 2016, 12:04:10

1 . do "C:\Users\rbhatt\AppData\Local\Temp\STD0i000000.tmp"

2 . /*C. 9.3*/
3 .
4 . /*(ii)*/
5 .
6 . reg lscrap grant if year==1988

Source SS df MS Number of obs = 54


F(1, 52) = 0.02

m
er as
Model .039451758 1 .039451758 Prob > F = 0.8895
Residual 105.323208 52 2.02544631 R-squared = 0.0004

co
Adj R-squared = -0.0188

eH w
Total 105.36266 53 1.98797472 Root MSE = 1.4232

o.
lscrap Coef. Std. Err. t P>|t| [95% Conf. Interval]

rs e
ou urc
grant .0566004 .4055519 0.14 0.890 -.757199 .8703998
_cons .408526 .2405616 1.70 0.095 -.0741962 .8912482

7 .
o

8 .
aC s

9 . /*(iii)*/
10 . reg lscrap grant lscrap_1 if year==1988
vi y re

Source SS df MS Number of obs = 54


F(2, 51) = 174.94
Model 91.9584791 2 45.9792396 Prob > F = 0.0000
Residual 13.4041809 51 .262827077 R-squared = 0.8728
ed d

Adj R-squared = 0.8678


Total 105.36266 53 1.98797472 Root MSE = .51267
ar stu

lscrap Coef. Std. Err. t P>|t| [95% Conf. Interval]

grant -.2539697 .1470311 -1.73 0.090 -.5491469 .0412076


sh is

lscrap_1 .8311606 .0444444 18.70 0.000 .7419347 .9203865


_cons .021237 .0890967 0.24 0.813 -.1576321 .2001061
Th

11 .
12 .
13 . /*(iv)*/
14 . reg lscrap grant lscrap_1 if year==1988

Source SS df MS Number of obs = 54


F(2, 51) = 174.94
Model 91.9584791 2 45.9792396 Prob > F = 0.0000
Residual 13.4041809 51 .262827077 R-squared = 0.8728
Adj R-squared = 0.8678
Total 105.36266 53 1.98797472 Root MSE = .51267

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/
Monday February 1 12:06:41 2016 Page 2

lscrap Coef. Std. Err. t P>|t| [95% Conf. Interval]

grant -.2539697 .1470311 -1.73 0.090 -.5491469 .0412076


lscrap_1 .8311606 .0444444 18.70 0.000 .7419347 .9203865
_cons .021237 .0890967 0.24 0.813 -.1576321 .2001061

15 . test lscrap_1=1

( 1) lscrap_1 = 1

F( 1, 51) = 14.43
Prob > F = 0.0004

16 .
17 .
18 . /*(v)*/
19 . reg lscrap grant lscrap_1 if year==1988, robust

Linear regression Number of obs = 54

m
er as
F(2, 51) = 77.79
Prob > F = 0.0000

co
R-squared = 0.8728

eH w
Root MSE = .51267

o.
Robust
lscrap
rs e
Coef. Std. Err. t P>|t| [95% Conf. Interval]
ou urc
grant -.2539697 .1463727 -1.74 0.089 -.5478251 .0398857
lscrap_1 .8311606 .0735407 11.30 0.000 .6835215 .9787996
_cons .021237 .0998451 0.21 0.832 -.1792103 .2216843
o
aC s

20 . test lscrap_1=1
vi y re

( 1) lscrap_1 = 1

F( 1, 51) = 5.27
Prob > F = 0.0258
ed d

21 .
end of do-file
ar stu

22 . log close
name: <unnamed>
log: C:\Users\rbhatt\Desktop\Log C9.3.smcl
log type: smcl
sh is

closed on: 1 Feb 2016, 12:04:21


Th

https://www.coursehero.com/file/24493676/Homework-2-Solutionspdf/

Powered by TCPDF (www.tcpdf.org)

You might also like