You are on page 1of 13

1

Multiple Regression Analysis: Inference (with Answer as of Oct 9, 2012)



You can obtain the data (in Stata format) from this website: http://fmwww.bc.edu/ec-
p/data/wooldridge/datasets.list.html (as of Oct 2, 2012).

Example 4.1 Hourly wage equation
Data: wage1s8.dta
1. Check the number of observation. Ans: 526
2. Check the data of tenure exper educ wage , , ), log(
Answer:

sum lwage educ exper tenure

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
lwage | 526 1.623268 .5315382 -.6348783 3.218076
educ | 526 12.56274 2.769022 0 18
exper | 526 17.01711 13.57216 1 51
tenure | 526 5.104563 7.224462 0 44

3. Estimate with OLS the regression:
i i i i i
u tenure exper educ wage + + + + =
3 2 1 0
) log( | | | |
Answer:

reg lwage educ exper tenure

Source | SS df MS Number of obs = 526
-------------+------------------------------ F( 3, 522) = 80.39
Model | 46.8741805 3 15.6247268 Prob > F = 0.0000
Residual | 101.455581 522 .194359351 R-squared = 0.3160
-------------+------------------------------ Adj R-squared = 0.3121
Total | 148.329762 525 .28253288 Root MSE = .44086

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .092029 .0073299 12.56 0.000 .0776292 .1064288
exper | .0041211 .0017233 2.39 0.017 .0007357 .0075065
tenure | .0220672 .0030936 7.13 0.000 .0159897 .0281448
_cons | .2843595 .1041904 2.73 0.007 .0796755 .4890435
------------------------------------------------------------------------------

4. Conduct the test whether the return to exper, controlling for educ and tenure, is zero in the
population, against the alternative that it is positive. Write the hypothesis. Determine the t-statistic
and the critical value at % 1 = o and % 5 = o . What is your conclusion?
Answer:
0 : H
exper 0
= |
0 : H
exper 1
> |
3914 . 2 0017233 . 0 / 0041211 . 0 = = t
2

One-sided critical value at % 5 = o is 1.65. This value can be obtained by using the command: disp
invttail(522,0.05) which yields 1.6477779.
One-sided critical value at % 1 = o is 2.33. This value can be obtained by using the command: disp
invttail(522,0.01) which yields 2.3335127.

invttail(n,p)
Domain n 2e-10 to 2e+17 (may be nonintegral)
Domain p 0 to 1
Range 0 to 1e+10
Description returns the inverse reverse cumulative (upper-tail,
survival) Student's t distribution: if ttail(n,t) = p,
then invttail(n,p) = t.

Conclusion: Since the t-stat (2.39) is greater than the critical values at % 1 = o , H
0
is rejected. We
have some support to say that, having controlled educ and tenure, the return to experience is
significantly positive.

We can also draw the conclusion by using the p-value information given by the Stata output.
However that p-value corresponds to the two-sided alternative hypothesis:
0 : H
exper 0
= |
0 : H
exper 1
= |
To get the one-sided p-value, divide the given p-value by two: 0.017/2 = 0.0085.
Since the one-sided p-value is less than 1%, we can reject H
0
at % 1 = o or 5%, and thus conclude
that the return to experience is significantly positive.
The one-sided p-value can also be obtained by using the command: di ttail(522,2.3914) which
yields 0.00856867.

ttail(n,t)
Domain n 2e-10 to 2e+17
Domain t -8e+307 to 8e+307
Range 0 to 1
Description returns the reverse cumulative (upper-tail, survival)
Student's t distribution; it returns the probability T>t.

5. How much the wage would change if the experience is increased by three years?
Answer: The model is of log-level type. The coefficient of exper is 0.0041211, meaning that an
increase in experience by one year will cause an increase in wage by (0.0041211 x 100) percent = 0.4
percent. Hence, if the experience is increased by three years, the wage would increase by (3 x 0.04
percent) = 0.12 percent.

Example 4.2 Student performance and school size
Data: meap93s8.dta
We want to investigate the effect of school size on student performance. One claim is that, everything else
being equal, students at smaller schools fare better than those at larger schools. The meap93s8.dta
contains data on 408 high schools in Michigan for the year 1993. We will use the data to test the null
hypothesis that school size has no effect on standardized test scores against the alternative that size has a
negative effect. Performance is measured by the percentage of students receiving a passing score on the
tenth-grade math test (math10). School size is measured by student enrollment (enroll).
3

1. Check the number of observation. Ans: 408
2. Check the data of math10, enroll, totcomp (teacher compensation), and staff (number of staff per
1000 students)
Answer:
sum math enroll totcom staff

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
math10 | 408 24.10686 10.49361 1.9 66.7
enroll | 408 2663.806 2696.821 212 16793
totcomp | 408 38237.94 5985.086 24498 63518
staff | 408 100.6417 13.29952 65.9 166.6

3. Estimate with OLS the regression
i i i i i
u enroll staff totcomp math10 + + + + =
3 2 1 0
| | | |
Answer:
reg math totcom staff enroll

Source | SS df MS Number of obs = 408
-------------+------------------------------ F( 3, 404) = 7.70
Model | 2422.93434 3 807.644779 Prob > F = 0.0001
Residual | 42394.2462 404 104.936253 R-squared = 0.0541
-------------+------------------------------ Adj R-squared = 0.0470
Total | 44817.1805 407 110.115923 Root MSE = 10.244

------------------------------------------------------------------------------
math10 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
totcomp | .0004586 .0001004 4.57 0.000 .0002613 .0006559
staff | .0479199 .039814 1.20 0.229 -.0303487 .1261884
enroll | -.0001976 .0002152 -0.92 0.359 -.0006207 .0002255
_cons | 2.274021 6.113794 0.37 0.710 -9.744801 14.29284
------------------------------------------------------------------------------

3a. Conduct the test whether the impact of school size (enroll) is zero in the population against the
alternative that it is negative. Write the hypothesis. Determine the t-statistic and the critical value
at % 5 = o and % 15 = o . What is your conclusion?
Answer:
0 : H
enroll 0
= |
0 : H
enroll 1
< |
91822 . 0 0002152 . 0 / 0001976 . 0 = = t
One-sided (left) critical value at % 5 = o is -1.65. This value can be obtained by using the
command: disp invttail(404,0.95) which yields -1.648634.
One-sided (left) critical value at % 15 = o is -1.04. This value can be obtained by using the
command: disp invttail(404,0.85) which yields -1.0377654.
Conclusion: Since the absolute t-stat (0.92) is smaller than the absolute critical value (1.03), H
0

cannot be rejected at 15% significance level which also means it cannot be rejected at a lower
significance level. School size does not seem to affect student performance.

3b. Conduct the test whether the impact of totcomp is zero in the population against the alternative
that it is positive. Write the hypothesis. Determine the t-statistic and the critical value at % 1 = o ,
% 5 = o , and % 10 = o . What is your conclusion?
4

3c. Conduct the test whether the impact of staff is zero in the population against the alternative that it
is positive. Write the hypothesis. Determine the t-statistic and the critical value at % 1 = o ,
% 5 = o , and % 10 = o . What is your conclusion?

4. Estimate the regression
i i i i i
u enroll staff totcomp math10 + + + + = ) log( ) log( ) log(
3 2 1 0
| | | |
Answer:

reg math ltotcom lstaff lenroll

Source | SS df MS Number of obs = 408
-------------+------------------------------ F( 3, 404) = 9.42
Model | 2930.03231 3 976.677437 Prob > F = 0.0000
Residual | 41887.1482 404 103.68106 R-squared = 0.0654
-------------+------------------------------ Adj R-squared = 0.0584
Total | 44817.1805 407 110.115923 Root MSE = 10.182

------------------------------------------------------------------------------
math10 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ltotcomp | 21.15498 4.055549 5.22 0.000 13.18237 29.1276
lstaff | 3.979981 4.189659 0.95 0.343 -4.256274 12.21624
lenroll | -1.268042 .6932037 -1.83 0.068 -2.630778 .0946951
_cons | -207.6645 48.70311 -4.26 0.000 -303.4077 -111.9213
------------------------------------------------------------------------------

4a. Conduct the test whether the impact of log school size is zero in the population against the
alternative that it is negative. Write the hypothesis. Determine the t-statistic and the critical value
at % 5 = o . What is your conclusion?
Answer:
0 : H
lenroll 0
= |
0 : H
lenroll 1
< |
82925 . 1 6932037 . 0 / 268042 . 1 = = t
One-sided (left) critical value at % 5 = o is -1.65. This value can be obtained by using the
command: disp invttail(404,0.95) which yields -1.648634.
Conclusion: Since the absolute t-stat (1.829) is smaller than the absolute critical value (1.65), we
reject H
0
at 5% significance level. This model says that school size has negative effect on student
performance.

4b. Conduct the test whether the impact of log(totcomp) is zero in the population against the
alternative that it is positive.
4c. Conduct the test whether the impact of log(staff) is zero in the population against the alternative
that it is positive.

5. What is the impact of an increase in enroll by 10% on math10?
Answer:
Under the first model,
enroll
| is -.0001976, meaning that if the school size (enroll) increases by one
unit, the percentage of students receiving a passing score on the tenth-grade math test (math10) will
drop by 0.0001976 point. However this effect is not statistically significant even at % 15 = o .

5

Under the second model (note: level-log type),
lenroll
| is -1.27, meaning that if enroll increases by
one percent, the percentage of students receiving a passing score on the tenth-grade math test
(math10) will drop by (1.27/100) = 0.013 point.
Hence the impact of an increase in enroll by 10% would be a reduction in math10 by (10 x 0.013) =
0.13 percentage point.

6. Which model do you prefer?
Answer:
The first regression suggests that there is no linear correlation between the school size and the
students performance. The second regression says that the percentage change in school size linearly
correlates with the students performance (i.e. the model captures the non-linear correlation between
school size and students performance). If we believe that school size can affect students
performance in non-linear fashion, then the first model is inadequate and the second model should be
preferred.

Example 4.4 Campus crime and enrollment
Data: campuss8.dta
1. Check the data. It contains 97 observations on colleges and universities in the US for the year 1992 on
the number of campus crimes (crime) and the enrollment (enroll).
Answer:

desc

Contains data from C:\STORAGE\Z_Copy\Dept IE\Ekonometri 1_S2\BKF\Data\Data stata
8\campuss8.dta
obs: 97
vars: 7
size: 2,716 (99.9% of memory free)
------------------------------------------------------------------------
storage display value
variable name type format label variable label
------------------------------------------------------------------------
enroll long %12.0g total enrollment
priv byte %8.0g =1 if private college
police byte %8.0g employed officers
crime int %8.0g total campus crimes
lcrime float %9.0g log(crime)
lenroll float %9.0g log(enroll)
lpolice float %9.0g log(police)
------------------------------------------------------------------------

2. What is the average number of crime and enroll?
Answer:

sum crime enroll

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
crime | 97 394.4536 460.7839 1 2052
enroll | 97 16076.35 12298.99 1799 56350

6

3. Estimate with OLS the regression:
i i i
u enroll crime + + = ) log( ) log(
1 0
| |
This is a constant elasticity model where
1
| is the elasticity of crime with respect to enrollment.
Answer:

reg lcrime lenroll

Source | SS df MS Number of obs = 97
-------------+------------------------------ F( 1, 95) = 133.79
Model | 107.083654 1 107.083654 Prob > F = 0.0000
Residual | 76.0358244 95 .800377098 R-squared = 0.5848
-------------+------------------------------ Adj R-squared = 0.5804
Total | 183.119479 96 1.90749457 Root MSE = .89464

------------------------------------------------------------------------------
lcrime | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lenroll | 1.26976 .109776 11.57 0.000 1.051827 1.487693
_cons | -6.63137 1.03354 -6.42 0.000 -8.683206 -4.579533
------------------------------------------------------------------------------
Interpretation of
1
.
| :
If the enrollment (enroll) increases by 1 percent, then the number of campus crime (crime) increases
by 1.26 percent.

4. Test the hypothesis that the elasticity of crime with respect to enrollment is one against the alternative
that the elasticity is more than one. Write the hypothesis. Determine the t-statistic and the critical
value at % 5 = o . What is your conclusion?
Note: The t-stat should be computed as
)

(
1

1
1
1
1 1
stat
|
|
|
| |
sd sd
t

=

=
Answer:
1 : H
lenroll 0
= |
1 : H
lenroll 1
> |
45736 . 2 109776 . 0 / ) 1 26976 . 1 ( = = t
One-sided (right) critical value at % 5 = o is 1.66. This value can be obtained by using the
command: disp invttail(95,0.05) which yields 1.6610518.
Conclusion: Since the t-stat (2.46) is greater than the critical values (1.66), H
0
is rejected. We would
argue that the elasticity of crime with respect to enroll is more than unity.

Example 4.5 Housing prices and air pollution
Data: hprice2s8.dta
For a sample of 506 communities in the Boston area, we estimate a model relating median housing price
(price) in the community to various community characteristics: nox is the amount of nitrogen oxide in the
air, in parts per milion; dist is a weighted distance of the community from five employment centers, in
miles; rooms is the average number of rooms in house in the community; and stratio is the average
student-teacher ratio of schools in the community.
1. Estimate the regression:
7

i i i i i i
u stratio rooms dist nox price + + + + + =
4 3 2 1 0
) log( ) log( ) log( | | | | |
2. Test the hypothesis that the elasticity of price with respect to nox is negative one against the
alternative that it does not equal negative one. Write the hypothesis. Determine the t-statistic and the
critical value at % 5 = o . What is your conclusion?
Note: The t-stat should be computed as
)

(
) 1 (

1
1
1
1 1
|
|
|
| |
sd sd
tstat

=

=
3. Calculate the p-value for testing the hypothesis that the elasticity of price with respect to nox is
negative one against the alternative that it does not equal negative one.

Example 4.7 Effect of job training on firm scrap rates
Data: jtrains8.dta
The scrap rate for a manufacturing firm is the number of defective items out of every 100 produced. Thus,
for a given number of items produced, a decrease in the scrap rate reflects higher worker productivity. We
want to use the scrap rate to measure the effect of worker training on productivity.
1. Check the data. For the year 1987 and for non-unionized firms, how many observations the data has?
2. Estimate the following regression only for the year 1987 and for non-unionized firms:
i i i i i
u employ sales hrsemp scrap + + + + = ) log( ) log( ) log(
3 2 1 0
| | | |
where hrsemp is annual hours of training per employee, sales is annual firm sales (in dollars), and
employ is the number of firm employees.
3. What is the average scrap rate and average hrsemp in 1987?
4. What is your comment on the economic significance of the training variable?
5. What about the statistical significance of the training variable?
5a. Test the hypothesis that the effect of training on scrap is zero in the population against the
alternative that it is negative. Write the hypothesis. Determine the t-statistic and the critical value
at % 1 = o , % 5 = o , and % 10 = o . What is your conclusion?
5b. Calculate the p-value to test the hypothesis that the effect of training on scrap is zero in the
population against the alternative that it is negative.
Note: The one-sided p-value is obtained as one-half of the p-value for the two-tailed test.

Section 4.4 Testing hypotheses about a single linear combination of the parameters
In this section we show how to test a single hypothesis involving more than one of the
j
| . Consider a
simple model to compare the returns to education at junior college and four-year colleges (universities).
The model is u exper univ jc wage + + + + =
3 2 1 0
) log( | | | |
The hypothesis of interest is whether one year at junior college is worth one year at a university against a
one-sided alternative that a year at a junior college is worth less than a year at a university. Thus:
2 1 0
: H | | = ,
2 1 1
: H | | < .
The t-statistic for the these hypotheses is
)

(

2 1
2 1
| |
| |

=
se
t
1. Estimate with OLS the regression
i i i i i
u exper univ jc wage + + + + =
3 2 1 0
) log( | | | |
8

Answer:
use twoyears8.dta, clear

desc

Contains data from C:\STORAGE\Z_Copy\Dept IE\Ekonometri 1_S2\BKF\Data\Data stata
8\twoyears8.dta
obs: 6,763
vars: 22 27 Feb 2012 08:36
size: 311,098 (97.3% of memory free)
------------------------------------------------------------------------------
storage display value
variable name type format label variable label
------------------------------------------------------------------------------
female byte %8.0g =1 if female
phsrank byte %8.0g % high school rank; 100 = best
ba byte %8.0g =1 if bachelor's degree
aa byte %8.0g =1 if associate's degree
black byte %8.0g =1 if african-american
hispanic byte %8.0g =1 if hispanic
id long %12.0g id number
exper int %8.0g total (actual) work experience
jc float %9.0g total 2-year credits
univ float %9.0g total 4-year credits
lwage float %9.0g log hourly wage
stotal float %9.0g total standardized test score
smcity byte %8.0g =1 if small city, 1972
medcity byte %8.0g =1 if med. city, 1972
submed byte %8.0g =1 if suburb med. city, 1972
lgcity byte %8.0g =1 if large city, 1972
sublg byte %8.0g =1 if suburb large city, 1972
vlgcity byte %8.0g =1 if very large city, 1972
subvlg byte %8.0g =1 if sub. very lge. city, 1972
variabl0 byte %8.0g =1 if northeast
nc byte %8.0g =1 if north central
south byte %8.0g =1 if south
------------------------------------------------------------------------------

sum lwage jc univ exper

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
lwage | 6763 2.248096 .4876918 .5555456 3.911953
jc | 6763 .3388946 .7721268 0 3.833333
univ | 6763 1.926274 2.297001 0 7.5
exper | 6763 122.3816 33.42799 3 166

reg lwage jc univ exper

Source | SS df MS Number of obs = 6763
-------------+------------------------------ F( 3, 6759) = 644.53
Model | 357.752575 3 119.250858 Prob > F = 0.0000
Residual | 1250.54352 6759 .185019014 R-squared = 0.2224
-------------+------------------------------ Adj R-squared = 0.2221
Total | 1608.29609 6762 .237843255 Root MSE = .43014

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
jc | .0666967 .0068288 9.77 0.000 .0533101 .0800833
univ | .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper | .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons | 1.472326 .0210602 69.91 0.000 1.431041 1.51361
------------------------------------------------------------------------------

9

2. Find
2 / 1
12
2
2
2
1 2 1
} 2 )]

( se [ )]

( se {[ )

( se s + = | | | |
Answer:

qui reg lwage jc univ exper
matrix v=e(V)
matrix list v

symmetric v[4,4]
jc univ exper _cons
jc .00004663
univ 1.928e-06 5.330e-06
exper -1.718e-08 3.933e-08 2.480e-08
_cons -.00001741 -.00001573 -3.105e-06 .00044353

scalar s12 = v[2,1]
scalar se_jc_min_univ=(_se[jc]^2 + _se[univ]^2 -2*s12)^(1/2)
disp se_jc_min_univ

.00693591

From the last command we get 00693591 . 0 )

( se
2 1
= | | .

3. Find the p-value for the test.
Answer:
The t-statistic for the these hypotheses is
)

(

2 1
2 1
| |
| |

=
se
t


gen t = (_b[jc]-_b[univ])/se_jc_min_univ
disp t
-1.4676566
disp 1-ttail(6759, -1.4676566)
.07112203
The p-value suitable for the test is 0.07112203.
Since the one sided p-value is 7.11% which is greater than 5%, then H0 should not be rejected at 5%
significance level, meaning that one year of junior college is worth one year at a university.

Define a new parameter
2 1 1
| | u = and rearrange the model to give:
u exper totcoll jc
u exper univ jc jc
u exper univ jc jc jc wage
+ + + + =
+ + + + + =
+ + + + + =
3 2 1 0
3 2 2 1 0
3 2 2 2 1 0
) ( ) (
) log(
| | u |
| | | | |
| | | | | |

4. Estimate the modified model with OLS.

Answer:

gen totcoll=jc+univ

reg lwage jc totcoll exper

Source | SS df MS Number of obs = 6763
10

-------------+------------------------------ F( 3, 6759) = 644.53
Model | 357.752575 3 119.250858 Prob > F = 0.0000
Residual | 1250.54352 6759 .185019014 R-squared = 0.2224
-------------+------------------------------ Adj R-squared = 0.2221
Total | 1608.29609 6762 .237843255 Root MSE = .43014

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
jc | -.0101795 .0069359 -1.47 0.142 -.0237761 .003417
totcoll | .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper | .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons | 1.472326 .0210602 69.91 0.000 1.431041 1.51361
------------------------------------------------------------------------------

5. Test 0 : H
1 0
= u against the one-sided alternative 0 : H
1 1
< u
Answer:
4676 . 1 0069359 . 0 / 0101795 . 0 = = t

One-sided (left) critical value at % 5 = o is -1.65. This value can be obtained by using the command:
disp invttail(6759,0.95) which yields -1.6450791.
Since the absolute t-stat is less than the absolute critical value, then H
0
should not be rejected. We
argue that 0
1
= u which means that
2 1
| | = .

6. What is the p-value for the test.
Answer:
The one-sided (left) p-value is 0.071. This value can be obtained by using the command disp 1-
ttail(6759,-1.47) or the command disp ttail(6759,1.47) which yields 0.07080415, or by
taking the P>|t| value from the stata output (0.142) and divide it by two.

Section 4.5 Testing multiple linear restrictions: The F test
Data: mlb1s8.dta
1. Check the data.
2. Estimate the regression model that explains major league baseball players salaries:
i i i i i i i
u rbisyr hrunsyr bavg gamesyr years salary + + + + + + =
5 4 3 2 1 0
) log( | | | | | |
where salary is the 1993 total salary, years is years in the league, gamesyr is average games played
per year, bavg is career batting average, hrunsyr is home runs per year, and rbisyr is runs batted in per
year.
3. Test whether bavg, hrunsyr, and rbisyr are jointly statistically insignificant, using the SSR-form of
the F-test.
4. Test whether bavg, hrunsyr, and rbisyr are jointly statistically insignificant, using the R-squared of
the F-test.

Example 5.3 Testing multiple linear restrictions: The LM test
Data: crime1.dta
1. Check the data (number of observation and number of variables)
Answer: # obs is 2725; # variables is 16

11

2. We will conduct the LM test using the crime model below
u qemp86 ptime86 tottime avgsen pcnv narr86 + + + + + + =
5 4 3 2 1 0
| | | | | |
Estimate the model using OLS.
Answer:
reg narr86 pcnv avgsen tottime ptime86 qemp86

Source | SS df MS Number of obs = 2725
-------------+------------------------------ F( 5, 2719) = 24.29
Model | 85.9532425 5 17.1906485 Prob > F = 0.0000
Residual | 1924.39391 2719 .707757967 R-squared = 0.0428
-------------+------------------------------ Adj R-squared = 0.0410
Total | 2010.34716 2724 .738012906 Root MSE = .84128

------------------------------------------------------------------------------
narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.1512246 .040855 -3.70 0.000 -.2313346 -.0711145
avgsen | -.0070487 .0124122 -0.57 0.570 -.031387 .0172897
tottime | .0120953 .0095768 1.26 0.207 -.0066833 .030874
ptime86 | -.0392585 .0089166 -4.40 0.000 -.0567425 -.0217745
qemp86 | -.1030909 .0103972 -9.92 0.000 -.1234782 -.0827037
_cons | .7060607 .0331524 21.30 0.000 .6410542 .7710671
------------------------------------------------------------------------------

3. Use the LM statistic to test the null hypothesis that avgsen and tottime have no effect on narr86
once the other factors have been controlled for.

Step 1. Estimate the restricted model

reg narr86 pcnv ptime86 qemp86

Source | SS df MS Number of obs = 2725
-------------+------------------------------ F( 3, 2721) = 39.10
Model | 83.0741941 3 27.691398 Prob > F = 0.0000
Residual | 1927.27296 2721 .708295833 R-squared = 0.0413
-------------+------------------------------ Adj R-squared = 0.0403
Total | 2010.34716 2724 .738012906 Root MSE = .8416

------------------------------------------------------------------------------
narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.1499274 .0408653 -3.67 0.000 -.2300576 -.0697973
ptime86 | -.0344199 .008591 -4.01 0.000 -.0512655 -.0175744
qemp86 | -.104113 .0103877 -10.02 0.000 -.1244816 -.0837445
_cons | .7117715 .0330066 21.56 0.000 .647051 .776492
------------------------------------------------------------------------------

Step 2. Obtain the residuals u
~
from the regression.
predict ures, resid

Step 3. Run the regression of u
~
on pcnv, ptime86, qemp86, avgsen, and tottime

reg ures pcnv avgsen tottime ptime86 qemp86

Source | SS df MS Number of obs = 2725
-------------+------------------------------ F( 5, 2719) = 0.81
Model | 2.87904835 5 .575809669 Prob > F = 0.5398
Residual | 1924.39392 2719 .707757969 R-squared = 0.0015
-------------+------------------------------ Adj R-squared = -0.0003
Total | 1927.27297 2724 .707515773 Root MSE = .84128
12


------------------------------------------------------------------------------
ures | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.0012971 .040855 -0.03 0.975 -.0814072 .0788129
avgsen | -.0070487 .0124122 -0.57 0.570 -.031387 .0172897
tottime | .0120953 .0095768 1.26 0.207 -.0066833 .030874
ptime86 | -.0048386 .0089166 -0.54 0.587 -.0223226 .0126454
qemp86 | .0010221 .0103972 0.10 0.922 -.0193652 .0214093
_cons | -.0057108 .0331524 -0.17 0.863 -.0707173 .0592956
------------------------------------------------------------------------------

Step 4. Calculate the LM statistics

. scalar lm = e(N)*e(r2)
. disp lm
4.0707294

Step 5a. Calculate the p-value

. disp chi2tail(2,lm)
.13063283

Step 5b. Calculate the critical value at % 10 = o

. disp invchi2tail(2,0.1)
4.6051702

Step 6. Conclude

Since the critical value (4.605) is greater than the LM stat (4.0707), than we fail to reject the null
hypothesis that 0
2
= | and 0
3
= | at the 10% level.
The p-value is 1306 . 0 ) 0707294 . 4 (
2
2
= > _ P , so we would reject the null at the 15% level.


Problems 3.4 Heteroskedasticity
Data: sleep75.dta
1. Check the data
2. The model below
u age educ totwrk sleep + + + + =
3 2 1 0
| | | |
can be used to study the tradeoff between time spent sleeping and working and to look at other
factors affecting sleep. If adults trade off sleep for work, what is the sign of
1
| ? What signs do
you think
2
| and
3
| will have?
3. Estimate the model.
4. If someone works five more hours per week, by how many minutes is sleep predicted to fall? Is
this a large tradeoff?
5. Discuss the sign and magnitude of the estimated coefficient on educ.
6. Would you say totwrk, educ, and age explain much of the variation in sleep? What other factors
might affect the time spent sleeping? Are these likely to be correlated with totwrk? F stat
7. Explain intuitively the procedures from Breusch-Pagan and White to test the presence of
heteroskedastic error. Compare the two approaches. Dengan brus pagan
8. Conduct the Breusch-Pagan test for heteroskedasticity for the error term in the equation above
and explain whether you think the error u is heteroskedastic.
13

9. Conduct the White test for heteroskedasticity for the error term in the equation above and explain
whether you think the error u is heteroskedastic.