You are on page 1of 13

Econ 201 PS6 Suggested Solutions

Misung Ahn

May 26, 2012

(1) Gasoline Demand and the 1973 Embargo


(a) From the STATA results below,

log(Gt ) = 9.8990 0.1604 log(Pt ) + 1.6917 log(Yt ) + Ut

Table 1: Estimation results : regress

Variable Coecient (Std. Err.)


lnP

-0.160

(0.022)

lnY

1.692

(0.070)

Intercept

-9.899

(0.626)

N
R

27

0.981

F (2,24)

616.39

(b) Notice that the dependent variable and independent variables are transformed to the logratithm
function. Mathematically,

1 = log(Gt )/log(Pt )

and

2 = log(Gt )/log(Pt ).

Therefore,

mea-

sures the percentage change in the gas consumption with respect to the percentage change in the
gasoline price; that is, the price elasticity of gasoline. On the other hand,

measures the percentage

change in the gas consumption with respect to the percentage change in income; that is, the income
elasticity of gasoline.
(c)

Ho : 1 = 0

vs

H1 : 1 > 0

t statistic =
Please

0.1604
0.1604 0
=
= 7.30 tdf =273=24

0.0220
SE(1 )

let me know if you nd any error or typo. mahn@uchicago.edu.


1

Since the critical value of the one-sided test at the 5% level for
which gives the

p value 0.000,

t24

is 1.711 and since

| 7.30| > 1.711,

we may reject the null hypothesis at the 5% level and conclude that

the gasoline demand responds to the gasoline price change.


(d)

Ho : 2 = 1 vs H1 : 2 6= 1
t statistic =

1.6917 1
0.6917
=
= 9.839 tdf =273=24

0.0703
SE(2 )

This is the two-sided test which gives the critical value of 2.064 for

2.064,

tdf =24

at 5% level. Since

|9.839| >

we may reject the null hypothesis and conclude that the gasoline demand is not unitary income

elastic.
(e) From the following STATA results,

ln(Gt )

= 5.4403 0.1427 ln(Pt ) + 1.2048 ln(Yt )


+0.7775P Et + 1.0807P Et ln(Pt ) 0.1010P Et ln(Yt ) + Ut

Table 2: Estimation results : regress

Variable Coecient (Std. Err.)


lnP

-0.143

(0.025)

lnY

1.205

(0.171)

PE

0.778

(2.462)

PElnP

1.081

(0.408)

PElnY

-0.101

(0.274)

Intercept

-5.440

(1.549)

N
R

27

0.991

F (5,21)

(f )

444.968

F T est

If the gasoline demand was unaected by the Arab Oil Embargo of 1973,
the dummy variable

P Et

3 = 4 = 5 = 0.

and its interaction terms does not inuence the

log(Gt ).

It is when

This is equivalent

to considering the structural break before and after 1973.

bef ore 1973 : log(Gt )


af ter 1973 : log(Gt )
Notice that only when

(0 + 3 ) + (1 + 4 )log(Pt ) + (2 + 5 )log(Yt ) + Ut

= 0 + 1 log(Pt ) + 2 log(Yt ) + Vt

3 = 4 = 5 = 0,

(1)
(2)

we have the equivalent regression equations. Therefore we

conduct the following hypothesis test:

Ho : 3 = 4 = 5 = 0 vs H1 : not H0
2

not Ho

The alternative hypothesis of

includes all cases such as one of

for

i 3, 4, 5

are not zero but equal; all are not zero nor equal or etc. In order to conduct the

is not 0; or all

F test,

we consider

two regression models:


Unrestricted:
Restricted:

log(Gt) = 0 + 1 log(Pt ) + 2 log(Yt ) + 3 P Et + 4 P Et log(Pt ) + 5 P Et log(Yt ) + Ut

log(Gt) = 0 + 1 log(Pt ) + 2 log(Yt ) + Vt

For the restricted model, we use Table 1. There are three restrictions(q
in class, this

= 3) in this test.

counts the number of equal signs but not the number of coecients in

SOLUTION 1 USING SSR:

As we learned

H0 .

Although the SSRs are not shown in Table 1 and 2, I used SSR from the STATA output.

test statistic =

(SSRR SSRu )/3


(0.0266 0.0130)/3
=
7.3231 F(3,21)
SSRu /(27 6)
0.0130/21

SOLUTION 2 USING R :
2

test statistic =

2
)/3
(Ru2 RR
(0.9906 0.9809)/3
=
7.2234 F(3,21)
(1 Ru2 )/(27 6)
(1 0.9906)/21

The critical value at 5% level for

F(3,21)

is 3.072. Therefore, we may reject the null and conclude that

there is a structural break before and after 1973.


(g) Lagrange multiplier Test
To run the auxiliary regression, we compute the

Vt ,

the estimated residual of the restricted equation.

This is obtained by a STATA command predict resid, residuals following reg lnG lnP lnY in STATA.
Then we run the auxiliary regression of reg resid lnP lnY PE PElnP PElnY and get
0.5103 in this problem.

Table 3: Estimation results : regress

Variable Coecient (Std. Err.)


lnP

0.018

(0.025)

lnY

-0.487

(0.171)

PE

0.778

(2.462)

PElnP

1.081

(0.408)

PElnY

-0.101

(0.274)

Intercept

4.459

(1.549)

N
R

27

0.51

F (5,21)

4.378

Ru2 ,

which is

test statistic = N Ru2 = 27 0.5103 = 13.7781 2df =3


The critical value at 5% level for

2df =3

is 7.815 which is less than our test statistic results of 13.7781.

Therefore we reject the null hypothesis of

3 = 4 = 5 = 0.

(h)
Before the embargo, the estimated price elasticity of demand is
as shown in equation (1). After the embargo, it is

1 = 0.1427

1 + 4 = 0.1427 + 1.0807 = 0.9380


as in equation (2).

On the other hand, the estimated income elasticity of demand before the embargo is

1.2048 0.1010 = 1.1038

as in equation (1) while it is

2 = 1.2048

2 + 5 =

after the embargo as in equation

(2).
(i)
Notice that the price elasticity of gasoline demand used to be 0.9380, which is positive, surprisingly, and
then it is altered to -0.1427. To see if it is a statistically signicant change,
From STATA results corresponding to Table 2, we see that

Ho : 4 = 0 vs H1 : 4 6= 0.

t value = 2.65

with the p-value=0.015.

Thus, we reject the null at the 5% level. Prior to the Embargo, the price elasticity used to be positive.
Considering that the price elasticity is mostly negative except for giens, one explanation to this
unintuitive results is the demand shifts.

Before the Embargo where the gasoline price was low and

stable, people could have aorded a large car, which demands more gasoline. Hence, the price elasticity
results in a positive number; that is, the gasoline demand increases with respect to the price increase.
After the Embargo, people would substitute their larger cars towards smaller cars and this will give
the usual negative price elasticity where higher gasoline price yields lower gasoline demand. Thus,
we can say that the demand became more sensitive to changes in price.
On the other hand, the income elasticity became higher in (h). It remains to be positive; the more
income one has, the more s/he demands the gasoline. Thus, the gasoline demand became more sensitive
to income. The analogous argument with car sizes goes through for the income elasticity.

(2) F-test
The unrestricted regression model:

Yn = 0 + 1 Xn1 + 2 Xn2 + 3 Xn3 + 4 Xn4 + Un , n = 1, .., N


(a)

Ho : 0 + c1 = 0

Yn = 1 (Xn1 c) + 2 Xn2 + 3 Xn3 + 4 Xn4 + Vn

The restricted equation is


So we will use

Xn1 c instead of Xn1 as a variable for the coecient 1 .

after imposing

0 = c1 .

Since the number of restrictions

is the number of equal signs in the hypothesis, we know there is only 1 restriction. Then, the relevant
F-distribution is

F(1,N 5) .

Ho : 2 = 3 = 4

(b)

Yn = 0 + 1 Xn1 + 2 (Xn2 + Xn3 + Xn4 ) + Wn

The restricted equation is

The number of restriction is 2 and the relevant F-distribution is


(c)

after imposing

2 = 3 = 4 .

F(2,N 5) .

Ho : 1 + 2 + d3 + e4 = 1

The restricted equation after plugging in

1 = 1 2 d3 e4

0 + 2 (Xn2 Xn1 ) + 3 (Xn3 dXn1 ) + 4 (Xn4 eXn1 ) + Zn .


F-distribution is

and arranging terms is

Yn Xn1 =

We have one restriction and the relevant

F(1,N 5) .

(4) Linear Prob Model for Smoking Bans


(a)
(i)

p =the

(ii)
(iii)

probability of smoking for all workers=0.2423

q =the
r =the

probability of smoking for workers aected by workplace smoking bans=0.2120


probability of smoking for workers not aected by workplace smoking bans=0.2896

Table 4:

Estimated Probability of Smoking


smkban mean
0

0.2896

0.2120

Total
Source:

0.2423

Smoking.dta

(b) The estimated dierence in the probability of smoking between workers aected by ban and not is

r q = 0.2120 0.2896 = 0.0776.


One linear probability model is

Yn = 0 + 1 Xn + Un

where

Yn

is the probability of smoking and

Xn

is the dummy variable which is 1 if there is a work area smoking ban and 0 otherwise. To account for
the heteroskedasticity, we use the command robust and have the following STATA results.
To test the statistical signicance of the probability dierence, we test

t statistic =

Ho : 1 = 0

0.0776 0
= 8.6222 tdf =9998
0.0090

vs

H1 : 1 6= 0.

Table 5: Estimation results : regress

Variable Coecient (Std. Err.)


smkban

-0.078

(0.009)

Intercept

0.290

(0.007)

N
R

10000

0.008

F (1,9998)

75.061

With this large number of observations, t-distribution is well approximated to the normal distribution
so that p-value is 0.000. Hence, we may reject the null hypothesis that

1 = 0

and conclude that the

workplace smoking ban aects the probability of smoking.


(c) We posit a following population regression model.

smokern

= 0 + 1 smkbann + 2 f emalen + 3 agen + 4 age2n


+5 hsdropoutn + 6 hsgradn + 7 colsomen + 8 colgradn
+9 blackn + 10 hispanicn + Un

The corresponding sample equation model that is estimated is

smokern

= 0.01411 0.0472 smkbann 0.0333 f emalen + 0.0097 agen 0.0001 age2n


+0.3227 hsdropoutn + 0.2327 hsgradn + 0.1643 colsomen + 0.0448 colgrad(3)
n
0.0276 blackn 0.1048 hispanicn + Un

The coecient on the smkban changes to -0.0472 from -0.0776. That is, the impact of smoking ban
policy at workplace, measured in absolute magnitude, on the probability of smoking decreases. Why
does it decrease? Mechanically, as there are more independent variables in the regression equation, it
is likely that an explanatory power of one variable decreases since newly added variables also explain

Figure 1: Linear probability model of smoking

the dependent variable. Also, the model in (b) suers from omitted variable bias. That is,

smkban

may be correlated with the education/race/gender indicators or with age. For example, workers with
a college degree are more likely to work in an oce with a smoking ban than high-school dropouts,
and college graduates are less likely to smoke than high-school dropouts.
(d)

Ho : 1 = 0

vs

H1 : 1 6= 0
test statistic =

0.0472 0
= 5.2444 tdf =9989
0.0090

With this large number of observation, t-distribution is well approximated to the normal distribution
so that p-value is 0.000. Hence, we may reject the null hypothesis that

1 = 0

and conclude that the

workplace smoking ban aects the probability of smoking. The 95% condence interval can be read
o of the STATA output in Figure 1 and is (-0.0648,-0.0297).
(e)

Ho : 5 = 6 = 7 = 8 = 0

vs

H1 : not Ho .

We use the STATA command test.

is, test (hsdrop=0) (hsgrad=0) (colsome=0) (colgrad=0).

That

The resulting F-statistic=140.09 and p-

value=0.0000; and hence, we reject the null hypothesis.


The baseline education level is a Master's degree or higher.

Notice that

5 > 6 > 7 > 8 > 0.

Therefore, as the education level rises, we observe one exhibits lower probability of smoking.
(f ) The coecient on

age

age2

is statistically signicant. This suggests a nonlinear relationship between

and the probability of smoking.

The gure below shows the estimated probability for a white,

.05

Fitted values
.05
.1

.15

.2

non-Hispanic male college graduate with no workplace smoking ban.

20

40

60

80

age in years

Figure 2: Nonlinear Relationship between age and Probability of Smoking

(5) Probit Model for Smoking Bans


(a) From the STATA results below, we have the following probit model.

Yn

1.7349 0.1586 smkbann 0.1117 f emalen + 0.0345 agen 0.0005 age2n


+1.1416 hsdropn + 0.8827 hsgradn + 0.6771 colsomen + 0.2347 colgradn

(4)

0.0843 blackn 0.3383 hispanicn + Un


so that

P r(Yn = 1|Xn ) = (Xn )

for the normal CDF

corresponds to the constant;

to the one for

().

smkban,

Denote the coecients in order so that

and so forth.

Table 6: Estimation results : probit

Variable

smkban

Coecient (Std. Err.)


-0.159
-0.112

(0.029)

age

0.035

(0.007)

age2

0.000

(0.000)

hsdrop

1.142

(0.072)

hsgrad

0.883

(0.060)

colsome

0.677

(0.061)

colgrad

0.235

(0.065)

black

-0.084

(0.053)

hispanic

-0.338

(0.048)

Intercept

-1.735

(0.153)

10000

Log-likelihood

-5235.868

2(10)
(b)

Ho : 1 = 0

t statistic =

vs

(0.029)

female

602.597

H1 : 1 6= 0

0.15860
0.0290

= 5.4690 tdf =9989

where the degrees of freedom is calculated by 9989 =

# of obs - # of coecients = 10000-11.


Assuming the t-distribution is well approximated to the standard normal distribution for this large
sample, at the 5% level, we may reject the null because p-value=0.000. The result is the same as in
part (d) of Question 4 which gives the t-statistic=-5.2444.
(c)

Ho : 5 = 6 = 7 = 8 = 0

vs

H1 : not Ho

The test statistic for the likelihood ratio is

test statistic = 2(lu lR ) = 2((5235.8679) (5484.2301)) = 496.7244 2df =4


The critical value for

2df =4

at the 5% level is 9.488 < 496.7244 and thus we may reject the null and con-

clude that the education aects the probability of smoking. Notice that

4.
8

df = q = # of restrictions =

Just like Question 4 part (e), this yields the identical results of rejecting the null and

8 > 0.

5 > 6 > 7 >

Since the baseline education level is a Master's degree or higher, we observe one exhibits lower

probability of smoking as the education level rises.

Table 7: Estimation results : probit

Variable

smkban

Coecient (Std. Err.)


-0.241
-0.059

(0.028)

age

0.014

(0.007)

age2

0.000

(0.000)

black

0.001

(0.052)

hispanic

-0.120

(0.045)

Intercept

-0.708

(0.134)

10000

Log-likelihood

-5484.230

2(6)
(d)

Ho : 3 = 4 = 0

vs

(0.028)

female

105.872

H1 : not Ho

The test statistic for the likelihood ratio is

test statistic = 2(lu lR ) = 2((5235.8679) (5258.7768)) = 45.8178 2df =2


The critical value for

2df =2

at the 5% level is 5.991<45.8178 and thus we may reject the null and

conclude that the age matters in determining the probability of smoking.

df = 2

here.

Table 8: Estimation results : probit

Variable

smkban

Coecient (Std. Err.)


-0.156

(0.029)

female

-0.114

(0.029)

hsdrop

1.093

(0.071)

hsgrad

0.863

(0.059)

colsome

0.669

(0.060)

colgrad

0.240

(0.065)

black

-0.078

(0.053)

hispanic

-0.313

(0.047)

Intercept

-1.157

(0.058)

10000

Log-likelihood

-5258.777

2(8)

556.779

(e) Fitted value in this probit model of Aloysius. To calculate the probabilities, take the estimation results from the probit model to calculate

at Yn , i.e.,

P rob(smoke) =

Yn , and calculate the cumulative standard normal distribution

(Yn ).

Using the equation (4), denote the estimated coecients as a vector of

T = [0.15863 0.1117313 0.0345114

0.0004675 1.14161 0.8826708 0.6771192 0.2346839 0.0842789 0.3382743 1.734926]


Aloysius' covariates is

0.090088
ban,

X A ,N o =[0

0 20 400 1 0 0 0 0 0 1 ] without the ban.

and hence prob (Aloysius is a smoker| without ban)=(0.090088)

X A ,Y es =[1

0 20 400 1 0 0 0 0 0 1 ] and

smoker| with ban)=(0.248718)

XA,N o = 0.248718

Then,

XA,N o =

= 0.4641.

With the

and hence prob (Aloysius is a

= 0.4013.

The marginal eect of smoking ban for Aloysius is the lowering of the probability of smoking by
0.4641-0.4013=0.0628.
(f ) Fitted value of Brenda. Brenda's covariates without the ban is
so that

XB,N o = 1.0637963

0.1446.

With the ban,

XB,Y es

XB,N o =[0

1 40 1600 0 0 0 1 1 0 1]

and hence prob (Brenda is a smoker| without ban)=(1.0637963)

=[1 1 40 1600 0 0 0 1 1 0 1] and XB,Y es = 1.2224263 and hence prob

(Brenda is a smoker| with ban)=(1.2224263)

= 0.1112.

The marginal eect of smoking ban for Aloysius is the lowering of the probability of smoking by
0.1446-0.1112=0.0334.
(g)

REPEAT (E) FOR THE LINEAR PROB MODEL:


Using the equation (3),

T = [0.04723990.03325690.00967440.00013180.32271420.23270120.16429680.0447983

0.0275658 0.1048159 0.0141099]


P r(smokerA |N o )

XA,N o = 0.4494

P r(smokerA |Y es )

XA,Y es = 0.4021

The marginal eect of smoking ban for Aloysius from the LPM (linear probability model) is the lowering
of the probability of smoking by 0.4494-0.4021=0.0472.

REPEAT (F) FOR THE LINEAR PROB MODEL:


Similary for Brenda,

P r(smokerB |N o )

XB,N o = 0.1460

P r(smokerB |Y es )

XB,Y es = 0.0987

The marginal eect of smoking ban for Brenda from the LPM is the lowering of the probability of
smoking by 0.1460-0.0987=0.0472.
(h) The probit and LPM results give similar predictions for the probabilities of both Aloysius and
Brenda. However, they yield dierent predictions of the eect of smoke ban on probability of smoking.

10

That is, the marginal eect of smoking ban for the probit model varies by the characteristics of an
individual  here, the marginal eects of Aloysius and Brenda are dierent. On the other hand the
marginal eect of smoking ban for the LPM is identical for Aloysius and Brenda by the coecient of

smkban

variable. This makes us believe that the probit model is more convincing than the LPM.

Are the estimated eects large in an economic (i.e., real-world) sense? Most people might believe
the impacts are large. For example, in (e) the reduction on the probability is 6.3%. Applied to a large
number of people, this translates into a 6.3% reduction in the number of people smoking.
(i) Across groups, observe that the smoker ratio with the smoking ban is 0.2120 while that without the
ban is 0.2896. Thus, the average eect of a workplace smoking ban is 0.2896-0.2120=0.0776 around
8%.

Practically, this might not be signicant if the group characteristics seem to dier from each

other. Especially, the education level is quite varying between smokers and nonsmokers and thus, it
might not measure the average partial eect of the smoking ban policy well enough.

Table 9:

smkban

smoker female

Sample Means By Smkban


age hsdrop hsgrad colsome colgrad black hispanic

0.2896

0.4923

38.0871

0.1258

0.3721

0.2717

0.1548

0.0746

0.2120

0.6094

39.0810

0.0690

0.2975

0.2857

0.2243

0.0784

0.1046

Total

0.2423

0.5637

38.6932

0.0912

0.3266

0.2802

0.1972

0.0769

0.1134

Source:

Smoking.dta

(j) What we did in part (i) corresponds to the option 2 of the last Friday lecture, i.e., quick and
dirty way of obtaining the average partial eect. It plugs in the average values of the variables into
the partial derivatives. However, as we saw in (i), it might be less appealing if we can not dene an
individual of average characteristics. An alternative method is the option 1; the average partial eect
is calculated by the sample average of the partial eects of all individuals. The quantities from option
1 and 2 are usually dierent as it is a nonlinear function. Specically, the option 1 measures

F (X)
]
E[
X2
E[F (X)|X]

"

#
N
X
F (Xn )
/N
X2n
n=1

"

N
X

#
(F (Xn |X1n = 1) F (Xn |X1n = 0)) /N

n=1

(k)

P r[Yn = 1|Xn ]
age

G(Xn )
Yn
= g(Xn )
|evaluated at Xn
age
age
= g(Xn ) [3 + 24 age] |evaluated at Xn

(5)

We want to evaluate this average eect of age by plugging the sample means for variables in the data
which is given below.

11

0.1271

Table 10:

stats

smkban female

mean

Source:

0.6098

0.5637

Sample Means of the Entire Data


age hsdrop hsgrad colsome colgrad black hispanic

38.6932

0.0912

0.3266

0.2802

0.1972

0.0769

Smoking.dta

=
X

1.7349 0.1586 smkbann 0.1117 f emalen + 0.0345 agen 0.0005 age2n


+1.1416 hsdropn + 0.8827 hsgradn + 0.6771 colsomen + 0.2347 colgradn
0.0843 blackn 0.3383 hispanicn |all evaluated at X

1.7349 0.1586 0.6098 0.1117 0.5637 + 0.0345 38.6932 0.0005 38.69322


+1.1416 .0912 + 0.8827 .3266 + 0.6771 .2802 + 0.2347 .1972
0.0843 .0769 0.3383 .1134

0.7247
= (0.7247) 1 (0.72) = 1 0.7642 = 0.2358
)
P rob(smoking|X

The previous formula (5) gives

g(Xn ) [3 + 24 age] |evaluated at X,


= g(0.7247) [0.0345 + 2 (0.0005) 38.6932]


0.3068 [0.0345 + 2 (0.0005) 38.6932] = 0.0013

where

g()

is the pdf of Normal distribution. This is the average marginal eect of a one-unit change

in age on the tendency to smoke.


(l) While the coecient on

age for X is positive at 0.0345 so that higher age increases the probability

of smoking, the average marginal eect of

age

is negative. This implies that there may exist dierent

age-specic marginal changes. This age-specic marginal eect dierence can give a completely wrong
estimates for the change in smoking probability induced by a work-place smoking ban.
(m)
Plugging the sample means of Table 11 into

equation for each age cohort and then to (5), we have

X18 = 0.5370 = g(0.5370) [0.0345 + 2 (0.0005) 18] = 0.005698696


X24 = 0.7783 = g(0.7783) [0.0345 + 2 (0.0005) 24] = 0.003094299
X30 = 0.7477 = g(0.7477) [0.0345 + 2 (0.0005) 30] = 0.001357454
X36 = 0.7550 = g(0.7550) [0.0345 + 2 (0.0005) 36] = 0.00045001
It seems that the average eect of aging on smoking tendency is approximately concave and this is

12

0.1134

Sample Means By Age


smkban female hsdrop hsgrad colsome colgrad black hispanic
Table 11:

age

18

0.49

0.48

0.31

0.56

0.13

0.00

0.06

0.18

24

0.60

0.58

0.08

0.31

0.32

0.27

0.09

0.18

30

0.59

0.58

0.09

0.31

0.26

0.24

0.09

0.12

36

0.62

0.53

0.05

0.33

0.28

0.20

0.09

0.10

Total

0.59

0.55

0.10

0.34

0.27

0.21

0.09

0.14

Source:

Smoking.dta

consistent with Figure 2. From the data, the marginal eect with respect to 6-years is less than 1%
but economy as a whole could represent a large shift in smoking population.
(n) The method 2 serves better for our aim of computing the change in smoking probability induced
by a work-place smoking ban than the method 1. However, it is not clear how to divide the age bin
and this arbitrary assignment may results in an inappropriate outcome.

13