Professional Documents
Culture Documents
Final Exam
Section 2 (Tue/Thur section)
(Seyhan Erden Arkonac)
Instructions
1. Do not turn this page until so instructed.
3. This exam has six questions for a total of 100 points and a bonus question for 2 points.
5. You are permitted to use a simple calculator. No computers, wireless, or other electronic
devices without prior permission. You may not share resources with anyone else.
6. Some questions ask you to draw a real-world judgment in a problem of practical importance.
The quality of that judgment counts. For example, consider the question: “It is 10oF outside.
In your judgment, why are so many people wearing heavy coats?” The answer, “To stay
warm” would receive more points than the answer, “Because they are fashion-conscious.”
NAME:_________________________________________________________
UNI:__________________________________________________________
1
Question 1 [17 points]:
A study analyzed the probability that Major League Baseball (MLB) players "survive" for
another season, or, in other words, play one more season. They studied a model of the following
form:
The dependent variable is a binary variable that takes on a value of one if the player played one
more season (a minimum of 50 at bats or 25 innings pitched), and zero otherwise. Seasons is the
number of total seasons played, measured in years, Perf is the performance of the player this
year, and Avgperf is the average performance of the player over their career.
The researchers had a sample of 4,728 hitters and 3,803 pitchers for the years 1901-1999. All
explanatory variables are standardized (sample mean of 0, variance of 1). Probit estimation
yielded the results as shown in the table:
(a) (6p) Interpret the two probit equations and calculate survival probabilities for hitters and
pitchers at the sample mean. Provide an explanation for why these are so high.
2
(b) (6p) Calculate the change in the survival probability for a player who has a very bad year by
performing two standard deviations below the average (assume also that this player has been
in the majors for many years so that his average performance is negligibly affected). How
does this change the survival probability when compared to the answer in (a)?
(c) (5p) Since the results for hitters and pitchers seem similar, the researcher could consider
combining the two samples. With a combined sample, how could you test the hypothesis
that the coefficients for the explanatory variables are the same for hitters and pitchers?
Explain in some detail.
.
3
Question 2 [21 points]: (ch 10)
Consider the following panel data regression with a single explanatory variable
Yit = β0 + β1Xit + .
In each of the examples below, you will be including entity and time fixed effects.
(a) (3 p) Consider the effect of beer taxes on the fatality rate using annual data from 1982-1988,
and nine U.S. regions (New England, Pacific, Mid-Atlantic, South, etc.). How many total
coefficients do you need to estimate?
(b) (4 p) Certain regions (e.g. New England) that tend to have higher beer taxes also tend to
have consistently higher quality hospitals. Does this pose a threat to your analysis?
(c) (3 p) Consider the effect of the minimum wage on teenage employment using annual data
from 1963-2000 for five Canadian Regions (Atlantic Provinces, Quebec, Ontario, Prairies,
British Columbia). How many total coefficients do you need to estimate?
4
(d) (4 p) Nationwide recessions impact both teenage employment and the minimum wage across
the country. Does this pose a threat to your analysis?
(e) (3 p) Consider the effect of savings rates on per capita income using data for three decades
(1960-1969, 1970-1979, 1980-1989; one observation per decade) and 104 countries. How
many total coefficients do you need to estimate?
5
Question 3 [15 points]:
Consider a supply model for edible chicken, which the the U.S. Department of Agriculture calls
“broilers” Data for this question is adapted from the data provided by Epple and McCallum
(2006)1. The data are annual, 1950-2001 The Supply equation is:
( ) ( ) ( ) ( )
where is aggregate production of young chickens, is the real price index of fresh
chicken, is real price index of broiler feed, and which is included to
capture any technical progress in the production. Some potential external instrumental variables
are ( ), where is the real per capita income; ( ), where is the real price of
beef; is the percent population growth from year t-1 to year t; ( ) is the lagged
log of real price of chickens; ( ) is the log of exports of chicken.
Estimated supply equation for chicken can be written from the following output:
Regression 1:
. reg lnQPROD lnP lnPF TIME lnQPROD_1
------------------------------------------------------------------------------
lnQPROD | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnP | .0091099 .0679409 0.13 0.894 -.1288175 .1470373
lnPF | -.0901945 .0426459 -2.11 0.042 -.1767703 -.0036186
TIME | .0111706 .0051486 2.17 0.037 .0007183 .0216229
lnQPROD_1 | .7326902 .1066347 6.87 0.000 .5162103 .94917
_cons | 2.109681 .7991519 2.64 0.012 .487316 3.732045
------------------------------------------------------------------------------
1 “Simultaneous Equation Econometrics: The Missing Example”, Economic Inquiry, 44(2), 374-384
6
Regression 2:
. ivreg lnQPROD (lnP=lnPB lnY POPGRO lnEXPTS) lnPF TIME lnQPROD_1
------------------------------------------------------------------------------
lnQPROD | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnP | .393975 .1749342 2.25 0.031 .0388398 .7491103
lnPF | -.1909911 .0705566 -2.71 0.010 -.3342286 -.0477535
TIME | .0242389 .0087117 2.78 0.009 .0065532 .0419247
lnQPROD_1 | .5489031 .1635754 3.36 0.002 .2168274 .8809789
_cons | 3.298617 1.196567 2.76 0.009 .8694559 5.727778
------------------------------------------------------------------------------
Instrumented: lnP
Instruments: lnPF TIME lnQPROD_1 lnPB lnY POPGRO lnEXPTS
------------------------------------------------------------------------------
Regression 3:
. reg lnP lnPB lnY POPGRO lnEXPTS lnPF TIME lnQPROD_1
------------------------------------------------------------------------------
lnP | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnPB | .1159974 .2186138 0.53 0.599 -.3293044 .5612991
lnY | 1.471961 .6529929 2.25 0.031 .1418577 2.802064
POPGRO | .0697965 .0908676 0.77 0.448 -.1152949 .2548878
lnEXPTS | 2.438689 .6971098 3.50 0.001 1.018723 3.858655
lnPF | .154805 .1068706 1.45 0.157 -.0628833 .3724932
TIME | -.0735312 .0230427 -3.19 0.003 -.1204676 -.0265948
lnQPROD_1 | -.0086269 .2911554 -0.03 0.977 -.601691 .5844372
_cons | -11.95739 6.311461 -1.89 0.067 -24.81341 .8986362
-----------------------------------------------------------------------------c
7
Regression 4:
. reg lnQPROD lnP lnPF TIME lnQPROD_1
------------------------------------------------------------------------------
lnQPROD | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnP | .0091099 .0679409 0.13 0.894 -.1288175 .1470373
lnPF | -.0901945 .0426459 -2.11 0.042 -.1767703 -.0036186
TIME | .0111706 .0051486 2.17 0.037 .0007183 .0216229
lnQPROD_1 | .7326902 .1066347 6.87 0.000 .5162103 .94917
_cons | 2.109681 .7991519 2.64 0.012 .487316 3.732045
------------------------------------------------------------------------------
. predict e, residuals
(1 missing values generated)
Regression 5:
. reg e lnPB lnY POPGRO lnEXPTS lnPF TIME lnQPROD_1
------------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnPB | .1180813 .0856913 1.38 0.178 -.0564662 .2926289
lnY | .2378684 .2559575 0.93 0.360 -.2835 .7592367
POPGRO | -.0123288 .0356179 -0.35 0.732 -.0848802 .0602225
lnEXPTS | .9702997 .2732502 3.55 0.001 .4137072 1.526892
lnPF | -.0522353 .0418907 -1.25 0.221 -.1375639 .0330932
TIME | -.0045154 .0090322 -0.50 0.621 -.0229133 .0138826
lnQPROD_1 | -.2648651 .1141259 -2.32 0.027 -.497332 -.0323983
_cons | .1666471 2.473941 0.07 0.947 -4.872605 5.205899
------------------------------------------------------------------------------
( 1) lnPB = 0
( 2) lnY = 0
( 3) POPGRO = 0
( 4) lnEXPTS = 0
F( 4, 32) = 3.83
Prob > F = 0.0118
8
(a) (4p) Compare the results in regression 1 and 2. Explain the reasons for instrumental
variables in regression 2?
(b) (5p) What are the requirements for valid instruments? Explain with mathematical
conditions.
(c) (6p) Do these instruments satisfy the requirements? You must use the necessary
regression results for your answer. Please specify the regression number you use while
answering each part of this questions.
9
Question 4 [15 points]:
There is some economic research that suggests that oil prices play a central role in causing
recessions in developed countries. In particular, this research suggests that it is specifically
increases in oil prices that matter. As a result, economists often look only at the percentage point
difference between oil prices at date t and the maximum value over the previous year. However,
you notice that energy prices can fluctuate quite dramatically in both directions and believe that
geographic areas also benefit substantially from oil price decreases. As a result, you decide to
consider the effect of real oil prices (Poil/CPI) on GDP growth (Yt) You estimate the following
distributed lag model using annual data (numbers in parenthesis are HAC standard errors):
(a) (5p) What is the impact effect of a 25 percent increase in real oil prices?
(b) (5p) What is the predicted cumulative change in GDP Growth over two years of this effect?
10
(c) (5p) The HAC F-statistic is 4.07. Can you reject the null hypothesis that oil price changes
have no effect on real GDP growth? What is the critical value you considered? Is there any
reason why you should be cautious using an F-test in this case, given the sample period?
11
Question 5 [20 points]:
Given the following STATA output, you can find a VAR(2) (VectorAutoregression) model of
change in inflation ( ) and unemployment rate ( )
. var unem cinf
Vector autoregression
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
unem |
unem |
L1. | 1.061241 .1303681 8.14 0.000 .8057245 1.316758
L2. | -.2874012 .133048 -2.16 0.031 -.5481705 -.026632
|
cinf |
L1. | .0976014 .0668152 1.46 0.144 -.0333539 .2285567
L2. | .0623594 .0572543 1.09 0.276 -.049857 .1745758
|
_cons | 1.345204 .513183 2.62 0.009 .3393835 2.351024
-------------+----------------------------------------------------------------
cinf |
unem |
L1. | -.4678597 .2243671 -2.09 0.037 -.907611 -.0281084
L2. | .2932862 .2289793 1.28 0.200 -.155505 .7420773
|
cinf |
L1. | -.0527481 .1149907 -0.46 0.646 -.2781258 .1726296
L2. | -.430232 .0985363 -4.37 0.000 -.6233595 -.2371044
|
_cons | 1.00306 .883202 1.14 0.256 -.7279845 2.734104
------------------------------------------------------------------------------
Table 1
Year Unem Inflation
2008 5.8 3.8
2009 9.3 -0.3
2010 9.6 1.6
2011 8.9 3.1
2012 8.1 2.1
12
(a) (4p) Given the actual realizations of unemployment and inflation in table 1, forecast
unemployment for 2013, show your work
(b) (4p) Given the actual realizations of unemployment and inflation in table 1, forecast
inflation for 2013, show your work
13
(c) (4p) Following is the joint test result for the second lags of unemployment rate and the
inflation rate, according to the following test, would a VAR(1) model be better
forecasting model than a VAR(2) model, explain why?
( 1) [unem]L2.cinf = 0
( 2) [cinf]L2.cinf = 0
( 3) [unem]L2.unem = 0
( 4) [cinf]L2.unem = 0
chi2( 4) = 30.26
Prob > chi2 = 0.0000
(d) (4p) Why might a researcher use change in inflation as opposed to inflation in this
model? Explain.
(e) (4p) Should one use change in unemployment instead of unemployment? Explain.
14
Question 6 [12 points]:
Consider the panel data model:
where are i.i.d. and independent of Xs with mean zero and variance ,
15
(c) (3 p) Derive algebraically ̂ the fixed-effects estimator of . The fixed effects
estimator minimizes the sum of squared residuals of the model you wrote in part b.
16
(d) (3 p) Show that, if is a random variable that is independent of X and u, the
estimator
∑
̃
∑
is unbiased for . Explain your answer.
17
Bonus Question [2 points]:
The two conditions for instrument validity are corr(Zi, Xi) ≠ 0 and corr(Zi, ui) = 0. The reason for the
inconsistency of OLS is that corr(Xi, ui) ≠ 0. If X and Z are correlated, and X and u are also correlated, how
is it possible that Z and u are not correlated? Explain.
18
Selected Tables from Stock and Watson, Introduction to Econometrics
19
20
21
22