Professional Documents
Culture Documents
EC220
Introduction to Econometrics
Instructions to candidates
This paper contains FOUR questions, divided into two sections. Section A contains ONE question
related to Michaelmas Term and Section B contains THREE questions related to Lent Term. You
should answer ALL questions from Section A and ALL questions from Section B.
If at any point in this exam you feel that anything is unclear, please make additional assumptions that
you feel are necessary and state them clearly.
For Section A: Please type your answer in a Word-processing software on a computer (e.g. Word).
You could combine the typed document with scanned or photographed hand-drawn diagrams and
computations. The maximum word count is 1500 words, beyond which nothing will be marked. There
is no minimum word count and concise answers will be rewarded.
For Section B: Please use pen and paper and scan (or photograph) your answers. You could also use
an iPad or a tablet. There is no maximum word count for Section B. Please annotate your answers
clearly.
The answers must then be converted to pdf and uploaded to Moodle as ONE individual file together
with the Coversheet. Please make sure every single scanned page is legible and properly ordered.
The file will be run through Turnitin to ensure academic integrity.
Time Allowed Submit PDF with answers within 24 hours after official start of the exam
You are supplied with: Lindley & Scott Cambridge Statistical Tables
Table A5 Durbin-Watson d-statistic
You may also use: Open book examination
Question 1
[33.34 marks]
A blowout of the BP Deepwater Horizon oil-well in April 2010 led to the largest marine oil spill in history,
lasting until July of that year. Researchers would like to analyse whether consumers reacted to the
disaster by reducing their consumption of BP branded petrol during the oil spill. They collected data
on the prices and quantities sold at BP-branded and non-BP petrol stations across zip codes (postal
codes; small local areas) in the US. Either a zip code contains BP stations, in which case the average
price and average number of gallons sold for each of these BP stations is recorded (and the indicator
variable BP = 1), or a zip code contains no BP station, in which case the price and quantity at these
non-BP stations is recorded (and BP = 0). An observation is a particular petrol station. Non-BP
stations in BP zip codes are not used in the sample.
Prices and quantities are the averages either for the period January 2009 to March 2010 (before the
oil spill) in columns (1) and (2) or for April 2010 to July 2010 (during the oil spill) in columns (3) to (6)
of the table below. Prices are in US Dollars per gallon and coded as P rice. Quantities are in logarithm
and coded as ln(sales). The researchers also constructed a variable called Green Index, which is
supposed to measure the environmental orientation of consumers in the zip code. The Green Index
is constructed by combining the share of hybrid vehicle registrations, per capita membership in the
Sierra Club, an environmental organisation, and per capita contributions to Green Party election funds
in the zip code, all measured prior to 2010. The Green index is then standardised to have mean 0
and standard deviation 1. Using either P rice or ln(sales) as the dependent variable, the researchers
obtain the following results.
(a) Define the treatment, the outcome, and the counterfactuals implicit in the regression in column
(3)? What do the researchers use as the control group in this regression?
[3.34 marks]
(b) Why do the researchers run the regressions in columns (1) and (2) for the period before the oil
spill? What do you conclude from this exercise?
[6 marks]
(c) What is the average effect of the oil spill on BP prices in column (3)? Discuss whether this is
likely a causal effect.
[6 marks]
(d) The researchers also have a variable available which measures the advertising expenditures by
BP in a particular zip code from April to July 2010. Would this variable be useful for the analysis?
Question 2
[22.33 marks]
Consider the bivariate regression model without intercept
yi = βxi + ui ,
(a) Let β̂ be the OLS estimator for the regression from y on x (without intercept). Show that β̂ is a
consistent estimator for β under SLR.1-4.
[3 marks]
(b) In addition to SLR.1-4, suppose we know that
V ar(u|x) = σ 2 x2 .
where w(x) > 0 is a positive weight function of x. Derive the expression for β̃ .
[4 marks]
(e) Show that β̃ is an unbiased estimator for β under SLR.1-4.
[4 marks]
(f) Additionally, suppose:
SLR.5 The error term u satisfies V ar(u|x) = σ 2 for any value of x (homoskedasticity).
H0 : β1 = 10 in favor of H1 : β1 > 10
at significance level α, then he/she would certainly reject the null hypothesis
H0 : β1 = 10 in favor of H1 : β1 6= 10
at the same significance level.
[4 marks]
(iii) Consider a multiple regression model y = β0 +β1 x1 +β2 x2 +u, where u is independent of (x1 , x2 )
and u ∼ N ormal(0, σ 2 ) (i.e., Assumptions MLR.1-6 are satisfied). If the sample correlation
between x1 and x2 is extremely high (say, 0.99), then the t statistic for testing the hypothesis
H0 : β2 = 0 does not follow the t distribution under H0 .
[3 marks]
(b) We are interested in evaluating whether the decision by loan officials to deny a mortgage may
be racially motivated. Let the binary variable deny equal 1 if the application for a mortgage was
denied, and deny = 0 if an application for a mortgage was successful. minority is a dummy vari-
able which indicates if the applicant belongs to an ethnic minority group (1 = yes) or not (0 = no).
The data set contains information on a wide range of variables which a loan officer might legally
consider when deciding on a mortgage application. We will restrict our attention to the variables:
pirat (ratio of total monthly debt payment to total monthly income), lvrat_med and lvrat_high
(dummy variables indicating whether the loan-to-value ratio is intermediate or high, with the ex-
cluded dummy being low), and a consumer credit score, chist (which ranges from 1 to 5, where 5
is the worst rating). We have a random sample of 2,380 observations. There are 285 denied ap-
plications, the average of pirat equals 0.33, and the average of chist equals 2.12. The following
table provides various regression results based on this data that may shed light on this question.
(i) Let us consider the results that are based on the linear probability model (LPM). Provide
the interpretation of parameter estimate on minority, let us denote it by β̂minority , and test
whether the effect is statistically significant. Clearly indicate the assumptions you make
use of for your interpretation and for the test you conduct.
[4 marks]
(ii) Obtain the partial effect of being a minority on the mortgage denial rate using the logit
model, when evaluated at the mean values of the explanatory variables and given an inter-
mediate loan-to-value ratio. Briefly indicate whether you expect this effect to be significant
(no formal test expected), and should we expect this effect to be the same given a high
loan-to-value ratio?
[4.33 marks]
(iii) Let us define Pr (deny = 1|x) = Λ(α0 + α1 pirat + α2 lvrat_med + α3 lvrat_high + α4 chist),
where Λ(z) = exp(z)/(1 + exp(z)). You are interested in deciding whether there is evi-
dence that the decision rule taken by the loan officer is the same for minorities as it is for
non-minorities. Using the results provided, conduct this test. Clearly specify the null and
the alternative hypothesis, the test statistic and its distribution under the null, the rejection
rule, and interpret your findings.
[3 marks]
The error process has mean zero and exhibits autocorrelation. You may assume that εt is independent
of xs for all s ≤ t, and y0 and y−1 are available and equal 0.
(a) Discuss how you would test the null hypothesis that the error process does not exhibit autocor-
relation against the alternative that the error can be represented by a stationary AR(2) process.
In your answer you should clearly indicate what an AR(2) process is.
[5 marks]
(b) Briefly discuss the properties of the OLS estimator when applied to the above model (unbiased-
ness, consistency) recognizing the presence of autocorrelation. Support your answers with
suitable arguments.
[4 marks]
(c) What is the purpose of heteroskedasticity and autocorrelation robust (HAC) standard errors?
Discuss whether HAC standard errors can resolve the main problem associated with estimating
the model using OLS.
[3 marks]
(d) Describe in detail a method to resolve the main problem with using OLS to estimate (α, ρ1 , ρ2 , β).
Clearly indicate the assumptions underlying this method. If you think there is no method that
can resolve the main problem, explain why.
[5 marks]
(e) What is the Long Run Propensity (long run effect that a permanent change in xt has on yt ) in
this model and what is the Impact Propensity? Describe how you would conduct, using a single
linear hypothesis, a test for the hypothesis that the LRP is the same as the Impact Propensity.
[5 marks]
END OF PAPER