You are on page 1of 7

Professor Sitian Liu

Email: sitianliu@econ.queensu.ca
Office: Dunning Hall, Room 345
Office hours: Monday 4:00-5:00pm

Econ 361 Assignment 1 Solution

Due: October 22, 2020

Note: More details on the calculation can be found in the Excel sheet.

Problem 1: Statistics (Lecture 2) (15 points)

1. You are asked to conduct the following one-sided hypothesis test:


Null hypothesis: The population mean wage per hour in Ontario is lower than $20.

H0 : µ < 20

Alternative hypothesis: The population mean wage per hour in Ontario is equal or higher than
$20.
H1 : µ ≥ 20

To conduct the hypothesis test, you collected information on hourly wages (Xi ) from a sample
of 400 workers in Ontario, and calculated the sample mean (X̄) and sample variance (s2 ):

X̄ = 22 and s2 = 16.

Recall that the T-statistic is given by

X̄ − µ
t= √ ∼ tn−1 ,
s/ n

where n − 1 denotes the degree of freedom and n is the number of observations.

(a) What is the critical value corresponding to a significance level 0.05? (5 points)
(Hint: The one-sided hypothesis test is different from the two-sided test, introduced in
Lecture 2. In a two-sided test, you reject the null hypothesis when the absolute value of
t is too large—i.e., the sample mean is either far greater or smaller than the proposed
population mean (µ). Here, you reject the null only when t is too large—i.e., the sample
mean is far greater than µ. How does this affect your choice of the critical value for a
given significance level?)

1
Answer: 1.64. You can find the number using the t-distribution table: Column “0.05”
and Row “z” (when the degree of freedom is large enough, a t-distribution is close to a
standard normal distribution).
(b) Will you reject the null hypothesis and why? (5 points)
Answer: The T-statistic is
X̄ − µ 22 − 20
t= √ = √ = 10,
s/ n 4/ 400

which is greater the critical value 1.64. Therefore, you will reject the null hypothesis.

2. Based on X̄ and s2 in part 1, calculate the interval that contains the true population mean
with a 95% probability. (5 points)
Answer:
x̄ − µ
P r(−1.96 ≤ √ < 1.96) = 0.95 ⇒
s/ n
s s
P r(x̄ − 1.96 √ ≤ µ < x̄ + 1.96 √ ) = 0.95 ⇒
n n
P r(21.608 ≤ µ < 22.392) = 0.95.

Problem 2: Econometric Methods (Lectures 3–4) (35 points)

1. You are interested in estimating the causal effect of parental income (X) on children’s college
enrollment (Y ). You collected data from a sample of individuals aged 18–20, which provide
information on whether individual i is enrolled in college (Yi ) and his/ her parental income
(Xi ). Suppose you estimate the following equation:

Yi = α + βXi + i

and obtain the OLS estimates α̂ and β̂.

(a) Can you interpret β̂ as the causal effect of parental income on children’s education? (2
points)
Answer: No.
(b) Use 1–2 sentences to describe what is an omitted variable bias. (4 points)
Answer: There may exist a variable that is correlated with parental income and affects
children’s education directly.
(c) Provide an example of a potential omitted variable. (2 points)
Answer: An example is inherited ability/ genetic traits. More specifically, parents with
higher ability may have higher income. In the meanwhile, children may inherit such
ability from their parents, which may directly affect their educational choices.

2
2. What is selection? Provide an example a show that self-selection can make it difficult to
evaluate the effect of policies. (4 points)
Answer: Self-selection is when individuals select themselves into a group (i.e., choose whether
they are in the treatment or not). This makes it difficult to estimate the causal effect of
policies because the control group (i.e., the group not affected by a policy) is not an accurate
counterfactual for the treatment group (i.e., the group affected by the policy).
If we want to estimate the effect of a GED program in prisons on recidivism, we cannot
estimate the causal effect by just comparing the difference in recidivism rates between those
who completed the program and those who did not. This is because inmates who chose to
enroll in the GED program may differ from those who did not, and these unobserved factors
may directly affect their recidivism. For instance, those who chose to enroll in the GED
program while serving time in prison may have higher ambition or prefer working in a legal
sector more than others. These factors may directly affect their after-release criminal behavior.

3. Explain how randomized control trials can help overcome selection bias. (4 points)
Answer: Randomized control trials can help overcome selection bias because treatment is
randomly assigned to individuals. This means that there are no unobservables that are both
correlated with treatment and the dependent variable.

4. The U.S. federal government enacted the Higher Education Act in 2001, which made people
convicted of drug offenses ineligible for federal financial aid. Lovenheim and Owens (2014)
study the effect of this policy on college enrollment for those with convictions relative to those
without, using a difference-in-differences (DD) strategy. The following table shows the average
college enrollment rates:

No Conviction Conviction
Pre-Policy 0.623 0.358
Post-Policy 0.651 0.269

(a) Using the information above, calculate the DD estimate. (4 points)


Answer:
Difference between before and after for students without convictions: 0.651 − 0.623 =
0.028.
Difference between before and after for students with convictions: 0.269 − 0.358 =
−0.089.
Difference in differences: −0.089 − 0.028 = −0.117.
(b) Write down a regression equation that will allow you to estimate the DD effect of the
policy. (4 points)

3
Answer:

Enrollit = β0 + β1 P ostt + β2 Convicti + β3 P ostt × Convicti + it ,

where Enrollit is an indictor equal to 1 if individual i is enrolled in college in year t, and


0 otherwise. P ostt is an indicator that t is after the policy implementation. Convicti is
an indicator that individual i was convicted of a drug offense. We can also control for a
vector of observable characteristics of the students. β3 is the parameter of interest.
(c) What is the key identifying assumption of the strategy? (5 points)
Answer: The key identifying assumption required to interpret β3 as causal is that the
only reason for a change in the relative enrollment rates between drug offenders and
non-offenders post-2001 is the financial aid restrictions in the Higher Education Act.
(d) Can you suggest a way to test for the assumption? (6 points)
(Hint: Suppose you can have access to some confidential data that provide information
on college enrollment and criminal records over years.)
Answer: One way to test for the assumption is to see whether there were differential
pre-treatment trends in college enrollment between drug offenders and non-offenders. In
particular, if college enrollment for drug offenders was declining/ increasing over time,
this could confound the effect of the change in the financial aid rule. For instance, the
implementation of Higher Education Act might be initiated by changing trends of drug
offenders’ college enrollment. In Figure 1, Lovenheim and Owens (2014) plot the fraction
of people attending college over time, and they find very similar trends for drug offenders
and non-offenders.

4
Figure 1: Trends in College Enrollment Rates by Conviction Status and High
School Cohort

.6
No Convictions Drug Conviction
Drug Charge
College Attendance Rate (Within 2 Years)
.2 .3 .1.4 .5

1998 1999 2000 2001 2002 2003


Predicted HS Graduation Year

Source: Author’s calculations from the 1997 National Longitudinal Survey of Youth as described in the
text.

Problem 3: Inequality (Lectures 5–6) (30 points)

1. Consider an income distribution over a sample of 10 individuals: 5,000, 20,000, 45,000, 80,000,
10,000, 35,000, 150,000, 42,000, 36,000, 28,000. Use the following two figures to visualize
inequality. (4 points)

(a) Shares of income of quintiles


(b) Mean income of quintiles

Answer: Find the graphs in the Excel sheet.

2. Based on the income distribution in part 1, calculate the following inequality indices. (4
points)

(a) Decile dispersion ratio (using 10%)


(b) The fraction of income accruing to the top 10% earners

Answer: The decile dispersion ratio is $150, 000/$5, 000 = 30. The total income is $451, 000,
so the fraction of income accruing to the top 10% earners is $150, 000/$451, 000 = 33.26%.

5
3. Show the Lorenz curve of the income distribution in part 1 and calculate the Gini coefficient.
(10 points)
Answer: Find the Lorenz curve in the Excel sheet. The area under the Lorenz curve is 0.282.
0.5−0.282
Therefore, Gini = 0.5
= 0.436.

4. The Gini coefficient is not easily decomposable. Argue whether the Gini coefficient satisfies
mean independence, population size independence, symmetry, and Pigou-Dalton Transfer
sensitivity. (4 points)
Answer: The above four criteria are satisfied.

5. Consider another income distribution: 8,000, 24,000, 45,000, 85,000, 12,000, 35,000, 120,000,
40,000, 38,000, 30,000. Show the Lorenz curve of this income distribution. What can be said
about Lorenz dominance compared with the income distribution in part 1? (8 points)
Answer: Find the Lorenz curves in the Excel sheet. The new income distribution Lorenz
dominates the income distribution in part 1, because the orange (corresponding to part 5)
Lorenz curve lies nowhere below the blue (corresponding to part 1) Lorenz curve.

Problem 4: Education (Lectures 9–11) (20 points)

1. The Mincer equation is

ln(Yi ) = β0 + β1 Si + β2 Expi + β3 Exp2i + i ,

where Yi is the earning of individual i; Si is the years of eduction; Expi is the years of working
experience.

(a) Assume that the error term i is independent and identically distributed, with E(i ) = 0
and V ar(i ) = σ 2 . Then the OLS estimates of the equation can give us unbiased estimates
of the economic return to education. Suppose you obtain the estimated parameters β̂0 –
β̂3 . What is the effect of an additional year of schooling on log earnings, holding working
experience constant? For someone with 5 years of working experience, what is the effect
of an additional year of working experience on log earnings, holding education constant?
(4 points)
(Hint: Take the first order derivative with respect to S or Exp.)
Answer:
∂ ln(Y )
= β1
∂S
∂ ln(Y )
= β2 + 2β3 Exp.
∂Exp

6
Therefore, the effect of an additional year of schooling on log earnings is β̂1 . The ad-
ditional year of working experience on log earning for someone with 5 years of working
experience is β̂2 + 10β̂3 .
(b) Why the OLS estimates of the Mincer equation may not be interpreted as the causal
effect of education on earnings? (2 points)
Answer: The OLS estimate of β1 from the Mincer equation may not be interpreted as
the causal effect of eduction on earnings because of ability bias. For example, higher-
ability individuals may tend to get more schooling. In the meanwhile, they could also
earn more regardless of their schooling.

2. Angrist and Krueger (1991) attempt to overcome ability bias by exploiting the quarter of
birth and compulsory schooling laws.

(a) Who are the compliers in this study? (3 points)


(Hint: Recall that compliers are the subpopulation who are induced to change their
behavior because of an intervention or natural experiment.)
Answer: Compliers in this study are people who, by accident of birth and compulsory
schoolings laws, are induced to complete additional years of schooling.
(b) They find that the IV estimates are very similar to the OLS estimates. List two potential
explanations. (6 points)
Answer: First, ability bias can be small. Second, the economic return to schoolings for
compliers can be different from the return for an average student.

3. Hoekstra (2009) studies the economic returns to a flagship university using a regression dis-
continuity design. He compares the earnings of young adults who were barely admitted to
the flagship to those who were barely rejected. Discuss a potential limit of Hoekstra (2009)’s
research design or results. (5 points)
Answer: A potential limit of the research design could be that applicants or the university
can manipulate the side of the cutoff on which applicants fall. If so, those who were just above
and below the cutoff are not likely to be identical. A potential limit of the results could be
that students who were just above and below the cutoff might not be representative.

You might also like