You are on page 1of 279

ECON 361: Income & Inequality

Lecture 2: Review of Statistics

Prof. Sitian Liu


Queen’s University

September 14, 2020


Outline

I Basic statistics review


I Probability density function
I Cumulative distribution function
I Frequency distribution
I Hypothesis testing
I Confidence intervals
I Joint distribution
A A F :P S
(Kalid Azad, http://betterexplained.com)

I Probability: start with an animal (model) and predict what


footprints (data) it will make

I Statistics: study a footprint (data) and try to figure out what


animal (model) produced it

Example: Flipping a coin


F C

I Population: Includes all members of the group we are


studying or collecting information about.

I Sample: A subset of the population.

Since observing the population is often impossible (or


prohibitively expensive), we study samples to learn about the
population.
Example: Suppose we’re interested in studying the IQ of
Canadians.

We randomly sample from the population of Canada and


administer IQ tests.

How can we summarize the data to make learning about it?

Two types of measures that are useful:

1. Measures of central tendency: trying to come up with a


“typical” data point.
2. Measures of dispersion: trying to capture how spread out
the observations are.
Measures of Central Tendency

I Sample mean (or sample average):


n
1X
x̄ = xi
n i=1

I Median: obtained by ordering each of the n observations


from smallest to largest value and then the median is:
(
n+1 th
2
value if n is odd
x̃ = n th n+1 th
average of 2
and 2
values if n is even
Measures of Dispersion

I Range: The difference between the highest and lowest


observations.

I Interquartile range: The difference between the 75th and


25th percentile (excludes observations in the tails).

I Sample variance: The average squared distance of


observations from the sample mean.

N
X
2 1
s = (xi x̄)2
N 1 i=1
Probability Density Function (PDF)
P T : A function that maps all the possible
values of a random variable to their (relative) likelihood of
occurring. In discrete cases, the probability mass function is
f (x) = Pr(X = x).

Normal Distribution:

1 (x µ)2
f (x, µ, ) = p exp
2⇡ 2 2

where µ is the mean and 2


is the variance.

Standard normal distribution: µ = 0 and 2


=1

1 x2
f (x, 0, 1) = p exp
2⇡ 2
Density
0 .1 .2 .3 .4

-4
-2
X
0
2
4
Cumulative Distribution Function (CDF)

Probability that a random variable (X) will take a value less than
or equal to x.

Formally: Z x
F (x) = P (X  x) = f (t)dt
1
1

.8

.6
CDF

.4

.2

0
-4 -2 0 2 4
X
Some notes on distributions:

I The distribution is symmetric if the left half is a mirror


image of the right half

I A positive skewness occurs when the right half of the


distribution is stretched out (compared with the left)

I A negative skewness occurs when the left half of the


distribution is stretched out (compared with the right)

I The kurtosis captures the amount of “stretching” in the


distribution.
Frequency Distribution

Think histogram!

S : The frequency or count of the occurrences of values


for a variable (X ) in a sample.

The frequency distribution groups all values in data to bins and


tells us the number of values observed in each bin.

Next slides are frequency distributions for 100 and 1000


observations that are normally distributed with mean 0 and
variance 1.
Frequency
0 2 4 6 8 10

-4
-2
0
X
2
4
Frequency
0 20 40 60 80

-4
-2
0
X
2
4
Income Distribution in the U.S.
1.5e-05
1.0e-05
Density
5.0e-06
0

0 100000 200000 300000 400000


Wage Income (< 99th Percentile)
Note: Sample includes full-time workers aged between 25 and 69 from the 2013-2017 ACS.
Income Distribution in the U.S.
1.5e-05
1.0e-05
Density
5.0e-06
0

0 100000 200000 300000 400000


Wage Income (< 99th Percentile)
Note: Sample includes full-time workers aged between 25 and 69 from the 2013-2017 ACS.
Log-normal Distribution

We cannot use a normal distribution to approximate an income


distribution.

Log-normal distribution: If the random variable X is


log-normally distributed, then Y = ln(X) has a normal
distribution.
(Log) Income Distribution in the U.S.
.6
.4
Density
.2
0

0 5 10 15 20
Log Wage Income
Note: Sample includes full-time workers aged between 25 and 69 from the 2013-2017 ACS.
(Log) Income Distribution in the U.S.
1

.8

.6
CDF

.4

.2

0
0 5 10 15
Log Wage Income

c.d.f. Normal c.d.f.


Hypothesis Testing

Hypothesis testing: The Ontario Average Wage

I Suppose you are told that average earnings in Ontario are


$20.00/hour.
I You collect data from a sample of individuals and calculate
the sample mean to be $22.46 (i.e., Ȳ ).
I Given our data, we might question whether the population
mean (µY ) is $20.00.
How do we formulate this test?
Hypothesis Testing

Null hypothesis: “The population mean wage per hour is $20.”

H0 : µY = 20

Alternative hypothesis: “The population mean wage per hour


is not $20.”
H1 : µY 6= 20
This is known as a two-sided test. A one-sided test can be
H 0 : µY 20 and H1 : µY < 20.

We either reject H0 or fail to reject H0 , but in general we do not


say that we accept H0 .
Sample Statistics
I Sample mean:
X
X̄ = (X1 + ... + Xn )/n = (1/n) Xi
i

I Sample variance:
X
s2 = (1/(n 1)) (Xi X̄)2 .
i

I Sample Mean Theorem: In random sampling, sample size


n, from any population with E(X) = µ and V ar(X) = 2 ,
the sample mean X̄ has

E(X̄) = µ
Var(X̄) = 2 /n.
Asymptotic Distribution Theory
I Standardized sample mean (z -statistic):

Z = [X̄ E(X̄)]/[V ar(X̄)]1/2


p p
= (X̄ µ)/( / n) = n(X̄ µ)/

I Central Limit Theorem: The limiting distribution of Z is


N (0, 1).
I T -statistic:
X̄ µ
t= p ⇠ tn 1
s/ n
where n 1 denotes the degree of freedom.
I If the t-statistic is small in absolute value, this provides
credence that null hypothesis is likely to be true (i.e., the
sample mean is close to the proposed population mean).
Student’s t-test
Student’s t distribution: similar to the normal distribution but
has heavier tails (i.e., greater chances for extreme values). Tail
heaviness is determined by the degree of freedom.
Student’s t-test
Significance level (↵): the probability of rejecting the null
hypothesis when it is true. Graphically, ↵ determines how far out
from 0 will draw the line on the graph. ↵ = 0.05 is mostly
commonly used.
Student’s t-test

Procedure for a two-sided test:

I Compute the t-statistic.


I Use the “t table” to find the critical value c(↵/2)
corresponding to an appropriate significance level ↵.
I The “t table” contains critical values of the t distribution
computed using the CDF for different levels of freedom and
significant levels.
I Excel function TINV.

I Reject the null if |t-statistic| > c(↵/2).


Student’s t-test

P-values: the probability of obtaining a result at least as


extreme as the result actually observed during the test,
assuming that the null hypothesis is correct.

P -value = P rH0 (t > t-statistic)

Intuitively, a p-value is the probability that something “weirder”


happened than what actually occurred.

An alternatively way to conduct the hypothesis test: reject the


null if p-value< ↵/2.
Confidence Intervals

I Due to randomness we can’t recover the true population


parameter (e.g., population mean µ) using only sample
information.
I However, we can use sample statistics to come up with a
set of values that contains µ with some specified probability.
I Given X̄ and s2 , we can calculate the interval that contain
µ with a 95% probability.

X̄ µ
P r( 1.96  p < 1.96) = 0.95 ,
s/ n
s s
P r(X̄ 1.96 p  µ < X̄ + 1.96 p ) = 0.95.
n n
Joint Distribution
We’ll be talking a lot about the determinants of earnings in this
class. But that requires that we understand the conditional
distribution. Consider the following:
11.5
Log Wage Income (Mean)
10.5 10 11

0 2 4 6 8 10
Education Group
Note: Sample includes full-time workers aged between 25 and 69 from the 2013-2017 ACS.
Education group: 0: no schooling; 1: grade 4 or less; 2: grades 5-8; 3: grade 9; 4: grade 10;
5: grade 11; 6: grade 12; 7: 1 year college; 8: 2 years college; 9: 3 years college
10: 4 years college; 11: 5 years college or more
No Yes
.6
.4
Density
.2
0

0 5 10 15 0 5 10 15
Log Wage Income
Graphs by whether or not having at least 1 year of college
Joint Distributions (Discrete Case)
I The joint probability mass function for (X, Y ) gives:
f (x, y) = Pr(X = x, Y = y).
I The conditional distribution defines the probability that y
occurs given that x occurs:
Pr(X = x, Y = y)
Pr(Y = y|X = x) =
Pr(X = x)
I X and Y are independent iff
f (x, y) = f (x)f (y) for all (x, y).
I Covariance:
Cov(X, Y ) = E[(X E(X))(Y E(Y ))]
= E(XY ) E(X)E(Y )
X and Y are uncorrelated if Cov(X, Y ) = 0.
Coming Soon!

I Introduction to econometrics
I Regression analysis
I Causal effect
ECON 361: Income & Inequality
Lecture 3: Introduction to Econometrics I

Prof. Sitian Liu


Queen’s University

September 17, 2020


Outline
Last class:

I Basic statistics review

Today & Next Lecture:


I Regression analysis: Ordinary least squares
I Omitted variable
I Measurement error
I Reverse causality

I Causal effect
I Randomized control trial (RCT)
I Difference-in-differences (DD)
I Instrumental variables (IV)
I Regression discontinuity (RD)
Education and Income

People with higher levels of education have higher income:

I Does higher education lead to higher income?

I Gain knowledge and skills ! increase productivity in the


labor market.

I Usually find a positive correlation in the data.

Does this mean the existence of causality? Not really!


Education and Income

What is the causal effect of an additional year of school on


income?
I Imagine a hypothetical situation in which we can “treat” an
individual (say Larry) with more schooling or not.
I YT (Larry) = Larry’s wage if he finishes one more year of
high school.
I YC (Larry) = Larry’s wage if he doesn’t finish one more
year of high school.
I Treatment effect = YT (Larry) YC (Larry).
I We do not observe the counterfactual: What would have
happened to an individual in the absence of the treatment?
I Goal: Find a valid counterfactual!
Education and Income

Policy implications:

I If increasing someone’s levels of education leads to higher


income in the future: improving access to schooling and
learning for poor children.

I If there is no causal effect, then policy implications are very


different!
Regression Analysis
We want to know how another year of education affects
earnings. Start with the following plot, where each point
represents a combination of education and earnings for a
different individual. Why we would not expect a perfect
correlation?
Independent and Dependent Variables

I Independent variable (explanatory variable): years of


education (Ed).
I If and how much an outcome of interest varies as the
independent variable varies.

I Dependent variable (outcome variable): earnings (Y ).


I Seek to explain its variation.
Bivariate Linear Regression
I Assume that the link between education and earnings is
linear: Each additional year of education predicts a
constant increase in earnings. We could describe the data
as follows (true model):

Yi = ↵ + Edi + ✏i

where
I i: individual observation.
I ✏: error term (i.e., other determinants of earnings). (i.i.d.
with E(✏i ) = 0 and V ar(✏i ) = 2 .)
I ↵: intercept (i.e., the amount of income a person with zero
years of education could expect to earn).
I : slope.
Ordinary Least Square

I We call ↵ and the true parameters (or population


parameters) and we never observe them.

I E(Y |X) the conditional expectation functions. It shows the


expected value of Y for each value of X .

I Compare the expected earnings with Ed = 1 and Ed = 0:

E(Yi |Ed = 1) E(Yi |Ed = 0) = ↵ + ↵= .

I Therefore shows how average earnings change when


education changes by one year.
Ordinary Least Sqaure
I Regression analysis is a statistical technique to pick
parameters that best fit the data: estimated parameters ↵
ˆ
and ˆ.
I Ordinary Least Square: Minimize the Residual Sum of
Squares (RSS):
X X
min (Yi Ŷi )2 = (Yi ↵
ˆ ˆEdi )2
ˆ ˆ
↵,
i i

I OLS estimate of is
P
ˆ = i (YPi
Ȳ )(Xi X̄)
=
Cov(X, Y )
.
2 V ar(X)
i (Xi X̄)

where X denotes independent variable and Y denotes


dependent variable.
Causality

I Under the assumptions of the true model: i.i.d. ✏i with


E(✏i ) = 0 and V ar(✏i ) = 2 , the OLS estimator is
unbiased:
E( ˆ) =

I However, in general, if we compute an estimate of ˆ based


on a sample, we cannot interpret this measure of
association between education and earnings as the causal
impact of education on earnings?
I Omitted variable
I Measurement error
I Reverse causality
Omitted Variable

I Individuals who choose to get more education are likely to


differ from those who choose lower levels in ways that are
also related to earnings.
I The counterfactual earnings for the high-education group is
unlikely to be the earnings that we observe for the
low-education group.
I E.g., Individuals with higher ability tend to choose higher
levels of education.
I From an econometric standpoint, ability here is an omitted
variable, which leads to an omitted variable bias.
I A variable is not included in the regression but it is
correlated both with the treatment and the outcome.
Omitted Variable
I Formally, suppose the true model is:

Yi = ↵ + Edi + Abilityi + ✏i ,

where ✏i is i.i.d. with E(✏i ) = 0 and V ar(✏i ) = 2


.
I However, we don’t have data on Ability , so we simply
estimate the following bivariate regression:

Yi = 0 + 1 Edi + µi .

I Then the OLS estimate of 1 is

Cov(Y, Ed)
E( ˆ1 ) =
V ar(Ed)
Cov(Ed, Ability)
= +
V ar(Ed)
Omitted Variable
Measurement Error

I The observed variables are measured with error (for either


dependent or independent variables, or both).
I Not measured perfectly (e.g., income).
I Not measure what we want (e.g., using IQ or score to
measure ability).

I The true model is

Yi = ↵ + Xi⇤ + ✏i .

However, we observe the independent variable with an


error:
Xi = Xi⇤ + µi .
Reverse Causality

I Belief: X ! Y .
Reality: Y ! X or X $ Y .
I E.g., women’s educational attainment !
?
the timing of
entry into motherhood.
I Simultaneous equations:

Y = 0 + 1X + ✏
X = ↵0 + ↵1 Y + ↵2 Z + µ.

Everything depends on everything!


Coming Soon!

I Randomized control trial (RCT)

I Difference-in-differences (DD)

I Instrumental variables (IV)

I Regression discontinuity (RD)


ECON 361: Income & Inequality
Lecture 4: Introduction to Econometrics II

Sitian Liu
Queen’s University

September 21, 2020


Outline
Last class:
I Regression analysis: Ordinary Least Squares
I Omitted variable
I Measurement error
I Reverse causality

Today:

I Randomized control trial (RCT)

I Difference-in-differences (DD)

I Instrumental variables (IV)

I Regression discontinuity (RD)


Causal Effect

I Randomized control trial (RCT)

I Difference-in-differences (DD)

I Instrumental variables (IV)

I Regression discontinuity (RD)


Randomized Control Trial (RCT)

I Randomized control trials (RCTs) or experiments are


often viewed as the gold standard for measuring causal
effects.
I People are randomly assigned to treatment and control
groups (e.g., receiving a new medication or participating in
additional schooling).
I On average, this makes the two groups identical but for
receiving treatment.
I Therefore, any differences between the outcomes of the
treatment and control groups at the end of the experiment
should be due to the treatment: The control group is an
accurate counterfactual for the treatment group.
Randomized Control Trial (RCT)
I Long history in the life sciences of test the effectiveness of
new treatments, but less common in economics.
I 1962 Perry preschool project:
I Families from Ypsilanti, Michigan were randomly assigned
to the treatment and control groups.
I 2 years of intensive preschool for 3–4 year old children and
home visits.
I Adults from the treatment group were more likely to
graduate high school, make earnings, and go on to college,
and less likely to commit crime (Heckman et al., 2010).
I RCTs address the problem of selection bias:
I Individuals choose the activity that will maximize their utility.
I Individuals can choose whether they are in the treatment or
not.
Randomized Control Trial (RCT)

It is difficult in economics to conduct experiments that are double


blind (i.e., neither the researchers and the participants know the
treatment status). Threats to internal validity:
1. Differential attrition: A treatment is typically desired by
the participants, and we cannot compel people to comply
with their assigned treatment status.
2. Sampling error: Because of small sample sizes, two
groups may differ for random reasons.
3. Hawthorne effect: Participants may behave in a way to
make an effect occur.

Lack of external validity: Whether we can generate the results


to other settings.
Natural Experiments

Experiments:
I Internal/ external validity.
I Ethics.
I Expensive.

Natural experiments (quasi-experiments):


I Find variation in the treatment exposure determined by
nature or changes in policy.
I Outside of researchers’ control, but approximate random
assignment (uncorrelated with differences across
individuals or institutions).
Difference-in-Differences Approach

I When governments change policies (e.g., related to funding


for education), they create opportunities to evaluate the
effects of these policies.
I E.g., In 1993, the state of Georgia implemented a merit aid
program, called HOPE, providing essentially free tuition at
Georgia public postsecondary schools to high school
graduates with a high school GPS above 3.5.
I Policy question: Did HOPE increase college enrollment
among Georgia high school graduates?

I The easiest way is to look at how outcomes of interest


change after the policy is implemented relative to before.
I Concern: Would the after outcome same as before in
absence of the policy changes? (E.g., economic shock.)
Difference-in-Differences Approach

I Find some unaffected areas (control) similar to the affected


area (treatment).
I Use changes in the unaffected areas to control for the
confounding factors.
I Use the enrollment change in states that border Georgia as
counterfactual change in enrolment for Georgia.
I Assumption: The policy change of interest is the only
reason why outcomes among the treated group change
relative to outcomes among the control group.
DD in graphical from:

The difference-in-differences (DD) estimate of the effect of


HOPE on college enrollment is calculated by comparing
enrollment changes in Georgia with those in boarding states
when the policy was enacted: (D C) (B A).
DD in tabular form:

DD = {E[Enrit |i = GA, t = P ost] E[Enrit |i = GA, t = P re]}


{E[Enrit |i = Bor, t = P ost] E[Enrit |i = Bor, t = P re]}
DD in regression form:

Enrit = ↵ + GAi + ⇠P ostt + GAi ⇥ P ostt + ✏it

I i indexes state and t indexes year.


I GA is an indicator (or dummy variable) that is equal to 1 if
the state is Georgia and is equal to 0 otherwise.
I P ost is an indicator that is equal to 1 after the
implementation of HOPE and equal to 0 before.
I is identical to the estimate in the previous table.
DD in regression form:

Enrit = ↵ + GAi + ⇠P ostt + GAi ⇥ P ostt + ✏it

I How to show that represents the DD estimate?


1. Show points A, B, C, D in the graph using the parameters in
the equation. E.g., A is the average enrollment rate of the
control group in the pre-HOPE period.

E(Enr|GA = 0, P ost = 0) = ↵+ ·0+⇠ ·0+ ·0·0 = ↵.

2. The DD estimate is (D C) (B A).


I It is easy to control for time-varying observable
characteristics (e.g., student income and racial
composition) of each state in the regression. (Then will be
different from the previous estimate.)
Instrumental Variables (IV)
I Sometimes, life events are shaped by random events:
I Whether or not young men were drafted to serve in the
Vietnam war was directly related to lottery numbers.
I The enrollment of a student in a charter school that is
oversubscribed depends on a random lottery.

I These lotteries produce instrumental variables (or


instruments) that be used to disentangle causality:
I Causal effects of military service on long-run earnings.
I Causal effects of charter school enrollment on academic
achievement.

I Winning or losing a lottery is random, so comparing


outcomes of winners versus losers essentially mimics a
RCT.
Instrumental Variables (IV)

I An instrument is a variable that isolates variation in the


treatment that is uncorrelated with underlying
characteristics of those who are treatment or untreated.

I An instrument works under two assumptions:


1. The instrument must be correlated with treatment (easy to
test with data).
2. The only reason the instrument and the outcome are
correlated is because the instrument affects the treatment
(major source of controversy).

I Many government policy changes can be used as


instruments as long as the policy change is random with
respect to unobserved changes in a school or state.
Regression Discontinuity (RD)

I Policies generate sharp breaks in eligibility for a treatment:


I Policies that designate eligibility for financial aid at a
particular score cutoff or income threshold;
I University admission policies that only admit students with
an SAT score above a certain mark.
I Assumption: Students just above and below the cutoff in
these cases are nearly identical.
I Students are unable to manipulate their test performance to
place themselves over the cutoff.
I There is some randomness in test outcomes.
I The comparison of student outcomes just above and below
the threshold.
Regression Discontinuity (RD)

I Fuzzy Regression Discontinuity: The likelihood of


treatment changes by less than 1 at the threshold.
I There are other eligibility criteria (in addition to test scores)
that determine treatment (scholarship receipt).

I The discontinuity rule acts as an instrument for scholarship


receipt.
Regression Discontinuity (RD)

I Advantages:
I Many programs have eligibility rules.
I Relative weak assumptions.

I Disadvantages:
I Individuals can respond to eligibility rules.
I Causal effect for individuals near the cut-off.
Coming Soon!

I Measures of inequality.
ECON 316: Income & Inequality
Lecture 5: Measures of Inequality

Prof. Sitian Liu


Queen’s University

September 24, 2020


Outline

Last class:

I Econometrics

Today:

I Measuring inequality visually


I Measuring inequality using indices
Visualize Inequality

1. Quantile graph

2. Pen’s Parade (“Parade of Dwarfs”)

3. Frequency distribution

4. Lorenz curve
Quantile Graph

H - G : Quantile Graph

1. Sort individuals by income (i.e., poorest to richest).


2. Divide the population into quintiles.
3. Calculate the proportion of total income that goes to each
quintile or the average income of each quintile.
4. Graph the results.
Shares of Household Income of Quintiles
(U.S. 2018)
50
Share of Household Income (2018)
10 20 0 30 40

1 2 3 4 5
Mean Household Income (2018)
0 50,000 100000 150000 200000 250000

1
2
3
(U.S. 2018)

4
5
Mean Household Income of Quintiles
Pen’s Parade

I Pen’s Parade is a graphical representation of income


distribution created by Dutch economist Jan Pen (1971).
I Imagine a parade where every person in the economy is
lined up, with their height proportional to their income, and
ordered from lowest to greatest.
I The original context of the parade is the UK, and the
duration is one hour.
I Consider a spectator with the average height. What will he
see during the parade?
Pen’s Parade

1. At the beginning of the parade, the marchers cannot be


seen at all (businesses with losses).
2. Marchers with positive income begin to pass by (tiny).
3. 10 mins, full-time labor force has arrived (last several mins).
4. 45 mins, marchers are as tall as the spectator.
5. Last 6 mins, top 10% begin to arrive (e.g., doctors, lawyers,
successful corporate executives, bankers).
6. Last few seconds, a glimpse of pop stars and the most
successful entrepreneurs (knees).
7. At the very end, John Paul Getty (the sole of his shoe).
Figure from the Atlantic (2006).
Figure 6.2 Pen’s Parade (Quantile Function) for Expenditure per Capita, Vietnam,
1993 and 1998

Source: Created by the authors, based on data from the Vietnam Living Standards Surveys of 1992–93
and 1998.
Note: This function is truncated at the 95th percentile.

Figure from the Handbook on Poverty and Inequality, Chapter 6.


Household Income Distribution

100K-150K
150K-200K
15K-25K
25K-35K
35K-50K
50K-75K
75K-100K
< $15K
>= 200K

0 5 10 15 20
Percentage

2008 2018
2008: median income is $58,811 and mean income is $79,997.
2018: median income is $63,179 and mean income is $90,021.
Lorenz Curve

I The Lorenz curve is developed by American economist


Max O. Lorenz in 1905.
I It is a graphical representation of the distribution of income
or wealth.
I The curve shows the proportion of overall income or wealth
assumed by the bottom x% of the total population.
I x-axis: the cumulative proportion of the population ranked
by income level.
I y -axis: the cumulative proportion of the income for a given
proportion of the population.
Figure 6.1 Lorenz Curve

Source: Authors’ illustration.

For users of Stata, there is a “fastgini” command that can be downloaded and
Figure from the Handbook on Poverty and Inequality, Chapter 6.
used directly (see appendix 3). This command also allows weights to be used, a capa-
bility not incorporated into equations (6.1) and (6.2). This Stata routine also allows the
Lorenz Curve

I Line of perfect equality: 45 degree line.


I Everyone makes exactly the same amount of income.
I The bottom x% of the population hold x% of total income.

I Line of perfect inequality:


I Total income is held by one person.
Lorenz Curve
H - G : Lorenz Curve

1. Sort individuals by income (i.e., poorest to richest):


y1  y2  ...  yN .
2. Define the proportion of income owned by each individual
P
(yi / Ni=1 yi ) and define his/her proportion of the total
population (1/N ).
3. Calculate the cumulative proportion of population and the
cumulative proportion of income: k/N and
P P
( ki=1 yi )/( Ni=1 yi ), for k = 1, ..., N.
4. Plot the cumulative proportion of income agains the
cumulative proportion of population.
5. Include the line of perfect equality.
Lorenz Curve

I
I As the Lorenz curve becomes more convex, the level of
inequality increases.

I We say an income distribution X Lorenz dominates


another distribution Y if X ’s Lorenz curve lies nowhere
below that of Y .
I This implies that X is more equally distributed than Y .
I If two Lorenz curves intersect, then they cannot say
anything definitive about inequality.
I Lorenz dominance is a visual inspection, so it cannot
answer questions such as “How much less unequal is
country Y than X.”
Measure Inequality using Indices

Criteria for a “good” measure of income inequality:

1. Mean independence. If all incomes were doubled, the


measure would not change.

2. Population size independence. If the population were to


change, the measure of inequality should not change, all
else equal.

3. Symmetry. If any two people swap incomes, the measure


should not change.
Measure Inequality using Indices

Criteria for a “good” measure of income inequality (cont.):

4. Pigou-Dalton Transfer sensitivity. Under this criterion, the


transfer of income from rich to poor (without altering their
rank) reduces measured inequality.

5. Decomposability. Inequality may be broken down by


population groups or income sources or in other
dimensions.

6. Statistical testability. One should be able to test for the


significance of changes in the index over time (not
discussed here).
Measures of Inequality

Commonly Used Measures of Inequality:

I Decile Dispersion Ratio

I Share of Income

I Coefficient of Variation

I Gini Coefficient

I Generalized Entropy Measure

I Theil Index
Decile Disperson Ratio

The ratio of the average consumption (or income) of the richest


10% of the population to the average consumption (or income)
of the poorest 10%. (Other percentiles: top and bottom 5%.)

I Pros: easy to interpret.


I Cons: It ignores information about incomes in the middle of
the income distribution, and does not even use information
about the distribution of income within the top and bottom
deciles.
Share of Income

The fraction of income accruing to top earners—top 10%, 1%,


0.1%.

I Concentration of income (income distribution is generally


positively skewed).
I E.g., In the 2000s, the top 10% US earners account for
45 50% of total pretax income.
Coefficient of Variation

A standardized measure of dispersion of a probability


distribution or frequency distribution:
q P
1 N
N i=1 (yi ȳ)2
.

I Pros: mathematically tractable (complete equality


corresponds to the minimum value of zero).
I Cons: It is not bounded from above, so it cannot be
normalized to be within a fixed range.
Gini Coefficient
CHAPTER 6: Inequality Measures
6
Figure 6.1 Lorenz Curve

Source: Authors’ illustration.

For users of Stata, there is a “fastgini” command that can be downloaded and

The Gini coefficient is defined as A/(A + B).


used directly (see appendix 3). This command also allows weights to be used, a capa-
bility not incorporated into equations (6.1) and (6.2). This Stata routine also allows the

I Gini = 0 iffree,perfect
standard error of the Gini coefficient to be computed using a jackknife procedure.1 The
stand-alone DADequality.
software (Araar and Duclos 2006) allows one to measure a wide
array of measures of poverty and inequality, including the Gini coefficient.
I Gini = 1 ifVietnam
perfect inequality.
Table 6.2 shows that the value of the Gini coefficient for expenditure per capita in
rose from 0.313 in 1993 to 0.350 in 1998. The jackknife standard errors for
these estimates are small, and the 95 percent confidence intervals do not overlap;
therefore, we can say with some confidence that inequality—as measured by the Gini
Gini Coefficient
CHAPTER 6: Inequality Measures
6
Figure 6.1 Lorenz Curve

Formally, Source: Authors’ illustration.

R of Stata, there is a “fastgini” command that can be downloaded and


I Gini = 1 usedFor2directly
users 1

0
appendix 3). This.command also allows weights to be used, a capa-
(seeL(x)dx
bility not incorporated into equations (6.1) and (6.2). This Stata routine also allows the
I If the Lorenz curve is approximated on each interval as a
standard error of the Gini coefficient to be computed using a jackknife procedure. The
free, stand-alone DAD software (Araar and Duclos 2006) allows one to measure a wide
1

line between consecutive points, then the area B can be


array of measures of poverty and inequality, including the Gini coefficient.
Table 6.2 shows that the value of the Gini coefficient for expenditure per capita in

approximated with trapezoids: Giniintervals do not overlap;


Vietnam rose from 0.313 in 1993 to 0.350 in 1998. The jackknife standard errors for

PNthese estimates are small, and the 95 percent confidence

i 1it).is clear that inequality


therefore, we can say with some confidence that inequality—as measured by the Gini
=1 (x x
i least—rosei during
coefficient, at
i=1 )(y + y
1 this iperiod. Similarly,
within the urban areas of Vietnam in 1998 was substantially greater than within rural
areas, and this difference is highly statistically significant.
Gini Coefficient
(2018 or latest available)
Generalized Entropy Measures
A family of inequality measures given by:
" N ✓ ◆↵
#
1 1 X yi
GE(↵) = 1 .
↵(↵ 1) N i=1 ȳ

I GE 2 [0, 1).
I 0 is perfect equality and GE increases in inequality.
I ↵ governs the weight given to distances between incomes
at different parts of the income distribution and ↵ 2 R.
I When ↵ is small ) more sensitive to changes in the lower
tail of the distribution.
I When ↵ is large ) more sensitive to changes in the upper
tail of the distribution.
Theil Indices
The Theil indices belong to the family GE.
GE(1) is the Theil’s T index:
N ✓ ◆
1 X yi yi
GE(1) = ln .
N i=1 ȳ ȳ

The index allows us to decompose the inequality measure into


multiple groups:
X ✓ Yj ◆ X ✓ Yj ◆ ✓
Yj /Y

GE(1) = Tj + ln .
j
Y j
Y Nj /N

I Yj is the total income in subgroup j ; Nj is the population;


Tj is the Theil index.
I The first term represents the within-group inequality.
I The second term represents the between-group inequality.
Theil Indices

GE(0) is the Theil’s L index:


N ✓ ◆
1 X ȳ
GE(0) = ln .
N i=1 yi

The index allows us to decompose the inequality measure into


multiple groups:
X ✓ Nj ◆ X ✓ Nj ◆ ✓


GE(0) = Lj + ln .
j
N j
N ȳj

I ȳj is the mean income in subgroup j ; Lj is the Theil index.


I The first term represents the within-group inequality.
I The second term represents the between-group inequality.
Coming Soon!

I Changes in inequality over time

I Kopczuk, Wojiech, Emmanuel Saez, and Jae Song (2010).


Earnings inequality and mobility in the United States:
Evidence from social security data since 1937. The
Quarterly Journal of Economics, 125(1), 91-128.
ECON 316: Income & Inequality
Lecture 6: Trends in Inequality

Prof. Sitian Liu


Queen’s University

September 28, 2020


Outline

Last class:

I Measuring inequality

Today:

I Inequality in the long run (Piketty and Saez, 2014)


I Long-run versus short-run earnings mobility (Kopczuk,
Saez, and Song, 2010)
Piketty and Saez (2014, Science)

Inequality in the long run

I The distribution of income and wealth is a widely discussed


and controversial topic:
I Karl Marx (19th century): The dynamics of private capital
accumulation inevitably lead to the concentration of income
and wealth in fewer hands;

I Simon Kuznets (20th century): The balancing forces of


growth, competition, and technological progress lead to
reduced inequality among the classes.
hnological progress lead declarations and probate records dating back tensive data set availa
elopment to reduced in- to the 18th and 19th centuries were also ex- evolution of income ine

Piketty and Saez (2014, Science)


rmony among the classes, ploited by a growing number of scholars in constantly being extende
ught in the available on
o we know parisschoo
nd wealth a research
e 18th cen- Income inequality in Europe and the United States, analysis.
can we de- 1900–2010 Historica
dge for the Share of top income decile in total pretax income ries have als
? For a long similar met
arch on the number of c
and wealth Top 10% income Top 10% income a longer tim
50 percent
vely limited share: Europe share: U.S. ing on prev
ed facts to- historical n
ety of pure- 45 (22), long-r
ons. In this tion of ag
k of recent ratios in th
n made in 40 oped econo
a number some of th
ng the long- 18th centur
35
and wealth This Rev
d countries. on this bod
sible inter- 30 on income
ns for the on a recen
tive synthe
25 presenting
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 emerge fro
on the dis- gram (Figs
very country before WWI, wealth inequality levels have still not regained than in the United State
he order of 30 to 50% of the record levels observed in Europe before has been shrinking lately

ansfers, and
Piketty and Saez (2014, Science)
y developed country. Prop- World War I. The U.S. top decile wealth share Atlantic.
Given th
income dec- lower in th
measurement than in 191
arly regard- Wealth inequality in Europe and the United States, income ine
ch as health, 1870–2010 (or even sli
ood spend- in 1913 Eur
Share of top wealth decile in total net wealth
we therefore modern U.
evolution of more on a
ary income 100 percent Top 10% wealth labor incom
share: Europe treme levels
the top dec- 90 that charact
0 was lower (wealth-bas
rose in the Top 10% wealth In 1913 Eur
–1940s, and 80 share: U.S. predominan
35% in the (rent, intere
bove Euro- ing from t
70
It then rose tration of c
ce since the U.S. incom
ow close to 60 about equa
measure, pri- capital inco
ation is cur- proximately
50
s ever been income ineq
1870 1890 1910 1930 1950 1970 1990 2010
also slightly same form
WI Europe.
Piketty and Saez: Key Facts
I Income and wealth ineq. was very high a century ago,
particularly in Europe, but dropped dramatically in the first
half of the 20th century.
I Today, the income ineq. ordering is reversed (true for every
measure).
I In Europe, the decline in income ineq. is greater with
income after taxes and transfers.
I In US, income ineq. is now higher than the record in history.
I “Great inequality reversal” for wealth ineq. as well.
I Wealth concentration always higher than income
concentration: 60–90%.
I Bottom half of population has almost 0 wealth (only income).
I US wealth ineq. today is lower than the record in Europe.
Piketty and Saez: Why?

I Why income inequality declined in developed countries


during the first half of the 20th century?
I # top capital incomes induced by the world wars, the Great
Depression, and the policies in response to these shocks.
I No structural decline in the inequality of labor income.
I Long-run labor income inequality:
I Determined by race between education and technology
I Expansion of education ! " supply of skills.
I Technology change ! " demand of skills.
Piketty and Saez: Why?

Why income inequality increased in recent decades in the US?

I Rise in skill-biased technological progress and information


technology.
I Variations between countries?
I Insufficient educational investment.
I Rise of very top labor incomes: " top 10% income share
comes from the top 1% (or even 0.1%) (" top executive
compensation in large U.S. corporations.)

I Inequality does not follow a deterministic process. In this


sense, both Marx and Kuznets were wrong.
I Powerful forces: institutions and policies that societies
choose to adopt.
Kopczuk, Saez, and Song (KSS) (2010)
I Economic inequality is often measured with high-frequency
outcomes (e.g., annual income).

I Substantial mobility in earnings over lifetime.


I Annual earnings inequality might exaggerate the true
economic disparity among individuals (smooth earnings).
I Debate on whether the increase in inequality since the
1970s has been offset by increases in earnings mobility.
I Development of performance pay (bonus and stock): "
year-to-year earnings variability for top earners.

I This paper: Use Social Security Administration (SSA)


longitudinal data to analyze long-run inequality and mobility
in the U.S.
KKS Data and Measurement

Social Security Administration (SSA) earnings micro data:

I 1% sample of the full US covered workforce since 1957


(0.1% since 1937)
I Annual and cover almost 70 years.
I Longitudinal balanced panels, based on the same SSN.
I Earnings are not top-coded since 1978.
I Core sample: workers aged 25-69, in commerce and
industry, with earnings above some minimum threshold.
KKS: Annual Earnings Inequality
104 QUARTERLY JOURNAL OF ECONOMICS

0.50

● ●
● ●

●● ●
●● ●

0.45 ●●● ●
● ●

● ●●● ●
● ●
●●
Gini coefficient


● ●●
●●
●● ●
0.40 ●● ●
● ●●

●●● ●●●
● ●●●● ●●●
● ●●
● ●● ●
● ●

0.35

● All workers
Men
Women
0.30
1940 1950 1960 1970 1980 1990 2000
Year

FIGURE I
Annual Gini Coefficients
The figure displays the Gini coefficients from 1937 to 2004 for earnings of indi-
viduals in the core sample, men in the core sample, and women in the core sample.
The core sample in year t is defined as all employees with commerce and industry
KKS: Annual vs. Five-Year Earnings
110 QUARTERLY JOURNAL OF ECONOMICS

0.45
● ●●
●●●
● ●●●







●●● ●
● ●●
● ●
0.40 ● ●
● ●

● ●

● ● ●

Gini coefficient

● ● ●
● ●
● ● ●

0.35
● ●● ●

● ● ●
● ●
● ● ●
●●● ●●
● ●●● ●
● ●●●●● ●●●
●● ●●● ●
●●●●● ●

● ● ●
● ●
●● ● ●
0.30 ●
● ●

● ●●● ●
●● ●●● ●●●●●●●●●
●●
● Annual earnings, all workers
Five-year earnings, all workers
● Annual earnings, men
Five-year earnings, men
0.25
1940 1950 1960 1970 1980 1990 2000
Year

FIGURE III
Gini Coefficients: Annual Earnings vs. Five-Year Earnings
The figure displays the Gini coefficients for annual earnings and for earnings
averaged over five years from 1939 to 2002. In year t, the sample for both series
is defined as all individuals aged 25 to 60 in year t, with commerce and industry
KKS: Short-Term Mobility
EARNINGS INEQUALITY AND MOBILITY IN THE U.S. 111

1.0

●●●●●●●●●●●●●●●
●● ●●●●
●●●●●●●●●●●●●●●●●●●●● ●●●
●● ●●●●

● ●●●●●● ●●●●●●●●●●●●● ●●●●●
● ● ● ● ●
Shorrocks Gini mobility index and rank correlation

● ●● ● ● ● ●●
● ●●●
●● ●
0.9 ●●●
●●● ●
● ●
● ●

●●

0.8

0.7

0.6
● Shorrocks Index (five-year Gini/annual Gini), all workers
● Shorrocks Index (five-year Gini/annual Gini), men
Rank correlation (after one year), all workers
Rank correlation (after one year), men
0.5
1940 1950 1960 1970 1980 1990 2000
Year

FIGURE IV
Short-Term Mobility: Shorrocks’ Index and Rank Correlation
The figure displays the Shorrocks mobility coefficient based on annual earnings
Gini vs. five-year average earnings Gini and the rank correlation between earnings
in year t and year t + 1. The Shorrocks mobility coefficient in year t is defined as the
KKS:
114 Mobility and Top Earnings QUARTERLY JOURNAL OF ECONOMICS

13 ●

● ●

12 ●
● ●

11

Earnings Share (%)


● ●
● ● ● ●
10 ● ●
● ●

9

8 ● ●


7 ● ● ● Annual earnings
● ● Five-year average earnings
6
A. Top 1% earnings share: annual vs. five-year
100
● After one year
After three years
90 After five years
Probability (%)

80
● ● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●
70

60

50
1980 1985 1990 1995 2000 2005
B. Probability of staying in the top 1%

FIGURE VI
Top Percentile Earnings Share and Mobility
In Panel A, the sample in year t is all individuals aged 25 to 60 in year t and
KKS: Long-Term Upward Mobility
124
Probability of moving from P0−40 to P80−100 (%) after twenty years
QUARTERLY JOURNAL OF ECONOMICS

10

6
● ● ● ●
● ● ●


● ● ● ● ●
● ●
● ● ●

4 ● ●
● ● ●
● ●
● ●

● All
Men
Women
0
1950 1955 1960 1965 1970 1975 1980
Year (middle of the initial eleven-year span)

FIGURE XI
Long-Term Upward Mobility: Gender Effects
The figure displays in year t the probability of moving to the top quintile
group (P80–100) for eleven-year average earnings centered around year t + 20
conditional on having eleven-year average earnings centered around year t in the
bottom two quintile groups (P0–40). The sample is defined as all individuals aged
KKS: Conclusion

I Annual earnings inequality is U-shaped.

I Short-run earnings mobility measures are stable over the


full period (except for a temporary surge during WW II).

I Mobility at the top of the earnings distribution is stable.

I Substantial increase in upward mobility over a lifetime for


women: the driving force behind the increase in long-term
mobility among all workers.
Coming Soon!

I Intergenerational transfer

I Sacerdote, Bruce (2007). How large are the effects from


changes in family environment? A study of Korean
American adoptees. The Quarterly Journal of Economics,
122(1), 119-157.
ECON 361: Income & Inequality
Lecture 7: Intergenerational Transfer

Prof. Sitian Liu


Queen’s University

October 1, 2020
Outline

Last class:

I Trends in inequality

Today:

I Intergenerational elasticity of earnings

I Nature versus nurture (causal effects)


A Thought Experiment

Inequality within generations vs. across generations:

I Society A and society B, with identical distributions of


earnings.
I Within the generation, they are equally unequal.
I In society A, one’s relative position in the earnings
distribution is exactly inherited from one’s parents (extreme
caste society).
I In society B, one’s relative position in the earnings
distribution is completely independent of the position of
one’s parents (complete intergenerational mobility).
I Implications for policies.
Intergenerational Elasticity of Earnings
I Benchmark regression of permanent earnings of parents
(0) and children (1):

log(Y1 ) = ↵ + log(Y0 ) + ✏.

Use lowercase for logs and demean:

y1 = y0 + e.
I is the intergenerational elasticity (IGE).
I The intergenerational correlation ⇢ is
cov(y0 , y1 )
⇢=
0 1
cov(y0 , y1 ) 2 1 0
= 2
· 0 · = .
0 0 1 1
Intergenerational Elasticity of Earnings

Measurement issues: y should be a measure of permanent


earnings.

1. Persistent transitory shocks


I Measurement error in y0 ; errors are correlated over time.
I Averaging over 20 to 30 years (Mazumder, 2005).

2. Lifecycle bias
I Fathers’ and sons’ earnings are measured at different ages.
I Correlation between log earnings at different ages and log
of the present value of lifetime earnings: low in 20s, and
close to 1 in 30s to late 40s (Haider and Solon, 2006).
Table 1 Elasticity and correlations from Jäntti et al. (2006).
Country Elasticity Correlation
Men
Denmark 0.071 0.089
[0.064, 0.079] [0.079, 0.099]
Finland 0.173 0.157
[0.135, 0.211] [0.128, 0.186]
Norway 0.155 0.138
[0.137, 0.174] [0.123, 0.152]
Sweden 0.258 0.141
[0.234, 0.281] [0.129, 0.152]
UK 0.306 0.198
[0.242, 0.370] [0.156, 0.240]
US 0.517 0.357
[0.444, 0.590] [0.306, 0.409]
Women
Denmark 0.034 0.045
[0.027, 0.041] [0.036, 0.054]
Finland 0.080 0.074
[0.042, 0.118] [0.045, 0.103]
Norway 0.114 0.084
[0.090, 0.137] [0.070, 0.099]
Sweden 0.191 0.102
[0.166, 0.216] [0.090, 0.113]
UK 0.331 0.141
[0.223, 0.440] [0.099, 0.183]
US 0.283 0.160
[0.181, 0.385] [0.105, 0.215]
Numbers in brackets below the point estimates show the bias corrected 95% bootstrap confidence interval.
Source: This reproduces much of Table 2 from Jäntti et al. (2006).
Causal Effects Underlying the Correlations

Causal mechanisms:

I Nature/ nurture debate


I Parental attributes

Methodologies:

I Regression analysis using adoptees


I Natural experiments/instrumental variable estimates
Sacerdote (2007)
I Social scientists, policy makers, and parents everywhere:
To what degree children’s behavior and outcomes are
determined by nature, nurture, and the interaction?
I Korean American adoptees:
I Quasi randomly assigned by the Holt International
Children’s Services during 1964-1985 to US families.
I First-come first-serve policy: The timing of when
applications are completed creates the match, rather than
parent and child characteristics.
I Exception for families with all boys or all girls: Effectively
random conditional on the adoptee’s cohort and gender.

I Survey of adoptees and their families (including


non-adoptees) during 2004-2005.
Sacerdote (2007)
Treatment effects:
I Type 1 (T 1): High education, small families.
I Type 3 (T 3): Low education, large families.
I Type 2 (T 2): All other families not in the extreme groups.

Ei = ↵ + 1 T 1i + 2 T 2i + 3 Malei + Ai + ⇢Ci + ✏i .
where
I Ei : educational attainment for child i.
I Ai : a set of age dummies.
I Ci : a set of cohort dummies, i.e., the year in which the child
initially entered the Holt. (Parent and child characteristics co-vary
systematically over time.)
I 1 : the causal effect of assignment to a type 1 family, relative to
assignment to a type 3 family.
Sacerdote (2007)

Downloaded from https://academic.oup.com/qje/article-abstract/122/1/119/19247


A STUDY OF KOREAN AMERICAN ADOPTEES 137

FIGURE I
Mean (College Attendance) By Family Size
Dashed line is for nonadoptees (higher line), solid line is for adoptees.
Sacerdote (2007)
138 QUARTERLY JOURNAL OF ECONOMICS

FIGURE II
Mean Child’s Years of Education vs. Mother’s
Dashed line is for nonadoptees. Solid line is for adoptees.
Mean Child’s Years of Education vs. Mother’s

ct/122/1/119/1924717 by Bank of Canada Information Resource Centre user on 12 July 2019


Dashed line is for nonadoptees. Solid line is for adoptees.

Sacerdote (2007)

FIGURE III
Mean of Child’s Family Income By Parents’ Income at Adoption
Dashed line is for nonadoptees (higher line). Solid line is for adoptees.
Sacerdote (2007)
Table: Treatment Effects from Assignment to High Education, Small Family
(Part of Table VII from the Original Paper)
TABLE VII
TREATMENT EFFECTS FROM ASSIGNMENT TO HIGH EDUCATION, SMALL FAMILY

Treatment effect Treatment effect


“middle group” high education Nonadoptees: H
of families vs. small family vs. education sm
large, less large, less family vs. lar
educated educated less educate

Child’s years of education 0.314 (0.226) 0.749 (0.245)** 2.157 (0.264


Child has 4! years college 0.060 (0.056) 0.161 (0.057)** 0.317 (0.031
Log child’s household income 0.071 (0.081) 0.113 (0.089) 0.210 (0.089
Child four-year college ranked by 0.082 (0.052) 0.231 (0.060)** 0.365 (0.052
US News
Acceptance rate of child’s college "0.007 (0.035) 0.016 (0.036) "0.053 (0.032
Child drinks (yes/no) 0.099 (0.050)* 0.178 (0.049)** 0.229 (0.041
Child smokes (yes/no) 0.013 (0.044) "0.006 (0.048) "0.075 (0.024
Child’s BMI "0.509 (0.460) "0.941 (0.468)* "0.929 (0.498
Child overweight "0.030 (0.047) "0.077 (0.045) "0.088 (0.048
Child obese "0.020 (0.023) "0.044 (0.018)* "0.037 (0.018
Child has asthma "0.005 (0.028) 0.013 (0.031) "0.005 (0.034
Number of children "0.070 (0.099) "0.199 (0.103)* "0.580 (0.132
Child is married 0.014 (0.050) 0.000 (0.056) "0.092 (0.053

I split the sample into three groups: High education small families are defined as those with three or fewer children in which both the
(Type 1). Twenty-seven percent of adoptees are assigned to such a family. Large lesser educated families are defined as those with four or
has a college degree (Type 3). Thirteen percent of adoptees are assigned to such a family. The remaining families (which are either small or
Black, Devereux, and Salvanes (2005)

I Parent with higher education levels have children with


higher education levels. Why?

I Selection: the type of parents who has more education has


the type of child who will do so well.

I Causation: obtaining more education makes one a different


type of parent.

I Education policy: equality of opportunity ! spillover effect


on later generations?
Black, Devereux, and Salvanes (2005)
I Change in compulsory schooling laws in Norway in the
1960s.

I Pre-reform: required to attend school through 7th grade;

I After reform: extended to 9th grade.

I The reform occurred in different municipalities at different


times.

S1 = 0 + 1 S0 + 2 Age1 + 3 Age0 + 4 M0 +✏
S0 = ↵0 + ↵1 Ref orm0 + ↵2 Age1 + ↵3 Age0 + ↵4 M0 + ⌫

where subscript 0 denotes parent and 1 denotes child.


Estimate using 2SLS and Ref orm0 serves as an
instrument for S0 .
442 THE AMERICAN ECONOMIC

TABLE 2—DISTRIBUTION OF EDUCATION TWO YEARS TAB


BEFORE AND AFTER THE REFORM

Years of
education Before After
7 3.5% 1.2%
8 8.9% 1.6%
9 3.4% 12.9%
10 29.6% 26.6% Mother–
11 8.5% 8.8%
12 17.2% 19.1%
13 6.7% 6.7% Mother–
14 5.4% 5.8%
15 2.7% 3.4%
16! 14.2% 14.1% Mother–
N 89,320 92,227 daugh
Notes: Before indicates education distribution of cohorts in Father–a
the two years prior to the reform, while After indicates the
distribution of those two years post reform. Note that be-
TION TWO YEARS TABLE 3—RELATIONSHIP BETWEEN PARENTS’ AND
EFORM CHILDREN’S EDUCATION

Dependent variable: Children’s education


After
Parent’s education
1.2% Full sample "10
1.6%
12.9% OLS IV OLS IV
26.6% Mother–all 0.237* 0.076 0.211* 0.122*
8.8% (0.003) (0.139) (0.017) (0.043)
19.1% N # 143,579 N # 39,605
6.7% Mother–son 0.212* 0.199 0.197* 0.176*
5.8% (0.004) (0.185) (0.021) (0.054)
3.4% N # 73,663 N # 20,135
14.1% Mother– 0.264* $0.029 0.225* 0.066
92,227 daughter (0.004) (0.186) (0.023) (0.063)
N # 69,916 N # 19,470
bution of cohorts in Father–all 0.217* 0.030 0.200* 0.041
e After indicates the (0.003) (0.132) (0.021) (0.062)
form. Note that be- N # 96,275 N # 22,148
nt municipalities at Father–son 0.209* 0.029 0.151* 0.008
e reform varies by (0.004) (0.171) (0.027) (0.071)
N # 49,492 N # 11,235
Father– 0.226* 0.022 0.244* 0.081
daughter (0.004) (0.186) (0.033) (0.094)
as to reduce the N # 46,783 N # 10,913
r than nine years
Coming Soon!

I Neighborhood Effect

I The Geography of Intergenerational Mobility


I Chetty, Raj, Nathaniel Hendren, Patrick Kline, and
Emmanuel Saez (2014). Where is the land of opportunity?
The geography of intergenerational mobility in the United
States. The Quarterly Journal of Economics, 129(4),
1553-1623.
I (Optional): Corak, Miles, and Andrew Heisz (1999). The
intergenerational earnings and income mobility of Canadian
men: Evidence from longitudinal income tax data. The
Journal of Human Resources, 34(3), 504-533.
ECON 361: Income & Inequality
Lecture 8: Geography of Intergenerational Mobility

Prof. Sitian Liu


Queen’s University

October 5, 2020
Outline

Last class:

I Intergenerational elasticity of earnings


I Nature versus nurture (causal effects)

Today:

I Neighborhood effects
I The geography of intergenerational mobility
I Race and intergenerational moblity
Intergenerational Influences

I Intergenerational elasticity of earnings.


I Sibling correlations in earnings provide another measure of
intergenerational transfer:
I Positive correlations imply that shared genetic and
environmental factors cause siblings to be more similar than
two randomly selected persons in society.
I Solon (1999): 0.4 – correlation of log earnings between
brothers in the US.
I Black and Devereux (2001): 0.15-0.2 – Nordic countries.
I Parental attributes (nature and nurture).
I Neighborhood characteristics.
Moving to Opportunity (MTO)
The Moving to Opportunity (MTO) experiment offers an
opportunity to provide experimental evidence on neighborhood
effects.

I MTO offered randomly selected low-income families


housing vouchers to move from high-poverty to
lower-poverty neighborhoods in five US cities 1994-1998.
I Experimental group: housing vouchers that subsidized
private-market rents in low-poverty census tracts (for the
first year).
I Section 8 group: regular housing vouchers without any
relocation constraint.
I Control group: no assistance through MTO.
Chetty, Nendren, and Katz (2016)

I Analyze MTO’s long-term impacts on children who were


young when their families moved to better neighborhoods.
I Improved the mental and physical health, and family safety.
I No significant impacts on the earnings and employment of
adults and older youth.
I The amount time individuals spend in a neighborhood
during childhood can be crucial for the neighborhood’s
effects on children’s long-term outcomes.
Chetty, Nendren, and Katz (2016): Data

I MTO data: 4,606 households and 15,892 individuals


participated in the experiment.
I Background characteristics: children’s schooling, household
income, etc.
I Neighborhood information: poverty rate.

I Tax data: federal income tax records 1996-2012.


I Income, college attendance, college quality, neighborhood
characteristics in adulthood, marital status and fertility, and
taxes paid.
Chetty, Nendren, and Katz (2016): Analysis

I What is the causal impact of being offered a voucher to


move through MTO?
I "Intent-to-treat" (ITT) effects: differences between
treatment and control group means.

yi = ↵ + IT T
E Expi + IT T
S S8i + X i + ✏i (1)

I Expi : indicator for being randomly assigned to the experimental


group.
I S8i : indicator for being randomly assigned to the section 8 group.
I Xi : baseline characteristics.
Chetty, Nendren, and Katz (2016): Analysis
I Since not all the families offered vouchers actually took
them up, the ITT estimates understate the causal effect of
actually moving to a different neighborhood.
I What is the causal impact of moving through MTO?
I "Treatment on the treated" (TOT) effects.

yi = ↵ + T OT
E T akeExpi + T OT
S T akeS8i + Xi + ✏i (2)
I T akeExpi : indicator for taking up the experimental vouchers.
I T akeS8i : indicator for taking up the Section 8 vouchers.
I Xi : baseline characteristics.
I T akeExp and T akeS8 are endogenous, so the authors
instrument for them using the randomly-assigned MTO
treatment group indicators (Exp and S8).
Table 2—First-Stage Impacts of MTO on Voucher Take-Up
and Neighborhood Poverty Rates (Percentage Points)

Poverty rate    
in tract one year Mean poverty rate in tract Mean poverty rate in zip
Housing post- RA post-RA to age 18 post-RA to age 18
voucher
take-up ITT TOT   ITT TOT   ITT TOT
  (1) (2) (3)   (4) (5)   (6) (7)
Panel A. Children < age 13 at random assignment          
Exp. versus control 47.66*** −17.05*** −35.96***  −10.27*** −21.56***  −5.84*** −12.23***
  (1.653) (0.853) (1.392)   (0.650) (1.118)   (0.425) (0.752)
Sec. 8 versus control 65.80*** −14.88*** −22.57***  −7.97*** −12.06***  −3.43*** −5.17***
  (1.934) (0.802) (1.024)   (0.615) (0.872)   (0.423) (0.622)
                   
Observations 5,044 4,958 4,958   5,035 5,035   5,035 5,035
Control group mean 0 50.23 50.23   41.17 41.17   31.81 31.81
                   
Panel B. Children age 13–18 at random assignment        
Exp. versus control 40.15*** −14.00*** −34.70***  −10.04*** −24.66***  −5.51*** −13.52***
  (2.157) (1.136) (2.231)   (0.948) (1.967)   (0.541) (1.113)
Sec. 8 versus control 55.04*** −12.21*** −22.03***  −8.60*** −15.40***  −3.95*** −7.07***
  (2.537) (1.078) (1.738)   (0.920) (1.530)   (0.528) (0.921)
                   
Observations 2,358 2,302 2,302   2,293 2,293   2,292 2,292
Control group mean 0 49.14 49.14   47.90 47.90   35.17 35.17

I Notes: Columns 1, 2, 4, and 6 report ITT estimates from OLS regressions (weighted to adjust for differences in sam-
Column
pling 1:across
probabilities Replace
sites and overin
y Eq.1
time) of an with
outcomeanon indicator for assigned
indicators for being takingtoup a housing
the experimental
voucher group and the Section 8 voucher group as well as randomization site indicators. Columns 3, 5, and 7 report
TOTvoucher. Among
estimates using youngerinstrumenting
a 2SLS specification, children, for 48 percent
voucher who
take-up with were assigned
the experimental and Section to
8
assignment indicators. Standard errors, reported in parentheses, are clustered by family. Panel A restricts the sam-
plethe experimental
to children below age 13 atgroup took uppanel
random assignment; theB voucher they
includes children were
between age offered.
13 and 18 at random
assignment The estimates in panels A and B are obtained from separate regressions. The dependent variable in col-
I umnColumn 3: Instrument for
1 is an indicator for the family T akeExp T akeS8
taking up an MTO voucher (
and moving. ) with
The dependent ( in columns
Exp S8
variable
and 3 is the census tract-level poverty rate one year after random assignment. The dependent variable in columns
). 2

4–7 is the duration-weighted mean poverty rate in the census tracts (columns 4 and 5) and zip codes (columns 6
Estimate Eq. 2, replacing T akeExp with the expected take-up
and 7) where the child lived from random assignment till age 18. The sample in this table includes all children born
predictedlocation in the first
information stage
to the tax data.(Column
This sample is1).
before 1991 in the MTO data for whom an SSN was collected prior to RA because we were unable to link the MTO
tract-level nearly identical our linked analysis sample because
Table 3—Impacts of MTO on Children’s Income in Adulthood

Individual earnings   Individual earnings


W-2 earn- Employed Hhold. Inc.
2008–2012 ($) ($)
ings ($) (%) inc. ($) growth ($)
2008–2012 ITT w/   Age 26 2012 2008– 2008–2012 2008–2012
ITT ITT controls TOT ITT ITT 2012 ITT ITT ITT
  (1) (2) (3) (4)   (5) (6) (7) (8) (9)
Panel A. Children < age 13 at random assignment
Exp. versus 1,339.8** 1,624.0** 1,298.9** 3,476.8**   1,751.4* 1,443.8** 1.824 2,231.1*** 1,309.4**
control (671.3) (662.4) (636.9) (1,418.2)   (917.4) (665.8) (2.083) (771.3) (518.5)
Sec. 8 versus 687.4 1,109.3 908.6 1,723.2   551.5 1,157.7* 1.352 1,452.4** 800.2
control (698.7) (676.1) (655.8) (1051.5)   (888.1) (690.1) (2.294) (735.5) (517.0)
                     
Observations 8,420 8,420 8,420 8,420   1,625 2,922 8,420 8,420 8,420
Control group mean 9,548.6 11,270.3 11,270.3 11,270.3   11,398.3 11,302.9 61.8 12,702.4 4,002.2
                   

Panel B. Children age 13–18 at random assignment  


Exp. versus −761.2 −966.9 −879.5 −2,426.7   −539.0 −969.2 −2.173 −1,519.8 −693.6
control (870.6) (854.3) (817.3) (2,154.4)   (795.4) (1,122.2) (2.140) (11,02.2) (571.6)
Sec. 8 versus −1,048.9 −1,132.8 −1,136.9 −2,051.1   −15.11 −869.0 −1.329 −936.7 −885.3
control (932.5) (922.3) (866.6) (1,673.7)   (845.9) (1213.3) (2.275) (11,85.9) (625.2)
                     
Observations 11,623 11,623 11,623 11,623   2,331 2,331 11,623 11,623 11,623
Control group mean 13,897.1 15,881.5 15,881.5 15,881.5   13,968.9 16,602.0 63.6 19,169.1 4,128.1

Notes: Columns 1–3 and 5–9 report ITT estimates from OLS regressions (weighted to adjust for differences in
I Column 2: Among younger
8 voucherchildren, being assigned an experimental
sampling probabilities across sites and over time) of an outcome on indicators for being assigned to the experi-
mental voucher group and the Section group as well as randomization site indicators. Column 4 reports
TOT estimates using a 2SLS specification, instrumenting for voucher take-up with the experimental and Section 8
voucher
assignment increases
indicators. individual
Standard errors, reported inearnings
parentheses, areby $1,624.
clustered by family. Panel A restricts the sam-
ple to children below age 13 at random assignment; panel B includes children between age 13 and 18 at random
I Column 4: Children whose families took up the experimental voucher
assignment. The estimates in panels A and B are obtained from separate regressions. The number of individuals is
2,922 in panel A (except in column 5, where it is 1,625) and 2,331 in panel B. The dependent variable in column
and moved when theyin were
which theyoung
individualexperience an increase in annual
1 is individual W-2 wage earnings, summing over all available W-2 forms. Column 1 includes one observation per
individual per year from 2008–2012 is 24 or older. Column 2 replicates column 1 using
individual earnings as the dependent variable. Individual earnings is defined as the sum of individual W-2 and non-
individual earnings in early adulthood of $3,477.
W-2 earnings. Non-W-2 earnings is adjusted gross income minus own and spouse’s W-2 earnings, social security
and disability benefits, and UI payments, divided by the number of filers on the tax return. Non-W-2 earnings is
VOL. 106 NO. 4 CHETTY ET AL.: EFFECTS OF MOVING TO OPPORTUNITY EXPERIMENT 875

3,000

< Age 13 at random assignment


Experimental versus control ITT on earnings ($)

Age 13–18 at random assignment

2,000

1,000

−1,000

20 21 22 23 24 25 26 27 28
Age of income measurement

Figure 1. Impacts of Experimental Voucher by Age of Earnings Measurement

Notes: This figure presents ITT estimates of the impact of being assigned to the experimental voucher group on
The Geography of Intergenerational Mobility

Chetty, Hendren, Kline, and Saez (2014) use administrative


records on the incomes of more than 40 million children and
their parents to describe three features of intergenerational
mobility in the US.

I A 10 percentile increase in parent income is associated


with a 3.4 percentile increase in a child’s income.
I Intergenerational mobility varies substantially across areas.
I Factors correlated with up mobility.
Chetty Et Al. (2014): Data

I US citizens born between 1980-1982.


I Identify the parents of a child as the first tax filers (between
1996-2012) who claim the child as a child dependent and
were between the ages of 15 and 40 when the child was
born.
I Parental income is from federal income tax records
(including labor earnings and capital income,
unemployment insurance, social security, and disability
benefits). Parent family income is averaged between 1996
and 2000.
I Child family income (averaged over 2011 and 2012),
college attendance, college quality.
Chetty Et Al. (2014): National Statistics
I Rank-rank:
FIGURE roughlybetween
II: Association linear. A Children’s
10 percentileand
increase in parent
Parents’ incomeRanks
Percentile
rank is associated with a 3.4 percentile increase in a child’s income
rank.
A. Mean Child Income Rank vs. Parent Income Rank in the U.S.
70
60
Mean Child Income Rank
50
40
30

Rank-Rank Slope = 0.341


(0.0003)
20

0 10 20 30 40 50 60 70 80 90 100
Parent Income Rank
(0.0003)

20
Chetty Et Al. (2014):
Parent IncomeNational Statistics
0 10 20 30 40 50 60 70 80 90 100
Rank

I The chances of achieving the “American Dream” are considerably


higher for children in Denmark and Canada than those in the US.
B. Cross-Country Comparisons
70
60
Mean Child Income Rank
50
40
30

Rank-Rank Slope (Denmark) = 0.180


(0.006)
Rank-Rank Slope (Canada) = 0.174
(0.005)
20

0 10 20 30 40 50 60 70 80 90 100
Parent Income Rank
United States Denmark Canada
Chetty Et Al. (2014): Spatial Variation
I Variation in mobility across commuting zones (CZ): based
on where children lived at age 16.
I Relative mobility:
Ric = ↵c + c Pic + ✏ic ,

where Ric and Pic are children’s and parents’ income rank
in their respective income distributions for child i who grew
up in CZ c. c measures the relative mobility in CZ c.
I Absolute mobility:
r̄pc = ↵c + c p.

r̄pc is the expected rank of a child who grew up in CZ c with


parents who have a national income rank of p.
Chetty Et Al. (2014): Absolute Mobility
IFIGURE VI: The
Upward mobility Geography
varies ofatIntergenerational
substantially Mobility
the regional level: lowest in the
Southeast and highest in the Great Plains. The West Coast and
Northeast also have high rates of upward mobility.
A. Absolute Upward Mobility: Mean Child Rank for Parents at 25th Percentile (r̄25 ) by CZ
Chetty Et Al. (2014): Relative Mobility
I Similar patterns.
B. Relative Mobility: Rank-Rank Slopes (r̄100 r̄0 )/100 by CZ
Chetty Et Al. (2014): Correlates
FIGURE VIII: Correlates of Spatial Variation in Upward Mobility

Frac. Black Residents (-)


Racial Segregation (-)
G SEG

Segregation of Poverty (-)


Frac. < 15 Mins to Work (+)
Mean Household Income (+)
MIG LAB COLL TAX FAM SOC K-12 INC

Gini Coef. (-)


Top 1% Inc. Share (-)
Student-Teacher Ratio (-)
Test Scores (Inc Adjusted) (+)
High School Dropout (-)
Social Capital Index (+)
Frac. Religious (+)
Violent Crime Rate (-)
Frac. Single Moms (-)
Divorce Rate (-)
Frac. Married (+)
Local Tax Rate (+)
State EITC Exposure (+)
Tax Progressivity (+)
Colleges per Capita (+)
College Tuition (-)
Coll Grad Rate (Inc Adjusted) (+)
Manufacturing Share (-)
Chinese Import Growth (-)
Teenage LFP Rate (+)
Migration Inflow (-)
Migration Outflow (-)
Frac. Foreign Born (-)

0 0.2 0.4 0.6 0.8 1.0


Correlation
Chetty Et Al. (2018): Race and
FIGURE II: EmpiricalIntergenerational Mobility
Estimates of Intergenerational Mobility and Racial Disparities

A. Intergenerational Mobility and Steady States for Blacks vs. Whites

α β
α β
α β
α β

Race and Intergenerational Mobility

B. Current Mean Ranks vs. Predicted Ranks in Steady State, by Race

These figures show how empirical estimates of intergenerational mobility by race (Panel A) relate to the evo
disparities (Panel B) using the model in Section II. These figures use the primary analysis sample (children
Race and Intergenerational Mobility

New York Times: Mobility Animation!


Black-White Gaps in Income: Males
FIGURE V: Black-White Gaps in Individual Income, by Gender
A. Males
Black-White Gaps in Income: Females
B. Females
Black-White Gaps in Incarceration

E. Incarceration, Females F. Incarceration, Males

Notes: Panels A-D show the relationship between children’s educational attainment and their parents’ household income, by
race and gender. Data on educational attainment is obtained from the American Community Survey. Panels A and B plot
the fraction of children who complete high school by parental income percentile, by race and gender. Panels C and D replicate
Panels A and B using college attendance as the outcome. Panels A-B include only children observed in the 2005-15 ACS at
age 19 or older, while Panels C-D include those observed at age 20 or older. High school completion is defined as having a
high school diploma or GED. College attendance is defined as having obtained “at least some college credit”. Panels E and
F plot incarceration rates vs. parent income percentile, by race and gender. Incarceration is defined as being incarcerated
on April 1, 2010 using data from the 2010 Census short form. The children in our sample are between the ages of 27-32 at
Coming Soon!

I The Returns to Education


I Angrist, Joshua D., and Alan B. Krueger (1991). Does
compulsory school attendance affect schooling and
earnings? The Quarterly Journal of Economics. 106(4),
979-1014.
ECON 361: Income & Inequality
Lecture 9: Education and Earnings

Sitian Liu
Queen’s University

October 8, 2020
Outline

Last class:

I Neighborhood effects
I The geography of intergenerational mobility
I Race and intergenerational mobility

Today:

I The Difficulty of estimating the causal effect of education on


earnings
I Empirical evidence on the returns to education
Is College Worth It?
I A fundamental concern facing every high school student is
the economic return to investing in a college education.
I A big concern about the value of a college education is the
rising tuition costs and debt.

Figure: Trends in Full-Time College Tuition


Is College Worth It?

Figure: Debt Among Public and Private BA Recipients


Is College Worth It?
I Discussing the cost of higher education without discussing
the benefits can be deeply misleading.

Figure: Annual Earnings Among Workers, Age 25 and Over


Is College Worth It?

I To justify large expenditures on post-secondary training,


the relationship found in the previous figure must be causal
the rest of the lecture.
I The earning differences in the figure pay no attention to
education quality the next lecture.
The Difficulty of Estimating the Causal Effect
of Education on Earnings

I The positive correlation between education and future


earnings does not means a positive return to education
investment.
I Selection: Students who choose to obtain more education
are systematically different from those who choose to obtain
less education in ways that also should influence earnings.
I Selection when ability is one-dimensional
I Selection when ability is multidimensional
One-Dimensional Ability: Rosen’s Model
I Human capital production function for person i

ln yi (s) = h(s, Ai )
where Ai measures i’s ability.
I Each person faces the same interest rate r.
I Person i’s optimal schooling choice is given by
Z 1
yi (s)e rs
max V (s) = yi (s)e rt dt =
s s r
dyi (s)
F OC = yi (s)r
ds
@h(s, Ai )
() =r
@s
I The FOC implies that the person should continue schooling
until the marginal rate of return is equal to the interest rate.
One-Dimensional Ability: Rosen’s Model

I Ability bias
I More educated people are more productive because of their
schooling, or they would have earned more regardless of
their schooling?
I Ability is hard to measure.
Multidimensional Ability: Roy Model
I People differ markedly in their skills, talents, and interests.
I These differences will lead people to sort into different
education levels and jobs that they are best suited for.
I Example: Anita and Tara.
I Anita: Interested in focusing on very specific problems for
long periods; has poor interpersonal/communications skills
! (well suited for academia) PhD, academic researcher.
I Tara: Enjoys interacting with others and is a strong
communicator; doesn’t enjoy focusing on specific things for
long periods ! BA, consultant.
I Sort into the education levels and professions that suit their
talents. Neither has higher “ability.”
I Comparing their earnings and calling this the return to a
PhD would be highly misleading.
Empirical Evidence on the Returns to
Education
I One of the first and most important contribution to how
economists think about the returns to schooling comes from
economist Jacob Mincer (1958, 1974).
I Human capital model:
I Individuals invest in education early in their lives and enjoy
the benefits throughout their working careers;
I Individuals can continue to invest in human capital after
school through on-the-job trainings.

I Mincer’s theoretical model yields a simple estimating


equation for the relationship between schooling, earnings,
and work experience, which is often called the human
capital earnings function.
The Human Capital Earnings Function
I The human capital earnings function/ “Mincer equation:”
2
ln(Y ) = 0 + 1S + 2 Exp + 3 Exp + U,

where
I Y : earnings,
I S : years of education,
I Exp: number of years since completion of formal schooling,
I U : error term.
I 1 shows the returns to another year of education.
I Estimation based on US Census data:
I 10-12% for whites;
I 9-15% for blacks.
I Have grown dramatically over time.
The Human Capital Earnings Function

Two Issues in the Mincer equation:

I It assumes a linear relationship between schooling and log


earnings. An additional year of education has different
effects depending on the educational level.
I Estimates of 1 may not be interpreted as the causal effect
of education on earnings (omitted variable bias).
Modern Approaches

1. Control for observed differences across students: age,


race/ethnicity, geographic location, parental education and
income, and IQ scores.
I Unobserved characteristics, e.g., motivation and work ethic.
I Types of abilities cannot be measured by IQ scores.

2. Twins studies: compare differences across identical twins


in their earnings and educational attainment.
I Assumption: Educational differences across twins are
unrelated to their characteristics that affect earnings.
I Estimates of returns: 8-11%, similar to those found with the
first method.
I Concerns: (1) Twins may still differ, e.g., preferences,
health; (2) Twins may seek to differentiate themselves from
each other.
Modern Approaches

3. Quasi-experiments: A policy/event that has the effect of


randomly increasing or decreasing educational attainment
among people, even though this was not the intent.
I Quasi-experiments produce variation in education that is
effectively random and uncorrelated with underlying student
abilities.

I Examples:
I Proximity to a college;
I Compulsory schooling laws;
I College tuition;
I Quarter of birth.
Angrist and Krueger (1991)
I One of the most influential studies on the returns to
schooling that attempts to overcome ability bias using a
natural experiment.
I Two fundamental aspects of the structure of the education
system in the US:
I Most school districts do not admit students to first grade
unless they will attain age six by January 1of the calendar
year when they enter school. Therefore, students born early
in the calendar year are typically older when they enter
school than those born late in the year.
I Compulsory schooling laws require students to attend
school until they reach a specified birthday.
I Students born early in the calendar year attain the legal
dropout age after having attended school for a shorter
period of time than those born near the end of the year.
, 12.8 - 2

Angrist and Krueger (1991)


E

I Use quarter of birth indicators as instruments for schooling.


I Quarter of birth is related to educational attainment: Those
30 32 34 36 38 40
born in the first quarter obtain
Y 0.1
a fewer Byears of education
and are 2 percentage Y points
a lessElikelyca
to graduate
a from
S
FIGURE
a
I

high school.
1980 C
N . Q a b b ac b a .

13.9
13 313 2

0.

CD
40 42 234
0

. 3.5) 2

33,

40 42 44 46 48 50
Y a B

FIGURE II
Y a E ca a S a B
1980 C
N . Q a b b ac b a .
question is first addressed in Figure V, hich presents a graph of
the mean log eekl age of men age 30-49 (born 1930-1949), b

Angrist and Krueger (1991)


quarter of birth. The data used to create the figure are dra n from
the 1980 Census, and are described in detail in Appendi 1.
T o important features of the data can be obser ed in Figure

I Men born in the first quarter earn slightly less than men
V. First, men born in the first quarter of the ear- ho, on
a erage, ha e lo er education-also tend to earn slightl less per
born ineek
surrounding
than men born months.
in surrounding months. Second, the age-

I The age-earnings
earnings profile is positi el sloped for men bet een ages 30 and 39
(born 1940-1949), butprofile is for
fairl flat positively
men bet eensloped for49men ages
ages 40 and

30-39, but fairly flat for men ages 40-49.


3 3 3 433 4343 4 -

5.2 3424 4
21 21213 2 3

_j 2 3

30 3 4 4 50

Year of Birth 5
FIGURE V

Mean Log Weekl Wage, b Quarter of Birth


All Men Born 1930-1949; 1980 Census
ABLE III
PANEL A: ALD E IMA E FO 1970 CEN -MEN BO N 1920-1929

(1) (2) (3)


B B 2 , D
1 3 , 4 ( . )
(1) - (2)

( . ) 5.1484 5.1574 -0.00898


(0.00301)
E 11.3996 11.5252 -0.1256
(0.0155)
. 0.0715
(0.0219)
OL ' 0.0801
(0.0004)

P B: E 1980 C -M B 1930-1939

(1) (2) (3)


B B 2 , D
1 3 , 4 ( . )
(1) - (2)

( . ) 5.8916 5.9027 -0.01110


(0.00274)
E 12.6881 12.7969 -0.1088
(0.0132)
. 0.1020
(0.0239)
OL 0.0709
(0.0003)

. 247,199 P A, 327,509 P B. E
. 1980 C
5 , 1970 C , C , N 1
.
. OL
.
Angrist and Krueger (1991)
I IV estimates are very similar to the OLS estimates:
I Ability bias is very small.
I Those who are induced to stay in school by compulsory
schooling laws have particularly high returns to schooling
relative to the average student.

I An instrument estimates a local average treatment effect


(LATE): The treatment effect among those who are induced
to change their behavior because of an intervention or
natural experiment.
I The estimated effect is local to this group (“compliers”).
I It tells us little about the returns to schooling among
students who would never drop out (“always-takers”) or
who would drop out anyway (“never-takers”) regardless of
the compulsory schooling laws.
Lemieux and Card (2001): Canada

I Canadian World War II veterans benefited from an


extensive educational program.

I Because of differences in military enlistment rates, a much


lower fraction of Quebec men were eligible for these
benefits than men from other provinces.

I Examine patterns of education and earnings for


English-speaking men from Ontario and French-speaking
men from Quebec.
3 18 T. Lemieux and D. Card

Year of birth
,
Quebec, French-speakers O n t a r i o , English-speakers
FIGURE I Proportion of men who served in WW 11 by year of birth (five-year moving average)
Education, earnings, and the Canadian 'G.I. Bill' 32 1

,
Total enrolment -Non-veterans only
FIGURE 2 Full-time male university enrolment

of veterans. Figure 2 shows that the influx of veterans into Canadian universities
a. Men
0.25

X
*
'Z 0.20
.-Yc
:
E
0.1s
8
5 0.10
3
.-E
t( 0.05
2
LL

0.00
1945 1940 1935 1930 1925 1920 1915 1910
Birth year
-E- Ontario -I- Quebec

b. Women

Birth year
-S Ontario -I- Quebec

FIGURE 4 Fraction of men and women with some university, 1971 Census (five-year moving average)
a. Employment rates

Birth year
+Ontario -I- Quebec

b. Mean log annual earnings

Birth year
+Ontario + Quebec

FIGURE 5 Labour market outcomes of men, 1971 Census (five-year moving average)
(0.017) (0.019) (0.020)
c. IV estimate 0.081 0.125 0.080
334 T. Lemieux and D. Card (0.055) (0.107) (0.044)
4. IV estimates for women using Ontario ' Age 18-2 1 in 1945
a Reduced-form education -0.048 -0.221 -0.080
(0.110) (0.114) (0.122)
TABLE 5
b. Reduced-form earnings 0.017 -0.002 0.009
OLS and IV estimates of return to education using 1970 earnings
(0.033) (0.034) (0.037)
c. IV estimate -0.360 0.007 -0.1 1 1
(1)
(1.189) (2)
(0.153) (3)
(0.524)
Models
Models c.ontrolling,for
controiling,for experience
age
. OLS
I5. OLS education
education coefficient
coefficient 0.070 0.070 0.070
(0.002) (0.002) (0.002)
2.6. IV
IV Using Ontario '' age
Using Ontario age 18-2
18-2 11 in
in 1945
1945
a.a.Reduced-form
Reduced-form education
education 0.425 0.320 0.465
(0.093) (0.096) (0.101)
b.b. Reduced-form
Reduced-form earnings
earnings 0.060 0.056 0.073
(0.021) (0.022) (0.023)
c.c.1V
IV estimate
estimate 0.141 0.175 0.157
(0.050) (0.073) (0.051 )
3 . IV Using Ontario ' age 18-24 in 1945
NOTES: Standard errors
a. Reduced-form are in parentheses. Controls include weeks and0.303
education hours per week plus: column
0.172 1:
0.442
quartlc in experience (or age) and dummy for Quebec; column 2: quartic in experience
(0.076) (or age), dummy
(0.082) (0.088)
forb.Quebec, and interaction
Reduced-form earningsof Quebec dummy with quartic in experience;0.025 column 3:0.02
full1set of experi-
0.035
ence (or age) dummies, dummy for Quebec, and interaction of Quebec dummy with
(0.017) quartic in experi-
(0.019) (0.020)
ence
c. IV estimate 0.081 0.125 0.080
(0.055) (0.107) (0.044)
4. IV estimates for women using Ontario ' Age 18-2 1 in 1945
a Reduced-form education -0.048 -0.221 -0.080
are parallel. The model in column 2 relaxes this assumption (0.110) by allowing
(0.114) smooth (0.122)
(quartic) province-specific
b. Reduced-form earnings experience or age profiles. Finally, 0.017 the -0.002
model in column 0.009
3 includes an unrestricted set of experience or age dummies, (0.033) as well
(0.034) as an (0.037)
inter-
c. IV estimate
action between a Quebec dummy and a quartic function-0.360 of experience 0.007or age. -0.1 11
This
Lemieux and Card (2001): Canada

I Local average treatment effect.


I Shortfall: direct effect of WW II service on earnings.
I The direct effect of WW II service on earnings is likely to be
negative. The IV results understate the true return to
education.
Coming Soon!

I The Quality of Education


I Hoekstra, Mark (2009), “The effect of attending the flagship
state university on earnings: A discontinuity-based
approach.” Review of Economics and Statistics, 91(4),
717-724.
ECON 361: Income & Inequality
Lecture 10: Education Quality

Sitian Liu
Queen’s University

October 15, 2020


Outline

Last class:

I Difficulty of estimating the causal effect of education on


earnings
I Empirical evidence on the returns to education

Today:

I Education quality
I Causal effects of education quality on students’
performances and future earnings: Hoekstra (2009) and
Angrist and Lavy (1999)
Education Quality Matters

I Studies have estimated the returns to education as


measured by individuals earnings:
I Range from 6-12%, different methods across different time
periods.

I Limitation: Treat education as a single good.


I Is the return to a year of schooling at an elite private
university the same as the return to a year of schooling at a
community college?

I Provide insight into what types of schools students should


attend and what the returns may be to subsidizing different
types of schools with public funds.
School Quality

How to characterize one educational environment as being


higher quality than another?

I Use variation in education inputs.


I Resource levels: student-faculty ratios.
I Average SAT/ACT scores: the average pre-collegiate
academic ability of students at an institution.
I Instructional expenditures per student.
I Selectivity of undergraduate admissions.

I Distinguish among institutions by organizational structure.


Education Quality

How to characterize one educational environment as being


higher quality than another?

I Use variation in education inputs.


I Distinguish among institutions by organizational structure.
I Public vs. private.
I The degrees offered (two-year vs. four-year).
I Flagship schools and other public schools: University of
Michigan–Ann Arbor, University of Minnesota–Twin Cities,
etc.
The Causal Effect of School Quality

I Students select the schools they want to attend, and the


schools select the students they want.
I Factors for school choices: academics, social life, proximity
to home, athletics, specific program strength, etc.
I Comparing earnings among even observationally similar
students who attend schools of different quality !
comparing students who differ in their preferences for these
attributes.
I There will be a bias if such preferences are related to labor
market outcomes.
The Causal Effect of School Quality
I Controlling directly for students academic preparation
and background characteristics.
I Different datasets and across different time periods.
I Examine how wages and earnings differ across quality tiers,
conditional on academic and student background controls.
I Students who attend a top-ranked private or public school
have earnings 20-25% higher than those of students who
attend a bottom-ranked public school.
I Students enrolling in a community college earn about 7%
less in the future than those beginning college at a four-year
school.
I Unobserved student differences across schools, difference
in preferences.
The Causal Effect of School Quality

I Regression discontinuity design.


I Certain schools have admissions rules such that students
with above a minimum SAT score or GPA will be
automatically admitted if they apply, and those below the
threshold are unlikely to be admitted.
I The threshold does not perfectly predict enrollment:
exceptional athletes; choose to go elsewhere.

I Students cannot choose which side of the cutoff they are on


) Students just above and below the cutoff will be identical.
I Compare earnings just above and below the cutoff.
I Attending the higher-quality school increases earnings by
about 20%.
I Hoekstra (2009) and Angrist and Lavy (1999)
Hoekstra (2009): Overview

I Examine the economic returns to college quality in the


context of attending the most selective public state
university (flagship).
I Compare the earnings of 28 to 33 year olds who were
barely admitted to the flagship to those of individuals who
were barely rejected.
I Selection bias: Attendance at more selective university is
likely correlated with unobserved characteristics that
themselves will affect future earnings.
I Data: Combine administrative records from a large flagship
state university with earnings records collected by the state
through the unemployment insurance program.
Hoekstra (2009): Empirical Strategy

I Compare the earnings of individuals narrowly below the


cutoff with those who were narrowly above the cutoff.
I The randomness of scores leads to those just above and
below the cutoff being identical (on average) in terms of
their observed and unobserved characteristics.
I Distinguish the effect of enrollment at the flagship university
from other confounding factors, e.g., motivation, parental
support, as long as these factors are continuous at the
admission cutoff.
I Any discontinuous jump in earnings at the admission
cutoff can be interpreted as the causal effect of admission
to flagship on earnings.
Hoekstra (2009): Empirical Strategy

I Threat: Applicants or the university can manipulate the


side of the cutoff on which applicants fall.
I Applicants who would barely miss the cutoff were to retake
the SAT until they surpassed the cutoff. (Cutoffs were not
made public.)
I The university defined the cutoff at which students with
above-average unobservables lies just above the line,
whereas those with below-average unobservables just lie
below the line.
Does the admission cutoff predict the enrollment decision of
applicants? Being just above the cutoff causes a large and
statistically significant increase in the probability of attending the
flagship
THEuniversity.
EFFECT OF ATTENDING THE FLAGSHIP STATE UNIVERSITY ON EARNINGS 721

FIGURE 1.—FRACTION ENROLLED AT THE FLAGSHIP STATE UNIVERSITY


.1 .2 .3 .4 .5 .6 .7 .8 .9 1
0

-300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350

Local Average

V. Results discontinuity in earnings at the admission cutoff. This


is shown for white men in figure 2, which shows a
A. Earnings Discontinuities at the Admission Cutoff
regression of residual earnings on a cubic polynomial of
To the extent that there are economic returns to attend- adjusted SAT score. Table 1 shows the discontinuity
ing the flagship state university, one should observe a estimates that result from varying functional form

FIGURE 2.—NATURAL LOG OF ANNUAL EARNINGS FORWHITE MEN TEN TO FIFTEEN YEARS AFTER HIGH SCHOOL GRADUATION (FIT WITH A CUBIC
POLYNOMIAL OF ADJUSTED SAT SCORE)

Estimated Discontinuity = 0.095 (z = 3.01)

.2
(Residual) Natural Log of Earnings
-.3 -.2 -.1-.4 0 .1

-300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350
SAT Points Above the Admission Cutoff

Predicted Earnings Local Average


Hoekstra (2009): Results

I At the cutoff, the likelihood of enrolling in the flagship


university increases by 39 percentage points.
I The discontinuity is fuzzy because not all admitted students
enroll, and some students below the cutoff are admitted.
I At the cutoff, earnings increase by 9.5%.
I Assuming that the earnings increase is solely due to the
jump in enrolling in the flagship university, the estimate of
the effect of enrolling in the flagship university on earnings
is 24% (0.095/0.39).
Angrist and Lavy (1999)

I Parents and teacher generally prefer smaller classes:


promote student learning, pleasant environment.
I Policy interest: easier to manipulate than other school
inputs; school quality and the allocation of school
resources.
I Difficult to measure the causal effects of class size on pupil
achievement.
I The level of educational inputs differs substantially both
between and within schools;
I These differences are often associated with factors such as
remedial training or students or socioeconomic
backgrounds.
Angrist and Lavy (1999): Maimonides
I The great twelfth century Rabbinic scholar, Maimonides: “If
there are more than forty, two teachers must be appointed”
(Hyamson, 1939, p. 58b).
I Since 1969, Maimonides’ maximum of 40 students rule has
been used to determine the division of enrollment cohorts
into classes in Israeli public schools.
I The rule generates a potentially exogenous source of
variation in class size: 41 students ! 2 classes, 81
students ! 3 classes, etc.
I Assume grade cohorts are split up into classes of equal
size, the predicted class size m with enrollment e is
e
m=
int[ e401 ] +1
F IGURE I
Cla ss Size in 1991 by In it ia l E n r ollm en t Cou n t , Act u a l Aver a ge Size a n d a s
P r edict ed by Ma im on ides’Ru le
US IN G M AIM ON IDE S ’ R UL E 543

F IGURE II
Aver a ge Rea din g Scor es by E n r ollm en t Cou n t , a n d t h e Cor r espon din g Aver a ge
Cla ss Size P r edict ed by Ma im on ides’Ru le
Angrist and Lavy (1999): Empirical Strategy

ȳsc = Xs + ↵nsc + µs + ✏sc ,

I ȳsc : average score of class c in school s,


I Xs : a vector of school characteristics, including enrollment,
I nsc : the size of class c in school s.

Class size is not randomly assigned! Instrument for nsc with the
predicted class size msc .
Angrist and Lavy (1999): Empirical Strategy
I msc is a deterministic function of enrollment, and enrollment
is almost certainly related to test scores for reasons other
than changing class sizes.
I Better schools might face increased demand if parents
choose districts based on school quality.
I More educated parents might try to avoid large-enrollment
schools.

I These effects are likely to be smooth (captured by Xs ).


I Variation in test scores with enrollment has a rough
up-and-down pattern that mirrors Maimonides’ rule.
I First stage relationship:

nsc = Xs ⇡0 + ⇡1 msc + ⇠sc .


TABLE III
R E DUCE D -F ORM E STIMATE S F OR 1991

5t h Gr a der s 4t h Gr a der s

Rea din g Rea din g


Cla ss size com pr eh en sion Ma t h Cla ss size com pr eh en sion Ma t h

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

A. F u ll sa m ple

M ean s 29.9 74.4 67.3 30.3 72.5 68.9

US IN G M AIM ON IDE S ’ R UL E
(s.d .) (6.5) (7.7) (9.6) (6.3) (8.0) (8.8)
R egressors
f sc .704 .542 !.111 !.149 !.009 !.124 .772 .670 !.085 !.089 .038 !.033
(.022) (.027) (.028) (.035) (.039) (.049) (.020) (.025) (.031) (.040) (.037) (.047)
P er cen t disa dva n t a ged !.076 !.053 !.360 !.355 !.354 !.338 !.054 !.039 !.340 !.340 !.292 !.282
(.010) (.009) (.012) (.013) (.017) (.018) (.008) (.009) (.013) (.014) (.016) (.016)
E n r ollm en t .043 .010 .031 .027 .001 .019
(.005) (.006) (.009) (.005) (.007) (.009)
Root MSE 4.56 4.38 6.07 6.07 8.33 8.28 4.20 4.13 6.64 6.64 7.83 7.81
R2 .516 .553 .375 .377 .247 .255 .561 .575 .311 .311 .204 .207
N 2,019 2,019 2,018 2,049 2,049 2,049

B. Discon t in u it y sa m ple

M ean s 30.8 74.5 67.0 31.1 72.5 68.7


(s.d .) (7.4) (8.2) (10.2) (7.2) (7.8) (9.1)
R egressors
f sc .481 .346 !.197 !.202 !.089 !.154 .625 .503 !.061 !.075 .059 .012
(.053) (.052) (.050) (.054) (.071) (.077) (.050) (.053) (.056) (.063) (.072) (.080)
P er cen t disa dva n t a ged !.130 !.067 !.424 !.422 !.435 !.405 !.068 !.029 !.348 !.343 !.306 !.291
(.029) (.028) (.027) (.029) (.039) (.042) (.029) (.028) (.032) (.034) (.041) (.043)
E n r ollm en t .086 .003 .041 .063 .007 .024
(.015) (.015) (.022) (.014) (.017) (.022)
Root MSE 5.95 5.58 6.24 6.24 8.58 8.53 5.49 5.26 6.57 6.57 8.26 8.25
R2 .360 .437 .421 .421 .296 .305 .428 .475 .299 .299 .178 .182
N 471 471 471 415 415 415

553
Th e fu n ct ion f sc is equ a l t o en r ollm en t /[in t ((en r ollm en t ! 1)/40) " 1]. St a n da r d er r or s a r e r epor t ed in pa r en t h eses. St a n da r d er r or s wer e cor r ect ed for wit h in -sch ool cor r ela t ion
bet ween cla sses. Th e u n it of obser va t ion is t h e a ver a ge scor e in t h e cla ss.
554
TABLE IV
2SLS E STIMATE S F OR 1991 (F IF TH G RADE RS )

Rea din g com pr eh en sion Ma t h

!/" 5 !/" 5

QUAR T E R LY J OUR N AL OF E CON OM ICS


Discon t in u it y Discon t in u it y
F u ll sa m ple sa m ple F u ll sa m ple sa m ple

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

M ean score 74.4 74.5 67.3 67.0


(s.d .) (7.7) (8.2) (9.6) (10.2)
R egressors
Cla ss size ".158 ".275 ".260 ".186 ".410 ".582 ".013 ".230 ".261 ".202 ".185 ".443
(.040) (.066) (.081) (.104) (.113) (.181) (.056) (.092) (.113) (.131) (.151) (.236)
P er cen t disa dva n t a ged ".372 ".369 ".369 ".477 ".461 ".355 ".350 ".350 ".459 ".435
(.014) (.014) (.013) (.037) (.037) (.019) (.019) (.019) (.049) (.049)
E n r ollm en t .022 .012 .053 .041 .062 .079
(.009) (.026) (.028) (.012) (.037) (.036)
E n r ollm en t squ a r ed/100 .005 ".010
(.011) (.016)
P iecewise lin ea r t r en d .136 .193
(.032) (.040)
Root MSE 6.15 6.23 6.22 7.71 6.79 7.15 8.34 8.40 8.42 9.49 8.79 9.10
N 2019 1961 471 2018 1960 471

Th e u n it of obser va t ion is t h e a ver a ge scor e in t h e cla ss. St a n da r d er r or s a r e r epor t ed in pa r en t h eses. St a n da r d er r or s wer e cor r ect ed for wit h in -sch ool cor r ela t ion bet ween cla sses.
All est im a t es u se f sc a s a n in st r u m en t for cla ss size.
Coming Soon!

I Minimum Wage
I Card, David, and Alan B. Krueger (1994), “Minimum wages
and employment: A case study of the fast-food industry in
New Jersey and Pennsylvania.” The American Economic
Review, 84(4), 772–793.
ECON 361: Income & Inequality
Lecture 11: The Minimum Wage

Prof. Sitian Liu


Queen’s University

October 19, 2020


Outline

Today:

I Background on the minimum wage


I The standard model to analyze the impact of the minimum
wage on employment
I Empirical Evidence
I Adams, Blackburn, and Cotti (2012)
I Card and Krueger (1994, 2000)
The Minimum Wage

U.S.

I The U.S. federal government introduced mandatory


minimum wages in 1938 as one of the provisions of the Fair
Labor Standards Act (FLSA).
I The nominal minimum wage was initially set at 25 cents an
hour, with 43% of nonsupervisory workers covered.
I The nominal minimum wage was $7.25/hour in 2016 and
the coverage has been greatly expanded.
Source: Labor Economics by George J. Borjas.
Source: U.S. Department of Labor (https://www.dol.gov/agencies/whd/minimum-wage/state).
The Minimum Wage

Canada

I In Canada, the 10 provinces and 3 territories have the


power to set the minimum wage. (May be lower to liquor
servers or inexperienced employees.)
I The federal government in past years set it own minimum
wage for workers in federal jurisdiction industries (e.g.,
railways). Since 1996, it is set by the province/territory
where work is performed.
$11.51
$13.00 $11.15

$13.46

$11.35
$15.00
$12.65 $12.00
PRINCE EDWARD ISLAND
$11.06 $11.55
$14.00 $11.00

$11.25

Minimum Hourly Wage Rates as of october 1st, 2018


Retail Council of Canada
www.retailcouncilofcanada.org
Fraction of Employees Paid at the Minimum Wage Rate
by Province
A Standard Model

Source: Labor Economics by George J. Borjas.


A Standard Model

I Some workers are displaced from their current jobs and


become unemployed: E ⇤ Ē .
I Additional workers enter the labor market, cannot find jobs,
and are added to the unemployment rolls: ES E ⇤ .
I The unemployment rate is given by (ES Ē)/ES , which
depends on the level of the minimum wage and the
elasticities of labor supply and labor demand.
I The minimum wage is presumably supposed to raise the
income of the least-skilled workers, who also become
particularly vulnerable to layoffs.
A Standard Model

I Compliance with the minimum wage law:


I Many employers do not comply the law. In 2010, 2.5 million
workers (3.5% of all workers paid by hour) were paid less
than $7.25 (the min wage).
I Trivial penalties: pay the difference for the last two years,
interest-free loan, delay payment etc.
I The covered and uncovered sectors:
I The adverse employment effects of minimum wages may
be moderated by less-than-universal coverage.
A Standard Model
A Standard Model
I Displaced workers migrate to the uncovered sectors and
find jobs there: SU ! SU0 .
I Some workers employed in the uncovered sector found it
worthwhile to quit the current low-paying jobs and hang
around in the covered sector: SU ! SU00 .
I The migration would stop when the expected wage wage
the same between two sectors:
I ⇡ : the probability that a worker who enters the covered
sector gets a job there.
I The expected wage in the covered sector =

[⇡ ⇥ w̄] + [(1 ⇡) ⇥ 0] = ⇡ w̄.

I The alternative is to stay in the uncovered wage, facing a


sure wage determined in equilibrium: wU .
I Free migration should lead to ⇡ w̄ = wU .
Empirical Evidence

I The simplest model predicts that if the demand curve for


labor is downward sloping, an increase in minimum wage
must reduce employment of the affected group.
I The size of the employment effect depends on the elasticity
of labor demand.
I Legislated changes in the minimum wage help identify the
labor demand elasticity.
I Empirical studies:
I The impact of minimum wages on teenagers, a group
clearly affected by the legislation.
I Increase in the minimum wage in New Jersey on April 1,
1992.
The Minimum Wage and Drunk Driving
(Adams, Blackburn, and Cotti, 2012)

I Much of the impact of the min. wage in the U.S. falls on


teenagers: employment opportunities & disposable income.
I Target the increased income resulting from a min. wage
increase to consume “non-necessities:” video games,
music, and alcohol.
I Motor vehicle accidents are the leading cause of death for
persons aged 16-20.
I 1/3 of these crashes are alcohol-related.
I 20% of people of the age group have driven under the
influence of alcohol.
I By adding to the disposable income of teenagers, min.
wage increases may raise the expenditures of teenagers on
alcohol and driving activities.
MINIMUM WAGES AND ALCOHOL-RELATED TRA

TABLE 1.—PERCENTAGE OF WORKERS EARNING LOW HOURLY WAGES, should


BY AGE GROUP, 1998 AND 2006
sen suc
No More than No More than Ther
Minimum $1 above the $2 above the
Age Group Wage or Less Minimum Minimum teenage
sumpti
1998
16–20 16.4% 49.2% 64.1% (those a
21–25 5.6 17.1 28.1 a teena
26 and over 1.8 5.8 9.9 roughly
2006
16–20 15.6 36.2 54.8 ting dr
21–25 5.7 12.5 21.3 Center
26 and over 1.6 3.8 7.0 (1987)
Calculations are based on the 1998 and 2006 outgoing rotation groups of the Current Population Sur-
vey. Minimum-wage comparisons are made on the basis of the higher minimum wage (federal or state)
creases
effective in that year, based on state of residence. The reported statistics are the percentage of all workers for dru
paid by the hour at a rate in the stated range.
observe
the 199
value of the federal minimum wage is likely behind these Markow
declines, as the average value of the minimum wage individ
-RELATED TRAFFIC FATALITIES AMONG TEENS 831

AR OLDS: FIGURE 2.—ALCOHOL-RELATED FATAL ACCIDENT RATES FOR 16 TO 20 YEAR OLDS:


STATES BEFORE AND AFTER MINIMUM WAGE INCREASE
0.00012

0.00011

0.0001
ARFA Rate

9E-05

8E-05

7E-05
2003
03
6E-05
-4 -3 -2 -1 0 1 2
Years Since First MW Increase
etween Treatment States Control States
hat our
under-
sents state-level, time-varying characteristics that might
A Case Study: New Jersey vs. Pennsylvania
(Card and Krueger, 1994)

I One of the most cited papers in the “minimum wage”


debate.

I Important ingredients for a good paper:


I Simple and transparent methodology.
I Novel data.
I Counter-intuitive finding.
I Queen’s alumni is an author! (David Card, B.A. in 1978)
Card and Krueger (1994): Overview

I How do employers in a low-wage labor market respond to


an increase in the min. wage?
I Although the prediction from conventional economic theory
is unambiguous, empirical findings are mixed!
I Studies in the 1970s based on teenage employment rates
usually confirmed the prediction.
I Several studies have failed to detect a negative employment
effect: 1990-1991 increases in the federal min. wage,
increase in the min. wage in California.

I Presents new evidence on the effect of min. wages on


establishment level employment outcomes in the fast-food
industry.
Card and Krueger (1994): Methodology

I On April 1, 1992, New Jersey (NJ) increased its min. wage


to $5.05 per hour, the highest mim. wage in the U.S.

I The neighboring state of Pennsylvania (PA) were unaffected


and kept the min. wage at $4.25, the federal level.

I Compare employment, wages, and prices at stores in NJ


and PA before and after the rise of the min. wage.

I Compare within NJ between initially high-wage stores


(unaffected by the min. wage increase) and other stores.
Card and Krueger (1994): Methodology

I Consider a large number of fast-food establishments (e.g.,


Burger King, and KFC) on both sides of the NJ-PA state line
prior to and after the NJ min. wage change.
I Restaurants on both sides: changes in economic
conditions, such as, seasonal shifts in consumer demand
for fried chicken and hamburgers.
I Comparing the employment change in the restaurants on
both sides can “net out” the effect of changes in economic
conditions and isolate the impact of the min. wage.
Card and Krueger (1994): Data

I Combine phone calls and site visits to create a data set on


fast-food establishments.
I 1st wave of interviews was done before the min. wage
increase in late February and early March, 1992.
I 2nd wave of interviews was done after the min. wage
increase in November and December, 1992.
VOL. 84 NO. 4 CARD AND KRUEGER: MINIMUM WAGE AND EMPLOYMENT

February 1 9 9 2

Wage Range

November 1 9 9 2
November 1 9 9 2

Wage Range

New Jersey Pennsylvania


FIGURE
1. DISTRIBUTION
OF STARTING
WAGERATES
Card and Krueger (1994): Diff-in-Diff
Card and Krueger (1994): Diff-in-Diff

I Method 1: Compare New Jersey and Pennsylvania.

E i = ↵ + X i + N Ji + ✏i

I Method 2: Compare initially high-wage and low-wage


stores within New Jersey.

Ei = ↵ 0 + 0
Xi + 0 GAPi + ✏0i

I Ei : change in employment between two waves at store i;


I Xi : a set of characteristics of store i;
I N Ji : a dummy variable that equals 1 for stores in NJ.
I Gapi : 0 for stores in PA and stores in NJ with initial wage
$5.05, and (5.05 w1i )/w1i .
b ~ i f f e r e n c ein employment between low-wage ($4.25 per hour) and high-wage ( 2$5.00 per hour) stores; and differenc
n employment between midrange ($4.26-$4.99 per hour) and high-wage stores.

Card and Krueger (1994): Results


'Subset of stores with available employment data in wave 1 and wave 2.
this row only, wave-2 employment at four temporarily closed stores is set to 0. Employment changes are based on th
ubset of stores with available employment data in wave 1 and wave 2.

TABLE4-REDUCED-FORM MODELSFOR CHANGEIN EMPLOYMENT

Model
Independent variable (i) (ii) (iii) (iv) (v)
1. New Jersey dummy 2.33 2.30 - - -
(1.19) (1.20)
2. Initial wage gapa - - 15.65 14.92 11.91
(6.08) (6.21) (7.39)
3. Controls for chain and no yes no yes yes
ownershipb
4. Controls for regionC
5. Standard error of regression
6. Probability value for controlsd

Notes: Standard errors a r e given in parentheses. T h e sample consists of 357 stores


with available data o n employment and starting wages in waves 1 and 2. T h e
dependent variable in all models is change in F T E employment. T h e mean and
standard deviation of the dependent variable are -0.237 and 8.825, respectively. All
models include a n unrestricted constant (not reported).
aProportional increase in starting wage necessary to raise starting wage t o new
minimum rate. For stores in Pennsylvania the wage gap is 0.
Card and Krueger (1994): Results
I The min. wage change increases employment by about 2.3
workers in the typical fast-food restaurant.
I The average gap among NJ stores (Gapi ) is 0.11. The
estimate in Column (iii) implies a 1.72 increase in
employment in NJ relative to PA.
I Why this evidence differs so sharply from earlier results?
I The employment effect of the min. wage is negative, but
small. It might be hard to detect the reduction in
employment in a rapidly changing economic environment
with very noisy data.
I The sector may not be representative of the low-wage labor
market. E.g., the number of employment is relatively fixed
(production technology). Discourage national chains from
opening additional stores.
Card and Krueger (2000): Revisit

I Replication and reanalysis are important endeavors in


economics, especially when new findings run counter to
conventional wisdom.
I Unemployment-insurance (UI) payroll-tax records:
employment records reported quarterly by employers to
their state employment security agencies for UI tax
purposes.
Card and Krueger (2000)
1400 THE AMERICAN ECONOMIC REVIEW DECEMBER 2000

{_ Original 7 Countes
Additional 7 Counties

Nuriber of Restaurants
in Original Survey

.2
.3

.6

70 0 70 140 Niles

FIGURE 1. AREAS OF NEW JERSEY AND PENNSYLVANIA COVERED BY ORIGINAL SURVEY AND BLS DATA
Card and Krueger (2000)
1406 THE AMERICAN ECONOMIC REVIEW DECEMBER 2000

1.4

1 .3

1.2

0.9 -

0.8

C14 NJ , P;7cute1 A 4cute


0.7

0.6

.........NJ -,-..*.....PA; 7 counties .----PA; 14 counties

FIGURE 2. EMPLOYMENT IN NEW JERSEY AND PENNSYLVANIA FAST-FOOD RESTAURANTS, OCTOBER 1991 TO SEPTEMBER 1997

Note: Vertical lines indicate dates of original Card-Krueger surve and the October 1996 federal minimum- age increase.
Source: Authors' calculations based on BLS ES-202 data.
Card and Krueger (2000)
VOL. 90 NO. 5 CARD AND KRUEGER: MINIMUM WAGE AND EMPLOYMENT, REPLY 1403

TABLE 2-BASIC REGRESSION RESULTS; BLS ES-202 FAST-FoOD DATA AND CARD-KRUEGER SURVEY DATA

Dependent variable:

Change in levels Proportionate change

Explanator variables (1) (2) (3) (4)

A. All of New Jerse and 7 Penns lvania Counties, BLS Data

New Jerse indicator 0.536 0.225 0.007 0.009


(1.017) (1.029) (0.029) (0.029)
Chain dummies and subunit dumm variable No Yes No Yes
Standard error of regression 10.09 9.99 0.286 0.281
R 2 0.001 0.029 0.000 0.046

B. All of New Jerse and 14 Penns lvania Counties, BLS Data

New Jerse indicator 0.946 0.272 0.045 0.032


(0.856) (0.859) (0.024) (0.024)
Chain dummies and subunit dumm variable No Yes No Yes
Standard error of regression 10.80 10.63 0.303 0.294
R 2 0.002 0.042 0.005 0.071

C. Original Card-Krueger Surve Data

New Jerse indicator 2.411 2.488 0.029 0.030


(1.323) (1.323) (0.050) (0.049)
Chain and compan -ownership dummies No Yes No Yes
Standard error of regression 10.28 10.25 0.385 0.382
R 2 0.009 0.025 0.001 0.024

Notes: Each regression also includes a constant. Sample si e is 564 for panel A, 687 for panel B, and 384 for panel C. Subunit
dumm variable equals one if the reporting unit is a subunit of a multiunit emplo er. For comparabilit with the BLS data,
emplo ment in the CK sample is measured b the total number of full- and part-time emplo ees. Standard errors are in
parentheses.

2, which are based on the emplo ment changes minimum wage. Onl in the proportionate
Coming Soon!

I The labor market impact of immigration


I Card, David (1990). The impact of the Mariel Boatlift on the
Miami labor market. Industrial and Labor Relations Review.
43(2), 245-257.
ECON 361: Income & Inequality
Lecture 12: The Labor Market Impact of
Immigration

Prof. Sitian Liu


Queen’s University

October 22, 2020


Outline

Last class:

I The impact of the minimum wage on employment: model


and empirics

Today:

I Facts on immigration
I A simple model
I Empirical evidence
I Spatial correlation
I National labor market
Foreign-Born as a Share of Total Population
Country 1981 1998 2009 2017

Australia 20.6 23.2 26.5 28.1


Austria 3.9 8.5 15.5 19.0
Belgium 9.0 8.7 13.0 16.6
Canada 16.1 17.8 19.6 20.5
Denmark 2.0 4.8 7.5 11.2
Finland 0.3 1.6 4.4 6.5
France 6.8 6.3 11.6 12.7
Germany 7.5 8.9 12.9 15.5
Italy 0.6 1.9 7.1 10.2
Netherlands 3.8 4.2 11.1 12.5
Norway 2.1 3.7 10.9 15.1
Spain 0.5 1.9 14.3 13.0
Sweden 5.0 5.6 14.4 18.0
UK 2.8 3.8 11.3 14.2
US 6.2 10.8 12.7 13.5
Note: Percentages of immigrants are growing everywhere (faster in some European countries

than the U.S.). Data source: OECD.


Number of People Obtaining Lawful
Permanent Resident Status in the U.S.
2000 1500
Legal inflows (thousand)
500 1000
0

1800 1850 1900 1950 2000


Year

Source: Yearbook of Immigration Statistics (Table 1).


Three Major Phrases

I 1880-1924: Great Migration—massive inflows from


Southern and Eastern Europe.

I 1924-1965: Migration restrictions—national origin quota


system.
I Prevented immigration from Asia.
I Germany and UK favored, relative to other European
countries.

I 1965-today: Immigration and Nationality Act of 1965


(Hart-Celler Act).
I Relatives to U.S. citizens and green card holders privileged.
I Professionals privileged.
A Simple Model

I The central issue in the immigration debate involves around


the impact of immigration on the employment opportunities
of native-born workers.
I The simplest model starts by assuming that immigrants and
natives are perfect substitutes in production.
I Two groups compete for the same types of jobs.
I In the short run, capital was fixed.
I As immigrants enter the labor market, the supply curve
shifts out.
I Fewer native workers are willing to work at the lower wage.
I Immigration reduces the wage and employment of
native-born workers.
Source: Labor Economics by George J. Borjas.
A Simple Model

I Immigration frees up the more skilled native workforce to


perform tasks that make better use of their talents )
increases native productivity.
I Immigrants and natives complement each other in the labor
market.
I Immigrants raise the value of marginal product of natives,
shifting up their labor demand curve.
I More natives find it worthwhile to work, because the higher
wage rate increases their incentive to enter the labor
market.
Source: Labor Economics by George J. Borjas.
A Simple Model

The Short Run and the Long Run Impacts:

I Suppose that immigrant and natives are perfect substitutes.


I In the short run, the immigrants supply shock means that
employers can hire workers at a lower wage, raising the
returns to capital and increasing profits.
I The increased profitability attracts capital into the market
(e.g., old firms expands and new firms open up).
I The increased capital stock shifts the labor demand curve,
attenuating the initial negative wage impact of immigration.
A Simple Model

By how much will the demand curve shift to the right in the long
run? Depending on the production technology!

I Suppose that the production function can be described as


the Cobb-Douglas production function:

q = AK ↵ E 1 ↵

I A: constant (total factor productivity); K : capital; E : labor;


↵ 2 (0, 1).
I Constant return to scale.
I Suppose the price of capital is r, wage is w, and the price
of output is $1.
A Simple Model
I Profit maximization in a competitive labor market requires
I r equals to the values of marginal product of capital.
I w equals to the value of marginal product of labor.
✓ ◆↵ 1
↵ 1 1 ↵ K
r = 1 ⇥ ↵AK E = ↵A
E
✓ ◆↵
↵ ↵ K
w = 1 ⇥ (1 ↵)AK E = (1 ↵)A
E
I Immigration " E ! " r and # w in the short run.
I Over time, higher return to capital will simulate an increase
in the capital stock K .
I In the long run, after K adjust fully to the supply shock, the
rate of return to K falls back to its “normal” level.
I (K/L) is constant in the long run.
Source: Labor Economics by George J. Borjas.
Spatial Correlations
I Empirical implication: Determine if immigrants and natives
are complements or substitutes in production by examining
whether natives earn more or less in labor markets where
immigrants cluster.
I Compare native earnings in cities where immigrants are a
substantial fraction of the workforce (e.g., LA or NY) with
native earnings in cities where immigrants are a relatively
small fraction (e.g., Pittsburgh or Nashville).
wit = pit + Other variables
I wit : native wage in city i at time t;
I pit : the percent of the workforce that is foreign-born.
I Other variables: native skills, industrial composition, city
fixed effect, etc.
I : cross-city correlation between wages and
immigration—spatial correlation.
Spatial Correlations
I Most of empirical studies report a negative, but weak,
correlation between local wages and immigration.
I need not measure the causal effect of immigration.
I Supply shock is not randomly distributed across cities.
I Immigrants want to settle in high-wage cities with robust
labor markets ! generate a positive correlation between
immigration and wages.
I An instrument that leads to an exogenous change in the
number of immigrants settling in a given city. The instrument
must have nothing to do with regional wage differences.
I General equilibrium effects.
I Natives may respond to the wage impact by moving to other
cities.
I Immigration affects every city, not just the one that actually
receives immigrants.
Card (1990): The Mariel Boatlift
I A natural experiment where a large number of immigrants
are randomly dropped off a particular location.
I On April 20, 1980, Fidel Castro declared that Cuban
nationals wishing to move to the U.S. could leave freely
from the port of Mariel.
I By September 1980, about 125,000 Cubans had chosen to
undertake the journey.
I Almost overnight, Miami’s labor force had unexpectedly
grown by 7%.
Card (1990): Data

I Individual-level data from the 1978-85 Current Population


Survey (CPS).
I Large sample of the Miami metropolitan area.
I Cubans are separately identified in the CPS.
I Wage and unemployment for both Cubans and
non-Cubans.
I Comparison cities: Atlanta, Los Angeles, Houston, and
Tampa-St. Petersburg.
I Large Black and Hispanic populations (similar to Miami).
I The impact of immigrants often focuses on minorities.
I Parallel trend (Angrist and Krueger, 1999).
Ch. 23." Empirical Strategies in Labor Economics 1297
1.0

Mariel Mariel Boatlift that J


Boatlift didn't happen

L
0.8

7-
0.6

i 0.4

"i 0.2

O0

-0.2 V- I ) J
1970 19"t2 1974 1976 1978 1980 1982 1984 1986 /988 1990 1992 1994 1996 t998
Year
[ ~iVliami ....... 4 Comparison Cilies 1

Fig. I. Changesin employmentin Miami and comparisoncities. Source: authors' calculationsfrom BLS State
and Area Employment,Hours, and Earnings EstablishmentSurvey.

find some sort of comparison that provides a compelling answer to "what if" questions
Source: Angrist
about the and Kruegerof(1999).
consequences immigration.
1298
Card (1990): ResultsJ. D. Angrist and A. B. Krueger
Table 4
Differences-in-differences estimates of the effect of inmfigration on unemploymenff

Group Year

1979 1981 1981-1979


(1) (2) (3)

Whites
(1) Miami 5.1 (1.1) 3.9 (0.9) 1.2 (l.4)
(2) Comparison cities 4.4 (0.3) 4.3 (0.3) -0.1 (0.4)
(3) Miami-Comparison Difference 0.7 (1.1) - 0 . 4 (0.95) - 1.1 (l.5)

Blacks
(4) Miami 8.3 (1.7) 9.6 (1.8) 1.3 (2.5)
(5) Comparison cities 10.3 (0.8) 12.6 (0.9) 2.3 (1.2)
(6) Miami-Comparison Difference - 2 . 0 (1.9) - 3 . 0 (2.0) - 1 . 0 (2.8)

a Notes: Adapted from Card (1990, Tables 3 and 6). Standard errors are shown in parentheses.
Source: Angrist and Krueger (1999).
Table 4 illustrates DD estimation of the effect of Boatlift immigrants on unemployment
The
rates,results suggest
separately for whitesthat the average
and blacks. The first employment in Miami was
column reports unemployment rates in

barely affected by the Mariel supply shock.


1979, the second column reports unemployment rates in 1981, and the third column
reports the 1981-1979 difference. The rows give numbers for Miami, the comparison
cities, and the difference between them. For example, between 1981 and 1979, the unem-
The Mariel Boatlift: Revisit

I The explosion in refugee flows worldwide has sparked


renewed interest in the Mariel experience.
I The recent research that examines data at a more
meticulous level reaches conflicting findings.
I Almost 2/3 of the Mariel refugees didn’t have high school
diplomas. The supply shock increased the number of high
school dropouts in the Miami labor market by almost 20%.
I Borjas (2017): Compare the average wage of prime-age,
non-Hispanic men without a high school diploma in Miami
and in the rest of the country.
1086 Borjas (2017)
ILR REVIEW

Figure 2. Log Wage of High School Dropouts, 1972–2003

Source: Data are drawn from the March CPS files.


Borjas (2017)
ure 3. Trends in the Wage of Low-Skill Workers in the March CPS, 1977–1992
A. Log weekly wage of high school dropouts
The Mariel Boatlift: Revisit

I Borjas (2017) finds that the wage of high school dropouts in


Miami dropped dramatically, by 10 to 30%.
I It seems the wage of this group took a dramatic nosedive
after 1980, and it took a decade for their wage of recover.
I Other studies claim that the trend in the wage of low-skilled
workers in Miami depends on the definition of the “low-skill”
labor force and changes in racial composition.
I Sampling error.
I The increasing importance of refugee flows throughout the
world guarantees that researchers will finetune the analysis
and search for alternative data sources.
Immigration and the National Labor Market

I The entry of immigrants into a particular city may lower the


wage of competing workers, but this is unlikely to be the
end of the story.
I Natives have incentives to take advantage of alternative
economic landscape (e.g., move to other cities).
I The shift in native supply would diffuse the impact of
immigration over the national economy.
I Comparisons of geographic wage differences might provide
little information about the true wage impact of immigration.
I Relate the wage changes experienced by specific skill
group to the number of immigrants that entered each of
those groups.
Source: Labor Economics by George J. Borjas.
Immigration and the National Labor Market

I The regression line in the previous figure suggests that a


10% increase in the size of the skill group reduces the
wage of that group by 3-4%.
I Some studies use model-based approach to estimate the
wage effect of specific skill groups (next table).
I Allowing for complementarities between immigrants and
natives with observationally similar skills attenuate the
adverse wage impact of immigration (not shown here).
Source: Labor Economics by George J. Borjas.
Coming Soon!

I The decision to immigrate


I Reading: Abramitzky, Ran, Leah Platt Boustan, and
Katherine Eriksson (2012), “Europe’s tired, poor, huddled
masses: Self-selection and economic outcomes in the age
of mass migration.” American Economic Review, 102(5),
1832–1856.

I Immigrant performance

You might also like