You are on page 1of 4

14.

75 CAUSAL INFERENCE REVIEW: REGRESSION DISCONTINUITY DESIGNS

Regression Discontinuity Designs


We will first analyze what is called a “Sharp Regression Discontinuity Design (RDD).” In a sharp RDD,
we have a treatment we are interested in analyzing, and we have a variable and cutoff (e.g. a test score on
an entrance exam, or a distance from a geographic boundary) such that all individuals on one side of the
cutoff get the treatment, and all individuals on the other side do not. Under certain assumptions, we can
then look at the change in the outcome variable right at the cutoff: this change is the effect of the treatment
for those individuals who are right at the cutoff. This will appear as a discountinuous jump in the outcome,
which is why we call it regression discontinuity. We often represent regression discontinuities using graphs,
which look something like this (from Lee and Lemieux’s 2010 review article):

You can ignore the A and B, the key thing to note is that c is the cutoff and τ is the estimated treatment
effect. However, to interpret the discontinuous jump at the cutoff as the treatment effect (for individuals
at the cutoff), we need to make one critical assumption. In order to understand this assumption, we first
need to spend a moment explaining the concept of “potential outcomes.” Suppose there is some treatment
(getting a vaccine, going to a certain school, being inside the mita, etc.), and we are interested in the effects
of that treatment on outcomes. “Potential outcomes” refers to what an individual’s outcome would be if
they were treated or untreated: Yi (1) denotes person i’s outcome if she is treated, and Yi (0) denotes her
outcome if she is untreated. The difficulty is that we only observe one of these two outcomes: we observe
Yi (1) if she is treated and we observe Yi (0) if she is untreated. We do not observe the counterfactual, but
it is important to think about the counterfactual if we want to estimate causal effects.
Now that we’ve defined potential outcomes, we can state the critical assumption for regression discontinu-
ity. Regression discontinuity requires that the conditional expectation of the potential outcomes is continuous
1
14.75 CAUSAL INFERENCE REVIEW: REGRESSION DISCONTINUITY DESIGNS 2

at the cutoff. Formally, we can write this as

lim E [Yi (0) | Xi = x] = lim− E [Yi (0) | Xi = x]


x→c+ x→c
AND lim+ E [Yi (1) | Xi = x] = lim− E [Yi (1) | Xi = x]
x→c x→c

where X is the assignment variable and c is the cutoff, such that individuals are assigned to the treatment
if X ≥ c. This can be represented with the following figure:

Again, we can ignore the labeled points, and just focus on the curves. The curve on top denotes the
average potential outcomes under treatment, and the bottom curve denotes average potential outcomes with
no treatment. We only observe the top curve to the right of the cutoff because everyone to the right is treated,
and we only observe the bottom curve to the left of the cutoff because nobody to the left is treated. However,
if we assume that these curves are both continuous at the cutoff, then we can interpret the discontinuous
jump at the cutoff as the treatment effect for people at the cutoff.
What does this mean in practice? To build intuition, we can think of this assumption as breaking into
two parts:
(1) As-good-as-random assignment: Observations just to the left or the right of the cutoff are the same
except for the fact that they were assigned to the treatment group or to the control group. This
rules out individuals having precise control over where they fall: if individuals could choose whether
they fall just above or just below the cutoff, then we’d expect individuals who chose to be on one
side vs. the other to be different. However, this assumption is not so strong: it still allows for sorting
(e.g. smarter kids get higher test scores on an entrance exam), it just requires that the sorting is
imprecise (e.g. smarter kids will do better on the entrance exam, but whether you score just at the
cutoff or one point below is just random luck). This also rules out cutoffs that were set ex post for
non-arbitrary reasons. For example, if a geographic boundary was set along a river or at a cliff, then
we would probably not expect things to be the same one either side of the boundary, even absent
the treatment.
14.75 CAUSAL INFERENCE REVIEW: REGRESSION DISCONTINUITY DESIGNS 3

(2) Excludability: Being above vs. below the cutoff only affects the outcome through its effect on the
treatment. This would be violated if, for example, being above the cutoff not only got you into a
particular high school, but also made you eligible for special scholarships and after-school programs.
If we have these two assumptions, then it is as if we have a randomized controlled trial right around the
treatment: individuals near the cutoff are randomly assigned to be just above (treated) and just below
(control), and the assignment only affects their treatment status.
You may remember these assumptions from our discussion of instrumental variables. They are very similar
to the assumptions we needed for IV, they are just adapted to the context of RDD. This leads us naturally
to Fuzzy RDD, where we will use the discontinuity as an instrument for the treatment.
Fuzzy Regression Discontinuity. We previously assumed that everyone on one side of the cutoff was
treated, and everyone on the other side was untreated. What if instead the cutoff only affects the probability
of treatment? We can then use the discontinuity as an instrument for the treatment. One way to implement
this is as a Wald estimator: we look at the discontinuous jump in the outcome at the cutoff (this is our
reduced form), and divide it by the discontinuous jump in the probability of treatment (this is our first stage).
For this to be valid, we need the same assumptions as in typical IV. We already needed as-good-as-random
assignment and excludability for Sharp RDD: these together tell us that our instrument is uncorrelated with
the residual. We just need to add the assumption that the instrument is relevant: it needs to be the case
that the instrument really does increase the probabiliy of treatment.
Implementation in a Regression Framework. So how do we implement all this? One way is to just
pick a very small “bandwidth” and compare observations to the left and right of the cutoff which are all very
near the cutoff. This is valid, but often is not the most efficient use of the data. Instead, it often makes sense
to estimate a linear function (or even a quadratic function) on either side of the cutoff (you should always
estimate your regression so that the line or curve is estimated separately on each side of the cutoff). To do
this, we typically normalize our cutoff value to zero (this makes things a lot easier), and then we estimate:
Yi = α + β1Xi ≥0 + δ1 Xi + δ2 Xi × 1Xi ≥0 + εi
where Xi is the running variable and 1Xi ≥0 is an indicator that the running variable is above the cutoff.
This fits lines on either side of the cutoff. Because we have normalized the cutoff to be zero, we can interpret
β directly as the jump in the outcome at the cutoff. This will work for Sharp RDD, for fuzzy RDD we
instead estimate:

Yi = α + βTi + δ1 Xi + δ2 Xi × 1Xi ≥0 + εi
Ti = κ + γ1Xi ≥0 + θ1 Xi + θ2 Xi × 1Xi ≥0 + ωi
where Ti is the treatment and now 1Xi ≥0 serves as our instrument.
Econometric research on regression discontinuity is still a quite active topic, and there are a variety of
ways to do RD. Here is a current list of best practices:
• DO use a bandwidth (even if you are estimating using a linear or a quadratic model) to limit your
estimation sample to observations that are not too far from the cutoff.
• DO show your results are robust to other reasonable bandwidths.
• DO make a graph to show the RD visually, and to check whether the specification and bandwidth
you’ve chosen seems reasonable.
• DO NOT use higher-order polynomials like cubic, quartic, or beyond. These have been shown to be
unstable in finite samples, and can lead to wildly inaccurate results (Gelman and Imbens 2018).
• FEEL FREE to use non-parametric methods as well as optimal bandwidth selection, although they
are beyond the scope of this class.
Some Tests of the Validity of the RDD. When you have a regression discontinuity, there are two types
of tests you can run in order to test the underlying assumptions. These tests are, of course, necessary but
not sufficient: if they fail then you’re in trouble, but even if they succeed you still need a good economic
argument that as-good-as-random assignment and excludability hold.
First, we can conduct placebo tests: if individuals really are as-good-as-randomly assigned to one sideof
the cutoff or the other, then there should be no discontinuity in baseline characteristics. To implement this,
14.75 CAUSAL INFERENCE REVIEW: REGRESSION DISCONTINUITY DESIGNS 4

we just run the same RDD as before, but try out different baseline characteristics as the outcome (of course,
if we try many baseline characteristics as outcomes, some of them will be significant just by chance, so we
want to account for that when looking at results).
Second, we can test for bunching. If individuals do not have precise control over which side of the cutoff
they end up, then we would expect the density to be continuous along the cutoff: there would be roughly as
many people just to the left of the cutoff as just to the right. One way to test this is to construct a histogram
and look to see if we see a spike just to the left or just to the right of the cutoff. If so, that indicates that
people are in fact choosing which side they end up on, which also suggests that the people just to the left
are probably different (perhaps in unobservable ways) from the people just to the right.

You might also like