Asset-V1 UBCx+ITSx+2T2015+type@asset+block@ITSx Week 5 - RD

ITSx: Policy Analysis Using
Interrupted Time Series

Week 5 Slides
Michael Law, Ph.D.
The University of British Columbia
COURSE OVERVIEW
Layout of the weeks

1.
2.
3.
4.
5.
Introduction, setup, data sources

Single series interrupted time series analysis
ITS with a control group
ITS Extensions
Regression discontinuities & Wrap-up
REGRESSION DISCONTINUITIES
Regression Discontinuity (RD)

Design
Compare trends in an outcome across an exposure variable
below and above a threshold
Major Assumption
The level and trend in the outcome above/below the threshold
would have continued absent the threshold
The Counterfactual
Outcome of Interest
Threshold
Change at
Threshold
Below Threshold
Above Threshold
Forcing Variable
Estimates
RD estimates whats known as a local average treatment
effect (LATE)
Comparing people just below to just above the threshold
Forcing Variable Examples
Student Achievement
Vote Margin
Birth Year
Minute of birth
Many others
Integrity of the Forcing Variable

Institutional integrity
Describe the process of assigning variables, and how access to
the intervention was assigned
Should not be subject to potential manipulation
Statistical integrity
There should not be a discontinuity in the density of cases at the
threshold
Testing Assumptions
Other variables should be smooth through the threshold
Potential RD Biases
1. Co-intervention / Non-smooth curve
Something aside from the intervention affects the outcome and

changes at the same threshold as the intervention
2. Instrumentation
The method of measurement differs above and below the threshold
3. Attrition
Individuals are differentially included in the sample on either side of

the threshold
4. Manipulation of threshold
PERFORMING AN RD ANALYSIS
Basic data setup

Person ID
Forcing
Threshold
Forcing_Threshold
Outcome
Basic RD model
For threshold j and forcing variable k:
outcome jk = 0 + 1 (k j) + 2 [k > j]+ 3 [k > j] k + jk

Predicted level at
smallest forcing
variable value
Pre-existing slope in the

outcome of interest
Change in the level

above the threshold
* Variable of interest
Change in the slope above

the threshold
outcome jk = 0 + 1 (k j) + 2 [k > j]+ 3 [k > j] k + jk
Outcome of Interest
Threshold
(slope below threshold)
2 (RD estimate)
(slope above threshold)
(intercept)
Below Threshold
Above Threshold
Forcing Variable
Running an RD Model
########################
# Modeling an RD
########################
# Fit the standard regression model
rd_model <- gls(outcome ~ forcing + threshold +
forcing_threshold,
data=data,
method="ML")
summary(rd_model)
Higher-order Polynomials
Often the relationship between the forcing variable and the
outcome on either side of the threshold will be non-linear
Solution: model in polynomial terms
Similar in structure and form to using a quadratic trend in a

time series analysis
Running an RD Model
#####################################
# Modeling an RD with square terms
#####################################
# Construct a square term on either side of the threshold
data$forcing_sq <- data$forcing^2
data$forcing_threshold_sq <- data$forcing_threshold^2
# Fit the standard regression model
rd_model <- gls(outcome ~ forcing + forcing_sq + threshold +
forcing_threshold + forcing_threshold_sq,
data=dataset,
method="ML")
summary(rd_model)
Modeling
Have to make decisions about range
Trade-off between linearity and data, or precision and bias as
Lee and Lemieux refer to it
Other considerations
Local linear regression
Kernel densities
Fuzzy RD designs
Presenting an RD Analyis
Common to present two figures:
Forcing variable and exposure to the intervention
Forcing variable and outcome
RD EXAMPLE: INCUMBENCY
Lee (2008)
Interested in the effect of incumbent party advantage
Uses data from US House of Representatives elections
Our data are from a replication by Caughey and Sekhon
Includes 7,598 elections from 1942 through 2006
Probability of Winning Next Election
Equal Vote Share
RD estimate
Democrat loss (negative margin)
Democrat win (positive margin)
Democratic Party Margin of Victory
Data Setup
state
year
dmargin
demwin
dwinnext
bin
1946
-6.218
22
1950
-4.146
23
1954
-5.118
23
1956
6.148
29
Setup Variables
# Setup square and cubic terms for forcing variable
dataset$dmargin2 <- dataset$dmargin^2
dataset$dmargin3 <- dataset$dmargin^3
# Setup interaction between forcing variable and threshold
dataset$dmargin_demwin <- dataset$dmargin * dataset$demwin
# Setup square and cubic terms for forcing variable * threshold
interactions
dataset$dmargin_demwin2 <- dataset$dmargin_demwin^2
dataset$dmargin_demwin3 <- dataset$dmargin_demwin^3
Preliminary Plot
###################################
# Preliminary Plot
###################################
# Setup bins for plotting
bins <- seq(-49,49,2)
# Get the mean within each bin
means <- tapply(dataset$dwinnext,dataset$bin,mean)
# Plot the results
plot(bins,means,
pch=19,
ylab="Probability of Winning Next Election",
xlab="Vote Margin in the Last Election",
xlim=c(-50,50),
col="lightblue")
# Add line at zero
abline(v=0,lty=2,col="grey")
Run Basic Model

###################################
# Modeling
###################################
model <- lm(dwinnext ~ dmargin + demwin + dmargin_demwin,
data=dataset)
summary(model)
Model 1 Results
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
0.2362171 0.0096311 24.526
<2e-16 ***
dmargin
0.0051402 0.0003727 13.790
<2e-16 ***
demwin
0.5558085 0.0139324 39.893
<2e-16 ***
dmargin_demwin -0.0008619 0.0005163 -1.669
0.0951 .
Incumbent Party Advantage: 56%
Add square terms

# Add square terms
model2 <- lm(dwinnext ~ dmargin + dmargin2 +
demwin + dmargin_demwin + dmargin_demwin2,
data=dataset)
summary(model2)
# Compare versus model 1
anova(model1, model2)
Model 2 Results
Coefficients:
Estimate
(Intercept)
0.28847535
dmargin
0.01172643
dmargin2
0.00014036
demwin
0.44811150
dmargin_demwin -0.00053605
dmargin_demwin2 -0.00028161
Std. Error t value Pr(>|t|)

0.01425106 20.242 < 2e-16 ***
0.00137841
8.507 < 2e-16 ***
0.00002829
4.962 7.14e-07 ***
0.02054055 21.816 < 2e-16 ***
0.00196543 -0.273
0.785
0.00003958 -7.114 1.23e-12 ***
Model 1 vs. Model 2

Analysis of Variance Table
Model 1: dwinnext ~ dmargin + demwin + dmargin_demwin
Model 2: dwinnext ~ dmargin + dmargin2 + demwin +
dmargin_demwin + dmargin_demwin2
Res.Df
RSS Df Sum of Sq
F
Pr(>F)
1
7593 732.19
2
7591 727.33 2
4.8522 25.32 1.096e-11 ***
Add cubic terms

# Run full specified model
model3 <- lm(dwinnext ~ dmargin + dmargin2 + dmargin3 + demwin
+ dmargin_demwin + dmargin_demwin2 +
dmargin_demwin3,
data=dataset)
summary(model3)
# Compare versus model 2
anova(model2, model3)
Model 3 Results
Coefficients:
Estimate
(Intercept)
0.300040593
dmargin
0.014578041
dmargin2
0.000288379
dmargin3
0.000002045
demwin
0.385243821
dmargin_demwin
0.009250574
dmargin_demwin2 -0.001068132
dmargin_demwin3 0.000006539
Std. Error t value

Pr(>|t|)
0.018943445 15.839
< 2e-16 ***
0.003374783
4.320 0.00001582 ***
0.000162408
1.776
0.0758 .
0.000002209
0.926
0.3547
0.027359614 14.081
< 2e-16 ***
0.004872682
1.898
0.0577 .
0.000231675 -4.610 0.00000408 ***
0.000003111
2.102
0.0356 *
Model 2 vs. Model 3

Analysis of Variance Table
Model 1: dwinnext ~ dmargin + dmargin2 + demwin +
dmargin_demwin + dmargin_demwin2
Model 2: dwinnext ~ dmargin + dmargin2 + dmargin3 + demwin +
dmargin_demwin + dmargin_demwin2 + dmargin_demwin3
Res.Df
RSS Df Sum of Sq
F
Pr(>F)
1
7591 727.33
2
7589 725.78 2
1.5515 8.1114 0.0003027 ***
A note on the example

I have modeled a discrete (win / loss) outcome using linear
regression
I have also posted code to perform the same analysis using
logistic regression

Asset-V1 UBCx+ITSx+2T2015+type@asset+block@ITSx Week 5 - RD

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Asset-V1 UBCx+ITSx+2T2015+type@asset+block@ITSx Week 5 - RD

Uploaded by

Copyright:

Available Formats

ITSx: Policy Analysis Using

Interrupted Time Series

Layout of the weeks

Introduction, setup, data sources

Regression Discontinuity (RD)

Forcing Variable Examples

Integrity of the Forcing Variable

Something aside from the intervention affects the outcome and

The method of measurement differs above and below the threshold

Individuals are differentially included in the sample on either side of

Basic data setup

outcome jk = 0 + 1 (k j) + 2 [k > j]+ 3 [k > j] k + jk

Pre-existing slope in the

Change in the level

Change in the slope above

outcome jk = 0 + 1 (k j) + 2 [k > j]+ 3 [k > j] k + jk

(slope below threshold)

(slope above threshold)

Similar in structure and form to using a quadratic trend in a

Probability of Winning Next Election

Equal Vote Share

Democrat loss (negative margin)

Democrat win (positive margin)

Democratic Party Margin of Victory

Run Basic Model

Incumbent Party Advantage: 56%

Add square terms

Std. Error t value Pr(>|t|)

Model 1 vs. Model 2

Incumbent Party Advantage: 45%

Add cubic terms

Std. Error t value

Model 2 vs. Model 3

Incumbent Party Advantage: 39%

A note on the example

You might also like