You are on page 1of 39

# ITSx: Policy Analysis Using

## Interrupted Time Series

Week 5 Slides
Michael Law, Ph.D.
The University of British Columbia

COURSE OVERVIEW

1.
2.
3.
4.
5.

## Introduction, setup, data sources

Single series interrupted time series analysis
ITS with a control group
ITS Extensions
Regression discontinuities & Wrap-up

REGRESSION DISCONTINUITIES

## Regression Discontinuity (RD)

Design
Compare trends in an outcome across an exposure variable
below and above a threshold

Major Assumption
The level and trend in the outcome above/below the threshold
would have continued absent the threshold

The Counterfactual
Outcome of Interest

Threshold

Change at
Threshold

Below Threshold

Above Threshold

Forcing Variable

Estimates
RD estimates whats known as a local average treatment
effect (LATE)
Comparing people just below to just above the threshold

## Forcing Variable Examples

Student Achievement
Vote Margin
Birth Year
Minute of birth
Many others

## Integrity of the Forcing Variable

Institutional integrity
the intervention was assigned
Should not be subject to potential manipulation

Statistical integrity
There should not be a discontinuity in the density of cases at the
threshold

Testing Assumptions
Other variables should be smooth through the threshold

Potential RD Biases
1. Co-intervention / Non-smooth curve

## Something aside from the intervention affects the outcome and

changes at the same threshold as the intervention

2. Instrumentation

3. Attrition

## Individuals are differentially included in the sample on either side of

the threshold

4. Manipulation of threshold

PERFORMING AN RD ANALYSIS

## Basic data setup

Person ID

Forcing

Threshold

Forcing_Threshold

Outcome

Basic RD model
For threshold j and forcing variable k:

## outcome jk = 0 + 1 (k j) + 2 [k > j]+ 3 [k > j] k + jk

Predicted level at
smallest forcing
variable value

## Pre-existing slope in the

outcome of interest

## Change in the level

above the threshold
* Variable of interest

the threshold

## outcome jk = 0 + 1 (k j) + 2 [k > j]+ 3 [k > j] k + jk

Outcome of Interest

Threshold

2 (RD estimate)

## (slope above threshold)

(intercept)
Below Threshold

Above Threshold

Forcing Variable

Running an RD Model
########################
# Modeling an RD
########################
# Fit the standard regression model
rd_model <- gls(outcome ~ forcing + threshold +
forcing_threshold,
data=data,
method="ML")
summary(rd_model)

Higher-order Polynomials
Often the relationship between the forcing variable and the
outcome on either side of the threshold will be non-linear
Solution: model in polynomial terms

## Similar in structure and form to using a quadratic trend in a

time series analysis

Running an RD Model
#####################################
# Modeling an RD with square terms
#####################################
# Construct a square term on either side of the threshold
data\$forcing_sq <- data\$forcing^2
data\$forcing_threshold_sq <- data\$forcing_threshold^2
# Fit the standard regression model
rd_model <- gls(outcome ~ forcing + forcing_sq + threshold +
forcing_threshold + forcing_threshold_sq,
data=dataset,
method="ML")
summary(rd_model)

Modeling
Have to make decisions about range
Trade-off between linearity and data, or precision and bias as
Lee and Lemieux refer to it

Other considerations
Local linear regression
Kernel densities
Fuzzy RD designs

Presenting an RD Analyis
Common to present two figures:
Forcing variable and exposure to the intervention
Forcing variable and outcome

RD EXAMPLE: INCUMBENCY

Lee (2008)
Interested in the effect of incumbent party advantage
Uses data from US House of Representatives elections
Our data are from a replication by Caughey and Sekhon
Includes 7,598 elections from 1942 through 2006

RD estimate

## Democratic Party Margin of Victory

Data Setup
state

year

dmargin

demwin

dwinnext

bin

1946

-6.218

22

1950

-4.146

23

1954

-5.118

23

1956

6.148

29

Setup Variables
# Setup square and cubic terms for forcing variable
dataset\$dmargin2 <- dataset\$dmargin^2
dataset\$dmargin3 <- dataset\$dmargin^3
# Setup interaction between forcing variable and threshold
dataset\$dmargin_demwin <- dataset\$dmargin * dataset\$demwin
# Setup square and cubic terms for forcing variable * threshold
interactions
dataset\$dmargin_demwin2 <- dataset\$dmargin_demwin^2
dataset\$dmargin_demwin3 <- dataset\$dmargin_demwin^3

Preliminary Plot
###################################
# Preliminary Plot
###################################
# Setup bins for plotting
bins <- seq(-49,49,2)
# Get the mean within each bin
means <- tapply(dataset\$dwinnext,dataset\$bin,mean)
# Plot the results
plot(bins,means,
pch=19,
ylab="Probability of Winning Next Election",
xlab="Vote Margin in the Last Election",
xlim=c(-50,50),
col="lightblue")
abline(v=0,lty=2,col="grey")

## Run Basic Model

###################################
# Modeling
###################################
model <- lm(dwinnext ~ dmargin + demwin + dmargin_demwin,
data=dataset)
summary(model)

Model 1 Results
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
0.2362171 0.0096311 24.526
<2e-16 ***
dmargin
0.0051402 0.0003727 13.790
<2e-16 ***
demwin
0.5558085 0.0139324 39.893
<2e-16 ***
dmargin_demwin -0.0008619 0.0005163 -1.669
0.0951 .

model2 <- lm(dwinnext ~ dmargin + dmargin2 +
demwin + dmargin_demwin + dmargin_demwin2,
data=dataset)
summary(model2)
# Compare versus model 1
anova(model1, model2)

Model 2 Results
Coefficients:
Estimate
(Intercept)
0.28847535
dmargin
0.01172643
dmargin2
0.00014036
demwin
0.44811150
dmargin_demwin -0.00053605
dmargin_demwin2 -0.00028161

## Std. Error t value Pr(>|t|)

0.01425106 20.242 < 2e-16 ***
0.00137841
8.507 < 2e-16 ***
0.00002829
4.962 7.14e-07 ***
0.02054055 21.816 < 2e-16 ***
0.00196543 -0.273
0.785
0.00003958 -7.114 1.23e-12 ***

## Model 1 vs. Model 2

Analysis of Variance Table
Model 1: dwinnext ~ dmargin + demwin + dmargin_demwin
Model 2: dwinnext ~ dmargin + dmargin2 + demwin +
dmargin_demwin + dmargin_demwin2
Res.Df
F
Pr(>F)
1
7593 732.19
2
7591 727.33 2
4.8522 25.32 1.096e-11 ***

# Run full specified model
model3 <- lm(dwinnext ~ dmargin + dmargin2 + dmargin3 + demwin
+ dmargin_demwin + dmargin_demwin2 +
dmargin_demwin3,
data=dataset)
summary(model3)
# Compare versus model 2
anova(model2, model3)

Model 3 Results
Coefficients:
Estimate
(Intercept)
0.300040593
dmargin
0.014578041
dmargin2
0.000288379
dmargin3
0.000002045
demwin
0.385243821
dmargin_demwin
0.009250574
dmargin_demwin2 -0.001068132
dmargin_demwin3 0.000006539

## Std. Error t value

Pr(>|t|)
0.018943445 15.839
< 2e-16 ***
0.003374783
4.320 0.00001582 ***
0.000162408
1.776
0.0758 .
0.000002209
0.926
0.3547
0.027359614 14.081
< 2e-16 ***
0.004872682
1.898
0.0577 .
0.000231675 -4.610 0.00000408 ***
0.000003111
2.102
0.0356 *

## Model 2 vs. Model 3

Analysis of Variance Table
Model 1: dwinnext ~ dmargin + dmargin2 + demwin +
dmargin_demwin + dmargin_demwin2
Model 2: dwinnext ~ dmargin + dmargin2 + dmargin3 + demwin +
dmargin_demwin + dmargin_demwin2 + dmargin_demwin3
Res.Df
F
Pr(>F)
1
7591 727.33
2
7589 725.78 2
1.5515 8.1114 0.0003027 ***