Professional Documents
Culture Documents
Outcome A: Outcome B:
Effect =
with programme without programme
Problem: we only observe individuals that
participate: or A B
do not participate :
A B
... but never A and B for everyone! 122
Cont’d
124
Cont’d
126
Treatment and selection effects
127
Definition of treatment effects
Estimating the Counterfactual
⇒ D = ATE + B. (4.5)
•In these equations, ATE is the average treatment effect
[E(Yi(1) | Ti = 1) – E(Yi(0) | Ti = 1)], namely, the average gain
in outcomes of participants relative to nonparticipants, as if
nonparticipating households were also treated.
•The ATE corresponds to a situation in which a randomly
chosen household from the population is assigned to
participate in the program, so participating and
nonparticipating households have an equal probability of
receiving the treatment T.
Cont’d
Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
153
4.5. Propensity score matching
Cont’d
Cont’d
We can match to more than one neighbour
5 nearest neighbours? Or more?
Radius matching: all neighbours within specific range
Kernel matching: all neighbours, but close neighbours have larger
weight than far neighbours.
Best approach?
Look at sensitivity to choice of approach
How many neighbours?
Using more information reduces bias
Using more control units than treated increases precision
But using control units more than once decreases precision
Cont’d
Cont’d
Cont’d
2. Caliper or radius matching: One problem with NN
matching is that the difference in propensity scores for a
participant and its closest nonparticipant neighbor may still
be very high.
• This situation results in poor matches and can be avoided
by imposing a threshold or ―tolerance‖ on the maximum
propensity score distance (caliper).
• This procedure therefore involves matching with
replacement, only among propensity scores within a
certain range.
• A higher number of dropped non-participants is likely,
however, potentially increasing the chance of sampling
bias.
Cont’d
Cont’d
3. Stratification or interval matching: This procedure
partitions the common support into different strata (or
intervals) and calculates the program’s impact within each
interval.
• Specifically, within each interval, the program effect is the
mean difference in outcomes between treated and control
observations.
• A weighted average of these interval impact estimates
yields the overall program impact, taking the share of
participants in each interval as the weights.
Cont’d
4. Kernel matching: One risk with the methods just described is
that only a small subset of nonparticipants will ultimately satisfy
the criteria to fall within the common support and thus construct
the counterfactual outcome.
• Nonparametric matching estimators such as kernel matching
use a weighted average of all nonparticipants to construct the
counterfactual match for each participant.
• If Pi is the propensity score for participant i and Pj is the
propensity score for nonparticipant j, the weights for kernel
matching are given by
181
Cont’d
What is the effect of the project on infant mortality?
T imrate
treated 10
The easiest and straightforward answer to this question is
treated 15 to compare average mortality rates in the two groups
treated 22
treated 19 (10+15+22+19)/4-(25+19+4+8+6)/5= 4.1
control 25
control 19
What does this mean? Does it mean that clinics have
control 4 increased infant mortality rates?
control 8
NO!
control 6
Pre-project characteristics of the two groups is very
important to answer the above question
182
Cont’d
T imrate povrate pcdocs
treated 10 0.5 0.01
treated 15 0.6 0.02
treated 22 0.7 0.01
treated 19 0.6 0.02
control 25 0.6 0.01
control 19 0.5 0.02
control 4 0.1 0.04
control 8 0.3 0.05
control 6 0.2 0.04
How similar are the treated and control groups?
On average, the treated group has higher poverty rate and few doctors per capita
183
Cont’d
The Basic Idea
1. Create a new control group
For each observation in the treatment group, select the
control observation that looks most like it based on the
selection variables (aka background characteristics)
2. Compute the treatment effect
Compare the average outcome in the treated group with
the average outcome in the control group
84
Cont’d
Macth using Macth using
S. No T imrate povrate pcdocs povrate pcdocs
1 treated 10 0.5 0.01
2 treated 15 0.6 0.02
3 treated 22 0.7 0.01
4 treated 19 0.6 0.02
5 control 25 0.6 0.01
6 control 19 0.5 0.02
7 control 4 0.1 0.04
8 control 8 0.3 0.05
9 control 6 0.2 0.04
• Take povrate and pcdocs one at a time to match the treated group
with that of the control one
• Then take the two at a time. What do you observe?
185
Cont’d
Predicting Selection
How do we actually match treatment
observations to control groups?
In stata, we use logistic or probit regression to predict:
Prob(T=1/X1, X2,…,Xk)
In our example, the X variables are povrate and pcdocs
So, we run logistic regrsssion and save the predicted
probability of the treatment
We call this propensity score
The commands are:
Logistic T povrate pcdocs
Predict ps1 or any name you want the propensity score to
have
187
Cont’d Predicted probability
of treatment or
pcdoc Propensity score
S. No T imrate povrate s ps1 Match
1 treated 10 0.5 0.01 0.4165713
2 treated 15 0.6 0.02 0.7358171
3 treated 22 0.7 0.01 0.9284516
4 treated 19 0.6 0.02 0.7358171
5 control 25 0.6 0.01 0.752714
6 control 19 0.5 0.02 0.395162
7 control 4 0.1 0.04 0.0016534
8 control 8 0.3 0.05 0.026803
9 control 6 0.2 0.04 0.0070107
Exercise: Use the propensity score to match the treated group with the control
one
Find out the average treatment effect on the treated ((10+15+22+19)/4)-
((19+25+25+25)/4)=-7
188
Cont’d
How do we know how well matching worked?
1. Look at covariate balance between the treated and the
new control groups. They should be similar.
2. Compare distributions of propensity scores in the treated
and control groups. They should be similar
3. Compare distributions of the propensity
scores in the treated and original control groups
If the two overlap very much, then matching might not
work very well.
189
Example 2 - use PSMExample.dta
• Command1: psmatch2
• psmatch2 dfmfd sexhead agehead educhead lnland vaccess pcirr rice
wheat oil egg, out(lexptot) common
• psgraph
• pstest
• psmatch2 dfmfd, out(lexptot) pscore(myscore) kernel k(normal) bw(0.01)
• psmatch2 dfmfd, out(lexptot) pscore(myscore) neighbor(2)
• psmatch2 dfmfd, out(lexptot) pscore(myscore) caliper(0.01)
• bs "psmatch2 dfmfd sexhead agehead educhead lnland vaccess pcirr rice
wheat oil egg, out(lexptot)" "r(att)"
Cont’d
• Command 2: pscore
• pscore dfmfd sexhead agehead educhead lnland vaccess pcirr rice wheat
oil egg, pscore(myscore) blockid(myblock) comsup
• psgraph, treated(dfmfd) pscore(myscore) bin(50)
• attnd lexptot dfmfd, pscore(myscore) comsup
• atts lexptot dfmfd, pscore(myscore) blockid(myblock) comsup
• attr lexptot dfmfd, pscore(myscore) radius(0.001) comsup
• attk lexptot dfmfd, pscore(myscore) comsup bootstrap reps(50)
Summarize: how to do PSM
1 91
Final comments on PSM and OLS
192
4.6. Difference-in-differences: Basic set-up
• 2 groups:
• Program group (“with program”)
• Comparison group (“without program”)
• 2 points in time:
• Baseline survey
• Follow-up survey
• Recommended: Follow-up survey is longitudinal at the
individual, household or locality level
Difference-in-Differences
Outcome B
B
Outcome
B-A
Outcome B
B-A
A
D
D-C
C
Outcome B
B-A
D-C
A
D
D-C
C
Impact =
Outcome B (B-A)-(D-C)
B-A
D-C
A
D
D-C
C
Impact =
Outcome B (B-A)-(D-C)
B-A
D-C
A
D
D-C
C
Impact =
Outcome B (B-A)-(D-C)
B-A
A
D True change;
diff-in-diff
D-C under-estimates
C program
impact
Impact =
Outcome B (B-A)-(D-C)
B-A
A
D True change;
diff-in-diff
D-C under-estimates
C program
impact
Impact =
Outcome B (B-A)-(D-C)
B-A
D-C
A
D
D-C
C
Impact =
Outcome B (B-A)-(D-C)
A
D
E
C
F
A
A-E D
E
C
F C-F
A
E
D True change
Impact =
Outcome B (B-A)-(D-C)
True
A Impact
E
D True
change
C
Outcome
B
Impact 1
A
D
Key condition: “Parallel trends assumption” holds for each time period.
Difference-in-Differences: Extensions (3 points in
time)
Outcome G Impact 2
B
Impact 1
A
D H
Key condition: “Parallel trends assumption” holds for each time period.
DID – More
• 4.24b