You are on page 1of 23

Gary T.

Henry
MacRae Professor of Public Policy
University of North Carolina at Chapel Hill

2008 Environmental Evaluator’s


Networking Forum
June 12, 2008
Not just RCT vs. everything
else
The Received Theory of
Causality has Changed
◦ Campbell & Stanley (1966);
Cook and Campbell (1979);
Shadish, Cook and Campbell
(2002)
◦ Design-based logic of inquiry
approach
◦ Objective: to establish that a
causal relationship exists and
can reasonably be
generalized
◦ Method: Making alternative
explanations implausible
Units, individuals or plots of land, have
alternative potential outcomes, for example,
recycling/not recycling or deforested/forested,
respectively.
Each unit has alternative outcomes
Evaluation Question: Does an environmental
program alter the potential outcomes in the
desired direction for a unit?
For a particular unit, we would like to observe
the outcome after the intervention occurred
for two conditions:
1. If the unit was included in the intervention;
2. If the unit was not included in the intervention.
The objective is an unbiased estimate of the
effect of the program
Fortunately, someone (Donald Rubin) has
done the math for us…
(but unfortunately the fine print says we have to collect
the data)
The Objective: Unbiased estimate of the
effect of treatment
Possible assignments (X, treatment or
control)
Potential outcomes (Y)
YiT = outcome for individual i after exposure to
treatment
YiC = outcome for individual i after exposure to control
Unit Potential Potential Potential Potential Label
Outcome Outcome Outcome Outcome
without without with with
Program Program Program Program
(YiC ) (YiT)
1 deforested 0 forested 1 Program
success
2 forested 1 forested 1 No
differenc
3 deforested 0 deforested 0 e
No
differenc
4 forested 1 deforested 0 e
Program
failure

These four units exhaust all of the logical


possibilities
 i  (YTi  YCi )
The fundamental problem with causal inference:
It is impossible to observe the ideal comparison
All designs including RCTs are approximations of the ideal
Causal inference requires assumptions: RCTs require the fewest
Extrapolation of treatment effects to target population requires
additional assumptions
Unitsin Potential Outcomes Label
Study
Strata Population YT YC
Program
1 40 1 0
success

No
2 20 1 1
Difference

No
3 20 0 0
difference

Program
4 20 0 1
failure

Program produced 60 forested parcels


No program produces 40 forested parcels
The program effect was 20 forested parcels or a 1.5
increase in forested parcels
The average treatment effect (ATE)
  E (YT  YC )
  E (YT )  E (YC )
  (YT )  (YC )
Percentage Possible Outcomes
of Study
Strata Population YT YC

1 20 1 0

Control Group
2 10 1 1

3 10 0 0
Independence =
Equivalence of the 4 10 0 1
Study Population
Percentages for 1 20 1 0

Each Strata in
Treatment Group
Control and 2 10 1 1

Treatment
3 10 0 0

4 10 0 1
Percentage Possible Outcomes
of Study
Strata Population YT YC

1 20 ? 0

Control Group
2 10 ? 1

3 10 ? 0
Independence =
Equivalence of the 4 10 ? 1
Study Population
Percentages for 1 20 1 ?

Each Strata in
Treatment Group
Control and 2 10 1 ?

Treatment
3 10 0 ?

4 10 0 ?
To complete the ingredients needed for
causal attribution (unbiased effect size
estimate) we need a switch to assign units
to treatment and control
We need a switch that meets the
independence assumption: creates
equivalent groups…
The Switch (S)
The switch assigns each individual to treatment (S =

E (Yi | Si  1)  E (Yi | Si  0)
1) or control (S = 0)
Percentage Potential Outcomes Observed outcomes
of Study
Strata Population YT YC X YT YC

1 20 1 0 0 * 0
Control Group

2 10 1 1 0 * 1

3 10 0 0 0 * 0

4 10 0 1 0 * 1

1 20 1 0 1 1 *
Treatment Group

2 10 1 1 1 1 *

3 10 0 0 1 0 *

4 10 0 1 1 0 *
1. Random assignment to treatment & control
If independence produces equivalence, “extraneous” sources of variation
(aka influence of disturbing variables) are equally distributed across
treatment and control
Simplifies analysis
2. Matched sampling
3. Matched sampling using propensity scores
Propensity scores are each individual’s probability of being assigned to
treatment
Matches based on finding individual in control similar to each individual in
treatment based on propensity scores
4. Cutoff on assignment variable assigns individuals to
treatment and control (regression discontinuity)
If model correctly specified, produces unbiased estimate of average
treatment effect
5. Instrumental variable
6. Fixed effects (within individual estimates for panel data)
Or using regression to adjust estimates…
Several important studies about differences in effect sizes
between experimental and observational studies
Lipsey and Wilson (1992)
Weisburd, Lum & Petrosino (2001)
Glazerman, Levy & Myer (2003) matched sample labor
force interventions; assumed randomized experiment
unbiased
5. Matching works well (better w/ one-on-one matching
extensive covariates;
6. Regression works well (better with specification tests,
numerous controls, especially pretests);
7. Large sample studies less biased
8. Controls selected from “similar” sites
Large consensus that regression discontinuity is second
best switch after randomized control trials (van der
Klaauw 2003; Trochim, Cappelleri, Reichhardt 1991)
Teacher
%
greater %
Students Teacher than 5 Above
LEANAME ADM Proficient Retention years Poverty Combined Rank
WELDON CITY 1,078 51.10 81.72 55.56 73.18 261.56 116 Y
VANCE COUNTY 8,157 62.70 78.96 58.53 74.06 274.25 115 Y
HERTFORD COUNTY 3,606 61.30 81.25 65.17 70.78 278.50 114 Y
HOKE COUNTY 6,593 66.40 72.41 62.47 77.75 279.03 113 Y
WARREN COUNTY 3,110 67.30 82.79 62.56 70.07 282.72 112 Y
LEXINGTON CITY 3,162 64.30 86.75 54.88 76.92 282.85 111 Y
NORTHAMPTON COUNTY 3,254 63.80 83.22 67.58 70.52 285.12 110 Y
HALIFAX COUNTY 5,428 67.10 90.43 62.56 66.19 286.28 109 Y
THOMASVILLE CITY 2,666 64.10 78.86 63.27 81.10 287.33 108 Y
WASHINGTON COUNTY 2,155 57.40 88.36 71.20 73.13 290.09 107 Y
EDGECOMBE COUNTY 7,591 66.40 81.67 66.98 76.89 291.94 106 Y
FRANKLIN COUNTY 7,877 72.80 78.47 59.45 81.99 292.71 105 Y
MONTGOMERY COUNTY 4,502 68.20 81.95 64.26 78.65 293.06 104 Y
ROBESON COUNTY 24,134 67.80 86.03 68.04 74.01 295.88 103 Y
HYDE COUNTY 670 73.50 85.53 69.01 69.22 297.26 102 Y
ELIZABETH CITY/PASQUOTANK5,902 71.90 81.43 70.55 75.52 299.40 101 Y
KANNAPOLIS CITY 4,673 72.10 87.43 56.10 83.90 299.53 100
DURHAM COUNTY 30,810 71.20 81.24 64.72 82.70 299.86 99
TYRRELL COUNTY 644 82.80 75.44 73.13 68.52 299.89 98
BERTIE COUNTY 3,404 65.50 92.31 70.52 71.99 300.32 97
DUPLIN COUNTY 8,802 75.60 79.62 67.97 78.25 301.44 96
HARNETT COUNTY 16,917 75.50 81.66 63.17 81.22 301.55 95
ANSON COUNTY 4,403 63.50 89.86 71.61 78.02 302.99 94
GREENE COUNTY 3,222 72.30 86.70 64.84 79.93 303.77 93
LENOIR COUNTY 10,211 77.70 79.87 70.21 76.25 304.03 92
CHARLOTTE/MECKLENBURG 117,773 75.60 83.27 59.17 86.58 304.62 91
(Net Effects Model) -All Students

p-value Coefficient Std. Err


Intercept 0.000 -0.097667 (0.021596 )

Asian mean 0.466 -0.001846 (0.002530 )


Black mean 0.067 0.000669 (0.000365 )
Hispanic mean 0.266 0.001985 (0.001781 )
Multiracial mean 0.039 0.008692 (0.004204 )
American Indian mean 0.168 -0.001173 (0.000849 )
Free lunch mean 0.070 -0.001265 (0.000698 )
Reduced lunch mean 0.569 0.000825 (0.001450 )
School size 0.236 0.000016 (0.000014 )
DSSF Dummy 0.000 0.164639 (0.036082 )
Year 2006 0.001 -0.037745 (0.010466 )
Combined 0.000 0.471141 (0.123720 )
Combined Squared 0.020 0.524394 (0.224090 )
Combined Cubed 0.001 -2.669102 (0.731942 )
Regular classroom instruction ---- ---- ----

1. Estimates with individual level controls


Percentage Potential Outcomes Observed outcomes
of Target
Strata Population YT YC X YT YC

1 10 1 0 0 * 0

Control Group
2 5 1 1 0 * 1

3 5 0 0 0 * 0

4 5 0 1 0 * 1
Treatment Group

1 10 1 0 1 1 *

2 5 1 1 1 1 *

3 5 0 0 1 0 *

4 5 0 1 1 0 *
Unobserved Sample

1 20 1 0 * * *

2 10 1 1 * * *

3 10 0 0 * * *

4 10 0 1 * * *
1. What kind of evidence is
needed to influence
environmental policy and
program decisions?
2. Is there a program on the
horizon for which it would
be helpful to have this
information?
… likely to have large benefits?
… highly controversial?
Can you find the resources
to invest in obtaining
trustworthy information
about program effects?
Consider the extrapolation
problem – how to estimate
effects on target population
based on study population.

You might also like