You are on page 1of 15

Inference for Average Treatment Effects

Kosuke Imai

Harvard University

S TAT 186/G OV 2002 C AUSAL I NFERENCE

Fall 2019

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 1 / 15


Motivation

Two limitations of permutation inference:


1 causal heterogeneity
2 population inference
Fundamental problem of causal inference
cannot identify individual causal effects

Neyman’s approach:
1 Average treatment effects as causal quantities of interest: SATE
and PATE
2 Design-based approach: randomization of treatment assignment,
random sampling
3 Asymptotic approximation rather than exact inference

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 2 / 15


Social Pressure and Turnout (Gerber, et al. 2008. Am. Political Sci. Rev.)

August 2006 Primary Election in Michigan


Statewide elections: Governor, US Senator
180,000 households
Send postcards with different messages

Randomly assign each household to a group (or treatment)


1 no message (control group)
2 civic duty message
3 “you are being studied” message (Hawthorne effect)
4 household social pressure message
5 neighborhood social pressure message

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 3 / 15


Neighborhood Social Pressure Message

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 4 / 15


“You are being studied” Message

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 5 / 15


Standard Empirical Analysis
Groups Control Civic duty Hawthorne Self Neighbor
Turnout rate 29.7% 31.5% 32.2% 34.5% 37.5%
# of voters 191,243 38,218 38,204 38,218 38,201

Neighborhood social pressure vs. Control

= 37.5 − 29.7 = 7.8


τ̂
r
37.5 × (100 − 37.5) 29.7 × (100 − 29.7)
s.e. = + ≈ 0.3
38201 191243
95%CI = [7.8 − 1.96 × 0.3, 7.8 + 1.96 × 0.3] = [7.2, 8.4]

This calculation ignores the fact that some households have


multiple voters: we will discuss this issue later in the course
How can we justify this standard difference-in-means analysis
from the randomization perspective?

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 6 / 15


Estimation of the Sample Average Treatment Effect

Due to Neyman (1923) Neyman. 1990 (translated to English) Stat. Sci.


Difference-in-means estimator:
n n
1 X 1 X
τ̂ ≡ Ti Yi − (1 − Ti )Yi
n1 n0
i=1 i=1

Unbiasedness (over repeated treatment assignments):


n n
1 X 1 X
E(τ̂ | On ) = E(Ti | On )Yi (1) − {1 − E(Ti | On )}Yi (0)
n1 n0
i=1 i=1
n
1X
= (Yi (1) − Yi (0)) = SATE
n
i=1

where On = {Yi (0), Yi (1)}ni=1

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 7 / 15


The Variance of the Difference-in-Means Estimator

Variance of τ̂ :
 
1 n0 2 n1 2
V(τ̂ | On ) = S1 + S0 + 2S01 ,
n n1 n0

where for t = 0, 1,
n
1 X
St2 = (Yi (t) − Y (t))2 sample variance of Yi (t)
n−1
i=1
n
1 X
S01 = (Yi (0) − Y (0))(Yi (1) − Y (1)) sample covariance
n−1
i=1

The variance is NOT identifiable

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 8 / 15


Details of the Variance Derivation
1 Let Xi = Yi (1) + n1 Yi (0)/n0 and Di = nTi /n1 − 1, and write
 !2 
n
1  X 
V(τ̂ | On ) = E D i Xi
On
n2  
i=1

2 Show
n0
E(Di | On ) = 0, E(Di2 | On ) = ,
n1
n0
E(Di Dj | On ) = −
n1 (n − 1)
3 Use Ê and Ë to show,
n
n0 X
V(τ̂ | On ) = (Xi − X )2
n(n − 1)n1
i=1

4 Substitute the potential outcome expressions for Xi


Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 9 / 15
Conservative Variance Estimator
The usual variance estimator is conservative on average:
!
S12 S02 σ̂12 σ̂02
V(τ̂ | On ) ≤ + = E + On
n1 n0 n1 n0
where
n
1 X
σ̂t = 1{Ti = t}(Yi − Y t )2 for t = 0, 1
nt − 1
i=1

Under the constant additive unit causal effect assumption, i.e.,


Yi (1) − Yi (0) = c for all i,
1 2 S12 S02
S01 = (S + S02 ) and V(τ̂ | On ) = +
2 1 n1 n0
The optimal treatment assignment rule:
n n
n1opt = , n0opt =
1 + S0 /S1 1 + S1 /S0
Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 10 / 15
Bounds on the Variance

Use of the Cauchy-Schwartz inequality:


1 Upper bound: sample correlation between Yi (1) and Yi (0) is 1
2 Lower bound: sample correlation between Yi (1) and Yi (0) is −1
 2  2
n0 n1 S1 S0 n0 n1 S1 S0
− ≤ V(τ̂ | On ) ≤ +
n n1 n0 n n1 n0

Constant additive unit causal effect sample correlation is 1


2
S12 S02

n0 n1 S1 S0
+ = +
n n1 n0 n1 n0

Sharp bounds based on the entire marginal distributions


application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.)

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 11 / 15


Inference for Population Average Treatment Effect
Assumption: simple random sampling from an infinite population
Unbiasedness (over repeated sampling):

E{E(τ̂ | On )} = E(SATE) = PATE

Variance:

V(τ̂ ) = V{E(τ̂ | On )} + E{V(τ̂ | On )}


σ12 σ02
= +
n1 n0

where σt2 is the population variance of Yi (t) for t = 0, 1


Unbiased variance estimator:
2 2
[) = σ̂1 + σ̂0
V(τ̂ [)} = V(τ̂ )
where E{V(τ̂
n1 n0
for t = 0, 1
Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 12 / 15
Asymptotic Inference for PATE
Hold k = n1 /n constant:
n  
1X Ti Yi (1) (1 − Ti )Yi (0)
τ̂ = −
n k 1−k
i=1 | {z }
i.i.d. with mean PATE & variance nV(τ̂ )

Consistency via Law of large numbers:


p
τ̂ −→ PATE
Asymptotic normality via the Central Limit Theorem:
!
√ d σ2 σ02
n(τ̂ − PATE) −→ N 0, 1 +
k 1−k

(1 − α) × 100% Confidence intervals:


[τ̂ − s.e. × zα/2 , τ̂ + s.e. × zα/2 ]

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 13 / 15


Exchange at the Royal Statistiacal Society
(Neyman et al. (1935) Suppl. of J. Royal Stat. Soc.)

Neyman: So long as the average yields of any treatments are


identical, the question as to whether these treatments affect separate
yields on single plots seems to be uninteresting

Fisher: It may be foolish, but that is what the z test was designed for,
and the only purpose for which it has been used.

Neyman: I am considering problems which are important from the


point of view of agriculture.

Fisher: It may be that the question which Dr. Neyman thinks should be
answered is more important than the one I have proposed and
attempted to answer. I suggest that before criticizing previous work it is
always wise to give enough study to the subject to understand its
purpose.
Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 14 / 15
Summary: Fisher vs. Neyman

Like Fisher, Neyman proposed randomization-based inference


Unlike Fisher,
1 estimands are average treatment effects
2 heterogenous treatment effects are allowed
3 population as well as sample inference is possible
4 asymptotic approximation is required for inference

Reading: I MBENS AND RUBIN , C HAPTER 6

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 15 / 15

You might also like