Compliance

L7 EC 402, 2022
Compliance, IV and LATE

Reading for next week:
Chapter on Matching
1 Dealing with noncompliance

When compliance was not a factor, the observed outcome Yi could be expressed in terms of potential
outcomes Yi1 , Yi0 as.
Yi = Yi0 + (Yi1 − Yi0 )Di
But given the distinction between assignment (Z) and participation (D), one can expand the
potential outcomes framework to consider:
Assignments [Zi = 0; Zi = 1] (under random assignment, for example)
Potential treatments that are a function of assignments

Di (Zi ) : [Di1 = 1; Di0 = 1; Di0 = 0; Di1 = 0].
One way to think of the relationship between assignments and treatment/participation is, let
Di∗ be the utility derived by i (or any other criterion) from participation.
Di∗ = α + βZi + ui ; E[ui ] = 0

In the case of random assignment Z is binary, but the specification is more general. Further
Di = 1ifDi∗ ≥ 0; Di = 0 ifDi∗ < 0
This is the process that describes selection; there are two components:
– D driven by Z ⇒ Selection on observables

– D driven by u ⇒ Selection on unobservables
Potential outcomes that are now a function of both D and Z, so

Yi (D, Z) : [Yi (1, 0); Yi (1, 1); Yi (0, 1); Yi (0, 0)].
Note these are all potential outcomes and treatments, so the missing observation problem means
that from each of these sets, only one is observed.
There are in principle four types of units
Compliers: Di0 = 0; Di1 = 1
Defiers: Di0 = 1; Di1 = 0
Always-takers: Di0 = 1; Di1 = 1
Never-takers: Di0 = 0; Di1 = 0
These can by represented by the following typology characterizing units by treatment and as-
signment types:
1
L7 EC 402, 2022
Z=0
Di0
=0 Di0 = 1
Di1 = 0 Never-taker Defier
Z=1
Di1 =1 Complier Always-taker
Since never takers never participate–their D = 0 irrespective of Z, their potential outcomes

under both situations cannot be observed.
Similarly, since always takers always participate–their D = 1 irrespective of Z, their potential

outcomes under both situations can not be observed.
We rule out defiers or deniers by assumption; this is termed the monotonicity assumption.
They do the opposite of the assignment rule.
This leaves only the compliers, for whom we are able to observe outcomes with and without
participation.
This characterization had assumptions embedded, which we now formally state:
1.1 Assumptions
1. SUTVA: Potential outcomes and treatments of unit i are independent of the potential assign-
ments, treatment and outcomes of unit j ̸= i. I.e.
Di (Z) = Di (Zi )
Yi (D, Z) = Yi (Di , Zi ).
2. Ignorability or random assignment, or “as if” random assignment. All units have the same
probability of assignment to treatment. I.e. P (Zi = 1) = P (Zj = 1)
3. Treatment is correlated to assignment (First stage)
P (Di1 = 1) ̸= P (Di0 = 1) ⇔ E[Di (1) − Di (0)] ̸= 0
In other words, the probability of treatment is different across the two assignment groups.
4. Exclusion restriction: Assignment (second argument) affects outcome only through treatment:
Yi (1, 0) = Yi (1, 1); Yi (0, 1) = Yi (0, 0)
So conditional on treatment, Z does not affect observed outcomes.
Yi1 ≡ Yi (1, 1) = Yi (1, 0); Yi0 ≡ Yi (0, 0) = Yi (0, 1)
Using lower case z to denote variable (and upper case Z for its realization), the observed
outcome can be written as a function of potential outcomes:
Yi = Yi (0, zi ) + [Yi (1, zi ) − Yi (0, zi )]Di = Yi0 + [Yi1 − Yi0 ]Di
2
L7 EC 402, 2022
5. Monotonicity: Di1 − Di0 ≥ 0 ∀ i, or vice-versa That is, no one does the opposite of their
assignment.
The last three assumptions together are known as strong monotonicity. In a purely mechanical
sense what is getting ruled out can be seen by the following tabulation where the assumption
is Di1 − Di0 ≥ 0
Di1 Di0 Di1 − Di0 Assump met? Type
1 1 0 Yes Always-taker
1 0 1 Yes Complier
0 1 -1 No Defier
0 0 0 Yes Never-taker
2 Identifying Treatment Effects

Idea of tabular representation borrowed from Ichino
In the context of impact assessment, the RHS (X) variable is the binary participation variable D,
which can be endogneous in the presence of partial compliance. A natural choice for the IV Z
is the assignment rule, which is distinct from the participation decision D unless there is perfect
compliance.
Under the assumptions in section 1.1, what treatment effects can be identified?
Causal ITT effect of Z on D (in other words, on participation): E(Di1 − Di0 )
Causal ITT effect of Z on Y : E[Yi (Di1 , 1) − Yi (Di0 , 0)]
Causal effect of D on Y : E(Yi1 − Yi0 ) = E[Yi (1, z) − Yi (0, z)] But this a LATE, and not an
ATT, as seen below.
2.1 Causal ITT effect of Z on D

Participation into treatment Di is a function of an assignment mechanism Z; Di = Di (Z).
Observed treatment status can analogously be written as
Di = Di0 + (Di1 − Di0 )Zi
Under SUTVA and ignorability, for each unit i the ITT effect of assignment on participation is
given by Di1 − Di0 . On average then,
Cov(Di Zi )
E(Di |Zi = 1) − E(Di |Zi = 0) =
V ar(Zi )
= E(Di |Zi = 1) − E(Di0 |Zi = 0)
1
= E[Di1 − Di0 ] by assumptions 1 and 2
This means that the “first stage” regression of D on Z yields the causal effect of assignment
on treatment. This correlation should be non-zero by above assumptions. In other words the
assignment either improves or worsens the probability of participation.
3
L7 EC 402, 2022
2.2 Causal ITT effect of Z on Y (D, Z)

Similarly the unit level ITT effect of assignment Zi on outcome Yi is: Yi (Di1 , 1) − Yi (Di0 , 0).
Cov(Yi Zi )
⇒ E(Yi |Zi = 1) − E(Yi |Zi = 0) =
V ar(Zi )
= E[Yi (Di , 1)|Zi = 1] − E[Yi (Di0 , 0)|Zi = 0]
1
= E[Yi (Di1 , 1)] − E[Yi (Di0 , 0)] by assumptions
I.e. the “reduced form” regression of Y on Z yields the causal ITT
To examine the role of assumptions in greater detail, consider in tabular form the causal effects of
Z on Y at the unit level:
Z=0
Di0
=0 Di0 = 1
Yi (0, 1) - Yi (0, 0) Yi (0, 1) - Yi (1, 0)
Z=1
Di1 = 1 Complier Always-taker
Yi (1, 1) − Yi (0, 0) Yi (1, 1) − Yi (1, 0)
Z=0
Di0
=0 Di0 = 1
Yi (0, 1) - Yi (0, 0) Yi (0, 1) - Yi (1, 0)
= Yi0 − Yi0 = 0 = −(Yi1 − Yi0 )
Z=1 1
Di = 1 Complier Always-taker
Yi (1, 1) − Yi (0, 0) Yi (1, 1) − Yi (1, 0)
= Yi1 − Yi0 = Yi1 − Yi1 = 0
By SUTVA and random assignment, the causal effect can be written separately for each i and
for each cell
Exclusion restrictions imply effect is zero for always- and never-takers
Montonicity implies that there are no defiers and there are at least some compliers
That is, we can identify a treatment effect only on compliers
2.3 The LATE theorem and the causal treatment effect of Di on Yi :

Yi1 − Yi0 = Yi (1, z) − Yi (0, z)
Notice that the IV estimator can be written as the ratio of the two ITT discussed above: i.e. as
Cov(Yi Zi )
Cov(Di Zi )
. But as noted in the previous class, it does not identify an ATT. Instead, it identifies a
4
L7 EC 402, 2022
LATE, for reasons outlined above. A formal statement of Local Average Treatment Effect (LATE)
theorem is
E[Yi |Zi = 1] − E[Yi |Zi = 0]
β̂IV = W ald = = E[Yi1 − Yi0 |Di1 > Di0 ]
E[Di |Zi = 1] − E[Di |Zi = 0]
(The LHS is a Wald Estimator). Proof of LATE theorem: The numerator:
E[Yi |Zi = 1] = E[Yi0 + (Yi1 − Yi0 )Di |Zi = 1] = E[Yi0 + (Yi1 − Yi0 )Di1 ]
This is a consequence of SUTVA, as seen above. Similarly, the second term in the numerator
E[Yi |Zi = 0] = E[Yi0 + (Yi1 − Yi0 )Di |Zi = 0] = E[Yi0 + (Yi1 − Yi0 )Di0 ]
⇒ E[Yi |Zi = 1] − E[Yi |Zi = 0] = E[(Yi1 − Yi0 )(Di1 − Di0 )]

At the unit level (Yi1 − Yi0 )(Di1 − Di0 ) is the product of causal effect of Z on D and causal effect of
D on Y
By monotonicity, Di1 − Di0 is either 1 or 0. The numerator then is:
E[(Yi1 − Yi0 )(Di1 − Di0 )] = E[(Yi1 − Yi0 ).1|(Di1 > Di0 )]P (Di1 > Di0 )
The denominator, as seen earlier is E(Di1 − Di0 ). By monotonicity and similar argument again,
E(Di1 − Di0 ) = E[(Di1 − Di0 )|Di1 > Di0 ]P (Di1 > Di0 )
⇒ E(Di1 − Di0 ) = P (Di1 > Di0 )

After cancelling the P (Di1 > Di0 ) from the numerator and denominator, we get the result of the
LATE theorem.
In other words, the ratio of the ITT effect of Z on Y and the ITT effect of Z and D which we
Cov(Yi Zi )
showed earlier was given by Cov(Z i Di )
identifies only a LATE estimator.
2.3.1 Why was monotonicity necessary?

Absent monotonicity, the numerator of the Wald, the ITT effect of Z on Y can be written as:
E(Yi |Zi = 1) − E(Yi |Zi = 0) = E[(Yi1 − Yi0 )(Di1 − Di0 )]

= E[(Yi1 − Yi0 )|(Di1 > Di0 )]P (Di1 > Di0 )
− E[(Yi1 − Yi0 )|(Di1 < Di0 )]P (Di1 < Di0 )
Thus treatment effects for compliers (the first term) may be cancelled out by treatment effects for
the defiers (second term) even though both sets of effects are positive.
Therefore the causal effect can only be estimated for the group of compliers. Therefore the IV
estimator identifies the LATE.
3 Encouragement designs
Recall the types of RCT designs. With the characterisation of compliance and noncompliance, and
the IV toolkit in hand, we can examine encouragement designs. By way of a hypothetical example:
say
5
L7 EC 402, 2022
the intervention is a 50 percent discount to purchase a bicycle to commute to college. The

intervention is say by the government and has universal coverage for the eligible population
the eligibility criterion is women aged 18 to 19 years. Absent any other interventions, say the
takeup rate is 40 percent.
the encouragement design would consist of the researchers offering an additional 30 percent
discount on the bicycle price, but this additional discount would be offered only to a randomly
drawn subset of the eligible women. Say the take up rate in this treated subgroup is 60 percent.
this means that the intervention led to a 20 percentage point increase in the take up of the
subsidy.
importantly, note that the “control” group also had access to the subsidy, but at the policy
rate of 50 percent.
These designs are often considered easier (more practical) to implement.
The ITT estimates are based on the random allocation, but they are effectively the treatment
effects of the encouragement (the additional 30 percent discount), not of the programme
intervention (50 percent discount) per se.
But the LATE theorem above tells us we can recover the effect of the programme anyway.
The only price we pay is that it is defined only for the set of compliers, and with all the
assumptions necessary to identify it
Recall that the Wald estimator is the ITT divided by the proportion of compliers. How is the
percentage of compliers receiving the encouragement to be computed? Under exclusion and
monotonicity, the proportion of compliers is defined by the difference in takeup rate among
those (randomly selected into) receiving the encouragment and the takeup rate among the
(randomly selected) non-encouraged. Proof not required.
Newer work addresses sample size calculations for LATE (see e.g. Bansak, 2018)
The Fishman et al. (2021) paper uses an encouragement design. Their ITT estimates have the
additional wrinkle of including covariates, but that does not change the intuitions (or results)
derived above.

Compliance

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compliance

Uploaded by

Copyright:

Available Formats

L7 EC 402, 2022

Compliance, IV and LATE

1 Dealing with noncompliance

 Assignments [Zi = 0; Zi = 1] (under random assignment, for example)

 Potential treatments that are a function of assignments

Di∗ = α + βZi + ui ; E[ui ] = 0

Di = 1ifDi∗ ≥ 0; Di = 0 ifDi∗ < 0

– D driven by Z ⇒ Selection on observables

 Potential outcomes that are now a function of both D and Z, so

 Compliers: Di0 = 0; Di1 = 1

 Defiers: Di0 = 1; Di1 = 0

 Always-takers: Di0 = 1; Di1 = 1

 Never-takers: Di0 = 0; Di1 = 0

 Since never takers never participate–their D = 0 irrespective of Z, their potential outcomes

 Similarly, since always takers always participate–their D = 1 irrespective of Z, their potential

This characterization had assumptions embedded, which we now formally state:

3. Treatment is correlated to assignment (First stage)

P (Di1 = 1) ̸= P (Di0 = 1) ⇔ E[Di (1) − Di (0)] ̸= 0

Yi (1, 0) = Yi (1, 1); Yi (0, 1) = Yi (0, 0)

So conditional on treatment, Z does not affect observed outcomes.

Yi1 ≡ Yi (1, 1) = Yi (1, 0); Yi0 ≡ Yi (0, 0) = Yi (0, 1)

Yi = Yi (0, zi ) + [Yi (1, zi ) − Yi (0, zi )]Di = Yi0 + [Yi1 − Yi0 ]Di

2 Identifying Treatment Effects

 Causal ITT effect of Z on Y : E[Yi (Di1 , 1) − Yi (Di0 , 0)]

2.1 Causal ITT effect of Z on D

= E[Di1 − Di0 ] by assumptions 1 and 2

2.2 Causal ITT effect of Z on Y (D, Z)

= E[Yi (Di1 , 1)] − E[Yi (Di0 , 0)] by assumptions

I.e. the “reduced form” regression of Y on Z yields the causal ITT

 Exclusion restrictions imply effect is zero for always- and never-takers

 That is, we can identify a treatment effect only on compliers

2.3 The LATE theorem and the causal treatment effect of Di on Yi :

⇒ E[Yi |Zi = 1] − E[Yi |Zi = 0] = E[(Yi1 − Yi0 )(Di1 − Di0 )]

⇒ E(Di1 − Di0 ) = P (Di1 > Di0 )

2.3.1 Why was monotonicity necessary?

E(Yi |Zi = 1) − E(Yi |Zi = 0) = E[(Yi1 − Yi0 )(Di1 − Di0 )]

 the intervention is a 50 percent discount to purchase a bicycle to commute to college. The

 These designs are often considered easier (more practical) to implement.

You might also like

Assignments [Zi = 0; Zi = 1] (under random assignment, for example)

Potential treatments that are a function of assignments

Potential outcomes that are now a function of both D and Z, so

Compliers: Di0 = 0; Di1 = 1

Defiers: Di0 = 1; Di1 = 0

Always-takers: Di0 = 1; Di1 = 1

Never-takers: Di0 = 0; Di1 = 0

Since never takers never participate–their D = 0 irrespective of Z, their potential outcomes

Similarly, since always takers always participate–their D = 1 irrespective of Z, their potential

Causal ITT effect of Z on Y : E[Yi (Di1 , 1) − Yi (Di0 , 0)]

Exclusion restrictions imply effect is zero for always- and never-takers

That is, we can identify a treatment effect only on compliers

the intervention is a 50 percent discount to purchase a bicycle to commute to college. The

These designs are often considered easier (more practical) to implement.