You are on page 1of 21

Evaluation Method

1
Introduction
• Program evaluation method has become increasingly
popular:
– answers some relevant questions in public policy
– directly related to policy reform
• Fundamental interest in all program evaluation:
– whether a particular intervention (or treatment) is
effective in accomplishing its primary objectives.
• The main challenge of a credible impact evaluation:
– construction of the counterfactual outcome, that is,
what would have happened to participants in absence of
treatment. Since this counterfactual outcome is never
observed, it has to be estimated using statistical
methods.
• Using either
– Experimental data or
– Observational data (Non-experimental data/Quasi-
experimental data)
Experimental and Non-experimental Data
• Experimental evaluation
– “Randomization”, “Randomized Control Trial (RCT)”
– Assignment to treatment (or participation in the
intervention) is random
– Random assignment is used to assure that having the
intervention is the only differentiating factor between
observations who got the intervention and those excluded
from it.
• Since both group are randomly selected, on average
their characteristics will be similar. So, the control
group can be used to assess what would have happened
to participants in the absence of the intervention.
• In this regard, the control group serves as a perfect
proxy for estimating the counterfactual outcome, that
is, what would have happened to the treatment group in
the absence of the intervention.
Experimental and Non-experimental
Data
– In individual randomization, individual units are
assigned to a treatment or control group.
– Cluster randomization, clusters of units rather
than the units themselves are randomly assigned
to treatment and control groups (e.g. village).
Clustered RCTs are the preferred type when the
intervention is applied at the cluster rather than
the individual level (i.e. an intervention targeted
towards schools or health facilities).
Experimental and Non-
experimental Data
Target Population

Study Sample

Treatment Group Control Group


(Get the treatment) Random assignment (Don’t get the treatment)

Outcome Outcome

Outcome Comparison
(This will provide the effect of treatment)
Experimental and Non-experimental
Data
– Complications of experimental evaluation
▪ Randomization not always feasible (e.g. genetics)
▪ Randomization not always ethical (e.g. smoking)
▪ People don’t do what they’re told (->
noncompliance)
▪ Time consuming and costly
▪ Requires baseline endline data…
▪ Spillovers: when a program changes outcome for
units in the control group- may be physical,
behavioral, informational, market or general
equilibrium.
▪ Crossovers: when a control unit directly receives
the program, either intentionally or accidentally.
Experimental and Non-experimental
Data
• Observational Studies
• Also known as non-experimental studies
• Assignment to treatment (or participation in the
intervention) is NOT random.
Study Sample

Treatment Control Group


NOT
Group (Don’t get the
Random assignemnt
(Get the treatment)
treatment)
Experimental and Non-experimental
Data
• Observational Studies
– Complications of observational studies
– Characteritics of a observation may be assiciated with
both outcomes and his/her participation in treatment
• For example: We want to see how tranning might
effect earning.
• Assignment to control group (no trainning) and
treatment group (with trainning) is nonrandom.
• We have observational data where some people have
tranning and some don’t.
• We cannot compare the income of two groups to
find the impact of trainning.
• There might be some characteristics of the
observation which affect participation in training
and also his income.
Experimental and Non-experimental
Data
• Observational Studies Treatment
– Complications of observational studies (Participation in training)

Characteristics of
observation
– Selection Bias problem
Outcomes (Income)
• Our target: We like to know the difference between
partcipants’ outcome with or witout treatment.
• But we don’t observe both outcomes for same
individual.
• Taking the mean outcome of nonparticipants as an
approximation is not advisable, since participants and
nonparticipants usually differ
• “Selection Bias” A good example of selection bias is
the case where high-skilled individuals have a higher
probability of entering a training programme and also
have a higher probability of finding a job. The
matching approach is one possible solution to the
selection problem.
Solutions to address selection bias
problem
• Matching
• Instrumental variables
• Propensity scores (Rosenbaum and Rubin 1983)
• Regression discontinuity
• Difference-in-difference
Parameter of interest and selection
bias
• Binary treatment indicator => 𝑫𝒊 = 𝟏 if individual 𝒊 recieves treatment
and 𝑫𝒊 = 𝟎 if otherwise
• Treatment effect for individual 𝒊 => 𝜹𝒊 = 𝒚𝒊𝟏 − 𝒚𝒊𝟎 ∀𝒊 = 𝟏, 𝟐, 𝟑 … , 𝑵
• Average Treatment Effect (ATE): 𝐴𝑇𝐸 = 𝐸 𝛿 = 𝐸 𝑌1 − 𝑌0
– In general, an evaluation seeks to estimate the mean impact of the
program
– One parameter of interest:
• Average Treatment Effect on the Treated (ATT):
– Another paramter of interest: 𝐴𝑇𝑇 = 𝐸 𝛿|𝐷 = 1 = 𝐸 𝑌1 |𝐷 = 1) − 𝐸(𝑌0 |𝐷 = 1
– We will mainly focus on this
– Problem:
• Here, 𝑬(𝒀𝟎 |𝑫 = 𝟏) shows “average outcome that the treated
individual would have obtained if he was not treated”
• This is counterfactual i.e. not observed. So, we have to replace
it. ATE parameter answers the question: ‘What is the expected
effect on the outcome if individuals in the population were
randomly assigned to treatment?’ Heckman (1997) notes that this
estimate might not be of relevance to policy makers because it
includes the effect on persons for whom the programme was
never intended. For example, if a programme is specifically
targeted at individuals with low family income, there is little
interest in the effect of such a programme for a millionaire.
Parameter of interest and selection bias
• Selection bias and 𝐴𝑇𝑇
Δ = 𝐸 𝑌1 𝐷 = 1 − 𝐸 𝑌0 𝐷 = 0

Δ = 𝐸 𝑌1 𝐷 = 1 − 𝐸 𝑌0 𝐷 = 1 + 𝐸 𝑌0 𝐷 = 1 − 𝐸 𝑌0 𝐷 = 0

Δ = 𝐴𝑇𝑇 + 𝐸 𝑌0 𝐷 = 1 − 𝐸 𝑌0 𝐷 = 0

Selection Bias (SB)

• Here, if SB is equal to zero, then we can


calculate 𝐴𝑇𝑇 by Δ.
• In experimental data, SB is zero. Incase of
observational data, we have to set some
assumptions to so that SB is equal to zero.
Assumptions for PSM
• Conditional Independence Assumption:
– Given a set of observable covariates 𝑿
which are not affected by treatment,
potential outcomes are independent of
treatment assignment. 𝒀𝟏 , 𝒀𝟎 ⟘𝑫|𝑿

– This allows the untreated units to be


used to construct a counterfactual for
the treatment group.
– Also known as “unconfoundedness”,
“selection on observables”.
𝟎<𝑷 𝑫=𝟏𝑿 <𝟏
Assumptions for PSM
• Common Support Condition:
– for each value of 𝑿, there is a positive
probability of being both treated and
untreated.
– the probability of receiving treatment for
each value of X lies between 0 and 1. By the
rules of probability, this means that the
probability of not receiving treatment lies
between the same values
– Also, known as “overlap condition”.
• because it ensures that there is sufficient
overlap in the characteristics of the
treated and untreated units to find
adequate matches 𝟎<𝑷 𝑫=𝟏𝑿 <𝟏
Density

Density of scores Density of scores for


for non- participants
participants

Region of
common
support

0 High probability of participating


Propensity score 1
given X

15
Steps for implementing PSM
Uses statistical techniques to construct an artificial control group
and matches each treated unit with a non-treated unit of similar
characteristics. I
The propensity score is defined as the probability of receiving
the treatment.
Then, PSM matches treated units to untreated units based on the
propensity score.
Conditional on some observable characteristics, untreated units can
be compared to treated units, as if the treatment has been fully
randomized.
1. Calculate the score
– Dependent variables is dichotomous, so using logit or probit
– If two individuals had the same probability of receiving
treatment, and one actually does and another don’t not, this
allocation can be seen as random. 𝑃 𝐷 = 1 𝑋
– Difference in outcomes within groups of similar propensity
scores gives unbiased estimate of the treatment effect.
Steps for implementing PSM
2.Choosing a matching algorithm
• 2.1 Nearest Neighbour (NN) matching
– The individual from the control group (𝒋) is chosen as a
matching partner for a treated individual (𝒊) that is closest in
terms of the propensity score.
– NN matching can be ‘with replacement’ and ‘without
replacement'
min 𝑝𝑖 − 𝑝𝑗

matching w/o replacement means any observation in the


comparison group is matched to no more than one treated
observation, that is with the closest match, with
replacement means multiple matches.
Steps for implementing PSM
2.Choosing a matching algorithm
• 2.2 Radius and Caliper matching
– Each treated observation (𝒊) is matched with all
control observations (𝒋) that fall within a specified
radius (𝒓) of propensity scores. 𝑝𝑖 − 𝑝𝑗 < 𝑟

– If each treated observation (𝒊) is matched with


nearest control observation (𝒋) that that fall within a
specified radius (𝒓) of propensity scores.
– This radius can be thought of a “”maximum distance
tolerance level”.
Steps for implementing PSM
2.Choosing a matching algorithm
• 2.3 Stratification or interval matching
– The idea of stratification matching is to
partition the common support of the
propensity score into a set of intervals
(strata)
– Then calculate the impact within each interval
by taking the mean difference in outcomes
between treated and control observations.
– This method is also known as interval
matching, blocking and subclassification.
Steps for implementing PSM
2.Choosing a matching algorithm
• 2.4 Kernel matching
– Each treated observation outcome is
matched to a weighted average of the
outcomes of all the untreated
observations, with the highest weight
being placed on those with scores
closest to the treated individual.
Steps for implementing PSM
3.Estimating intervention impacts
• After propensity scores have been
estimated and a matching algorithm
has been chosen, the impact of the
program is calculated by just
averaging the differences in
outcomes between each treated unit
and its neighbor (or neighbors)

You might also like