You are on page 1of 20

Contribution Analysis: The promising new approach to causal claims

Sebastian Lemire

Ramboll Management Consulting, Denmark

Authors’ note

Please direct correspondence at Sebastian Lemire, setl@r-m.com, telephone+45 5161 8188,

Rambøll Management Consulting, Hannemanns Allé 53, DK 2300 København S, Denmark. The author

would like to thank Steffen Bohni Nielsen and Melanie Kill for their thoughtful comments on an earlier

version of this paper.

1
English abstract

This paper examines the methodological strengths and weaknesses of contribution analysis

(CA), awarding particular attention to the ability of CA to identify and determine the extent of

influence from alternative explanations. The author argues that CA – in its current form and application

– fares well in terms of identifying salient external factors, but finds itself in need of further

methodological elaboration to adequately account for the extent of influence from these factors. As

such, CA remains vulnerable to threats of internal validity. In strengthening the merit of causal claims

based on CA, focus should not only be directed at CA as an analytical strategy but should also involve

a much broader discussion on how to (re)think validity in the context of evaluation. An outline of the

implications of this new categorization for causal claims based on CA and a discussion of how to

enhance the credibility of CA concludes the paper.

2
Introduction

Impact evaluation has for many years received the lion’s share of attention in both evaluation

theory and practice. Yet, despite the sustained attention to determine the attribution of projects,

programs, and policies, there is to this day surprisingly little agreement on viable and methodologically

rigorous alternatives to the traditional counterfactual impact designs. Indeed, the received wisdom

among many an evaluator remains that rigorous impact evaluation cannot be done in the absence of

controlled comparisons. The counterfactual designs stand strong. One promising alternative to the

counterfactual designs is that of John Mayne’s contribution analysis (CA) that seeks to examine

attribution through contribution (2008). As suggested by Mayne, CA is useful in instances where it is

impractical, inappropriate, or impossible to address the attribution question through an experimental

evaluation design. “In complex systems”, Mayne goes on to argue, “experimenting with exogenous

variables is not possible or not practical: the counterfactual case cannot be established” (2008: 4).

Accordingly, the evaluation question must be readdressed by focusing on the extent to which the

evaluator can “build a case for reasonably inferring causality,” that is, the extent to which the

intervention can be said to have contributed to a set of observed (positive or negative) outcomes

(Mayne, n.d.: 5).

The methodological strength of CA, then, rests on its ability to accommodate – without a

counterfactual – the often complex and messy nature of programs by taking into account the nexus of

conditioning variables and interactions among program components. This is – perhaps needless to say

to the experienced evaluator - in and of itself a strong selling point. Yet, despite the significant

theoretical interest and attention awarded contribution analysis, especially in development circles, there

3
are to this day few examples of the systematic application of contribution analysis (see Dybdal, Bohni

and Lemire n.d. for a discussion).i As explained by Mayne:

A stumbling block to using contribution analysis may be that this type of analysis

remains seen, especially in some of the evaluation profession, as a very weak substitute

for the more traditional experimental approaches to assessing causality. (n.d.: 1)

In advancing CA as a sound methodological alternative to the traditional experimental approaches, the

need then is to develop and present CA as a methodologically rigorous alternative. In this effort, I

would argue that attention should be awarded the validity of causal claims based on CA. The

foundation of confidence in any methodologically sound design, method or analytical approach that

aims to produce inferences—especially inferences of cause-and-effect relations—has to award attention

to validity issues. Contribution analysis is no exception.

The overarching purpose of this paper is to advance the application of CA. The aim is two-fold,

in that I seek to both (1) advance the theoretical discussion on the validity of causal claims based on

CA and (2) push for further practical application of CA. The paper consists of three sections. The first

presents a brief outline of contribution analysis and note on its ability to identify and determine the

influence of external factors. However, in examining the merit of causal claims based on CA, focus

should not only be directed at CA as an analytical strategy but should also involve a much broader

discussion on how to (re)think validity in the context of evaluation. Thus the second section examines

the concept of validity. It discusses the overwhelming dominance of the Campbellian validity model in

both research and evaluation and makes the case for rethinking validity in the context of evaluation.

The result of the discussion is a new categorization of validity evidence for causal claims. The third

section outlines the implications of this new categorization for causal claims based on CA and

4
concludes with a discussion of how to enhance the credibility of CA and thereby pave the way for its

increased practical application.

Contribution analysis

Contribution analysis (CA) has been presented and conceptually developed by John Mayne

through a series of seminal papers (1999, 2001 and 2008). In its development, CA has over time moved

away from its original setting in performance measurement and towards its new role in evaluating

complex programs in complex settings (Dybdal, Bohni and Lemire n.d.). While the approach has

undergone refinements in its methodology and even more notably its scope (see Dybdal, Bohni and

Lemire n.d. for a detailed account), the underlying logic of and core steps in CA remain more or less

the same: (i) elaborating the intervention’s theory of change, (ii) identifying key threats to the theory of

change’s mechanisms, (iii) identifying other contributing factors, and (iv) testing the primary rivaling

explanations. In the context of applying CA in complex settings, Mayne operationalise CA in the

following steps: (1) Set out the cause-effect issue to be addressed, (2) develop the postulated theory of

change and risks to it, (3) gather the existing evidence on the theory of change, (4) assemble and assess

the contribution story and challenges to it, (5) seek out additional evidence, (6) revise and strengthen

the contribution story, (7) in complex settings, assemble and assess the complex contribution story

(Mayne, 2008).

According to Mayne, in the context of evaluation CA can primarily address attribution by

providing answers to contribution questions such as: Has the program made a difference? How much of

a difference? Mayne delineates three types of causal stories:

1. A minimalist contribution analysis can be construed when a theory of change was

developed and the expected outputs were delivered. Contribution is based on “the

5
inherent strength of the postulated theory of change and the fact that the expected

outputs were observed”.

2. A contribution analysis of direct influence can be construed when a theory of change

was developed, expected outputs occurred, immediate results were observed, and

evidence suggests the program was instrumental in creating those results, in light of

other influencing factors.

3. A contribution analysis of indirect influence can be construed when: “It would measure

the intermediate and final outcomes (or some of them) and gather evidence that the

assumptions (or some of them) in the theory of change in the areas of indirect influence

were born out. Statements of contribution at this level would attempt to provide factual

evidence for at least the key parts of the whole postulated theory of change” (Mayne

n.d.: 25-26).

The distinction between these three types of causal stories has less to do with the extent to which CA

can address the magnitude of the contribution and more to do with the relative strength or credibility of

the contribution story. The shared denominator for all three causal stories is that the evaluator –

through systematic evaluative inquiry – seeks to infer “plausible association” between the program and

a set of relevant outcomes (Mayne, 1999: 5-7). The aim of CA, then, is not to provide proof of a one-

on-one, linear causal linkage between a program and its intended outcomes, nor is it to determine the

exact contribution of the program. Rather the aim of CA is to provide evidence beyond reasonable

doubt that the program to some degree contributed to the specified outcomes.

It is important to note that Mayne does not commit himself, nor link the causal stories presented

above, to any specific evaluation design. He remains insistent that a mix of quantitative and qualitative

methods can be used in answering these questions through CA. According to Mayne (n.d.: 7), five

6
criteria concerning the embedded theory of change are to be met in order to infer “plausible

association”;

(i) Plausibility: Is the theory of change plausible?

(ii) Implementation according to plan: Was the program implemented with high fidelity?

(iii) Evidentiary confirmation of key elements: To what extent are the key elements of the

theory of change informed by or based on existing evidence?

(iv) Identification and examination of other influencing factors: To what extent have other

influencing factors been identified and accounted for?

(v) Disproof of alternative explanations: To what extent has the most relevant alternative

explanations been disproved?

These collectively serve as the quality criteria of causal stories based on CA. While Mayne continues

throughout his conceptual advancement of CA to summarily address the issue of accounting for other

influencing factors and assessing their potential influence, he never goes into detail on the subject. As

such, there is no operational framework or discussion of what it means to account for other influencing

factors, despite their importance being stated repeatedly. Moreover, the ability to account for the

influence of other factors is key in establishing the internal validity of causal claims and inferences

based on CA. At the risk of adding to an already overused term, the following presents a discussion of

the concept of validity in relation to causal claims and inferences.

Validity – an outline of an elastic term

The concept of validity has for several decades been widely discussed and developed (see Chen

2010 for an overview)ii. Unfortunately, but perhaps not unexpectedly, the long-enduring debates have

in many instances served to muddy rather than to clarify the waters. As a result, the term has come to

7
mean many different things to many different people. I suggest that the conceptual murkiness – and

even some of the central points of conflict in the ongoing debates – stem from a lack of recognition that

the very meaning and application of the term may differ across different fields of application between

research and evaluationiii.

For several decades the Campbellian validity model has been dominant in both research and

evaluation (Campbell and Stanley, 1963). Campbell and Stanley’s delineation between internal validity

(i.e., to what extent the design accounts for the influence of external factors) and external validity (i.e.,

to what extent the conclusions of the study can be generalized) has had profound influence on the

theory and practical application of validity amongst researchers and evaluators alike. The importance of

their contribution is axiomatic.

Indeed, the most oft-cited categorization of validity evidence still remains that of the

Campbellian model’s internal and external validity. External validity – generally stated – concerns the

extent to which one may safely generalize the conclusions derived from a study; that is, to what extent

the inferences and conclusions are valid for other subjects, other times and other places, or other

settings (Mohr 1995, p.92). This is obviously relevant in the context of research, as the aim of research

studies very often revolves around producing knowledge about a specific topic that can be generalized,

and in effect applied, to further an academic field. Internal validity – by some considered the sine qua

non of validity – is an expression of the extent to which a design accounts for external factors. As such,

internal validity is constitutes a key component in isolating and determining the magnitude the impact

of a program. The two types of validity are characterized by an inverse relationship (Chen, 2010). As

noted by Mohr, “The less successful a design is in accomplishing the first, the more it is depending on

the second for causal inference” (1995, preface).

8
Despite the heavy influence of the Campbellian validity model, it does not remain

unchallenged. Most recently, Huey Chen has questioned the relevance of the model in the context of

evaluation, writing that:

Because the Campbellian model was developed for academic research, ideas and principles

proposed by the model might not be wholly applicable or relevant to program evaluation.

Similarly, issues crucial to program evaluation but not to academic research are, in the

Campbellian validity model, most likely ignored. (2010, p. 206)

As just one example, Chen argues that the model’s emphasis on internal validity as the sine qua non of

research, may not be as relevant in the context of evaluation (2010). Instead, the relative importance of

internal versus external validity must be reconsidered in the context of evaluation (Chen 2010).

My interest is not to engage in debates on the relative importance and weighing of internal

versus external validity. In my opinion internal and external validity together express two overarching

types of validity that are necessary to address in research and evaluation. However, I would argue,

inspired by Carol Weiss and Chen, that the meaning, purpose and application of these two types of

validity needs to be clarified in the context of evaluation. Indeed, I think the hard-won clarity that could

result from such an effort would not only serve to enhance the credibility of contribution analysis but

also further the field of evaluation in general.

Simply consider external validity that concerns the extent to which inferences and conclusions

can be generalized to other subjects at other times and places. As noted by Chen, “such an open-ended

quest for law-like propositions” is often more relevant in a research context whereas it may be

“extremely difficult or even impossible to achieve” in the context of evaluation (2010, p.207)iv. This is

not to say that this interpretation of external validity as statistical generalizability is not relevant for

some evaluations. In fact, in the early days of social engineering the aim of randomized controlled trials

9
was exactly to identify the programs that work and then implement these more widely. More recent

developments in the field of evaluation, such as systematic reviews and rapid evidence assessment, also

lend themselves well to this traditional interpretation of external validity. However, many – and

perhaps even most – evaluations have a much more practically–oriented aim in that they seek to answer

very specific questions and to produce information that supports the practical implementation of

programs in other local contexts. Indeed, one oft-cited challenge related to the utilization of

information stemming from evaluations is how to translate the generic learning statements from one

local context into actual practice in other local contexts. The emphasis on statistical generalization in

the Campbellian interpretation of external validity seems less appropriate in these types of evaluations.

Inspired by Chen, and motivated by the field’s persistent investment in utilization of

information from evaluations, I would argue that the type of external validity that is particularly

relevant for many evaluations ought perhaps to be more in the direction practical generalizibility. This

new interpretation of practical generalizibility could express the extent to which inferences and

conclusions can support the local implementation of the program for other subjects, other times and

other places or other settings. Moreover, this may present a welcomed twin to Samuel Messick’s

concept of consequential validityv in the context of test validation that concerns the extent to which

adverse consequences are produced by invalid test interpretation and use (1989).

The Campbellian model has been applied without giving due justice to the differences in

context, aim and quality criteria of research and evaluation, and it may be a model whose time has

come to an end in relation to many evaluations. The idea is not to replace the Campbellian validity

model with an everything-goes-approach to validity. I am not advocating for an approach that simply

allows evaluators to be opportunistic in their choice of validity evidence. Rather the aim is to develop a

framework that is more consistent with the utilization-oriented nature of evaluation.

10
Inspired by Messick’s unitary validity concept, I suggest a new classification for cutting and

framing validity evidence for causal claims and inferences (see table 1 below). The first dimension

covers two different types of justification for making causal inferences: the evidential and the

consequential. The second dimension is the function of the causal claims and inferences for either

theoretical or practical use. According to the classification, the justification for the interpretation of

causal claims and inferences is primarily based on the appraisal of the evidential basis (i.e. to what

extent other influencing factors have been accounted for in isolating the impact of the intervention and

to what extent the causal claims be generalized to other subjects at other times and places?) and

perhaps secondarily supplemented by an appraisal of the consequential basis (i.e. to what extent have

the causal claims resulted in misconception due to flaws in the design or analytical strategy?).

Likewise, the justification for the practical use of causal inferences is primarily based on the appraisal

of the evidential basis (i.e. to what extent other influencing factors have been accounted for in isolating

the impact of the intervention and to what extent can the causal claims be practically applied to other

subjects at other times and places?) and secondarily supported by the consequential basis (i.e.to what

extent are the causal claims likely to lead to misapplication due to flaws in the design or analytical

strategy?).

A couple of examples might clarify how to apply the content of the table. A research study

aiming to examine the causal linkage between smoking and lung cancer would primarily build its case

on the evidential basis of internal validity and statistical generalizability. In addition, consequential

validity evidence would also strengthen the validity of the causal claims in the study. In marked

contrast, an evaluation of a pilot project on youth advising would more likely aim for the practical use

of its causal conclusions and therefore focus the validation effort on internal validity and practical

11
generalizibility. An examination of the consequential validity of the causal conclusions could further

strengthen the evaluation.

Table 1. A new framework for validity evidence for causal stories


Interpretation Practical use
of causal inferences and claims of causal inferences and claims
Evidential Internal validity & statistical Internal & practical
Basis (primary) generalizability generalizibility
Consequential Consequential validity Consequential validity
Basis (secondary) (theoretical) (practical)
Adapted from Messick 1989

If we accept this cutting and combining of validity evidence, where does this leave us in our

quest to enhance the validity of causal claims based on CA?

Contribution analysis and validity evidence

As mentioned earlier in this paper, published examples of the systematic application of CA are

far and few between. Accordingly, the following discussion builds on how CA – in its current

conceptual state and presentation – fares in relation to the proposed categorization of validity evidence

(primarily Mayne 1999, 2001 and 2008). It is also important to note that it is the combination of a

design and an analytical strategy that collectively enhances the validity of causal claims. Accordingly,

the less successful the design is in enhancing validity, the more one might depend on the analytical

strategy to do the work. As mentioned earlier, Mayne does not commit himself to any specific designs.

As such, the true capacity of CA in terms of realizing validity evidence cannot be determined by

examining CA isolated from specific designs. My examination of CA and validity, then, is more an

effort to gauge its relative strengths and weakness to justify causal claims. My focus will be on the

evidential basis, as this dimension constitutes the primary source of justification. It is in effect central

in making the case for CA as a methodologically rigorous alternative to experimental designs.

12
Consequently, I will focus my effort on practical generalizibility, statistical generalizibility and internal

validity. Let’s consider these in turn.

In my opinion, CA holds great potential in terms of practical generalizibility - the extent to

which inferences and conclusions can support the local implementation of the program for other

subjects, other times and other places or other settings. The inherent focus of CA on understanding the

nature and context of causal linkages between a program and a set of desired outcomes provides causal

stories that lend themselves well to local implementation in other settings. The emphasis on the

development and subsequent refinement of an embedded theory of change leads the evaluator towards

a deep and highly applicable understanding of how the nuts and bolts of the program may function and

behave in different contexts.

CA however holds a weaker position when it comes to statistical generalizibility – the extent to

which conclusions can be generalized to other subjects, other times and other places or other settings.

The focus on systematically examining the nature and context of the causal linkages between the

program and the desired set of outcomes is not likely to produce the type of law-like general statements

that lend themselves well to generalization. However, and as mentioned above, the true capacity of CA

in relation to statistical generalizibility remains contingent upon the specific design and methods

employed in the evaluation.

Third and finally, I think the real area of improvement in relation to CA and validity revolves

around internal validity. While Mayne provides different strategies for identifying the most salient

external factors, he never goes into any detail on how to gauge the influence of these. The importance

of an operational framework for identifying and assessing the influence of external factors is

particularly clear given the very aim of contribution analysis, especially in relation contribution stories

of direct and indirect influence. The advantage that CA offers in complex contexts is that it aims to

13
determine the relative, rather than the specific, contribution of a program to a set of outcomes of

interest. This being the case, explicit guidance on how to systematically gauge the magnitude of the

contribution of a program relative to other influencing factors is essential. In my mind, an elaborated

version of this aspect of CA provides a necessary stepping stone to strengthen the ability of CA to

approximate the contribution of programs. The consequences of this missing stepping stone are real. As

just one example, Michael Patton’s use of contribution analysis in his advocacy impact evaluation of a

major philanthropic foundation (2008) resulted in the following conclusion:

Based on a thorough review of the campaign’s activities, interviews with key informants and

key knowledgeables, and careful analysis of the Supreme Court decision, we conclude that: The

coordinated final-push campaign contributed significantly to the Court’s decision (2008, p. 1 –

my italics).

One might wonder how to interpret the vague – yet heavily loaded – quantifier “significant” in the

above conclusion. How was it determined that the contribution was significant as opposed moderate?

What does it really mean that the campaign contributed significantly? The need for some sort of

systematic approach to not only identify but also gauge the influence of other external factors is in my

mind called for. I would dare argue that the methodological soundness of CA demands a consistent and

rigorous use of strategies that aim to reduce the threat of external factors.

I am well aware that pushing for the increased application of CA requires more than a

theoretical discussion of validity issues. In closing this paper, I would like to share what I think are

some of the related issues that ought to be discussed and that may serve to strengthen the conceptual

and practical advancement of CA. Admittedly, these are only the beginnings.

First, I suggest we need to examine the underlying concept of causality that CA builds on.

Examining the underlying concept of causality should involve an examination of the counterfactual

14
framework that Mayne is positioning CA against when arguing that contribution analysis does not

involve a counterfactual-based argument. It is certainly true that CA offers a viable alternative to

counterfactual designs in settings where comparison and control groups are unfeasible. That being said

CA still deals with counterfactual-based questions and is – in my mind – certainly compatible with

some counterfactual designs. As pointed out by Howard White:

Although it may not always be necessary or useful to make the counterfactual explicit, in attrib-

uting changes to outcomes to a development intervention it most likely is useful to have an

explicit counterfactual. An explicit counterfactual does not necessarily mean that one needs a

comparison group, though often it will. (2010, p. 157).

As he goes on to argue, the counterfactual may come in the form of different variants of interrupted

time-series design. If we accept this conceptualization of the explicit counterfactual, there is no reason

to hold that CA is incompatible with counterfactual-based designs or arguments as such. We may as

well keep our doors open to and be aware of this particular type of CA.

Second, I think it may further the discussion on CA as an alternative to experimental designs to

sharply distinguish between designs and analytical strategies, as these are often conflated. An

evaluation design specifies the frequency and placement of measurement points in relation to the

intervention being evaluated. It also specifies the demand for a control or comparison group. In doing

so a design delineates the overarching structure of the data collection. An analytical strategy specifies

how the data derived from the measurement points will be analyzed and connected with the questions

to be answered as part of the evaluation. One might argue that simply comparing experimental designs

with an analytical strategy is like comparing apples and oranges. I agree. However, I also recognize

that these comparisons are being made—and will continue to be made—and that contribution analysis,

as noted by Mayne, is “often perceived as a very weak substitute for the more traditional experimental

15
approaches to assessing causality” (n.d.: 1). In advancing CA, I think we have to accept that these

comparisons will be made and seek to inform and frame them as best as possible. This involves holding

on to the important distinction between a design and an analytical strategy and recognizing that the

internal validity of causal claims is both contingent upon the evaluation design and the analytical

strategy - collectively.

Third, and in direct extension with these two above points in mind, it may prove rewarding to

explore – in practical application or at least in theory – the methodological conditions or types of

designs that will strengthen contribution stories based on CA. This has me wondering: Is there really

any reason why we can’t combine a counterfactual pre-/post design with CA? Are there certain

counterfactual or non-counterfactual designs that lend themselves particularly well to CA? Are there

types of designs that will strengthen and connect particularly well with the three different types of

contribution stories? Are there certain types of counterfactual designs that are required to support

contribution stories of direct or indirect influence? In answering these questions we have to be clear on

the distinction between counterfactuals and control groups, designs and analytical strategies. We also

need to engage and systematically apply CA in our evaluations.

Fourth and finally, I would argue that we should continue the methodological discussion on

how to enhance and assess the quality of contribution stories. How would we recognize a

methodologically sound CA if it were right in front of us? Mayne points towards five criteria, but are

there other relevant quality markers? I’m thinking here of a set of quality markers equivalent – but not

identical – to the quality markers typically employed in research. These quality markers may

collectively serve as a backbone to strengthen CA as a viable and methodologically credible method

that can address attribution through contribution.

16
Concluding remarks

Contribution analysis (CA) presents a promising and viable alternative to the traditional

counterfactual impact designs; indeed, it is my strong belief in the potential of CA that motivates this

paper. However, in advancing CA as a methodologically sound way of addressing attribution, a need

arises for addressing the validity issues pertaining to CA. Validity is at its core about the extent to

which we can invest our trust in a set of inferences. As such, the foundation of confidence in any

methodologically sound design, method or analytical strategy that aims to produce inferences—

especially inferences of cause-and-effect relations—is contingent upon the extent to which attention is

awarded to validity issues. In order to pave the way for the increased practical application of CA we

need to address validity issues related to CA. However, we also need to make sure that the concept of

validity that we employ in building credible causal stories – by way of CA or other strategies – is

applicable and relevant to the field of evaluation. It is my modest hope that this paper will foster further

discussion of both validity and contribution analysis.

17
References

Bickman, L. 1987. The Function of Program Theory: Using Program Theory in Evaluation, New
Directions for Evaluation, 33 pp. 5-18

Campbell D.T. , & Stanley, J. 1963. Experimental and Quasi-experimental Designs for Research.
Chicago: Rand McNally

Chen, H. 1987. The Theory-driven Approach to Validity. Evaluation and Program Planning, 10, pp.
95-103.

Chen, H. 1988. Validity in evaluation research: A critical assessment of current issues, Policy and
Politics, 16 (1), pp. 1-16.

Chen, H. 2010. The Bottom-Up Approach to Integrative Validity: A New Perspective for Program
Evaluation. Evaluation and Program Planning, 33, pp. 205-214.

Davidson, E.J. 2000. Ascertaining Causality in Theory-Based Evaluation, New Directions for
Evaluation, 87, pp. 17-26.

Dybdal, Bohni and Lemire (n.d.).

House, E. 2001. Unfinished Business: Causes and Values, The American Journal of Evaluation 22 (3),
pp. 309-315.

Kane, M. 2001. Current Concerns in Validity Theory, Journal of educational measurement 38 (4), pp.
319-342.

Mayne, J. 1999. Addressing Attribution through Contribution Analysis: Using Performance Measures
Sensibly, discussion paper, Office of the Auditor General of Canada.

Mayne, J. 2001. Addressing Attribution through Contribution Analysis: Using Performance Measures
Sensibly, Canadian Journal of Program Evaluation, 16 (1), pp. 1-24.

Mayne, J. 2008. Contribution analysis: An approach to exploring cause and effect, ILAC Brief 16,
Institutional Learning and Change (ILAC) Initiative, Rome, Italy.

Mayne, J. n.d.: Addressing Cause and Effect in Simple and Complex Settings through
Contribution Analysis in R. Schwartz, K. Forss, and M. Marra (Eds.): Evaluating the Complex,
R. Schwartz, K. Forss, and M. Marra (Eds.), New York, Transaction Publishers (in print).

Messick, S. 1989. Validity in R. L. Linn (Ed.), Educational measurement (3rd ed.), pp. 13-103. New
York, American Council on Education and Macmillan.

18
Patton, M. 2008. Advocacy Impact Evaluation. JMDE, 5(9): pp. 1-10.

Rogers, P. et al. 2000. Program Theory Evaluation: Practice, Promise, and Problems, New Directions
for Evaluation, 87, pp. 5-13.

Rogers, P. 2007. Theory-Based Evaluation: Reflections Ten Years On, New Direction for Evaluation,
114, pp. 63-67.

Scheirer, A.M. 1987. Program Theory and Implementation Theory: Implications for Evaluators, New
Directions for Program Evaluation, 33, pp. 59-76.

White, H (2010). A contribution to current debates in impact evaluation, Evaluation 16 (2), pp. 153-
164.

19
i
One published example is that of Michael Patton and his employment of contribution analysis in evaluating a stealth
campaign (Patton, 2008).
ii
The literature on internal and external validity in research and evaluation is extensive (see Chen 1988 & 2010 as well as
Kane 2001 for a good overview), and space does not allow for a detailed account of the development of the term here.
However, some of the trends are particularly relevant in relation CA and therefore merit our attention. First of all, the
concept of validity in the Campbellian tradition pertains specifically to the research design; that is, it is the research design
that is being validated. Over the years the common consensus has moved towards validity pertaining to the claims and
inferences produced by different designs. Stated differently, it is the validity of the causal claims and inferences that are
being validated. As a result, any credible combination of design and analytical strategy that seeks to produce sound causal
claims has to address internal validity.
iii
New interpretations and conceptual advancement in the area of validity has come most often from the research
community, especially from researchers in the area of psychometrics where test and instrument validation is central (see
Kane 2001 and Messick 1989 among others).
iv
Meta-evaluation is the exception.
v
The concept of consequential validity was introduced by Messick in the context of instrumental validity, that is, the
validation of tests and measurement instruments. However, it certainly appears relevant given the heavy focus on utilization
in the field of evaluation.

20

You might also like