You are on page 1of 8

Evnhorion

and Program

Planning,

Vol.

IS,

pp. 263-270,

0149-718Qt92

1992

$5.00

+ .OO

Copyright 0 1992 Pergamon Press Ltd.

Printed in the USA. All rights reserved.

DANIELB. FISHMAN
Rutgers University

Cuba and ~in&orn~ recentbook, Fourth Generation Eva~uat~o~~


is a radical critique ofthe modernist, positivist fou~da fio~ of traditionaf program eva~l~at~on~which the authors d~ffereni~ate into three historical stages or generations.Based upon fkeir analysis, these high/y esteemed
authors propose a fundamental redefinition and restructuring of the whole evaluation field.
In order to deal adequately with the deep and far-reaching implications of their proposals, this
review has been extended to a fuil article length. The main focus of the book is an argument
to replace traditional evaluation with yourth generation evaluation, which is based on thepostmodernist epistemology ofconstructivism. In polar corrtrrrst to positivisms assumption that
the true nature of external reality is discoverable through the scientific method, constructivism assumes that there are only alternative, subjective constructions of reality produced by different ~nd~v~duais=Therefore, instead of the positivist roie of measuring a programs goal
attainment in scientific, ~~ant~iat~ve ways3 the roie of the program e~~a~uatorbecomes one of
fae~~~ta~~ng~nter~ret~v~ dialogue among a wide variety of a programs stakeho~ders~ The objective of the dialogue is to attain consensus among the stake~otders upon an emergent construction of the program svaltre and outcome. The present examination of Guba and Lincoln s
book begins with general background, proeeeds to a detaiied summary of their conceptual
framework, and ends with a critical assessment of their views.

Over the last 100 years or so, American intellectual life


has been dominated by the idea of modernism. This
view posits that the method of the natural sciences (like
physics and ~hemistry~ is the best way to pursue knowledge, and that the knowledge stemming from science
will produce teehnoIogi~a~ advance leading to societal
progress. Over the past 30 years, this view has been directly attacked, and a new view called postmodernism
has emerged (Gergen, 1991). In place of experimental
science as a paradigm for knowledge, postmodernism
uses as its model the type of qualitative scholarship
found in such interpretive disciplines as history, journalism, and literature. Egon Guba and Yvoane Lincolns
(1989) book, Fourth Generation Evaluation, is a postmodernist attack on science-based program evaluation
and an exposition of an alternative model. As such, it
Requests for reprints should be sent to

very much reflects the growing influence of the postmodernist movement.


In line with their postmodernist theme, Guba and
Lincoln critique the traditional, positivist foundation of
program evaluation and advocate replacing it with a
new hermeneutic or ~~~onstru~tivistapproach. Thus,
a serious reading of the book requires an openness to
new paradigms for reconceptualizing and restructuring
program evaluation at the most fundamental level. To
do justice to the deep and far-reaching implications of
the proposals put forth by these two highly regarded authors, this review has been extended to a full length article. The review begins with general background,
proceeds to a detailed summary of Guba and Lincolns
conceptual framework, and then ends with a critical assessment of their views.

DanielB. Fishman, 56 Marion Road, E., Princeton, NJ 08540.


263

264

DANIEL

B. FISHMAN

GENERALBACKGROUND
Epistemology
is the branch of philosophy that investigates the origins, nature, methods, and limits of human knowledge.
An epistemological
paradigm
sets
forth the criteria according to which the relevance and
validity of a particular body of knowledge are judged.
In other words, no knowledge is simply given in any absolute sense. Rather, there are a variety of possible, coherent epistemological
systems that have been set forth,
and the evaluation
of a statements truth or falsity will
depend, in part, upon the epistemological
criteria chosen for the evaluation,
as opposed to the content of the
statement per se (Bernstein,
1983; Fishman, 1988; Gergen, 1991; Rorty, 1979).
With Wilhelm Wundts establishment
of a psychophysiological laboratory in 1879, psychology initiated a
Declaration
of Independence
from philosophy
that
developed and thrived on the adoption of the modernist epistemology
of logical positivism.
In broad terms,
logical positivism
contends
that there is an external
world independent
of human experience and that objective, scientific knowledge about this world can be obtained through direct sense experience, as interpreted
within the framework of the theory-embedded,
hypothesis-testing laboratory experiment. The data upon which
this knowledge is founded consist of discrete, molecular, objectively derived, sensorily based facts, most
of which can be quantified.
Knowledge is in the form
of a cumulative
body of context-free,
universal laws
about the phenomena
studied. In the modernist tradition, psychologists
who have adopted a positivist perspective generally assume that the universal laws that
emerge from scientific
study will have a form such
that they can be applied to help solve significant psychological and social problems in a unique, rationally based
manner.
GUBA

AND

LINCOLNS

For a variety of philosophical,


scientific, cultural, and
practical reasons, since around 1960 there has been a
growing movement in psychology and the other social
sciences that rejects positivism as the appropriate
epistemology for the field and proposes in its stead, social
constructionism
or constructivism
(e.g., Fishman &
Neigher, 1987; Fiske & Shweder, 1986; Gergen, 1985;
Krasner & Houts, 1984; Starr, 1985). As mentioned
above, this movement is part of the broader postmodernist attack on science that has been taking place in
many domains of our culture.
In contrast to logical positivism, constructivism
takes
the position that reality as an individual or group experiences it is, to a substantial
degree, conceptually
constructed rather than sensorily discovered by that group.
Objective knowledge about the world is significantly
limited because facts and raw data can be known
only within a particular,
pre-empirically
established,
cultural,
social, historical,
and linguistic context. In
other words, in contrast to positivisms assumption
that
reality is discovered through the methods of natural science, constructivism
assumes that reality is, to a large
extent, invented by individuals and groups as a function
of particular
personal beliefs and historical,
cultural,
and social factors. Thus, constructivism
views the nature of reality as relative, depending on the observers
point of view (Ryder, 1987). A growing number of psychologists and other social scientists are exploring or
adopting
constructivism
as the foundation
of their
work. Egon Guba and Yvonne Lincoln are two such individuals. In reviewing their work, I will first describe
their conceptual
framework,
and then critically evaluate it.

CONCEPTUAL

In Fourth Generation Evaluation, Guba and Lincoln


present one of the clearest, best organized,
most detailed, and most practically elaborated
series of arguments for the nature and merits of the constructivist
paradigm for the practice of applied psychology. They
do this both in terms of the constructivist
paradigm,
generally, and in terms of their particular version of it,
which they call responsive
constructivist
evaluation
(p. 38). The name follows from their basic assumption
that there is no discoverable
reality that is independent
of the observer, that is, there is no objective reality.
Rather, there are only alternative,
subjective constructions of reality produced by different individuals.
Since
there is no discoverable
objective reality, the program
evaluators knowledge is simply another alternative construction of the the evaluand (the program being evaluated). In many senses, this puts knowledge about the

FRAMEWORK

evaluand by all stakeholders


at the same truth and validity level. Thus there are no correct social science
theories or specific measurement
procedures with which
to plan a program evaluation;
rather, the relevant theory and procedures
must be negotiated
among the
stakeholders,
that is, the evaluator must be responsive
in the design process to the perspectives
of the other
stakeholders.
Moreover, in conducting
the evaluation
itself, Guba
and Lincoln propose that the investigator
adopt a constructivist paradigm, which they describe as follows:
[The constructivist
paradigms] basic assumptions
are virtually polar to those of science. For ontologically, it denies
the existence of an objective reality, asserting instead that
realities are social constructions
of the mind, and hence
that there exist as many such constructions
as there are individuals (although
clearly many constructions
will be

Postmodernism

Comes to Program Evaluation

shared). We argue that science itself is such a construction;


we can admit it freely to the pantheon of constructions provided only that we are not asked to accept science as the
right or true construction. . . . Epistemologicaliy, the constructivist paradigm denies the possibility of subject-object
dualism, suggesting instead that the findings of a study exist precisely because there is an interaction between observer and observed that literally creates what emerges from
that inquiry. Methodologically, . . . the naturalistic [constructivist] rejects the controlling, manipulative (experimental) approach that characterizes science and substitutes for
it a hermeneutic/dialectic process that takes full advantage,
and account, of the observer/observed interaction to create a constructed reality that is as informed and sophisticated as it can be made at a particular point in time (pp.
43-44).
Guba and Lincoln view their model of evaluation as
the fourth major phase or generation in the history of
the field. The first generation (from about 1900 to about
1930) equated evaluation with measurement per se. It is
typified by Binets intelligence
quotient
(IQ) test,
which objectively and quantitatively
arrays individuals
in a distribution
in terms of their relative capacity to
perform representative,
age-appropriate
mental tasks.
This first generation was a direct application to human
affairs of the measurement methods used in physics and
chemistry.
The second generation of evaluation
(about 1930 to
1967) developed because there emerged a need to assess
the impact of curricular changes in educational
experiments. The first generation only measured the functioning of individuals.
Illustrative of the second generation
was the work of Ralph Tyler, who developed a method
resulting in a description of the degree to which certain
educational objectives were achieved. Tylers concept of
describing the degree of a programs goal attainment
is
viewed by some as the real beginning of program evaluation, and he has been labelled as the Father of Evaluation (Joint Committee,
1981).
The third generation
of evaluation
(1967 and after)
arose in response to educational
and other programs in
which the managers were not able to specify measurable
goals. This led to a broadening
of the evaluators function to help in developing a programs goals and in appraising pre-established
goals, in addition to assessing
degree of goal attainment
per se. In other words, the
evaluator took on a judgement role, along with the measurement and description roles of the first two generation. An example is Strivens Goal Free Model in which
the evaluator
is not told a programs
stated goals.
Rather, the evaluator focuses on finding out what the
program actually is accomplishing,
and then relates
these achievements
to how well they are meeting the
needs of the impacted population (Striven, 1973, 1980).
Guba and Lincoln see at least three major flaws in the
first three generations
of evaluation.
First is the problem of managerialism.
In these types of assessments the
evaluator is typically hired by the program manager and

265

works for this individual;


thus the evaluator is serving
the manager rather than service recipients.
A second
problem is the purported value-free nature of the first
three generations
of evaluation,
which derives from
their identification
with the natural sciences. This assumption of value-neutrality,
argue Cuba and Lincoln,
does not allow evaluation
to accommodate
to the fact
that our society is quite value-pluralistic.
The third related problem is the overcommitment
to the scientific
paradigm of inquiry, with its emphasis upon decontextualized,
immutable
natural
laws and upon formal
quantitative
measurement.
By stripping away context
and qualitative,
narrative description,
science is viewed
as making program evaluation results less applicable to
specific local conditions and less accessible and less relevant to lay decision-makers,
who typically think in
terms of narrative descriptions.
Since Cuba and Lincoln view the problems of the
first three generations
of evaluation
as stemming
in
large part from their adherence to a natural science
model, these authors have created fourth generation
evaluation
(FGE), which is based on the nonscientific
view of responsive
constructivism,
discussed above.
They nicely summarize the assumptions of their responsive constructivist
perspective,
highlighting
its direct
conflict with the scientific perspective, in the following
manner:
Truth is a matter of consensus among informed
and sophisticated
constructors.
. . .
Facts have no meaning except within some value
framework; hence there cannot be an objective assessment of any proposition.
Causes
and effects
do not exist except by
imputation;
. . .
Phenomena
can be understood
only within the context in which they are studied; findings from one context cannot
be generalized
to another;
neither
problems nor their solutions can be generalized from
one setting to another.
Interventions
are not stable; . . .
Change cannot
be engineered;
it is a nonlinear
process . . . ;
Accountability
is a characteristic
of a conglomerate
of mutual
and simultaneous
shapers,
no one of
which, nor one subset of which, can be singled out
for praise or blame;
Evaluators are subjective partners with stakeholders
in the literal creation of data. . . .
Evaluation
data derived from constructivist
inquiry
have neither special status nor legitimation;
they represent simply another construction
to be taken into
account in the move toward consensus (pp. 44-45).
The core task for the evaluator in FGE is to orchestrate a negotiation
process that attempts to culminate

266

DANIEL

in consensus on better informed and more sophisticated


constructions
among all stakeholders
of the program
being evaluated (p. 45). These stakeholders
consist of
agents, those who help to implement,
produce, and
use the program: beneficiaries,
those who profit in
some way from exposure to the program, and victims,
those who are negatively affected by the program. The
negotiation
process is a hermeneutic
dialectic onehermeneutic
because of its interpretive nature, and dialectic because it represents a comparison and contrast of
divergent views, with a view to achieving a higher-levei
synthesis of them all, in the Wegelian sense (p. 149).
The negotiation
process begins with the open-ended
interview
of one of the stakeholders,
Respondent
1
(Rl), to elicit an initial construction
of the program being evaluated.
Rl is then asked to nominate a second
stakeholder,
R2, who is viewed as having very different
ideas. The evaluator then analyzes the central themes,
concepts, ideas, values, concerns, and issues> (p. 151)
proposed by Rl and creates an initial formulation
of
Rls construction
of the program and its effects, called
Cl. R2 is next interviewed,
first in terms of R2s own
views, and then in terms of the R2s reaction to Cl.
R2 then nominates
an R3, and the evaluator completes an analysis resulting
in C2, a now more informed and sophisticated construction
based on the two
sources Rl and R2 (p. 152). The evaluator then interviews R3 and obtains this persons reaction to C2, integrating these results with C2 to derive a C3. This process
is continued
until the circle of respondents
has been
completed. At this point, it is sometimes useful to make
the circle a second time, or the circle may be spiraled
by making it a second time with a group of respondents
similar to those in the first circle.
The goal of the whole process is to derive an evaluative construction
of the programs impact which has two
properties:
it is agreed upon by the various stakeholders, and it is informed
and sophisticated,
that is, it
is of high quality. Cuba and Lincoln spell out three
types of quality criteria in the second half of Chapter
Eight of the book. The first type involves the trustworthiness
of the final construction,
and it parallels
such quality criteria in traditional,
positivistic evaluation
as internal validity, external validity, and reliability. For
example, the Fourth Generation
Evaluation
parallel to
internal validity is credibility,
which Guba and Lincoln describe as the isomorphism
between constructed
realities of respondents
and the reconstructions
attributed to them (p. 237). A variety of techniques for en-

B. FISHMAN

hancing credibility are discussed, such as prolonged and


persistent
engagement
with stakeholders,
debriefing
ones findings with a disinterested
peer, analyzing negative cases, and getting feedback on the final construction from the original stakehoider
participants.
The second type of quality criteria is intrinsic to the
hermeneutic,
dialectical process of FGE. As information is collected, it is analyzed immediateIy and fed back
for comment, elaboration,
correction, revision, expansion, or whatever to the very respondents who provided
them only a moment ago (p. 244). Thus, there is a continuous interplay among the views of a variety of stakeholders, many of whom are likely to have wide initial
differences,
and between the views of these stakeholders and the attempts of the evaluator to summarize and
integrate them. In this process, the so-called biases or
prejudices of the evaluator
(p. 244) are substantially
eliminated.
The third type of quality indicators
are called authenticity criteria. One of these is fairness, which refers to the extent to which different constructions
and
their underlying
value structures are solicited and honored within the evaluation
process (p. 246). This can
be evaluated in three ways: (a) by examining
written
documentation
of the process by which stakeholders
were selected for interviews, (b) by openly negotiating
with the various stakeholders
the final recommendations for action coming out of the evaluative process,
and (c) by creating an appropriate
mechanism should
any negotiating
party feel that the rules are not observed (p. 247).
Another authenticity criterion is ontological
authenticity, which refers to the extent to which the respondents own constructions
are improved
by becoming
more mature and elaborated.
One way of measuring such authenticity
is to examine the audit trail of entries of individual
constructions
recorded at different
points over time.
Still another authenticity
criterion is educative authenticity. This refers to the extent to which individual
and appreciation
for the
respondents understanding
constructions
of orhers outside their stakeholding group
are enhanced (p. 248). The final authenticity
criterion
is catalytic authenticity, and it refers to the extent to
which action is stimulated and facilitated by the evaluation processes (p. 249). This criterion is considered
crucial, because the authors view the ultimate purpose
and raison detre of evaluation as some form of action
and/or decision making (p. 249).

CRITIQUE
In sum, then, Fourth Generation
Evaluation
(FGE)
adopts a constructivist
set of epistemological
assumptions, which contrast
radically
with the positivist
epistemological
assumptions
underlyiIlg
traditional,
science-based program evaluation.
FGE is derived from

the view that external reality is not directly knowable,


but rather that the external
reality we bump
up
against can be interpreted or understood from only one
of a wide variety of possible, plausible perspectives.
Moreover,
these perspectives
are highly embedded in

Postmodernism

Comes to Program Evaluation

language and historical and cultural context. Thus, to


an important
degree, outside reality is constructed
rather than discovered.
If there is no single correct view of reality to be discovered, the ultimate criterion of the truth of a statement or conceptual
position is its pragmatic value for
helping those for whom the statement is relevant (Bernstein, 1983; Gergen, 1991; James, 1955; Rorty, 1979).
In other words, the truth of a statement is in some sense
relative to the sociopolitical dynamics of the group evaluating the statement. In the constructivist
approach of
FGE, the goal in evaluating a program is to help those
who are involved in or impacted by the program, that
is, the programs stakeholders.
There are two types of criticism that can be levelled
at FGE. One is an attack on its constructivist
epistemology, and the other is a reassertion of the values of positivism; for it should not be forgotten that positivism
still dominates
much of the social sciences (Gergen,
1985). Unfortunately,
there is not space here to pursue
the details of the debate between constructivism
and
positivism.
(For a recent impassioned
statement in defense of positivism, see Staats, 1991). However, there is
one issue that deserves to be briefly mentioned
in this
context-that
of relativism.
Relativism
Since constructivism
assumes that there is no single
correct view of reality to be discovered, only multiple
and alternative constructions
of it, constructivism
is susceptible to the logical problems of relativism (Bernstein,
1983). In other words, if any particular model like FGE
is only one of a possible number of constructed
views,
there is no special argument per se that FGE is superior
to a different point of view, such as logical positivism.
To their credit, Guba and Lincoln recognize that at the
end of their book:
The model of fourth generation evaluation -indeed,
this
entire book-is
a construction
. . . [and thus] is subject to
reconstruction
wherever and whenever new information
and/or increased sophistication
can be brought to bear
(p. 265).

The second type of criticism of FGE comes from


within constructivist
epistemology.
From this perspective, there are at least five critical issues to be raised
about FGE. Each will be considered in turn.
Inappropriate Mixing of Technology and Politics
Within FGE there appears to be an inappropriate
blending of program evaluations
technical resources and its
political advocacy. The program evaluator as a technician is viewed as having special conceptualization
and negotiation
skills in helping to clarify and express
others constructs,
to stimulate the interchange
among
individuals with different constructs, and to encourage
the emergence of new constructs that integrate across divergent points of view. There is nothing intrinsic to

261

these special skills that dictates they be used in the service of the political view FGE advocates, namely to empower and enfranchise
all stakeholders
by setting the
goal of achieving consensus among them. Thus, there
is nothing from within the unique skills of FGE evaluators that would prevent them from using their capacity to conceptualize
and negotiate in the service of
meeting the goals of program managers or program
funders specifically, rather than all stakeholders per se.
In a related vein, it is important to note the strong relationship between claims to truth and the distribution
of power in society. In Gergens (1991) words:
Those groups to whom knowledge is attributed are generally granted the privilege of making decisions. We want
knowledgeable people, rather than the ignorant or uninformed, to decide on matters of importance. Thus the
power of decision making is often granted to scientists, experienced politicians, learned judges, medical doctors, and
so on (Gergen, 1991, p. 95).
In the modernist view of science as having a special
capacity to generate the most accurate picture of external reality, scientific experts are provided special powers in the decision making process. This is reflected in
third generation
evaluation
where scientifically
skilled
evaluators are given the authority to decide what the
goals of a social program should be, even though, upon
reflection, the setting of those goals certainly seems a
value-based
rather than a science-based
issue.
From a postmodernist,
constructivist
perspective,
there is no clear way of deciding whose construction
of
reality is truer or better in some foundational
sense.
Guba and Lincoln state that
evaluation data derived from constructivist inquiry have
neither special status nor legitimation; they represent simply another construction to be taken into account in the
move towards consensus (p. 45).
Thus, the determination
of whose view is more relevant
to decision making and practical action becomes a matter of previously established political structures and ongoing political negotiation.
The FGE evaluator has no
special status in setting the political structure of a program, that is, in deciding what decision making authority is invested in which subgroups of stakeholders.
This
does not prevent the evaluator from assuming the role
of an interested citizen and arguing for a particular
structure, such as a directly democratic model of decision making. However, such advocacy is not related to
the evaluators special expertise as a professional
in the
area of evaluation.
Lack of Documented Case Studies Demonstrating
the FGE Model
As mentioned
above, in the constructivist
approach of
FGE, the ultimate criterion of the truth of a statement
or conceptual position is its pragmatic value in helping
those for whom the statement is relevant. From this per-

268

DANIEL

spective, there are reasons to question the FGE model


of evaluation presented by Guba and Lincoln. Perhaps
most important
in this regard, there is not even one
sample study that the authors present in enough detail
to demonstrate
in actuality the practical value of their
model. Also, there are reasons to believe that the FGE
model would be unwieldy and difficult to implement.
For example, consider the study of a complex program
like the implementation
of a social problem-solving
curriculum for middle-school
children (Elias, 1991). Guba
and Lincoln do not provide concrete techniques for deciding which and how many of the many hundreds or
perhaps even thousands of stakeholders to interview in
the hermeneutic
negotiation
process? These stakeholders include the children,
their parents and siblings,
teachers, administrative
and support staff, school district board members, and tax-paying
members of the
community
who dont have children in the school.
Problems With the Goal of Stakebolder Consensus
Moreover,
doesnt it seem naive to believe that the
skilled FGE evaluator can get groups who are frequently
in intense political conflict to come to consensus?
For
example, in the social problem-solving
program just
mentioned,
there are those stakeholders
who support
teaching psychological
skills in school, and there are
those who believe that school should devote itself only
to teaching basic academic skills. And then there are
those who would prefer to see any resources beyond the
academic basics go into music and art only. When more
controversial
topics are considered,
such as abortion
counseling
or programs that provide sex education
in
the schools, the achievement
of consensus on the part
of all stakeholders
seems for all practical
purposes
impossible.
The Role of Existing Political Contexts
The evaluator cannot forget that programs take place in
a pre-existing
political context, with certain groups in
power deciding to fund and operate the program in order to achieve certain goals. What seems typically feasible is for the evaluator to devise ways to measure the
extent to which those goals have been achieved and perhaps to provide other information
that might be relevant to various stakeholders
in the program.
There are a number of models that differentiate
the
various political contexts in which evaluation
can be
conducted.
For example, Windle and Neigher (1978)
discuss three: the amelioration
model, in which the
purpose of the evaluation is to help program managers
improve the internal operation
of their program; the
accountability
model, in which the purpose is to focus on public data disclosure and citizen participation;
and the advocacy model, in which the purpose is to
help managers advocate with outside funders for additional resources. Windle and Neigher discuss in detail

B. FISHMAN

how there are ethical problems inherent in each model


both separately and in attempts to combine models. In
essence, each approach involves a series of tradeoffs,
maximizing certain values in opposition to others. The
amelioration
model orients to the needs of program
managers, but not to the needs of service recipients and
other citizens; the accountability
model is the reverse;
and the advocacy model blurs the distinction
between
the goal of evaluating a program in a more objective
manner and evaluating a program in a more political
manner. FGE is based upon the view that all three models can be combined,
yielding consensus
among all
stakeholders.
Yet Windle and Neighers arguments,
as
well as those of others, raise grave doubts about this
approach.
Alternative Models Within the
Constructivist Paradigm
FGE is not the only evaluation model that can be developed within the constructivist
paradigm.
For illustration, I will describe such an alternative
that I have
developed called the technological
or pragmatic paradigm. This model incorporates
many of the ideas of
third generation evaluation into a constructivist
epistemology (Fishman & Neigher, 1987; Fishman,
1991a).
Although still using quantitative
methods, as in traditional science, the pragmatic paradigm rejects the theory-based
laboratory
experiment
and the search for
general psychological
laws. Rather, this model focuses
on action-oriented
approaches from engineering and research and development
(Gilbert, 1978; Morell, 1979).
In the pragmatic paradigm, a conceptually
coherent
program is designed to address a significant
social or
psychological
problem within a naturalistic,
real-world
setting, in a manner that is feasible, effective, and efficient. Quantification
is used to develop performance
indicators of a systems functioning.
Then the system is
monitored in terms of both baselines and changes dues
to identified interventions.
Also, in the historical and
cultural context of the particular individual,
group, or
organizational
case, single-case experimental
designs
can be employed to assess causal relationships
that appear true for that individual
case (Barlow, Hayes, &
Nelson, 1984).
The pragmatic paradigm focuses on getting programs
to work within a particular real-world setting. The degree to which the program is generalizable
from that
particular contextual setting is an empirical one. Just because a program will not work in another setting does
not diminish the programs relevance and validity in the
original setting. The lack of success in the second setting is attributed to contextual differences between the
two settings. These contextual
differences
are always
present, and frequently they functionally
interact with
the program in question.
The process of program evaluation conducted within

Postmodernism

Comes to Program Evaluation

the pragmatic
paradigm has four phases (Fishman &
Neigher, 1987). In the first phase, the evaluator identifies the type of decision to be made. Next the context of
the decision and the culture of the relevant decisionmakers are described. This includes the decision makers interpretation
of the decision and their values
regarding such issues as quantitative
versus qualitative
data, formal decision models, and a deliberate versus a
quick decision-making
process. Based upon this description and any relevant research that helps in articulating
and informing it, the evaluator constructs a conceptual
model for understanding
the nature of the decision to
be made. In the second phase, a quantitative data methodology is developed that is explicitly linked to the decisions set forth in the first phase. In phase three the
methodology
is pilot-tested.
When pilot-testing
is successful, the methodology
is implemented
at full scale,
and when this is successful, the methodology
is disseminated at full scale.
In sum, evaluation
in the pragmatic paradigm employs quantitative
and conceptual elements from traditional, positivistic evaluation,
but it does this within a
constructivist
context so that quantification
is employed
in the service of meeting the decision-makers
informational needs, rather than purporting
to discover the
real state of affairs. In developing the model, I have
explicitly linked it to a variety of case studies (Fishman
& Peterson,
1987; Fishman,
1991a, 1991b), showing
how the model describes and explains more or less successful evaluation
projects.
My point in discussing the pragmatic paradigm of
evaluation
is not to claim that it is correct and the
FGE model is incorrect.
Rather I am arguing that

269

since the ultimate justification


of any evaluation model
within constructivist
methodology
is its pragmatic
value-that
is, its value in helping decision-makers
and
other stakeholders
in particular case situations-there
must be documented,
detailed case studies employing
the model to assess properly the models worth. Unfortunately, Fourth Generation Evaluation is lacking in
such case study examples.
In conclusion, Guba and Lincolns book is important
in laying out in epistemological,
conceptual,
and methodological detail a postmodernist,
constructivist
hermeneutic model of program
evaluation.
Many of the
elements of their approach appear quite intriguing, such
as the systematic collection of qualitative
perspectives
from a wide variety of stakeholders as one of the inputs
into the evaluative process. It appears to me that the
quantitative
data in an evaluation
conducted
in my
pragmatic paradigm would be very importantly complemented by the type of hermeneutic
dialectic assessment described by Guba and Lincoln. However, the
contention
of these authors that their model can be applied in the pure manner they describe, resulting in
full consensus upon evaluative results by all stakeholders, seems, upon analysis, politically naive and, operationally,
almost endless as one proceeds around the
circle of respondents
again and again in the search for
consensus. On the other hand, I remain open to the possibility that their pure model can be successful, but to
make this possibility a reality, the model must be demonstrated with detailed case examples. I look forward
to reading reports by Guba and Lincoln and other FGE
evaluators of efforts to apply their model fully in a variety of case studies.

REFERENCES
BARLOW, D.H., HAYES, S.C., &NELSON,
R.O. (1984). Thescientist practitioner: Research and accountability in clinical and educational settings. Elmsford, NY: Pergamon Press.
BERNSTEIN,
R. J. (1983). Beyond objectivism and relativism. Philadelphia: University of Pennsylvania
Press.
ELIAS, M.J. (1991). An action research approach to evaluating the
impact of a social decision-making
and problem solving curriculum
for preventing behavior and academic dysfunction
in children. Eval-

uation and Program Planning, 14, 397-401.


FISHMAN,
D.B. (1988). Pragmatic behaviorism:
Saving and nurturing the baby. In D.B. Fishman, F. Rotgers, & C.M. Franks (Eds.),
Paradigms in behavior therapy: Present andpromise(pp. 254-293).
New York: Springer Publishing Company.
FISHMAN,
D.B. (1991a). An introduction
sus the pragmatic paradigm in evaluation.

to the experimental

FISHMAN,
D.B., & PETERSON,
D.P. (1987). On getting the right
information
and getting the information
right. In D.B. Fishman, F.
Rotgers, & C.M. Franks (Eds.), Paradigms in behavior therapy:
Present andpromise (pp. 254-293). New York: Springer Publishing
Company.
FISKE,

D.W., & SHWEDER,


R.A. (1986). Metatheory in socialsciUniversity of Chicago Press.

ence. Chicago:

GERGEN, K.J. (1985). The social constructionist


movement
ern psychology.
American Psychologist, 40, 266-275.

in mod-

ver-

Evaluation and Program

Planning, 14, 353-363.


FISHMAN, D.B. (1991b). The experimental versus the pragmatic paradigm: Summary and conclusions. Evaluation and Program Planning,

14, 403-409.

FISHMAN,
D.B., & NEIGHER,
W.D. (1987). Technological
assessment: Tapping a third culture for decision-focused
psychological
measurement.
In D.R. Peterson & D.B. Fishman (Eds.), Assessment
fordecision (pp. 44-76). New Brunswick, NJ: Rutgers University Press.

GERGEN,
K.J. (1991). The saturated self Dilemmas of identity in
contemporary life. New York: Basic Books.
GILBERT, T.F. (1978). Human competence: Engineering worthy performance. New York: McGraw-Hill.

210

DANIEL
evalua-

RYDER, R.G. (1987). The reuli.sric therapist: Modesly and relafivism


in therapy and research. Newbury Park, CA: Sage Publications.

The Meaning

SCARR, S. (1985). Constructing


psychology: Making facts and fables
for our time. American Psychologist,
40, 499-S 12.

GUBA, E.G., & LINCOLN,


Y.S. (1989). Four/h generation
iion. Newbury Park, CA: Sage Publications.
JAMES, W. (1955). Pragmatism
of Truth. New York: Meridan

andfour
Books.

essaysfrom

JOINT COMMITTEE
ON STANDARDS
FOR EDUCATIONAL
EVALUATION.
(1981). Standards for evaluation of educaiional programs, projects, and materiak. New York: McGraw-Hill.
KRASNER, L., & HOUTS, A.C. (1984). A study of the value systems of behavioral
scientists. American Psychologist,
39, 840-850.
MORELL, J.A. (1979). Program
ford, NY: Pergamon Press.

evaluation

B. FISHMAN

in social research. Elms-

RORTY, R. (1979). Philosophy and the mirror of nature. Princeton,


NJ: Princeton University Press.

SCRIVEN,
M. (1973). Goal-free evaluation.
In E.R. House (Ed.),
School evaluation: The politics and proce.7.~ (pp. 3 19-328). Berkeley,
CA: McCutchan.
SCRIVEN, M. (1980). Evaluution
ness, CA: Edgepress.

/he.wurus

(second edition).

Inver-

STAATS, A.W. (1991). Unified positivism and unification


p\ychology: l.ad or new field? Americun P.sychologist, 46, 899-912.
WINDL.E, c., & NEIGHER, W. (1978). Ethical problems in program
cvaluarion:
Advice for trapped evaluators.
Evaluution and Program
Plunning, I, 97-107.

You might also like