You are on page 1of 11

How Modern Democracies Are Shaping

Evaluation and the Emerging Challenges for


Evaluation

GARY T. HENRY
Promote then as an object of primary importance, Institutions for the general diffusion of
knowledge. In proportion as the structure of government gives force to public opinion, it
is essential that public opinion should be enlightened . . .
George Washington (1796/1992)1

INTRODUCTION

As he was leaving office, George Washington endorsed institutions that contribute to the
stock of information about public issues and that enlighten the opinions held by the public
with that information. Now, over 200 years later, evaluation is on the cusp of becoming such
an institution. Findings from evaluations routinely permeate the boundaries of academic
publications and technical reports and find their way to the public through the popular media.
A case in point is Arthur Reynolds’ evaluation of the Chicago Child-Parent Centers. These
findings appeared in the academic press (e.g., Reynolds, 2000; Reynolds, Temple, Robertson,
& Mann, 2000), and were echoed in at least 40 articles and editorials in newspapers and
magazines across the country. As this case illustrates, the media has recognized the public’s
demand for evaluation information, a demand similar to the one’s routinely registered in
legislative chambers and agency offices across the nation.
With evaluation perched at the precipice of fulfilling the function of informing opinions
and actions pertaining to public issues, it is timely to analyze how the institutional arrange-
ments within modern democracies have interacted to create the possibility for evaluation to
reach such a vaunted and influential status. I will not look inward at how evaluations are
designed and carried out. Instead, I will focus on the development of evaluation within the
United States, and how the development was set in motion by the U.S. Constitution, the Bill
of Rights, and the marketing of these documents through the Federalist Papers. The institutional
arrangements that have resulted, which restrain the concentration of power in the hands of
any individual or group of like-minded individuals, have had a profound influence on the
nature of evaluation that is called for in democracies. Currently, evaluation is moving along

Gary T. Henry ● Andrew Young School of Policy Studies, Georgia State University, P.O. Box 4039, Atlanta, GA
30302-4039; Tel: (404) 651-2343; Fax: (404) 651-3524; E-mail: gthenry@gsu.edu.

American Journal of Evaluation, Vol. 22, No. 3, 2001, pp. 419 – 429. All rights of reproduction in any form reserved.
ISSN: 1098-2140 Copyright © 2002 by American Evaluation Association.

419
420 AMERICAN JOURNAL OF EVALUATION, 22(3), 2001

its developmental path within such modern democracies. In this article, I will analyze the
demands placed upon evaluation by modern democracies and comment on two emerging
challenges, the responses to which will affect the future of evaluation.

CONSTANTLY CONFRONTING THE CONCENTRATION OF POWER


The science of politics, however, like most other sciences, has received great improve-
ment. The regular distribution of power into distinctive departments; the introduction of
legislative balances and checks; the institution of courts composed of judges holding their
offices during good behavior; the representation of the people in the legislature by
deputies of their own election; these are wholly new discoveries, or have made their
principal progress towards perfection in modern times. They are means, and powerful
means, by which the excellences of republican government may be retained and its
imperfections lessened or avoided.
The Federalist, Number 9, 1788/1992
Congress shall make no law respecting an establishment of religion, or prohibiting the free
exercise thereof; or abridging the freedom of speech, or of the press; or the right of the
people peaceably to assemble, and to petition the Government for redress of grievances.
Amendment 1, Constitution of the United States of America
Reacting to the excesses that had resulted from both direct democracies and monarchies, the
U.S. Constitution provided for a tripartite separation of power within the government. Each
branch was to have different constituencies and different responsibilities. In other words, the
framers of the Constitution anticipated mistakes and errors in human judgment. They set out
the three branches and the further diffusion of responsibilities (e.g., the 10th Amendment,
which allocated all powers not specifically assigned to the federal government to the states)
to reduce the possibility that errors in judgment would be allowed to stand uncorrected. To
allay concerns about the concentration of power within the three coequal branches of
government, those at the Constitutional Convention agreed to attach the Bill of Rights as a
part of the ratification of the Constitution. The Bill of Rights guaranteed an independent press
and the right of people to meet and speak freely about the problems as they see them.
Countervailing powers, both inside, outside, and across levels of government, ultimately
proved to be the strategy that would allow the benefits of democratic governance to be
retained and social betterment to be pursued. The enduring strength of this system was in
overturning bad decisions, although not necessarily with great haste, while conserving
decisions that contribute to the common good. Evaluation, with its focus on the retrospective
assessment of public policies and programs, has developed as a powerful resource in the
process of debunking bad ideas.
Evaluation developed primarily as an administrative function within the executive
branch during the late 1960s and 1970s. As House (1993) noted, administrators need to
legitimize their actions and, for this reason, evaluation would continue to grow within
bureaucracies. While growth did occur, reactions in other branches of government countered
the concentration of the power of evaluation within the executive branch. Legislative bodies
began to develop their own evaluation agencies to offset the authority of the executive branch
to decide what was evaluated, how it was evaluated, and— equally important—whether and
how the evaluation findings were released. Icons of this development include the legendary
Program Evaluation Methodology Division within the General Accounting Office and the
legislative evaluation units that arose in nearly every state capital (Jonas, 1999). Without the
Democracies Shaping and Challenging Evaluation 421

specific responsibility to either administer or oversee public policies and programs, the
judiciary has not developed their own capacity for evaluation. Courts have been more
sporadic in using evaluation, mainly as a tool to monitor and assess the success of remedies
required by the courts, such as the evaluation of the required Pre-K services in low-income
school districts in New Jersey (Barnett, Tarr, & Lamy, 2001).
In addition to government sponsorship of evaluation, foundations and other types of
nongovernmental organizations (NGOs) supply an increasing amount of the resources for
evaluation within the United States and around the world. Based on a survey of foundation
leaders, Patrizi and McMullan (1998) identify “influencing public policy” as one of the
highest priorities that foundations have for sponsoring evaluations. Foundations enhance the
information available to the public and policymakers in at least two ways. They often
evaluate “model” or demonstration programs that the foundations themselves may sponsor,
to provide information that could influence replication of the program with public funds in
the future (e.g., Gomby, Culross, & Behrman, 1999). In addition, they sponsor evaluations
of ongoing public programs to show their current impacts, whether positive, negative, or
nonexistent. Foundation evaluations of public programs and pilot projects offer information
about government activities that is independent of the government.
In addition to the government and NGOs, the press influences the role of evaluation in
democracies. The media offer a mechanism by which Washington’s goals of disseminating
findings and enlightening public opinion could be achieved. A case in point is the previously
cited reporting about the evaluation of Child-Parent Centers (CPC) run by Chicago Public
Schools. But the media are not simply conduits for evaluation findings. For example, take this
interchange that occurred when National Public Radio reporter Larry Abramson reported on
the CPC evaluation on April 8, 2001:
Abramson: But it’s not clear that a study like this one can really prove the value of the $15
billion spent annually for preschool programs in this country. This new study only shows
that early intervention provides some benefits to some kids.
For Douglas Besharov of the American Enterprise Institute, this does not mean this
is the best way to spend this money.

Mr. Douglas Besharov (American Enterprise Institute): What if it’s the case that their
development, their social behavior and so forth would benefit from the mothers only
working part-time? Then we would be better served by spending less money on these
expensive—and they’re very expensive—preschool programs, and more money encour-
aging low-income mothers to work, but only to work part-time (Abramson, 2001).

As this exchange points out, the journalists’ credo—the public’s right to know— extends
their responsibilities beyond passive reporting of findings to seek balance and perspective
about the study and its findings. Evaluations (as well as other studies) will always be
susceptible to criticism in terms of what they didn’t address, including testing the relative
merits of policy options preferred by some experts. When their job is done well, journalists can
perform and publish a review of the evaluation and expose obvious biases. Journalists can surface
apparent biases in what outcomes were addressed or concerns about the accuracy of the results.
To stand up in the contentious environs of modern democracies, evaluation must be perceived as
producing credible information that is not unduly biased by its sponsor or by bad decisions of the
evaluator (Datta, 2000), and journalism can play a role in these perceptions.
Evaluation findings can affect the delicate balance of power between competing insti-
tution and within those institutions. Evaluation findings have been used in Congress to justify
422 AMERICAN JOURNAL OF EVALUATION, 22(3), 2001

a committee’s jurisdiction over legislation for which there is competition among standing
committees (Shulock, 1999). Evaluations have also influenced the receptivity of state
policymakers to proposed policies (Greenberg, Mandell, & Onstott, 2000). Evaluations can
have a strong influence in the process of justifying a particular policy recommendation or in
the competition for jurisdiction over an issue within a democratic institution. To have such
effects, there must be some shared agreement on the nature of evaluation and how it is to be
conducted. Evaluation findings have to pass scrutiny from a variety of individuals represent-
ing numerous institutions that have a plethora of interests. No government administrator
wants to be upbraided in legislative hearings for having spent public resources on a
substantially biased and self-serving evaluation. No foundation wants its name prominently
cited in the media for sponsoring an evaluation that produced biased findings. These
conditions create a need for evaluation to be practiced in a way that would give evaluation
sponsors a degree of confidence that the findings from an evaluation would pass muster in a
competitive environment. But what is the nature of this practice?
House (1993) predicted a particular path for the development of evaluation in democ-
racies:
Society expects for evaluation to be based on scientific authority. . . the more objective
and less ideological evaluation becomes, the more useful it is and the more it threatens
established authority. A useful practice would provide sound evaluations that are not
ideological. . . The evaluation theory developed so far is too ideological (p. 30).

Evaluations that tow the line of less ideology and more scientific rigor have had notable
influence (Bickman, 1996; Riccio, Friedlander, & Freedman, 1994; Schweinhart, Barnes, &
Weikart, 1997). Similarly, the recommendations for evaluation practice that follow this line
are evoked in several reviews and critiques of evaluations (Advisory Committee on Head
Start Research and Evaluation, 1999; Datta, 1983; Gilliam, Ripple, Zigler, & Leiter, 2000;
Gilliam & Zigler, 2000; National Academy of Science, 2001; U.S. Government Accounting
Office, 1997). If the competition for power within modern democracies is indeed fostering
the development of a particular form of evaluation, what might the theory for that type of
evaluation be based upon and what would be the overriding concerns for its practice?

DEMOCRACY’S DEMANDS FOR EVALUATION THEORY AND PRACTICE


We the People of the United States, in Order to form a more perfect Union, establish
Justice, insure domestic Tranquility, provide for the common defense, promote the general
Welfare, and secure the Blessings of Liberty to ourselves and our posterity, do ordain and
establish this Constitution for the United States of America.
Constitution of the United States of America
As this quote from the U.S. Constitution makes clear, the institutional arrangements set forth
in modern democracies exist, not as ends in themselves, but to allow citizens to improve their
own well-being and to enhance the well-being of others in the society. To realize these
benefits through public policies and programs, we, as a society, must define the common
good, select a course of action that we believe will realize that common good, and adapt that
course of action to circumstances and local conditions (Henry, 1999). Framers of the
Constitution were well aware that the realization of the goals which they set out for the new
nation would require judgments and they struggled to establish self-correcting means for
overturning judgments that turned out not to be in the best interest of the people.
Democracies Shaping and Challenging Evaluation 423

Citizens, their representatives, and those who take on stewardship of the public’s
business make countless judgments that set the course of public policies and programs. These
judgments are formed from individual and collective understandings of the problems and
issues at hand and from the attitudes and values concerning the significance of the problem
as well as what should be done about it. Mark, Henry, and Julnes label this as part of natural
sense-making and posit that “the field of evaluation has been developed to assist, support, and
extend natural human abilities to observe, understand, and make judgments about policies
and programs” (Mark, Henry, & Julnes, 2000, p. 5). Evaluation is fundamentally concerned
with enhancing the innate human activity of making judgments.
To extend natural sense-making abilities, evaluators “conduct . . . systematic inquiry
that describes and explains the policies’ and programs’ operations, effects, justifications, and
social implications ” (Mark et al., 2000, p. 3). The systematic inquiry required for evaluation
is motivated by the pursuit of social betterment, a general framing of which appears in the
above quote from the U.S. Constitution. Working from this view of evaluation, evaluators
can be seen to have two fundamental responsibilities that, like two sides of a coin, are distinct
but inseparable. These two sides of the coin are the selection of indicators of success and the
selection of methods of systematic inquiry.

Selection of Indicators of Success

First, evaluators must choose the values, desired outcomes, and negative side effects that
indicate the success or failure of a policy or program. The indicators of success constitute a
working definition of the specific aspects of betterment that the policy or program would need
to advance to create a benefit for society. To provide sponsors and others with the confidence
that the evaluation is not overly ideological, these indicators must be selected in a way that
minimizes bias and is reasonably operationalized to form the basis for systematic inquiry.
Early on, some evaluators sought to sidestep the selection of values by following a research
tradition or discipline, using stated program objectives or attempting to objectively determine
“human needs.” Later, evaluation theorists focused on specific groups whose views should be
given special status, such as program managers (Wholey, 1983), the disadvantaged (House,
1993), intended users (Patton, 1997), or stakeholders (Weiss, 1983). However, more recent
theories have pointed to procedural rules to make these decisions more democratic (House &
Howe, 1999, 2000). Others have proposed that evaluators use systematic inquiry to reduce
bias in the selection of indicators as well as criticism to expose the biases that remain (Henry,
1999, Henry & Julnes, 1998; Mark et al., 2000).

Conducting Systematic Inquiry

Second, evaluators must carry out the systematic inquiry in a way that produces findings
that represent reality, again, with the minimal amount of bias possible. The accuracy of the
“facts” is important because we need to know, for example, if families are better off or worse
off after welfare reform or if Head Start enhances school readiness and success later in life.
We must thoroughly understand the techniques of systematic inquiry, make professional
judgments about their accuracy and adequacy for the inquiry at hand, and not allow
judgments concerning rigor to fall to those who need information but who are not and need
not be experts at obtaining accurate information.
To sum up, less ideological and more rigorous evaluation that sheds light on prior policy
424 AMERICAN JOURNAL OF EVALUATION, 22(3), 2001

judgments and leads to betterment, albeit indirectly, must be fair and reasonably compre-
hensive in the selection of indicators and accurate in the collection and interpretation of evidence.
Not all biases can be overcome, nor are any methods capable of producing infallible information.
Judgments about the evaluation must be available for scrutiny and redress, in the same manner
that all other decisions in democracies are subject to second-guessing and correction. The
challenge for evaluation that is posed by becoming a potent institution within modern democra-
cies is to open itself up for examinations of its biases. In this regard, two specific challenges come
to the top of the list: transparency and meta-evaluation.

Transparency

Each House shall keep a Journal of its Proceedings, and from time to time publish the same,
excepting such Parts as may in their Judgment require secrecy.
Constitution of the United States of America, Article 1, Section 5.

Remember all Men would be tyrants if they could. If perticular care and attention is not paid to
the Ladies we are determined to foment a Rebelion, and will not hold ourselves bound by any
laws in which we have no voice, or Representation.
Abigail Adams, 1776/1992

. . . when you assemble a number of men, to have the advantage of their joint wisdom, you
inevitably assemble with those men all their prejudices, their passions, their errors of opinion, their
local interests, and their selfish views. From such an assembly can a perfect production be expected?
Benjamin Franklin, speech at the Constitutional Convention

These quotes serve to remind us that we cannot prevent errors in judgment or biases in
how we think and act. For example, stakeholder processes are useful, but provide little
comfort for anyone who was not in the stakeholder group and who wishes to object to the
findings. To those that were excluded, the results will appear to be a collection of biases. One
imperfect but time-honored method of dealing with bias is based on the realization that bias
cannot be fully eliminated. Instead, the idea is to make decisions transparent, and to identify the
participants in decisions and their affiliations. With such information in hand, bias can be
confronted. If evaluation is to live up to its potential to indirectly enhance social betterment by
providing systematic evidence about the consequences of prior judgments about policies and
programs, the work of evaluators requires transparency. On the surface, few will quibble with the
desire for transparency, but making evaluation transparent carries with it many challenges as well.
To become more transparent, evaluation findings must become more widely available
and the description of the study design and processes must be easily accessible to all who are
interested. One reason for transparency is so that we can learn from the work of others, both
substantively and methodologically. However, checking for biases may be an equally if not
more important reason for making all evaluations of public programs available to the public.
Transparency affects both sides of the evaluation coin—selection of indicators of program
outcomes and methods of systematic inquiry. Evaluation reports should make the methods
for systematic inquiry transparent, as the best reports do currently. In addition, the reports
must make the selection of indicators transparent. Mark et al. (2000) propose that evaluators
accomplish this by framing the information supporting the selection of these indicators as
“findings.” This implies that the process for selecting the indicators should be articulated as
Democracies Shaping and Challenging Evaluation 425

thoroughly as the systematic inquiry process in the evaluation documentation. Certainly,


information that would assist the reader in understanding the process that was used to arrive
at key decisions is crucial and allows the evaluator to describe the limitations of the process
Henry, Dickey, & Areson, 1991). This could include information about which stakeholders
actually participated among those invited. Also, it is important to try to document the impact
that the participants had on the selection of outcomes, choice of measures and other elements
of the evaluation design. Crucial decisions about the design should be discussed, perhaps
most fruitfully by indicating how other options were ruled out, thereby establishing the
rationale for the path that was taken.
A second potential benefit of transparency is to demonstrate more systematically the
benefits of specific processes that can be used to select indicators of success, such as
deliberative dialogues, negotiations with intended users, or stakeholder processes. Accounts
documenting what was done and what decisions changed because of the particular decision
processes will provide case examples and a database for more thorough study. Many
promising evaluation strategies that support organizational learning, such as developmental
evaluation (Patton, 1997) could be emulated and assessed if the actual studies were more
transparent. Proponents and skeptics alike could use this evidence as a basis for developing
their own conclusions about a given approach. Without making evaluations transparent, we
are forced into arguments that can gain no traction because of information asymmetry; that
is, those who have used the approaches have complete control over what information is
available to those who are considering the use or adaptation of the approaches. Greater
transparency for evaluations of all types could lead to more informed judgments during
evaluation planning and toward the goal of evidence-based practice in evaluation.
Of course, as with the requirement for a journal of proceedings for the Congress, some
aspects of an evaluation must not be divulged. It is possible to err on the sides of revealing
to much, such as the identities of human participants, just as it is possible to extend privacy
protections too far, such as to those who are appropriately accountable. A balance will need
to be struck by tempering the protection of informants with the public’s right to know.

Meta-Evaluation

It is error alone which needs the support of government. Truth can stand by itself.
Thomas Jefferson, Notes on the State of Virginia—Religion (1787/1992)

Reason and persuasion are the only practicable instruments. To make way for these, free enquiry
must be indulged; and how can we wish others to indulge it while we refuse it ourselves.
Thomas Jefferson, Notes on the State of Virginia—Religion (1787/1992)

While Jefferson was concerned in these statements about the issue of whether the state
should establish an official religion, it is fitting that they be cited in the context of evaluation.
Evaluation is a subversive activity because it aims reason and persuasion at existing beliefs
and powerful interests. Evaluators ask, “what is the worth of a program?” Mostly, evaluation
is accepted as a good thing to do. Few quibble that it should be done. But, sooner or later,
evaluators will be accountable for their decisions to others. It is likely that evaluators will be
accountable, not only in terms of following accepted procedures, but for their results and for
the impacts of their findings. In other words, echoing evaluators’ question about the worth of
a program, others will ask eventually ask, “what is the worth of an evaluation?” If evaluators
426 AMERICAN JOURNAL OF EVALUATION, 22(3), 2001

protest that we are not responsible for what is done with our work and there are too many
intervening factors for us to conclude whether it had an impact, they might seem to be
reverberations of the protests registered by those whose efforts we evaluated.
Under current conditions, evaluation of evaluation, or meta-evaluation, as it is known,
is likely to undergo a revival. House (1993, p. xii) has remarked, “Evaluation is a discipline,
or nearly a discipline, because it is possible to have correct and incorrect evaluations. There
is enough agreement in the field to make such judgments about most evaluations. . .The
successful critique of studies is an important indicator of the joining of issues that charac-
terize a discipline.” At this juncture, we appear to be approaching tests of this proposition.
For example, Stufflebeam (2001a) evaluated 22 models or approaches to evaluation using the
Joint Committee Standards (1994), and pared down the list of those for future use to seven
or eight. By their omission in the final list, he suggested that evaluations using certain
approaches were “incorrect.” His approach was connoisseurial and focused on approaches
rather than individual evaluations. No individual evaluations that used each approach were
identified for analysis, nor were prototypes examined. In the future, meta-evaluations are
likely to begin a more intense scrutiny of individual evaluations. In the process, the criteria
for assessing the correctness of individual evaluations will necessarily be advanced, cri-
tiqued, revised, and, perhaps, accepted.
Interest in meta-evaluation may produce either of two important strands of work for the
field, or both. First, a very positive outcome could be a slate of empirical studies that examine
the impacts of actual evaluations. While these studies may be broadly conceived as research
on evaluation, they can add to our understandings of the impacts of evaluation. They are
meta-evaluations in the sense that they employ criteria such as providing a justification for
action or persuading a group to change their opinions to evaluate evaluations. Standing on the
shoulders of studies conducted in the 1970s (Alkin, Daillak, & White, 1979; Caplan 1977;
Patton et al., 1977; Weiss & Bucavalas, 1977), new studies could help to establish the
mechanisms by which evaluation achieves influence. In addition, they could establish
reasonable expectations for when various outcomes are likely to occur and the extent to
which they occur. These studies could be informed by (1) the prior research; (2) extensively
researched theories of persuasion and the influence of information from psychology and political
science; (3) methodological developments of the last 25 years; and (4) the development of a large
body of evaluations to act as stimuli for changing attitudes and actions. Studies such as these are
essential if evaluation is to move toward evidence-based practice that is informed by the ways and
extent to which evaluation can contribute to the realization of social betterment. These studies will
be costly but necessary if we are to continue to use research to guide the practice of evaluation
rather than untested beliefs or anecdotal experience.
Second and, I believe, somewhat less likely, will be the review of individual evaluations.
One could envision the adaptation of checklists similar to those recommended by Scriven or
development of templates using the AEA guidelines or other standards (Stufflebeam, 2001b).
The methods for evaluating specific evaluations must be capable of ferreting out biases when
they have occurred and must make their findings about these biases accessible. Eventually,
to engender confidence in the independence of these reviews, the methods must not only be
capable of a determination of bias, they must actually determine some instances in which it
has actually occurred. The same pressures that point to the need for the transparency of
individual evaluations will also force the transparency of meta-evaluation. These pressures
will minimally require a compilation of the number of evaluations that were determined
meta-analytically to be “correct” or “incorrect.” This will allow an assessment of whether the
Democracies Shaping and Challenging Evaluation 427

meta-evaluations were consistently biased in favor of concluding that the evaluation was
properly conducted. If the evaluation community develops these procedures, the outcome
could be self-regulation and monitoring that strengthens evaluation as a profession. If it is left
to political or judicial processes, the entire field could be undeservingly sullied.

CONCLUSIONS

Evaluation as a function has been thoroughly integrated into governments throughout the
U.S. and it is either equally developed or under development in other modern democracies
around the world. In some ways our work is more difficult and messier because of the deep
scorn for centralized authority that is embedded in most modern democracies. However,
evaluation has a much greater potential for fulfilling its “subversive” mission of undermining
support for bad decisions because of the associated fear of absolute power. Rather than carrying
out studies assigned by a central authority with results controlled by them, we have a range of
constituencies and great latitude about how we choose to do the work of evaluation. Because
evaluation is a relatively young field populated by clever and thoughtful people, options for
evaluations abound and there is little consensus to constrain our practice. However, democracies
have their own needs to make evaluation more rigorous and less ideological to pursue social
betterment. These needs are served by an evaluation that is independent but not autonomous. The
countervailing powers within modern democracies have created contexts within which evaluators
can have both influence and independence. How evaluators use their independence in responding
to the needs of modern democracies will affect their influence.

ACKNOWLEDGMENT

An earlier version of this paper was presented at the Minnesota Evaluation Studies Institute,
St. Paul, MN, May 29, 2001.

NOTE

1. The quotes from historical documents in this article may contain anachronistic spellings or
capitalization that were in the source documents from which the quote was taken.

REFERENCES

Abramson, L. (April 8, 2001). CPC Evaluation. Washington, DC: National Public Radio.
Adams, A. (1992). Correspondence on women’s rights. In D. Ravitch & A. Thernstrom (Eds.), The
democracy reader: Classic and modern speeches, essays, poems, declarations and documents on
freedom and human rights worldwide (pp. 103–104). New York, NY: HarperCollins. (Original
Correspondence March 31, 1776.)
Advisory Committee on Head Start Research and Evaluation. (1999). Evaluating Head Start: A
recommended framework for studying the impact of the Head Start program. Washington, D.C.:
U.S. Department of Health and Human Services.
Alkin, M. C., Daillak, R., & White, P. (1979). Using evaluations: Does evaluation make a difference?
Beverly Hills, CA: Sage Publications.
Barnett, W. S. (1995). Long-term effects of early childhood programs on cognitive and school
outcomes. The Future of Children, 53, 24 – 49.
428 AMERICAN JOURNAL OF EVALUATION, 22(3), 2001

Barnett, S., Tarr, J., & Lamy, C. E. (2001). Fragile lives/shattered dreams: Second annual report on
implementation of preschool education in the Abbott districts. New Brunswick, NJ: Center for
Early Education Research, Rutgers University.
Bickman, L. (1996). A continuum of care: More is not always better. American Psychologist, 51,
689 –701.
Caplan, N. (1977). A minimal set of conditions necessary for the utilization of social science knowledge
in policy formulation at the national level. In C. H. Weiss (Ed.), Using social research in public
policy making (pp. 183–198). Lexington, KY: Lexington Books.
Datta, L.-e. (1983). A tale of two studies: The Westinghouse-Ohio Evaluation of Project Head Start and the
Consortium for Longitudinal Studies Report. Washington, D.C.: National Institute of Education.
Datta, L.-e. (2000). Seriously seeking fairness: Strategies for crafting a non-partisan evaluation in a
partisan world. American Journal of Evaluation, 21(1), 1–14.
The Federalist, Number 9. (1992). In D. Ravitch & A. Thernstrom (Eds.), The democracy reader:
Classic and modern speeches, essays, poems, declarations and documents on freedom and human
rights worldwide (pp. 123–132). New York, NY: HarperCollins. (Original work published 1788.)
Franklin, B. (1782). In D. Ravtich & A. Therstrom (Eds.), The democracy reader: Classic and modern
speeches, essays, poems, declarations and documents on freedom and human rights worldwide
(pp. 109 –111). New York, NY: HarperCollins. (Original speech 1787.)
Gilliam, W. S., & Zigler, E. F. (2000). A critical meta-analysis of all evaluations of state-funded
preschool from 1977 to 1998: Implications for policy, service delivery and program evaluation.
Early Childhood Research Quarterly, 15, 441– 472.
Gilliam, W. S., Ripple, C. H., Zigler, E. F., & Leiter, V. (2000). Evaluating child and family
demonstration initiatives: Lessons from the Comprehensive Child Development Program. Early
Childhood Research Quarterly, 15, 41–59.
Gomby, D. S., Culross, P. L., & Behrman, R. E. (1999). Home visiting: Recent program evaluations-
analysis and become. The Future of the Children, 9(1), 4 –24.
Greenberg, D., Mandell, M., & Onstott, M. (2000). The dissemination and utilization of welfare-to-work
experiments in state policy making. Journal of Policy Analysis and Management, 19(3), 367–382.
Harkreader, S. A., & Henry, G. T. (2000). Using performance measurement systems for assessing the
merit and worth of reforms. American Journal of Evaluation, 21, 151–170.
Henry, G. T. (1999, November). What do we expect from preschool? A systematic inquiry into values,
conflicts and consensus. Paper presented at The American Evaluation Association, Orlando, FL.
Henry, G. T., & Dickey, K. (1993). Implementing performance monitoring: A research and develop-
ment approach. Public Administration Review, 53(3), 203–212.
Henry, G. T., Dickey, K. C., & Areson, J. C. (1991). Stakeholder participation in educational
performance monitoring. Education Evaluation and Policy Analysis, 13(2), 177–188.
Henry, G. T., & Julnes, G. (1998). Values and realist evaluation. In G. T. Henry, M. M. Mark, & G.
Julnes (Eds.), Realist evaluation. New Directions for Evaluation, Number 78 (pp. 53–72). San
Francisco: Jossey-Bass.
Hilgartner, S., & Bosk, C. (1988). The rise and fall of social problems: A public arenas model.
American Journal of Sociology, 94, 53–78.
House, E. R. (1993). Professional evaluation: Social impact and political consequences. Newbury
Park, CA: Newbury.
House, E. R., & Howe, K. R. (1999). Values in evaluation and social research. Thousand Oaks, CA: Sage.
House, E. R., & Howe, K. R. (2000). Deliberate democratic evaluation. In K. Ryan & L. DeStefano
(Eds.), Evaluation as a democratic process. New Directions for Evaluation, Number 85 (pp.
3–13). San Francisco: Jossey-Bass.
Jefferson, T. (1992). Notes on the state of Virginia-religion. In D. Ravitch & A. Thernstrom (Eds.), The
democracy reader: classic and modern speeches, essays, poems, declarations and documents on
freedom and human rights worldwide (pp. 108 –109). New York, NY: HarperCollins.
Democracies Shaping and Challenging Evaluation 429

Joint Committee on Standards for Educational Evaluation. (1994). The program evaluation standards.
Thousand Oaks, CA: Sage.
Jonas, R. K. (1999a). Against the whim: State legislatures’ use of program evaluation. In R. K. Jonas
(Ed.), New Directions for Evaluation, Number 81 (pp. 3–10). San Francisco, CA: Jossey-Bass.
Katz, L. G. (1997). A developmental approach to assessment of young children. Champaign, IL:
University of Illinois.
Mark, M. M., Henry, G. T., & Julnes, G. (2000). An integrated framework for understanding, guiding
and improving policies and programs. San Francisco: Jossey-Bass.
National Academy of Sciences. (2001). Evaluating welfare reform in an era of transition. Washington,
D.C.: National Research Council.
Patrizi, P., & McMullen, B. (1998). Evaluation of foundations: The unrealized potential. (Unpublished
manuscript).
Patton, M. Q. (1997). Utilization-focused evaluation: The new century text (3rd ed.). Beverly Hills, CA:
Sage.
Patton, M. Q., Guthrie, K. M., Brennan, N. J., French, B. B., & Blyth, D. A. (1977). In search of impact:
An analysis of the utilization of federal health evaluation research. In C. H. Weiss (Ed.), Using
social research in public policy making (pp. 141–164). Lexington, KY: Lexington Books.
Reynolds, A. J. (2000). Success in early intervention: The Chicago child-parent centers. Lincoln, NE:
University Nebraska Press.
Reynolds, A. J., Temple, J. A., Roberston, D. L., & Mann, E. A. (2000). A 15-year follow-up of low-income
children in public schools. Journal of American Medical Association, 285, 2339 –2346.
Riccio, J., Friedlander, D., & Freedman, S. (1994). GAIN: Benefits, costs and three-year impacts of a
welfare-to-work program. New York: Manpower Demonstration Research Corporation.
Schweinhart, L. J., & Weikart, D. P. (1997). The high/scope preschool curriculum comparison study
through age 23. Early Childhood Research Quarterly, 12, 117–143.
Shulock, N. (1999). The paradox of policy analysis: If it is not used, why do we produce so much of
it? Journal of Policy Analysis and Management, 18(2), 226 –244.
Stufflebeam, D. L. (2001a). Foundation models for 21st century program evaluation (Vol. 89). San
Francisco, CA: Jossey-Bass.
Stufflebeam, D. L. (2001b). The meta-evaluation imperative. American Journal of Evaluation, 22(2),
183–209.
U.S. Constitution, art. 1, sec. 5. (1992). In D. Ravitch & A. Thernstrom (Eds.), The democracy reader:
Classic and modern speeches, essays, poems, declarations and documents on freedom and human
rights worldwide (p. 113). New York, NY: HarperCollins. (Original work published 1791.)
U.S. Constitution of the United States of America. (1992). In D. Ravitch & A. Thernstrom (Eds.), The
democracy reader: Classic and modern speeches, essays, poems, declarations and documents on
freedom and human rights worldwide (pp. 109 –111). Philadelphia, PA: HarperCollins. (Original
work published 1791.)
U.S. Constitution, Amend 1. (1992). In D. Ravitch & A. Thernstrom (Eds.), The democracy reader:
Classic and modern speeches, essays, poems, declarations and documents on freedom and human
rights worldwide (pp. 118 –119). New York, NY: HarperCollins. (Original work published 1791.)
U.S. General Accounting Office. (1977). Head Start: Research provides little information on impact of
current program. Washington, D.C.: Author.
Washington, G. (1992). Farewell address. In D. Ravitch & A. Thernstrom (Eds.), The democracy
reader: Classic and modern speeches, essays, poems, declarations and documents on freedom and
human rights worldwide (pp. 136 –137). New York, NY: HarperCollins.(Original speech 1796.)
Weiss, C. H. (1983). The stakeholder approach to evaluation: Origins and promise. In A. S. Bryk (Ed.),
New Directions for Program Evaluation, Number 17 (pp. 3–14). San Francisco, CA: Jossey-Bass.
Weiss, C. H., & Bucuvalas, M. J. (1977). The challenge of social research to decision making: Using
social research in public policy making (pp. 213–232). Lexington, KY: Lexington Books.
Wholey, J. S. (1983). Evaluation and effective public management. Boston, MA: Little Brown.

You might also like