You are on page 1of 17




March 2010

Presented in this document are a rationale, concept, and preliminary plans for an educational
evaluation group, tentatively described as The Penn State Educational Evaluation Group, to be located
within the Penn State College of Education. A working group of College faculty, students, and
administrators has been involved formally in discussions about the Group since 2008, although former
Associate Dean Stefkovich began informal discussions in 2007. First, considered in this précis is the
need to unify evaluation efforts within the College. Next, the Phi Delta Kappa Committee on
Evaluation’s 1971 evaluation model is resurrected as a candidate to guide the Group’s efforts. Then,
issues and challenges are discussed related to the development of evaluation opportunities for the
Group, gathering assets required, financing the effort, operating evaluation projects with the University
structure, and growing the Group’s capacity to match current and anticipated demand for evaluation
projects. Last, the development of a business plan is the next step suggested for creating The Penn
State Educational Evaluation Group. This white paper was requested by the College of Education
Working Group on Evaluation during its 8 February 2010 meeting.

Table of Contents
Abstract............................................................................................................................................................... 1

Focus of This Précis............................................................................................................................................ 2

College Working Group on Evaluation.......................................................................................................... 2

Need for Unifying Effort on Evaluation in the College ................................................................................. 3

The Conceptual Anchor: A Model Developed by the PDK Committee on Evaluation ..................................... 4

Evaluation Defined......................................................................................................................................... 4

The Evaluation Process Serves Decision Making.......................................................................................... 5

Types of Decisions Are Linked to Types of Evaluation ................................................................................ 5

Evaluation is a Process Defined & Organized Through a General Work Breakdown .................................. 7

Delineating Information Needs—“Who Needs What When?”.................................................................. 7

Obtaining Information—“How?” .............................................................................................................. 8

Providing Information—“To Whom & In What Form?” .......................................................................... 9

Evaluation is, Itself, Evaluated..................................................................................................................... 10

Moving Forward ............................................................................................................................................... 11

Opportunities................................................................................................................................................ 11

Assets ........................................................................................................................................................... 12

Finances........................................................................................................................................................ 12

Operations .................................................................................................................................................... 13

Growth.......................................................................................................................................................... 15

The Next Step: A Business Plan ....................................................................................................................... 15



Focus of This Précis

College Working Group on Evaluation
A working group of College of Education administrators, faculty members, and graduate students
has been discussing development of evaluation capacity in the College. An ANGEL group, Educational
Evaluation Center,3 is a focal point for documentation of this working group’s efforts. The working group
decided on 8 Feb 2010 to use provisionally the title, The Penn State Educational Evaluation Group, to
envision the realization of this evaluation capacity as an organizational entity of the College.4
According to the minutes of the working group’s 10 Nov 2009 meeting,5 the goals of the The Penn
State Educational Evaluation Group are to:
• Become a premier resource for providing education–related evaluation services;
• Train graduate students in evaluation;
• Fund graduate students; and
• Create new revenue streams for the College.
The working group’s laundry list of potential activities of the Evaluation Group includes:
• Developing and conducting evaluations based on educational outcomes;
• Serving as evaluators for NSF and other science–related educational programs;
• Serve as “external evaluators” for projects/grants submitted by other colleges and universities, government
agencies, etc.;
• Supporting instructors at Penn State who want to understand the impact of their teaching;
• Engaging in development projects (proof of concept/formative evaluation) to advance the goals of projects and,
perhaps, Penn State;
• Conducting public policy analysis;
• Educating other departments, colleges, and organizations about evaluation;
• Developing and conducting cost/benefit analyses;
• Assisting in the development of evaluation sections of grant proposals for Penn State principal investigators,
even when the Group cannot be the “external” evaluator;
• Serving as the focal point for collecting requests for proposals for evaluation projects from around the country;
• Seeking opportunities to conduct evaluations in the international arena, perhaps through collaborators such as
the World Bank and non–governmental organizations.
Emily Crawford (, a graduate student in the Educational Theory and Policy
program, compiled a list of centers of activity at other universities that pursue goals and projects for
evaluation similar to those envisioned for The Penn State Educational Evaluation Group.6 The universities
listed include Cornell, Syracuse, Western Michigan, Iowa State, Harvard, Penn, UCLA, UC/Berkeley, and
Vanderbilt. Although the list of organizations that Ms. Crawford assembled probably is not comprehensive,

Assistant Professor of Education in Workforce Education and Development and Director (for Penn State Outreach) of the Center for Regional
Economic and Workforce Analysis, 814.865.9919,
Professor of Education in Workforce Education and Development and Director (for the College of Education) of the Institute for Research in
Training and Development, 914.689.9337, Baker and Passmore are leaders of Penn State’s Workforce Education and Development
Initiative (
3; requires PSU authentication and group authorization.
We suggest avoiding the description of this nascent organizational entity as a “Center” or “Institute,” at least at this point, to avoid the bottleneck of
Penn State administrivia that is required to establish research centers, consortia, and institutes (cf., “Guidelines for Establishing Research Centers,
Consortia and Institutes (PSU Central Administration Guidelines),” at Let’s try this approach
provisionally, and jump through the paperwork hoops when we become more certain about the viability of this initiative.
Unnumbered p. 1 of; requires PSU authentication and group authorization.
http://OtherInst–; requires PSU authentication and group authorization.


review of the information accompanying her list reveals that many Research I universities successfully fund
and operate centers of activity that conduct and report educational evaluation projects representing diverse
content and methods. In February 2009, prior to the formation of the current working group, Prof Stefkovich,
who was then Associate Dean for Research and Graduate Studies, arranged for a presentation to a meeting of
directors of the College’s centers and institutes by Joshua Smith from the School of Education, University of
Indiana/Indianapolis.7 Directors attending that meeting were impressed by the opportunities, collaboration
among internal and external partners, and funding that Prof Smith was able to generate through educational
evaluation projects.
Although not specifically documented in the meeting minutes of the working group that are available
at the ANGEL web site, our impression from participating in discussions as members of the working group is
that the group believes opportunities exist for securing funds to initiate and support The Penn State
Educational Evaluation Group. Indeed, the group’s meeting minutes from 10 Nov 2009 indicate that “Some
government agencies and foundations are requiring X% of the budget to be devoted to serious evaluations.”8
Of course, a formal business plan should be developed to scrutinize the financial viability of The Penn State
Educational Evaluation Group, but preliminary information indicates that the Group could secure business.

Need for Unifying Effort on Evaluation in the College
Members of the working group represent various disciplines, methodological backgrounds and
preferences, and academic unit interests, as does the entire body of scholars within the College. This
diversity is evident in the “Related discussions” listed in the working group’s 10 Nov 2009 meeting minutes
in which concerns are expressed about, for example, ensuring that the “difference between program
evaluation and formative assessments” is respected and recognizing that “outcomes beyond learning
outcomes” are possible.9 A major organizational task in structuring The Penn State Educational Evaluation
Group will be to create institutional capacity to arch over the variegated conceptions of evaluation held in the
We assert that a useful way to integrate this dappled mix of perspectives on evaluation held in the
College is by viewing the work of The Penn State Educational Evaluation Group through a meta–
methodological lens. We use the term, meta,10 to modify methodological in an epistemological sense—that
is, we believe that the work of the Evaluation Group can be structured around a high level conception of the
evaluation process that transcends particular disciplinary traditions, bursts through calcified methodological
commitments and biases, and stands above internecine competition for resources within the College.
In 1971, the Phi Delta Kappa (PDK) Committee on Evaluation authored Educational Evaluation and
Decision Making,11 which detailed a meta–methodological approach to evaluation. The work of the PDK
Committee was conceived partially in reaction to the difficulty and practical inconsequence of applying
experimental and quasi–experimental designs in educational and work settings that rarely afford
opportunities for attaining the gold standard of random assignment of subjects to experimental conditions.12
However, the PDK Committee was much more concerned with staking out a process for assembling
information that supports change and improvement—a process that the Committee meant to be a roadmap for
an evaluation methodology that is generalizable over many types of evaluation projects.
The PDK Committee produced a logic model13 of evaluation. The PDK Committee’s logic model is
easily translated into a work breakdown14 of tasks that define the scope of an evaluation project. In these

Unnumbered p. 3 of; requires PSU authentication and group authorization.
Statements made without further discussion on unnumbered p. 3 of; requires PSU authentication and group
According to Wikipedia (, “(from Greek: µετά = “after", "beyond", "with", “adjacent”, “self”), is a prefix
used…indicate a concept which is an abstraction from another concept.”
Stufflebeam, D.L., Foley, W.J., Gephart, W. J., Guba, E. G., Hammond, R. L., Merriam, H. O., & Provus, M. M. (1971). Educational evaluation
and decision making. Itasca, Illinois: F. E. Peacock Publishers. This book, now out of print, was somewhat disruptive due to its practical focus on
decision making, but nevertheless received some enthusiastic reviews (cf., e.g., Clark, C. A. (1972). [Review of the book Educational evaluation
and decision making by D. L. Stufflebeam, W. J. Foley, E. J. Guba, et al.]. Educational and Psychological Measurement, 32, 219–223.
The epigraph in the front matter of Educational Evaluation and Decision Making reads, “The purpose of evaluation is not to prove but to improve.”
A logic model “is a general framework for describing work in an organization” (


ways, the PDK Committee’s evaluation model provides a structure for managing evaluation projects that
could be prospected, conducted, and reported through The Penn State Educational Evaluation Group. One of
us (Passmore) directed an economic evaluation program at the National Technical Institute for the Deaf that
was organized around the PDK Committee’s evaluation model.
In the remainder of this white paper, we, first, summarize the PDK Committee on Evaluation’s
design model to aid in deliberations about plans for The Penn State Educational Evaluation Group. We
devote substantial space in this white paper to a detailed summary of the PDK Committee’s book,
Educational Evaluation and Decision Making, because the book has been out of print for some time and, as a
result, the Committee’s work might be unfamiliar. And, second, we list the next steps that could be
undertaken to move closer to a decision about whether to invest in The Penn State Educational Evaluation
Group. Our précis is not a strategic, operational, or business plan, although we believe that these types of
plans will be necessary to launch the Evaluation Group. This white paper was requested by the College of
Education Working Group on Evaluation during its 8 February 2010 meeting.

The Conceptual Anchor: A Model Developed by the PDK Committee on Evaluation
The information we provide in this section is not meant to be a vade mecum of evaluation. Rather,
our simple aim is to summarize key elements of the PDK Committee’s evaluation model. We have taken
liberties in this white paper to freely scan (or redraw) and include graphics from the PDK Committee’s book,
Educational Evaluation and Decision Making. To save time, we do not cite the location of these graphics in
the PDK Committee’s book. Passmore owns a copy of this book, which is available for review (his original
bound copy was lost by a student who had taken it along on a trip to an equatorial jungle, but, hey that’s
another story).

Evaluation Defined
The PDK Committee defines the term, evaluation, as simply:

A process for delineating, obtaining, and providing
useful information for judging decision alternatives.
This brief definition is loaded with meaning. An analytical breakdown of terms in this definition:
• Process—An activity subsuming many methods and involving a number of steps or operations.
• Decision alternatives—Two or more different actions that might be taken in response to some situation
requiring altered action. Situations include: (a) unmet needs exist; (b) some barrier to fulfillment of needs
exists; or (c) opportunities exist which ought to be exploited.
• Information—Quantitative or qualitative data about entities (tangible or intangible) and their relationships in
terms of some purpose. Information is derived from many sources and methods. Could be from scientific data,
precedence, or experience. Information is more than a collection of facts and data. Rather, facts and data must
be organized for intelligibility and must reduce the uncertainty that surrounds decision making. Information
often is limited and imperfect in evaluation situations in comparison with experimental situations in which
controls are possible and are formalized by rigorous design.
• Delineating—Identifying evaluative information required through an inventory of the decision alternatives to
be weighed and the criteria to be applied in weighing them. Knowing at least two elements is essential: (a)
decision alternatives and (b) values or criteria to be applied. These are obtained only from working with clients
for evaluations who must make or ratify decisions. Therefore, evaluation has an “interface” role as well as a
“technical” role because its worth is defined by meeting client needs.
• Obtaining—Collecting, organizing, and analyzing information through such formal means as observation,
review of artifacts, measurement, data processing, and statistical analysis. Ways of obtaining information are
guided by scientific criteria and disciplinary preferences.

A work breakdown “is a tool that defines a project and groups the project’s discrete work elements in a way that helps organize and define the total
work scope of the project” (Booz, Allen, & Hamilton. (n. d.). Earned value management tutorial module 2: Work breakdown structure. Retrieved


• Providing—Fitting information together into systems and subsystems that best serve the purposes of the
evaluation, and reporting the information to the decision maker. Involves interaction between the evaluator and
the various audiences for information. Multiple audiences are common. The direct audience for information is
the decision maker. Also important are people and organizations who must ratify decisions made as are others
who have strong stakeholder positions (e.g., the taxpayers might pay close attention to decisions about tax rates
made by legislators). These information delivery preferences of these audiences vary in specificity, modality,
and timing.
• Useful—Satisfying criteria for evaluation and matched to the judgment criteria to be employed in choosing
among decision alternatives. Criteria to be imposed on evaluations include: scientific criteria (internal and
external validity, reliability, objectivity); practical criteria (relevance, importance, scope, credibility,
timeliness, and pervasiveness); and prudential criteria (efficiency). Must meet the client’s identified
information needs.
• Judging—The act of choosing among decision alternatives. Judging is central because evaluation is meant to
serve decision making. Although judging is central to evaluation, the act of judging is not central to an
evaluator’s role.

The Evaluation Process Serves Decision Making
The PDK Committee’s definition of evaluation focuses on decision making to make system
improvements. The decision maker must judge among decision alternatives that present themselves as
options weighted by their (perhaps subjective) probability of success. The basis for decision making is
derived from the integration of, on one hand, personal or institutional values with, on the other hand,
information that is available to the decision maker.

Types of Decisions Are Linked to Types of Evaluation
Decisions served by the evaluation process are classified into a typology along two dimensions: one
dimension indicates whether decisions are focused on means or ends; another dimension refers to whether
decisions are relevant to intentions or actualities. The intersection of these two dimensions classifies four
types of decisions.
Decisions about what ends should be pursued are planning decisions. For instance, what goals
should an activity pursue? Are there priorities among goals? Planning decisions are about setting context.
Decisions about what means are required to reach intended outcomes are structuring decisions.
These types of decisions specify the means to achieve ends established as a result of planning decisions.
Most of these decisions are about allocation of scarce resources necessary to attain goals. Structuring
decisions are about assembling input.


Decisions about how the means to attain ends actually are working are implementing decisions.
These types of decisions mostly serve process control. For instance, is an activity following standards and
procedures, or should these standards and procedures be altered? Based on current experience, should inputs
to an activity be altered? Is there any evidence that mid–course corrections in an activity are necessary?
Implementing decisions are about monitoring process.
Judging and reacting to actual attainments are recycling decisions. These decisions often are made to
serve accountability requirements. Did an activity actually produce a desired or mandated outcome? For
instance, did a job training program attain a job placement rate specified as a criterion in a contract between
the program operator and a funding agency? Reacting
to attainments also involves making decisions to
maintain, revise, or terminate an activity. For instance,
do the outcomes of the job training program justify
maintaining it at current resource levels. Should the job
training program be revised because outcomes are not
attained, or is the program’s performance so far below
standard that termination is justified? Recycling
decisions can loop back to planning decisions because
information about goal attainment is useful in
rethinking goals and their priorities. Recycling
decisions are focused on products.
Several other ways are possible to organize the typology of decisions made in evaluation. First,
decisions about ends are about consequences, and decisions about means are about instruments to attain ends.
And, to take a tack more familiar to many, implementing decisions are served by what often is called
formative evaluation because they are meant to improve processes while they are occurring. Recycling
decisions are served by what often is called summative evaluation because these decisions are made after
activity has occurred.
The PDK Committee on Evaluation identifies four types of evaluation processes serving these four
types of decisions that are meant to lead to system improvement:
• Planning decisions  Context evaluation—This type of evaluation determines goals and objectives;
defines the relevant environment, its actual and desired conditions, its unmet needs and unused
opportunities, and why needs and opportunities are not being met and used; examines contingencies
that pressure and promote improvements; and assesses congruities between actual and intended
• Structuring decisions  Input evaluation—Essentially, this type of evaluation helps state objectives
operationally and whether their accomplishment is feasible.
• Implementing decisions  Process evaluation—This type of evaluation is used to: identify and
monitor potential sources of failure in an activity; to service preprogrammed decisions that are to be
made during the implementation of an activity; and to record events that have occurred so that
“lessons learned” can be delineated at the end of an activity. Process evaluation assesses the extent to
which procedures operate as intended.
• Recycling decisions  Product evaluation—This type of evaluation measures criteria associated
with the objectives for an activity, compares these measurements with predetermined absolute or
relative standards, and makes rational interpretations of these outcomes using recorded context,
input, and process information. Product evaluation investigates the extent to which objectives have
been, or are being, attained.
The evaluation model specified by the PDK Committee in Educational Evaluation and Decision
Making often is referenced as the CIPP model to signify the four types of evaluation conducted—context,
input, process, product.


Evaluation is a Process Defined & Organized Through a General Work Breakdown
According to the PDK Committee, evaluation is “a process for delineating, obtaining, and providing
useful information for judging decision alternatives.” This process is identified as logical units of activity,
each containing a series of tasks. A work breakdown of this process specifies discrete work tasks to define
and organize the total scope of work for any evaluation project. For activities in which The Penn State
Educational Evaluation Group is involved, an evaluation design identifies the project under consideration
with a work breakdown that defines systems and subsystems involved in an evaluation.

Within the superstructure constructed through this work breakdown, The Penn State Educational
Evaluation Group can manage context, input, process, and output evaluation projects. Adopting this
framework for evaluation design supplies a professional and consistent methodology for any evaluation
project that The Penn State Educational Evaluation Group might manage. The work breakdown for
evaluation describes the general process that can be used to guide all evaluation projects conducted by The
Penn State Educational Evaluation Group.

Delineating Information Needs—“Who Needs What When?”
The PDK Committee on Evaluation originally cast its work to promote continuous evaluation
systems, so the Committee placed great stock in, first, defining the systems within activities to be evaluated.
Maps, process and control flow charts, system context diagrams, and definitions stated in ordinary language
help to model the activity being evaluated. All of these tools help to evoke the evaluation activity that is
Because serving information needs for decision making drives the evaluation
process, a starting point for delineating information needs is to specify the nature of decisions
to be made. This specification involves documenting who has the authority and responsibility
for making decisions as well as individuals and organizations that influence or ratify
decisions. In short, audiences for information to be provided are identified.
The individual decision questions to be answered must be delineated, which involves
listing clear, unambiguous, operational statements defining the decisions to be made, the
mutually exclusive decision alternatives considered, the criteria by which alternatives will be
selected, and the rules to be applied in making these selections. Also considered is the evidence necessary to
turn fuzzy concepts into variables that are measurable from specific observations.
Resource–conscious planning of evaluation projects involves exhausting available evidence, such as
data collected primarily for administrative purposes, before collecting new information. Such use of
secondary data avoids redundancy in data collection. At the same time, evaluators must weigh the


appropriateness of using information merely because it is available. In evaluation projects with limited
duration and resources, however, an approximation to ideal, pristine information might be all that is possible.
In a research setting, timing of the production of information might not be critical. Research operates
in the fullness of time. However, in most evaluation settings, the timing of information delivery often is
crucial to meet statutory requirements, to maintain progress in activity, or to make periodic reports necessary
to achieve credibility with various audiences for information. Therefore, information about the necessary
timing of decisions is essential for creating evaluation designs that respond to decision makers’ requirements.
Because an evaluator works for a client, rules of engagement between the client and the evaluator are
important to the success of the evaluation process. For instance, what information sources may the evaluator
access? What information is off limits? With whom may the evaluator communicate? And, when? How
often? What budget is available to complete the evaluation project? Are there scheduling or resource
constraints that limit the evaluation project?
Information about the structure and aims of an evaluation project is necessary when a project is
initiated. Moreover, this information is required for the evaluator and the client to decide whether an
evaluation is feasible technically and fiscally.

Obtaining Information—“How?”
The PDK Committee on Evaluation divided the task of obtaining information into three major
subtasks: collecting, organizing, and analyzing information. These subtasks are a prominent and familiar part
of the training of almost every researcher. These are the “tools of the trade.” Particularly relevant, however,
is that the PDK Committee did not endorse any particular epistemological viewpoint about the best or
preferred methods for obtaining information or about any zeitgeist that conditions an evaluation mission. For
example, there was no endorsement of, say, hierarchical linear modeling, of hermeneutics, of null hypothesis
testing, or of focus groups. Nor did the PDK Evaluation Committee weigh into any how–many–angels–on–
the–head–of–a–pin disputations about the Flyvbjerg Debate. Rather, the Committee remained aloof from and
indifferent to competing claims about specific methods, techniques, and tools preferred for evaluation, except
to assert that information obtained must serve the needs for clients’ decisions and should meet scientific
The PDK Committee on Evaluation asserted, in fact, that a major impediment for
trained researchers to overcome is the idea that evaluation methodology is identical to
research methodology. The Committee observed that most educational evaluations do not
occur in antiseptic laboratories that model contrived, one–time situations and rely on random
assignment of subjects to treatment conditions. Rather, many educational evaluations must
occur in what the PDK Committee described as “the septic world of the classroom and
school,” a continuously shifting environment in which randomization and independence of
treatment conditions is wildly impractical and even detrimental.
Since the time of the work of the PDK Committee on Evaluation, however, the pendulum has swung
back toward the belief that quality is maintained best in evaluations by applying experimental and quasi–
experimental designs and by making statistical adjustments for selection bias due to lack of randomization.
For instance, an early 2010 request for research project applications from the U.S. Department of
Education’s Institutes of Education Sciences stipulated that randomization is a sine qua non of rigor in
evaluation: “Studies using random assignment to intervention and comparison conditions have the strongest
internal validity for causal conclusions and thus are preferred whenever they are feasible,”15 even though
“Randomized controlled trials may face resistance from stakeholders who would like to see all eligible
participants receive the intervention in expectation of its benefits.”16
The approach taken by the PDK Committee on Evaluation is to accept that obtaining information for
evaluation is a catholic, eclectic, and diverse activity that is dictated by the expectations, worldviews, and

From p. 7 in Institute for Educational Sciences. (2010, February 1). Request for applications: Evaluation of state and local educational programs
and policies (CFDA Number 84.305E). Retrieved from:
From p. 8 of the Institute for Educational Sciences request.


hedonic tastes of both the client and the evaluator, the nature of the activity being evaluated, and the amounts
of time, funds, and other resources allocated to conduct the evaluation project.

Providing Information—“To Whom & In What Form?”
An evaluation design delineates information needs and provides a plan for obtaining information to
meet those needs. To round out an evaluation design, preparation of a plan for providing information is
Providing information includes the report preparation and dissemination of reports. The term, report,
is not meant to stand only for a written document. Reports can take many forms. A report may, indeed, end
up in printed form, but reports can be offered in a variety of non–print media. For example, when one of us
(Passmore) conducted an evaluation of the factors related to the successful integration of deaf people into
workplaces, one report took the form of a staged play production. This format was chosen because features
of the culture of deaf Americans include a strong enjoyment of
mime, familiarity with apprehending concepts through
visualization, and pleasurable associations with theater.
Reports provide information to audiences. Preparation
of reports requires some knowledge of the communication
modalities that are preferred and effective with identified
audiences. Of course, some reports might appear formal, such
as an oral report to a group, a written document, or a web page.
Some reports are delivered to audiences internal to the project
being evaluated. Some reports are designed for external
audiences. Reports also can take on an air of informality, such
as in an elevator conversation, but they nevertheless might
occur as a result of a carefully planned rollout of evaluation. In
short, information obtained through evaluations should be
provided in forms that leverage the communication
preferences and styles of the audiences for evaluations—the
decision makers, influencers, and ratifiers.
Dissemination of reports involves two subtasks: publication of reports and transmission of reports.
Publication of reports involves creating, composing, editing, and producing reports for various audiences.
Many forms of publication can be quite resource intensive and can consume significant resources. In fact,
some items, such as web pages, can present substantial lifetime costs if the client wishes pages to be
maintained and updated for a period beyond the termination of the evaluation project. Therefore, careful
operational, technical, and fiscal planning of report publication is necessary. In our observations of many
projects, unplanned costs for publications, that clients eventually might believe are necessary only after an
evaluation project is almost complete, can become a serious budget liability and a threat to the success of an
evaluation project.
Transmission of reports involves identifying channels of communication and marketing that reach
intended audiences best. Dissemination sequences can become important in report transmission. For
example, we conducted research on the history and forecast for manufacturing in the Commonwealth. One of
the identified audiences included state legislators. In particular, majority and minority party
chairs of appropriations committees in the Pennsylvania House of Representatives and the
Pennsylvania State Senate were targeted to receive information from this report about
Pennsylvania manufacturing. Based on interaction with Penn State Office of Governmental
Affairs, we learned that the order of transmission of our report was a sensitive issue. As a result,
we spent several days negotiating a one–day schedule of four meetings to make oral and deliver
written reports of our work in an order that was acceptable to the four committee chairs. Order of
transmission often counts.


Evaluation is, Itself, Evaluated
Evaluation projects conducted by The Penn State Educational Evaluation Group must meet quality
criteria. The Group’s work must be correct technically, must fulfill the requirements of clients, and must be
conducted within resource constraints. The PDK Committee on Evaluation considered three groups of meta–
evaluation criteria—that is, criteria for evaluating evaluations:
• Scientific—These criteria primarily assess “representational goodness;” that is, they assess how well the
evaluation depicts a situation. These criteria, in their detailed and technical forms, are quite familiar to most
people with research training.
o Internal validity—Extent to which evaluation findings can be attributed to the activity evaluated rather
than to some other factor or artifact.
o External validity—The extent to which evaluation findings are generalizable over people, places, and time.
The notion of whether generalizability even is possible is a major point of difference among researchers.
o Reliability—Simply said, the extent to which evidence is measured accurately.
o Objectivity—Intersubjectivity in the sense of agreement on conclusions between independent observers of
the same evidence. The “Do you see what I see?” phenomenon.
• Practical—Meeting scientific criteria is a necessary, but not a sufficient, condition of the value of evaluative
information. Information for evaluation must be correct, but it is meant to serve decision makers. Practical
criteria concentrate on the extent to which evaluation has met decision makers’ information needs.
o Relevance—Evaluation information is obtained to serve decision making. If the information obtained does
not match these decisions, then this information, no matter how well they are obtained scientifically, is
o Importance—Information needs to be more than nominally relevant to a decision. Not all information is
important. Information with the highest importance (relevance graded by quality) must be obtained and
provided, within budget constraints. The evaluator must determine what the client believes is important. In
some cases, the evaluator might suggest to the client what information ought to be considered important
because the evaluator often has considerable experience with obtaining and using some types of
o Scope—This is a “completeness” criterion—that is, is all information obtained and provided that is needed
to make a decision?
o Credibility—Clients often are not in a position to judge whether information obtained and provided by an
evaluator meets scientific criteria. Therefore, the trust invested in the evaluator by the client is an
important dimension of the quality of the outcomes of an evaluation project. Just like the scientific
criterion of internal validity pertains to a particular evaluation situation, credibility is never generalizable
and always refers to a particular situation. Of course, the reputation of the evaluator is an important
mediator of credibility. So are specific processes or activities meant to promote credibility. For example,
the Penn State Workforce Education and Development Initiative (WEDI; states
explicitly certain ethical guides that are meant to promote credibility. In the WEDI report, Global
Competitiveness of Powder Metallurgy Part Manufacturing in North Central Pennsylvania (see, we wrote:
This report is publicly available, as are all reports of Penn State’s Workforce Education
and Development Initiative. This report about the global competitiveness of the PM
part manufacturing industry in north central Pennsylvania is issued in the public
interest solely for individuals and organizations seeking information about the industry
and its important role in the north central Pennsylvania regional economy. Penn State
and the organizations that provided advice for the research behind this report do not
represent this information for use in financial planning or investment, nor is any claim
made for the usefulness of this information for product design, engineering, or
The Penn State WED Initiative often conducts research and analysis about topics and
issues that, at times, are the focus of vigorous debate and public attention and that
frequently are associated with diverse stakeholders who represent divergent opinions.
The Initiative adds value, attention, and discussion to this debate by conducting and
reporting research and analysis for decisions affecting economic and workforce

p.52 in Baker, R.M., & Passmore, D.L. (2009). Global competitiveness of powder metallurgy Research Network: manufacturing in north central
Pennsylvania. Retrieved from the Social Science Research Network:


development using the most objective approaches possible. The research and analysis
of the WED Initiative are pursued independent of the commercial or political interests
of any actual or potential sponsor of WED Initiative work.18
o Timeliness—The best information is useless, if it comes too late.
o Persuasiveness—Information from evaluation projects is provided to identified audiences. And, this
information is meant to be used. The criterion of pervasiveness is met if all of the people and organizations
who should, do, in fact, know about and use evaluative information.
• Prudential—Proper application of practical criteria of relevance, importance, and scope should remedy many
inefficiencies in evaluation, the conduct of an evaluation project must be weighed against alternative
evaluation designs that could have achieved the same outcome with different time, financial, and personnel
Scientific, practical, and prudential criteria can form the basis for post–project autopsies by both
evaluators and clients. In addition, these criteria can guide the work and judgments of any advisory group
established for The Penn State Educational Evaluation Group.

Moving Forward
We believe strongly that adoption of a meta–methodology for evaluation is necessary for success of
The Penn State Educational Evaluation Group, even if the particular meta–methodology outlined in this
white paper from the PDK Committee on Evaluation’s work is not adopted. Articulation and adoption of
such a meta–methodology would:
• Promote communication and marketing with potential clients;
• Professionalize the educational evaluation function;
• Generalize and organize the evaluation function to circumscribe seemingly disparate and superficially
contrasting decisions and types of evaluation;
• Reduce the reliance on particular philosophies and approaches to evaluation by substituting a catholic and
eclectic process capable of encompassing a wide variety of information requirements of decision makers; and
• Encourage the wide participation among College of Education faculty and students representing a diversity of
evaluation viewpoints and skills.
Beyond the question of methodology, however, is the need to grapple with challenges related to the
design, development, and execution of functions of The Penn State Educational Evaluation Group. We
describe a number of these opportunities and issues in this section of this white paper.

Work for The Penn State Educational Evaluation Group will arrive in the form of evaluation
projects from such clients as internal and external research projects in need of an evaluation component,
school districts, and governmental and non–governmental organizations. Some projects might be ad hoc,
once–and–done activities for clients. Some projects might be take the form of a continuing relationship with
a client. But, how shall these clients be developed?
The Penn State Educational Evaluation Group must prospect for qualified clients. Prospecting will
involve establishment of a brand, reputation, and presence for the Group through public information,
professional visibility, web presence, and, most importantly, through establishment of relationships with key
clients. The ultimate aim of prospecting is acquisition of evaluation project contracts—that is, sales.
Instrumental to attaining this goal is the marketing of The Penn State Educational Evaluation Group by
providing information about the Group’s existence and availability and the creation of awareness of the
scope and quality of the Group’s capabilities. Because the Group does not exist currently, the establishment
of presence and brand is mission–critical. Association with the Group of well–known and productive faculty
during the start-up phase could allow the halo effect from their reputations to be cast on The Penn State
Educational Evaluation Group.

p.53 in Baker and Passmore report on powder metallurgy.


Not all prospects identified will become clients. Prospects developed for The Penn State Educational
Evaluation Group also need to be qualified in terms of (a) the match between evaluation requirements and
the Group’s capabilities and, especially, (b) the ability to finance the evaluation projects they envision. There
are a variety of strategies for pricing projects that can be followed, but the general rule is that that the
University is a vendor that provides evaluation services to evaluation clients. Within this structure, cash
flows into the University for projects, but it generally does not flow out. More is presented about this matter
in a subsequent section of this white paper with a discussion of financing opportunities and issues.

Space is scarce throughout the University Park Campus, and especially in the College of Education.
However, we feel confident that space will be found for people associated with The Penn State Educational
Evaluation Group and that other physical assets can be assembled to meet the needs of the Group. A greater
challenge is likely to be assembling the human capital to fulfill requirements of evaluation projects that the
group undertakes.
Evaluation projects conducted by the Group probably will require a mix of skills. Some projects
might require high–end technical skills such as in instrument design, psychometrics, or statistics, although
others might demand a relatively low–level skills such as in assembly of printed documents, mailing, or data
coding and preparation. In addition, the number of people required by The Penn State Educational
Evaluation Group will wax and wane over the life cycle of projects. As a consequence, staffing of evaluation
projects will challenge the Group.
The staff of The Penn State Educational Evaluation Group should develop a database of people who
can be allocated to evaluation projects. Populating this database will require: recruiting faculty, staff, and
students in the College, on the University Park Campus, on Commonwealth campuses, and beyond who have
evaluation interests; cataloging knowledge, skills, and abilities that have potential for productivity in
evaluation projects; and maintaining accuracy of the database through accessions, revisions, or terminations
of records of potential Group participants on a continuous basis. We do not want to trivialize the amount of
effort that populating and maintaining this database will require, but having knowledge of assets available to
deploy to projects, big and small, will be a crucial strength of The Penn State Educational Evaluation Group.
The Group must possess a strategic readiness,19 much as military forces maintain the support capability to
meet a commander’s operational requirements.20

A familiar cognomen and watchword of the times is sustainable, the “keep on keeping on” attribute
of persistent organizations. Of the many factors that will affect the sustainability of The Penn State
Educational Evaluation Group, finances take precedence.
Although The Penn State Educational Evaluation Group probably will require seed money for
initiation, our planning is based on the assumption that the Group, even if it receives a department
allotment,21 eventually must pay its bills and expenses almost entirely from its earnings.22 A business plan
should be developed to consider the Group’s cash stock and flow needs in detail, but several facts of
financial life unique to the University will challenge the sustainability of The Penn State Educational
Evaluation Group.
In the operating budget of the University, restricted funds earned through projects or programs can
only be expended under the terms of a governmental or private gift, grant, or contract.23 The terms of the gift,
grant, or contract dictate a restricted fund’s contract period. Carrying restricted funds beyond the contract

The state or quality of being ready, prepared, available, and willing.
The military measures the readiness of a unit as the ability to accomplish its assigned missions.
See unnumbered p.9 in
We believe that the Group is not likely receive a standing position from permanent funds (see unnumbered p.11 in


period is not possible, except in payment of bills during holdover periods and in the case of client–approved
contract extensions.
The budget of the typical evaluation project of The Penn State Educational Evaluation Group is
most likely to be composed of restricted funds. The Group is highly unlikely to receive unrestricted money
from general or auxiliary funds or from intramural, interdepartmental service fund transactions.24 Therefore,
just as the people assets of the Group will wax and wane, so will its finances that will be dependent largely
on restricted funds. The expected volatility of the external funding stream for The Penn State Educational
Evaluation Group, especially during its start–up period, is a challenge to the Group’s sustainability.
Depending on the client, most evaluation projects on restricted funds will generate indirect costs,
some small percentage of which will be reallocated to the College administration by the University
administration through the University’s Research Incentive Fund (RIF).25 Whether any portion of the RIF
generated by evaluation projects can be returned to The Penn State Educational Evaluation Group is
unknown. The financial uncertainty about the availability of RIF is a challenge to the sustainability of the
Again, we believe that these and other financial challenges to the sustainability of The Penn State
Educational Evaluation Group are best addressed in a formal business plan.

One purpose for adopting the meta–methodology for evaluation that we summarized from the PDK
Committee on Evaluation’s work is to provide conceptual categories and common processes that are
adaptable over the variety of topics that might be evaluated and techniques applied to evaluations. We
anticipate that the development of general flowcharts and checklists for evaluation processes will help
organize projects and communicate processes to the faculty and student partners involved. These aids
especially are important for educational evaluation projects, which we anticipate will need to be conducted
on stringent timelines and conducted efficiently on tight budgets. Flowcharts and checklists have become
popular management tools for medical services, just as they are the mainstay of operations in other fields
such as in aviation’s use of pilot checklists.26
Managing evaluation projects, which usually are temporary endeavors, staffed by faculty and
students, engaged on an ad hoc basis, for clients, who have their own agendas, can be complex. The Penn
State Educational Evaluation Group must provide information in a timely manner (one of the practical
criteria by which we are suggesting that the Group be evaluated). It follows, then, that evaluations projects
must provide quality information according to a contracted schedule for deliverables. Professional project
management, therefore, would optimize the work of the Group. One of us (Baker) is certified Project
Management Professional27 through the Project Management Institute.28

Penn State unrestricted funds are defined at: One exception to the inability to obtain unrestricted funds might
occur if the Group receives funds through an Interdepartmental Charges and Credit (IDCC) process. IDCC transfers may be used for budget
reallocation only between general funds and only within restricted funds (; requires authentication) or for the
purchase of, or the reimbursement for the use of, goods or services (including labor) (; requires
See Colleges receive an amount equal to 12% of indirect costs to promote faculty
research incentives and new sponsored programs. For instance, this means that a $1 million contract, from which 48% is recovered by the
University in indirect costs (see F&A Rates (7/1/07 – Until Amended (modified July 18, 2009) at for various indirect cost rates), will return $57,600 to the College (equal to
$1,000,000 x (.48 x .12)).
See review of literature by Hales, B., Terblanche, M., Fowler, R. & Sibbald, W. (2008). Development of medical checklists for improved quality of
patient care. International Journal for Quality in Health Care, 20(1), 22-30. doi:10.1093/intqhc/mzm062 and prescriptions for use of medical
checklists developed by Pronovost, P., & Vohr, E. (2010). Safe patients, smart hospitals: How one doctor's checklist can help us change health care
from the inside out. New York: Penguin Group.


Adoption of project management to guide the work of The Penn State Educational Evaluation Group
would entail supporting five processes (four stages and one control function) in the development of an
evaluation project:29
• Initiating—Determine the nature, objectives, scope, purpose, and deliverables for an evaluation project. These
factors usually are embodied in a project charter. Also included in the charter are: detailing information about
the organization, staffing, and workflow for the project; identification of stakeholders on the side of the client
and on the project team; assessment of risks, issues, and assumptions; and a general implementation plan.
• Planning—Specify time, costs, and resources adequately to estimate the work needed and to effectively
manage risk during project execution. Involves, among other matters: specification of tasks and assignment of
assets to task completion; creation of a financial plan; identification of risks, priorities among risks and the
likelihood of their occurrence; determination of the impact of realization of risks; delineation of standards for
client acceptance of deliverables; and listing of required communication events, their methods, and frequency.
• Executing—Implement processes used to accomplish the project's requirements. Includes time, cost, expense,
process, and quality management.
• Monitoring and controlling—Observe project execution so that potential
problems can be identified in a timely manner and corrective action can be
taken, when necessary, to control the execution of the project. Essentially,
this is management of the plan, including identifying, assessing the feasibility
of, structuring, implementing, and communicating any changes in the project
plan that are necessary.
• Closing—Formally accept and end the project. Involves ensuring that the
client has received all deliverables and necessary documentation (e.g.,
technical and financial reports, as needed), releasing staff and equipment
made redundant upon project completion, and informing stakeholders of the
closure of the project. A post–project review would allow the project team
and the client to examine the success of the project and to assess any lessons
Management of evaluation projects is likely to be a challenge for The Penn State Educational
Evaluation Group. Penn State research administration guidelines for the responsible conduct of research
outline principles for the University research community to follow “to adhere to the highest ethical and
professional standards as they pursue research activities to further scientific understanding.”30 The guidelines
anticipate a variety of standards for work conduct and ethical conduct (e.g., peer review, conflicts of interest
and commitment, proper conduct with human and animal research subjects) and, among other principles,
exhort researchers to “Demonstrate integrity and professionalism.” The guidelines indicate that research
supervisors must “Ensure the proper fiscal management and conduct of the project. [and]…Oversee the
preparation and submission of technical reports and any other required deliverables.” Among the fiscal
responsibilities assigned to researchers are to “Comply with all terms and conditions imposed by the
financial sponsor. . [and]…Initiate all required approvals for budgetary and programmatic changes that may
be necessary during a project.”
In spite of these guidelines for the responsible conduct of research, The Penn State Educational
Evaluation Group will be challenged to control the time that University participants allocate to creating
project deliverables. Most project participants face a full plate of activities that demand their time. Balancing
a time budget to include one activity, an evaluation project, out of many could pose difficulties. In addition,
many faculty members are most familiar with engagement in research projects without any other than
personal project completion goals. Research timelines are open-ended, while evaluation project timelines are
fixed and often rigid so that evaluation clients’ information needs are met.
The staff of The Penn State Educational Evaluation Group generally does not have the authority or
responsibility for primary assessment of job performance of evaluation project team participants. In most
cases, the Group has no recourse to ensure the timely delivery of products from project participants.
However, the Group can react to unsatisfactory performance by a project team participant in at least four
ways (some not independent of one another):

Adapted from and
Guideline RTAG16 the responsible conduct of research. Retrieved from


• Terminate contracts with clients prior to project completion based on terms agreed at the initiation of the
• Remove or replace the participants from the project teams based on terms agreed between The Penn State
Educational Evaluation Group and the participants, with total or partial refusal to pay the wages or salaries
that the participants might have earned (not all participants are paid; some might serve as volunteers);
• Remove or replace the participants from any other project teams established by the Group on which the
participants are serving simultaneously, with, again, total or partial refusal to pay the wages or salaries that the
participants would have earned; and
• Eliminate opportunities for the participants to engage in subsequent projects with the Group.
Evaluation projects must be completed, with quality, on time.

If successful, The Penn State Educational Evaluation Group must expand its capacity to meet the
demand for evaluation projects. Absent a business plan, the initial capacity necessary and the optimal
capacity desired for the Group remains unknown. However, we expect success to breed success—that is,
more identification and differentiation of the Group’s brand, increased awareness of the capacity possessed
by the Group, and stronger reputation of the value that the Group delivers will translate into more sales to
more clients. Challenges to any growth in the capacity of the Group are extensions of issues raised in
“Assets” and “Finances” paragraphs in this section of this white paper.
At least two challenges to growth of the Group are identifiable:
• Assets—The nature and quality of the talent pool recruited for evaluation projects managed through The Penn
State Educational Evaluation Group could either encourage or constrain growth in educational evaluation
work for the College. No talent, no growth. Talent with particular expertise will yield growth along lines
supported by that expertise.
• Finances—In many enterprises, retained earnings from one accounting period are invested in subsequent
accounting periods to grow and diversify the enterprise through research, development, and innovation. Of
course, most enterprises can record losses in any accounting period, but retained earnings and losses are
cumulative from year to year, with sustainability attained when long–run earnings offset losses. University
restricted funds do not have this dynamic, carryover feature. Indeed, restricted funds not expended in one
accounting period cannot be retained to offset losses in subsequent accounting periods. In addition, the
amount of RIF available is likely to be insignificant to sustain the Group’s operation unless contract amounts
are sufficiently high and all or some RIF actually is returned to the Group. The volatility in operating funds
will be a major challenge to the growth of the Group. One possibility to meet this challenge is through the
crafting of fee–for–service arrangements, already available to some University units, that allow revenue above
expenses to be retained. Such an approach could adapt Penn State agreements for measurement, fabrication,
and analysis,31 standard research,32 or fixed price research.33

The Next Step: A Business Plan
Although The Penn State Educational Evaluation Group is an academic effort, it must be sustainable
as a business entity within the College of Education. The best way to arrive at a decision about the Group’s
sustainability is to produce a written business plan. Many prescriptions for writing a plan are available, but,
according to the U.S. Small Business Administration (SBA):
A business plan precisely defines your business, identifies your goals, and serves as
your firm's resume. The basic components include a current and pro forma balance
sheet, an income statement, and a cash flow analysis. It helps you allocate resources
properly, handle unforeseen complications, and make good business decisions. As it
provides specific and organized information about your company and how you will
repay borrowed money, a good business plan is a crucial part of any loan application.



Additionally, it informs sales personnel, suppliers, and others about your operations
and goals.34
The SBA believes that a business plan should answer four core questions:
• What service or product does a business provide and what needs does it fill?
• Who are the potential customers for the product or service and why will they purchase it from the business?
• How will the business reach your potential customers?
• Where will the business obtain the financial resources necessary to start?
The business plan for The Penn State Educational Evaluation Group should lead to a decision about
whether to launch the Group. If a launch is decided, the Group can move forward with a provisional staff
that will create the systems and procedures necessary to conduct evaluation projects.