Professional Documents
Culture Documents
Introduction
The purpose of this paper is to examine the program evaluation literature
in educational psychology with a view to synthesizing that which is most
useful and applicable to the special problems of language program
evaluation. To that end, various definitions of program evaluation will
first be compared; then, the differences and similarities between 'testing',
'measurement' and 'evaluation' will be explored. Historical trends in
program evaluation are also discussed with a focus on the product-
oriented, static characteristic, process-oriented and decision facilitation
approaches which have developed over the past forty years in educational
psychology. Three dimensions, specifically applicable to language pro-
grams, appear to be dominant in the general literature: formative vs.
summative evaluation, product vs. process evaluation and quantitative
vs. qualitative evaluation. This is followed by an enumeration of twenty-
four different data-gathering procedures which are available to language
program evaluators. These procedures are classified into six different
categories, which can help the evaluator to select a more limited and
practical subset of procedures for use in a specific language program
evaluation. The paper concludes by tying together the historical develop-
ments, the three dimensions of evaluation theory and the large number of
data-gathering procedures through discussion of a model (currently
being applied at the University of Hawaii). This model functions as an
integral part of the overall process of language curriculum development
and maintenance.
Definitions of evaluation
Evaluation is defined in Richards et al. (1985) as 'the systematic
gathering of information for purposes of making decisions'. This defi-
nition is perhaps too broad for the purposes of this paper in that it could
equally well be used to define needs analysis. However, it is worth
222
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Introduction 223
considering the possibility that the difference between needs analysis and
program evaluation may be more one of focus than of the actual activities
involved.
Another definition supplied by Popham (1975) may help to show how
foci can differ even in approaches to program evaluation: 'Systematic
educational evaluation consists of a formal assessment of the worth of
educational phenomena.' In contrast to Richards et al.y this definition
may be too restrictive. Certainly, there should be a portion of any
program evaluation which focuses on 'formal assessment', but perhaps
provision should also be made for a number of informal activities, an
issue which will be discussed below. There are indeed forms of evaluation
that focus on 'the worth of educational phenomena'; however, there are
also types which focus on improving the curriculum. These may be
equally constructive and useful, as we will also see.
The definition offered by Worthen and Sanders provides a somewhat
broader perspective:
Evaluation is the determination of the worth of a thing. It includes
obtaining information for use in judging the worth of a program,
product, procedure, or object, or the potential utility of alternative
approaches designed to attain specified objectives.
(1973: 19)
This definition is less restrictive in the sense that, while it still includes the
notion of 'the worth of a program' it provides for judging 'the potential
utility of alternative approaches'. One problem with this definition,
however, is that the goal of evaluation is to 'attain specified objectives'.
This too may be unnecessarily limiting because it implies a goal-oriented
approach to the evaluation process. While there is certainly room for
some goal or product orientation in program evaluation, the processes
involved should also be considered for the sake of constantly upgrading
and modifying the program to meet new conditions.
It seems, then, that a modified definition is needed to suit the views and
purposes expressed in this paper: evaluation is the systematic collection
and analysis of all relevant information necessary to promote the
improvement of a curriculum, and assess its effectiveness and efficiency,
as well as the participants' attitudes within the context of the particular
institutions involved.
Notice that this definition requires that information not only be
gathered but also analyzed, and that both should be done systematically.
Note also that there are two purposes: the promotion of improvement as
well as the assessment of effectiveness, or 'worth' as Popham put it.
Finally, this definition stresses that evaluation is necessarily site-specific
in the sense that it must focus on a particular curriculum, and will be
affected and bound to the institutions which are linked to the program,
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
224 Language program evaluation
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Approaches to evaluation 225
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
226 Language program evaluation
Process-oriented approaches
A notable shift to process-oriented approaches began with the reali-
zation that meeting program goals and objectives was indeed important
but that evaluation procedures could also be utilized to facilitate curri-
culum change and improvement. Notable among the advocates of these
approaches were Scriven and Stake.
Scriven's (1967) contributions included a dramatic new set of foci for
program evaluations. Some of the most important of these are as
follows:
1 He originated the distinction between formative and summative
evaluation.
2 He emphasized the importance of evaluating not only if the goals had
been met but also if the goals themselves were worthy.
3 He advocated goal free evaluation, i.e., the evaluators should not only
limit themselves to studying the expected goals of the program but
also consider the possibility that there were unexpected outcomes
which should be recognized and studied.
Notice the degree to which product, in the form of goals, is de-
emphasized here in favor of process in the forms of formative and goal
free evaluation.
Stake is best known for the Countenance model of evaluation. The
basic elements of this model (Stake, 1967a) begin with a rationale, then
focus on descriptive operations (intents and observations) and end with
judgmental operations (standards and judgments) at three different
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Approaches to evaluation 227
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
228 Language program evaluation
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Dimensions of evaluation 229
Dimensions of evaluation
Among the evaluation approaches discussed above, there are certain
patterns which can help not only in understanding the similarities and
differences between the existing approaches but also in formulating an
approach tailored to (and, therefore, most advantageous to) a particular
program. These patterns center on three dimensions: formative vs.
summative, process vs. product and quantitative vs. qualitative. Such
opposing points of view are often considered dichotomies, but they are
referred to as 'dimensions' here because our experience at the University
of Hawaii indicates that they may be complementary rather than
mutually exclusive. We take the stance that our particular program
should utilize both points of view in each of the dimensions. In other
words, all available perspectives may prove valuable for the evaluation of
a given program. How they are utilized in a particular setting will
naturally depend on the various educational philosophies of the admini-
strators, teachers and students in the program.
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
230 Language program evaluation
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Dimensions of evaluation 231
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
232 Language program evaluation
Evaluation procedures
All of the approaches and dimensions discussed above are important to
know about in the evaluation process because evaluators should decide,
at least tentatively, which combination of approaches they wish to use
(i.e., whether their evaluation will be summative or formative, process- or
product-oriented, and quantitative or qualitative - or all of the above).
These distinctions and decisions must eventually lead to determining
which measures will be applied to changing elements of a particular
language program.
There are numerous procedures available to evaluators for gathering
information. Some of these lend themselves to collecting quantitative
data and others to qualitative information. Thus the term procedure is
not only being used here to include measures (as defined earlier), which
are useful in gathering quantitative data, but also to encompass methods
used in gathering qualitative data, for example, observations, interviews,
meetings, etc. Many of the existing procedures are presented in Table 1.
The purpose of this section is not to explain each and every one of these
procedures (there are fuller descriptions in Brown and Pennington's
unpublished MS), but rather to demonstrate that they are only variants of
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Evaluation procedures 233
six basic categories, and that even these six categories can be classified
into two classes that differ in terms of the relationship between the
evaluator(s) and the program participants. It is hoped that these cate-
gories and classes will help evaluators to make balanced choices among
the many procedures available for the evaluation process. Obviously, it
would be absurd to attempt the use of all of the procedures listed in Table
1, but a reasonable selection can be made based on the realization that the
measures can be grouped in this manner.
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
234 Language program evaluation
column on the right may seem bewildering though closer analysis reveals
a simpler pattern. Notice, for instance, that there are four types of testing,
each considered a separate procedure in the table. Surely, these forms of
testing are related and can be considered as one category of procedures.
By similar processes, all of the 24 instruments can be classified into the six
categories of procedures shown in the second column of Table 1.
Still further analysis reveals that three of the categories in the second
column, i.e., existing records, tests, and observations, leave the evaluator
more or less in the position of being an outsider looking in on the
program as the evaluation process proceeds. The other three, i.e.,
interviews, meetings and questionnaires, inevitably seem to draw the
evaluator into participating in the process of gathering or drawing out
information from the participants in the program. This difference can
have important consequences with regard to the way the procedures and
the results based on them are viewed by participants and evaluators alike.
Let us now turn to the steps whereby all of the information presented so
far can be implemented. This should help to tie together the many threads
that have been developed here.
Discussion
Step one: a framework
Numerous models for curriculum development have been proposed in
the language teaching literature, especially in the area of English for
specific purposes, and some of them have included an evaluation com-
ponent (Perry, 1976; Strevens, 1977; Candlin etal, 1978). With a view to
establishing evaluation that is an integral and ongoing part of the English
Language Institute (ELI) curriculum at the University of Hawaii, we
instead adapted the rather elaborate 'systematic approach' model devel-
oped by Dick and Carey (1985). Ours is the much simpler version shown
in Figure 1.
In a systematic approach to curriculum design such as this, the primary
information-gathering and organizational elements include the needs
analysis, instructional objectives and testing (Dick and Carey, 1985;
Brown and Richards, in preparation). The information and insights
gained from these activities can then be analyzed and synthesized in the
design of materials and delivery of instruction. The working model for
this approach which is presently being used in our program may, at first
glance, appear to represent five steps which should be followed linearly.
Further examination of the model reveals, however, that all of the
elements are interconnected by arrows to each other and to an ongoing
process of 'evaluation'.
The arrows and their link to evaluation are meant to imply three
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Discussion 235
f
NEEDS ANALYSIS -*—*~
E
|
V
OBJECTIVES
A
1 L
U
TESTING
A
1 T
1
MATERIALS
0
-
I N
TEACHING -• *~
1
Figure 1 Systematic approach for designing and maintaining language
curriculum
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
236 Language program evaluation
that connects and holds all of the elements together. Without it, there is
no cohesion among the elements and, if left in isolation, any one of them
may become pointless. In short, the heart of the systematic approach to
language curriculum design, as shown here, is evaluation — the part of the
model that includes, connects and gives meaning to all of the other
elements.
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Discussion 237
that our treatment is better than nothing. Of course, the degree to which
our treatment is better than nothing may be of interest. We have decided,
instead, in favor of a simple pretest—post-test design for each course
wherein we will examine improvement within the program rather than
comparison to a control group. Such information is being viewed as only
one important piece in a puzzle which must include many more pieces —
ones which can provide more detail about what is going on in the learning
process. Other sources of information of a more qualitative nature (for
example, interviews, observations, diaries, etc.) are therefore being
included in our evaluation process.
A Effective?
1 Which of our original perceptions of the students' needs, especially as
reflected in the goals and objectives, are being met and which are not?
2 Which of our instructional objectives, as reflected in criterion-refer-
enced tests, are being achieved at an acceptable level of performance
by the students and which are not?
3 To what degree are the students learning the stated objectives and
overall academic English in the three skill areas (reading, listening and
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
238 Language program evaluation
Program components
B Efficient?
1 Which of the students' needs, as originally perceived, turned out to be
necessary when taught, and which turned out to be superfluous, as
reflected by performance on the objectives-based criterion-referenced
tests?
2 Which objectives were necessary and which have previously been
learned by the students as indicated by the pretests?
3 How can the criterion-referenced and norm-referenced tests be made
more efficient, reliable and valid for our purposes?
4 How can materials resources be better organized for access by the
teachers and students?
5 What types of teacher training can be provided to improve the
teaching throughout the program?
C Attitudes?
What are the students', teachers' and administrators' attitudes and
feelings about the i) students' language learning needs; ii) goals and
objectives; iii) criterion-referenced and norm-referenced tests; iv) mater-
ials; and v) teaching.
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Discussion 239
Questions to
Categories Procedures be addressed
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
240 Language program evaluation
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520
Conclusion 241
Conclusion
Near the beginning of this paper, evaluation was defined as the sys-
tematic collection and analysis of all relevant information necessary to
promote the improvement of a curriculum and assess its effectiveness and
efficiency, as well as the participants' attitudes within the context of the
particular institutions involved. The overall purpose of evaluation, it was
argued, is to determine the general effectiveness of a program — usually
for purposes of improving it, or defending its utility to outside admini-
strators or agencies. This is not a simple issue, but rather a complex of
interrelated issues. Evaluation, in fact, should probably be viewed as the
drawing together of many sources of information to help examine
selected research questions from different points of view, with the goal of
forming all of this into a cogent and useful picture of how well the
language learning needs of the students are being met. One way to view
program evaluation might be that it is a never ending needs analysis, the
goal of which is to constantly refine the ideas gathered in the initial needs
analysis, such that the program can do an even better job of meeting those
needs. All of the components of the curriculum are necessarily involved
as sources of information, and each of these must be considered from
different points of view. But selectivity and planning are crucial so that
the enormous amount of potential information does not become over-
powering. In short, evaluation should be the part of a curriculum that
includes, connects and gives meaning to all of the other elements in a
program.
Downloaded from https://www.cambridge.org/core. Hacettepe Universitesi, on 11 Nov 2020 at 10:46:33, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524520