You are on page 1of 33

Escudero Escorza, T. (2003). From tests to current evaluative research.

One century, the XXth, of intense development


of evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Revista ELectrónica de Investigación


y EValuación Educativa

This article in spanish (original Version)

FROM TESTS TO CURRENT EVALUATIVE RESEARCH.


ONE CENTURY, THE 20TH, OF INTENSE DEVELOP-
MENT OF EVALUATION IN EDUCATION
(Desde los tests hasta la investigación evaluativa actual. Un siglo,
el XX, de intenso desarrollo de la evaluación en educación)
by
Articles record Ficha artículo
Tomás Escudero Escorza
About the authors (tescuder@unizar.es) Sobre los autores
HTML format Formato HTML

Abstract Resumen
This article presents a critical review about historical de- Este artículo presenta una revisión crítica del desarrollo
velopment in the field of educational evaluation during the histórico que ha tenido el ámbito de la evaluación educa-
XXth century. The main theoretical proposals are com- tiva durante todo el siglo XX. Se analizan los principales
mented propuestas teóricas planteadas.

Keywords Descriptores
Evaluation, Evaluation Research, Evaluation Methods; Evaluación, Investigación evaluativa, Métodos de Evalua-
Formative Evaluation, Summative Evaluation, Testing, ción, Evaluación Formativa, Evaluación Sumativa, Test,
Program Evaluation Evaluación de Programas

Introduction We will carry out the analysis centering our-


selves on three positions that we could brand as
In any discipline, an investigation into its his- classics in recent literature on the topic and that
tory tends to be a fundamental road to understand we use indistinctly, although we don't have the
its conception, status, functions, environment, pretense of offering a synthesis position, but
etc. This fact is especially evident in the case of rather of exact use of all them, since the three
evaluation because it is a discipline that has suf- positions have an impact on the same moments
fered deep conceptual and functional transforma- and key movements.
tions throughout history and, mainly, throughout
the 20th century, in which we principally con- A position, maybe the more used in our con-
centrate our analyses. In this sense, the dia- text (Mateo et al., 1993; Hernández, 1993), is
chronic approach to the concept is essential. that which Madaus, Scriven, Stufflebeam and

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

other authors offer that tends to establish six pe- to select high officials. Other authors such as
riods in their works, beginning its analysis in the Sundbery (1977) speak of passages with refer-
19th century (Stufflebeam and Shinkfield, 1987; ence to evaluation in the Bible, while Blanco
Madaus et al., 1991). They speak to us of: a) (1994) refers to the exams of the Greek and
period of reform (1800-1900), b) efficiency and Roman teachers. But according to McReynold
“testing” period (1900-1930), c) Tyler period (1975), the most important book of antiquity
(1930-1945), d) innocence period (1946-1956), regarding evaluation is the Tetrabiblos that is
e) expansion period (1957-1972) and f) the pro- attributed to Ptolomeo. Cicero and San
fessionalization period (from 1973) that con- Agustín also introduce evaluative concepts and
nects with the current situation. positions in their writings.
Other authors like Cabrera (1986) and Salvador It is during the Middle Ages that the exams
(1992) cite three major stages, taking as a central are introduced into the university environment
reference point the figure of Tyler in the second with a more formal character. It is necessary to
quarter of the 20th century. The stages before remember the famous public oral exams in
Tyler are referred to as precedents or antece- presence of a tribunal, although they were only
dents, Tyler’s stage is referred to as the birth, and administered to those individuals with previous
those which follow are considered development permission from their professors, with which
stages. the possibility of failure was practically non-
existent. In the Renaissance there are contin-
Guba and his collaborators, mainly Yvonna ued uses of selective procedures and Huarte of
Lincoln, highlight different generations. We San Juan, in his Exam of geniuses for the sci-
would currently be in the fourth (Guba and Lin- ences, defends the observation as a basic posi-
coln, 1989) which according to them is based on tion of the evaluation (Rodríguez et al., 1995).
the paradigmatic constructivist focus and in the
needs of those stakeholders, as a base to deter- In the 18th century, as the demand and the
mine the information that is needed. The first access to education increases, the necessity of
generation is that of measurement that arrives up verifying individual merits is accentuated and
until the first third of this century, the second is educational institutions embark on elaborating
that of description and the third that of the and introducing norms on the use of written
judgement or valuation. exams (Gil, 1992).
After the historical analysis, as a complement In the 19th century, national systems of edu-
and a revision of synthesis, we offer a concise cation are established and graduation diplomas
summary of the more relevant evaluative focuses appear following the passing of exams (exams
of the different models and positions that, with of the State). According to Max Weber (Bar-
greater or lesser force, come to mind when we bier, 1993), a system of exams of a specific
try to delimit what is today evaluation research in preparation validation arises to satisfy the
education needs of a new hierarchical and bureaucratized
society. In the United States, in 1845, Horace
1. Precedents: before «tests» and meas- Mann begins to use the first evaluative tech-
urement niques in the form of written tests. They ex-
tend to the schools of Boston and begin the
Since antiquity instructive procedures in which road toward more objective and explicit refer-
the teachers used implicit references have been ences with relation to certain reading-writing
created and used, without an explicit theory of skills. However, it still is not an evaluation
evaluation, to value and, overall, to distinguish sustained in a theoretical focus, but rather,
and select students. Dubois (1970) and Coffman something that responds to routine practices,
(1971) cite the procedures that were used in im-
perial China more than three thousand years ago

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

frequently based on not very reliable instruments. Consequently with this state of things, in this
period between the end of the 19th century and
At the end of the 19th century, in 1897, a work beginning of the 20th, an intense evaluative
of J.M. Rice appears that is usually pointed out activity known as “testing” is developed that is
as the first evaluative research in education defined by the following characteristics:
(Mateo et al., 1993). It discussed a comparative
analysis in American schools about the value of . Measurement and evaluation were inter-
the instruction in the study of spelling, using as changeable terms. In the practice it was only
criteria the marks obtained in tests. referred to as measurement.

2. Psychometric tests . The objective was to detect and establish


individual differences, inside the pattern of
In the previous context, at the end of the 19th trait and attribute that characterized the psy-
century, a great interest for scientific measure- chological development of the time
ment of human behaviors is awakened. This is (Fernández Archers, 1981), that is to say, the
something that is framed in the renovating discovery of differential punctuations to de-
movement of methodology of human sciences, termine the subject's relative position inside
when assuming the positivism of the physical- the reference group.
natural sciences. In this sense, evaluation re-
ceives the same influences as other pedagogic . The performance tests, synonym of educa-
disciplines related with measurement processes, tional evaluation, were developed to estab-
as experimental and differential pedagogy lish individual discriminations, forgetting in
(Cabrera, 1986). great measure the representativeness and
consistency with educational objectives. In
The evaluative activity will be conditioned in a the words of Guba and Lincoln (1982),
decisive way by diverse factors that converge in evaluation and measure had little relation-
this moment, such as: ship with the school curriculum. The tests
showed results about the students, but not
a) The blossoming of the positivistic and em- about the curriculums with which they had
pirical philosophical currents that supported been educated.
observation, experimentation, data, and facts
as sources of the true knowledge. The demand Within the educational field, some instru-
for scientific rigor as well as for objectivity in ments of that time are highlighted, such Ayres
the measure of human behavior (Planchard, and Freeman’s writing scales, Hillegas’ writ-
1960) appears and written tests are promoted ing, Buckingan’s spelling, Wood’s mathemat-
as a means of combating the subjectivity of ics, Thorndike and McCall’s reading, and
oral exams (Ahman and Cook, 1967). Wood and McCall’s arithmetic (Planchard,
1960; Ahman and Cook, 1967; Ebel, 1977).
b) The influence of evolutionist theories and
Darwin's works, Galton and Cattel, supporting However, it was in the psychological tests
the measurement of the characteristics of indi- where the efforts had a larger impact, being
viduals and the differences among them. most likely the work of Thorndike (1904) that
had most influence during the beginning of the
c) The development of the statistical methods 20th century. In France the works of Alfred
that favored decisively the metric orientation Binet stand out, later revised by Terman at the
of the time (Nunnally, 1978). Stanford University, on tests of cognitive ca-
d) The development of the industrial society pacities. Now we speak of the Stanford-Binet,
that empowered the necessity to find some ac- one of the most well-known tests in the history
creditation and selection mechanisms of stu- of the psychometry.
dents, according to their knowledge.

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Years later, with the recruitment necessities in is taught and the goals of the instruction was
the First World War, Arthur Otis directs a team criticized. The evaluation was left in the hands
that builds collective tests of general intelligence of the completely personal interpretation of the
(Alpha for readers-writers and Beta for illiterate) teacher. As a solution the following was pro-
and inventories of personality (Phillips, 1974). posed: a) the elaboration of taxonomies to for-
mulate objectives, b) diversification of infor-
After the war, the psychological tests are put to mation sources, exams, academic files, test re-
the service of social ends. The decade between taking techniques, and tests, c) unification of
1920 and 1930 marks the highest point in «test- correction criteria beginning with the agree-
ing» due to the development of a multitude of ment between the correctors of the tests and d)
standardized tests to measure all kinds of school revision of value judgements by means of such
abilities with relating external and explicit objec- procedures as double correction, or the means
tives. They are based on procedures of intelli- of different correctors. As can be seen, we are
gence measurement to use with large numbers of dealing with criteria of good and valid meas-
students. urement, in some cases, even advanced.
These standardized applications are welcomed Nevertheless, Tyler is traditionally consid-
in educational environments and McCall (1920) ered the father of educational evaluation (Joint
proposes that the teachers construct their own Committee, 1981), for being the first in giving
objective tests, to not find themselves at the it a methodical vision, going beyond behavior-
mercy of the proposals made exclusively by ex- ism, a trend of the time, the mere psychological
ternal specialists. evaluation. Between 1932 and 1940, in his
famous Eight-Year Study of Secondary Educa-
This movement was effective in parallel to the
tion for the Progressive Education Association,
improvement process of psychological tests with
published two years later (Smith and Tyler,
the development of statistical and factorial analy-
1942), he outlines the necessity of a scientific
sis. The fervor for «testing» began to decline at
evaluation that serves to perfect the quality of
the start of the 40’s and there even began to arise
education. The synthesis work is published
some hypercritical movements with these prac-
some years later (Tyler, 1950), explaining in a
tices.
clear way his idea of curriculum, and integrat-
Guba and Lincoln (1989) refer to this evalua- ing in it his systematic method of educational
tion as the first generation that can rightfully be evaluation, as the process emerged to deter-
called the generation of measurement. The role mine in what measure the previously estab-
of the evaluator used to be technical, as supplier lished objectives have been reached (sees you
of measurement instruments. According to these also Tyler, 1967 and 1969).
authors, this first generation still remains alive
The currículum comes defined by the four
because texts and publications that use indissolu-
following questions:
ble evaluation and measure still exist (Gronlund,
1985). a) What objectives are desired?
3. The birth of true educational evalua- b) With what activities can they be reached?
tion: The great “Tylerian” reform
c) How can these experiences be organized
Before the revolution promoted by Ralph W. efficiently?
Tyler arrived, an idependent current known as
docimology started during the 1920’s in France d) How can it be proved that the objectives
(Pieron, 1968 and 1969; Bonboir, 1972) that are reached?
supposed a first approach to true educational
And the good precise evaluation of the fol-
evaluation. Mainly the split between that which
lowing conditions:

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

a) Clear proposal of objectives. of the task’s analysis that began to be used with
success in military educational environments
b) Determination of the situations in those that (Gagné, 1971). In Spain, the positions of Tyler
should show the expected behaviors. extended with the General Law of Education of
1970.
c) Election of appropriate instruments of
evaluation. After the Second World War, a period of ex-
pansion and optimism occurs that Stufflebeam
d) Interpretation of the test results.
and Shinkfield (1987) have not doubted to
e) Determination of the reliability and objec- qualify as “social irresponsibility”, due to the
tivity of the measures. great consumer waste after a time of recession.
It is the well-known stage as that of the inno-
This evaluation is no longer a simple meas- cence (Madaus et al., 1991). Many institutions
urement because it supposes a value judgement and educational services of all types are ex-
of collected information. It alludes to, without tended, a large quantity of standardized tests
further development, the decision making regard- are produced, there are advances in measure-
ing the success or failure of the curriculum ac- ment technology and in the statistical princi-
cording to students’ results. This theme is one ples of experimental design (Gulliksen, 1950;
that important evaluators such as Cronbach and Lindquist, 1953; Walberg and Haertel, 1990)
Sufflebeam would take up some years later. and the famous taxonomies of educational ob-
jectives appear (Bloom et al., 1956; Krathwohl
For Tyler, the central reference in the evalua- et al., 1964). However, at this time, the contri-
tion is the preestablished objectives which should bution of evaluation to the improvement of
be carefully defined in behavioral terms (Mager, education is scarce due to the lack of coherent
1962), keeping in mind that they should mark the plans of action. Much is written about evalua-
student's individual development, but inside a tion, but with scarce influence in the improve-
socialization process. ment of the instructional work. The true de-
The object of the evaluative process is to de- velopment of the tylerianan proposals came
termine the change in the students, but its func- later (Anklebone, 1962; Popham and Baker,
tion is wider than making explicit this change to 1970; Fernández de Castro, 1973).
the very students, parents and teachers; it is also Ralph W. Tyler died February 18th of 1994,
a means of informing about the effectiveness of having lived more than ninety years, and after
the educational program and also about the seven decades of fruitful contributions and
teacher's continuing education. According to services to evaluation, research, and to educa-
Guba and Lincoln (1989), it refers to the second tion in general. Some months before, in April
generation of evaluation. Unfortunately, this of 1993, Pamela Perfumo, a graduate student of
evaluative global vision was not sufficiently ap- Stanford University, interviewed Tyler with the
preciated, neither exploited, for those that used purpose of knowing his thoughts about the cur-
its works (Bloom et al., 1975; Guba and Lincoln, rent development of evaluation and of the con-
1982). troversial topics surrounding it. This interview,
In spite of the above-mentioned issues and that conveniently prepared, was presented April 16,
the tylerianan reforms were not always applied 1993 in the AERA Conference in Atlanta.
immediately, Tyler’s ideas were well received by Horowitz (1995) analyzes the content and the
the specialists in curricular development and by meaning of the mentioned interview, highlight-
the teachers. Their outline was rational and it ing, among other things, the following aspects
was supported by a clear technology, easy to of Tyler’s thoughts at the end of his days:
understand and apply (Guba and Lincoln, 1982;
House, 1989) and it fit perfectly in the rationality

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

a) Necessity to carefully analyze the purposes mainly because Russia got ahead in the space
of the evaluation before beginning to evaluate. program, after the launching of Sputnik for the
The current positions of multiple and alternat- USSR in 1957. A certain disenchantment ap-
eve evaluations should be adjusted to this prin- pears with public schools and pressure grows
ciple. for accountability (MacDonald, 1976; Sten-
house, 1984). In 1958 a new law of educa-
b) The most important purpose in evaluation tional defense is promulgated that provides
of the students is to guide their learning, this many programs and means to evaluate them.
is, to help them to learn. A comprehensive In 1964 the Elementary and Secondary Educa-
evaluation of all the significant aspects of their tion Act (ESEA) is established by the National
performance is necessary; it is not enough to Study Committee on Evaluation and, creates a
make sure that they regularly do their daily new evaluation not only of students, but to
work. have an impact on programs and global educa-
tional practice (Mateo et al., 1993; Rodríguez
c) The portfolio is a valuable evaluation in-
et al., 1995).
strument, but it depends on its content. In any
event, it is necessary to be cautious with the To improve the situation and to recapture the
preponderance of a single evaluation proce- scientific and educational hegemony, millions
dure, including the portfolio, for its inability of of dollars of public funds were dedicated to
embracing the whole spectrum of evaluable subsidize new educational programs and initia-
aspects. tives of American public schools’ personnel
guided to improve the quality of teaching.
d) True evaluation should be idiosyncratic,
(Popham, 1983; Rutman and Mowbray, 1983;
appropriate for the student's peculiarities and
Weiss, 1983). This movement was also
the center of learning. In rigor, the compari-
strengthened by the development of new tech-
son of centers is not possible.
nological means (audiovisual, computers...)
e) Teachers should report to parents on their and that of programmed teaching whose educa-
educational action with students. To do this it tional possibilities awoke interest in education
is necessary to interact with them in a more professionals (Rosenthal, 1976).
frequent and more informal way.
In the same way that the proliferation of so-
Half of a century after Tyler revolutionized the cial programs in the previous decade had im-
world of educational evaluation, one can observe pelled the evaluation of programs in the social
the strength, coherence, and validity of his field, the sixties would be fruitful in demand
thoughts. As we have just seen, his basic, con- for evaluation in the field of education. This
veniently up-to-date, ideas are easily connected new dynamic into which evaluation enters,
to the most current trends in educational evalua- though centered on the students as individuals
tion. that learn with the object of valuation being
their performance, will vary in its functions,
4. The development of the sixties focus, and interpretation according to the type
of decision being sought after.
The sixties would bring new airs to educational
evaluation, among other things because people To a great extent, this strong American
began to lend interest to Tyler’s calls for atten- evaluator impulse is due to the before-
tion, related with the effectiveness of the pro- mentioned approval of the Elementary and
grams and the intrinsic value of evaluation for Secondary Education Act (ESEA) in 1965
the improvement of education. (Berk, 1981; Rutman, 1984). With this law
started the first significant program for educa-
At that time a certain conflict arises between tional organization in the federal environment
the American society and its educational system,

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

of the United States, and it was specified that As a result of these new necessities of
each one of the projects carried out with federal evaluation, a period of reflection and of theo-
economic support should be evaluated annually, retical essays with spirit of clarifying the mul-
in order to justify future grants. tidimensionality of the evaluative process was
initiated during this time. These theoretical
Along with the disenchantment of public reflections would decisively enrich the concep-
school, it is necessary to point out the economic tual and methodological environment of
recession that characterized the final years of the evaluation that together with the tremendous
sixties, and, mainly, the decade of the seventies. expansion of program evaluation that occurred
This caused that general population, as taxpay- during these years, will give way to the emer-
ers, and the legislators themselves to worry about gence of the new modality of applied research
the effectiveness and the yield of the money that that today we refer to as evaluation research.
was used in improving the school system. At the
end of the sixties, and as a consequence to the As landmarks of the time, it is necessary to
above-mentioned, a new movement enters the highlight two essays for their decisive influ-
scene, the era of Accountability (Popham, 1980 ence: Cronbach’s article (1963), Course im-
and 1983; Rutman and Mowbray, 1983) that is provement through evaluation, and that of
fundamentally associated with the teaching per- Scriven (1967), The methodology of evalua-
sonnel's responsibility in the achievement of es- tion. The wealth of evaluative ideas exposed in
tablished educational objectives. In fact, in the these works forces us to refer to them briefly.
year 1973, the legislation of many American
states instituted the obligation of controlling the Regarding the analysis that Cronbach makes
achievement of educational objectives and the of the concept, functions and methodology of
adoption of corrective measures in negative cases evaluation, we highlight the following sugges-
(MacDonald, 1976; Wilson et al., 1978). It is tions:
comprehensible that, outlined this way, this
a) Associate the concept of evaluation with
movement of accountability and school responsi-
decision making. The author distinguishes
bility, gave way to a wave of protests on the part
three types of educational decisions which
of educational personnel.
the evaluation serves: a) about the improve-
Popham (1980) offers another dimension of ment of the program and instruction, b)
school responsibility, when he refers to the about the students (necessities and final mer-
school decentralization movement during the last its) and c) about administrative regulation
years of the sixties and beginning of the seven- over the quality of the system, teachers, or-
ties. Large school districts were divided into ganization, etc. In this way, Cronbach opens
smaller geographical areas, and, consequently, the conceptual and functional field of educa-
with a greater direct civic control on what hap- tional evaluation far beyond the conceptual
pened in the schools. framework given by Tyler, although he fol-
lows his line of thought.
As a consequence of this focusing of influence,
the phenomenon of educational evaluation was b) Evaluation that is used to improve a pro-
expended considerably. The direct subject of gram while it is being applied, contributes
evaluation continued being the student, but also more to the development of education than
included all those factors that converge in the evaluation used to estimate the value of the
educational process (the educational program in product of an already concluded program.
a wide sense, teacher, means, contents, learning
c) Put in question the necessity that evalua-
experiences, organization, etc.), as well as the
tive studies be comparative. Among the ob-
educational product itself.
jections to this type of study, the author
highlights the fact that frequently the differ-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

ences among the average grades are lower in same, whatever it may be that we are evalu-
inter-groups than in intra-groups, as well as ating. The objective of evaluation is invari-
others concerning the technical difficulties that ant, its main aim is the process with which
present comparative designs in the educational we estimate the value of something that is
framework. Cronbach pleads for some abso- evaluated, while the functions of the evalua-
lute criteria of comparison, outstanding the tion can be vastly varied. These functions
necessity of an evaluation with reference to are related with the use that is made of the
the criteria when defending the valuation with collected information.
relation to some very well-defined objectives
and not comparison with other groups. b) Scriven points out two different functions
that evaluation can adopt: the formative and
d) Great scale studies are questioned, since the the summative. He proposes the term of
differences among their treatments can be very formative evaluation to describe an evalua-
large and prevent the clear discernment of the tion of a program in progress with the objec-
results’ causes. Defended are the more well- tive of improving it. The term of summative
controlled analytic studies that can be used to evaluation is the process oriented to check
compare alternative versions of a program. the effectiveness of the program and to make
decisions about its continuity.
e) Methodologically Cronbach proposes that
evaluation should include: 1) process studies - c) Another important contribution of Scriven
facts that take place in the classroom-; 2) is the criticism of the emphasis that evalua-
measure of performance and attitudes - tion gives to the attainment of previously es-
changes observed in the students - and 3) fol- tablished objectives, because if the objec-
low-up studies, that is, the later path continued tives lack value, one doesn't have any inter-
by the students that have participated in the est to know to what extent they have been
program. achieved. The need for evaluation to include
the evaluation of suitable objectives as well
f) From this point of view, evaluation tech- as determining the degreee to which these
niques cannot be limited to performance tests. have been reached is emphasized (Scriven,
Questionnaires, interviews, systematic and 1973 and 1974).
non-systematic observation, essays, according
to the author, occupy an important place in d) Scriven makes a clear distinction between
evaluation, in contrast to the almost exclusive intrinsic evaluation and extrinsic evaluation,
use that was made of tests like techniques of two different forms of valuing an element of
information collection. teaching. In an intrinsic evaluation, the ele-
ment is valued for what it is, while in extrin-
If these reflections by Cronbach were shock- sic evaluation the element is valued for the
ing, they were not less than those in Scriven’s effects that it causes in the students. This
essay (1967). His prolific terminological distinc- distinction is very important when consider-
tions vastly enlarged the semantic field of ing the criteria to be utilized, because in in-
evaluation, and at the same time clarified the trinsic evaluation the criteria are not formu-
evaluative chore. Below we highlight the most lated in terms of operative objectives, while
significant contributions: it is done in extrinsic evaluation.
a) Difinitively established is the difference be- e) Scriven adopts a position contrary to
tween evaluation as a methodological activity, Cronbach, defending the comparative char-
that which the author names the goal of the acter that evaluation studies should present.
evaluation and the functions of the evaluation Along with Cronbach he acknowledges the
in a particular context. In this way evaluation technical problems that comparative studies
as a methodological activity is essentially the involve and the difficulty in explaining the

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

differences among programs. However, 1962 and 1973; Lindvall, 1964; Krathwohl et
Scriven considers that evaluation, as opposed al., 1964; Glaser, 1965; Popham, 1970; Bloom
to the mere description, implies to produce a et al., 1971; Gagné 1971). Among other things
judgement about the superiority or inferiority new ideas appeared about the evaluation of
of what is evaluated with regard to its com- interaction in the classroom and about its ef-
petitors or alternatives. fects in the achievement of the students (Baker,
1969).
These two commented contributions decisively
influenced the community of evaluators, impact- Stake (1967) proposed his evaluation model,
ing not only studies in the line of evaluation re- the countenance model, that follows the line of
search, to which is preferably referred, but also Tyler. However, Stake’s is more complete
in evaluation orientated to the individual, in the when considering the discrepancies among that
evaluation line such as assessment (Mateo, which is observed and expected in the “antece-
1986). We are before the third generation of dents” and “transactions”, and when facilitat-
evaluation that, according to Guba and Lincoln ing some bases to formulate a hypothesis about
(1989), is characterized by introducing valuation, the causes and the shortcomings in the final
judgement, as an intrinsic content in evaluation. results. In his successive proposals, Stake
Now the evaluator doesn't only analyze and de- would begin distancing himself from his initial
scribe reality, he also assesses and judges it in positions.
relation to different criteria.
Metfessell and Michael (1967) also presented
During the sixties many other contributions ap- a model of evaluation of the effectiveness of an
pear that continue drawing an outline of a new educational program in which, still following
evaluative conception that will be finished de- the basic pattern of Tyler, proposed the use of a
veloping and, mainly, extending in later decades. comprehensive list of diverse criteria. The
It is perceived that the conceptual nucleus of evaluators could keep these criteria in mind at
evaluation is the valuation of the change in the the moment of evaluation and, consequently,
student as an effect of a systematic educational not be centered merely in the intellectual
situation, some well formulated objectives being knowledge reached by the students.
the best criteria to assess this change. Likewise,
one begins to pay attention not only to the de- Suchman (1967) emphasized the idea that
sired results, but also to the lateral or undesired evaluation should be based on objective data
effects, and even to results or long term effects that are analyzed with scientific methodology,
(Cronbach, 1963; Glaser, 1963; Scriven, 1967; clarifying that scientific research is preferably
Stake, 1967). theoretical and, in exchange, evaluation re-
search is always applied. His main purpose was
There was criticism of the operativization of to discover the effectiveness, success or failure,
objectives (Eisner, 1967 and 1969; Atkin, 1968). of a program when comparing it with the pro-
Some criticism was directed at the structure of posed objectives and, in this way, trace the
the underlying assessment. Another was about lines of its possible redefinition. According to
centering the assessment of learning in the most Suchman, this evaluation research should keep
easily measurable products. Finally, other critics in mind: a) the nature of the addressee of the
focus on the low attention given to the affective objective and that of the objective itself, b) the
domain, with greater difficulty in operativization. necessary time in which the proposed change is
In spite of this, Tyler’s evaluative model would carried out, c) the knowledge of if the prospec-
experience great improvement in these years, tive results are dispersed or concentrated and d)
with works on the educational objectives that the methods that must be used to reach the ob-
would continue and perfect the road undertaken jectives. Suchman also defends the use of ex-
in 1956 by Bloom and collaborators (Mager, ternal evaluators to avoid all types of misrepre-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

sentation by the teachers highly involved in the If one could characterize the theoretical con-
instructional processes. tributions that specialists offer us during the
1970’s, it would be with the proliferation of all
The emphasis on the objectives and their kinds of models of evaluation that flood the
measurement will also bring about need for a bibliographical market, evaluation models that
new orientation to evaluation, the denominated express the author's own points of view who
evaluation of criterial reference. The distinction proposes them on what it is and how it should
introduced by Glaser (1963) among measure- behave as an evaluative process. It deals with
ments referring to norms and criteria would have a time characterized by conceptual and meth-
an echo at the end of the sixties, precisely as a odological plurality. Guba and Lincoln (1982)
result of the new demands that were outlined by speak to us of more than forty models proposed
educational evaluation. In this way, for example, in these years, and Mateo (1986) refers to the
when Hambleton (1985) studies the differences proliferation of models. These will enrich the
among tests referring to the criteria and tests re- evaluative vocabulary considerably, however,
ferring to the norm he points out for the first we share Popham’s idea (1980) that some are
ones, in addition to the well known objectives of too complicated and others use quite confusing
describing the subject’s performance and making jargon.
decisions regarding whether or not a particular
content is known, another objective similar to Some authors like Guba and Lincoln (1982),
that of valuing the effectiveness of a program. Pérez (1983) and in some measure House
(1989), tend to classify these models in two
Since the end of the sixties, specialists have large groups, quantitative and qualitative, but
spoken decisively in favor of criterial evaluation, we think along with Nevo (1983) and Cabrera
as soon as that is the evaluation type that gives (1986) that the situation is much richer in nu-
real and descriptive information of the individ- ances.
ual's or individuals’ status regarding the foreseen
teaching objectives, as well as the evaluation of It is certain that those two tendencies are ob-
that status for comparison with a standard or cri- served today in evaluative proposals, and that
teria of desirable realizations, being irrelevant to some models can be representative of them, but
the contrast effect, namely the results obtained different models, considered particularly, differ
by other individuals or group of individuals (Po- more by highlighting or emphasizing some of
pham, 1970 and 1983; Mager, 1973; Carreño, the components of the evaluative process and
1977; Gronlund, 1985). by the particular interpretation that they lend to
this process. It is from this perspective, to our
In the evaluative practices of this decade of the understanding, how the different models
sixties, two performance levels are observed. should be seen and be valued for their respec-
We can qualify one as evaluation orientated to- tive contributions in the conceptual and meth-
ward individuals, fundamentally students and odological fields (Worthen and Sanders, 1973;
teachers. The other level is that of evaluation Stufflebeam and Shinkfield, 1987; Arnal et al.,
orientated to decision making on the “instru- 1992; Scriven, 1994).
ment” or “treatment” or educational “pro-
gram.” This last level, also impelled by the There are various authors (Lewy, 1976; Po-
evaluation of programs in the social environ- pham, 1980; Cronbach, 1982; Anderson and
ment, will be the basis for the consolidation in Ball, 1983; De la Orden, 1985) that consider
the educational field of program evaluation pro the models not as exclusive, but rather as com-
and of evaluation research. plementary, and that the study of them (at least
those that have turned out to be more practical)
5. From the seventies: The consolidation will cause the evaluator to adopt a wider and
of evaluation research more understanding vision of his work. We, in

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

some moment have dared to speak of modellic (1975 and 1976), to which Guba and Lincoln
approaches, more than of models, since it is each adhere to (1982), MacDonald’s Democratic
evaluator that finishes building his own model in Evaluation (1976), Parlett and Hamilton’s
each evaluative research as a function of the Evaluation as Illumination (1977) and Eisner’s
work type and the circumstances (Escudero, Evaluation as Artistic Criticism (1985).
1993).
In general terms, this second group of evalua-
In this movement of evaluation model propos- tive models emphasizes the role of the evalua-
als, it is necessary to distinguish two stages with tion audience and the relationship of the
marked conceptual and methodological differ- evaluator with it. The high-priority evaluation
ences. In a first stage, the proposals followed the audience in these models is not who should
line exposed by Tyler in his position that has make the decisions, like in the models orien-
come to be called "Achievement of Goals." Be- tated to decision making, neither the one re-
sides those already mentioned by Stake and Met- sponsible for elaborating the curricula or objec-
fessell and Michael that correspond to the last tives, like in the models of achievement of
years of the sixties, in this stage the proposal of goals. The high-priority audience is the par-
Hammond (1983) and the Model of Discrepancy ticipants of the program themselves. The rela-
of Provus (1971) stand out. For these authors the tionship between the evaluator and the audi-
proposed objectives continue being the funda- ence, in the words of Guba and Lincoln (1982),
mental criteria of evaluation, but they emphasize should be “transactional and phenomenologi-
the necessity to contribute data on the consis- cal”. We are referring to models that advocate
tency or discrepancy between the designed an ethnographic evaluation, it is from here that
guidelines and their execution in the reality of the methodology that is considered more ap-
the classroom. propriate is that used in social anthropology
(Parlett and Hamilton, 1977; Guba and Lin-
Other models consider the evaluation process coln, 1982; Pérez 1983).
at the service of the instances that should make
decisions. Notable examples of them include: This summary of models from the prolifera-
probably the most famous and utilized of all, the tion period is enough to approach to the wide
C.I.P.P. (context, input, process and product), theoretical and methodological conceptual ar-
proposed by Stufflebeam and collaborators ray that today is related with evaluation. This
(1971) and the C.S.E. (takes its initials from the explains that when Nevo (1983 and 1989) tries
University of California’s Center for the Study of to carry out a conceptualization of evaluation,
Evaluation) directed by Alkin (1969). The con- starting with the revision of the specialized
ceptual and methodological contribution of these literature, attending topics such as: What is
models is positively assessed among the commu- evaluation? What functions does it have?
nity of evaluators (Popham, 1980; Guba and Lin- What is the purpose of the evaluation?... a sin-
coln, 1982; House, 1989). These authors go be- gle answer is not found to these questions. It is
yond evaluation centered in final results, given easily comprehensible that the demands that
that in their proposals they suppose different evaluation suggests of programs of a part, and
evaluation types, according to the necessities of evaluation for making decisions regarding the
the decisions which they serve. individuals of another, drive a great variety of
real evaluative outlines used by teachers, direc-
A second stage in the proliferation of models is tors, inspectors, and public administrators. But
one represented by the concept of alternative it is also certain that below this diversity lie
models that, with different conceptions of evalua- different theoretical and methodological con-
tion and methodology, continue appearing in the ceptions about evaluation. Different concep-
second half of the seventies. Among those high- tions have given way to an opening and con-
lighted include Stake’s Responsive Evaluation ceptual plurality in the field of evaluation in

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

several senses (Goatherd, 1986). Next we high- deals with models related to accreditation
light the most outstanding points of this concep- and professional judgement (Popham, 1980).
tual plurality.
The authors (Stake, 1975; Parlett and Hamil-
a) Different evaluation concepts. On one hand, ton, 1977; Guba and Lincoln, 1982; House,
we have the classic definition given by Tyler 1983) that accentuate the evaluation process
exists: evaluation as the process of determin- in the service of determining the “value”
ing the consistency grade between the realiza- more than the “merit” of the entity or object
tions and the previously established objectives, evaluated, advocate that the fundamental cri-
to which the models orientated toward the re- terion of valuation be the contextual necessi-
alization of goals correspond. This definition ties in those that it is introduced. In this
contrasts with the wider one that is advocated way, Guba and Lincoln (1982) relate the
by the models orientated to decision making: terms of the valorative comparison; on one
evaluation as the process of determining, ob- hand, the characteristics of the evaluated ob-
taining, and providing relevant information to ject and, on the other, the necessities, expec-
judge alternative decisions, defended by Alkin tations and values of the group to those that
(1969), Stufflebeam et al. (1971), MacDonald it affects or with those that the evaluated ob-
(1976) and Cronbach (1982). ject is related.
Moreover, Scriven’s concept of evaluation c) Plurality of evaluative processes depend-
(1967), being the process of estimating the ing on the theoretical perception that is
value or the merit of something, is recaptured maintained over the evaluation. The cited
by Cronbach (1982), Guba and Lincoln evaluation models as well as others, too nu-
(1982), and House (1989), with the objective merous to be in the bibliography, represent
of pointing out the differences that would in- different proposals to drive an evaluation.
volve value judgements in the event of esti-
mating merit (it would be linked to intrinsic d) Plurality of evaluation objects. As Nevo
characteristics of what is being evaluated) or says (1983 and 1989), there exist two impor-
value (being linked to the use and application tant conclusions about evaluation that are
that it would have for a certain context). obtained from the revision of the bibliogra-
phy. On one hand, anything can be an
b) Different criteria. It is deduced from the evaluation object and should not be limited
previously noted definitions that the criterion to students and teachers and, on the other, a
to use for evaluation of the information also clear identification of the evaluation object
changes. From the point of view of the is an important part of any evaluation de-
achievement of goals, a good and operative sign.
definition of the objectives constitutes the fun-
damental criterion. From the perspective of e) Opening, generally recognized by all the
the decisions and situated inside a political authors, of the necessary information in an
context, Stufflebeam and collaborators, Alkin evaluative process to hold not only the de-
and MacDonald even end up suggesting the sired results, but rather to the possible effects
non-evaluation of the information on the part of an educational program, intended or not.
of the evaluator, being the decision maker re- Even Scriven (1973 and 1974) proposes an
sponsible of doing it. evaluation in which one doesn't have in
mind sought after objectives, but values all
The definitions of evaluation that accentuate the possible effects. Opening also regarding
the determination of “merit” as an objective of the collection of information, not only of the
evaluation use standard criteria for those on final product, but also of the educational
which the experts or professionals agree. It process. And opening in the consideration
of different results of short and long scope.

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Lastly, opening not only in considering cogni- ate degrees and doctorate programs, but also in
tive results, but also the affective ones (Ander- study plans for undergraduate degrees.
son and Ball, 1983).
6. The fourth generation, according to
f) Plurality in the functions of evaluation in Guba and Lincoln
the educational field, withdrawing the pro-
posal of Scriven between formative and sum- At the end of the 1980’s, after this whole be-
mative evaluation, and adding others of socio- fore-described development, Guba and Lincoln
political and administrative type (Nevo, 1983). (1989) offer an evaluating alternative that they
call fourth generation, seeking to overcome
g) Differences in the role played by the that which, according to these authors, are de-
evaluator, which has come to be called inter- ficiencies of the three previous generations,
nal evaluation vs. external evaluation. Never- such as a manager point of view of the evalua-
theless, a direct relationship between the tion, a scarce attention to the pluralism of val-
evaluator and the different audiences of the ues, and an excessive attachment to the positiv-
evaluation is recognized by most of the au- ist paradigm. The alternative of Guba and Lin-
thors (Nevo, 1983; Weiss, 1983; Rutman, coln is called responsive and constructivist,
1984). integrating somehow the responsive focus pro-
posed originally by Stake (1975), and the
h) Plurality of the audience of the evaluation
postmodern epistomology of constructivism
and, consequently, plurality in the evaluation
(Russell and Willinsky, 1997). The demands,
reports. From informal narrative reports to
the concerns and the matters of the individuals
very structured reports (Anderson and Ball,
involved or responsible (stakeholders) serve as
1983).
the organizational focus of evaluation (as a
i) Methodological plurality. The methodo- base to determine what information is needed)
logical questions arise from the dimension of which is carried out within the methodological
evaluation as evaluation research that comes positions of the constructivist paradigm.
defined, in great measure, by methodological
The use of the demands, concerns, and mat-
diversity.
ters of those involved is necessary, according
The previous summary identifies the contribu- to Guba and Lincoln, because:
tions made to evaluation in the 1970’s and
a) They are risk groups with regard to
1980’s, a time period that has been named the
evaluation and their problems should be
time of professionalization (Stufflebeam and
adecuately contemplated, so that they are
Skinkfield, 1987; Madaus et al., 1991;
protected in the face of such a risk.
Hernández, 1993; Mateo et al., 1993). In addi-
tion to the countless models of the seventies, it b) The results can be used against those in-
was deepened in the theoretical and practical volved in different senses, mainly if they are
positions and consolidated evaluation as evalua- outside of the process.
tion research in the term previously defined. In
this context appear many new specialized maga- c) They are potential users of the resulting
zines such as Educational Evaluation and Policy evaluation information.
Analysis, Studies in Evaluation, Evaluation Re-
view, New Directions for Program Evaluation, d) They can enlarge and improve the range
Evaluation and Program Planning, Evaluation of evaluation.
News, etc. Scientific associations related to the e) A positive interaction is produced among
development of evaluation are founded and uni- the different individuals involved.
versities are beginning to offer courses and pro-
grams in evaluation research, not only in gradu-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

These authors justify the paradigmatic 7) To form and mediate a forum of involved
change because: individuals for the negotiation.
a) Conventional methodology doesn't contem- 8) To develop and elaborate reports for each
plate the necessity of identifying the demands, group of involved individuals on the differ-
concerns, and matters of the individuals in- ent agreements and resolutions about their
volved. own interests and of those of other groups
b) To carry out the above-mentioned, a dis- (Stake, 1986; Zeller, 1987).
covery posture is needed more than verifica- 9) Redraft the evaluation whenever there are
tion, typical of positivism. pending matters of resolution.
c) Contextual factors are not kept sufficiently The proposal of Guba and Lincoln (1989) ex-
in mind. tends quite a bit in the explanation of the nature
d) Means are not provided for case by case and characteristics of the constructivist para-
valuations. digm in opposition with those of the positivist.

e) The supposed neutrality of conventional When one speaks of the steps or phases of
methodology is of doubtful utility when value evaluation in this fourth generation, their pro-
judgements are looked concerning a social ob- ponents mention twelve steps or phases, with
ject. different subphases in each one of these. These
steps are the following:
Leaving these premises, the evaluator is re-
sponsible for certain tasks that he/she will carry 1) Establishment of a contract with a spon-
out sequentially or in parallel, building an or- sor or client.
derly and systematic work process. The basic . Identification of the client or sponsor of
responsibilities of the evaluator of the fourth the evaluation.
generation are as follows:
. Identification of the object of the
1) To identify all individuals involved with evaluation.
risk in the evaluation.
. Purpose of the evaluation (Guba and
2) To bring out for each group involved their Lincoln, 1982).
conceptions about what was evaluated and
their demands and concerns about this matter. . Agreement with the client over the type
of evaluation.
3) To provide a context and a hermeneutic
methodology in order to be able to keep in . Identification of the audiences.
mind, understand, and criticize the different
concepts, demands, and concerns. . Brief description of the employed meth-
odology.
4) To generate the maximum possible agree-
ment about the said concepts, demands, and . Guaranty of access to records and docu-
concerns. ments.

5) To prepare an agenda for the negotiation of . Agreement to guarantee the confidenti-


topics not in consensus. ality and anonimity to where it is possi-
ble.
6) To collect and to provide the necessary in-
formation for the negotiation. . Description of the report type to elabo-
rate.

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

. Listing of technical specifications. to the hermeneutic process and that of authen-


ticity.
2) Organization to redraft the research.
The parallel criteria are named this way be-
. Selection and training of the appraisal cause they try to be parallel to the criteria of
team. rigor used for many years inside the conven-
tional paradigm. These criteria have been:
. Attainment of facilities and access to the
internal and external validity, reliability and
information (Lincoln and Guba, 1985).
objectivity. However, the criteron should be in
3) Identification of the audiences (Guba and agreement with the fundamental paradigm
Lincoln, 1982). (Morgan, 1983). In the case of the fourth gen-
eration, the criteria that are offered are those of
. Agents. credibility, transfer, dependence, and confir-
mation (Lincoln and Guba, 1986).
. Beneficiaries.
The credibility criteria are parallel to that of
. Victims. internal validity, so that the isomorphism idea
4) Development of conjunct constructs within between the findings and reality is replaced by
each group or audience (Glaser and Strauss, the isomorphism among the realities built by
1967; Glaser, 1978; Lincoln and Guba, 1985). the audiences and the reconstructions of the
evaluator appointed to them. To achieve this,
5) Contrast and development of the conjunct several techniques exist, among them the fol-
constructs of the audiences. lowing are highlighted: a) prolonged compro-
mise, b) persistent observation, c) contrast with
. Documents and records. colleagues, d) analysis of negative cases (Kid-
der, 1981), e) progressive subjectivity and f)
. Observation.
control of the members. The transfer can be
. Professional literature. seen as parallel to the external validity, the
dependence is parallel to the reliability crite-
. Circles of other audiences. rion and the confirmation can be seen as paral-
lel to the objectivity.
. Ethical construct of the evaluator.
Another way to judge the quality of evalua-
6) Classification of the demands, concerns, tion is through an analysis of the process itself,
and resolved matters. something that fits with the hermeneutic para-
7) Establishment of priorities in the unre- digm, through a dialectical process.
solved topics. However, these two classes of criteria, al-
8) Collection of information. though useful, are not completely satisfactory
for Guba and Lincoln that also defend with
9) Preparation of the agenda for negotiation. more insistence the criteria that they call of
authenticity, also of the constructivist basis.
10) Development of the negotiation. These criteria include the following: a) impar-
tiality, justice, b) ontologic authenticity, c)
11) Reports (Zeller, 1987; Licoln and Guba, educational authenticity, d) catalytic authentic-
1988). ity and e) tactical authenticity (Lincoln and
12) Recycling/review. Guba, 1986).

To judge the quality of the evaluation, we are


offered three focuses called parallel, the linked

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

We can complete this analysis of the fourth more recently from one of the figures of this
generation with the characteristics with which field in the second half of the 20th century. We
Guba and Lincoln define evaluation: are referring to Daniel L. Stufflebeam, pro-
poser of the CIPP model (the most used) at the
a) Evaluation is a sociopolitical process. end of the sixties, president of the Joint Com-
mittee on Standars for Educational Evaluation
b) Evaluation is a combined process of col-
from 1975 to 1988, and current director of the
laboration.
Evaluation Center of Western Michigan Uni-
c) Evaluation is a teaching/learning process. versity (headquarters of the Joint Committee)
and of CREATE (Center for Research on Edu-
d) Evaluation is a continuous process, recur- cational Accountability and Teacher Evalua-
sive and highly divergent. tion), a center favored and financed by the De-
partment of Education of the American Gov-
e) Evaluation is an emergent process. ernment.
f) Evaluation is a process with unpredictable Gathering these recommendations (Stuffle-
results. beam, 1994, 1998, 1999, 2000 and 2001), into
g) Evaluation is a process that creates reality. those that have been integrating ideas of di-
verse notable evaluators, we don't offer just
In this evaluation the characteristics of the one of the lastest contributions to the current
evaluator are a result of the first three genera- conception of evaluation research in education;
tions, namely that of the technitian, the analyst, we complete in good measure the vision of the
and the judge. However, these should be ex- current, rich and plural panorama, after analyz-
panded with skills in order to gather and interpret ing the fourth generation of Guba and Lincoln.
qualitative data (Patton, 1980). These skills in-
clude those of a historian and enlightener and Stufflebeam parts from the four principles of
those of a mediator of judegements which serve the Joint Committee (1981 and 1988), that is,
to create a more active role as an evaluator in the from the idea that any good work of evaluative
concrete socio-political concontext. research should be: a) useful, that is, to provide
timely information and to influence, b) feasi-
Russell and Willinsky (1997) defend the poten- ble, this is, it should suppose a reasonable ef-
tialities of the fourth generation’s position to fort and should be politically viable, c) appro-
develop alternative formulations of evaluating priate, adequate, legitimate, this is, ethical and
practice among those individuals involved, in- just with those individuals involved, and d)
creasing the probability that evaluation serves to sure and precise when offering information
improve school teaching. This requires, on the and judgements on the object of evaluation.
part of the faculty, the recognition of other posi- Also, evaluation is seen as “transdisciplinary,”
tions, besides his own, the implication of all because it is applicable to many different disci-
since the beginning of the process and, on the plines and many diverse objects (Scriven,
other hand, the development of more pragmatic 1994).
approaches of the conceptualization of Guba and
Lincoln, adapted to the different school realities. Stufflebeam invokes the responsibility of the
evaluator that should act according to princi-
7. The new impulse around Stufflebeam ples accepted by the society and to profession-
alism criteria, to form judgements regarding
To finish this analytic-historical journey from the quality and educational value of the evalu-
the first attempts of educational measurement to ated object that should assist the involved indi-
current evaluation research in education, we want viduals in the interpretation and use of its in-
to gather the recommendations that come to us formation and judgements. However, it is also

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

their duty, and their right, to be on the margin of 2) Educational entities should be examined
the fight and the political responsibility for the for their integration and service to the prin-
decision-making process and eventual conclu- ciples of democratic society, equality, well-
sion. being, etc.
To evaluate education in a modern society, 3) Educational entities should be valued in
Stufflebeam (1994) suggests that some basic terms of their merit (intrinsic value, quality
approaches of reference should be taken, such as regarding general criteria) as much as their
the following: value (extrinsic value, quality and service
for a particular context) (Guba and Lincoln,
. Educational necessities. It is necessary to 1982; Scriven, 1991), as well as for their
ask oneself if the education provided covers significance in the reality of the context in
the necessities of the students and their fami- which it is located. Scriven (1998) points
lies in all areas in view of basic rights, in this out that using other habitual denominations,
case, inside a democratic society (Nowa- merit has fairly good equivalence with the
kowski et al., 1985). term quality, value with that of cost-
effective relationship, and significance with
. Fairness, Equity. It is necessary to ask one-
that of importance. In any event, the three
self if the system is fair and equal when pro-
concepts depend on the context, specially
viding educational services, access, the
the one refers to significance, meaning that
achievement of goals, development of aspira-
understanding the difference between de-
tions, and the coverage for all sectors of the
pendence on the context and arbitrariness is
community (Kellagan, 1982).
part of the understanding of the evaluation’s
. Feasibility. It is necessary to question the ef- logic.
ficiency of the use and distribution of re-
4) Evaluation of teachers, educational insti-
sources, the adaptation and viability of the le-
tutions, programs, etc, should always be re-
gal norms, the commitment and participation
lated to their duties, responsibilities, and
of those individuals involved, and everything
professional or institutional obligations, etc.
that makes it possible for the educational ef-
Maybe one of the challenges that educa-
fort to produce the maximum of possible
tional systems should tackle is the clearest
fruits.
most precise definition of these duties and
. Excellence as a permanently sought after ob- responsibilities. Without it, the evaluation is
jective. The improvement of quality, starting problematic, even in the formative field
from the analysis of the past and present prac- (Scriven, 1991a).
tices is one of the foundations of the evalua-
5) Evaluative studies should have the ability
tion research.
to value to what measure teachers and edu-
Considering the reference point of these criteria cational institutions are responsible and they
and their derivations, Stufflebeam summarizes a account for the execution of their duties and
series of recommendations to carry out good professional obligations (Scriven, 1994).
evaluative research and to improve the educa-
6) Evaluative studies should provide direc-
tional system. These recommendations consist
tion for improvement, because it is not
of the following:
enough to simply form a judgement about
1) Evaluation plans should satisfy the four re- the merit or the value of something.
quirements of utility, feasibility, legitimacy
7) Collecting the previous points, all evalua-
and precision (Joint Committee, 1981 and
tive study should have formative and sum-
1988).
mative components.

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

8) Professional self-evaluation should be en- Stronge and Helm, 1991; Keefe, 1994) and
couraged, providing the educators with the as grounds for action against social dis-
skills needed and favoring positive attitudes criminations (Mertens. 1999). Empower-
toward it (Madaus et al., 1991). ment Evaluation which Fetterman defends
(1994), is a procedure, of democratic base,
9) Evaluation of the context (necessities, op- of involved individuals participating in the
portunities, problems in an area,…) should be evaluated program, to promote their auton-
used in a prospective way, to locate the goals omy in the resolution of their problems.
and objectives and to define priorities. Also, Weiss (1998) alerts us that participative
evaluation of the context should be used retro- evaluation increases the probability that the
spectively to adequately judge the value of the results of the evaluation are used, but also
services and educational results, in connection that it is conservative in its conception, be-
with the necessities of the students (Madaus et cause it is difficult to think that those re-
al., 1991; Scriven, 1991). sponsible for an organization put in question
its foundation and the system of power.
10) Evaluation of the inputs should be used in
Their interest is generally the change of
a prospective way, to assure the use of an ap-
small things.
propriate range of approaches according to the
necessities and plans. 14) Evaluative studies should employ multi-
ple perspectives, multiple measures of re-
11) Evaluation of the process should be used
sults, and quantitative as well as qualitative
in a prospective way to improve the work
methods to collect and analyze the informa-
plan, but also in a retrospective way to judge
tion. The plurality and complexity of the
to what extent the quality of the process de-
educational phenomenon makes the use of
termines the reason for why the results are at
multiple and multidimensional approaches in
one level or another (Stufflebean and Shink-
evaluative studies necessary (Scriven, 1991).
field, 1987).
15) Evaluative studies should be evaluated,
12) Evaluation of the product is the means of
including formative metaevaluations to im-
identifying the desired and not desired results
prove their quality and use as well as sum-
in the participants or affected by the evaluated
mative metaevaluations to help users in the
object. A prospective valuation of the results
interpretation of their findings and to pro-
is needed to guide the process and to detect ar-
vide suggestions for the improvement of fu-
eas of need. A retrospective evaluation of the
ture evaluations (Joint Committee, 1981 and
product is needed to be able to gauge as a
1988; Madaus et al., 1991; Scriven, 1991;
whole the merit and value of the evaluated ob-
Stufflebeam, 2001).
ject (Scriven, 1991; Webster and Edwards,
1993; Webster et al., 1994). These fifteen recommendations provide es-
sential elements for an approach of the evalua-
13) Evaluative studies should base themselves
tive studies that Stufflebeam calls objectivist
on communication and the substantive and
and that is based on the ethical theory that
functional inclusion of stakeholders with the
moral kindness is objective and independent of
key questions, criteria, discoveries, and impli-
personal or merely human feelings.
cations of the evaluation, as well as in the
promotion of the acceptance and the use of Without entering in debate concerning these
their results (Chelimsky, 1998). Moreover, final evaluations of Stufflebeam, or initiating a
evaluative studies should be conceptualized comparative analysis with other proposals, for
and used systematically as part of the long- example with those of Guba and Lincoln
term process of educational improvement (1989), we find it to be evident that the concep-
(Alkin et al., 1979; Joint Committee, 1988; tions of evaluation research are diverse, de-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

pending on the epistemologic origin. However, the main models, methodological positions,
there appear some clear and convincing common designs, perspectives and current visions. His
elements within all the perspectives such as con- analysis, in a compact manner, is a necessary
textualization, service to society, methodological complement for such a historic vision that, due
diversity, attention, respect and participation of to its lineality, runs the risk of offering an arti-
those involved, etc., as well as a greater profes- ficially divided disciplinary image.
sionalization of the evaluators and a wider insti-
tutionalization of the studies (Worthen and San- During the seventies and the surrounding
ders, 1991). years, we have seen an appearance of evalua-
tive proposals that traditionally have been
Stufflebeam (1998) recognizes the conflict of called models (Castle and Gento, 1995) and in
the positions of the Joint Committee on Stan- some cases designs (Arnal et al., 1992) of
dards for Educational Evaluation with those of evaluation research. We know that several
the present trends in evaluation denominated dozens of these proposals existed in the above-
postmodernist. Besides Guba and Lincoln, this mentioned decade, though were very concen-
conflict is represented additionally by other rec- trated in the time. In fact, the issue of those
ognized evaluators such as Mabry, Stake and proposed models for evaluation seems to be a
Walker, but he doesn't accept that reasons exist practically closed topic for nearly twenty years.
for attitudes of scepticism and frustration with New models or proposals no longer arise, ex-
the current evaluative practice, because many cept for some exceptions that we see later on.
domains of approximation exist and the devel-
opment of evaluation standards is perfectly com- In spite of that said, models, methods and de-
patible with the attention given to the diverse signs in specialized literature, mainly looking
group of involved individuals and their values, for their agreement classification with diverse
social contexts and methods. Stufflebeam de- approaches, paradigmatic origin, purpose,
fends a larger collaboration in the improvement methodology, etc. continued being discussed.
of evaluations, establishing standards in partici- Also in the classifications, not only in the mod-
pative way, because he believes that the ap- els, exists diversity, which proves that, besides
proach of positions is possible, with important academic dynamism in the area of evaluation
contributions from all points of view. research, certain theoretical weakness still ex-
ists in this respect.
Weiss (1998) also takes similar positions when
she suggests that constructivist ideas should We have previously pointed out (Escudero,
cause us to think more carefully when using the 1993) that we agree with Nevo (1983 and
results of the evaluations, synthesizing them and 1989) in the appreciation that many of the ap-
establishing generalizations. However, she proaches to the conceptualization of evaluation
doubts that everything has to be interpreted in (for example, the responsive model, the goal-
exclusively individual terms, as many common free model, and the model of discrepancies,
elements exist among people, programs and insti- etc.) have been denominated unduly as models
tutions. although none of them has the grade of com-
plexity and globality that the previously-
8. To conclude: synthesis of model and mentioned concept should carry. That which a
methodological approaches of evaluation classic text in evaluation (Worthen and Sand-
and Scriven’s final perspective ers, 1973) designates as «contemporary models
of evaluation» (to the well-known positions of
After this analysis of the development of Tyler, Scriven, Stake, Provus, Stufflebeam,
evaluation throughout the 20th Century, it seems etc), Stake himself (1981) says that it would be
opportune, as a synthesis and conclusion, to better to call it “persuasions” while House
gather and emphasize those that are considered (1983) refers to “metaphors”.

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Norris (1993) notes that the concept model is 2) Purpose, objectives.


used with certain lightness when referring to
conception, approach or even evaluation method. 3) Audiences/participants/clientele.
De Miguel (1989), on the other hand, thinks that
4) High-priority or preferential empha-
many of the so-called models are only descrip-
sis/aspects.
tions of processes or approaches to evaluation
programs. Darling-Hammond et al. (1989) use 5) Criteria of merit or value.
the term “model” due to habit, but they indicate
that they don't do it in the precise meaning of the 6) Information to collect.
term in social sciences, this is, basing it on a
structure of supposed theory-based interrelations. 7) Methods of information collection.
Finally, we will say that the very author of the 8) Analysis methods.
CIPP model only uses this denomination in a
systematic way to refer to his own model (Stuf- 9) Agents of the process.
flebeam and Shinkfield, 1987), using the terms
approach, method, etc., when referring to the 10) Sequenciation of the process.
others. For us, perhaps the term evaluative ap-
11) Reports/utilization of results.
proaches is the most appropriate, even if we con-
tinue speaking of models and designs simply due 12) Limits of the evaluation.
to academic tradition.
13) Evaluation of evaluation research itself /
Our idea is that when conducting evaluative re- metaevaluation.
search, we still don't have a selected handful of
well-based, defined, structured and complete To define these elements it is logically neces-
models, from which to choose one in particular. sary to look for the support of the different
However, we do indeed have distinct modellic modellic approaches, methods, procedures,
approaches and ample theoretical and empirical etc., that evaluation research has developed,
support that allow the evaluator to respond in an mainly in recent decades.
appropriate manner to the different matters that
the research process outlines, helping to config- Returning to the denominated models of the
ure a global plan, a coherent flowchart, and a seventies and to their classifications, we can
«model» scientifically robust to carry out its gather some of those that appeared in the last
evaluation (Escudero, 1993). Which are the nec- decade in our academic field, based on differ-
essary matters to address in this process of mod- ent authors. In this way, for example, Arnal
ellic construction? Leaning on in the contribu- and others (1992) offer a classification of what
tions of different authors (Worthen and Sanders, they denominate designs of evaluation re-
1973; Nevo, 1989; Kogan, 1989; Smith and search, revising those of diverse authors (Pat-
Haver, 1990), they should address and define ton, 1980; Guba and Lincoln, 1982; Pérez,
their answer while building a model of evalua- 1983; Stufflebeam and Shinkfield, 1987). The
tion research in the following aspects: classification is as follows:

1) Object of the evaluation research.

Chart 1 - Types of designs of educational research


Patton Guba and Lin- Pérez Stufflebeam and
Perspective Creating authors
(1980) coln (1982) (1983) Shinkfield (1987)

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Objectives Objectives Objectives Objectives Tyler (1950)


Empirical- Rivlin (1971)
System analysis System analysis
analytic Rossi and et al (1979)
Scientific method Suchman (1967)
CIPP CIPP CIPP Stufflebeam (1966)
Artistic criticism Artistic criticism Artistic criticism Eisner (1971)
Liable to comple-
mentarity Adversary pro- Adversary pro-
Wolf (1974)
ceedings ceedings
UTOS UTOS Cronbach (1982)
Responsive Responsive Responsive Responsive Stake (1975)
Parlett and Hamilton
Illumination Illumination Illumination
Humanistic - (1977)
interpretive
Goal-free Goal-free Goal-free Scriven (1967)
Democratic MacDonald (1976)

For their part, Castillo and Gento (1995) offer conductivist-efficientist, humanistic and holis-
a classification of “methods of evaluation” wit- tic. The following is a synthesis of these clas-
hin each one of the paradigmas that they call sifications:

Chart 2 - Model behaviorist-efficientist


Method / Evaluative Dominant Content of Role of the
author purpose paradigm evaluation evaluator
Achievement Measurement of Quantitative Results External techni-
objectives achieved objec- cian
Tyler (1940) tives

CIPP Information Mixed C (context) External techni-


Stufflebeam for making I (input) cian
(1967) decisions P (process)
P (product)
Valuation of Mixed Antecedents, External techni-
Countenance results and transactions, cian
Stake (1967) process results

CSE Information for Mixed Centered in External techni-


Alkin (1969) determination of achievements of cian
decisions necessities
Educational Valuation Mixed U (evaluation External techni-
planning of process and units) cian
Cronbach product T (treatment)
(1982) O (operations)

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Chart 3 – Humanistic model


Method / Evaluative Dominant Content of Role of the
author purpose paradigm evaluation evaluator
Customer ser- Analysis of the Mixed All the effects of External evaluator
vice Scrivenclient’s necessi- the program of necessities of
(1973) ties the client
Opposition Opinions for Mixed Any aspect of the External referee
Owens (1973), consensus deci- program of the debate
Wolf (1974) sion
Artistic criti- Critical interpreta- Qualitative . Context External stimula-
cism Eisner tion ofeducational . Emergent proc- tor of interpreta-
(1981) actions esses tions
. Relations of
processes
. Impact on con-
text

Chart 4 – Holistic model


Method / Evaluative Domi nante Content of Role of the
author purpose paradigm evaluation evaluator
Responsive Valuation of an- Qualitative Result of total External stimula-
Evaluation swer to necessi- debate on pro- tor of the interpre-
Stake (1976) ties of participants gram tation for indi-
viduals involved
Holistic evalua- Educational inter- Qualitative Elements that External stimula-
tion MacDonald pretation for im- configure educa- tor of the interpre-
(1976) provement tional action tation for indi-
viduals implied
Evaluation as Illumination and Qualitative System of teach- External stimula-
Illumination understanding of ing and means of tor of the interpre-
the program’s learning tation for indi-
Parlett and components viduals involved
Hamilton
(1977)

that support this approach worry if the program


Scriven (1994) also offers a classification of will reach its objectives, but they continue
the “previously-mentioned models,” before in- questioning if such objectives cover the neces-
troducing his transdisciplinary perspective which sities that they should cover. This position is
will be commented on later. This author identi- maintained, although not made explicit by
fies six visions or alternative approaches in the Ralph Tyler and is extensively elaborated in
“explosive” phase of the models, in addition to the CIPP model (Stufflebeam et al., 1971).
others that he refers to as “exotic” that range
from models of jurisprudence to expert models. According to the Tylerian position, the deci-
Next we succinctly comment on these visions sions regarding a program should be based on
and the “models” that are attributed to them. the degree of coincidence between the objec-
tives and the results. The degree of change in
The strong decision-making vision (Vision A) students, which is usually the pursued objec-
provides the researching evaluator with the ob- tive, is the evaluation criteria in this case.
jective of reaching evaluative conclusions that
help he/she that should make decisions. Those

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Contrary to Tyler, Stufflebeam offers a wider gration without problems in the “car” of
perspective of the contents to be evaluated. The evaluative research. In fact, one of the most
following are the four dimensions that identify utilized texts of evaluation in the field of social
his model, context (C) where the program takes sciences (Rossi and Freeman, 1993), utilizes
place or the location of the institution, inputs (I) this perspective.
elements and initial resources, process (P) that is
necessary to continue toward the goal and the Visions B and C are the positions of scien-
product (P) that is obtained. Also, it is estab- tists connected to a free conception of scientific
lished that the fundamental objective of evalua- values. On the other hand, those that subscribe
tion research is improvement, decision-making vision A come from a different paradigm,
for the improvement of each of the four before- probably due to their academic connection with
mentioned dimensions. history, philosophy of education, compared
education and educational administration.
Scriven (1994) tells us that Stufflebeam has
continued developing his perspective since the Some years ago Alkin (1991) revised her po-
development of the CIPP. However, one of his sitions from two decades ago, but continued
collaborators, Guba, took a different direction without including the terms of merit, value, or
later on, just as we have seen when analyzing the worth. He finishes defining a System of Infor-
fourth generation of evaluation (Guba and Lin- mation for the Administration (Management
coln, 1989). Information System-MIS) for the use of the
individual that makes decisions, but he doesn't
The weak vision of decision-making (Vision B) offer valuations in this respect.
provides the evaluator with relevant information
for the making of decisions, but doesn't force The simplest form of the relativist vision (Vi-
him to produce critical or evaluative conclusions sion C) is the one developed in Malcolm
for the objectives of the programs. The most Provus’ “discrepancy model” of evaluation
genuine theoretical representative is Marv Alkin (1971). The discrepancies are the divergences
(1969) that defines evaluation as a factual proc- with the sequence of projected tasks and the
ess of collection and generation of information at foreseen temporization. This model is closely
the service of the individual who makes the deci- related to program control in the conventional
sions, but it is this person that has to make the sense; it is a type of simulation of an evalua-
evaluative conclusions. This position is logically tion.
popular among those that think that true science
The vision of the fertile, rich, complete de-
shouldn’t or cannot enter into questions of value
scription (Vision D) is that which understands
judgements. Alkin’s pattern is known as CSE
evaluation like an ethnografic or journalistic
(Center for the Study of Evaluation), outlining
task in which the evaluator reports on what
the following phases: valuation of the necessities
he/she sees without trying to produce valora-
and fixation of the problem, planning of the pro-
tive statements or to infer evaluative conclu-
gram, evaluation of the instrumentization,
sions, not even in the frame of the client's val-
evaluation of progresses and evaluation of re-
ues as in the relativist vision. This vision has
sults.
been defended by Robert Stake as well as by
The relativist vision (Vision C) also maintains many British theorists. It is a kind of naturalis-
the distance of the evaluative conclusions, but tic version of vision B, having something of
using the frame of the clients' values, without a relativist flavor and sometimes appears to be a
judgement on the part of the evaluator about precursor of the vision of the fourth generation.
those values or some reference to others. This It is based on observation, in the observable,
vision and the previous one have been the road more than in inference. Recently it has been
that has allowed to many social scientists, inte- denominated as a vision of the solid, strong

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

description, to avoid the rich term that seems of formal reports. The evaluator can also fol-
more evaluative. low the phases in a counterclockwise direction
or in any other order.
In his first stage, Stake is tylerian in regard to
evaluative conception centered on the outlined In the responsive method the evaluator must
objectives, proposing the countenance model interview the participants to know their points
(Stake, 1967), as a total image of the evaluation. of view and to look for the convergence of the
This tour around the three components, antece- diverse perspectives. The evaluator will inter-
dents, transactions and results, elaborates two pret the opinions and differences in points of
matrices of data, one of description and another view (Stecher and Davis, 1990) and present a
of judgement. In that of description, intentions wide range of opinions or judgements, instead
are gathered from one side and the observations of presenting his/her personal conclusions.
from the other and in the judgement matrix, the
norms, which are approved and the judgements, The vision of social process (Vision E) that
which are believed to be appropriate, are col- crystallized more than two decades ago around
lected. a group from Stanford University, directed by
Lee J. Cronbach (1980), plays down the impor-
During the mid-seventies, Stake moves away tance of the summative orientation of evalua-
from the tylerian tradition of concern for the ob- tion (external decisions about the programs and
jectives and revises his evaluation method to- accountability), emphasizing the understand-
ward a position that he qualifies as “responsive” ing, planning and improvement of social pro-
(Stake, 1975 and 1975a), assuming that the ob- grams to those that it serves. Their positions
jectives of the program can be modified over were clearly established in ninety-five theses
time with the purpose of offering a complete and that have had an enormous diffusion between
holistic vision of the program and to respond to the evaluators and the users of the evaluations.
the problems and real questions that are posed by
those involved in the program. According to As for the contents of the evaluation, Cron-
Stufflebeam and Shinkfield (1987), this model bach (1983) proposes that the following ele-
made Stake the leader of a new school of evalua- ments are planned and controlled:
tion that demands a model that is pluralistic,
. Units (U) that are subjected to evaluation,
flexible, interactive, holistic, subjective, and ori-
individuals or participant groups.
entated to service. This model suggests “cus-
tomer service” proposed by Scriven (1973), valu- . Treatment (T) of the evaluation.
ing their necessities and expectations.
. Operations (O) that the evaluator carries
In a graphic way, Stake (1975a) proposes the out for the collection and analysis of data, as
phases of the method through a comparison of well as for the elaboration of conclusions.
the hours on a clock, putting the first one at
twelve o’clock and continuing with the following . Context in which the program and its
phases in clockwise direction. These phases are evaluation takes place.
the following: 1) Speak with the clients, those
responsible, and audiences, 2) Scope of the pro- In one specific evaluative research, several
gram, 3) Panorama of activities, 4) Purposes and units, treatments, and operations can be given,
interests, 5) Questions and problems, 6) Data to that is, several (uto), inside a (UTO) universe
investigate the problems, 7) Observers, judges of acceptable situations.
and instruments, 8) Antecedents, transactions and Ernie House (1989), a theorists y practitioner
results, 9) Development of topics, descriptions of evaluation, quite independent of the latest
and case studies, 10) Validation (confirmation), trends in fashion, also marked the social con-
11) Outline for the audience and 12) Gathering nection of the programs, but he was distin-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

guished mainly for his emphasis of the most Evaluation as illumination (Parlett and Ham-
ethical and argumentational dimensions of ilton, 1977) has a holistic, descriptive and in-
evaluation, perhaps motivated by the absence of terpretive approach, with the pretense of illu-
these facets in Cronbach’s approaches and his mination on a complex range of questions that
collaborators. they are given in an interactive way
(Fernández, 1991). MacDonald’s democratic
The constructivist vision of the fourth genera- evaluation (1971 and 1976), also denominated
tion (Vision F) it is the last of these six visions holistic, supposes the collaborative participa-
that Scriven describes (1994), being maintained tion of those individuals involved, and contrast
by Guba and Lincoln (1989) and continued by of opinions of the participants is presumed to
many American and British evaluators. We have be the fundamental evaluative element.
already seen that this vision rejects an evaluation
guided by the search for quality, merit, value, Scriven (1994) critically analyzes the six vi-
etc., and favors the idea that it is the result of the sions and shows himself to be closest to vision
construction by individuals and the negotiation A, the strong vision on decision-making, repre-
of groups. According to Scriven this means that sented fundamentally by the CIPP model of
all types of scientific knowledge are suspicious, Stufflebeam and his positions. He claims that it
debatable and non-objective. The same thing is the one that comes closest to the common
happens to all analytic work, including his phi- sense vision which is the one that working
losophical analysis. Scriven notes that Guba evaluators use in their own programs, in the
himself has always been aware of the potential same way that doctors work with patients,
“self-contradictions” of his position. making it the best thing possible, independ-
ently of the type and of the patient's general
From this revision made by Scriven, there are state. Scriven wants to extend this vision with
some evaluative positions traditionally gathered a vision or model that he denominates trans-
and delt with by analysts. In this way, for exam- disciplinary and that he qualifies as signifi-
ple, Schuman (1967) offers an evaluative design cantly different from the previously-mentioned
based on the scientific method or, at least, in vision A and radically different from the oth-
some variation or adaptation of it. Owens (1973) ers.
and Wolf (1974 and 1975) propose an opposition
method or discussion that through a program, In the transdisciplinary perspective, evalua-
cause the emergence of two groups of evaluators, tion research has two components: the group of
partisans and adversaries, to distribute pertinent application fields of the evaluation and the con-
information for decision makers. Eisner (1971, tent of the discipline itself. Something similar
1975 and 1981) outlines the evaluation in terms to what happens to disciplines as the statistic
similar to the process of artistic criticism. and the measurement. Difinitively, evaluation
research is a discipline that includes its own
Scriven himself (1967 and 1973) proposed contents as well as those of many other disci-
years ago to base evaluation on customer service plines; its concern for analysis and improve-
and not so much on the foreseen goals, given that ment extends to many disciplines, making it
the unforeseen achievements are frequently more transdisciplinary.
important than those that figure in the planning
of the program. Because of this, he tends to de- This vision is objectivist like vision A and it
nominate his focus as evaluation without goals. defends that the evaluator determines the merit
The evaluator determines the value or merit of or the value of the program, of the personnel or
the program to inform the users; it is something of the researched products. In such a sense, it
similar to an informative middleman (Scriven, should be established in an explicit way and
1980). defend the logic used in the inference of the
evaluative conclusions, starting from the defi-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

nitional and factual premises. Likewise, the ar- ology. To conduct a proper evaluation, it
gumentational fallacies of the doctrine free of probably is not necessary to be a great special-
values should also be pursued (Evaluation The- ist in auxilliary techniques and their processes
saurus, 1991). of results synthesis, consequences, and their
placement in the evaluation process as a whole;
Secondly, the transdisciplinary perspective is however, it is a necessity to have a thorough
centered on the consumer rather than the agent or general understanding.
intermediary. The perspective is not exclusive to
the consumer, but it does consider the consumer This transdisciplinary perspective of
first as a justification of the program. In addi- Scriven’s evaluation research (1994), coincides
tion, it considers common good a primacy of in great measure with the positions that we
evaluation. Beginning here, valuable informa- have defended in other moments (Escudero,
tion is also produced for the agent who decides 1996). We don't have positions contrary to the
and can analyze the results of a particular pro- other visions in the same measure that Scriven
gram or institution in relation to its initial objec- does and, in fact, we consider from a prag-
tives. This position not only lends legitimacy to matic position that all the visions have strong
the researcher when generating evaluative con- points and that in any event, they contribute
clusions, but also creates a necessity to perform something useful to the conceptual understand-
such an analysis in the majority of cases. ing and the development of evaluation re-
search. However, we do think that this modern
It is also a widespread vision, though not ex- vision of Scriven is solid and coherent and
actly a general vision, that includes the generali- broadly accepted at the present time.
zation of concepts in the field of knowledge and
practice. From this perspective, evaluation re- A critique could be made of this position of
search is much more than the evaluation of pro- Scriven concerning the excessive relative em-
grams, processes, and institutions and impacts in phasis placed on client orientation, that is in the
many other objects. In a more detailed way, this user in the strict sense. We think that this ori-
widespread vision means that: entation should be integrated inside an orienta-
tion to individuals involved, where different
a) The distinctive application fields of the dis- types and different audiences exist and, of
cipline are the programs, personnel, achieve- course, the users in the sense of Scriven is one
ments, products, projects, administration, and of the most important, but it seems to us that
the metaevaluación of everything. evaluative research today has a more plural
high-priority orientation than that which is
b) Evaluation research impacts in all types of
defended by this author.
disciplines and the resulting practices.
c) Evaluation research is conducted on levels, Bibliographic references
from practical to conceptual. Ahman, S. J. y Cook, M. D. (1967). Evaluating
Pupil Growth. Principles of Tests Measure-
d) The different fields of evaluation research ment. Boston, Ma.: Allyn and Bacon
have many levels of interconnection and over- Alkin, M. (1969). Evaluation theory develop-
lapping. The evaluation of programs, per- ment. Evaluation Comment, 2, 1, 2-7.
sonal, centers, etc., have many aspects in
Alkin, M. (1991). Evaluation theory develop-
common.
ment: II. En M. McLaughlin, y Phillips (Eds.),
The fourth distinct element of the transdiscipli- Evaluation and education at quarter century
nary perspective of evaluation is that it concen- (pp.91-112). Chicago: NSSE/ University of
trates on a technical vision. Evaluation not only Chicago .
requires technical support from many other dis-
ciplines, but it includes its own unique method-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Anderson, S. B. y Ball, S. (1983). The profession Cronbach, L. J. (1963). Course improvement


and practice of program evaluation. San Fran- through evaluation. Teachers College Record,
cisco, Ca: Jossey-Bass. 64, 672-683.
Arnal, J.; Del Rincon, D. y Latorre, A. (1992). Cronbach, L. J. (1982). Designing evaluations
Investigación educativa. Fundamentos y Meto- of educational and social programs. Chicago:
dología. Barcelona: Labor Jossey-Bass.
Atkin, J. M. (1968). Behavioral Objectives in Cronbach, L. J., Hambron, S.R., Dornbusch,
curriculum design: A cautionary note. The Sci- S.M., Hess, R.D., Hornick, R.C., Phillips,
ence Teacher, 35, 27-30. D.C., Walker, D.F. y Weiner, S.S. (1980).
Baker, L. R. (1969). Curriculum evaluation. Re- Towards reform in program evaluation: Aims,
view of Educational Research, 39, 339-358. methods and institutional arrangements. San
Barbier, J. M. (1993). La evaluación en los pro- Francisco: Jossey-Bass.
cesos de formación. Barcelona: Paidos. Cronbach, L. J. y Suppes, P. (1969). Research
Berk, A. R. (Ed.) (1981). Educational evaluation for tomorrow's schools: Disciplined inquiry
methodology: The state of the art. Baltimore: for education. New York: MacMillan.
The Hopkins University Press. Darling-Hammond, L., Wise, AE, & Pease, SR
Blanco, F. (1994). La evaluación en la educación (1989). Teacher evaluation in the organiza-
secundaria. Salamanca: Amasú Ediciones. tional context: A review of the literature. En
Bloom, B. S., Engelhaut, M.D., Furst, E.J., Hill, E. R. House (Ed.) New directions in educa-
W.H. y Krathwohl, D.R. (1956). Taxonomy of tional evaluation (pp.203-253). London: The
educational objectives Handbook I; Cognitive Falmer Press.
domain. New York: Davis, McKay. De la Orden, A. (1985). Investigación evalua-
Bloom, B. S., Hastings, J. Th. y Madaus, F. tiva. En Arturo De la Orden. (Ed.), Investiga-
(1971). Handbook on formative and summative ción educativa. Diccionario de Ciencias de la
evaluation of student learning. New York: Educación (pp.133-137). Madrid: Anaya.
McGraw-Hill. De Miguel, M. (1989). Modelos de investiga-
Bloom, B.S., Hastings, T. y Madaus G. (1975). ción sobre organizaciones educativas. Revista
Evaluación del aprendizaje. Buenos Aires: Tro- de Investigación Educativa, 7, 13, 21-56.
quel. Dubois, P. H. (1970). A History of Psychologi-
Bonboir, A. (1972). La Docimologie. Paris: PUF. cal Testing. Boston: Allyn Bacon.
Cabrera, F. (1986). Proyecto docente sobre técni- Ebel, R. L. (1977). Fundamentos de la medi-
cas de medición y evaluación educativas. Barce- ción educacional. Buenos Aires: Guadalupe.
lona: Universidad de Barcelona. Eisner, E. W. (1967). Educational objectives:
Carreño, H. F. (1977). Enfoques y principios Help or hindrance?. The School Review, 75,
teóricos de la evaluación. Mexico: Trillas. 250-260.
Castillo, S. y Gento, S. (1995). Modelos de eva- Eisner, E. W. (1969, Instructional and expres-
luación de programas educativos. En A.Medina sive educational objectives: their formulation
y L. M. Willar (Coord.), Evaluación de progra- and use in curriculum. En J. Popham (Ed.),
mas educativos, centros y profesores (pp.25-69). Instructional objectives (pp. 1-18). Chicago:
Madrid: Editorial Universitas, S. A.. AERA.
Chelimsky, E. (1998). The role of experience in Eisner, E. (1971). Emerging models for educa-
formulating theories of evaluation practice. tional evaluation. School Review, 2.
American Journal of Evaluation 19, 1, 35-55. Eisner, E. W. (1975). The perceptive eye: To-
Coffman, W. E. (1971). Essay examinations. En ward the reformation of educational evalua-
R.L. Thorndike (Ed.) Educational Measurement. tion. Stanford, Ca: Stanford Evaluation Con-
Washington , DC: American Council on Educa- sortion.
tion. Eisner, E. W. (1981). The methodology of
qualitative evaluation: the case of educational

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

connoisseurship and educational criticism. Stan- Gullicksen, H. (1950). Theory of mental tests.
ford, Ca: Stanford University . New York: Wiley
Eisner, W. E. (1985). The art of educational Hambleton, R. K. (1985). Criterion-referenced
evaluation. London: The Falmer Press. measurement.En Encyclopedia of Educational
Escudero, T. (1993). Enfoques modélicos en la Research. New York: McMillan.
evaluación de la enseñanza universitaria, Actas Hammond, R. L. (1983). Evaluation at the lo-
de las III Jornadas Nacionales de Didáctica cal level. En B. R. Worthen y J. R. Sanders,
niversitaria «Evaluación y Desarrollo Profesi Educational Evaluation: Theory and Practice.
nal» (pp. 5-59). Las Palmas: Servicio de Publi- Worthington, Ohio: Charles A. Jones Publish-
caciones, Universidad de Las Palmas. ing Company.
Escudero, T. (1996). Proyecto docente e investi- Hernández, F. (1993). Proyecto docente e in-
gador. Zaragoza: Universidad de Zaragoza. vestigador. Murcia.
Fernández Ballesteros, R. (1981). Perspectivas Horowitz, R. (1995). A 75-year legacy on as-
históricas de la evaluación conductual. En R. sessment: Reflections from an interview with
Fernández, y J.A.I Carrobles. (Ed.), Evaluación Ralph W. Tyler. The Journal of Educational
conductual. Madrid: Ediciones Pirámide. Research, 89, 2, 68-75.
Fernández de Castro, J. (1973). La enseñanza House, E. R. (1983). How we think about
programada. Madrid: CSIC. evaluation. En House, E. R., Philosophy of
Fernández, J. (1991). La evaluación de la calidad evaluation. San Francisco, Ca.: Jossey-Bass
docente. En A. Medina (Coord.), Teoría y mé- House, E. R. (1989). Evaluating with validity.
todos de evaluación. Madrid: Cincel. Newbury Park, Ca.: Sage.
Fetterman, D. M. (1994). Empowerment evalua- Joint Committee on Standards for Educational
tion. Evaluation Practice, 15, 1, 1-15. Evaluation (1981, Standards for evaluations of
Gagne, R. M. (1971). Las condiciones del apren- educational programs, projects, and materials.
dizaje. Madrid: Aguilar. New York.: McGraw-Hill.
Gil, E. (1992). El sistema educativo de la Com- Joint Committee on Standards for Educational
pañía de Jesús. La «Ratio Studiorum». Madrid: Evaluation (1988). The personnel evaluation
UPCO. standards. Newbury Park, CA.: Sage.
Glaser, B. G. (1978). Theoretical sensitivity. Mill Keefe, J. (1994). School evaluation using the
Valley, Ca: Sociology Press. CASE-IMS model and improvement process.
Glaser, B. G. y Strauss, A. L. (1967). The dis- Studies in Educational Evaluation, 20,1, 55-
covery of grounded theory. Chicago: Aldine. 67.
Glaser, R. (1963). Instructional technology and Kellaghan, T. (1982). La evaluación educativa.
the measurement of learning outcomes: some Bogotá: Universidad Pontificia Javierana.
questions. American Psychologists, 18, 519- Kidder, L. H. (1981). Qualitative research and
521. quasi-experimental frameworks. En M. B.
Glaser, R. (Dir.) (1965). Teaching machines and Brewer y B. E. Collins, (Eds.), Scientific in-
programmed learning. Washington: National quiry and the social sciences. San Francisco:
Education Association. Jossey-Bass.
Gronlund, N. E. (1985). Measurement and Kogan, M. (Ed.) (1989). Evaluating higher
evaluation in teaching. New York: MacMillan, education. London: Jessica Kingsley Publish-
Guba, G. E. y Lincoln, Y. S. (1982). Effective ers.
evaluation. San Francisco: Jossey Bass Publish- Krathwohl, R. D., Blomm, B.S. y Masia, B.B.
ers. (1964). Taxonomy of educational objectives.
Guba, E. G. y Lincoln, Y. S. (1989). Fourth Gen- Handbook II: Affective domain. New Cork:
eration Evaluation. Newbury Park, Ca.: Sage Davis McKay.
Publications

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Lewy, A. (Ed.) (1976). Manual de evaluación Mccall, W. A. (1920). A new kind of school
formativa del currículo. Bogotá : Volun- examination. Journal of Educational Research,
tad/Unesco. January.
Lincoln, y. S. y Guba, E. G. (1985). Naturalistic McReynold, P. (1975). Advances in Psycho-
inquiry. Beverly Hills, CA: Sage. logical Assessment, vol. III.San Francisco:
Lincoln, y. S. y Guba, E. G. (1986). But is it rig- Jossey-Bass.
orous? Trustworthiness and authenticity in natu- Mertens, D. M. (1999). Inclusive evaluation:
ralistic evaluation. en D. D. Williams (Ed.), Implications of transformative theory for
Naturalistic evaluation. San Francisco: Jossey- evaluation. American Journal of Evaluation,
Bass. 20, 1, 1-14.
Lincoln, y. S. y Guba, E. G. (1988). Criteria for Metfessel, N. S. y Michael, W. B. (1967). A
assessing naturalistic inquiries as products. Pa- paradigm involving multiple criterion meas-
per presented at the American Educational Re- ures for the evaluation of effectiveness of
search Association, New Orleans, LA. school programs. Educational and Psycho-
Lindquist, E. F. (1953). Design and Analysis of logical Measurement, 27, 931-943.
Experiments in Psychology and Education. Bos- Morgan, G. (1983). Beyond method. Beverly
ton, Ma.: Houghton Mifflin Company. Hills, Ca: Sage.
Lindvall, C. M., (Ed.) (1964). Defining educa- Nevo, D. (1983). The conceptualization of
tional objectives. Pittsburg: University of Pitts- educational evaluation: An analytical review
burgh Press. of the literature. Review of Educational Re-
Macdonald, B. (1971). The evaluation of the search, 53, 1, 117-128.
Humanities Curriculum Project: a holistic ap- Nevo, D. (1989). The conceptualization of
proach. Theory into Practice, 10, 3, 163-169. educational evaluation: An analytical review
Macdonald, B. (1976). Evaluation and the con- of the literature. En E. R. House (Ed.), New
trol of education. En D. Tawney (Ed.), Curricu- directions in educational evaluation. London :
lum evaluation today: trends and implications, The Falmer Press, 15-29.
125-136. London: McMillan. Norris, N. (1993). Understanding educational
Madaus, G. F. y otros (1991). Evaluation Mod- evaluation. London : Kogan Page/CARE,
els. Viewpoints on Educational and Human Ser- School of Education , University of East An-
vices Evaluation. Hingham , Mass.: Kluwer- glia .
Nijhoff Publishing. Nowalowski, Jeri, Mary Anne Bunda, Russell
Mager, R. F. (1962). Preparing instructional ob- Working (1985). A Handbook of Educational
jective. Palo Alto, CA.: Fearon. Variables. Boston: Kluwer-Nijhoff .
Mager, R. F. (1973). Análisis de metas. México: Nunnally, J. C. (1978). Psychometric Theory.
Trillas. New York: McGraw Hill.
Mann, H., (1845). Boston Grammar and Writing Owens, T. R. (1973). Educational evaluation
Schools. Common School Journal, October, 7, by adversary proceedings. En E.R. House
19. (Comp.), School Evaluation: The Politics and
Martínez de Toda, M. J. (1991). Metaevaluación Process. Berkeley: McCutchan.
de necesidades educativas: Hacia un sistema de Parlett, M. y Hamilton , D. (1977). Evaluation
normas. Tesis doctoral. Madrid: Universidad as illumination: A new approach to the study
Complutense. of innovative programmes. En Hamilton D. y
Mateo, J. (1986). Proyecto docente e investiga- otros (Eds.), Beyond the numbers game. Lon-
dor. Barcelona: Universidad de Barcelona. don: MacMillan.
Mateo, J. y otros (1993). La evaluación en el aula Patton, M. Q. (1980). Qualitative Evaluation
universitaria. Zaragoza: ICE-Universidad de methods. Beverly Hills, Ca.: Sage
Zaragoza. Pérez, A. (1983). Modelos contemporáneos de
evaluación. En J. Gimeno y A. Pérez, La en-

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

señanza: su teoría y su práctica (pp. 426-449). Salvador, L. (1992). Proyecto docente., Uni-
Madrid: Akal versidad de Cantabria.
Phillips, R. C. (1974). Evaluation in education. Scriven, M. (1967). The methodology of
Columbus, Ohio: Merril. evaluation. En Perspectives of Curriculum
Pieron, H. (1968). Vocabulaire de la psycholo- Evaluation, (pp. 39-83). AERA Monograph 1.
gie. Paris: PUF. Chicago : Rand McNally and Company.
Pieron, H. (1969). Examens et Docimologie. Scriven, M. (1973). Goal-free evaluation. En
PUF, Paris. E. R. House (Ed.), School evaluation: The
Planchard, E. (1960). La investigación pedagógi- politics and process,(pp. 319-328). Berkeley,
ca. Madrid: Ediciones Fas. CA.: McCutchan.
Popham, W. J. (1970). Establishing instructional Scriven, M. (1974). Prose and cons about goal-
goals. Englewood Cliffs, N. J.: Prentice Hall. free evaluation. Evaluation Comment, 3, 1-4.
Popham, W. J. (1980). Problemas y técnicas de Scriven, M. (1980). The logic of evaluation.
la evaluación educativa. Madrid: Anaya. Inverness , Ca.: Edgepress
Popham, W. J. (1983). Evaluación basada en Scriven, M. (1991, Evaluation Thesaurus.
criterios. Madrid: Magisterio Español, S. A.. Newbury Park , Ca.: Sage
Popham, W. y Baker, E. L. (1970). Systematic Scriven, M. (1991a). Duties of the teacher.
Instruction. Englewood Cliffs, NJ.: Prentice Kalamazoo, Mi.: Center for Research on Edu-
Hall. cational Accountability and Teacher Evalua-
Provus, M. (1971). Discrepancy evaluation. For tion.
educational program improvement and assess- Scriven, M. (1994). Evaluation as a discipline.
ment. Berkeley, Ca.: McCutchan Publishing Co. Studies in Educational Evaluation, 20, 1, 147-
Rivlin, A. M. (1971). Systematic thinking for 166.
social action. Washington, D.C.: Brookings In- Scriven, M. (1998). Minimalist theory: The
stitute. least theory that practice require. American
Rodríguez, T. y otros (1995). Evaluación de los Journal of Evaluation 19, 1, 57-70.
aprendizajes. Aula Abierta Monografías 25. Smith, E. R. y Tyler, R. W. (1942). Appraising
viedo: ICE-Universidad de Oviedo. and recording student progress. New York:
Rosenthal, J. E. (1976). Evaluation history. En S. Harper & Row,.
B. Anderson, y otros (Eds.), Encyclopedia of Smith, N. L. y Haver, d. M. (1990). The appli-
Educational Evaluation. San Francisco: Jossey cability of selected evaluation models to evol-
Bass Publishers. ving investigative designs. Studies in Educa-
Rossi, P. H. y Freeman, H. (1993). Evaluation: A tional Evaluation, 16, 3, 489-500.
Systematic Approach. Beverly Hills , Ca.: Sage. Stacher, B. M. y Davis, W. A. (1990). How to
Rossi, P. H. y otros (1979). Evaluation: A sys- Focus on Evaluation. Newbury Park, Ca.:
tematic approach. Beverly Hills , Ca.: Sage. Sage.
Russell, N. y Willinsky, J. (1997). Fourth genera- Stake, R. E. (1967). The countenance of educa-
tion educational evaluation: The impact of a tional evaluation. Teacher College Record, 68,
post-modern paradigm on school based evalua- 523-540.
tion. Studies in Educational Evaluation, 23, 3, Stake, R. E. (1975a). Program evaluation: par-
187-199. ticularly responsive evaluation. Occasional
Rutman, L. (Ed.) (1984). Evaluation research Paper, 5. University of Western Michigan.
methods: A base guide. Beverly Hills, Ca.: Stake, R. E. (1975b). Evaluating the arts in
Sage. education: A responsive approach. Ohio: Mer-
Rutman, L. y Mowbray, G. (1983). Understand- ril .
ing program evaluation. Beverly Hills, Ca.: Sa- Stake, R. E. (1976). A theoretical stament of
ge. responsive evaluation. Studies in Educational
Evaluation, 2 , 19-22.

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Stake, R. E. (1981). Setting standars for educa- Thorndike, E. L,. (1904). An Introduction to
tional evaluators. Evaluation News, 22, 148- the Theory of Mental and Social Measure-
152. ments. New York: Teacher College Press, Co-
Stake, R. E. (1986). Quieting reform. Urbana: lumbia University .
University of Illinois Press. Tyler, R. W. (1950). Basic principles of cur-
Stenhouse, L. (1984). Investigación y desarrollo riculum and instruction. Chicago: University
del curriculum. Madrid: Morata. of Chicago Press .
Stronge, J. H. y Helm, V. M. (1991). Evaluating Tyler, R. W. (1967). Changing concepts of
professional support personnel in educational educational evaluation. En R. E. Stack
settings. Newbury Park, Ca: Sage. (Comp.), Perspectives of curriculum evalua-
Stufflebeam, D. L. (1966). A depth study of the tion. AERA Monograph Series Curriculum
evaluation requeriment. Theory into Practice, 5, Evaluation, 1. Chicago: Rand McNally,.
3, 121-134. Tyler, R. W., (Ed.) (1969). Educational evalua-
Stufflebeam, D. L. (1994). Introduction: Rec- tion: New roles, new means. Chicago: Univer-
ommendations for improving evaluations in U. sity of Chicago Press .
S. public schools. Studies in Educational Walberg, H. J. y Haertel, G. D. (Ed.) (1990).
Evaluation, 20, 1, 3-21. The International Encyclopedia of Educa-
Stufflebeam, D. L. (1998). Conflicts between tional Evaluation. Oxford: Pergamon Press.
standards-based and postmodernist evaluations: Webster, W. J. y Edwards, M. E. (1993). An
Toward rapprochement. Journal of Personnel accountability system for school improve-
Evaluation in Education, 12, 3, 287-296. ment, Paper presented at the annual meeting
Stufflebeam, D. L. (1999). Using profesional (April) of the AERA. Atlanta, GA.
standards to legally and ethically release evalua- Webster, W. J., Mendro, R.L. y Almaguer,
tion findings. Studies in Educational Evaluation, T.O. (1994). Effectiveness indices: A «value
25, 4, 325-334. added» approach to measuring school effect.
Stufflebeam, D. L. (2000). Guidelines for devel- Studies in Educational Evaluation, 20, 1, 113-
oping evaluation checklists. Consultado en 137.
www.wmich.edu/evalctr/checklists/ el 15 de D Weiss, C. H. (1983). Investigación evaluativa.
ciembre de 2002. Métodos para determinar la eficiencia de los
Stufflebeam, D. L. (2001). The metaevaluation programas de acción. México: Trillas,.
imperative. American Journal of Evaluation, 22, Weiss, C. H. (1998). Have we learned anything
2, 183-209. new about the use of evaluation?. American
Stufflebeam, D. L., Foley, WJ, Gephart, WJ, Journal of Evaluation, 19, 1, 21-33.
Guba, EG, Hammond, RL, Merriman, HO & Wilson, A. R. J. (1978). La evaluación de los
Provus, MM (1971). Educational Evaluation objetivos. En J. A. R. Wilson (Ed.), Funda-
and Decision-making, Itasca, Illinois : F. E. mentos psicológicos del aprendizaje y la en-
Peacock Publishing. señanza (pp. 549-578). Madrid: Anaya..
Stufflebeam, D. L. y Shinkfield, A. J. (1987). Wolf, R. L. (1974). The citizen as jurist: A new
Evaluación sistemática. Guía teórica y práctica. model of educational evaluation. Citizen Ac-
Barcelona: Paidos/MEC. tion in Education, 4.
Suchman, E. A. (1967). Evaluative Research: Wolf, R. L. (1975). Trial by jury: a new
Principles and Practice in Public Service and evaluation method. Phi Delta Kappa, 57, 185-
Social Action Programs. New York : Russell 187.
Sage Foundation. Worthen, B. R. y Sanders, J. R. (1973). Educa-
Sunberg, N. D. (1977). Assessment of person. tional Evaluation: Theory and Practice. Wor-
Englewood Cliffs, N.J.: Prentice Hall. thington, Ohio: Charles a. Jones Publishing
Taba, H. (1962). Curriculum development. The- Company.
ory and practice. New York: Harcourt Brace .

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development of
evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Worthen, B. R. y Sanders, J. R. (1991). The Zeller, N. C. (1987). A rethoric for naturalistic


changing face of educational evaluation, Theory inquiry. Ph. D. dissertation, Indiana Univer-
into Practice, XXX, 1, 3-12. sity.

ABOUT THE AUTHORS / SOBRE LOS AUTORES

Tomás Escudero Escorza (tescuder@unizar.es). Head of the department of Methods of Research


and Diagnostics in Education at the University of Zaragoza. He is a member of the Comíssion for
Technical Coordination of Quality Planning of Universities. He has authored a large number of
works and publications in the evaluative education field, particularly institutional evaluation.

ARTICLE RECORD / FICHA DEL ARTÍCULO

Escudero, Tomás (2003). Desde los tests hasta la investigación evaluativa actual. Un
siglo, el XX, de intenso desarrollo de la evaluación en educación. Revista ELectrónica
Reference /
de Investigación y EValuación Educativa (RELIEVE), v. 9, n. 1.
Referencia http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1.htm. Consultado en (poner fe-
cha).
Desde los tests hasta la investigación evaluativa actual. Un siglo, el XX, de intenso
Title / Título desarrollo de la evaluación en educación. [ From tests to current evaluative research.
One century, the XXth, of intense development of evaluation in education]
Authors / Autores Tomás Escudero Escorza
Translator / Traducto- Laura M. McLeod
ra
Review / Revista ELectrónica de Investigación y EValuación Educativa (RELIEVE), v. 9, n. 1
Revista
ISSN 1134-4032
Publication date /
2003 (Reception Date: 2003 January 10; Publication Date: 2003 Febr. 27 )
Fecha de publicación
This article presents a review and state of art about development in educational
evaluation in the XXth century. The main theoretical proposals are commented.
Abstract /
Este artículo presenta una revisión crítica del desarrollo histórico que ha tenido el ám-
Resumen
bito de la evaluación educativa durante todo el siglo XX. Se analizan los principales
propuestas teóricas planteadas..
Evaluation, Evaluation Research, Evaluation Methods; Formative Evaluation Sum-
Keywords mative Evaluation, Testing, Program Evaluation.
Descriptores Evaluación, Investigación evaluativa, Métodos de Evaluación, Evaluación Formativa,
Evaluación Sumativa, Test, Evaluación de Programas
Institution / Universidad de Zaragoza (España)
Institución
Publication site /
http://www.uv.es/RELIEVE
Dirección
Español and english version (Title, abstract and keywords in english). English version
Language / Idioma added March, 21th 2006

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]


Escudero Escorza, T. (2003). From tests to current evaluative research. One century, the XXth, of intense development
of evaluation in education. RELIEVE, v. 9, n. 1. http://www.uv.es/RELIEVE/v9n1/RELIEVEv9n1_1eng.htm

Revista ELectrónica de Investigación y EValuación Educativa


(RELIEVE)
[ ISSN: 1134-4032 ]

© Copyright 2002, RELIEVE. Reproduction and distribution of this articles it is authorized if the content is no
modified and their origin is indicated (RELIEVE Journal, volume, number and electronic address of the document). /
© Copyright 2002, RELIEVE. Se autoriza la reproducción y distribución de este artículo siempre que no se modifi-
que el contenido y se indique su origen (RELIEVE, volumen, número y dirección electrónica del documento).

Revista ELectrónica de Investigación y EValuación Educativa [ www.uv.es/RELIEVE ]