Ennis ASSESMENT

This article was downloaded by: [University of Birmingham]
On: 07 July 2013, At: 00:09

Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:
Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Theory Into Practice

Publication details, including instructions for authors and subscription
information:
http://www.tandfonline.com/loi/htip20
Critical thinking assessment

a
Robert H. Ennis
a
Professor of education, University of Illinois, Urbana‐Champaign
Published online: 05 Nov 2009.
To cite this article: Robert H. Ennis (1993) Critical thinking assessment, Theory Into Practice, 32:3, 179-186,
DOI: 10.1080/00405849309543594
To link to this article: http://dx.doi.org/10.1080/00405849309543594
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”)
contained in the publications on our platform. However, Taylor & Francis, our agents, and our
licensors make no representations or warranties whatsoever as to the accuracy, completeness, or
suitability for any purpose of the Content. Any opinions and views expressed in this publication
are the opinions and views of the authors, and are not the views of or endorsed by Taylor &
Francis. The accuracy of the Content should not be relied upon and should be independently
verified with primary sources of information. Taylor and Francis shall not be liable for any
losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities
whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or
arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial
or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or
distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use
can be found at http://www.tandfonline.com/page/terms-and-conditions
Robert H. Ennis
Critical Thinking Assessment

Downloaded by [University of Birmingham] at 00:09 07 July 2013
A LTHOUGH critical thinking has often been urged

as a goal of education throughout most of this
century (for example, John Dewey's How We Think,
Defining Critical Thinking
The upper three levels of Blooms' taxonomy of
educational objectives (analysis, synthesis, and eval-
1910; and the Educational Policies Commission's The uation) are often offered as a definition of critical
Central Purpose of American Education, 1961), not a thinking. Sometimes the next two levels (comprehen-
great deal has been done about it. Since the early sion and application) are added. This conception is a
1980s, however, attention to critical thinking instruc- good beginning, but it has problems. One is that the
tion has increased significantly—with some spillover levels are not really hierarchical, as suggested by the
to critical thinking assessment, an area that has been theory, but rather are interdependent. For example,
neglected even more than critical thinking instruction. although synthesis and evaluation generally do re-
Partly as a result of this neglect, the picture I quire analysis, analysis generally requires synthesis
paint of critical thinking assessment is not all rosy, and evaluation (Ennis, 1981).
though there are some bright spots. More explicitly, More significantly, given our concern here, the
my major theme is that, given our current state of three (or five) concepts are too vague to guide us in
knowledge, critical thinking assessment, albeit diffi- developing and judging critical thinking assessment.
cult to do well, is possible. Two subthemes are that Consider analysis, for example. What do you assess
(a) the difficulties and possibilities vary with the pur- when you test for ability to analyze? The difficulty
pose of the critical thinking assessment and the format becomes apparent when we consider the following
used, and (b) there are numerous traps for the unwary. variety of things that can be labeled "analysis": analysis
In pursuit of these themes, I consider some pos- of the political situation in the Middle East, analysis
sible purposes in attempting to assess critical think- of a chemical substance, analysis of a word, analysis
ing, note some traps, list and comment on available of an argument, and analysis of the opponent's weak-
critical thinking tests (none of which suit all of the nesses in a basketball game. What testable thing do
purposes), and finish with suggestions for how to de- all these activities have in common? None, except for
velop your own critical thinking assessment, includ- the vague principle that it is often desirable to break
ing a discussion of some major formats. But first, things into parts.
some attention must be paid to the definition of criti- A definition of critical thinking that I at one
cal thinking, because critical thinking assessment requires time endorsed is that critical thinking is the correct
that we be clear about what we are trying to assess. assessing of statements (Ennis, 1962). If I had not
Robert H. Ennis is professor of education at the Univer- elaborated this definition, it would be as vague as
sity of Illinois at Urbana-Champaign. Bloom's taxonomy. But even when elaborated, it suffers
THEORY INTO PRACTICE, Volume 32, Number 3, Summer 1993

Copyright 1993 College of Education, The Ohio State University
THEORY INTO PRACTICE / Summer 1993
Teaching for Higher Order Thinking
from excluding creative aspects of critical thinking, to test for such important things as being open mind-
such as conceiving of alternatives, formulating hy- ed, and many even fail to test for judging the credi-
potheses and definitions, and developing plans for bility of sources. Without some defensible concep-
experiments. I now think the contemporary conception of critical thinking, judgments about tests are
tion of critical thinking includes these things, so the likely to be erratic—or worse.
"correct assessing" definition is more narrow than Two other well-known definitions of critical
standard usage, and thus could interfere with com- thinking are McPeck's "reflective skepticism" (1981,
munication among proponents of critical thinking. p. 7) and Paul's "strong sense" definition (1987).
The following definition seems to be more in Paul's definition is similar in broad outline to the
accord with contemporary usage and thus, I hope, definition proposed here, but emphasizes more heavily
will minimize confusion in communication: "Critical being aware of one's own assumptions and seeing
thinking is reasonable reflective thinking focused on things from others' points of view. However, neither
deciding what to believe or do." As it stands, howev- of these definitions provides sufficient elaboration for
er, this definition is also as vague as Bloom's taxon- developing critical thinking tests. Furthermore,
omy. It too needs elaboration. Here is an abridgment McPeck's definition is negative. Critical thinking must
of the elaborations I have provided and defended else- get beyond skepticism.
where (Ennis, 1987, 1991, in press):
In reasonably and reflectively going about de- Purposes of Critical Thinking Assessment
ciding what to believe or do, a person characteristi- Not only must we have a defensible elaborated
cally needs to do most of these things (and do them definition of critical thinking when selecting, criti-
interdependently): cizing, or developing a test, we must also have a
1. Judge the credibility of sources. clear idea of the purpose for which the test is to be
2. Identify conclusions, reasons, and assumptions. used. A variety of possible purposes exist, but no one
3. Judge the quality of an argument, including the ac- test or assessment procedure fits them all. Here are some
ceptability of its reasons, assumptions, and evidence. major possible purposes, accompanied by comments:
4. Develop and defend a position on an issue. 1. Diagnosing the levels of students' critical
5. Ask appropriate clarifying questions.
6. Plan experiments and judge experimental designs. thinking. If we are to know where to focus our in-
7. Define terms in a way appropriate for the context. struction, we must "start with where they are" in spe-
8. Be open-minded. cific aspects of critical thinking. Tests can be helpful
9. Try to be well informed. in this respect by showing specific areas of strength
10. Draw conclusions when warranted, but with caution. and weakness (for example, ability to identify as-
This interdependent list of abilities and disposi- sumptions).
tions can provide some specificity for guiding criti- 2. Giving students feedback about their critical
cal thinking testing. The elaborations, of which the thinking prowess. If students know their specific
list is an abridgment, are more thorough, but the sim- strengths and weaknesses, their attempts to improve
plicity of this list can make it useful. It can serve as a can be better focused.
set of goals for an entire critical thinking curriculum 3. Motivating students to be better at critical
or as a partial set of goals for some subject-matter or thinking. Though frequently misused as a motivational
other instructional sequence. It can be the basis for a device, tests can and do motivate students to learn
table of specifications for constructing a critical think- the material they expect to be covered on the test. If
ing test. (A table of specifications provides the areas critical thinking is omitted from tests, test batteries,
that a test is supposed to assess and indicates the or other assessment procedures, students will tend to
weighting assigned to each.) neglect it (Smith, 1991; Shepard, 1991).
The elaboration also can be used as a guide in 4. Informing teachers about the success of their
judging the extent to which an existing critical think- efforts to teach students to think critically. Teachers
ing test is comprehensive, and whether it assesses can use tests to obtain feedback about their instruc-
critical thinking at all. One of my chief criticisms of tion in critical thinking.
most existing critical thinking tests is their lack of 5. Doing research about critical thinking instruc-
comprehensiveness. For example, they typically fail tional questions and issues. Without careful comparison
180
Ennis
of a variety of approaches, the difficult issues in crit- are actually different tests. Comparability is always
ical thinking instruction and curriculum organization suspect, since so much depends on the specific con-
cannot be answered. But this research requires as- tent of the test.
sessment, so that comparisons can be made. 4. Most critical thinking tests are not compre-
6. Providing help in deciding whether a student hensive, especially those that are easiest to use, the
should enter an educational program. People in some multiple-choice tests. These tests typically miss much
fields already use assessed critical thinking prowess that is important in critical thinking.
to help make admissions decisions. Examples are 5. Another problem in the use of (especially)
medicine, nursing, law, and graduate school in gener- multiple-choice tests lies in differences in background
al. The idea seems good, but the efficacy of existing beliefs and assumptions between test maker and test
efforts in selecting better critical thinkers has not been taker. Since a critical thinker employs a grasp of the
established. Research needs to be done in this area. situation, different beliefs about the situation can
7. Providing information for holding schools sometimes result in justifiably different answers to
accountable for the critical thinking prowess of their test questions (see Norris & Ennis, 1989).
students. A currently popular purpose for testing, in- 6. Significant results may be expected in too
cluding critical thinking testing, is to pressure schools short a time period. Learning to think critically takes
and teachers to "measure up" by holding them ac- a long time. Much reflective practice with many ex-
countable for the test results of their students. amples in a variety of situations is required.
Purposes 6 and 7 typically constitute "high- 7. High-stakes purposes often interfere with the
stakes" testing, so called because much often depends validity of a test. This is partly because they motivate
on the results. The science reasoning section of the cram-schools, which teach students how to do well
American College Test (ACT), much of the new Med- on the tests without the students' having the critical think-
ical Colleges Admissions Test (MCAT), College Board ing prowess for which the test is supposedly testing. The
Advanced Placement (AP) tests, Iowa Test of Educa- students often learn tricks for taking the tests.
tional Development, and the analytic and logical rea- This interference with validity occurs also in
soning sections of the Graduate Record Examination part because the high-stakes situation pressures the
(GRE) and the Law School Aptitude Test (LSAT) are test makers to avoid taking risks with items, the an-
examples of high-stakes critical thinking tests. swers to which might be subject to challenge. So the
pressure is for them to limit their testing to multiple-
Traps choice deductive-logic items of various sorts, that is,
In pursuing the above purposes, educators need items in which the conclusion necessarily follows from
to be aware of several traps, including the following: the premises (thus limiting the test's comprehensive-
1. Test results may be compared with norms, ness and content validity). Deductive-logic items are
and the claim made that the difference, or similarity, the most immune to complaint about the keyed answer.
is the result of instruction. There are usually other 8. Scarce resources (indicated by low assess-
possible explanations of the result, such as neighbor- ment budgets and overworked teachers) often lead to
hood influences. Currently-popular accountability test- compromises that affect test validity. Because of the
ing invites us into this trap. expense of, and/or teacher grading time required for,
2. A pretest and a posttest may be given without the tests necessary to assess critical thinking, many
comparing the class to a control group. The lack of a con- testing programs have resorted to multiple-choice tests
trol group renders the pretest-to-posttest results dubious, that are arguably less valid than short answer, essay,
since many things other than the instruction have hap- and performance tests of critical thinking.
pened to the students, and could account for the results.
3. The use of the same test for the pretest and Published Critical Thinking Tests
posttest has the problem of alerting the students to Although a number of tests incorporate critical
the test questions. On the other hand, the use of dif- thinking (including the high-stakes tests just men-
ferent forms of (allegedly) the same test for pretest- tioned), only a few have critical thinking (or some
posttest comparisons, given that the testing is for crit- aspect of critical thinking) as their primary concern.
ical thinking, is probably worse, since different forms None exist for students below fourth grade.
181
This dearth of critical thinking tests is unfortu- mal judgment about the validity of the content. Per-
nate; many more are needed to fit the various situa- sons who are seriously considering using any test
tions and purposes of critical thinking testing. In Ta- should take the test and score it themselves. There is
ble 1,1 have attempted to identify all currently avail- no better way to get a feel for the test's content va-
able published tests that have critical thinking as their lidity. One should not depend solely on the name
primary concern. The tests are grouped according to given to the test by the author and publisher. The
whether they aim at a single aspect of critical think- following questions should be considered:
ing or more than one aspect. The essay test is more
comprehensive than the others. 1. Is the test based on a defensible conception of criti-
cal thinking?
It would also make sense to group the tests ac- 2. How comprehensive is its coverage of this concep-
cording to whether they are subject specific or gen- tion?
eral-content based. Subject-specific critical thinking 3. Does it seem to do a good job at the level of your
tests assess critical thinking within one standard sub- students?
ject matter area, whereas general-content-based criti- Though these questions might seem obvious, they
cal thinking tests use content from a variety of areas are often neglected.
with which test takers are presumed to be already In varying degrees, all of the listed tests can be
familiar. A committee of the National Academy of used for the first five purposes specified earlier (all
Education has recommended that there be a strong but the high-stakes purposes). Their use for high
effort to develop subject-specific higher order think- stakes is problematic for two reasons: (a) There is no
ing tests (The Nation's Report Card, 1987, p. 54). A security on the tests, so prospective examinees can
full understanding of any subject matter area requires
secure copies, and (b) most of the tests are not suffi-
that the person be able to think well in that area.
ciently comprehensive to provide valid results in a
Regrettably, I can find no subject-specific criti- high-stakes situation. Let me elaborate on this sec-
cal thinking tests (that is, tests whose primary pur- ond problem.
pose is to assess critical thinking in a subject matter As indicated earlier, existing multiple-choice
area), although parts of some tests (such as the ACT tests do not directly and effectively test for many
section on science reasoning) fit this criterion. So
significant aspects of critical thinking, such as being
there is no subject-specific grouping in this listing of
open minded and drawing warranted conclusions cau-
tests primarily committed to critical thinking. All of
tiously. In response to this problem, some people
the tests listed here are general-content-based tests.
will hold that the various aspects of critical thinking
Unfortunately, the National Academy committee
are correlated with each other, so the lack of direct
also recommended the neglect of general-content-based
testing of specific aspects does not matter. For exam-
higher order thinking tests (p. 54). This is a mistake. We
ple, being open minded correlates highly with judg-
need general-content-based tests to check for transfer of
critical thinking instruction to everyday life, regardless ing the credibility of sources and identifying assump-
of whether thinking instruction is embedded in subject tions, making all of these good indicators of the oth-
matter instruction or whether it is offered in a separate ers, so the,argument goes.
course or unit, or some combination of the two. However, when the stakes are high, people pre-
Since I am a coauthor of some of the listed pare for the content areas that are expected to be on the
tests, my conflict of interest in presenting and dis- tests. Even though these content areas might correlate
cussing this listing is obvious. I have tried not to let highly with other critical thinking aspects when the
it interfere with my objectivity, but do recommend stakes are low, specific preparation for the expected
Alter & Salmon's Assessing Higher Order Thinking aspects in order to deal with them on the tests will
Skills: A Consumer's Guide (1987), which provides lower the correlations, destroying their validity as
more extensive coverage. A general discussion of the indirect measures of the missing aspects of critical
problems, prospects, and methods of critical thinking thinking. The danger is to accept correlations ob-
testing can be found in Evaluating Critical Thinking tained in low-stakes situations as representative of
(Norris & Ennis, 1989). the correlations obtainable in a high-stakes situation.
Since statistical information about tests can be A possible exception to this warning about the
misleading, it is important to make one's own infor- use of the listed tests for high-stakes situations is the
182
Ennis
Table 1
An Annotated List of Critical Thinking Tests
Tests Covering More Than One Aspect of Critical assumption identification, word relationships, sentence se-
Thinking quencing, interpreting answers to questions, information
The California Critical Thinking Skills Test: Col- sufficiency and relevance in mathematics problems, and
lege Level (1990) by P. Facione. The California Aca- analysis of attributes of complex stick figures.
demic Press, 217 LaCruz Ave, Millbrae, CA 94030. Test of Enquiry Skills (1979) by B.J. Fraser. Aus-
Aimed at college students, but probably usable with ad- tralian Council for Educational Research Limited, Fred-
vanced and gifted high school students. Incorporates in- erick Street, Hawthorn, Victoria 3122, Australia. Aimed
terpretation, argument analysis and appraisal, deduction, at Australian grades 7-10. Sections on using reference
mind bender puzzles, and induction (including rudimen- materials (library usage, index, and table of contents);
tary statistical inference). interpreting and processing information (scales, averag-
Cornell Critical Thinking Test, Level X (1985) by es, percentages, proportions, charts and tables, and
R.H. Ennis and J. Millman. Midwest Publications, PO graphs); and (subject-specific) thinking in science (com-
Box 448, Pacific Grove, CA 93950. Aimed at Grades 4- prehension of science reading, design of experiments,
14. Sections on induction, credibility, observation, de- conclusions, and generalizations).
duction, and assumption identification. Test of Inference Ability in Reading Comprehen-
Cornell Critical Thinking Test, Level Z (1985) by sion (1987) by L.M Phillips and C. Patterson. Institute
R.H. Ennis and J. Millman. Midwest Publications, PO for Educational Research and Development, Memorial
Box 448, Pacific Grove, CA 93950. Aimed at advanced University of Newfoundland, St. John's, Newfoundland,
or gifted high school students, college students, and oth- Canada A1B 3X8. Aimed at grades 6-8. Tests for ability
er adults. Sections on induction, credibility, prediction to infer information and interpretations from short pas-
and experimental planning, fallacies (especially equivo- sages. Multiple choice version (by both authors) and con-
cation), deduction, definition, and assumption identifica- structed response version (by Phillips only).
tion. Watson-GIaser Critical Thinking Appraisal (1980)
The Ennis-Weir Critical Thinking Essay Test (1985) by G. Watson and E.M. Glaser. The Psychological Cor-
by R.H. Ennis and E. Weir. Midwest Publications, PO poration, 555 Academic Court, San Antonio TX 78204.
Box 448, Pacific Grove CA 93950. Aimed at grades 7 Aimed at grade 9 through adulthood. Sections on induc-
through college. Also intended to be used as a teaching tion, assumption identification, deduction, judging wheth-
material. Incorporates getting the point, seeing the rea- er a conclusion follows beyond a reasonable doubt, and
sons and assumptions, stating one's point, offering good argument evaluation.
reasons, seeing other possibilities (including other possi-
ble explanations), and responding to and avoiding equiv- Tests Covering Only One Aspect of Critical Thinking
ocation, irrelevance, circularity, reversal of an if-then Cornell Class Reasoning Test (1964) by R.H. Ennis,
(or other conditional) relationship, overgeneralization, W.L. Gardiner, R. Morrow, D. Paulus, and L. Ringel. Illi-
credibility problems, and the use of emotive language to nois Critical Thinking Project, University of Illinois, 1310
persuade. S. 6th St., Champaign, IL 61820. Aimed at grades 4-14.
Judgment: Deductive Logic and Assumption Rec- Tests for a variety of forms of (deductive) class reasoning.
ognition (1971) by E. Shaffer and J. Steiger. Instruction- Cornell Conditional Reasoning Test (1964) by R.H.
al Objectives Exchange, PO Box 24095, Los Angeles, Ennis, W. Gardiner, J. Guzzetta, R. Morrow, D. Paulus,
CA 90024. Aimed at grades 7-12. Developed as a crite- and L. Ringel. Illinois Critical Thinking Project, Univer-
rion-referenced test, but without specific standards. In- sity of Illinois, 1310 S. 6th St., Champaign, IL 61820.
cludes sections on deduction, assumption identification, Aimed at grades 4-14. Tests for a variety of forms of
and credibility, and distinguishes between emotionally (deductive) conditional reasoning.
loaded content and other content. Logical Reasoning (1955) by A. Hertzka and J.P.
New Jersey Test of Reasoning Skills (1983) by V. Guilford. Sheridan Psychological Services, PO Box 6101,
Shipman. Institute for the Advancement of Philosophy Orange, CA 92667. Aimed at high school and college
for Children, Test Division, Montclair State College, students and other adults. Tests for facility with class
Upper Montclair, NJ 08043. Aimed at grades 4 though reasoning.
college. Incorporates the syllogism (heavily represent- Test on Appraising Observations (1983) by S.P.
ed), assumption identification, induction, good reasons, Norris and R. King. Institute for Educational Research
and kind and degree. and Development, Memorial University of Newfound-
Ross Test of Higher Cognitive Processes (1976) by land, St. John's, Newfoundland, Canada, A1B 3X8.
J.D. Ross and C M . Ross. Academic Therapy Publica- Aimed at grades 7-14. Tests for ability to judge the cred-
tions, 20 Commercial Blvd., Novato, CA 94947. Aimed ibility of statements of observation. Multiple choice and
at grades 4-6. Sections on verbal analogies, deduction, constructed response versions.
183
critical thinking essay test, which does test more com- less structured—in the form of naturalistic observa-
prehensively than others. But it is not secure. Fur- tion of a student. Greater structure usually means
thermore, it is more expensive in time and/or money greater effort beforehand, but also greater assurance
than multiple-choice tests to administer and score. that there will be opportunities to assess specific as-
The problem is serious in high-stakes testing. We do pects of critical thinking. Less structure generally re-
not yet have inexpensive critical thinking testing us- quires greater effort during and after the observation,
able for high stakes. Research and development are and gives the opportunity for more life-like situa-
needed here. tions, but provides less assurance that a broad range
The listed multiple-choice tests can, to varying of specific aspects of critical thinking will be as-
degrees, be used for the first five listed lower stakes sessed. The sections that follow illustrate several types
purposes: diagnosis, feedback, motivation, impact of of open-ended critical thinking tests that teachers can
teaching, and research. But discriminating judgment make themselves.
is necessary. For example, if a test is to be used for
diagnostic purposes, it can legitimately only reveal Multiple Choice With Written Justification
strengths and weaknesses in aspects for which it tests. In the Illinois Critical Thinking Project—in con-
The less comprehensive the test, the less comprehen- junction with the Alliance for Essential Schools in
sive the diagnosis. Illinois—we are currently exploring the use of the
For comprehensive assessment, unless appropri- multiple-choice-plus-written-justification format. We
ate multiple-choice tests are developed, open-ended have taken 20 items from the Cornell Critical Think-
assessment techniques are probably needed. Until the ing Test, Level X, and requested a brief written justi-
published repertoire of open-ended critical thinking fication of the student's answer to each. In the fol-
tests increases considerably, and unless one uses the lowing example of an item focusing on judging the
published essay test, or parts of other open-ended tests, credibility of a source, the situation is the exploration
such as College Board's advanced placement (AP) of a newly-discovered planet:
tests, it is necessary to make your own test. WHICH IS MORE BELIEVABLE? Circle one:
A. The health officer investigates further and says,
Making Your Own Test "This water supply is safe to drink."
In making your own test, it is probably better B. Several others are soldiers. One of them says,
that it be at least somewhat open ended anyway, since "This water supply is not safe."
making good multiple-choice tests is difficult and time C. A and B are equally believable.
consuming, and requires a series of revisions, tryouts, YOUR REASON:
and more revisions. Suggestions for making multiple- One advantage of this promising format is that spe-
choice critical thinking items may be found in Norris cific aspects of critical thinking can be covered (in-
and Ennis (1989), but I do not present them here, cluding an aspect not effectively tested in existing
because open-ended assessment is better adapted to multiple-choice tests: being appropriately cautious in
do-it-yourself test makers and can be more compre- the drawing of conclusions). Another advantage is that
hensive. Norris and Ennis also make suggestions for answers that differ from those in the key, if well defend-
open-ended assessment, and the discussion here grows ed, can receive full credit. Answers that differ from the
out of that presentation. key, as I noted earlier, are sometimes defensible, given
Multiple-choice assessment is labor intensive in that the test taker has different beliefs about the world
the construction and revision of the tests. Open-end- than the test maker. We have found that high inter-
ed assessment is labor intensive in the grading, once rater consistency (.98) can be obtained if guides to
one has developed a knack for framing questions. scoring are carefully constructed and if the scorers are
One promising approach is to give a multiple-choice proficient in the same conception of critical thinking.
item, thus assuring attention to a particular aspect of I recommend this approach to making your own
critical thinking, and to ask for a brief written de- test. It is fairly quick, can be comprehensive, pro-
fense of the selected answer to the item. vides forgiveness for unrefined multiple-choice items,
As in the previous example, open-ended assess- and allows for differences in student backgrounds and
ment can be fairly well structured. Or it can be much interpretation of items.
184
Ennis
Essay Testing of Critical Thinking We developed a six-factor analytic scoring system, an

Several approaches to making one's own essay adaptation of scoring guides developed by the Illinois
test of critical thinking are viable, depending on the State Board of Education, and have secured high inter-
purpose. rater consistency (.94). This approach also looks prom-
High structure. The use of the argumentative ising. Grading takes us about 5 minutes per essay for
essay to assess critical thinking can vary consider- essays written in 40 minutes of class time.
ably in degree of structure. The Ennis-Weir Critical
Thinking Essay Test is an example of a highly struc- Performance Assessment
tured essay test. It provides an argumentative passage Performance assessment is the most expensive
(a letter to an editor) with numbered paragraphs, most of all, since it requires considerable expert time de-
of which have specific built-in errors. Students are voted to each student. It has the greatest face validity
asked to appraise the thinking in each paragraph and for whatever is revealed, since the situations are more
the passage as a whole, and to defend their appraisals. realistic—possibly real life situations. However, the
A scoring guide assigns a certain number of pos- greater the realism, the less assurance of comprehen-
sible points to the appraisal of each paragraph and siveness. In real life situations, people generally re-
the passage as a whole, and provides guidance for the veal only what the situation requires, and most ob-
grader. But the grader must be proficient in critical servable situations do not require all aspects of criti-
thinking in order to handle responses that differ from cal thinking. So real-life performance assessment en-
standard responses in varying degrees. Responses that counters a difficulty similar to one found in multiple-
are radically different, if well defended, receive full choice assessment: reduced comprehensiveness. An-
credit. Scoring by proficient graders takes about 6 other possible danger in performance assessment is
minutes per essay. excessive subjectivity.
Medium structure. Structure can be reduced by The least structured performance assessment is
providing an argumentative passage and requesting naturalistic observation, as in a case study (Stake,
an argumentative response to the thesis of the pas- 1978). Here, a trained observer takes extensive notes
sage and its defense—without specifying the organi- describing a series of events and focuses on the ac-
zation of the response. College Board AP tests use tivities of one person or group. Interpretation is inev-
this approach. itable, but "rich" description is the goal.
Scoring can be either holistic (one overall grade An example of slightly more structured perfor-
for the essay), or analytic (a grade for each of several mance assessment is the use of a student's portfolio
criteria). Holistic scoring is quicker and thus less ex- of work to determine graduation from high school
pensive. Proficient graders take roughly 1 or 2 min- (recommended, for example, by Sizer in Horace's
utes for a two-page essay. Analytic scoring gives more Compromise, 1984). Validity of this type of assess-
information and is more useful for most purposes. ment is not yet established. It is an attractive idea,
Proficient graders take roughly 3 to 6 minutes for a two- but many problems exist, including probable lack of
page essay, depending on how elaborate the criteria are. comprehensiveness of critical thinking assessment.
Minimal structure. Structure can be further re- A more structured performance assessment is
duced by providing only a question to be answered or exemplified by an exploratory assessment effort by
an issue to be addressed. The Illinois Critical Think- the National Assessment of Educational Progress (Blum-
ing Essay Contest uses this approach (Powers, 1989). berg, Epstein, MacDonald, & Mullis, 1986). A student
In one year, students were asked to take and defend a is given a variety of materials and asked to investigate
position about the possible regulation of music tele- what factors affect the rate at which sugar cubes dis-
vision, a topic of great interest to students. Reduced solve. The observer asks questions and watches to see
structure gives students more freedom, but provides whether the student goes about the task scientifically. In
teachers with less assurance of diagnostic informa- this kind of performance assessment, structure is pro-
tion, not a problem for the essay contest. Again, ei- vided by the assignment of a task, which is designed
ther holistic or analytic scoring is possible. to check things of interest.
At Illinois we are also using this format for the Performance assessment seems valid on the face
development of the Illinois Critical Thinking Essay Test. of it. Expense, possible lack of comprehensiveness,
185
possible excessive subjectivity, and lengthy reports Dewey, J. (1910). How we think. Boston: D.C. Heath.
are dangers. Educational Policies Commission. (1961). The central
purpose of American education. Washington, DC:
National Education Association.
Summary Ennis, R.H. (1962). A concept of critical thinking. Har-
Critical thinking testing is possible for a variety vard Educational Review, 29, 128-136.
of purposes. The higher the stakes and the greater the Ennis, R.H. (1981). Eight fallacies in Bloom's taxono-
budgetary restraints, the fewer the purposes that can my. In C.J.B. Macmillan (Ed.), Philosophy of edu-
be served. In particular, comprehensiveness of cover- cation 1980 (pp. 269-273). Bloomington, IL: Phi-
losophy of Education Society.
age of aspects of critical thinking is threatened in
Ennis, R.H. (1987). A taxonomy of critical thinking dis-
high-stakes testing. positions and abilities. In J. Baron & R. Sternberg
A number of published tests focus on critical (Eds.), Teaching thinking skills: Theory and prac-
thinking. Almost all are multiple-choice tests, an ad- tice (pp. 9-26). New York: W.H. Freeman.
vantage for efficiency and cost, but currently not for Ennis, R.H. (1991). Critical thinking: A streamlined con-
ception. Teaching Philosophy, 14(1), 5-25.
comprehensiveness. More research and development

Ennis, R.H. (in press). Critical thinking. Englewood
are needed. Cliffs, NJ: Prentice-Hall.
Viable alternatives include the addition of justi- McPeck, J.E. (1981). Critical thinking and education.
fication requests to multiple-choice items, essay test- New York: St. Martin's Press.
ing with varying degrees of structure, and performance The nation's report card. (1987). Cambridge, MA: Na-
assessment. All are considerably more expensive than tional Academy of Education, Harvard Graduate
School of Education.
multiple-choice testing when used on a large scale,
Norris, S.P., & Ennis, R.H. (1989). Evaluating critical
but on a small scale, they offer a feasible alternative thinking. Pacific Grove, CA: Midwest Publications
in terms of validity and expense. However, grading Paul, R.W. (1987). Dialogical thinking: Critical thought
them does take more time than grading a prepack- essential to the acquisition of rational knowledge
aged multiple-choice test. and passions. In J. Baron & R. Sternberg (Eds.),
Teaching thinking skills: Theory and practice (pp.
Note: The author deeply appreciates the helpful com- 127-148). New York: W.H. Freeman.
ments of Michelle Commeyras, Marguerite Finken, Ste- Powers, B. (Ed.). (1989). Illinois critical thinking annu-
phen Norris, and Amanda Shepherd. al. Champaign, IL: University of Illinois College of
Education.
References Shepard, L.A. (1991). Will national tests improve stu-
Arter, J.A., & Salmon, J.R. (1987). Assessing higher or- dent learning? Phi Delta Kappan, 73, 232-238.
der thinking skills: A consumer's guide. Portland, Sizer, T. (1984). Horace's compromise. Boston: Hough-
OR: Northwest Regional Educational Laboratory. ton-Mifflin.
Blumberg, F., Epstein, M., MacDonald, W., & Mullis, I. Smith, M.L. (1991). Put to the test: The effects of exter-
(1986). A pilot study of higher-order thinking skills nal testing on teachers. Educational Researcher,
assessment techniques in science and mathematics. 20(5), 8-11.
Princeton, NJ: National Assessment of Educational Stake, R.E. (1978). The case study method in social in-
Progress. quiry. Educational Researcher, 7(2), 5-8.
TiP
186

Ennis ASSESMENT

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ennis ASSESMENT

Uploaded by

Copyright:

Available Formats

This article was downloaded by: [University of Birmingham]

On: 07 July 2013, At: 00:09

Theory Into Practice

Critical thinking assessment

To link to this article: http://dx.doi.org/10.1080/00405849309543594

PLEASE SCROLL DOWN FOR ARTICLE

Critical Thinking Assessment

A LTHOUGH critical thinking has often been urged

THEORY INTO PRACTICE, Volume 32, Number 3, Summer 1993

Essay Testing of Critical Thinking We developed a six-factor analytic scoring system, an

comprehensiveness. More research and development

You might also like