You are on page 1of 8
Robert H. Ennis Critical Thinking Assessment A L1HOUoH critical thinking has often been urged as a goal of education throughout most of this, century (for example, John Dewey's How We Think 1910; and the Educational Policies Commission's The Central Purpose of American Education, 1961), not a treat deal has been done about it. Since the early 1980s, however, attention to critical thinking instruc- ‘ion has increased significantly—with some spillover to critical thinking assessment, an area that has been neglected even more than critical thinking instruction, Partly as a result of this neglect, the picture 1 paint of critical thinking assessment is not all ros though there are some bright spots. More explicitly, ‘my major theme is that, given our current state of knowledge, critical thinking assessment, albeit diff- cult to do well, is possible. Two subthemes are that (a) the difficulties and possibilities vary with the pur- pose ofthe eitical thinking assessment and the format used, and (b) there are numerous traps for the unwary In pursuit ofthese themes, I consider some pos- sible purposes in attempting to assess critical think- ing, note some traps, lst and comment on available critical thinking tests (none of which suit all of the ‘purposes). and finish with suggestions for how to de velop your own critical thinking assessment, includ- ing a discussion of some major formats. But firs, some attention must be paid to the definition of criti cal thinking because ritical thinking assessment requires that we be Clear about what we are trying £0 assess Robert H. Ennis is professor of education at the Univer. sity of Mlinois at Urbana-Champaign Defining Critical Thinking The upper three levels of Blooms’ taxonomy of educational objectives (analysis, synthesis, and eval- uation) are often offered as a definition of critical thinking. Sometimes the next two levels (comprehen sion and application) are added. This conception is a ood beginning, but it has problems. One is that the levels are not really hierarchical, as suggested by the theory, but rather are interdependent, For example, although synthesis and evaluation generally do re- quire analysis, analysis generally requires synthesis and evaluation (Ennis, 1981), More significantly, given our concern here, the three (or five) concepts are (00 vague to guide us in developing and judging critical thinking assessment. Consider analysis, for example. What do you assess when you fest for ability to analyze? The difficulty becomes apparent when we consider the following variety of things that can be Tabeled “analysis”: analysis of the politcal situation in the Middle East, analysis of a chemical substance, analysis of a word, analysis ‘of an argument, and analysis of the opponent's weak nesses in a basketball game, What testable thing do all these activities have in common? None, except for the vague principle that itis often desirable to break things into parts A definition of critical thinking that T at one time endorsed is that critical thinking is the correct assessing of statements (Ennis, 1962). If I had not elaborated this definition, it would be as vague as Bloom's taxonomy. But even when elaborated, i suffers THEORY INTO PRACTICE, Volume 32, Number 3, Summer 1993 Copyright 1993 College of Education. The Ohio State University Cenwicht © 200' Al Richts Reeeved. THEORY INTO PRACTICE / Summer 1998 Teaching for Higher Order Thinking from excluding creative aspects of critical thinking. such as conceiving of alternatives. formulating. hy- potheses and definitions, and devetoping plans for experiments, I now think the contemporary concep= tion of eritical thinking includes these things, so the “correct assessing” definition is more marrow than standard usage, and thus could interfere with com- ‘munication among proponents of critical thinking. The following definition seems to be more in accord with contemporary usage and thus, I hope ‘will minimize confusion in communication: “Critical thinking is reasonable reflective thinking focused on deciding what to believe or do.” As it stands, howev- cr, this definition is also as vague as Bloom's taxon- ‘omy. Itto0 needs elaboration, Here is an abridement of the elaborations I have provided and defended elve~ ‘where (Ennis, 1987, 1991, in press) In reasonably and reflectively going about de- ciding what to believe or do, a person characteristi- cally needs to do most of these things (and do them interdependently> Judge the credibility of sources. Identify conclustons, reasons, and assumptions Judge the quality of an argument. including the ae ‘ceplability ofits reasons, assumptions, and evidence. 4. Develop and defend a position on an issue, 5. Ask appropriate clarifying questions 6. Plan experiments and judge experimental designs. 1 8. 9. Define terms in a way appropriate for the context Be open-minded. ‘Try to be well informed. 10, Draw conclusions when warranted, but with eaution. ‘This interdependent list of abilities and disposi- tions can provide some specificity for guiding criti- cal thinking testing. The elaborations, of which the Fist isan abridgment, are more thorough, but the sim- plicty of this list can make it useful, Itcan serve as a set of goals for an entire critical thinking curriculum for as a partial set of goals for some subject-matter or ‘other instructional sequence. It can be the basis for & table of specifications For constructing a ertical think ing test. (A table of specifications provides the areas that a test is supposed to assess and indicates the weighting assigned to each.) ‘The elaboration also can be used as a guide in {judging the extent to which un existing critical think- ing test is comprehensive. and. w critical thinking at all. One of my chief criticisms of ‘most existing critical thinking tests is thei lack of comprehensiveness. For example, they typically fail 180 to test for such important things as being open mind- ed, and many even fail to test for judging the credi- bility of sources. Without some defensible concep- tion of critical thinking, judgments about tests are likely 10 be erratie—or worse. ‘Two other well-known definitions of critical thinking are MePeck's “reflective skepticism” (1981. p. 7) and Paul's “strong sense” definition (1987), Paul's definition is similar in broad outline to the definition proposed here, but emphasizes more heavily being aware of one's own assumptions and seeing things from others’ points of view. However. neither ‘of these definitions provides sufficient elaboration For developing critical thinking tests. Furthermore, MePeck’s definition is negative, Critical thinking must get beyond skepticism. Purposes of Critical Thinking Assessment Not only must we have a defensible elaborated definition of critical thinking when selecting, criti- cizing, or developing a test, we must also have a ‘clear idea of the purpose for which the testis to be used. A variety of possible purposes exist. but no one test or assessment procedure fits them all Here are some ‘major possible purposes, secompanied by comments: 1. Diagnosing the levels of students’ critical thinking. If we are to know where to focus our in steuetion, we must “start with where they are” in spe- cific aspects of evtical thinking. Tests can be helpful in this respect by showing specific areas of strength, ‘and weakness (or example, ability to identify as- sumptions) 2. Giving students feedback about their critical thinking prowess. I students know their specific strengths and weaknesses, their attempts to improve ‘ean be better focused. 3. Motivating students to be better at critical thinking. ‘Though frequently misused as a motivational device, tests can and do motivate students (0 learn the material they expect to be covered on the test. It critical thinking is omitted from tests, test batteries. or other assessment procedures, students will tend (© neglect it (Smith, 1991: Shepard, 1991. 4. Informing teachers about the success of their efforts to teach students 0 think critically. Teachers can use tests to obtain feedback about their instruc tion in critical thinking 5. Doing research about critical thinking instruc tional questions and issues. Without careful comparison of a variety of approaches, the difficult issues in crit ical thinking instruction and curriculum organization ‘cannot be answered. But this research requires. as- sessment, so that comparisons can be made. 6, Providing help in deciding whether a student should enter an educational program. People in some fields already use assessed critical thinking prowess to help make admissions decisions, Examples are medicine, nursing, aw, and graduate school in gener- al. The idea seems good, but the efficacy of existing. efforts in selecting beter critical thinkers has not been established. Research needs to be done in this area. 7. Providing information for holding schools ‘accountable for the critical thinking prowess of their students. A currently popular purpose for testing, in- cluding critical thinking testing, isto pressure schools, ‘and teachers to “measure up” by holding them ac- ‘countable for the test results of their students Purposes 6 and 7 typically constitute “high- stakes” testing, so called because much often depends, ‘on the results. The science reasoning section of the American College Test (ACT), much of the new Med- ‘cal Colleges Admissions Test (MCAT), College Board Advanced Placement (AP) tests, lowa Test of Educa- tional Development, and the analytic and logical rea (GRE) and the Law School Aptitude Test (LSAT) are ‘examples of high-stakes critical thinking tests Traps In pursuing the above purposes, educators need. to be aware of several traps, including the following: 1. Test results may be compared with norms, and the claim made that the difference, or similarity, is the result of instruction. There are usually other possible explanations of the result, such as neighbor- hood influences. Currently-popular accountability test- ing invites us into this trap. 2. A pretest and a posttest may be given without ‘comparing the clas 10 control group. The lack of a con- trol group renders the pretestto-posttest results dubiou since many things other than the instruction have hap” Pened to the students, and could account forthe results 3. The use of the same test for the pretest and posttest has the problem of alerting the students to the test questions. On the other hand, the use of dif ferent forms of (allegedly) the same test for pretest- posttest comparisons, given that the testing is for crit. ical thinking, is probably worse, since different forms, Ennis Critical Thinking Assessment are actually different tests. Comparability is always suspect, since so much depends on the specific con- tent of the test 4. Most critical thinking tests are not compre hensive, especially those that are easiest to use, the ‘multiple-choice tests, These tests typically miss much that is important in critical thinking. 5. Another problem in the use of (especially) ‘multiple-choice tests lies in differences in background beliefs and assumptions between test maker and test taker. Since a critical thinker employs: a grasp of the situation, different beliefs about the situation can sometimes result in justifiably different answers to test questions (see Norris & Ennis, 1989), 6. Significant results may be expected in too short a time period. Learning to think eritically takes a Tong time. Much reflective practice with many ex- amples in a variety of situations is required. 7. High-stakes purposes often interfere with the validity of atest. This is partly because they motivate cram-schools, which teach students how to do well (on the tests without the students" having the critical think ing prowess for which the testis supposedly testing, The students often learn tricks for taking the tests, This interference with validity occurs also in part because the high-stakes situation pressures the test makers to avoid taking risks with items, the an- swers to which might be subject to challenge. So the pressure is for them to limit their testing to multiple- choice deductive-logic items of various sorts, that is, items in which the conclusion necessarily follows from the premises (thus limiting the test's comprehensive- ness and content validity). Deductive-logie items ane the most immune to complaint about the keyed answer. 8, Scarce resources (indicated by low assess- ‘ment budgets and overworked teachers) often lead to compromises that affect test validity, Because of the expense of, and/or teacher grading time required for, the tests necessary to assess critical thinking, many testing programs have resorted to multiple-choice tests that are arguably less valid than short answer, essay, and performance tests of critical thinking. Although a number of tests incorporate critical thinking (including the high-stakes tests just men: tioned), only a few have critical thinking (or some aspect of critical thinking) as their primary concern None exist for students below fourth grade. 181 ee THEORY ITO PRACTICE / Summer 1993 Teaching for Higher Order Thinking ‘This dearth of critical thinking tests is unfortu: hate; many more are needed to fit the various situa tions and purposes of critical thinking testing. In Ta~ ble 1, [have attempted to identify all currently avail- able published tests that have critical thinking as their primary concern. The tests are grouped according to ‘whether they aim at a single aspect of critical think ing oF more than one aspect. The essay test is more ‘comprehensive than the others. Tt would also make sense to group the tests ac~ ‘cording to whether they are subject specific or gen- eral-content based. Subject-specific critical thinking tests assess critical thinking within one standatd sub> ject matter area, whereas general-content-based criti- ‘eal thinking tests use content from a variety of areas with which test takers are presumed to be already familiar. A committee of the National Academy of Education has recommended that there be a strong cffort to develop subject-specitic higher order think ing tests (The Nation’s Report Card, 1987. p. 54). 8 full understanding of any subject matter area requires that the person be able to think well in that area. Regrettably. I can find no subject-specific eriti- cal thinking tests (that is, tests whose primary pur- pose is to assess critical thinking in a subject matter area), although parts of some tests (suet as the ACT section on science reasoning) fit this eriterion, So there is no subject-specific grouping in this listing of | tests primarily committed to critical thinking. ANI of the tess listed here are general-content-based tests Unfortunately, the National Academy committee also recommended the neglect of general-content-based higher order thinking tess (p.54). This isa mistake, We ‘need general-content-based tests to check for transfer of critical thinking instruction to everyday life. regardless of whether thinking instruction is embedded in subject matter instruction or whether itis offered in separate course of unit, or some combination of the two. Since I am a coauthor of some of the listed tests, my conflict of interest in presenting and dis ‘cussing this listing is obvious. 1 have tried not to Tet it interfere with my objectivity. but do recommend Amter & Salmon’s Assessing Higher Order Thinking ‘Skills: A Consumer's Guide (1987), which provides more extensive coverage. A general discussion of the problems, prospects, and methods of critical thinking testing can be found in Evaluating Critical Thinking (Norris & Ennis, 1989). Since statistical information about tests can be misleading, itis important t0 make one’s own infor- 182 eee eee eae mal judgment about the validity of the content. Per sons who are seriously considering using any test should take the test and score it themselves, There is, no better way to get a feel for the test’s content va- lidity, One should not depend solely on the name given to the test by the author and publisher. The following questions should be considered: 1. Is the test based on a defensible conception of eriti- cal thinking? How comprehensive is its coverage of this concep 3. Does it seem to do a good job at the level of your students? Though these questions might seem obvious, they are often neglected. In varying degrees, all of the listed tests ean be used for the first five purposes specified earlier (all but the high-stakes purposes). Their use for high stakes is problematic for (wo reasons: (a) There is no security on the tests, so prospective examinees can secure copies. and (b) most of the fests are not suffi- ciently comprehensive to provide valid results in a high-stakes situation, Let me elaborate on this sec- ‘ond problem. [As indicated earlier, existing multiple-choice tests do not directly and effectively test for many significant aspects of critical thinking, such as being ‘open minded and drawing warranted conclusions cau tiously. In response to this problem, some people will hold that the various aspects of critical thinking are correlated with each other, so the lack of ditect testing of specific aspects does not matter, For exam. ple. being open minded correlates highly with judg- ing the credibility of sources and idemifying assump- tions, making all of these good indicators of the oth- crs, so the argument goes. However, when the stakes are high, people pre pare for the content areas that ae expected to be on the tests. Even though these content areas might correlate highly with other critical thinking aspects when the stakes are low, specific preparation for the expected aspects in order to deal with them on the tests will lower the correlations, destroying their validity as indirect measures of the missing aspects of critical thinking. The danger is to accept correlations ob- tained in low-stakes situations ay representative of the correlations obtainable in a high-stakes situation. ‘A possible exception to this warning about the use of the listed tess for high-stakes situations is the Ennis Critical Thinking Assessment Table 1 An Annotated List of Critical Thinking Tests if More Than One Aspect of Critical The California Critical Thinking Stils Test Col lege Level (1990) by P. Fasione, The Califoria Aca- demic Pres, 217 LaCrun, Ase, Millbrae, CA" 94030 ‘Aime at college stents, bt probably sable wth a ‘anced and gifted highschool students neorporates im terpetation. argument anaiyss od appa dedton mind bender pzres, so induction tnchdingeudimen: tay stn inference) Comell Creal Thinking Test, Level X (1988) by RH. Ennis and J. Millman, Midwest ublications, PO Box 448, Pacfic Grove, CA 92980, Aimed at Grades 14. Sections on indsction, credibly. observation, de duction and assumption denificaion Corel! Criical Thinking Test Level Z (1985) by A, Ennis and’, Milman. Midwest Publieations, PO Box 4th, Pati Grove, CA 93950, Aimed at avn ar gifted high seo! stadent, college students and th EF alts. Seevons on indicton, crab. prediction and expevimentl planning, alae fespecally equi. Caton), deduction definition, and assumption entiea The Enis Weir Crcal Thinking Essay Tes (1985) by RIL Ennis and E. Wel, Midwest Publications, PO Box 448, Pacie Grove CA 93930, Aimed at gindes 7 through college. Alo intended to be used asa teaching imate. Incorporates geting the point, seeing the Sons and assumptions, Stating ones point fering good reasons, seins ster posbses(eluing other pen ble explanations, and responding to and avoiding eq ovation, irlevanc, circularity reversal of an then (or other conditional) relationship, evergeneralvaton credibility problems, and the tue of emotive language to perstade ‘udement: Deduetve Logic and Assumption Rec ogntion (1971) by E Shaffer and Scigor. struction af Objectives Exchange, PO Box 24095 Lor Angcles, CA 90028. Aime at grades 7-12, Developed asa ere Fion referenced ts. without specific nants Ta Cludes sections on deduction, assumption ientation, and credibility and distinguishes between emotionally Toaded coment and other content "New Jersey Test of Reasoning Skills (1983) by V Shipman. Institute for the Advanceinent of Philosophy for Children, Test Division, Montel. State Colle, Upper Montclair, NP 08043. Aimed at grates 4thoogh college. Incorporates the syllogism (heavily represent 2). assumption idence, indglom, Food asons Sd Kind and degree Boss Test of Higher Cognitive Proceses (1976) by 41. Rosy and CM. Rows Academie Therapy Poon tions, 20 Commercial Biv, Novato, CA 93947, Aimed at grades 4-6. Sections on verbal analogies, deducts assumption identification, word relationships, sentence se quencing, interpreting answers 10 questions, information Sufficiency and relevance in mathematics problems, and analysis of attributes of complex stick figures, Test of Enquiry Skills (1979) by BJ. Fraser, Aus tralian Council for Educational Research Limited, Fred- ctick Street, Hawthorn, Vietoria 3122, Australia. Aimed at Australian grades 7-10. Sections on using reference ‘materials (library usage, index, and table of contents); interpreting and processing information (scales, averag- fs, percentages. proportions, charts and tabies, and graphs); and (subject-specific) thinking in science (com- Drehension of science reading, design of experiments, Conclusions, and generalizations) Test of Inference Ability in Reading Comprehen- sion (1987) by LM Phillips and C. Patterson. Institute for Educational Research and Development, Memorial University of Newfoundland, St. John’s, Newfoundland, Canada AIB 3X8, Aimed at grades 6-8. Tests for ability to infer information and interpretations from short pas. sages. Multiple choice version (by both authors) and con- structed response version (by Phillips only), Warson-Glaser Critical Thinking Appraisal (1980) by G. Watson and EM. Glaser. The Psychological Cor- poration, 555 Academic Coun, San Antonio TX 78204. Aimed at grade 9 through adulthood, Sections on induc: ‘ion, assumption identification, deduction, judging wheth~ er a conclusion follows beyond a reasonable doubt, and argument evaluation, ‘Tests Covering Only One Aspect of Critical Thinking Comell Class Reasoning Test (1964) by RH, Ennis, WL. Gardiner, R. Morrow, D. Paulus, and L. Ringel Il nois Critical Thinking Project, University of Hlinois, 1310 S. 6th St. Champaign, IL 61820. Aimed at grades 4-14 Tests fora variety of forms of (deductive) class reasoning, Comet! Conditional Reasoning Test (1964) by RH. Ennis, W. Gardiner, J. Guzzetta, R. Morrow, D. Paulus, and L: Ringel. Uinois Critical Thinking Project, Univer. sity of Mlinois, 1310 S, 6th St., Champaign, IL 61820. Aimed at grades 4-14. Tests for a variety of forms of (Geductive) conditional reasoning, Logical Reasoning (1955) by A. Hertzka and J.P. Guilford. Sheridan Psychological Services, PO Box 6101, Orange, CA 92667. Aimed at high school and college students and other adults. Tests for facility with class reasoning. Test on Appraising Observations (1983) by S.P. Norris and R. King. Institute for Educational Research and Development, Memorial University of Newfound. land, St. John’s, Newfoundland, Canada, AIB 3X8, Aimed at grades 7-14. Tests for ability to judge the ered: ibility of statements of observation, Multiple choice and constructed response versions 183 Cenwicht © 200' Al Richts Reeeved. THEORY INTO PRACTICE / Summer 1993 Teaching for Higher Order Thinking critical thinking essay test, which does test more com- prehensively than others, But it is not secure. Fur thermore, itis more expensive in time and/or money than multiple-choice tests to administer and score, ‘The problem is serious in high-stakes testing. We do rot yet have inexpensive critical thinking testing us- able for high stakes, Research and development are seeded here. ‘The listed multiple-choice tests can, to varying degrees, be used for the first five listed lower stakes purposes: diagnosis, feedback, motivation, impact of teaching, and research, But discriminating judgment necessary. For example, if a testis to be used for diagnostic purposes, it can legitimately only reveal ‘strengths and weaknesses in aspeets for which i test. ‘The less comprehensive the test, the less comprehen sive the diagnosis, For comprehensive assessment, unless appropri ‘ate multiple-choice tests are developed, open-ended assessment techniques are probably needed. Until the published repertoire of open-ended critical thinking tests increases considerably, and unless one uses the published essay test, or parts of other open-ended test. such as College Board's advanced placement (AP) tests, itis necessary to make your own test Making Your Own Test In making your own test, it is probably better that it be at least somewhat open ended anyway, since making good multiple-choice tests is difficult and time consuming, and requires a series of revisions, tryouts. ‘and more revisions. Suggestions for making multiple- ‘choice critical thinking items may be found in Norris and Ennis (1989), but 1 do not present them here because open-ended assessment is better adapted to do-it-yourself test makers and can be more compre= hensive. Norris and Ennis also make suggestions for open-ended assessment, and the discussion here grows out of that presentation ‘Multiple-choice assessment is labor intensive in the construction and revision of the tests. Open-end- ‘ed assessment is labor intensive in the grading, once fone has developed a knack for framing questions (One promising approach is to give a multiple-choice item, thus assuring attention to a particular aspect of critical thinking, and to ask for a brief written de- fense of the selected answer to the itt [As in the previous example, open-ended assews ment can be fairly well structured. Or it ean be much less structured—in the form of naturalistic observa- tion of a student, Greater structure usually means greater effort beforehand, but also greater assuranc that there will be opportunities to assess specific as: pects of critical thinking. Less structure generally re~ ‘quires greater effort during and after the observation. tand gives the opportunity for more life-ike situa- tions, but provides less assurance that a broad rang ‘of specific aspects of critical thinking will be as sessed. The sections that follow illustrate several types ‘of open-ended critical thinking tests that teachers can ‘make themselves, Multiple Choice With Written Justification In the Iinois Critical Thinking Projeet—in con- junction with the Alliance for Essential Schools in linois—we are currently exploring the use of the multiple-choice-plus-written-justification format. We hhave taken 20 items from the Cornell Critical Think ing Test, Level X, and requested a briet written jus fication of the student’s answer to each. In the fol- Towing example of an item focusing on judging the credibility of a source, the situation is the exploration ‘of a newly-discovered planet WHICH IS MORE BELIEVABLE? Circle one |A. The health officer investigates further and says. ‘This water supply is safe to drink B. Several others are soldiers. One of them says, This water supply is nov safe C. A and B are equally believable. YOUR REASON: ‘One advantage of this promising Format is that spe~ cific aypects of critical thinking can be covered (in- cluding an aspect not elfectively tested in existing multiple-choice tests: being appropriately cautious in the drawing of conclusions). Another advantage is that swers that differ from those in the key. if well defend- fed, can receive full eedit, Answers that differ from the key, as F noted earlier. are sometimes defensible. given that the test taker has different beliefs about the world than the test maker. We have found that high inter- rater consistency (.98) can be obtained if guides 10 scoring are carefully constructed and if the scorers are ‘proficient in the same conception of ertica thinking T recommend this approach to making your ow test. It is fairly quick, can be comprehensive. pro- vides forgiveness for unrefined multiple-choice items, and allows For differences in student backgrou interpretation of items, Pssay Testing of Critical Thinking Several approaches to making one's own essay test of critical thinking are viable, depending on the purpose High structure. The use of the argumentative essay t0 assess critical thinking can vary consider- ably in degree of structure. The Ennis-Weir Critical Thinking Essay Testis an example of a highly struc tured essay test. It provides an argumentative passage (a letter to an editor) with numbered paragraphs, most of which have specific built-in errors. Students are asked to appraise the thinking in each paragraph and the passage as a whole, and to defend their appraisals A scoring guide assigns a certain number of pos- sible points to the appraisal of each paragraph and the passage as a whole, and provides guidance for the grader. But the grader must be proficient in critical thinking in order to handle responses that differ from standard responses in varying degrees. Responses that are radically different, if well defended, receive full credit. Scoring by proficient graders takes about 6 ‘minutes per essay, Medium structure. Structure can be reduced by providing an argumentative passage and requesting, ‘an argumentative response to the thesis of the pas sage and its defense—without specifying the organi ation of the response. College Board AP tests use this approach, Scoring can be either holistic (one overall grade for the essay), or analytic (a grade for each of several criteria). Holistic scoring is quicker and thus less ex: Pensive. Proficient graders take roughly 1 or 2 min- utes for a two-page essay. Analytic scoring gives more information and is more useful for most purposes. Proficient graders take roughly 3 to 6 minutes fora two- page essay, depending on how elaborate the criteria are ‘Minimal siructure. Structure can be further re duced by providing only a question to be answered or an issue to be addressed. The Ilinois Critical Think: ing Essay Contest uses this approach (Powers, 1989). In one year. students were asked to take and defend Position about the possible regulation of music tele- vision, a topic of great interest to students. Reduced structure gives students more freedom, but provides teachers with less assurance of diagnostic informa tion, not a problem for the essay contest. Again, ei- ther holistic or analytic scoring is possible. At linois we are also using this format for the development ofthe Ilinois Critical Thinking Exsay Test Ennis Critical Thinking Assessment We developed a six-factor analytic scoring system, an adaptation of scoring guides developed by the Illinois, State Bourd of Education, and have secured high inter- rater consistency (94). This approach also looks prom- ising. Grading takes us about 5 minutes per essay for essays written in 40 minutes of class time. Performance Assessment Performance assessment is the most expensive of all, since it requires considerable expert time de- voted to each student. It has the greatest face validity for whatever is revealed, since the situations are more possibly real life situations. However, the ‘greater the realism, the less assurance of comprehen- siveness. In real life situations, people generally re- veal only what the situation requires, and most ob- servable situations do not require all aspects of criti- cal thinking. So real-life performance assessment en- counters a difficulty similar to one found in multiple- choice assessment: reduced comprehensiveness. An: other possible danger in performance assessment is excessive subjectivity, The least structured performance assessment is, naturalistic observation, as in a case study (Stake, 1978), Here, a trained observer takes extensive notes describing a series of events and focuses on the ac- tivities of one person or group. Interpretation is inev- itable, but “rich” description is the goal An example of slightly more structured perfor- ‘mance assessment is the use of a student's portfolio of work to determine graduation from high school (recommended, for example, by Sizet in Horace’s Compromise, 1984). Validity of this type of assess- ‘ment is not yet established. It is an attractive idea, but many problems exist, including probable lack of ‘comprehensiveness of critical thinking assessment. ‘A more structured performance assessment is exemplified by an exploratory assessment effort by the National Assessment of Educational Progress (Blum- berg, Epstein, MacDonald, & Mullis, 1986). A. student is given a variety of materials and asked to investigate What factors affect the rate at which sugar cubes dis- solve. The observer asks questions and watches 10 see Whether the student goes about the task scientifically. In this kind of performance assessment, structure is pro- vided by the assignment of a task, which is designed to check things of interest, Performance assessment scems valid on the face of it. Expense, possible lack of comprehensiveness, realistic 18s ee THEORY INTO PRACTICE / Summer 1998 Teaching for Higher Order Thinking possible excessive subjectivity, and lengthy reports are dangers, Summary Critical thinking testing is possible for a variety of purposes. The higher the stakes and the greater the budgetary restraints, the fewer the purposes that can be served. In particular, comprehensiveness of cover age of aspects of critical thinking is threatened in high-stakes testing. 'A number of published tests focus on critical thinking. Almost all are multiple-choice tests, an ad- vantage for efficiency and cost, but currently not for ‘comprehensiveness. More research and development are needed Viable alternatives include the addition of justi- fication requests to multiple-choice items, essay test- ing with varying degrees of structure, and performance assessment, All are considerably more expensive than ‘multiple-choice testing when used on a large scale. but on a small scale, they offer a feasible alternative in terms of validity and expense. However, grading them does take more time than grading a prepack- ‘aged multiple-choice test. Note: The author deeply appreciates the helpful com- ments of Michelle Commeyras, Marguerite Finken, Ste- phen Norris, and Amanda Shepherd References Arter. JA, & Salmon, J.R. (1987), Assessing higher or Tier thinking skills: A consumer's guide. Portland. ‘OR: Nonthwest Regional Educational Laboratory ‘Blumberg, F.. Epstein, M., MacDonald, W., & Mullis, | (1986). A pilor study of higher-order thinking skills assessment techniques in science and mathematics Princeton, NI: National Assessment of Educational Progress. Dewey. J. (1910), How we think. Boston: D.C. Heath Educational Policies Commission. (1961). The central ‘purpose of American education. Washington, DC: National Education Association. Ennis, RH. (1962). A concept of critical thinking. Mar- ‘ard Educational Review, 29, 128-136. Ennis, RH, (1981). Bight fallacies in Bloom's taxono my. In CJ.B, Macmillan (Ed), Philosophy af edw Cation 1980 (pp. 269-273). Bloomington. IL: Phi Tosophy of Education Society Ennis, ReH. (1987). A taxonomy of critical thinking diss positions and abilities. In J. Baron & R. Sternberg (Eds. Teaching thinking skills: Theory and prac tice (pp. 9-26). New York: W.H. Freeman, Ennis, RH. (1991), Ceitical thinking: A streamlined eon ‘ception. Teaching Philosophy. 14(1). 5-25 Ennis, RLM, (in press). Critical thinking, Englewood Cliffs, NB: Prentice-Hall MePeck, JiE, (1981). Critical shinking and education, New York: St, Martin's Press The nation’s report card. (1987). Cambridge, MA: Na: tional Academy of Education, Harvard Graduate School of Education. Nomis, $.P..& Ennis, RH. (1989), Evaluating critical thinking. Pacific Grove, CA: Midwest Publications Paul, R'W, (1987), Dialogical thinking: Critical thought ‘essential to the acquisition of rational knowledge fand passions. In J) Baron & R, Sternberg. (Eds. Teaching thinking skills: Theor’ and practice (Pp. 127-148), New York: W.H. Freeman. Powers. B. (Ed). (1989), Minots critical thinking ann ‘al Champaign, IL: University of Ilinois College of Education. Shepard, L.A. (1991), Will national tests improve stu ‘dent learning? Phi Delta Kappan, 73, 232-238 Sizer. T. (1984). Horace's compromise, Boston: Housh- ton Mifflin. Smith. ML, (1991). Put to the test: The effects of exter ial testing on teachers, Educational Researcher 2045), 8-11 Stake, RE. (1978), The case study method in social in “duiry, Educational Researcher. 7.2). 5 TP 186

You might also like