You are on page 1of 19

Science & Education (2005) 14: 117–135 © Springer 2005

Examining the Exam: A Critical Look at The


California Critical Thinking Skills Test 

DON FAWKES1, BILL O’MEARA2, DAVE WEBER2 and DAN FLAGE2


1 Eutaw Village Center Court, Box 35786, Fayetteville, NC 28303-0786, U.S.A.; 2 James Madison
University

Abstract. This paper examines the content of The California Critical Thinking Skills Test (1990).
This report is not a statistical review. Instead it brings under scrutiny the content of the exam. This
content will be of interest to the general reader, because the issues range from logic to ethics to
pedagogy, and to questions of evidential and epistemological support. Anyone interested in clear
thought and expression will find these issues of significance. Although the exam has a number of
strengths and has the clearest instructions of all the presently available Critical Thinking exams, the
content of 9 of the exam’s 34 questions is defective, namely the content of questions 6, 7, 8, 19,
21, 23, 24, 29, and 33. These questions make errors in critical thinking. Hence, no statistical results
pertaining to the administration of these questions to students can be acceptable. The remaining
questions are acceptable as to content. But until the problems are corrected, those who may use the
exam should remove the defective questions from test administration or from data collection and
reporting.
The scope of the exam also is quite limited, but this may be unavoidable for any instrument
designed to be completed in about an hour. Further, the scores resulting from any such testing can be
understood only as a measure of minimal competency (below which remediation likely is needed)
for the skills tested, but not as an adequate measure of critical thinking.

Before turning to the analysis it may be useful to give a brief sketch of the context
in which the exam is used, and its relevance to those interested in science and
education. In recent years critical thinking (CT) has become a matter of interest
to collegiate administrators and assessments offices. For most colleges CT has
become some sort of (usually lower division) requirement, and for many colleges
the assessment office is tasked to measure student success in CT. One way that
colleges try to teach CT is in science courses, but there are many approaches. One
interesting question is how CT is to be defined. There are several widely known
“models” of it, and Fawkes (2001) describes them and goes on to describe the CT
skills tested by the three widely used CT skills tests. The three tests are known as

 Disclosure: Three of the authors are engaged in producing and marketing a critical thinking test.
Though this paper was written before any of us considered developing such a test, the reader should
be informed. Each of the writers has exercised considerable care to avoid any bias, and we thank our
independent reviewers for helping us in this regard as well.
118 DON FAWKES ET AL.

the “California”, the “Cornell”, and the “Watson–Glaser”. None of these three tests
is based on any of the models; each test is independently developed and marketed.
In so far as the tests are used to check student acquisition of CT skills, it is the
tests that define CT. As teachers of science are sometimes expected to teach CT,
it is useful to know what is being tested and the quality of the tests. Fawkes et al.
(2001) does the analysis for the Watson–Glaser, the present paper does it for the
California, and a paper in progress will do it for the Cornell. Moreover, the analysis
is a logical analysis, and in as much as logic is a science and an essential part of
all other sciences, the analysis will be of interest to scientists and educators alike.
Every scientist and every educator benefits from exercising logical muscles.

Analysis of Questions and Instructions


The California Critical Thinking Skills Test (1990) consists of 34 questions. There
are two forms of the exam; this paper addresses Form A. This is not a statistical
review, but rather a review of the content of the exam. This content raises issues
that range from logic to ethics to epistemology to pedagogy, and so they will be of
interest to the general reader; these issues are interesting in themselves. Since the
California Skills Test (hereafter, CS) is a widely used measure of critical thinking
it warrants careful review. The analysis is offered in a collegial spirit, intended
to foster discussion and improvement. Different readers will find different parts
of the analysis controversial, and some possible challenges will be noted as the
analysis proceeds. Controversy strengthens the invitation to discussion and im-
provement. The reader should consider everything controversial and examine the
analysis carefully.
As the analysis uncovers a number of flaws in the exam it is useful to note at the
outset that the instructions in CS are especially clear and nontechnical, clearly the
best among the currently available standardized CT exams. There are difficulties
however with a number of questions, to which we now turn.
We begin with Question #6:
6. Suppose “Only those seeking challenge and adventure should join the Army”
were true. Which of the following would express the same idea?
(A) If you seek challenge and adventure, you should join the Army.
(B) If you join the Army, you should seek challenge and adventure.
(C) You shouldn’t seek challenge and adventure except by joining the Army.
(D) You shouldn’t join the Army unless you seek challenge and adventure.
CORRECT
The problem with response D is that it is ambiguous. “You shouldn’t join the
Army” has two possible meanings. One of these meanings would express the same
idea as the statement in the stem, but the other would not express the same idea
as the statement in the stem. The two possible meanings turn on just exactly what
the negation in “shouldn’t” applies to. (Generally, logicians would express this
EXAMINING THE EXAM 119

by saying that the two possible meanings depend on what the negation “operates
over”.)
“You shouldn’t join the Army” could mean,

“It is the case that you should not join the Army”. (1)

Or, “You shouldn’t join the Army” could mean,

“It is not the case that you should join the Army”. (2)

(1) says that you have an obligation to not join the Army, but (2) says that you do
not have an obligation to join the army. “You shouldn’t join the Army” can mean
either (1) or (2). But only (2) allows the statement in D to mean the same as the
statement in the stem. To see this, consider that the stem statement,

“Only those seeking challenge and adventure should join the (S1)
Army”
expresses the same idea as

“If you should join the Army, then you seek challenge and (S2)
adventure”,
and both (S1) and (S2) express the same idea as

“If you do not seek challenge and adventure, then it is not the (S3)
case that you should join the Army”.
(S1), (S2), and (S3) each express the same idea. D however, can mean either

“If you do not seek challenge and adventure, then it is the case (D1)
that you should not join the Army”, or,
“If you do not seek challenge and adventure, then it is not the (D2)
case that you should join the Army.”
Note that only (D2) [and not (D1)] expresses the same idea as (S3), (S2), and (S1).
Basically, this is a matter of taking care in expressing a negation. As expressing
negations is one of the most basic and common uses of language, this item and the
analysis of it may have broader interest beyond tests and testing, broader interest
to those who simply are curious about expressing ideas clearly. Giving a thorough
demonstration of the relevant points (please see the Appendix) is somewhat daunt-
ing, and that is likely the best point to make about this test question: If professors
have to delve into logic textbooks to figure out a test question, it is not the sort
of question that can be a useful measure of students’ CT skills. But for the more
broadly interested reader, the best point may be to be on alert when negations are
used, and to attend to just what is being negated.
We turn next to Question #7. It reads as follows:
120 DON FAWKES ET AL.

7. Suppose a botanist lecturing about garden plants said, “The rose offers many
colors”. Which would be the best interpretation of this claim?
(A) There is a rose which is more than one color.
(B) There is a thing that is more than one color and it is a rose.
(C) All roses are more than one color.
(D) Not every rose is the same color. CORRECT
(E) All of the above are equally acceptable interpretations.
It is not clear what are the criteria for “acceptable interpretations” here; accord-
ingly, it is not clear why E is not an answer as good as or better than the “correct”
answer, D. If an “acceptable” interpretation need only be consistent with the word-
ing of the botanist’s claim that “The rose offers many colors”, then there is no
obvious reason why any one of A, B, C, or D should be considered a superior
reading. For this claim is ambiguous: does “the rose” denote a particular rose, or
roses in general; and if the latter, is “offers many colors” being predicated of roses
collectively or distributively? Given this dual ambiguity of subject and predicate,
the claim taken by itself seems equally compatible with A, B, C, and D in which
case, the correct answer is E. However, the authors of the test designate D as the
correct answer. Given the ambiguity of the statement, perhaps everyday beliefs
about roses are given some say in determining the answer. But then the item seems
to be testing more for everyday knowledge than for reasoning ability. This question
does not serve to measure the presence or the absence of critical thinking skills.
Question #8 is as follows:
8. “Ezerinians tell lies”, means the same thing as:
(A) If anyone is Ezerinian, then that person is a liar. CORRECT
(B) If anyone is a liar, then that person is Ezennian.
(C) There is at least one person who is an Ezerinian who lies.
(D) People don’t lie unless they are Ezerinian.
(E) All of the above mean the same thing.
The statement “Ezerinians tell lies”, does not indicate whether all Ezerinians tell
lies, or whether it is generally the case that Ezerinians tell lies. If the statement
is interpreted to mean the former, then response A is correct. But the question is
ambiguous as it stands. “Ezerinians tell lies” can mean either that all Ezerinians tell
lies A If anyone is an Ezerinian, then that person is a liar.) or some Ezerinians tell
lies C There is at least one person who is an Ezerinian and a liar.) The interpretation
of this statement is unclear. Certainly, A is a plausible answer – it might even be
the most natural way to read the statement – but there are two problems with it:
(1) The word liar is obscure. How many lies does one need to tell to be a liar?
Consider:
a. Presumably, every person has told at least one lie at some time. Does that mean
that every person is a liar?
b. Or is it someone who makes a habit of it – who lies much or most of the time?
EXAMINING THE EXAM 121

c. Or must it be someone who always lies? If it is the last, then one might reason-
ably claim that there are no liars, and A would be false on the assumption that
there are Ezerinians.
(2) Within the context of an argument, it is reasonable to tell students to interpret
the statement in such a way that the statement is true and (when possible) the
argument is valid or strong. For example, consider the premise, “Students aren’t
college graduates”. This premise might mean “No students are college graduates”
or “Some students are not college graduates”. Perhaps the more understanding
interpretation is, “Some students are not college graduates”, since that statement
is true: i.e. (virtually) all graduate students are college graduates. Within the con-
text of an argument, one would have some guidance as to the interpretation of
“Ezerinians tell lies”: If the statement “All Ezerinians are liars” would yield a valid
argument, then ceteris paribus that may be the more sympathetic interpretation – it
would also indicate that the argument relies on liar in sense (a) or (b). On the other
hand (again ceteris paribas), if “Some Ezerinians are liars” would yield a valid
argument, then that may be the better understanding. And that statement might be
true even given the highly improbable sense (c) of liar. So, either response A or
response C could be correct, and apart from some context, it would not be possible
to determine which is correct. One way to correct the problem with this question
would be to change option A to “If anyone is Ezerinian, then that person tells lies”.
Question #19 raises an interesting logical puzzle. But alas, no genuinely correct
answer is provided. Here is the item:
19. Consider the “krendalog” relationship. It is defined as follows: “Only humans
are krendalogs. But not every member of the human species has krendalogs.
Nobody can be a krendalog to themself, but today every human is someone
5 krendalog. If someone is your krendalog, then all that person’s krendalogs
are your krendalogs too. If someone is your krendalog, then you cannot be
that person’s krendalog. Assume the first two humans, the long ago deceased
ancestors of our species, were named Jake and Kathy”. Given this meaning of
“krendalog” we can say for sure that
(A) Jake and Kathy are krendalogs to one another.
(B) Jake or Kathy is each their own krendalog.
(C) Someone is neither Jake’s nor Kathy’s krendalog.
(D) All of us are krendalogs to Jake and Kathy. CORRECT
(E) None of the above because this concept does not make sense.
The idea suggested here seems to be this: the relation (a) x is a krendalog of y (the
relation K) is supposed to be identical to or isomorphic to or analogous to (b) x is
a (human) descendent of y. If one analogizes (a) to (b), answer D does seem to
follow from the information in the passage. However, the information given about
the relation K does not necessarily make it isomorphic to the descendent relation.
The information given is as follows:
(1) Only humans are krendalogs.
(2) Not every member of the human species has krendalogs.
122 DON FAWKES ET AL.

(3) Nobody can be a krendalog to himself or herself, but today every human is
some human’s krendalog.
(4) If someone is your krendalog, then you cannot be that person’s krendalog.
In addition, we are told to assume that:
(A) The first two humans, the long ago deceased ancestors of our species, were
named Jake and Kathy.
There are a number of ways to analyze this information; here’s one: a relation
can be defined extensionally over individual humans. Consider the set of all hu-
mans living in the past, p1 . . . pn. Consider the set of all humans living now, s1
. . . sn. The following examples describe relations that satisfy the conditions.
Where the intended interpretation of K(ni, nj) is ni is a krendalog of nj
(i) K(sl, p1), K(s2, p2), K(s3, p3), . . . K(sn, pn).1
(Note that here only currently alive individuals are krendalogs, and only members
of the previous generation have krendalogs.)
(ii) K(sl, p1), K(s2, p1), K(s3, p1), . . . K(sn, p1).2
(Note that here only currently alive individuals are krendalogs, and only a single
individual, p1, has everyone as his krendalog.)
Both (i) and (ii) satisfy the description of the K relation but neither makes all
currently alive individuals krendalogs to Jake and Kathy. Thus response D is not
the right answer here and neither are any of the other responses. Correlatively, (i)
and (ii) show that the K-relation is not the ancestor relation.
Now perhaps it might be thought that this question can be fixed by changing
option D to “Jake and/or Kathy”. But changing option D to “Jake and/or Kathy”,
is not sufficient to fix the problem, because with this change in place both (i) and
(ii) still satisfy the description of the K relation, and though (ii) makes all currently
alive individuals krendalogs to Jake and/or Kathy, (i) does not. And if D were to
read, “All of us are krendalogs to Jake and/or Kathy” then, as noted above, there are
a number of ways to analyze this information, only one of which would be to take
“All of us” to mean “all currently alive individuals”, as both (i) and (ii) interpret it
to mean. But obviously, “All of us” could mean something else, like “All humans”.
This item needs some careful revision.
Question #21 and its instructions are as follows:
For Questions 20, and 21 use this fictitious case: “In a study of high school stu-
dents at Mumford High, it was found that 75drank two or more beers each day
for a period of 60 days experienced measurable liver function deterioration. That
these results could have occurred by chance was ruled out experimentally with high
levels of confidence”.
21. If the information in the Mumford High case were true, which of the following
hypotheses would not have to be ruled out in order to confirm the claim that
for about 75 adolescents out of 100, after two months of drinking as little as
two beers a day, measurable liver deterioration can be found?
EXAMINING THE EXAM 123

(A) Liver deterioration occurs only in inexperienced beer drinkers, but it levels
off after people have been drinking beer for longer periods of time.
(B) Since teens brag about their drinking, the positive relationship between
drinking and adolescent liver function deterioration is much higher than
reported.
(C) Since the students at Mumford High are predominantly Black or Hispanic,
the findings do not apply to adolescents in general.
(D) Liver function deterioration in adolescents is the result of other factors,
such as normal growth and development, poor diet, and sports injuries.
(E) Since school officials failed to keep this research project confidential, the
purpose of this study was known by the students being tested and by
unauthorized persons. CORRECT
It isn’t clear that A would have to be ruled out to confirm the claim. The claim that
the deterioration levels off is irrelevant to the hypothesis that for 75 of 100 adoles-
cents drinking two beers a day for two months causes measurable liver damage.
Both (A) and (E) appear to be correct.
Here is Question #23:
23. Consider this argument: “Person L is shorter than person X. Person Y is shorter
than person L, but person M is shorter than person Y. Therefore, person Y is
shorter than person J”. What information must be added to require that the
conclusion be true, assuming all the premises are true?
(A) Person L is taller than J.
(B) Person X is taller than J.
(C) Person J is taller than L. CORRECT
(D) Person J is taller than M.
The test’s authors designate the “correct” option C (“Person J is taller than L”) as
the conveyor of “information (that] must be added to require that the conclusion be
true, assuming all the premises are true” (original emphasis). In fact, no one of the
four options provided on the exam “must” be added “to require” the conclusion’s
truth: in order to certify the conclusion, given the premises, not only option C
but also the distinct statement “Person J is taller than X” will suffice. Option C
would clearly be the right answer if the question were to ask something like: “Of
the following four statements, which one would, if added to the other premises
and with all premises assumed to be true, make it certain that the conclusion is
true?” However, what is actually asked is, logically, something quite different.
Students who have really learned to think clearly and critically can be expected
to pick up on that difference – yet this may cost them at least a certain amount of
unnecessary confusion and lost time, because as the question is currently worded,
strictly speaking, the options provided contain no correct answer.
Perhaps it might be thought that the criticism of this item is a mere quibble.
But logic is often a matter of slight changes, and slight changes are often crucial to
understanding what is being said, and to thinking critically. Quibbles are addressed
a little further below.
124 DON FAWKES ET AL.

We turn next to Question #24. Here are the instructions and the question:
For Questions 24 and 25 use this fictional passage:
“Research at the Happy-Days Pre- School on the campus of State University
showed that four-year-old children who attended the Happy-Days Pre-School
all day for 9 months averaged 58 points on a standardized test of kindergarten
readiness. The research showed also that those four-year-olds who attended
only in the morning for 9 months averaged 52, and those four-year-olds who
attended afternoons only for 9 months averaged 51. A second study of four-
year-olds who attended Holy Church Pre-School all day for 9 months showed
these children averaged 54 on the same kindergarten readiness test. A third
study of four-yearolds who attended no pre-school and were all from low
income households showed an average score of 32 on the same test. The differ-
ence between 32 and the other scores was found to be statistically significant
at the .05 level of confidence”.
24. Initially, the most plausible scientific hypothesis regarding the data is
(A) a child who scores 50 or higher is ready for kindergarten.
(B) more testing is needed before a plausible hypothesis can be formed.
(C) pre-school attendance is not related to kindergarten readiness.
(D) there should be funding for four-year-olds to attend pre-school.
(E) attending a pre-school is correlated with kindergarten readiness. COR-
RECT
The construction of this item is somewhat puzzling. Note that the “third study”
cited in the fictional passage correlates low scores on the kindergarten readiness
test with a pair of what seem to be separable factors: “no pre-school” and “low-
income background”. This makes B a plausible option. A, C, and D are clear non
sequiturs, and B looks to have an edge over E, as E seems to neglect the possibility
(raised by the findings of the “third study”) that differences in household income
correlate with differences in test scores. Despite looking good on these grounds,
however, B seems to be ruled out, simply by the way in which the question stem
and option together are worded. Note that the test taker who selects B is effectively
committed by the wording of the item to the strange claim that “Initially, the most
plausible scientific hypothesis regarding the data is” [option B] “more testing is
needed before a plausible hypothesis can be formed”. This is a rather odd “hypo-
thesis”. So while option B on its own seems at least as good as, if not better than,
any of the other options, taken in combination with the question stem it makes for
a singularly odd answer. The result is unclarity (at minimum). Option B seems best
on strictly evidential grounds, but at the same time it appears questionable owing
to the wording of the item. As the question stands it cannot serve to measure either
the presence or absence of critical thinking skills.
We turn next to Question #29.
29. “Confidentiality is an important part of the relationship between doctor and
patient. But protecting innocent people from serious harm is also important.
Nobody can say with certainty which value is the more important of the two.
EXAMINING THE EXAM 125

This can create some agonizing dilemmas. For example, a doctor may know
that a patient is going to harm someone or be harmed by someone, as in the case
of suspected child abuse. This puts the doctor in a difficult situation regarding
whether to maintain confidentiality or to inform the proper authorities about
the suspected danger”. The best evaluation of the speaker’s reasoning is
(A) good thinking, because confidentiality cannot be compromised.
(B) good thinking, because in the abstract these values conflict. CORRECT
(C) poor thinking, because in practice doctors do choose one value over
another.
(D) poor thinking, because the law clearly says protecting the child is more
important.
This question raises a number of concerns – from ethics to logic to epistemo-
logy – and appears to offer no clearly correct answer. The key selects option B as
the correct answer, but is it? First of all it is difficult to see how “values” might
“conflict” in the abstract, but even if they may, the description given in this item
poses a concrete case of alleged “conflict”. So on this ground alone, C is at least
as good an answer as B. And the “Nobody can say with certainty . . . ” line is oddly
out of place in the context of such a matter: for the law does govern such cases in
every U.S. State and in most places on earth. So the law does “say with certainty”
in so far as the law “says” anything “with certainty”. But on this ground, D is at
least as good as C or B. Admittedly, this point depends on background knowledge,
but it is background knowledge that most test takers are likely to have, and that
more advanced critical thinkers are even more likely to have. And the “doctor may
know . . . ” line is also confused here because even in the unlikely event of a clear
and direct statement by a patient, the most a doctor has is a suspicion – a suspicion
which may be strong or weak and well supported or poorly supported – that the
patient will act in some specific way. But this consideration further supports the
view that there is “poor thinking” in the item, even though none of the options
matches with it. But then an attentive test taker would just note this “poor thinking”
and wonder further about this item. For since the answers provided don’t take note
of this aspect of “poor thinking”, the especially adept critical thinker will begin
to wonder which of the instances of “poor thinking” in C and D were also not
noticed by those who prepared the test, and since the test taker doesn’t have the
key, the choice between C and D begins to look like an even bet from the point
of view of a test taker who sees the flaws in the question and who has no option
other than to try to guess what the test writers meant to say. In any case, the critical
thinking skills of a test taker who reasoned in such ways are not faulty, and so this
item cannot provide a measure of either the presence or the absence of CT skills.
Perhaps the item could be improved by deleting “in the abstract” from option B
and then rephrasing to correct the remaining problems
Here is Question #33 and its instructions:
For Questions 31, 32, 33 and 34 focus on the faulty inference in the following
fictional case:
126 DON FAWKES ET AL.

A speech writer working for a white supremacist group claimed that white
Americans were “genetically superior to Blacks, Hispanics, Asians, Iranians
and all the other mongrel races in terms of native human intelligence”. To sup-
port this claim, the speech writer quoted a study which compared two groups
of tenth graders. Each group was given the same exam covering European
geography. The exam focused on European rivers, mountain ranges, countries,
capital cities, agriculture, industry, religion, music and languages. Group A
was 35 tenth graders, 34 of whom were whites with Anglo-European family
names. Group A students attended a private college prep school in wealthy
Orange County, California. That school requires ninth graders to take a year
of European history. Group B was 40 tenth graders, all but 4 of whom were
Hispanic, Black, Asian or Middle Eastern. Group B students attended a public
high school in a violent, gang infested ghetto community of south central Los
Angeles County. Ninth graders at the public high school take a year of world
history. The writer pointed out that Group. A did significantly better on the
geography test than Group B.
33. Suppose a female social worker objected, “You can’t expect Group B children
to be as intelligent. After all, they come from a background of poverty, crime
and broken families”. If true, would this social worker’s reason be a good or
bad reason, and why?
(A) Good reason. Poor neighborhoods mean poor schools, poor schools mean
poor teachers, poor teachers mean poor students, poor students mean poor
test scores.
(B) Good reason. Regardless of race, children from these kinds of backgrounds
are less intelligent than children from wealthy backgrounds.
(C) Bad reason. Regardless of socioeconomic conditions, intelligence depends
on the quality of the school you attend.
(D) Bad reason. Poverty, wealth and family circumstances do not make a
person more or less intelligent. CORRECT
Selecting the “correct” answer to this question, response D, seems to depend upon
taking the term “intelligent” to denote something like inborn intellectual aptitude.
If the term is taken in this sense, then the social worker’s objection is a clear non
sequitur – but it is not clear that this is the only sense in which the term “intelligent”
is regularly employed. Suppose, for example, that one takes “intelligent” to be
roughly equivalent to “particularly apt at assimilating and applying new concepts
and new information”. There is nothing particularly idiosyncratic about ascribing
this sense to that term, but if one does so, then it is not at all clear that option
D is correct, because it is not clear that this kind of intellectual aptitude cannot
be negatively affected by “socio-economic conditions”. Presumably this aptitude
is helped or hindered by some array of conditions importantly including brain
development, during early childhood especially, and by good education or the lack
thereof. And in that case, the social worker’s claim begins to look like a plausible
(if not unproblematic) objection to the supremacist: one could take the objection
EXAMINING THE EXAM 127

to say that the difference in intelligence (supposedly) exhibited by the members of


these groups is a function of socio-economic background, not of race. Moreover,
because the social worker’s objection is clearly a non sequitur provided that “in-
telligent” is given a strict innatist construal, many of the better critical thinkers
taking the test may well reject this construal in favor of an alternate reading like
the one just offered. Such a reading is both permissible (owing to the ambiguity of
“intelligent”) and more plausible as an interpretation of the social worker’s view;
it is also incompatible with the selection of response D, which the test’s authors
deem the correct response. Since options A and C are not plausible, this likely will
lead the best critical thinkers to reconsider options B and D, costing them time on
the exam. Likely some of them in the end would conclude that the innatist notion
of intelligence is being assumed, and so choose D. But there is a case for option
B. For example, consider a student who reasons something like this: “Eating lead-
based paint off the walls (a matter of poverty and family circumstances, among
other things) causes lower intelligence. Nutrition (a matter of poverty and family
circumstances) has effects on intelligence. So, I’ll choose B – which has a grain of
truth to it”. In short, this item is ambiguous, and implausible in its construal of the
social worker’s claim. At best, the talented critical thinker will spend inordinate
time on it; at worst she will spend that time only to choose a “wrong” answer, B.
Before turning to the scope of the exam, there are a few matters of expression
worth noting. There is an improper expression of the subjunctive mood contained in
the instructions for Questions #11 and #12 viz. “For example, (4) suppose there was
a woman prisoner whom you knew for certain to be totally innocent”. This should
read, “. . . suppose there were . . . ”. Further, the expression in Question #19 reads
in part, “Nobody can be a krendalog to themself, but today . . . ”. This should read,
“Nobody can be a krendalog to himself or to herself, but today . . . ”. Now perhaps
these might be thought to be mere quibbles. But that would be a mistake: The
subjunctive mood is of considerable importance in ordinary English. It provides
English speakers with the ability to distinguish between possibility and actuality, a
logical distinction of considerable consequence: what is (or was) the case and what
can be the case are quite different things, and any critical thinker (including any
writer of a critical thinking test) must be competent in expressing this distinction. A
competent critical thinker might respond to, “suppose there was a woman prisoner
whom you knew for certain to be totally innocent”. By saying, “I don’t know what
I’m supposed to do about that now!” And in general careful attention to matters
of proper expression is a mark of a critical thinker. Hence, it is important that
academic materials reflect this sort of careful attention. Materials and teachers are
students’ role models.

Scope
Another consideration relevant to the assessment of any test is its breadth,
the range of competencies it attempts to measure. Posted at the website
128 DON FAWKES ET AL.

<http://www.geocities.com/siuchu2002/ & http://www.geocities.com/fawkesdx/>


is a fairly comprehensive inventory of more than 250 basic CT skills. By comparing
the exam with the inventory we can identify the skills that the exam attempts to
measure.3 The reader is encouraged to make a comparison as well. In the present-
ation below three CT models are crosschecked with the analysis, that is, if a skill
is found in one of the three CT models this is indicated by a letter designation: D,
for the Delphi model; E, for the United States National Educational Goals model;
and S, for the Sonoma model. (Please see references.) The CT skills listed in the
analysis are stated as objectives; each completes the phrase, “A critical thinker is
able to . . . ”.

QUESTIONS 1–4
• interpret, and apply complex texts, instructions D E
• distinguish:
• conclusions D E S
• premises (reasons) D E S
• distinguish supporting, conflicting, and compatible claims, arguments, explan-
ations, descriptions, representations etc. D
• assess the relevance of claims to other claims D E
• evaluate whether a deductive argument is valid or invalid (logical form) D E
• evaluate whether an inductive argument is strong or weak D E
• evaluate claims and arguments in terms of criteria such as:
• consistency D E S
• relevance E S
• support
QUESTIONS 5–9
• recognize ambiguity and unclarity in claims, arguments, and explanations D E
• interpret and apply complex texts, instructions, illustrations etc. D E
• distinguish supporting, conflicting, compatible, and equivalent claims, argu-
ments, explanations, descriptions, representations, etc. D
QUESTION 10
• recognize and clarify issues, claims, arguments, and explanations D E
• interpret and apply complex texts, instructions, illustrations etc. D E
QUESTIONS 11–13
• distinguish:
• conclusions D E S
• premises (reasons) D E S
• explanations D E S
• assumptions (stated and unstated) D E S
QUESTIONS 14–19
• interpret and apply complex texts, instructions, illustrations etc. D E
• evaluate whether a deductive argument is valid or invalid (logical form) D
QUESTIONS 20–21
EXAMINING THE EXAM 129

• interpret and apply complex texts, instructions, illustrations etc. D E


• evaluate whether an inductive argument is strong or weak D E
QUESTIONS 22–23
• interpret and apply complex texts, instructions, illustrations etc. D E
• evaluate whether a deductive argument is valid or invalid (logical form) D
QUESTIONS 24–27
• interpret and apply complex texts, instructions, illustrations etc. D E
• evaluate whether an inductive argument is strong or weak D E
QUESTION 28
• interpret and apply complex texts, instructions, illustrations, etc. D E
• evaluate whether an inductive argument is strong or weak D E
• identify and avoid errors in reasoning: D
• informal fallacy:
• post hoc, ergo propter hoc (after that, therefore because of that)
QUESTION 29
• interpret and apply complex texts, instructions, illustrations etc. D E
• assess the relevance of claims to other claims D E
• distinguish supporting, conflicting, compatible, and equivalent claims, argu-
ments, explanations, descriptions, representations etc. D
QUESTION 30
• interpret and apply complex texts, instructions, illustrations etc. D E
• identify and avoid errors in reasoning: D
• informal fallacy:
• begging the question
QUESTIONS 31–34
• interpret and apply complex texts, instructions, illustrations etc. D E
• evaluate whether an inductive argument is strong or weak D E
• identify and avoid errors in reasoning: D
• informal fallacy:
• smokescreen/red herring/rationalizing
Summary for The California Critical Thinking Skills Test
• interpret, and apply complex texts, instructions D E
• distinguish:
• conclusions D E S
• premises (reasons) D E S
• explanations D E S
• assumptions (stated and unstated) D E S
• assess the relevance of claims to other claims D E
• evaluate whether a deductive argument is valid or invalid (logical form) D
E
• evaluate whether an inductive argument is strong or weak D E
• evaluate claims and arguments in terms of criteria such as:
• consistency D E S
130 DON FAWKES ET AL.

• relevance E S
• support
• recognize ambiguity and unclarity in claims, arguments, and explanations
DE
• distinguish supporting, conflicting, compatible, and equivalent claims,
arguments, explanations, descriptions, representations, etc. D
• recognize and clarify issues, claims, arguments, and explanations D E
• identify and avoid errors in reasoning: D
• informal fallacy:
• post hoc, ergo propter hoc (after that, therefore because of that)
• begging the question
• smokescreen/red herring/rationalizing
Of more than 250 basic critical thinking skills listed in the inventory (and there
are surely more than these), 17 are addressed by CS.

Recommendations
CS is seriously flawed as it stands. The limited scope may be unavoidable for any
multiple choice test on Critical Thinking designed for completion in about an hour.
But the exam makes mistakes in critical thinking in questions 6, 7, 8, 19, 21, 23,
24, 29, and 33. In most of the places where it goes wrong, the exam seems likely
to produce “false negative” evaluations of the performance of students who have
better developed CT skills. But whether or not CS generally results in such false
negatives in practice has no bearing on this assessment of CS’s content. The details
of the assessment given above provide the grounds on which the conclusion rests.
And those grounds show that any results from the portions of the exam shown to be
defective cannot be meaningful. No statistical conclusion can follow from content-
defective testing material. Nevertheless, the remainder of the exam is acceptable
with respect to content; and, the defective questions can be replaced or modified
fairly readily. In the interim, those who may use the exam can eliminate the defect-
ive parts from test delivery or from data collection; elimination of these parts from
test delivery would be best, in the interest of saving time and avoiding unnecessary
distractions for test takers.
As to the scope of the exam and the significance of such multiple choice testing,
it is unlikely that any multiple choice exam can hope to capture the range of CT ba-
sic skills, but that is not an argument against such testing. It is instead an argument
in favor of understanding the results of any such testing. Such results can give an
indication of competence for the skills measured (as a kind of minimal competency,
below which remediation is in order for those skills); but such results cannot serve
as an adequate measure of critical thinking skills generally, and any such testing
should involve several different tests to give better indications. But (1) since most
CT skills involve a “supply” response rather than a “select” response (i.e. most
CT skills involve initiating responses rather than making a selection from given
EXAMINING THE EXAM 131

alternatives); and, (2) since most CT skills involve reflection on these “supply”
responses themselves (thinking about thinking); and, (3) since many CT skills
involve originating thought and then carefully examining it, rather than making
any response at all, such testing even when done well can only provide indicators
at a rudimentary level. For these reasons any attempt to use such testing to grant
any form of credit by exam, or to waive any CT requirement, or to make any
positive claim about scores on such exams as indicators of competence is sheer
folly. The better place for both the acquisition and the assessment of CT skills is
the traditional classroom (with small class size,4 without multiple choice testing,
and with the requirement that students explain every answer to a teacher competent
in CT skills, who cares enough and has time enough to read and listen and respond
to every response and every explanation). There are no short cuts.

Quibbles, Controversy, and the General Reader


Quibbles and controversy are useful in an open and objective analysis intended
to invite discussion and improvement; and perhaps these provide some of the most
interesting issues to the general reader. Critical thinking is not easy. It ranges across
all subjects and disciplines; it ranges across far more than the logical, ethical,
pedagogical and epistemological issues raised here. As to whether or not some of
the analysis herein amounts to quibbling, the reader may wish to consider various
points of view. One point of view that seems particularly relevant is that of the
student. The analysis shows that those students who are most advanced in their
critical thinking skills are the ones most likely to be adversely affected. This is a
serious ethical and pedagogical matter. As for the matters of expression criticized,
collegiate level material needs to meet collegiate standards. The general reader will
no doubt find these points of interest as well. This evaluation has been an endeavor
to raise and to answer some questions about the scope and content of the exam,
and to examine issues of clarity, accuracy and precision that may be of interest to
a wider readership, with the expectation that the reader will hold this evaluation to
high standards as well.

Appendix
Demonstrations That
“Only those seeking challenge and adventure should join the Army”
and
“You shouldn’t join the Army unless you seek challenge and adventure”.
Do Not Necessarily Express the Same Idea

1. We can explore these matters and provide a little context by considering the
notions of “obligation” and “permissibility”.
132 DON FAWKES ET AL.

Where O is an obligation operator and P is a permissibility operator, and  is a


proposition, the relationship between obligation and permissibility is:
O ↔ ∼P ∼ 
∼O ∼  ↔ P
∼P ↔ O ∼ 
∼O ↔ P ∼ 
Call this the “Obligation-Permissibility Relation”.
The stem of Question 6 gives “Only those seeking challenge and adventure
should join the Army” where
• ‘Hx’ is “x is a human”,
• ‘Jx’ is “x joins the Army”,
• ‘xSy’ is “x seeks y”,
• ‘a’ is “adventure”,
• ‘c’ is “challenge”,
• ‘O’ is the obligation operator, and
• ‘P’ is the permissibility operator,
the claim “Only those seeking challenge and adventure should join the Army” is

(x){OJx → [Hx & (xSc & xSa)1}.

So, beginning with ‘(x){OJx → [Hx & (xSc & xSa)]}’ we can prove:
1. (x){OJx → [Hx & (xSc & xSa)]}
2. OJp → [Hp & (pSc & pSa)] From 1 by Universal Instantiation
3. ∼OJp ∨ [Hp & (pSc & pSa)] From 2 by Implication
4. [∼OJp ∨ Hp] & [∼OJp ∨ (pSc & pSa)] From 3 by Distribution
5. [∼OJp ∨ (pSc & pSa)] & [∼OJp ∨ Hp] From 4 by Commutation
6. ∼OJp ∨ (pSc & pSa) From 5 by Simplification
7. OJp → (pSc & pSa) From 6 by Implication
8. ∼(pSe & pSa) → ∼OJp From 7 by Transposition
9. ∼(pSc & pSa) → P ∼ Jp From 8 by Obligation-Permissibility

The allegedly correct answer is “You shouldn’t join the Army unless you seek
challenge and adventure”, So we have,

(x){[Hx & O ∼ Jx] ∨ (xSc & xSa)}.

So, beginning with ‘(x){[Hx & O ∼ Jx] ∨ (xSc & xSa)}’ we can prove:
EXAMINING THE EXAM 133

D1. (x){[Hx & O ∼ Jx] ∨ (xSc & xSa)}


D2. [Hp & O ∼ Jp] ∨ (pSc & pSa) From D1 by Universal Instantiation
D3. (pSc & pSa) ∨ (Hp & O ∼ Jp) From D2 by Commutation
D4. [(pSc & pSa) ∨ Hp] & [(pSc & pSa) ∨ From D3 by Distribution
O ∼ Jp]
D5. [(pSc & pSa) ∨ O ∼ Jp] & [(pSc & From D4 by Commutation
pSa) ∨ Hp]
D6. (pSc & pSa) ∨ O ∼ Jp From D5 by Simplification
D7. ∼(pSc & pSa) → O ∼ Jp From D6 by Implication
D8. ∼(pSc & pSa) → ∼PJp From D7 by Obligation-Permissibility
Since step 9, in the first proof, is not equivalent to step D8 in the second proof, the
proposed answer to Question #6 is incorrect.
The authors of the California Test might reply by saying,
The claim, “You shouldn’t join the Army” is to be understood as [Hx & ∼OJx].
But consider the following:
• Though “You shouldn’t join the Army” can be understood as [Hx & ∼OJx],
this is not clearly the only (or best) understanding of “You shouldn’t join the
Army”. While ‘should’ is ambiguous vis-á-vis the kind of obligation – presum-
ably it is prudential obligation-it would seem to make better sense to claim it
means, “It is in your best interest not to join the Army”, rather than, “It is not
in your best interest to join the Army”.
• For the “key correct” answer actually to be correct it is not sufficient for it to
be the case that “You shouldn’t join the Army” can be understood as [Hx &
OJx], rather, this must be the case.
• If there is any ambiguity regarding what the statement means, the question is
defective, since it is a question of which statement expresses “the same idea”.
2. Perhaps a simpler way to see these matters is as follows:
The basic point is:
“should not p may be interpreted either as
(1) “should not p” or (2) “not should p”.
The relevant structure of the statements can be expressed simply as:
The Stem Statement: “(x)[Ojx → Ax]” where Ax = [Hx & (xSc & xSa].
This translates “Only A are OJ” as the categorical statement “All OJ are A”.
The relevant structure of the second (purportedly equivalent) statement in
response D is
EITHER:
“not OJ unless A” = “∼OJ ∨ A” = “∼A → ∼ OJ” in which case response D
“would mean the same as the statement in the stem.”
OR:
134 DON FAWKES ET AL.

“∼A → O ∼ J” in which case response D “would not mean the same as the
statement in the stem.”

Acknowledgements
Thanks to the following individuals for providing suggestions, review, and/or com-
mentary on earlier drafts: Bonnie Abney, President, Abney International; Henry
C. Byerly, Regents Professor Emeritus, The University of Arizona; Manuel Dav-
enport, Professor, Texas A & M University; James B. Dixon, Associate Dean
of Liberal and Interdisciplinary Studies, St. Augustine College; Charles Hudlin,
Professor, United States Air Force Academy; Nathan H. Miller, Attorney, Har-
risonburg, Virginia; Richard Morehouse, Editor, Analytic Teaching and Professor,
Viterbo College of La Crosse, Wisconsin; Ann-Janine Morey, Professor, South-
ern Illinois University at Carbondale; Paul Newberry, Professor, California State
University at Bakersfield; Richard Parker, Professor, California State University at
Chico, and co-author of Critical Thinking (1998); James Slinger, Professor Emer-
itus, California State University, Fresno; James M. Smith, Professor Emeritus,
California State University, Fresno; Barry Watts, Director of Programs Analysis
and Evaluation, Office of the United States Secretary of Defense. Thanks also to
Tom Adajian and Bill Knorpp of James Madison University, and Katherine Di-
mittiou of the University of Virginia for suggestions for the first section of this
paper.

Notes
1 “K(sl, p1), K(s2, p2)”, can be read as “s1 is a krcndalog of p1, s2 is a krendalog of p2” and soon.
2 “K(s1, p1), K(s2, p1)”, can be read as “s1 is a krendalog of p1, s2 is a krendalog of p1” and
soon.
3 Fawkes (2001) provides an analysis of the scope of all currently available commercial CT tests
and Fawkes et al. (2001) provides an assessment of the Watson–Glaser Critical Thinking Appraisal
exam as to both content and scope.
4 It may he appropriate to quantify the meaning of “small class size”. Perhaps it would be best
simply to note that at this writing the fairly widely publicized national goal of the U.S. Department
of Education for K through 12 education is a class size of 18. Is collegiate education less demanding?
The relevant question here is not whether or not college courses are more or less demanding than
K through 12 courses, but rather how demanding the relevant courses are to the students taking
them, and obviously there is a range difficulty for any group of students. In that sense, K through
12 subjects likely are generally as demanding to those students as college subjects generally are to
college students.

References
Crosschecked Models of Critical Thinking:
D the Delphi model: Facione, Peter A. 1990, Critical Thinking: A Statement of Expert Consensus
for Purposes of Educational Assessment and Instruction, “The Delphi Report”, The California
Academic Press, Milbrae, California.
EXAMINING THE EXAM 135

E the U.S. National Educational Goals model: Click, Benjamin A.L., Hoffman, Steven, Jones, Eliza-
beth, Moore, Lynne M., Ratcliff, Gary & Tibbitts, Stacy: 1990, National Educational Goals,
National Assessment of College Student Learning: Identifying College Graduates’ Essential
Skills in Writing, Speech and Listening and Critical Thinking, Final Round Consensus of Faculty,
Employers, and Policymakers, United States Department of Education.
S the Sonoma model: Paul, Richard et al.: 1998, Center for Critical Thinking, Sonoma State Univer-
sity, Critical Thinking: Basic Theory & Instructional Structures, Rohnert Park, California: The
Foundation for Critical Thinking.

General References:

Facione, Peter A. & Facione, Noreen C.: 1992, The California Critical Thinking Skills Test, Milbrae,
California: The California Academic Press, Second Edition (updated) 1994.
Fawkes, Don: 2001, ‘Analyzing the Scope of Critical Thinking Exams’, American Philosophical
Association Newsletter on Teaching Philosophy, Spring.
Fawkes, Don, Adajian, Tom, Flage, Dan, Hoeltzel, Steven, Knorpp, Bill, O’Meara, Bill, and Weber,
Dave: 2001, ‘Examining The Exam: A Critical Look At The Watson–Glaser Critical Thinking
Appraisal Exam’, Inquiry, Fall.