You are on page 1of 9

Elementary School Childrens Ability to Distinguish Hypothetical Beliefs

From Statements of Preference

Irene-Anna N. Diakidoy
University of Cyprus
Christos Ioannides
University of Pireaus
The authors examined students understanding of hypotheses as beliefs that can be empirically verified.
Thirty second graders and 30 sixth graders considered cases of disagreement about foods and colors that
reflected either alternative hypotheses or different preferences. Their task was to decide whether the
validity of each expressed belief could be determined and to justify their decision. Younger students
considered both hypotheses and preferences as empirically verifiable, whereas older students were better
able to recognize in some cases that preferences are legitimately variable. This lack of distinction may
reflect limited metaconceptual ability or a deterministic epistemological view, both of which might
interfere with the understanding of the hypothesis-testing process.
In the past 15 years, there has been a proliferation of research
exploring issues related to the acquisition of scientific knowledge
and skills (Ioannides & Vosniadou, 2002; Klaczynski, 2000;
Klahr, Fay, & Dunbar, 1993; Kuhn, Amsel, & OLoughlin, 1988;
Sodian, Zaitchik, & Carey, 1991) as well as to the teaching of
science in elementary and secondary schools (Guzzetti, Snyder,
Glass, & Gamas, 1993; Smith, Maclin, Grosslight, & Davis, 1997;
Toth, Klahr, & Chen, 2000). One line of this research has explic-
itly focused on the knowledge and skills that are taken to underlie
the understanding and the application of the scientific method (see,
e.g., Kuhn, Black, Keselman, & Kaplan, 2000). A central part of
this knowledge is the understanding of the hypothesis as a causal
or categorical statement that accounts for a particular state of
affairs. The scientific method revolves around the formulation and
empirical testing of hypotheses, and conclusions regarding their
validity enrich, modify, or restructure scientific knowledge. The
present study focused on the understanding of hypotheses as
statements or assertions whose truth value can be determined, at
least to some extent, through empirical testing.
Studies by Kuhn and her colleagues (e.g., Kuhn et al., 1988;
Kuhn, Garcia-Mila, Zohar, & Andersen, 1995) have documented
preadolescent childrens weaknesses in scientific reasoning and,
specifically, in the ability to test causal hypotheses in a systematic
way. Kuhn has interpreted these findings as attributable to a lack
of differentiation between prior belief and the evidence that may
confirm or disconfirm it (Kuhn et al., 1988). On the other hand,
Sodian et al. (1991) have shown that first and second graders can
distinguish between conclusive and inconclusive tests of simple
hypotheses when these are presented to them. This finding led
Sodian et al. to the contrasting conclusion than even young chil-
dren can differentiate between a belief and the evidence that may
support or disconfirm it. The ability to differentiate between belief
and evidence is crucial to understanding the function and potential
outcomes of hypothesis testing in science domains. Equally cru-
cial, however, is the ability to understand what types of beliefs
would constitute a hypothesis and would, therefore, be subject to
empirical verification.
Hypotheses as a Subset of Belief Statements
To the extent that a hypothesis is promoted by one or more
individuals as a (potentially) true or valid account of a state of
affairs, it can be taken to represent a belief that, nevertheless, is
subject to disconfirmation on the basis of evidence. The opposite,
however, is not necessarily true. Only a subset of statements or
assertions from the universe of beliefs can be subject to discon-
firmation and therefore constitute legitimate hypotheses of the type
that would lend themselves to systematic scientific inquiry
(Dewey, 1906; Stace, 1945). This point is more clearly illustrated
by considering one such subset of beliefs that could be taken, more
appropriately, to represent preferences. The assertion Red is a
more beautiful color than blue represents an evaluative proposi-
tion (Dworkin, 1996) that can be classified as a personal prefer-
ence. It also represents a belief to the extent that it is held to be true
by at least one person (Southerland, Sinatra, & Matthews, 2001).
As it is formulated, however, its generalizability is indeterminate,
and therefore, it cannot be subjected to the empirical test. On the
other hand, the validity and generalizability of the related assertion
Most people think that red is a more beautiful color than blue
can be determined, and therefore, it can function as a legitimate
hypothesis. The outstanding difference between these statements
of beliefthe hypothesis and the preferenceis that one can be
subjected to the empirical test whereas the other cannot because it
Irene-Anna N. Diakidoy, Department of Education, University of
Cyprus, Nicosia, Cyprus; Christos Ioannides, Department of Education,
University of Piraeus, Piraeus, Greece.
The authors made equal contributions to this article so author names
appear in alphabetical order.
We thank D. Natsopoulos for reviewing an earlier draft of the manu-
script, A. Raftopoulos for his comments about the philosophical distinc-
tions between kinds of beliefs, A. Andreou for his help with data collection,
and the children who participated in the study.
Correspondence concerning this article should be addressed to Irene-
Anna N. Diakidoy, Department of Education, University of Cyprus, Kal-
lipoleos 75, P.O. Box 20537, CY-1678 Nicosia, Cyprus. E-mail:
Journal of Educational Psychology Copyright 2004 by the American Psychological Association
2004, Vol. 96, No. 3, 536544 0022-0663/04/$12.00 DOI: 10.1037/0022-0663.96.3.536
concerns a matter of personal taste (Carpendale & Chandler,
The understanding of this difference can be taken to reflect the
development of metacognitive competence (Kuhn et al., 1995) to
the extent that it implicates a reflection on the types of beliefs that
one may hold. As such, it presupposes the understanding of belief
as a mental state distinct from physical fact (Perner, 1991) but falls
short of the sophisticated metastrategic competence that is in-
volved in the coordination of multiple sources of evidence with
theory and that preadolescents have been found to lack (Kuhn et
al., 1988). It is, however, intimately connected with the ability to
recognize and/or formulate researchable questions and therefore is
a prerequisite to understanding the function of the scientific
method and its successful application. The immersion of students
in the scientific process, as in the case of inquiry-based learning,
involves, among other things, their active participation in the
formulation and testing of hypotheses (Chinn & Malhotra, 2002;
McGinn & Roth, 1999). It would appear, then, that the understand-
ing of hypotheses as a distinct set of beliefs defined by their
potential to be empirically verified would justify, to the mind of
the student, the process of testing them in the context of scientific
inquiry. That same understanding can be expected to highlight the
tentative and evolving nature of scientific knowledge (Elby &
Hammer, 2001; Polanyi, 1950), promoting thereby the develop-
ment of more sophisticated epistemological views (Carey & Smith,
1993; Southerland et al., 2001).
The ability to differentiate hypotheses from other belief state-
ments, such as preferences, is also related to the ability to differ-
entiate hypotheses from evidence (Sodian et al., 1991) but only to
the extent that it involves theat least implicitrecognition that
the truth value of a belief is supportable by facts or reasons that are
distinct from the belief itself. The child, for example, who offers an
explicit reason for holding a belief of any kind that is not a simple
restatement of the belief itself demonstrates a basic understanding
that reasons represent distinct entities that are brought to bear on
beliefs. This understanding, however, does not necessarily imply
the ability to distinguish between different types of belief that, to
our mind, would also have implications concerning the ability to
distinguish between the various types of information or reasons
that can support them. Therefore, the highly constrained context of
the Sodian et al. (1991) study (i.e., children had to choose between
two tests that could yield either conclusive or inconclusive, but
otherwise appropriate, evidence concerning the validity of two
simple and mutually exclusive hypotheses) does not permit any
inferences concerning the extent to which children recognized the
presented contrasting beliefs as hypotheses.
Distinguishing Hypotheses From Other Belief Statements
Despite its importance with respect to the acquisition of scien-
tific knowledge and skills, the ability to distinguish between hy-
potheses and other types of beliefs has not been the subject of
intense investigation within the areas of cognitive-developmental
and educational psychology. However, one study by Carpendale
and Chandler (1996) has implications with respect to this issue.
Their study was designed to assess childrens ability to understand
and account for differences in interpretation and preference in
relation to their understanding of false belief. In addition to a
standard false belief task, they also presented 5- to 6-year-old and
7- to 8-year-old children with scenarios in which two puppets
disagreed on matters of taste (e.g., which soup was tastier and
which picture was nicer) or arrived at a different interpretation of
an ambiguous stimulus. The ambiguous stimulus tasks involved
lexical ambiguity (e.g., having to wait for a ring), referential
ambiguity (e.g., object hidden under two equally large blocks),
and figural ambiguity (e.g., duckrabbit and ratman drawings).
For each scenario, children were asked to indicate whether it was
all right for the puppets to disagree and to justify their answer.
Indications that the different interpretations or tastes were equally
acceptable were scored as correct responses. Carpendale and
Chandlers results showed that childrens scores on the matter-of-
taste tasks were significantly different from their scores on the
ambiguous stimuli tasks. Overall, all children were more likely to
indicate that differences in preference were more acceptable than
differences in interpretation. More importantly, however, there
was also a significant main effect of age, indicating that the older
children were more likely to correctly attribute differences in
preference to matters of personal taste and differences in interpre-
tation to the ambiguous nature of the stimuli.
Acknowledging variations in preference as more acceptable
than variations in interpretation reflects a basic recognition of
these statements of belief as different in some respects. In the
Carpendale and Chandler (1996) study, however, the conflicting
preferences and interpretations were equally acceptable. What
distinguished them was the source of their justification (i.e., person
based as opposed to stimulus based) and their potential to be
verified as true and generalizable. What Carpendale and Chandler
termed interpretations could essentially be conceptualized as cat-
egorical hypotheses concerning the nature or the meaning of an
ambiguous stimulus. Therefore, one could verify them the same
way one would test such a hypothesis: by collecting more infor-
mation about the stimulus in question and its relation to other
stimuli. In contrast, one would not be able to arrive at a similar
generalizable conclusion concerning variations in preference.
Their validity must be taken at face value because the observed
variations derive from personal subjective experience and neither
need nor admit external justification.
The fact that the older children were better able to distinguish
preferences from interpretations on the basis of differences in
sources of origin and justification (Carpendale & Chandler, 1996)
reflects the development of an understanding of a fundamental
way in which various beliefs may differ from each other. However,
it does not necessarily reflect a corresponding understanding that
what further distinguishes a particular set of beliefs is their poten-
tial for empirical validation. Therefore, the present study sought to
assess the extent to which elementary school children could dif-
ferentiate hypotheses from other beliefs as statements whose truth
value can be decided upon on the basis of empirical testing. The
sample included second graders, whose age range corresponded to
that of the older children in the Carpendale and Chandler (1996)
study, and sixth graders. The choice of the sample was motivated,
first, by the fact that it is in elementary school that children come
formally into contact with scientific knowledge and activities as
such and, second, by the possibility of a developmental progres-
sion toward a more mature understanding of what would constitute
a hypothesis and an empirical test in relation to that formal
exposure. We expected, then, that the sixth-grade students would
be better able to recognize that the validity of a hypothesis can be
determined whereas the validity of a preference must be taken at
face value.
Our basic procedure was similar to the one used by Carpendale
and Chandler (1996) in the sense that we also presented partici-
pants with scenarios of disagreement about foods and colors that
reflected either a difference in preference or alternative hypotheses
of the type that could be the focus of scientific investigation.
However, instead of asking children to decide whether the dis-
agreement was acceptable, we asked them to decide whether one
could determine which side was right or wrong and to justify their
decision. We asked for a justification because a positive answer
with respect to whether the validity of a particular assertion could
be determined would not permit any inferences regarding chil-
drens understanding of what would constitute a hypothesis test.
For example, a child, or an adult for that matter, might indicate that
the validity of an assertion can be determined by asking an expert
or by reading. Although this would certainly influence ones
gravitation toward one side of an issue, it would not count as
legitimate empirical evidence on the basis of which hypotheses are
evaluated in science domains. Moreover, one could indicate that a
decision can be based on appearance and taste comparisons. In this
case, a belief is justified on the basis of subjective evaluation, and
although this is a way of assessing whether one shares a prefer-
ence, it would not count as a legitimate hypothesis test.
Research has shown that prior knowledge and personal beliefs
can influence reasoning (see, e.g., Stanovich & West, 1997), the
extent to which hypotheses are perceived as plausible (Klahr et al.,
1993), and the way evidence is interpreted and evaluated (see, e.g.,
Chinn & Brewer, 1993). Specifically, Klaczynski (2000) found
that although adolescents used higher order analytic reasoning to
evaluate evidence that was inconsistent with their beliefs, they
relied on simple heuristics to evaluate evidence that was consis-
tent. We reasoned, therefore, that favoring one alternative belief
might influence decisions concerning the extent to which its va-
lidity could be tested and, more importantly, proposals concerning
the way it could be tested. To control for possible prior belief
biases in the present study, half of all the disagreements involved
one alternative hypothesis or preference that students had been
found to favor, whereas the other half involved neutral
To summarize, the primary goal of the present study was to
determine the extent to which elementary school children distin-
guished hypotheses from preferences as belief statements whose
truth value can be determined and to examine the kinds of tests that
they proposed. We hypothesized that older children would be more
likely than younger children (a) to indicate that only hypotheses, as
opposed to preferences, could be tested and (b) to propose empir-
ical tests, although not necessarily well designed or correct, as
opposed to subjective evaluations or references to authority. An
additional but secondary goal was to examine the extent to which
younger and older children were influenced by their prior belief
biases in their decisions and test proposals.
Participants and Procedure
The participants in the study were 60 students who came from predom-
inantly middle-class backgrounds and whose native language was Greek.
All students were attending the same elementary school located in a large
metropolitan area in the island nation of Cyprus. Half of the students (n
30) were attending Grade 2 (15 boys and 15 girls), and their ages ranged
from 7 years 6 months to 8 years 3 months (mean age 7 years 11
months). The rest of the students (n 30) were attending Grade 6 (17 boys
and 13 girls), and their ages ranged from 11 years 3 months to 12 years 4
months (mean age 11 years 7 months). All participants came from one
second-grade classroom and one sixth-grade classroom chosen randomly
from the three classrooms per grade level available at the school. Grade
point averages and individual subject test scores were not made available
as a matter of school policy. However, at the time of the study, the school
had an explicit mixed-ability grouping policy, and therefore, it is more
likely that a range of school achievement levels was represented in the
All students were interviewed individually twice by Christos Ioannides,
an experienced science educator, and his research assistant. Both inter-
views took place during the school day and were conducted in the students
and the researchers native language. At the beginning of the first inter-
view, each student was informed about the general purpose of the study and
the types of questions that he or she would be asked. The first interview
took place after the student had indicated his or her willingness to continue
and lasted approximately 5 min. Its purpose was to reveal students
personal preferences about foods and colors and their prior beliefs con-
cerning the properties of these. The second interview took place 3 days
later and lasted approximately 10 min. Its purpose was to reveal students
understanding of hypotheses as distinct from preferences. All responses
were tape-recorded and transcribed.
Preliminary questionnaire. The first interview was structured accord-
ing to a preliminary questionnaire that included 12 questions. Half of the
questions were designed to elicit students preferences about foods and
colors. Specifically, students were asked to name foods and colors they
liked best, those they disliked, and those they neither liked nor disliked.
The rest of the questions were designed to elicit students personal beliefs
about specific properties of foods and colors that could potentially function
as hypotheses. Specifically, students were asked to name foods they
believed had high nutritional value, foods they believed had low nutritional
value, and foods whose nutritional value they believed was average.
Similarly, students were asked to name colors they believed were highly
visible from a long distance away, colors they believed were not visible,
and colors whose visibility they believed was about average.
Main questionnaire. The second interview was structured according to
a main questionnaire that included eight disagreement scenarios or cases,
each followed by a question and a justification requirement (see Appen-
dix). Students were asked to consider each disagreement case, to evaluate
whether it would be possible to decide if one side or party was right, and
to justify their answer. Overall, half of all the cases concerned disagree-
ments about foods, whereas the rest concerned disagreements about colors.
Moreover, half of the cases referred to disagreements about matters of
preference (Cases 1, 2, 5, and 6; see Appendix), whereas the rest referred
to disagreements about beliefs that could function as hypotheses (Cases 3,
4, 7, and 8; see Appendix).
The main questionnaire was individualized according to each students
personal beliefs, as indicated by his or her responses to the preliminary
questionnaire. Each student had to consider disagreements for which he or
she was found to share the preference or hypothetical belief expressed by
one party and disagreements for which the student had previously ex-
pressed no strong bias. As a result, the main questionnaire included two
biased preference cases (Cases 1 and 5; see Appendix), two neutral
preference cases (Cases 2 and 6; see Appendix), two biased belief cases
(Cases 3 and 7; see Appendix), and two neutral belief cases (Cases 4 and
8; see Appendix). The order of question presentation was counterbalanced
between students both with respect to conceptual category (foods and
colors) and with respect to within-category questions.
Students responses to the questions following each disagreement case
were scored as to their appropriateness depending on whether the case
reflected a disagreement in preference or hypothetical belief. For example,
indications that the nutritional value of foods and the visibility of colors
could be determined would constitute appropriate responses. On the other
hand, indications that the tastiness of foods and the beauty of colors could
also be determined would constitute inappropriate responses. Appropriate
responses received a score of 1, and inappropriate responses received a
score of 0. Subsequently, Christos Ioannides and his research assistant,
independently of each other, classified responses and justifications into
categories designed to capture the full range of the responses obtained.
Interrater agreement was high (84%), and all differences were resolved by
Irene-Anna N. Diakidoy.
Differentiating Between Hypotheses and Preferences
Preliminary analysis indicated that scores received for all cases
within each conceptual category (food and color) were not signif-
icantly different from each other (M 2.53, SD 0.90, and M
2.70, SD 0.91, respectively), paired t(57) 1.31, p .19
(two-tailed), d 0.21. Therefore, students scores across cases
were summed to yield six combined scores: a total belief score, a
total preference score, a biased belief score, a biased preference
score, a neutral belief score, and a neutral preference score. How-
ever, the small number of cases within each score category and the
near-ceiling performance on the belief cases resulted in marked
deviations from normality (see Table 1). Therefore, data were
analyzed with nonparametric procedures, and effect sizes were
estimated on the basis of proportion differences (Fleiss, 1994).
Overall, 35 students (59%) recognized correctly that the validity of
hypothetical beliefs can be determined,
(4, N 60) 61.42, p
.00, whereas 18 students (31%) indicated incorrectly that the validity
of personal preferences can also be determined,
(4, N 60)
14.31, p .01, D 0.28. Moreover, it can be seen from Table 2 that
correct and incorrect responses were equally frequent in the prefer-
ence categories. A Wilcoxon signed-ranks test for matched pairs
showed that differences between beliefs and preferences, in terms of
correct responses, were significant, Z 3.60, p .00 (two-tailed),
with effect sizes ranging from D 0.20 to D 0.48. Although
response patterns in the biased belief and neutral belief categories
were significantly different from each other, Z 3.35, p .00
(two-tailed), with effect sizes ranging from D 0.14 to D 0.22,
this was not the case with the biased preference and neutral preference
categories, Z 1.36, p .17 (two-tailed).
A series of Mann-Whitney tests indicated that response patterns
in any of the categories were not significantly different as a
function of sex ( p .05). On the other hand, differences as a
function of grade emerged as significant only with respect to the
preference categories (see Table 3). Second graders (37%) were
more likely than sixth graders (24%) to indicate incorrectly that the
truth value of preferences can also be determined, Z 2.41, p
.02 (two-tailed), D 0.13. Moreover, there were significant dif-
ferences in the sixth graders response patterns in the biased belief
and neutral belief categories, Z 2.80, p .01 (two-tailed),
with effect sizes ranging from D 0.10 to D 0.26. In compar-
ison, response differences between biased belief and neutral belief
categories were not as pronounced in the second graders group,
Z 1.96, p .05 (two-tailed). However, students at both grade
levels had similar response patterns in the biased preference and
neutral preference categories ( p .05).
Justification Categories
Tables 4 and 5 show the types of justification given by students
for their positive and negative responses and their frequency in
each of the belief and preference cases. Overall, it can be seen that
the frequency of unjustified responses (9% on average) and simple
belief restatements (4% on average) was low across all cases of
disagreement. With respect to statements of preference (see Table
4), justifications on the basis of subjective evaluations that could
not lead to generalizable conclusions were the most prevalent
(38% on average). A typical example of this justification type was
given by Constantinos (Grade 2), who indicated that yes, each
one should try the others favorite food, and then they will see who
is right [about which food tastes better]. In fact, all students who
resorted to this type of justification stated that simply tasting the
foods or painting the colors would help them resolve the disagree-
Table 3
Score Frequencies in Biased and Neutral Categories Within Grade

N 0 1 2
Grade 2
Biased belief 2 5 22 24.07** 29
Neutral belief 4 8 17 9.17* 29
Biased preference 15 9 6 4.20 30
Neutral preference 14 10 6 3.20 30
Grade 6
Biased belief 0 4 26 16.13** 30
Neutral belief 3 9 18 11.40** 30
Biased preference 9 7 14 2.60 30
Neutral preference 7 5 17 8.55* 29
Note. The degrees of freedom for all chi-square tests are 2.
* p .05. ** p .01.
Table 1
Means, Standard Deviations, and Normality Indices for
Combined Scores (N 60)
Combined scores M SD Kurtosis Skewness
Biased belief 1.78 0.49 4.40 2.23
Neutral belief 1.48 0.70 0.30 0.98
Total belief 3.25 1.08 0.68 1.31
Biased preference 0.93 0.86 1.65 0.13
Neutral preference 1.03 0.87 1.69 0.07
Total preference 1.97 1.67 1.64 0.08
Note. Maximum possible biased and neutral scores 2. Maximum
possible total score 4.
Table 2
Score Frequencies in Biased and Neutral Categories

N 0 1 2
Biased belief 2 9 48 62.47** 59
Neutral belief 7 17 35 20.47** 59
Biased preference 24 16 20 1.60 60
Neutral preference 21 15 23 1.76 59
Note. The degrees of freedom for all chi-square tests are 2.
** p .01.
ment. On the other hand, negative answers justified on the basis of
individual differences, which would also be more appropriate for
statements of personal preference, were less frequent (28% on
average). For example, Aphrodite (Grade 6) decided that no, we
cannot [determine who is right], because it depends on what color
each one likes. Moreover, about 15% of all students justified their
negative answers by resorting to egalitarian responses. An example
of such a response was given by Christos (Grade 6), who said that
we cannot say [who is right], because both colors are nice. The
frequency of egalitarian responses was comparable across cases of
biased preference and neutral preference. Finally, it is interesting
to note that a few students (5% on average) justified their positive
responses by proposing an objective evaluation of a different but
related belief statement. Phanis (Grade 6), for example, proposed
that the two parties should paint one sheet green and another
yellow and then have their classmates vote which one is more
With respect to statements of belief that could function as
hypotheses (see Table 5), justifications on the basis of objective
evaluations were the most frequent (38% on average), especially
for cases concerning the visibility of colors from a distance.
Typically, these justifications involved looking from a distance.
Maria (Grade 2), for example, indicated that, to determine which
color is more visible from a distance, they should paint both on
pieces of paper, then go far away and see which color can be seen
better. The objective evaluations proposed in the case of the
foods nutritional values were slightly more varied. For Cypriana
(Grade 2), the way to determine which food is more nutritious was
to have each party eat one kind of food and then see who gets
fatter. For Michael (Grade 6), on the other hand, the way to
determine which food is more nutritious was to have each party
eat one kind of food only. Then, after a while, see which person
has gained more weight. The person who has gained the most has
eaten the least nutritious food. A few students thought that health,
instead of growth, was the important indicator and, like Stephanos
(Grade 6), stated that each one should have a blood testmaybe
check for cholesterol? Finally, one student, Stelios (Grade 6),
proposed that we can analyze the hamburger and see how many
vitamins and other nutrition stuff it has. Then we can do the same
with corn flakes and compare them. Justifications on the basis of
objective evaluations, regardless of their completeness or scientific
adequacy, were considered to be the most appropriate for cases of
belief. However, several students (14% on average) also proposed
subjective evaluations similar to the ones proposed for cases of
Students also justified their positive responses with references to
properties of foods and colors or to objects having a particular
color (11% on average) and with references to authority, such as
parents, doctors, and favorite friends (8% on average). For exam-
ple, Marina (Grade 6) decided that one party was wrong because
potatoes do not have as many vitamins as beans. Antonis (Grade
6) determined that yellow must be more visible from a distance
because the sun is yellow and it is far away. Natasha (Grade 2),
on the other hand, suggested that they should ask their mothers,
and then they will see who is right. It must be noted that
references to properties, objects, and authorities were more fre-
quent in cases of disagreement about the nutritional value of foods
than in cases concerning the visibility of colors (see Table 5).
Finally, egalitarian responses were more frequent in cases of
neutral belief (15% on average) than in cases of biased belief (5%
on average).
Justification categories were not equally frequent across grade
levels. Unjustified responses and belief restatements were more
frequent in Grade 2 (19% on average) than in Grade 6 (6% on
average). With respect to preference cases (biased or neutral), the
younger students were more likely to justify their responses on the
basis of subjective evaluations (46% on average), whereas sixth
graders were more likely to suggest that disagreements were due to
individual differences (51% on average). Moreover, the second
most frequent justification type for the second graders was the
egalitarian response (21% on average), whereas for the sixth
graders, it was the subjective evaluation (26% on average). Dif-
Table 5
Frequency of Justification Categories in Cases of Biased Belief
and Neutral Belief (N 60)
Justification category
Biased belief Neutral belief
Case 3 Case 7 Case 4 Case 8
No explanation 3 1 0 1
No way to find out 4 1 1 3
Belief restatement 3 4 2 3
Reference to authority 10 0 9 0
Reference to property
or object 14 2 5 4
Subjective evaluation 9 8 11 6
Objective evaluation of
related belief 0 0 1 0
Objective evaluation 12 36 9 34
No explanation 1 2 5 2
Egalitarian responses 2 4 13 5
Reference to individual
differences 1 2 2 2
Dont know/no response 1 0 2 0
Table 4
Frequency of Justification Categories in Cases of Biased
Preference and Neutral Preference (N 60)
Justification category
Biased preference Neutral preference
Case 1 Case 5 Case 2 Case 6
No explanation 1 1 0 1
No way to find out 0 4 2 1
Belief restatement 1 0 3 2
Reference to authority 0 2 1 2
Reference to property
or object 1 2 1 2
Subjective evaluation 28 19 25 18
Objective evaluation of
related belief 2 5 1 3
Objective evaluation 0 0 0 0
No explanation 3 0 2 2
Egalitarian responses 8 10 8 9
Reference to individual
differences 15 17 17 19
Dont know/no response 1 0 0 1
ferences between grade levels were not as clear with respect to the
belief cases. Although the older students were more likely to
propose objective evaluations (48% on average), they also resorted
to references to properties and authorities when considering the
nutritional value of foods (23% on average). In contrast, the
younger students proposed objective evaluations primarily when
considering the visibility of colors (47% on average). Otherwise,
they were more likely to propose subjective evaluations (23% on
average). Finally, egalitarian responses were given by both the
second graders (17% on average) and the sixth graders (23% on
average) only to neutral belief cases.
Distinguishing Hypotheses From Preferences
The first question that motivated this study was whether
younger and older elementary school children distinguish hypoth-
eses as beliefs whose validity can be tested from preferences that
are justifiably variable. The findings indicate that, whereas stu-
dents readily agreed that the validity of hypotheses can be deter-
mined, they had difficulty understanding that the same is not true
for statements of preference. As expected, however, older children
were more likely to consider preferences as legitimately variable
and to attribute them to individual differences. In contrast, younger
children were likely to think that disagreements about preference
could also be resolved in favor of one side, and they proposed
evaluations that by necessity involved subjective comparisons that
would not yield generalizable conclusions. These findings suggest
that younger children (7- to 8-year-olds) have difficulty differen-
tiating between hypotheses and preferences. Older children (11- to
12-year-olds), on the other hand, appear to be better able to
distinguish between hypotheses and preferences, but the relatively
small effect size indicates that they do not do so with accuracy.
These findings appear to be in contrast to those of the Carpen-
dale and Chandler (1996) study, which showed that all 5- to
8-year-old children tended to consider differences in preference as
more acceptable than differences in the interpretation of ambigu-
ous stimuli. The fact, however, that a difference in preference is
considered acceptable does not necessarily imply an understanding
that it is not resolvable (see also Flavell, Flavell, Green, & Moses,
1990). The present findings lend validity to the above claim.
Although we did not ask children to decide whether it was okay for
two people to disagree (as in Carpendale & Chandler, 1996), we
can assume that they also considered disagreements acceptable,
even if only in the sense that they recognized that different people
can believe different things. However, the majority of the younger
children and some of the older ones believed also that disagree-
ments can be resolved regardless of whether they involve differ-
ences in preferences or hypotheses. Moreover, the way they pro-
posed this could be done was similar for both kinds of belief
statementsthat is, by directing attention outward, toward the
stimuli in question (comparing foods and colors), rather than
inward, as Flavell et al. (1990) had claimed to be the case with
preferences and as Carpendale and Chandler (1996) had found to
be the case with the older children in their sample. These findings
appear to suggest that the metaconceptual ability that would allow
a reflection on and a further differentiation between the kinds of
beliefs that one may have is either lacking or not consistently
manifested in the early elementary school years.
The tendency to consider preferences and hypotheses as simi-
larly testable can also be taken to reflect an epistemological
presupposition that all statements can be either right or wrong. To
the extent that beliefs and knowledge are justified similarly in the
minds of students (Southerland et al., 2001), such a presupposition
can be thought of as related to a starting, commonsense episte-
mology, according to which knowledge is certain and conflicts are
attributable to incomplete or inaccurate information (Carey &
Smith, 1993; Hofer & Pintrich, 1997). The possibilities of limited
metaconceptual ability and a deterministic epistemological presup-
position do not necessarily represent alternative interpretations of
the findings. In fact, we consider them related in the sense that lack
of reflection on ones own beliefs would help sustain a determin-
istic outlook and that, in turn, a firm conviction that all beliefs can
be right or wrong could inhibit such a reflection and, thereby,
metaconceptual awareness.
Older childrens increased ability to recognize that some belief
statements are not testable may reflect increased metaconceptual
understanding and a weakening deterministic epistemology, both
of these brought about by their greater familiarity with the kinds of
belief statements that drive scientific investigation. It must be
noted, however, that the way science is taught in the Cypriot
elementary school is more likely to reinforce, inadvertently, a less
sophisticated, deterministic epistemological view. Science educa-
tion content and outcomes are determined by a national curriculum
(Cypriot Ministry of Education and Culture, 1996). All teachers
are required to follow the instructional methods that are specified
in detail for each lesson unitincluding presentation content, time
frames, activities, teacher demonstrations, and questionsin the
science teachers manual (Kyprianou, Loizidou, Charalambous,
Matsikaris, & Yiannakis, 1997). According to the manual, lessons
start with definitions and examples of target concepts. Subse-
quently, students engage in classification and example-recognition
activities. In addition, they observe teacher demonstrations or
conduct prespecified miniexperiments, both of which serve to
prove the validity of the previously offered accounts.
Although the manual advises teachers to start each lesson by
asking students what they know about the target concepts or the
general topic, it offers no similarly explicit guidelines on how to
respond to students ideas or, in fact, whether to respond at all (see,
e.g., Kyprianou et al., 1997). In this case, it is up to the individual
teacher to decide whether to have students pursue their ideas
further or to proceed with the lesson. The second option is more
viable given the large amount of content that must be covered
within specific lesson periods, as well as within the course of the
school year (Cypriot Ministry of Education and Culture, 1996). As
a result, the sixth graders in our sample had limited experience, if
any, in formulating and investigating hypotheses in science (see,
e.g., Chinn & Malhotra, 2002). Therefore, their increased ability to
differentiate between kinds of belief, when compared with that of
the younger students, may also simply reflect their greater expe-
rience with unresolved preference disputes in the context of ev-
eryday life.
Although there was a tendency to regard hypotheses and pref-
erences as similarly testable, prior belief bias was found to exert an
influence on responses to hypothetical beliefs only. In general,
students confidence that a disagreement could be resolved was
greater when it involved a hypothetical belief that was favored as
opposed to a neutral belief. A corresponding tendency was not
evident in students responses concerning preferences. Moreover,
the influence of prior belief bias was greater in the older age group
than in the younger age group. Although, one could speculate that
such differential influences might reflect conceptual differentiation
to some degree, we consider such a speculation as unwarranted
given the modest effect sizes and the limited scope of the study
with respect to this factor.
Proposing Tests for Hypotheses
The second question that the study sought to address concerned
the extent to which students were able to propose objective,
empirical tests for determining the validity of hypotheses. The
findings indicate that although the older children were clearly
better able to think of empirical tests, the younger children were
also able to propose objective evaluations in some cases. More-
over, all children were better able to think of empirical tests when
considering the visibility of colors from a distance than when
considering the nutritional value of different kinds of food. Be-
cause of the current emphasis on health and nutrition, all students
in the Cypriot elementary school have been advised at least once
about the importance of a healthy diet. That, in turn, may explain
the overall higher frequency of references to properties of food,
such as vitamin and sugar content, and to authorities, such as
doctors and dieticians, when considering the nutritional value of
foods. In contrast, students everyday life provided many oppor-
tunities for discovering and evaluating what things can be seen and
from how far away (e.g., letters on the blackboard). Therefore,
conceptual and task familiarity may have been responsible for all
students increased ability to think of a test to resolve disagree-
ments about the visibility of colors.
In agreement with previous research (e.g., Sodian et al., 1991;
Wimmer & Perner, 1983), our findings also indicate that young
elementary school children demonstrate an understanding that a
belief is justifiable on reasons or facts that are distinct from the
belief itself. Few children, even in the younger age group, failed to
justify their responses, and even fewer restated a belief as evidence
for its validity. Moreover, we would argue that failure to justify a
belief does not necessarily reflect lack of understanding of what an
external justification is and of its necessity. Evidence for this is
provided by the few children in our sample who explicitly attrib-
uted their inability to justify to the fact that they did not know of
or they could not think of a way to evaluate who was right or
wrong. Therefore, young childrens notable reliance on external
justifications and evaluations (subjective or objective) necessarily
implies that they also possess the basic ingredients of the strategic
competence that, according to Kuhn (1997), is needed to under-
stand the inferential relationship between evidence and belief.
The objective evaluations that our sample proposed did not
necessarily represent well-designed hypothesis tests, and in the
case of nutritional values, they were often scientifically inaccurate.
They were, however, genuine attempts to set up test conditions that
could yield potentially useful evidence. On the other hand, sub-
jective evaluations can also be taken to represent attempts at
testing and, as expected, were more prevalent with personal pref-
erence conflicts. It is also notable that the few students who
proposed objective evaluations to resolve them did so by interpret-
ing the initial statements as hypothetical generalizations about
peoples preferences. This dominant tendency to propose some
kind of test suggests that elementary school students have a basic
understanding of experimentation (Sodian et al., 1991) even if they
have not yet mastered the intricacies of the skill. We would further
expect the acquisition and the successful application of this skill to
go hand in hand with, if not to depend on, the ability to distinguish
the kinds of belief statements that can be the focus of experimental
Limitations and Suggestions for Further Research
One could argue that our basic task requirement may have
biased the students to answer in the affirmative and to do their best
to propose some kind of test. Children may have responded dif-
ferently in the cases of preference if we had asked them whether it
could be determined if someone was right, instead of who was
right. However, we consider this possibility less likely given the
almost equal frequency of correct and incorrect responses in all
cases of preference. On the other hand, it must be noted that our
task was relatively limited when it came to revealing the full extent
of students ability to think of empirical tests. It is possible that
students who proposed objective tests would have been able to
revise and develop their proposals further if they had been asked to
elaborate them. Similarly, students who proposed subjective tests
might have been able to recognize their proposals lack of potential
to conclusively resolve a disagreement if they had been confronted
with the issue. Future research could modify and extend the basic
task to require children to consider the possible and expected
results of their test proposals and the extent to which these would
allow them to resolve a disagreement conclusively.
Arguably, the beliefs that our sample had to consider repre-
sented simple categorical statements. They were not beliefs about
complex causal relationships of the type investigated by Kuhn
(e.g., Kuhn et al., 2000). Therefore, one could suppose that chil-
drens extensive experience with classifying the world might have
better prepared them to deal with the contrasting beliefs that our
tasks involved. It is possible that if we had asked them to consider
disagreements about factors influencing the nutritional value of
foods or the visibility of colors, they might have resorted more
often to belief restatements. To our knowledge, there is no research
that has directly examined the extent to which young childrens
ability to differentiate hypotheses and to coordinate them with
evidence varies as a function of the type of hypotheses. The
contrasting findings of previous research (e.g., Kuhn et al., 1995;
Sodian et al., 1991) suggest that this factor may also play a role.
Therefore, further research in this direction may help to better
describe and explain younger childrens ability to engage in sci-
entific reasoning.
Future research could also be designed to provide a more
in-depth look at the influence of prior belief bias and conceptual
familiarity on the ability to distinguish hypotheses from other
beliefs and on the types of tests proposed. Although prior belief
bias was included as a factor and was found to selectively influ-
ence decisions concerning hypotheses only, the limited number of
disagreement cases involving biased and neutral hypothetical be-
liefs and preferences does not permit any definite conclusions to be
drawn at this point. In contrast, there was no systematic effort to
control for conceptual or task familiarity in this study. Neverthe-
less, there were clear task differences with respect to the ability to
think of objective hypothesis tests and that appeared to be due to
conceptual familiarity differences. Therefore, a more thorough
examination of this factor and its possible interaction with belief
bias would provide a test of this possibility and would contribute
to arguments concerning the influence of prior knowledge in
relation to domain-general heuristics in the development of scien-
tific reasoning skills (see, e.g., Klahr et al., 1993).
Conclusions and Implications
The findings of the present study indicate that the related no-
tions of belief justification and evaluation may be acquired earlier
than, albeit to a limited degree, and more or less independently of
the ability to distinguish between different kinds of beliefs. Nev-
ertheless, elementary school childrens limited ability to distin-
guish hypotheses from unverifiable belief statements is likely to
interfere with their understanding of what a conclusive hypothesis
test is all about and with their ability to formulate researchable
questions in a scientific inquiry context. However, students initial
understandings of belief and belief justification can provide a
fertile ground for subsequent conceptual differentiation, refine-
ment, and extension. Early and extensive practice in formulating
and evaluating their own beliefs should facilitate childrens ability
to distinguish those that can be empirically and conclusively
verified from those that cannot. This, in turn, should promote the
development of the metacognitive and metastrategic competencies
that Kuhn (1997) has claimed to be necessary for scientific rea-
soning and increase the epistemological authenticity of scientific
inquiry in educational settings (Chinn & Mahlotra, 2002; Smith,
Maclin, Houghton, & Hennessey, 2000).
Carey, S., & Smith, C. (1993). On understanding the nature of scientific
knowledge. Educational Psychologist, 28, 235251.
Carpendale, J. I., & Chandler, M. J. (1996). On the distinction between
false belief understanding and subscribing to an interpretive theory of
mind. Child Development, 67, 16861706.
Chinn, C. A., & Brewer, W. F. (1993). The role of anomalous data in
knowledge acquisition: A theoretical framework and implications for
science instruction. Review of Educational Research, 63, 149.
Chinn, C. A., & Malhotra, B. A. (2002). Epistemologically authentic
inquiry in schools: A theoretical framework for evaluating inquiry tasks.
Science Education, 86, 175218.
Cypriot Ministry of Education and Culture. (1996). Elementary education:
National curriculum. Nicosia: Government of Cyprus.
Dewey, J. (1906). Beliefs and realities. Philosophical Review, 15, 113129.
Dworkin, R. (1996). Objectivity and truth: Youd better believe it. Philos-
ophy and Public Affairs, 25, 87139.
Elby, A., & Hammer, D. (2001). On the substance of a sophisticated
epistemology. Science Education, 85, 554567.
Flavell, J. H., Flavell, E. R., Green, F. L., & Moses, L. J. (1990). Young
childrens understanding of fact beliefs versus value beliefs. Child
Development, 61, 915928.
Fleiss, J. L. (1994). Measures of effect size for categorical data. In H.
Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp.
245260). New York: Russell Sage Foundation.
Guzzetti, B. J., Snyder, T. E., Glass, G. V., & Gamas, W. S. (1993).
Promoting conceptual change in science: A comparative meta-analysis
of instructional interventions from reading education and science edu-
cation. Reading Research Quarterly, 28, 116159.
Hofer, B. K., & Pintrich, P. R. (1997). The development of epistemological
theories: Beliefs about knowledge and knowing and their relation to
learning. Review of Educational Research, 67, 88140.
Ioannides, C., & Vosniadou, S. (2002). The changing meanings of force.
Cognitive Science Quarterly, 2, 562.
Klaczynski, P. A. (2000). Motivated scientific reasoning biases, epistemo-
logical beliefs, and theory polarization: A two-process approach to
adolescent cognition. Child Development, 71, 13471366.
Klahr, D., Fay, A. L., & Dunbar, K. (1993). Heuristics for scientific
experimentation: A developmental study. Cognitive Psychology, 25,
Kuhn, D. (1997). Constraints or guideposts? Developmental psychology
and science education. Review of Educational Research, 67, 141150.
Kuhn, D., Amsel, E., & OLoughlin, M. (1988). The development of
scientific thinking skills. New York: Academic Press.
Kuhn, D., Black, J., Keselman, A., & Kaplan, D. (2000). The development
of cognitive skills to support inquiry learning. Cognition and Instruction,
18, 495523.
Kuhn, D., Garcia-Mila, M., Zohar, A., & Andersen, C. (1995). Strategies
of knowledge acquisition. Monographs of the Society for Research in
Child Development, 60(4, Serial No. 245).
Kyprianou, K., Loizidou, P., Charalambous, P., Matsikaris, G., & Yianna-
kis, I. (1997). First steps in science: Sixth-grade science teachers
manual. Nicosia: Government of Cyprus.
McGinn, M. K., & Roth, W. M. (1999). Preparing students for competent
scientific practice: Implications of recent research in science and tech-
nology studies. Educational Researcher, 28(3), 1424.
Perner, J. (1991). Understanding the representational mind. Cambridge,
MA: MIT Press.
Polanyi, M. (1950). Scientific beliefs. Ethics, 61, 2737.
Smith, C., Maclin, D., Grosslight, L., & Davis, H. (1997). Teaching for
understanding: A study of students preinstruction theories of matter and
a comparison of the effectiveness of two approaches to teaching about
matter and density. Cognition and Instruction, 15, 317393.
Smith, C., Maclin, D., Houghton, C., & Hennessey, M. G. (2000). Sixth-
grade students epistemologies of science: The impact of school science
experiences on epistemological development. Cognition and Instruction,
18, 349422.
Sodian, B., Zaitchik, D., & Carey, S. (1991). Young childrens differenti-
ation of hypothetical beliefs from evidence. Child Development, 62,
Southerland, S. A., Sinatra, G. M., & Matthews, M. R. (2001). Belief,
knowledge, and science education. Educational Psychology Review, 13,
Stace, W. T. (1945). The problem of unreasoned beliefs. Mind, 54, 2749.
Stanovich, K. E., & West, R. F. (1997). Reasoning independently of prior
belief and individual differences in actively open-minded thinking. Jour-
nal of Educational Psychology, 89, 342357.
Toth, E. E., Klahr, D., & Chen, Z. (2000). Bridging research and practice:
A cognitively based classroom intervention for teaching experimentation
skills to elementary school children. Cognition and Instruction, 18,
Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and
constraining function of wrong beliefs in young childrens understand-
ing of deception. Cognition, 13, 103128.
(Appendix follows)
Main Questionnaire
Case 1
George and Michael disagree about which food tastes better. George
says that X (students favorite food) tastes better than Y (the student
dislikes it). Michael says that Y tastes better than X.
Question 1: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 2
Peter and Paul disagree about which food tastes better. Peter says that X
tastes better than Y. Paul says that Y tastes better than X (student is
indifferent to both foods).
Question 2: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 3
Helen and Angela disagree about which food is more nutritious. Helen
says that X (student considers it very nutritious) is more nutritious than Y
(a food the student considers to be less nutritious). Angela says that Y is
more nutritious than X.
Question 3: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 4
Mary and Silvia disagree about which food is more nutritious. Mary says
that X is more nutritious than Y. Silvia says that Y is more nutritious than
X (student believes that both foods are equally nutritious).
Question 4: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 5
Lucia and Georgia disagree about which color is more beautiful. Lucia
says that X (students favorite color) is more beautiful than Y (student does
not like it). Georgia says that Y is more beautiful than X.
Question 5: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 6
Kostas and Lazaros disagree about which color is more beautiful. Kostas
says that X is more beautiful than Y. Lazaros says that Y is more beautiful
than X (student is indifferent to both colors).
Question 6: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 7
Marina and Sofia disagree about which color can be seen from a longer
distance. Marina says that X (the student considers it most visible) can be
seen from a longer distance than Y (student considers it less visible from
a distance). Sofia says that Y can be seen from a longer distance than X.
Question 7: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Case 8
Paul and Nikolas disagree about which color can be seen from a longer
distance. Paul says that X can be seen from a longer distance than Y.
Nikolas says that Y can be seen from a longer distance than X (student
believes that both colors are equally visible).
Question 8: Can we determine who is right and who is wrong?
If yes, how?
If no, why not?
Received February 3, 2003
Revision received January 29, 2004
Accepted February 10, 2004