You are on page 1of 30

Journal of Experimental Child Psychology 222 (2022) 105466

Contents lists available at ScienceDirect

Journal of Experimental Child


Psychology
journal homepage: www.elsevier.com/locate/jecp

Learning science concepts through prompts to


consider alternative possible worlds
Angela Nyhout a,⇑, Patricia A. Ganea b
a
School of Psychology, University of Kent, Canterbury, Kent CT2 7NP, UK
b
Applied Psychology & Human Development, University of Toronto, Toronto, Ontario M5S 1V6, Canada

a r t i c l e i n f o a b s t r a c t

Article history: We investigated whether prompting children to think counterfac-


Received 7 July 2020 tually when learning a complex science concept (planetary habit-
Revised 28 April 2022 ability) would promote their learning and transfer. In Study 1,
Available online 7 June 2022
children (N = 102 6- and 7-year-olds) were either prompted to
think counterfactually about Earth (e.g., whether it is closer to or
Keywords:
farther from the sun) or prompted to think about examples of dif-
Counterfactual thinking
Scientific reasoning ferent planets (Venus and Neptune) during an illustrated tutorial. A
Causal reasoning control group did not receive the tutorial. Children in the counter-
Science learning factual and examples groups showed better comprehension and
Astronomy transfer of the concept than those in the control group.
Imagination Moreover, children who were prompted to think counterfactually
showed some evidence of better transfer to a novel planetary sys-
tem than those who were prompted to think about different exam-
ples. In Study 2, we investigated the nature of the counterfactual
benefit observed in Study 1. Children (N = 70 6- and 7-year-olds)
received a tutorial featuring a novel (imaginary) planet and were
either prompted to think counterfactually about the planet or
prompted to think about examples of additional novel planets.
Performance was equivalent across conditions and was better than
performance in the control condition on all measures. The results
suggest that prompts to think about alternative possibilities—both
in the form of counterfactuals and in the form of alternative possi-
ble worlds—are a promising pedagogical tool for promoting
abstract learning of complex science concepts.
Ó 2022 The Authors. Published by Elsevier Inc. This is an open
access article under the CC BY license (http://creativecommons.org/
licenses/by/4.0/).

⇑ Corresponding author.
E-mail address: a.nyhout@kent.ac.uk (A. Nyhout).

https://doi.org/10.1016/j.jecp.2022.105466
0022-0965/Ó 2022 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

Introduction

A reliable indicator of one’s understanding of a science concept is the ability to transfer what one
has learned to exemplars and contexts with low surface similarity—a task known to be challenging for
children (Bransford, Brown, & Cocking, 2000; Brown, Kane, & Long, 1989; Gentner, 1989). A significant
challenge for researchers and educators, therefore, is to identify methods to support children’s ability
to transfer newly acquired concepts to novel contexts. In the current studies, we proposed and tested
one such method, guided counterfactual reasoning, which could be easily integrated into guided learn-
ing contexts, including classrooms and museums.
One common approach to promoting transfer involves presenting learners with multiple exem-
plars of a concept, phenomenon, or solution (Minervino, Olguin, & Trench, 2017). Compared with cases
where the learner has seen only a single example, presenting two or more exemplars allows the lear-
ner to identify abstract features or principles that generalize across contexts. According to structure
mapping theory, learners place exemplars in structural alignment by comparing across them, which
highlights their commonalities and differences (Gentner, 1983; Gentner & Markman, 1997). Research
with both children and adults across several domains of learning provides support for this approach.
For instance, comparison across exemplars facilitates adults’ transfer of problem solutions (Gentner,
Loewenstein, & Thompson, 2003; Gick & Holyoak, 1983; Kurtz & Loewenstein, 2007) as well as chil-
dren’s word learning and category learning (Gentner & Namy, 1999; Namy & Gentner, 2002) and
learning of complex science concepts (Brown & Kane, 1988; Ganea, Ma, & DeLoache, 2011; Strouse
& Ganea, 2021).
However, in certain cases, using different examples might not be optimal or feasible. Imagine that
one wants to teach a learner about migration paths in animals by presenting different examples.
Learning about migration in monarch butterflies and humpback whales, for instance, may present
considerable challenges given the number of variables that distinguish the two species (e.g., size, habi-
tat, diet, mating patterns, lifespan). According to structure mapping theory, in the case of multiple
examples, those that share surface similarities are more easily comparable and as a result are easier
to align on certain abstract dimensions (Gelman, Raman, & Gentner, 2009; Gentner & Gunn, 2001).
For certain concepts, we may also lack similar examples to promote abstract learning and transfer.
Imagine, as in the current study, that one wants to teach a child about planetary habitability. Because
we lack detailed knowledge about planets beyond our solar system, there are limited examples one
can provide of habitable planets (i.e., Earth and potentially Mars). How, then, might one promote chil-
dren’s abstract learning and transfer of science concepts in the absence of many viable examples? In
the current studies, we proposed that guiding children to think counterfactually about a particular
exemplar may promote these abilities.
When reasoning counterfactually, an individual constructs an alternate representation to reality by
considering possible outcomes following the manipulation of a variable. Researchers have pointed to
the overlap between the development of counterfactual reasoning and both causal reasoning
(Buchsbaum, Bridgers, Skolnick Weisberg, & Gopnik, 2012; Gopnik et al., 2004; Harris, German, &
Mills, 1996; Weisberg & Gopnik, 2013) and scientific reasoning (Rafetseder & Perner, 2014;
Wenzlhuemer, 2009), and some have argued that the imagined manipulations carried out during
counterfactual reasoning are akin to actual manipulations of variables carried out during scientific
experimentation (Gopnik, 2009; Nyhout & Ganea, 2021; Pearl, 2000; Walker & Gopnik, 2013). The
objective of scientific experimentation is, of course, to yield generalizable knowledge about the world
by intervening on a variable to uncover its effects. The imagined interventions carried out during coun-
terfactual thinking may similarly promote generalization by allowing the reasoner to compare two
models of the world: one of the world as it is and one of the world as it could be given a specific change.
These two models may be thought of as a control condition and a treatment condition, respectively. As
with experiments conducted in the real world, these imagined experiments may lead to generalizable
knowledge about the role of causal variables that goes beyond the specific example (or sample)
examined.

2
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

Imagined interventions by way of counterfactuals may be particularly useful when learning about
concepts where it is difficult to conduct real-life experiments or even make firsthand observations.
Despite this link, no previous work, to our knowledge, has looked at the relationship between coun-
terfactual reasoning and children’s generalization of scientific concepts. Crucially, counterfactuals may
be used to glean new knowledge and understanding in the absence of additional examples or data.
One can take what one already knows (e.g., about Earth) and imagine changes that yield alternate
models or representations of the same example from which one can draw conclusions.
An alternate proposal for a potential benefit of counterfactuals on learning and transfer is more
general. Specifically, researchers have proposed that thinking counterfactually induces a counterfac-
tual mindset that is more open to alternative possibilities (Galinsky & Moskowitz, 2000; Hirt &
Markman, 1995). Thinking about alternative possibilities by way of counterfactuals and multiple
explanations has been found to lead to a broad range of benefits, including debiasing predictions
and hypothesis testing (Galinsky & Moskowitz, 2000; Hirt & Markman, 1995; Nyhout, Iannuzziello,
Walker, & Ganea, 2019), increasing attention to anomalous evidence (Engle & Walker, 2021), and
decreasing functional fixedness (Galinsky & Moskowitz, 2000). Together, these findings suggest that
counterfactuals may lead learners to approach new tasks flexibly, shedding common biases and short-
comings, by encouraging the consideration of alternative possibilities.
In the current studies, we investigated the impact of guided counterfactual reasoning on children’s
learning in the domain of planetary habitability, a concept that involves a complex causal chain (Fig. 1)
and one that children can neither readily observe nor intervene in. We measured both children’s com-
prehension (i.e., their ability to explain the concept in relation to the content already provided) and
children’s transfer (i.e., their ability to extend the concept to a novel planetary system). Extensive prior
work demonstrates that children’s learning is often tied to the specific content that has been trained,
and abstract transfer or generalization can be a challenging task for learners (e.g., Brown et al., 1989;
Gentner, 1989; Marcus, Haden, & Uttal, 2018; Strouse, Nyhout, & Ganea, 2018).
In Study 1, children were given a short illustrated tutorial on planetary habitability and were then
guided to think counterfactually about Earth or to think about examples of different planets. A third
group of children were not exposed to the learning content or guided prompts and served as a control
group. Given previous work showing that presenting multiple exemplars fosters learning, we expected
that children who were presented with exemplars of different planets (e.g., Earth, Venus, Neptune)
would show better comprehension and transfer compared with those in the control group. However,
we were primarily interested in testing the efficacy of the novel approach of guided counterfactual
thinking.
We expected that children in both the guided counterfactuals and guided examples conditions
would show better comprehension and transfer than those in the control group, but we proposed that
the mechanisms underlying the proposed benefits of examples and counterfactuals are different. In
the case of multiple examples, the learner aligns models of the exemplars (e.g., Earth, Neptune) and
identifies their commonalities and differences. These models may vary from one another on a number
of dimensions (e.g., distance, size, color). To make use of these examples, the learner must focus on the
variables that are most relevant to the concept at hand and ignore extraneous ones (e.g., color).
In the case of counterfactuals, the learner takes a single model (e.g., Earth) and introduces changes
to it—reasoning through the consequences of these changes. Comparing between the model of reality
and its alternative may allow the learner to draw generalizable conclusions. Suppose, as in the current
study, one wants the learner to grasp the abstract principle that distance of a planet from its star mat-
ters for habitability. By considering two models of Earth: one in its actual position and one closer to
the sun, the child may reinforce the lesson that ‘‘distance matters” by seeing how Earth in a nearer or
farther position would render it uninhabitable.

Fig. 1. A causal chain of planetary habitability based on the distance of a planet from its star.

3
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

We tested 6- and 7-year-olds for two reasons. First, children in this age range in Ontario, Canada
have learned many of the isolated facts underlying this concept (e.g., the sun is a source of heat; living
things need water to survive) but have not yet learned the concept itself (Ontario Ministry of
Education, 2007), and therefore the instructed content was novel but within reach for most children.
Second, children in this age range are capable of engaging in counterfactual reasoning about various
types of content (e.g., Beck & Riggs, 2014; McCormack, Ho, Gribben, O’Connor, & Hoerl, 2018; Nyhout
& Ganea, 2019a, 2019b).

Study 1

Method

In Study 1, we introduced children to a critical variable influencing planetary habitability: distance


of a planet from its star. Children were exposed to the concept in an illustrated tutorial, during which
they were guided either to think counterfactually about Earth or to think factually about examples of
other planets (Venus and Neptune). A control group of children did not receive exposure to the
concept.

Participants
Participants were 101 children aged 6 and 7 years (M = 7.02 years, SD = 0.47, range = 6.03–7.98; 48
girls). Participants were randomly assigned to a counterfactual condition (n = 33), an examples condi-
tion (n = 32), or a control condition (n = 36). All participants were tested between April 2016 and March
2017 in a semi-private area of a science museum (n = 39) or in a university laboratory (n = 62). For
inclusion in the study, children needed to be exposed to English at least 50% of the time, as assessed
by parental report. An additional 5 children were tested, but their data were excluded due to failing
pretest (n = 2), parental interference (n = 1), insufficient exposure to English (n = 1), or loss of video
(n = 1).
Parents reported the participating children’s ethnicity as follows: White (38%), mixed ethnicity
(10%), Chinese (8%), South Asian (5%), Southeast Asian (3%), Latin American (2%), West Asian (2%),
Japanese (2%), and Jewish (1%). Parents reported their education level as undergraduate degree
(27%), graduate or professional degree (25%), community college diploma (10%), high school (6%), or
some high school (5%). About a third of parents did not provide demographic information.

Design
Stimuli and test questions are presented in full in the Appendixes A and B, respectively. We created
two versions of the tutorial to teach children the concept of planetary habitability. An illustrator cre-
ated visuals for the purpose of the study. Each picture was displayed on a PowerPoint slide with a
black background and accompanying text below the picture.
The first 7 slides of the tutorial were identical. On the 8th through 15th slides of the counterfactual
version, children were prompted to imagine that the Earth was closer to and farther from the sun. On
the corresponding slides of the examples version, children were prompted to imagine life on Venus and
Neptune. The wording and images were matched as closely as possible across the two versions of the
tutorial.

Procedure
The experimenter first asked children 4 pretest questions to ensure that they had prerequisite
knowledge important to understanding the tutorial. The questions, listed in Appendix B, were focused
on children’s understanding of what happens to water when cooled or heated and what happens to
plants and animals if they do not have water. When asked what happens to water when it is heated,
several children responded that it boils. In these cases, the experimenter asked children, ‘‘But do you
know what it turns into?” and the following answers were accepted: steam, smoke, (water) vapor, a
gas, and evaporates. The concepts covered in the pretest questions had been introduced in children’s
classrooms based on our review of curriculum documents for the tested age range. Children needed to
4
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

answer 3 of 4 questions correctly for inclusion in the study. Only 2 children did not meet this thresh-
old, and therefore their data were excluded from the study. In cases where children did not answer
correctly, the experimenter provided the correct answer.
Next, the experimenter opened the laptop and introduced the tutorial for children in the counter-
factual and examples conditions. The experimenter read the text aloud to children on each page before
advancing to the next page. Children in the control condition proceeded directly from answering the
pretest questions to answering comprehension questions.
The posttest phase included both comprehension questions about Earth’s solar system to test chil-
dren’s learning from the tutorial and transfer questions about a novel (pretend) planetary system.
All questions are listed in Appendix B, and sample questions and responses are presented in Table 1.
The experimenter first asked children 3 comprehension questions while an image of Earth’s solar sys-
tem was displayed on the screen to examine their learning of the concept of planetary habitability as it
relates to Earth and our solar system (e.g., ‘‘Why is Earth a planet that plants and animals can live
on?”). She then asked children 7 transfer questions while an image of the novel planetary system
was displayed on the screen (e.g., ‘‘Which planet might have plants and animals living on it? Why?”).
The novel planetary system was displayed right to left, with the star on the right to try to minimize
any likelihood that children would think this was a depiction of Earth’s solar system given that depic-
tions of our own solar system are usually left to right. The experimenter first asked 4 closed-ended
questions, then said ‘‘I’m going to tell you the names of some of these planets,” and proceeded to name
three of the planets and asked participants why each of the planets was or was not habitable.
For all posttest questions, each response was scored for the mention of the following four units that
correspond to the causal chain outlined in Fig. 1: distance of planet from sun/star, temperature differ-
ences, potential for existence of liquid water, and opportunity for life. We did not expect children to
mention each unit in response to every question because doing so may seem repetitive, and therefore
we did not expect children to perform at ceiling. Children received a maximum score of 4 on open-
ended questions and a score up to 5 on closed-ended questions. Closed-ended questions included
an extra point for children’s selection of an appropriate planet (e.g., selecting the planet closest to
the star as the hottest planet) in addition to the explanation they offered.
All responses were recorded live to the best of the experimenter’s ability and were later transcribed
fully by the experimenter from a video of each session and then coded. A second individual watched
all videos to check for accuracy of transcription. A third individual coded a third of the transcripts.
Coding agreement was 97%. All disagreements were resolved through discussion between the two
coders.

Results and discussion

Children’s pretest performance did not differ across the three conditions (p >.957). To analyze chil-
dren’s performance on comprehension and open- and closed-ended transfer questions, with condition,
exact age, and pretest score as predictors and score as the outcome variable, we conducted three sep-
arate ordinal logistic regression analyses for each of the dependent measures because the coding
scheme and range of scores were slightly different. Each model included condition as a predictor
and exact age and pretest score as covariates.
The model including comprehension score as the outcome variable was significant, v2(4) = 25.86,
p <.001. Both condition, v2(2) = 16.47, p <.001, and pretest score, v2(1) = 10.46, p =.001, were signif-
icant predictors, but age was not a significant predictor (p =.462). Both versions of the tutorial con-
ferred a benefit for children’s comprehension of the core concept relative to the control condition
(see Table 1 for descriptive statistics and Table 2 for inferential statistics). Performance did not differ
significantly between the two experimental groups.
The model including closed-ended transfer score as the outcome variable was also significant,
v2(4) = 20.98, p <.001. Condition was a significant predictor of score, v2(2) = 17.71, p <.001, but pretest
score, v2(1) = 3.10, p =.078, and age, v2(1) = 0.13, p =.722, were not significant predictors. Both the
counterfactual and examples versions of the tutorial conferred a benefit for children’s closed-ended
transfer relative to the control condition. Performance between the two experimental conditions
did not differ significantly.
5
A. Nyhout and P.A. Ganea
Table 1
Mean scores (and standard deviations) on different sets of questions in Studies 1 and 2 grouped by condition.

Question type Study 1 scores Study 2 scores Sample question Sample response
Counterfactual– Examples– Control Counterfactual: Examples:
Earth Venus/ Novel planets Novel
Neptune planets
Pretest 3.48 (0.83) 3.69 (0.47) 3.64 3.63 (0.49) 3.51 (0.56) ‘‘What happens to water when it ‘‘It freezes or turns into ice.”
(0.76) gets really, really cold?”

Comprehension 5.12 (1.80) 5.00 (1.69) 3.58 5.34 (1.66) 5.71 (1.74) ‘‘Why is Earth/Kepler a planet ‘‘It is in the middle; it has enough distance from the
(1.80) that plants and animals can live star and from far away so it can stay hot and cold so
6

on?” it has water so nature can live.”

Journal of Experimental Child Psychology 222 (2022) 105466


Closed-ended 6.45 (1.80) 6.38 (1.98) 4.44 6.46 (1.65) 6.77 (1.50) ‘‘Which planet or planets might ‘‘This one [middle] because it’s not too far or too
transfer (2.75) have liquid water on them? close from the sun. It’s probably not too hot or too
Why?” cold.”

Open-ended 4.73 (1.57) 3.72 (1.11) 3.50 4.31 (1.64) 4.94 (1.43) ‘‘This is Planet Eris. Animals or ‘‘Because it’s really far away from the star so I think
transfer (1.84) plants cannot live there. Why it would be freezing cold and all water would
can’t animals or plants live freeze. So it would not be able to have life or
there?” animals.”

Note. Sample questions and responses are provided for each question type.
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

Table 2
Study 1 results of ordinal regression analyses with condition and age as predictors.

Parameter SE Wald’s v2 p Odds ratio 95% Confidence interval


estimate
Comprehension questions
Counterfactual vs. control 1.71 0.45 14.27 <.001 5.54 [2.28, 13.46]
Examples vs. control 1.43 0.45 9.97 .002 4.19 [1.72, 10.19]
Counterfactual vs. 0.28 0.46 0.38 .541 1.32 [0.54, 3.23]
examples
Closed-ended transfer questions
Counterfactual vs. control 1.77 0.46 14.55 <.001 5.85 [2.36, 14.52]
Examples vs. control 1.62 0.46 12.42 <.001 5.08 [2.06, 12.52]
Counterfactual vs. 0.14 .045 0.10 .750 1.15 [0.48, 2.78]
examples
Open-ended transfer questions
Counterfactual vs. control 1.38 0.46 8.90 .003 3.95 [1.60, 9.76]
Examples vs. control 0.19 0.43 0.20 .656 1.21 [0.52, 2.83]
Counterfactual vs. 1.18 0.45 6.80 .009 3.26 [1.34, 7.93]
examples

A different pattern of results emerged when looking at performance on open-ended transfer ques-
tions. The overall model was significant, v2(4) = 19.45, p =.001. Both condition, v2(2) = 10.26, p =.006,
and pretest score, v2(1) = 4.92, p =.027, were significant predictors, whereas age was marginally sig-
nificant, v2(1) = 3.72, p =.054. Training with prompts to think counterfactually about Earth was pre-
dictive of significantly higher open-ended transfer scores than the examples and control conditions.
Surprisingly, performance did not differ significantly between the examples and control conditions
on open-ended transfer.
Children in both experimental conditions showed better comprehension of the concept than chil-
dren in the control condition (who did not receive the instructional content), and the two experimen-
tal groups did not differ significantly from one another in their comprehension scores. These results
suggest that the tutorial, regardless of the prompts given, was successful in teaching children the tar-
get content. Those who did not receive instruction were not able to arrive at the concept on their own
by piecing together preexisting underlying knowledge they had of aspects of the causal chain (e.g., dis-
tance from a heat source affects temperature, heat causes water to evaporate).
The pattern of performance on transfer questions was more nuanced, such that children in the
counterfactual condition outperformed those in the control condition on all measures. Children in
the counterfactual condition also outperformed those in the examples condition on open-ended trans-
fer questions but not on closed-ended questions. Those in the examples condition outperformed those
in the control condition on closed-ended transfer score but (surprisingly) not on open-ended transfer
score. Why might this be the case? The open-ended questions were more challenging and required
children to invoke the entire causal chain (Fig. 1) to explain a novel planet’s habitability, whereas
the closed-ended questions required children to understand only a segment of the causal chain
(e.g., ‘‘Which planet is the hottest?”). This finding also underscores the difficulty with far transfer;
in some cases, children who had received the instruction in the examples condition struggled to trans-
fer the concept at a level that was significantly better than those who had received no instruction.
Children’s performance in the counterfactual condition, however, suggests that counterfactual
prompts may be a helpful tool for meeting the challenges of far transfer. Thus, children in the coun-
terfactual condition appeared to achieve a more complete understanding of the causal chain than
those in the other conditions even though the manipulation between conditions was quite subtle.
Children in both the examples and counterfactual conditions were provided with the same explana-
tions for the phenomena in the tutorial, and the only difference was the prompts at the end.
There are a few possible explanations for children’s better performance in the counterfactual con-
dition relative to the examples condition. First, the counterfactuals may have conferred a benefit
because they provided an imagined intervention. Children who were prompted to think

7
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

counterfactually had the opportunity to mentally intervene on the causal structure in question and
therefore may have gained a more generalizable understanding of said structure. This explanation
proposes a specific benefit of counterfactuals for reasoning about a particular causal structure. A sec-
ond explanation proposes a more general benefit of counterfactuals. The counterfactual prompts
invited children to think about an imaginary or possible—but non-actual—world. Reasoning about
possible worlds in the form of counterfactuals may have conferred a particular benefit on transfer
questions because transfer questions also required children to reason about non-actual worlds,
although the planets in the transfer phase were presented to children as though they were real planets
far away. Consistent with arguments that thinking counterfactually leads to a more open-minded or
flexible mindset (Galinsky & Moskowitz, 2000), counterfactual prompts may encourage learners to
flexibly apply their understanding to new contexts by encouraging a focus on more abstract features
of the learning content. Finally, the counterfactual prompts may have conferred a benefit simply
because they invited children to think about Earth. Reasoning about non-actual premises involving
a highly familiar exemplar may have allowed children to draw more accurate inferences compared
with reasoning with less familiar exemplars, namely Venus and Neptune.
To investigate these explanations, we conducted a second study in which we introduced children to
a novel planetary system during instruction and prompted them either to think counterfactually about
a novel habitable planet, Kepler, or to think about examples of three novel planets: Kepler, Gliese, and
Moa. In addition to comparing performance across these two new conditions in Study 2, we compared
children’s performance in Study 2 with performance across the three conditions in Study 1 to gain a
better understanding of the relative benefits of the different types of prompts. Specifically, if counter-
factuals are beneficial to transfer because of specific effects involving imagined interventions on a cau-
sal structure, then we should expect better performance in the counterfactual condition involving
novel planets compared with the examples condition involving novel planets. If counterfactuals pro-
vide a more general benefit by inviting consideration of alternative possible worlds, then we should
expect performance to be relatively equal across both new conditions. Finally, if performance in Study
1 was facilitated specifically by consideration of Earth, then we should expect children’s performance
across both conditions in Study 2 to be poorer than performance in the Study 1 counterfactual
condition.

Study 2

Method

Participants
Participants were 70 children aged 6 and 7 years (M = 6.59 years, SD = 0.50; 32 girls). Participants
were randomly assigned to the counterfactual condition (n = 35) or the examples condition (n = 35).
Participants were tested between January and September 2021 online via Zoom due to the COVID-19
pandemic and were recruited through an existing participant database or through social media. Chil-
dren needed to be exposed to English language at least 50% of the time for inclusion in the study. An
additional 4 children were tested, but their data were excluded due to failing the pretest questions.
Optional demographic information was provided by 51% of participating families. Parents reported
participating children’s ethnicity as White (23%), mixed ethnicity (20%), South Asian (4.3%), Chinese
(1.4%), Jewish (1.4%), or West Asian (1.4%). Information on parental education level was provided by
50% of families. Parents reported their education level as graduate or professional degree (26%), under-
graduate degree (20%), or high school (2.3%).

Design and procedure


Stimuli text and test questions are presented in Appendixes A and B, respectively. Visual stimuli
were adapted from and similar to the stimuli used in Study 1, but they also included some copyrighted
images and therefore are not reproduced here. Children were given the same 4 pretest questions as in
Study 1. The experimenter shared her screen and read the instructional content to children while dis-
playing the accompanying images. The wording of the tutorial was the same as in Study 1 but referred
8
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

to a novel habitable planet called Kepler rather than Earth. Children in the counterfactual condition
were then prompted to think counterfactually about Kepler being closer to or farther from its star.
Children in the examples condition were guided to think about examples of two other novel planets,
Gliese and Moa, that were too close to or too far from the star to be habitable.
Children were then asked 3 comprehension questions. 4 closed-ended transfer questions, and 3 open-
ended transfer questions that mirrored those asked in Study 1. The only difference from Study 1 was
that the first comprehension question referred to Kepler instead of Earth. As in Study 1, we did not
expect children to achieve the maximum scores. Study sessions were recorded over Zoom and were
later transcribed and coded. Coding took place as in Study 1. A second coder, who was blind to con-
dition, coded one third of participants’ responses. Coding agreement was 98%. Any disagreements
were resolved through discussion between the two coders.

Results and discussion

Children’s pretest performance did not differ by condition (p =.419). We again conducted three
ordinal logistic regression analyses for each of the dependent measures (comprehension, open-
ended transfer, and closed-ended transfer questions) with condition as a predictor, exact age and pret-
est score as covariates, and score as the outcome variable. Table 1 presents descriptive statistics, and
Table 3 presents inferential statistics for condition comparisons in Study 2. Odds ratios provide an
index of effect size for each comparison across conditions.

Table 3
Comparison across Studies 1 and 2 of the ordinal regression analyses with condition, age, and pretest scores as predictors.

Parameter SE Wald’s p Odds 95% Confidence


estimate v2 ratio interval
Comprehension questions
CF (S2) vs. Examples (S2) 0.59 0.43 1.85 .174 1.80 [0.75, 2.34]
CF (S2) vs. Control (S1) 1.90 0.45 18.02 <.001 6.68 [2.78, 16.06]
Examples (S2) vs. Control (S1) 2.49 0.47 28.20 <.001 12.01 [4.80, 30.06]
CF (S2) vs. CF–Earth (S1) 0.34 0.46 0.55 .459 1.40 [0.57, 3.42]
CF (S2) vs. Examples–Venus/Neptune 0.49 0.45 1.18 .277 1.64 [0.67, 3.97]
(S1)
Examples (S2) vs. CF–Earth (S1) 0.92 0.47 3.92 .048 2.52 [1.01, 6.28]
Examples (S2) vs. Examples–Venus/ 1.08 0.46 5.41 .020 2.94 [1.18, 7.29]
Neptune (S1)
Closed-ended transfer questions
CF (S2) vs. Examples (S2) 0.42 0.43 0.96 .328 1.52 [0.66, 3.53]
CF (S2) vs. Control (S1) 1.86 0.46 16.20 <.001 6.44 [2.60, 15.96]
Examples (S2) vs. Control (S1) 2.28 0.48 23.10 <.001 9.81 [3.87, 22.88]
CF (S2) vs. CF–Earth (S1) 0.20 0.47 0.002 .966 1.02 [0.41, 2.56]
CF (S2) vs. Examples–Venus/Neptune 0.16 0.46 0.12 .732 1.17 [0.48, 2.87]
(S1)
Examples (S2) vs. CF–Earth (S1) 0.44 0.47 0.86 .354 1.55 [0.61, 3.93]
Examples (S2) vs. Examples–Venus/ 0.58 0.47 1.53 .216 1.78 [0.71, 4.45]
Neptune (S1)
Open-ended transfer questions
CF (S2) vs. Examples (S2) 0.87 0.43 4.25 .039 2.43 [1.04, 5.63]
CF (S2) vs. Control (S1) 1.26 0.46 7.46 .006 3.54 [1.43, 8.77]
Examples (S2) vs. Control (S1) 2.15 0.46 21.62 <.001 8.59 [3.47, 21.25]
CF (S2) vs. CF–Earth (S1) 0.08 0.47 0.03 .867 1.08 [0.43, 2.73]
CF (S2) vs. Examples–Venus/Neptune 1.05 0.45 5.36 .021 2.86 [1.16, 6.96]
(S1)
Examples (S2) vs. CF–Earth (S1) 0.81 0.46 3.13 .077 2.24 [0.92, 5.48]
Examples (S2) vs. Examples–Venus/ 1.94 0.45 18.22 <.001 6.93 [2.85, 16.87]
Neptune (S1)

Note. CF, counterfactual. Conditions from Study 1 (S1) are marked in italic typeface, and conditions from Study 2 (S2) are marked
in roman typeface.

9
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

The model predicting comprehension score was not significant, v2(3) = 5.50, p =.138. Differences in
scores were not predicted by condition (p =.214), age (p =.052), or pretest score (p =.312).
The model predicting closed-ended transfer score was similarly nonsignificant, v2(3) = 4.10,
p =.215, with none of the three variables significantly predicting score: condition (p =.283), age
(p =.402), or pretest score (p =.087).
In contrast, the model predicting open-ended transfer score was significant, v2(3) = 11.00, p =.012.
Condition was a marginally significant predictor, v2(1) = 3.82, p =.051, such that exposure to the
examples prompts was actually predictive of marginally higher scores than exposure to the counter-
factual prompts. Age was a significant predictor of score, v2(1) = 6.57, p =.010, whereas pretest score
was not (p =.738).

Comparison of Study 1 and Study 2 conditions


Children’s pretest performance did not differ by condition (p =.552). Using three ordinal logical
regression analyses for each of the dependent measures, we compared children’s performance in
the counterfactual and examples conditions in Study 2 with children’s performance in the control,
counterfactual (henceforth counterfactual–Earth), and examples (henceforth examples–Venus/Nep-
tune) conditions from Study 1, and thus each analysis included five groups.
The model with comprehension score as the outcome variable was significant, v2(6) = 38.61,
p <.001. Both condition, v2(4) = 31.24, p <.001, and pretest score, v2(1) = 9.56, p =.002, were significant
predictors, whereas age was not a significant predictor (p =.336). Children’s comprehension scores did
not differ significantly between the counterfactual and examples conditions in Study 2. Exposure to
the tutorial in both conditions in Study 2 was predictive of significantly higher comprehension scores
than in the Study 1 control condition. Moreover, exposure to the tutorial with examples prompts in
Study 2 was predictive of significantly better comprehension scores than instruction in Study 1 with
either examples–Venus/Neptune or counterfactual–Earth prompts.
The model including closed-ended transfer score as the outcome variable was significant,
v2(6) = 32.99, p <.001. Both condition, v2(2) = 28.49, p <.001, and pretest score, v2(1) = 5.77,
p =.016, were significant predictors of closed-ended transfer score, whereas age was not a significant
predictor (p =.426). Performance between the counterfactual and examples conditions in Study 2 did
not differ significantly. Exposure to the Study 2 tutorial with counterfactual or examples prompts was
predictive of significantly higher closed-ended transfer scores than in the Study 1 control condition.
Performance in the two conditions in Study 2 did not differ significantly from performance in the
two experimental conditions in Study 1.
Finally, a model including open-ended transfer score was also significant, v2(6) = 35.71, p <.001.
Condition, v2(4) = 28.10, p <.001, age, v2(1) = 11.04, p <.001, and pretest score, v2(1) = 4.03, p =.045
all were significant predictors. Exposure to the tutorial with counterfactual and examples prompts
was predictive of a higher open-ended transfer score than in the Study 1 control condition. In contrast
to the results of Study 1, instruction with examples prompts in Study 2 was predictive of marginally
higher scores than instruction with counterfactual prompts. Instruction in the two conditions in Study
2 was predictive of higher open-ended transfer scores than exposure to the examples–Venus/Neptune
instruction in Study 1 but not to the counterfactual–Earth instruction in Study 1.
The transfer results differed from those in Study 1. Unlike Study 1, children in the counterfactual
condition did not outperform those in the examples condition. In fact, children in the examples con-
dition performed marginally better than children in the counterfactual condition in the current study.
Children in both instruction conditions outperformed those in the control condition on transfer ques-
tions, indicating that both types of instruction with prompts facilitated transfer.

General discussion

Across two studies, we investigated the effects of different types of learning prompts on children’s
learning and transfer of the concept of planetary habitability. Children were prompted to think coun-
terfactually about the distance of a habitable planet from its star and to consider the effects of this
change on its habitability or to consider examples of different planets that varied in their distance
10
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

from their star. In Study 1, children who were prompted to think counterfactually about Earth’s dis-
tance from the sun showed better transfer of the target concept than those in the control group. We
also found that children in the counterfactual condition outperformed children in the examples con-
dition—who thought about Earth, Venus, and Neptune—on open-ended transfer questions. Given the
difficulties that learners show with far transfer (e.g., Bransford et al., 2000), it is noteworthy that chil-
dren showed success at transferring their knowledge to a novel planetary system compared with
those in the control condition. Both the counterfactual and examples groups outperformed children
in the control group on the comprehension measure but did not perform significantly differently from
one another.
In Study 2, we introduced children to a novel planetary system similar to our own to further inves-
tigate the benefits of counterfactual and examples prompts. In this case, children who were prompted
to think about counterfactual versions of a novel planet, Kepler, did not perform significantly differ-
ently on comprehension and transfer questions than children who were prompted to think about
three novel planets. To better understand the effects of different types of learning prompts, we com-
pared performance across the five conditions in the two studies. Children in both Study 2 conditions
outperformed those in the Study 1 control condition by an order of magnitude higher than those in the
Study 1 experimental conditions. Relative to control group performance, the instruction with prompts
in Study 2 was associated with odds ratios from 3.54 to 12.01 compared with odds ratios from 3.26 to
5.85 for instruction with prompts in Study 1. On some measures, children in Study 2 outperformed
those in the Study 1 experimental conditions. Specifically, children in the Study 2 examples condition
outperformed those in the Study 1 examples–Venus/Neptune and counterfactuals–Earth conditions
on comprehension scores, and children in both conditions in Study 2 outperformed those in the exam-
ples–Venus/Neptune condition in Study 1 on open-ended transfer.
These findings support the assertion that counterfactuals—and more broadly prompts to consider
alternative possibilities—benefit learning and transfer when children encounter complex causal phe-
nomena. The pattern of performance across the two studies also sheds light on the nature of this ben-
efit. Encouraging children to consider alternative possible worlds—whether in the form of
counterfactuals or in the form of novel (imaginary) exemplars—appears to facilitate learning and
transfer. It is notable that the weakest performance across the four experimental conditions came
when children were prompted to think of realistic exemplars (Earth, Venus, and Neptune) with no
deviation from reality. This may be because the exemplars they thought about (e.g., Venus, Neptune)
lacked surface similarity and varied along several dimensions (e.g., size, color, composition), which
could have made it more difficult to compare them and extract the relevant causal structure
(Gentner, 1983; Gentner & Markman, 1997). Having children consider familiar examples therefore
may present certain barriers for learning and transfer of a concept like planetary habitability com-
pared with having them consider counterfactuals or novel/imaginary exemplars. Rather than it being
the case that familiar exemplars attenuate learning and transfer, it may instead be the case that the
process of considering different non-actual possibilities leads to benefits. Our findings suggest that
by directing attention away from specific exemplars to counterfactual and alternative possible worlds,
children gained a more abstract understanding of the causal principles we aimed to teach.
Following Study 1, we considered two main proposals for the benefit of counterfactuals observed
on transfer questions. In Study 2, benefits of considering alternative possibilities were observed not
just for transfer but also for comprehension. The first proposal was that counterfactuals function as
an imagined intervention, allowing learners to mentally intervene on a causal structure to yield gen-
eralizable insights. The second proposal was that counterfactuals, as a form of reasoning about alter-
native possibilities, have more general benefits by promoting a mindset that is open to alternatives.
The results of the current studies are more in line with the second proposal and suggest that it was
not counterfactuals per se that benefited performance in Study 1. Hirt and Markman (1995) argued
that prompting individuals to consider alternatives leads them to adopt a mindset that is generally
open to alternative hypotheses. This proposal connects to a wide body of research with both adults
and children suggesting that prompts to think of alternatives facilitate reasoning on an array of tasks
(Chakravarty, Srivastava, & Patil, 2020; Galinsky & Moskowitz, 2000; Hirt & Markman, 1995; Nyhout &
Ganea, 2019a, 2019b). For instance, adults who have been primed to think counterfactually show
more divergent thinking and better performance on a hypothesis-testing task (Galinsky &
11
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

Moskowitz, 2000). Other work has found that prompting 7- to 10-year-olds to think counterfactually
leads to better performance on a control-of-variables experimental design task compared with chil-
dren given control prompts (Nyhout & Ganea, 2019a). Another study with preschoolers found that
counterfactual prompts scaffolded their ability to detect anomalies to an existing hypothesis in a cau-
sal learning task (Engle & Walker, 2021). Research with middle schoolers found that children who
were asked to think counterfactually subsequently generated more insightful questions
(Chakravarty et al., 2020).
The precise mechanisms underlying these effects may be as diverse as the tasks on which they are
put to use. For instance, whereas prompts to consider alternatives may debias reasoning on some tasks
by encouraging individuals to consider lower-likelihood hypotheses (Engle & Walker, 2021; Walker &
Nyhout, 2020), the facilitating effect in the current studies may be better explained by encouraging a
focus on abstract features of the problems by drawing attention to alternative possibilities. Future
work may investigate the effects of different types of prompts to consider alternatives on different
types of reasoning tasks.
Future work should identify the scope of the facilitating effects of prompts to consider alternatives
across a variety of scientific domains. On which types of tasks do they facilitate learning and reason-
ing? How long-lasting are any effects? The manipulation in the current studies was quite subtle,
which has also been the case in previous studies investigating the effects of counterfactuals
(Galinsky & Moskowitz, 2000; Nyhout & Ganea, 2019a, 2019b). We may expect effect sizes to mirror
the extent of the intervention. With more intense, frequent, or prolonged interventions, is there a sim-
ilar increase in learning? In which cases may prompts to consider alternatives hinder learning?
Research with adults indicates that counterfactual alternatives may bias performance on reasoning
tasks when consideration of alternative possibilities detracts focus from the task at hand (e.g., affirm-
ing the consequent on the Wason card selection task; Galinsky & Moskowitz, 2000). Performance may
be similarly limited by the nature of the alternative that individuals generate spontaneously. Out of
the infinite alternative possibilities that individuals could consider, the mind tends to focus on a select
few (Byrne, 2005), and this rather narrow focus may limit the scope of any effects of considering alter-
natives on learning and reasoning.
As a final note, the results of the current studies could contribute to allaying concerns that instruc-
tion delivered online is somehow inferior to in-person instruction. Children in Study 2, who learned
the content online, performed just as well as, and in some cases better than, children in Study 1,
who learned in-person.
Taken together, the results of the current studies and previous studies suggest that counterfactual
prompts and prompts to consider alternative possibilities are a promising pedagogical tool that may
help to address the challenges of transfer (Bransford et al., 2000). In particular, the current studies
introduce novel pedagogical prompts as a tool for the abstract learning of science concepts that
may be as useful as (and in some cases more useful than) prompts to consider multiple existing exem-
plars. They are low cost and can be easily implemented across a range of formal and informal learning
settings. Whether used during book reading with a parent at home or in a classroom when learning a
new science concept, prompts to consider alternative possibilities may encourage learners to engage
with material in novel ways.

Acknowledgments

This work was supported by a Postdoctoral Fellowship Award from the Social Science and
Humanities Research Council of Canada (SSHRC) to A. Nyhout, a SSHRC Insight Development Grant
to A. Nyhout and P. A. Ganea, and a Natural Sciences and Engineering Research Council of Canada
(NSERC) Discovery Grant to P. A. Ganea. We are grateful to our dedicated team of research assistants
for their work on this project, including Cailie Gordon who created the illustrations and assisted with
testing, Lynn Nguyen and Mila Milicevic who assisted with testing, and Thahmina Rahman and Lydia
Jia who conducted coding. We thank the children and parents in the greater Toronto area who gave
their time to participate in these studies. We also thank Deena Weisberg and an anonymous reviewer
for their helpful comments and suggestions for Study 2.
12
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

Appendix A

Study 1 stimuli

13
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

14
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

Appendix B

Test questions from Studies 1 and 2

Pretest questions

1. What happens to water when it gets really, really hot?


2. What happens to water when it gets really, really cold?
3. What happens to animals if they don’t have any water for a long time?
4. What happens to plants if they don’t have any water for a long time?

Comprehension questions

1. Why is Earth/Kepler a planet that plants and animals can live on?
2. Can plants and animals live on a planet that is very, very far away from the sun/star? Why/why
not?
3. Can plants and animals live on a planet that is very, very close to the sun/star? Why/why not?

Transfer questions

Closed-ended questions

Experimenter says, ‘‘Now I’m going to show you some new planets that we haven’t seen before.”.

1. Which planet is the hottest/coldest? Why?


2. Which planet is the hottest/coldest? Why?
3. Which planet or planets might have liquid water on them? Why?
4. Which planet might have plants and animals living on it? Why?

Open-ended questions

Experimenter says, ‘‘I’m going to tell you the names of some of these planets.”.

1. This is Planet Moa/Ceres. Animals or plants cannot live there. Why can’t animals or plants live
there?
2. This is Planet Gliese/Eres. Animals or plants cannot live there. Why can’t animals or plants live
there?
3. This is Planet Kepler/Varuna. Animals and plants can live there. Why can animals and plants live
there?

References

Beck, S. R., & Riggs, K. J. (2014). Developing thoughts about what might have been. Child Development Perspectives, 8, 175–179.
Bransford, J. D., Brown, A. L., & Cocking, R. R. (2000). How people learn: Brain, mind, experience, and school. Washington, DC:
National Academies Press.
Brown, A. L., & Kane, M. J. (1988). Preschool children can learn to transfer: Learning to learn and learning from example.
Cognitive Psychology, 20, 493–523.
Brown, A. L., Kane, M. J., & Long, C. (1989). Analogical transfer in young children: Analogies as tools for communication and
exposition. Applied Cognitive Psychology, 3, 275–293.
Byrne, R. M. (2005). The Rational Imagination: How people create alternatives to reality. Cambridge: MIT press.
Buchsbaum, D., Bridgers, S., Skolnick Weisberg, D., & Gopnik, A. (2012). The power of possibility: Causal learning, counterfactual
reasoning, and pretend play. Philosophical Transactions of the Royal Society B: Biological Sciences, 367, 2202–2212.
Chakravarty, S., Srivastava, A., & Patil, K. (2020). Middle-schoolers primed to reason counterfactually ask more interesting
questions. In epiSTEME 8 (Eighth International Conference to Review Research in Science, Technology, and Mathematics
Education, pp. 139–247). Mumbai, India: Homi Bhabha Centre for Science Education, TIFR. https://episteme8.hbcse.tifr.res.

15
A. Nyhout and P.A. Ganea Journal of Experimental Child Psychology 222 (2022) 105466

in/proceedings/MIDDLE-SCHOOLERS%20PRIMED%20TO%20REASON%20COUNTERFACTUALLY%20ASK%20MORE%
20INTERESTING%20QUESTIONS.pdf.
Engle, J., & Walker, C. M. (2021). Thinking counterfactually supports children’s evidence evaluation in causal learning. Child
Development, 92, 1636–1651.
Galinsky, A. D., & Moskowitz, G. B. (2000). Counterfactuals as behavioral primes: Priming the simulation heuristic and
consideration of alternatives. Journal of Experimental Social Psychology, 36, 384–409.
Ganea, P. A., Ma, L., & DeLoache, J. S. (2011). Young children’s learning and transfer of biological information from picture books
to real animals. Child Development, 82, 1421–1433.
Gelman, S. A., Raman, L., & Gentner, D. (2009). Effects of language and similarity on comparison processing. Language Learning
and Development, 5, 147–171.
Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7, 155–170.
Gentner, D. (1989). The mechanisms of analogical learning. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning
(pp. 199–241). Cambridge, UK: Cambridge University Press.
Gentner, D., & Gunn, V. (2001). Structural alignment facilitates the noticing of differences. Memory & Cognition, 29, 565–577.
Gentner, D., Loewenstein, J., & Thompson, L. (2003). Learning and transfer: A general role for analogical encoding. Journal of
Educational Psychology, 95, 393–408.
Gentner, D., & Markman, A. B. (1997). Structure mapping in analogy and similarity. American Psychologist, 52, 45–56.
Gentner, D., & Namy, L. L. (1999). Comparison in the development of categories. Cognitive Development, 14, 487–513.
Gick, M. L., & Holyoak, K. J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1–38.
Gopnik, A. (2009). The philosophical baby: What children’s minds tell us about truth, love, and the meaning of life. New York:
Random House.
Gopnik, A., Glymour, C., Sobel, D. M., Schulz, L. E., Kushnir, T., & Danks, D. (2004). A theory of causal learning in children: Causal
maps and Bayes nets. Psychological Review, 111, 3–32.
Harris, P. L., German, T., & Mills, P. (1996). Children’s use of counterfactual thinking in causal reasoning. Cognition, 61, 233–259.
Hirt, E. R., & Markman, K. D. (1995). Multiple explanation: A consider-an-alternative strategy for debiasing judgments. Journal of
Personality and Social Psychology, 69, 1069–1086.
Kurtz, K. J., & Loewenstein, J. (2007). Converging on a new role for analogy in problem solving and retrieval: When two problems
are better than one. Memory & Cognition, 35, 334–341.
Marcus, M., Haden, C. A., & Uttal, D. H. (2018). Promoting children’s learning and transfer across informal science, technology,
engineering, and mathematics learning experiences. Journal of Experimental Child Psychology, 175, 80–95.
McCormack, T., Ho, M., Gribben, C., O’Connor, E., & Hoerl, C. (2018). The development of counterfactual reasoning about doubly-
determined events. Cognitive Development, 45, 1–9.
Minervino, R. A., Olguín, V., & Trench, M. (2017). Promoting interdomain analogical transfer: When creating a problem helps to
solve a problem. Memory & Cognition, 45, 221–232.
Namy, L. L., & Gentner, D. (2002). Making a silk purse out of two sows’ ears: Young children’s use of comparison in category
learning. Journal of Experimental Psychology: General, 131, 5–15.
Nyhout, A., & Ganea, P. A. (2019a). The development of the counterfactual imagination. Child Development Perspectives, 13,
254–259.
Nyhout, A., & Ganea, P. A. (2019b). Mature counterfactual reasoning in 4- and 5-year-olds. Cognition, 183, 57–66.
Nyhout, A., & Ganea, P. A. (2021). Scientific reasoning and counterfactual reasoning in development. Advances in Child
Development and Behavior, 61, 223–253.
Nyhout, A., Iannuzziello, A., Walker, C. M., & Ganea, P. A. (2019). Thinking counterfactually supports children’s ability to conduct
a controlled test of a hypothesis. In Proceedings of the 41st annual meeting of the Cognitive Science Society (pp. 2488–2494).
Montreal, QC: Cognitive Science Society.
Ontario Ministry of Education. (2007). The Ontario Curriculum Grades 1-8: Science and Technology. http://www.edu.gov.on.ca/
eng/curriculum/elementary/scientec.html.
Pearl, J. (2000). Causal inference without counterfactuals: Comment. Journal of the American Statistical Association, 95, 428–431.
Rafetseder, E., & Perner, J. (2014). Counterfactual reasoning: Sharpening conceptual distinctions in developmental studies. Child
Development Perspectives, 8, 54–58.
Strouse, G. A., & Ganea, P. A. (2021). The effect of object similarity and alignment of examples on children’s learning and transfer
from picture books. Journal of Experimental Child Psychology, 203, 105041.
Strouse, G. A., Nyhout, A., & Ganea, P. A. (2018). The role of book features in young children’s transfer of information from
picture books to real-world contexts. Frontiers in Psychology, 9. https://doi.org/10.3389/fpsyg.2018.00050.
Walker, C. M., & Gopnik, A. (2013). Causality and imagination. In M. Taylor (Ed.), The Oxford handbook of the development of
imagination (pp. 342–358). New York: Oxford University Press.
Walker, C. M., & Nyhout, A. (2020). Asking ‘‘why?” and ‘‘what if?”. In L. P. Butler, S. Ronfard, & K. H. Corriveau (Eds.), The
questioning child: Insights from psychology and education (pp. 252–280). Cambridge, UK: Cambridge University Press.
Weisberg, D. S., & Gopnik, A. (2013). Pretense, counterfactuals, and Bayesian causal models: Why what is not real really matters.
Cognitive Science, 37, 1368–1381.
Wenzlhuemer, R. (2009). Counterfactual thinking as a scientific method. Historical Social Research/Historische Sozialforschung, 34
(2), 27–54.

16
Computers and Education Open 4 (2023) 100124

Contents lists available at ScienceDirect

Computers and Education Open


journal homepage: www.sciencedirect.com/journal/computers-and-education-open

Cognitive and motivational benefits of a theory-based immersive virtual


reality design in science learning
Xiaoxia Huang a, c, *, Jeanine Huss a, Leslie North b, Kirsten Williams a, Angelica Boyd-Devine a
a
School of Teacher Education, College of Education and Behavioral Sciences, Western Kentucky University, Bowling Green KY42101, United States
b
Department of Earth, Environmental, and Atmospheric Sciences, Western Kentucky University, Bowling Green KY42101, United States
c
School of Education, Syracuse University, Syracuse, NY 13244, United States

A R T I C L E I N F O A B S T R A C T

Keywords: This study investigated the effects of an immersive virtual reality (IVR) nature-trail tour on participants’ science
Immersive virtual reality learning, self-efficacy, cognitive load, perceived enjoyment, and perceived usefulness, as compared to actual
Cognitive theory of multimedia learning walking tours. The IVR tour was designed based on the Cognitive Theory of Multimedia Learning. In a between-
Informal science learning
subjects quasi-experiment, participants learned environmental science topics in one of three types of nature-trail
Cognitive load
Self-efficacy
tours, including an IVR tour, a business-as-usual walking tour, and an enhanced walking tour. Results of analyses
of covariance indicated that the theory-based IVR design was effective in improving participants’ science
learning and their self-efficacy perceptions. At the same time, the IVR tour was found to be as enjoyable as the
walking tours and did not pose an unnecessary cognitive load during the learning process. The results have
implications for designing IVR environments to (1) enhance cognitive and motivational outcomes in science
learning and (2) increase the accessibility of nature-based sites.

1. Introduction technology has improved to incorporate immersive technologies, like a


head-mounted display (HMD), that brings the 3D world to a new level.
From film reels in classrooms to the current use of digital technolo­ Immersive virtual reality (IVR) allows the plane of projection barriers to
gies, educators have consistently turned toward technology to find ways vanish, enabling the user to feel surrounded by a virtual world they can
to enhance learning and motivation. Recent years have seen the explore freely as in the real-world [3]. As such, a well-designed IVR
increased use of virtual reality (VR) in educational settings due to the environment facilitates learning of more complex concepts in natural­
increased accessibility and affordability of VR devices. At its core, VR is istic settings [5,6].
defined as a three-dimensional multimedia environment that is highly According to Psotka [7], IVR is distinguished from all preceding
interactive and involves multisensory experiences, which enables the technology by creating a sense of immediacy and control (through im­
user to become a participant in this virtual digital world as if in the real- mersion) from changing visual perspectives in accordance with the
world [1–3]. user’s head and eye movement, thus creating a perception that the
Ryan [3] addressed the history of VR by taking the simulated worlds virtual world “looks and feels to some degree like the real world” (p.
back to the Renaissance with the use of perspective in paintings. She 406). This level of immersion is achieved through two main character­
stated that although the artists were trying to create a virtual world, the istics of IVR: first, the user is unobtrusively tracked so that their head
medium was a flat surface that “the spectator cannot break through… and body positions are recorded, and consequently, the virtual envi­
and walk into the pictorial space” (p. 112). Many early computer pro­ ronment is updated to reflect the orientation changes; and second,
grams used in education were similar with perspective drawings that sensory information from the real world is minimized as much as
simulated 3D effects but contained a plane that the learner could not possible [8]. The HMD (e.g., Oculus Quest or HTC Vive) is primary to
cross. As Akpan and Shanker [4] illustrated, these graphics are often IVR technology in that it shuts off the physical reality to immerse the
confined to a 2D surface and that “3D visualizations contain real users in the virtual world and provides real-time visual images that
binocular stereographic depth effects” (p. 146). While VR in the class­ change as they move through the virtual world environment [9]. In
room was initially relegated to 3D graphics on a flat computer screen, contrast to IVR, desktop VR is realized through screen-based devices

* Corresponding author.
E-mail address: xhuang91@syr.edu (X. Huang).

https://doi.org/10.1016/j.caeo.2023.100124
Received 20 May 2022; Received in revised form 21 December 2022; Accepted 19 January 2023
Available online 21 January 2023
2666-5573/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
X. Huang et al. Computers and Education Open 4 (2023) 100124

such as a conventional computer monitor that is controlled with tradi­ informal, non-school settings [24]. For instance, it has been pointed out
tional computer hardware (e.g., a keyboard and a mouse), which is that nature-based tours can improve visitors’ environmental knowledge
non-immersive [10]. and attitudes through first-hand interaction opportunities with plants
and animals in their natural environments [25].
1.1. IVR in facilitating cognitive and motivational processes in science Nevertheless, conducting nature-based education has limitations due
learning to high costs and limited budgets, logistics, and accessibility issues,
making it hard for teachers to lead students on field trips [26,27]. For
One prominent area that is researching the use of IVR to increase example, people with physical disabilities, seniors with limited mobility,
student motivation and learning are S.T.E.M. (Science, Technology, or people with financial concerns may not be able to travel to a nature
Engineering, and Math) subjects. These subjects have higher dropout site to enjoy its educational and recreational benefits. Natural disasters
rates due to the difficulties many students have in understanding the such as severe weather, earthquakes, fires, and floods may also limit the
complex theoretical concepts [6] that could lead to cognitive overload. access of a nature site to the general public.
In addition, students tend to have low self-efficacy and low enjoyment in The use of IVR provides opportunities to engage individuals and
S.T.E.M. subjects that prevents them from engaging in scientific inquiry promote social equity by reducing real-world barriers. For instance,
[11–13]. As an innovative technology, IVR provides a highly interactive Harrington [26] noted that the virtual field trip in her study allowed the
and immersive environment that can be customized to meet intended students greater flexibility in movement and inquiry to the environment
learning needs. These technology affordances have great potential to around them in the virtual world. In addition, IVR offers opportunities to
promote cognitive and motivational processes in S.T.E.M. learning, such enhance nature-based learning as it allows for greater manipulations of
as learner performance, cognitive load, self-efficacy, and perceived objects that are not possible in real life for many science subjects. IVR
enjoyment. allows one to ask, "can I try this?" in a risk-free environment and be able
As a major discipline in S.T.E.M. education, science learning has to go where conventional means cannot take them, allowing them
been researched in immersive learning environment, although with considerable freedom to test and cognitively construct knowledge on
limited empirical work. Most previous studies on IVR in science areas their own [6,10,28].
are media comparison studies, i.e., comparing the effect of an IVR A limited number of studies have investigated the effectiveness of
environment with a less immersive learning environment, such as IVR within a nature-based learning environment, more specifically,
desktop VR [10,14–16,17], video instruction [18,19], or PowerPoint immersive virtual field trips, on science learning and motivation [29,
slide show [20]. A smaller number of studies involved media and 30–32]. For example, Markowitz et al. [29] demonstrated the benefits of
methods experiments. That is, in addition to comparing IVR with an IVR environment in increasing participants’ knowledge of and
another media, the effect of a specific instructional method within each inquisitiveness about the learning topic concerning climate change,
media was investigated, such as summarizing strategy [17,20] and specifically, ocean acidification. Their IVR environment consisted of a
pre-training strategy [18]. Researchers have not reached a consensus on narrated immersive underwater world of animated fish and sea life, and
the role that immersion plays in IVR learning. For instance, while some the main IVR design elements included physical immersion, embodi­
studies suggested that participants learned less in IVR as compared to ment, natural interactions, and time travel. Similarly, Petersen et al.
desktop VR [10,14,16], other research showed the benefits of IVR in [31] investigated the strategies to maximize the effect of an IVR envi­
promoting learning [21], and still other studies indicated there was no ronment on climate change education through a virtual field trip to
significant difference between the two types of media when it came to Greenland. The IVR program consisted of a 360-degree noninteractive
learning [15,17]. video documentary focusing on the melting Greenland ice sheet. The
Some studies have focused specifically on how IVR affects learners’ IVR was enhanced by a narration explaining some foundational concepts
cognitive load, self-efficacy, or perceived enjoyment in addition to sci­ (pretraining) related to the learning content. The researchers explored
ence learning. For example, Pande et al. [19] showed that their IVR the pretraining principle with this IVR environment, i.e., whether
environment involving environmental biology increased un­ including the narrated pretraining before learners’ IVR exploration
dergraduates’ performance and perceived enjoyment as compared to the would improve learning and motivation as compared to integrated
video instruction, but there was no significant difference in self-efficacy narration within the IVR exploration. The results showed both condi­
between the two conditions. Both Meyer et al. [18] and Jian Zhao et al. tions improved various outcomes such as declarative knowledge,
[17] investigated the effects of IVR instruction involving a science lesson self-efficacy, and STEM intentions, although the pretraining group did
about blood cells as compared to video instruction for college students. perform better on the transfer test. It was argued that the improved
Meyer et al. [18] revealed the benefits of IVR coupled with pre-training transfer was likely due to the provision of pretraining before the IVR
on improving students learning, transfer, and self-efficacy. Jian Zhao experience, resulting in reduced cognitive load during the virtual trip.
et al. [17] showed that the IVR environment resulted in increased in­ However, cognitive load was not measured in this study to verify the
terest and value, but no significant difference in cognitive load was assumption. Finally, Zhao et al. [32] investigated the effectiveness of an
found. In K-12 settings, Makransky et al. [16] focused on IVR involving a IVR field trip on introductory geoscience education as compared to a
science lesson on forensic analysis of a collected DNA sample. The re­ desktop VR trip and an actual field trip. The virtual trip included a series
sults showed that the IVR condition led to higher perceived enjoyment of 360-degree images of the study site - Salona Formation (sedimentary
for their high school students but not on learning, as compared to the rocks) in Pennsylvania. Participants could interact with the IVR envi­
video condition. Generally speaking, previous research in media com­ ronment through various textual, visual or audio narration information,
parison studies produced mixed results on the effectiveness of IVR on the including answering questions related to the learning content. The re­
intended cognitive and motivational outcomes. sults demonstrated the benefits of the IVR trip on learner motivation as
compared to the desktop VR trip, and both types of virtual trips pro­
1.2. IVR and informal nature-based science learning moted learning as compared to the actual field trip.
In sum, only a limited number of studies have examined the effect of
IVR has great potential to support nature-based learning, which is IVR in facilitating science learning and motivation in informal, nature-
defined as learning taking place in outdoor natural environments, such based settings. More empirical research is needed to explore effective
as national parks and state forests [22,23]. This approach offers expe­ IVR design elements, especially theory-based design, to maximize the
riential learning that brings learners out of a classroom and into nature educational and recreational value of this increasingly popular
for an authentic experience. It connects to the philosophy that people technology.
learn science from everyday experiences about the natural world in

2
X. Huang et al. Computers and Education Open 4 (2023) 100124

1.3. Theory-Based design of IVR learning environments CTML principles provide a practical, theoretical framework for
effective IVR design that can prevent cognitive overload or reduce
Previous systematic review research indicates that most empirical extraneous processing [10,18] for learners navigating in an immersive
studies in IVR lack a theoretical foundation in the design of the virtual learning environment. However, only a limited number of empirical
environment [33]. Indeed, only a small number of studies involving S.T. studies on S.T.E.M. learning with IVR have integrated one or more of
E.M. learning have explicitly discussed a theory-based approach in their these principles in their IVR design or research framework [10,16,18,20,
IVR systems (e.g., [10,16,18]). Effective design of IVR learning envi­ 43,44]. Further, these previous studies were conducted mostly with
ronments should be guided by relevant theories and design principles in college students [10,20,67,71] or high school students [16] in a
order to maximize the effectiveness of this innovative technology. controlled lab setting. It is less clear how an IVR environment designed
One fundamental issue to consider when designing an IVR system specifically based on selected CTML principles would impact learners of
relates to the distinctive feature of “immersion” in immersive learning varying ages in an informal learning setting. In addition, although CTML
environments. Immersion refers to a state of “shutting out physical re­ design principles are based on the cognitive load theory, most of the
ality” while being surrounded by a virtual environment or another re­ CTML-related IVR studies did not include cognitive load as an intended
ality; it concerns the technological quality of a VR environment, which is outcome measure to validate the design in terms of facilitating learners’
an objective measure of how vividly a virtual environment is presented cognitive processing during their IVR experience.
([34], p.3). While it is often the goal of designers to create a high level of
immersion in an IVR environment, recent research highlights the issue 1.4. Purpose of the study
that the high immersion level can generate an entirely different learning
experience as compared to a conventional multimedia environment, The main purpose of this study was to investigate effective design
which may drain learners’ cognitive resources [35]. In particular, the elements to create virtual nature-based experiences which may result in
perceptual realism in an IVR environment could serve as seductive de­ equivalent, or perhaps better, educational and enjoyable experiences for
tails - interesting but irrelevant information, that negatively affect visitors unable to physically explore nature-based sites. More specif­
learning [10]. Hence, it is essential to use a theory-based approach in ically, there were two components embedded in the main goal of the
IVR design to provide appropriate learning direction and reduce the study: (1) explore effective, theory-based IVR design elements in the
potential negative impact caused by a high level of immersion. context of an informal science learning environment, and (2) increase
Two relevant theories that have potential to address the issue of the accessibility of nature-based sites through theory-based IVR design.
unintended cognitive processing in IVR environments include cognitive As mentioned previously, despite the benefits of nature-based sites,
load theory (CLT) and cognitive theory of multimedia learning (CTML). there are various factors limiting their access. In order to reach the goal
Both theories emphasize that learning and instruction should be of social equity, we need to create innovative ways to offer site-
designed in a way that does not overload learners’ limited cognitive equivalent experiences to individuals who may be otherwise unable to
capacity for processing information [36]. As a general theory of human participate or visit nature-based sites. A theory-based approach is
cognition, CLT distinguishes three different types of load, including important to meet the intended goal that the IVR experience is equiva­
intrinsic cognitive load that is inherent in the difficulty level of the lent, or perhaps better, than the real-life experience in terms of educa­
instructional topics, germane cognitive load that is essential to the tional and recreational purposes.
learning process, and extraneous cognitive load that is irrelevant to the The design of the IVR environment was guided by CLT and CTML to
learning process; in order for effective learning to occur, extraneous reduce unnecessary cognitive load while promoting germane cognitive
cognitive load should be minimized while germane cognitive load load during the virtual tour. The primary research question of the study
should be promoted within the limit of working memory capacity was: What is the impact of a theory-based IVR design on users’ science
[37–40]. learning, cognitive load, self-efficacy and perceived enjoyment when
CTML was derived from CLT but posited specifically for learning in they experience an immersive virtual nature trail, as compared to actual
interactive technology-enhanced environments [20,36]. Three kinds of walking tours? We were interested in comparing the IVR tour with
cognitive processing during multimedia learning were proposed corre­ walking tours because the IVR design is a replica of the actual nature
sponding to the three types of cognitive load, namely, extraneous pro­ site, and the results should shed light on the effectiveness of the IVR
cessing (extraneous cognitive load), essential processing (intrinsic environment as compared to the authentic physical reality of the nature
cognitive load), and generative processing (germane cognitive load) site. Our hypothesis was that the CTML-based IVR design should have
[36]. A number of design principles have been developed from CTML, positive impact on learners’ cognitive load, which would subsequently
which summarize conditions of facilitating deeper learning in influence their learning, self-efficacy, and perceived enjoyment.
technology-enhanced learning environments [41,42]. According to
Mayer and his colleagues, these principles propose that people learn 2. Materials and methods
better when:
2.1. The study site
(1) extraneous processing is reduced; e.g., words and corresponding
pictures are presented concurrently rather than successively The study site of this research is an urban nature park of more than
(temporal contiguity principle); words and corresponding pictures 70 acres in a southeastern state in the U.S. The park provides an ideal
are physically unified rather than split up (spatial contiguity site for nature-based learning experiences as it features karst landscape,
principle); essential information is signaled or cued (signaling including a cave, a spring, and multiple blue holes, as well as other
principle), and unnecessary information is excluded rather than natural features such as a butterfly habitat and wooded nature trails. In
included (coherence principle); addition to directional signage, various signs are set up at the park for
(2) essential processing is managed; e.g., words and pictures are informal learning opportunities, such as introduction to the karst fea­
presented in segments rather than continuously (segmenting tures, what blue holes are and why they appear blue, what a watershed
principle); and is, what a butterfly’s lifecycle is, and what butterflies live in the habitat.
(3) generating processing is fostered; e.g., words are presented in Our study focused on one of the nature trails that is approximately a mile
personalized conversational style rather than nonpersonal formal long, where participants could explore various native and invasive trees
style (personalization principle); and information presented allows and plants, three blue holes, the butterfly habitat, and the spring.
for learner control in terms of pacing, sequencing, and selecting
(learner control principle).

3
X. Huang et al. Computers and Education Open 4 (2023) 100124

2.2. Participants and procedure were also recruited to the IVR group who completed the study in a lab
space on campus. Each participant was offered a ten-dollar gift card.
The study included a total of 120 participants (females = 67, males All participants completed a pre-survey consisting of demographic
= 53), who were recruited from both the nature-trail site and a nearby information and pre-self-efficacy measurement prior to their condition-
university. Most participants belong to the Millennials group (N = 43 dependent nature-trail tour, followed by a post-survey that included the
between ages of 25–40; 35.8%) and the Generation Z group (N = 30 instruments on the knowledge test, post-self-efficacy, cognitive load,
between ages of 18–24; 25%), followed by Baby Boomers (N = 26 with perceived enjoyment, and perceived usefulness (see Fig. 1).
an age above 57; 21.7%) and the Generation X group (N = 21 between
ages of 41–56; 17.5%). Eighty percent of the participants reported that
they were White (n = 96), followed by Two or More Races (n = 7), Asian 2.3. Apparatus and learning materials
(n = 5), Black (n =4), Latinx or Hispanic (n = 3), Middle Eastern (n = 3),
and Asian/Pacific Islander (n = 1). In addition, 31.7% of the participants 2.3.1. IVR condition
reported that they had previous experience using an IVR device (n = 38), A total of five learning objectives were included in the virtual nature-
while 63.3% of the participants (n = 76) did not have any prior IVR trail tour as informal science learning opportunities. These objectives
experience. For those who had prior IVR experience, the majority of included: (1) identifying different trees and plants at the site, (2) dis­
them (n =30) reported having used it for video games, with a small tinguishing native versus invasive species, (3) explaining karst land­
number of participants (n = 9) who had experienced online VR tours. scapes, (4) explaining the importance of water in a karst environment
Participants were assigned to three groups, an IVR group, an and how pollution happens to karst environments, and (5) explaining
enhanced walking tour group (E-WT), and a business-as-usual control the importance of butterflies in the ecosystem.
walking tour group (BAU-WT), with 40 participants in each group. The The virtual tour was created to replicate the physical stie as
inclusion of the BAU-WT group was intended to examine the effective­ authentically as possible. It consisted of a series of 360-degree pano­
ness of the IVR tour as compared to a “physical reality” tour in an ramas pictures of the nature trail, with content information included as
authentic natural setting. The E-WT group was included for a similar textual and visual information. Spatial ambient sounds of nature present
reason, but with additional learning content support that was provided at this nature-trail site were also included in the virtual tour, e.g., birds
to the IVR group in order to establish the equivalence between the IVR singing and water flowing. Participants could interact with the virtual
tour and the walking tour in this aspect (see details in Section 2.3 environment through various actionable buttons (see Table 1) using the
Apparatus and Learning Materials). handheld controllers, such as information icons that would bring up
The study was conducted both at the nature-trail site and the nearby additional content for the user to study, magnifier buttons that would
university. Participants of the two walking groups were tourists allow for a closer view of signage along the nature trail, map buttons
recruited at the nature-trail site. Participants in the IVR group consisted that would show the map of the trail for the user to check their location,
of both tourists at the nature-trail site (N = 21) and undergraduate as well as previous and forward buttons or icons for the user to go back
students at the university (N = 19). More specifically, tourists of the to a previous scene or move to the next scene of the nature-trail tour.
nature-trail site were approached by the researchers and asked whether In general, the IVR design featured three types of interactivity in a
they would be willing to participate in a research study by taking either self-contained interactive multimodal learning environment, including
a walking tour or an IVR tour. Participants self-selected to one of these controlling the pace and/or order of the learning events, e.g., back,
two options, and those who were interested in a walking tour were continue, home, manipulating learning objects within the virtual envi­
randomly assigned to either the enhanced walking tour group or the ronment, e.g., zooming in and out, and navigating to determine the
control walking tour group. Tourists interested in the IVR tour were led learning content through user choices, e.g., hover over learning objects
to a quiet conference room at the nature-trail site and individually for additional information [45]. Fig. 2 shows the screenshots of example
completed the virtual tour there. As there were not enough onsite scenes of the IVR tour, including the built-in actionable buttons pre­
tourists who were interested in the IVR tour within the expected time­ sented earlier. The virtual tour was created using a commercial VR
line, a social media site of the park was used to recruit additional platform CenarioVR that allows for various interactivity to be embedded
tourists. In addition, undergraduate students at the nearby university in the system, such as linked scenes, hotspots, animation, 3D objectives
and modeling, and immersive audio. It is a VR authoring tool that does

Fig. 1. Overview of the Research Design.

4
X. Huang et al. Computers and Education Open 4 (2023) 100124

Table 1 by the 360-degree pictures in sequence with corresponding content text,


Actionable Buttons Built into the IVR Tour. (4) content expert review of the paper prototype, (5) revising the paper
Icon Name Function prototype based on the feedback, (6) creating an IVR prototype where
360-degree pictures and content were integrated into the VR develop­
Home Goes to the home scene of the IVR tour
ment program, (7) IVR prototype review (content expert and potential
user), (8) revising the IVR tour based on the feedback, and (9) evaluating
Continue Moves to the next scene of the IVR tour the IVR tour where users experienced the IVR tour and completed the
assessment instruments (See 2.4 Assessment Instruments).
Back Moves to the previous scene of the IVR tour Participants in the IVR condition individually viewed the virtual tour
using a stand-alone Oculus Quest 2 head-mounted display (HMD) and
Magnifier Brings up a closer view of objects along the nature trail the accompanying touch controllers (see Fig. 4). The virtual tour was
loaded to the Oculus Quest device used for this study as a private, stand-
alone app that participants could access through the App Library of the
Information Brings up additional content for the user to study device. The VR HMD has a resolution of 1832 × 1920 pixels per eye and
a field of view (FOV) of 89◦ , with 6 Degree of Freedom (DoF) and a
refresh rate of 72 Hz. The headset could track the movement of both
Map Brings up the trail map for the user to check location participants’ head and body. Participants remained seated in swivel
chairs at all times during the virtual tour, which allowed them to turn
Exit Exit the IVR tour their heads and bodies to view the 360-degree images in a safe envi­
ronment (e.g., reduce the likelihood of motion sickness or falling [69,
70]). Participants were told that they could spend as much time as
needed to explore the IVR tour. The average time of their IVR experience
not require coding. was approximately 25 minutes.
The design of the IVR learning environment was guided by CLT in
general and CTML in specific [41,46–48]. Key design principles focusing 2.3.2. Walking conditions
on minimizing extraneous processing, managing essential processing, Individuals in both walking groups were provided with a map of the
and fostering generating processing were integrated into the IVR design. nature-trail site. The route of the specific trail they were supposed to
These principles and their applications are summarized in Table 2. follow was highlighted in yellow color, with the starting point and the
A systematic instructional design process was followed in creating endpoint marked (the same map that the IVR group could access
the IVR environment [68]. It included the following main steps as shown virtually). Participants were instructed to follow the highlighted trail to
in Fig. 3: (1) developing learning objectives and corresponding assess­ complete their walking tour.
ment instruments, (2) taking 360-degree pictures of the intended nature In addition, participants in the enhanced walking tour group (E-WT
trail, (3) creating a paper prototype where the virtual tour was depicted condition) studied the same materials presented in the IVR group, with

Fig. 2. An Annotated Screenshot of the IVR Learning Environment.

5
X. Huang et al. Computers and Education Open 4 (2023) 100124

Table 2 the differences lying in the presentation format and study setting. More
CTML Design Principles Incorporated into the IVR Learning Environment. specifically, before E-WT participants started their walking tour, they
Goal Design Principle Design Realization were provided with hard copy materials to guide their tour, including
(1) a copy of the park map with the nature-trail highlighted and marked
Reducing Contiguity principle Corresponding pictorial information
extraneous (temporal and spatial) and textual messages are physically with the starting and ending points, and (2) a copy of printed informa­
processing integrated and presented tion including the screenshots of the IVR tour pictures with accompa­
concurrently. nying learning content (see Fig. 5).
Signaling principle Essential information is signaled Participants in the business-as-usual condition (BAU-WT condition)
using actionable icons (e.g.,
information icon, magnifier icon)
walked the same nature trail with a copy of the park map as in the E-WT
and/or textual cues (e.g., “check this condition. However, no additional instructional support was provided.
sign”).
Managing Segmenting principle Information is presented in bite-size
essential segments across the 360-degree 2.4. Assessment instruments
processing scenes.
Fostering Personalization Messages are presented in
Participant learning was measured by a knowledge check test con­
generating principle conversational style whenever
processing possible, e.g., “During this walking sisting of 15 multiple-choice questions (α= 0.79) related to the science
tour, we will learn about the [park learning topics covered during their exploration of the nature trail
name] system and all of the life that (Appendix A). The test included six questions on native and invasive
lives here.” trees and plants (e.g., Which type of species is not from a local habitat and
Learner control Learners have the control to pace,
principle select or navigate the information
causes harm to the local species?), six questions on the watershed, karst
presented. landscape, or blue holes (e.g., Why are karst landscapes more vulnerable to
pollution?), and three questions on butterflies (e.g., Why are butterflies
important animals to promote in local ecosystems?) The knowledge check

Fig. 3. The Systematic Process of the IVR Design.

Fig. 4. Picture that Shows an IVR Participant Viewing the Virtual Tour Using an Oculus Quest 2.

6
X. Huang et al. Computers and Education Open 4 (2023) 100124

Fig. 5. An Example of the Guided Information Presented to the E-WT Condition.

was implemented after the completion of the condition-dependent tour. outcome measures, controlling for the baseline pre-intervention differ­
We did not include a pre-test on this measure because previous research ences among the three conditions found in the group equivalence
shows that taking a pre-test can subsequently influence learner perfor­ assessment tests mentioned previously. These controlling variables
mance on a post-test [51]. included participant age and their previous IVR experiences (see Section
In addition, perceived learning was measured by an 8-item, 7-point 3.1 under Results). The independent variable was the nature-trail tour
Likert scale (α= 0.96) adopted from previous research involving virtual type at three levels, i.e., the IVR tour versus the E-WT tour versus the
reality learning environments [52], e.g., I learned a lot of factual infor­ BAU-WT tour. The dependent variables included participants’ learning,
mation in the topics (1 = Strongly Disagree; 7 = Strongly Agree; Appendix self-efficacy, cognitive load, perceived enjoyment, and perceived
B). usefulness.
Self-efficacy was measured by a 5-item, 11-point Likert scale (0 =
Cannot do at all; 10 = Highly certain can do), asking the participants to 3. Results
rate their confidence level in the specific knowledge and skills aligned
with the learning objectives of the nature-trail tour (e.g., Identify trees 3.1. Group basic characteristics prior to intervention
and plants at “Park Name”; Appendix C). The instrument was developed
by the authors using Bandura’s framework [53]. Self-efficacy was Pearson Chi-Square tests indicated that there was no significant
measured twice - before the participants started the tour (α= 0.91) and difference among the three groups in the distribution of gender, χ 2 =
after they finished the tour (α= 0.92). 0.27, p = .87, or the highest degree received, χ 2 = 8.91, p = .06. In
Cognitive load was measured by a single item, 9-point Likert scale (1 addition, a one-way ANOVA test showed that there was no significant
= very, very low mental effort; 9 = very, very high mental effort) that difference among the groups on the pre-measures of self-efficacy, F (2,
asked the participants to rate the mental effort they spent during the 113) = 1.93, p = .15, or their perceived knowledge of environmental
nature-trail tour and during the knowledge test, respectively (Appendix education, F (2, 107) = 2.26, p = .11. However, Pearson Chi-Square tests
D). The instrument was adapted from a well-established instrument showed that there was a significant difference among the three groups in
measuring subjective ratings of cognitive load [54] that has been used the proportion of age categories, χ 2 = 35.72, p < .001, and their prior
widely in previous empirical studies. IVR experience, χ 2 = 14.60, p < .001. Therefore, participants’ age and
Enjoyment was measured by a 3-item, 7-point Likert scale (α= 0.95), prior IVR experience were used as covariates in the analysis of the main
asking participants to rate how enjoyable, fun, or pleasant they findings.
perceived the tour, e.g., I found exploring “Park Name” to be enjoyable (1
= Strongly Disagree; 7 = Strongly Agree; Appendix E). The instrument 3.2. Effect on knowledge test
was adapted from previous research [55,56].
Perceived usefulness was measured by a 4-item, 7-point Likert scale The first line in Table 3 shows the means and standard deviations of
(α= 0.89), asking participants to rate how useful they found the tour as a the knowledge test scores for the three tour groups, respectively. The
tool to facilitate their learning during the process. The instrument was descriptive statistics indicated that the IVR group achieved a masterly
adapted from Davis [57] and Makransky and Petersen [55], e.g., This level of 81% on the knowledge test, followed by the E-WT group of 56%
type of (virtual) tour was useful in supporting my learning (1 = Strongly
Disagree; 7 = Strongly Agree; Appendix F). Table 3
In addition, participant baseline information was collected prior to Descriptive Statistics for the Outcome Measures by Condition.
the intervention in a pre-survey, which included items on characteristics
Measures IVR Group E-WT BAU-WT
such as age, gender, ethnicity, education level, prior IVR experience, and Group Group
perceived knowledge level of environmental education. M (SD) M (SD) M (SD)

Knowledge Test* (15) 12.21 8.41 (3.35) 7.31 (2.13)


2.5. Data analysis (2.35)
Perceived Learning* (7) 6.04 (0.75) 5.42 (1.45) 4.91 (1.26)
First, several tests were performed to assess group equivalence Post-Self-Efficacy* (10) 6.31 (1.84) 4.68 (2.21) 3.64 (2.66)
Cognitive Load_Tour (9) 5.14 (1.73) 5.06 (1.94) 4.19 (1.49)
among the three conditions in terms of participant background charac­
Cognitive Load_Knowledge Test 5.42 (1.62) 5.79 (1.32) 5.94 (1.52)
teristics. Pearson’s Chi-square tests of independence were conducted on (9)
participant age, gender, highest degree received, and prior IVR experi­ Perceived Enjoyment (7) 6.30 (0.90) 6.69 (0.66) 6.41 (0.84)
ences. In addition, one-way analyses of variance (ANOVA) were con­ Perceived Usefulness* (7) 6.14 (0.92) 5.89 (1.14) 5.45 (0.98)
ducted on participants’ pre-self-efficacy and perceived knowledge of Note: Values in parentheses in the first column under Measures indicate
environmental education prior to the intervention. maximum scores.
Second, a series of one-way analyses of covariance (ANCOVA) were *
indicates a significant difference between the treatment group and a control
conducted to examine the effects of the IVR tour on the intended group.

7
X. Huang et al. Computers and Education Open 4 (2023) 100124

and BAU-WT group of 49%. A one-way ANCOVA test showed that there A one-way ANCOVA test showed that there was a significant difference
was a significant main effect among the three groups on their knowledge among the three groups, F (2, 91) = 3.07, p = .05, η2 = 0.06. Follow-up
test scores, F (2, 80) = 18.21, p < .001, η2 = 0.31. Bonferroni post hoc Bonferroni multiple comparison tests revealed that the IVR group (M =
multiple comparison tests indicated that the IVR group (M = 12.21, SD 6.14, SD = 0.92) perceived the tour more useful than the BAU-WT group
= 2.35) performed significantly better than both the E-WT group (M = (M = 5.45, SD = 0.98). There was no significant difference between the
8.41, SD = 3.35), p < .001, and the BAU-WT group (M = 7.31, SD = IVR group and the E-WT group (M = 5.89, SD = 1.14), nor was there
2.13), p < .001. No significant main effect on knowledge test scores was other significant main effect or interaction effect involving the two
detected for the two covariates, i.e., participant age and prior IVR covariates, ps > 0.05.
experience; neither was there any interaction effect, ps > 0.05.
4. Discussion
3.3. Effect on perceived learning
4.1. Main findings
The second line in Table 3 displays the means and standard de­
viations of participants’ perceived learning during their respective tour This study investigated the effects of an informal, theory-based IVR
type. A one-way ANCOVA test indicated that there was a significant science learning environment in the format of a virtual nature-trail tour
main effect among the three groups, F (2, 88) = 5.43, p = .006, η2 = on various cognitive and motivational outcomes, including learning
0.11. Follow-up Bonferroni multiple comparisons showed that the IVR (knowledge test and perceived learning), cognitive load, self-efficacy,
group (M = 6.04, SD = 0.75) perceived significantly higher learning perceived enjoyment, and perceived usefulness. Two control groups
than the BAU-WT group (M = 4.91, SD = 1.26). The analysis did not were included, consisting of an enhanced walking tour (E-WT) group
reveal any other significant main effect or interaction effect, ps > 0.05. and a business-as-usual walking tour (BAU-WT) group. As presented in
the previous section, the results indicate that the IVR group performed
3.4. Effect on self-efficacy significantly better on the knowledge test and reported significantly
higher self-efficacy than both walking tour groups. In addition, the IVR
The third line in Table 3 reveals the means and standard deviations of group revealed significantly higher levels of perceived learning and
participants’ self-efficacy level among the three groups after their perceived usefulness than the BAU-WT group. There was no significant
condition-dependent tour. A one-way ANCOVA test indicated that there difference among the three groups on cognitive load and perceived
was a significant difference among the three groups, F (2, 86) = 7.69, p < enjoyment. In other words, these findings indicate that the theory-based
.001, η2 = 0.15. Follow-up Bonferroni multiple comparisons indicated IVR design was effective in improving users’ science learning and
that the IVR group (M = 6.31, SD = 1.84) reported a significantly higher developing their self-efficacy perceptions; at the same time, the IVR tour
level of self-efficacy perceptions as compared to the E-WT group (M = was as enjoyable as the walking tours and did not pose additional
4.68, SD = 2.21), p = .009, and the BAU-WT group (M = 3.64, SD = cognitive load during the learning process.
2.66), p < .001. No other significant main effect or interaction effect was
detected, ps > 0.05. 4.2. Theoretical implications

3.5. Effect on cognitive load This study contributes to the research on theory-based IVR design to
support science learning and motivational outcomes in an informal
Lines four and five in Table 3 present the means and standard de­ nature-based learning environment. The design of the IVR environment
viations of participants’ cognitive load during the tour and during the was guided by the general CLT framework [40], and more specifically,
knowledge test, respectively, for the three groups. A one-way ANCOVA CTML learning and design principles [41,46]. The goal of the design was
test on cognitive load during the tour indicated that there was no sig­ to reduce unnecessary cognitive processing that may be caused by the
nificant difference among the IVR group (M = 5.14, SD = 1.73), the E- high level of immersion in an IVR environment [10], while at the same
WT group (M = 5.06, SD = 1.94), and the BAU-WT group (M = 4.19, SD time managing essential processing and promoting generative process­
= 1.49), F (2, 78) = 1.61, p = .207, η2 = 0.04. No significant effect was ing through different design elements. CTML principles were originally
detected for other main effects or interaction effects, ps > 0.05, either. developed for less immersive multimedia learning environments and
Similarly, a one-way ANCOVA test on cognitive load during the have been validated mostly with non-immersive learning environments,
knowledge test showed that there was no significant difference among which calls for more empirical work investigating the effects of these
the IVR group (M = 5.42, SD = 1.62), the E-WT group (M = 5.79, SD = principles in highly immersive learning environments [31]. The IVR
1.32), and the BAU-WT group (M = 5.94, SD = 1.52), F (2, 84) = 0.92, p design of this study incorporated a number of CTML principles,
= .403, η2 = 0.02. Again, no significant effect was detected for other including contiguity principle, signaling principle, segmenting princi­
main effects or interaction effects, ps > 0.05, either. ple, personalization principle, and learner control principle. The find­
ings of the present study indicate that CTML principles can be applied in
3.6. Effect on perceived enjoyment IVR environments to facilitate science learning and self-efficacy devel­
opment without negatively affecting learners’ enjoyment and cognitive
The second to the last line in Table 3 shows the means and standard load.
deviations of participants’ perceived enjoyment among the three groups, In addition, despite that CTML can serve as an important theory in
respectively. A one-way ANCOVA analysis indicated that there was no guiding the design of IVR environments in terms of facilitating learners’
significant difference among the IVR group (M = 6.30, SD = 0.90), the E- cognitive processing, few researchers have investigated how their IVR
WT group (M = 6.69, SD = 0.66), and the BAU-WT group (M = 6.41, SD design would actually impact learners’ cognitive load. The finding that
=0.84) on their perceived enjoyment of the tour, F (2, 91) = 2.16, p = the IVR group in the present study did not experience a higher level of
.12, η2 = 0.05. Similarly, there was no other significant main effect or cognitive load than the two walking groups is of particular interest.
interaction effect detected, ps > 0.05. Previous research suggests that immersion characteristics in an IVR
environment may serve as seductive details that exert extraneous
3.7. Effect on perceived usefulness cognitive processing [10]. The inclusion of the cognitive load measure in
this study indicates that a CTML theory-based IVR environment does not
The last line in Table 3 presents the means and standard deviations of have to induce unnecessary cognitive processing that negatively in­
participants’ perceived usefulness of the tour as a learning mechanism. fluences learning. At the same time, previous research [58] indicates

8
X. Huang et al. Computers and Education Open 4 (2023) 100124

that the novelty of IVR environments can distract learners, i.e., a 4.4. Limitations and future directions
possible source of extraneous cognitive load; however, the novelty effect
could be mitigated through strategies such as providing sufficient time One limitation of the study concerns the issue of learning assessment.
for the users to explore an IVR environment. Consistent with this line of We measured participant learning using a multiple-choice knowledge
argument, the present study did not limit participants’ time on the IVR check test, which focused on the learning retention of factual knowledge
tour, which could offer an additional explanation of no extra cognitive gained through the nature-trail tours. Hence, the effect of the IVR tour
load observed for the IVR group. on learning transfer as compared to the walking tours is unclear. In
Another unique aspect of the study is that it expands previous IVR addition, previous research has pointed out the nature of complexity in
research in S.T.E.M. education to include a participant sample not often documenting science learning in informal settings [62]. For example,
studied. That is, different than most previous IVR research involving S.T. participants in the walking groups completed the study in a setting that
E.M. learning content conducted entirely with college students or K-12 was unlike a typically controlled lab setting. How participants in the
students (e.g., [16–19]; see Discussion in Section 1), the present study walking tours actually studied the learning materials is less clear. There
recruited adult participants of varying ages in an informal learning may be inadvertent factors, e.g., other tourists or other plants or trees
setting. Furthermore, few studies have examined the effectiveness of IVR not included in the instruction, that could distract participants from the
environments with the focus on the intended outcomes of learning, central learning objectives intended for the study. It has been suggested
self-efficacy, cognitive load, perceived enjoyment and usefulness in one that naturalistic and open-ended assessment methods may better cap­
empirical study, as did the present study. ture the complexity of the informal learning process and better align
Previous research has reported mixed results concerning the use of with participants’ expectations about learning in such settings [62].
IVR to promote science learning and motivational outcomes. While Future research may include qualitative learning assessment approaches
some studies support the effectiveness of IVR to increase science (e.g., interviews) when investigating the effectiveness of an IVR tour as
learning and motivational outcomes [32,59], others suggest, as previ­ compared to a control group.
ously indicated, that the perceptual realism in an IVR environment could In addition, we measured cognitive load using a self-report, one-item
serve as seductive details that negatively affect learning [10]. The scale [54]. Although this scale has been validated and widely used in
findings of this research are consistent with Klippel et al. [59] and previous empirical work, it is not possible to distinguish the three
Jiayan Zhao et al. [32] in supporting the effectiveness of an IVR envi­ different types of cognitive load proposed in the literature (i.e., intrinsic,
ronment to increase science learning and motivational outcomes. IVR germane, and extraneous load). The challenges of measuring cognitive
content in our study was presented via textual and visual information, load in a technology-supported learning environment have generated
with background ambient sounds of the nature trail (e.g., birds singing, numerous meaningful discussions, such as what constitutes germane
water flowing) to create an environment that was as authentic as load and extraneous load and their role in contributing to the overall
possible. This format is shown to be effective in facilitating the intended load (e.g., [35,63]). These questions are especially crucial to investigate
learning and motivational outcomes. However, according to previous in an IVR environment in that various design elements contributing to
research, the effect of ambient sounds on learning is open to two con­ creating an immersive environment, such as vividness and interactive
trasting interpretations. On the one hand, the ambient sounds should quality, may potentially exert extraneous cognitive load and conse­
lead to an increased level of immersion, i.e., technological quality of the quently interfere with learning [10]. Therefore, a valid instrument tar­
IVR environment [34], which may consequently improve learning [52, geting the three distinct types of cognitive load in an interactive and
60]. On the other hand, the ambient sounds may be considered as immersive multimedia environment will be helpful to discern the role of
seductive details, i.e., interesting but irrelevant information, which may different design elements contributing to cognitive processing.
be in violation with the coherence principle (despite that the textual Furthermore, this study did not completely assign participants at
instruction included in the IVR tour is mostly aligned with this princi­ random to different groups. Rather, participants self-selected to the IVR
ple), and consequently, might have negatively affected learning [61]. As group or the walking groups first, and then only those who preferred to
both lines of arguments are plausible, it will be interesting to compare walk were randomly assigned to a business-as-usual walking tour or an
the current IVR environment with a less immersive one that does not enhanced walking tour. This procedure was conducted because park
include the ambient sounds. Ultimately, the question perhaps boils tourists were recruited on site, most of whom had their own schedule of
down to how we should design an IVR environment to achieve a balance taking a walk rather than experiencing a virtual tour. Thus, it is possible
between the immersion level and the intended learning outcomes. that those who selected the IVR condition were more motivated in using
this technology, which could have accidently affected the study results.
4.3. Practical implications A true experimental study with random assignment of participants to all
conditions may provide insights on this aspect. In addition, future
It is worth restating that the participants in the IVR group enjoyed research may consider including a control condition in a lab setting, for
the tour as much as their counterparts in the walking tours, without example, asking participants to study the learning materials for the
experiencing higher unnecessary cognitive load, while at the same time enhanced walking tour group in a quiet lab. In so doing, it will help
learned more about the nature-trail site and developed higher self- remove possible distracting factors that are inherent in a physical
efficacy on the learning topics. In other words, wearing an HMD head­ informal learning setting.
set and sitting in a chair in an indoor room without getting in touch with Another limitation is that only one IVR condition was included in the
nature and having all the senses involved (e.g., smell, touch) does not present study; thus, it is not possible to pinpoint the relative contribu­
necessarily remove the users from the perceived benefits that a physical tions of the CTML design elements incorporated into the IVR condition.
environment can offer. This finding suggests promising use of theory- As the goal of the study was to investigate the efficacy of using IVR as a
based IVR environments for both educational and entertainment tool to support informal science learning, we were more interested in
purposes. finding out how a theory-based IVR design would work to facilitate
In addition, the results of the study suggest that IVR can offer a viable various learning and motivational outcomes as compared to a physical
way to increase the accessibility of nature-based education. Learning in informal learning setting. However, it will be beneficial for future
the field presents many practical challenges that may limit accessibility, studies to compare different CTML design elements by employing mul­
such as weather, transportation, and people with disabilities who are tiple versions of an IVR design. Future researchers are also encouraged
physically unable to visit a nature-based site [30]. This study provides to explore CTML design elements targeting various content areas in
evidence that a theory-based IVR environment can be an effective virtual environments (e.g., learning cultural heritage through a guided
alternative to learning in the field. city tour as in [64]). This will provide further insights on using

9
X. Huang et al. Computers and Education Open 4 (2023) 100124

theory-based design to guide the development of IVR learning research findings.


environments.
Similarly, future studies may examine how distinct levels of inter­ 5. Conclusion
activity in an IVR environment impact the intended learning and
motivational outcomes. The IVR tour of this research was created using In sum, the present study provides evidence for the efficacy of using a
360-degree pictures of the nature-trail site. As compared to 3D modeled theory-based IVR environment as a tool to support informal, nature-
virtual environments, 360-degree panorama VR environments have the based science learning and motivational outcomes. The design of the
benefits of “low-cost, easy-to-capture, non-computer-generated simula­ IVR environment is guided by multiple CTML principles, which supports
tions that can provide a true-to-reality representation of the environ­ recent research that this theoretical framework can be effectively
ments” ([65], p. 572). Three-D modeled environments can be costly, implemented in highly immersive learning environments. In addition,
unrealistic, and may not fully depict real-world environments [65,66]. the study indicates that IVR can serve as an effective alternative to in-
That said, a 3D modeled IVR environment has the capability of person nature-based experiences, which supports the idea that IVR is a
providing more opportunities for interactivity, e.g., users interacting viable tool to increase the accessibility of nature-based learning. Future
with butterflies, plants, and tree fruits through grabbing-and-placing. research is suggested to extend the current work. Possible directions
Previous research [67] pointed out that the most unconstrained cate­ may include: (1) conducting true experimental research involving
gory of interactivity in a virtual environment is explorative interaction multiple versions of CTML-based IVR design or including control con­
(allowing exploration freely without any restriction); while the most ditions in a lab setting, (2) investigating how different levels of inter­
restricted form is passive experience (highly limited interactivity and activity in an IVR design may influence the intended learning and
movement, such as observing a virtual nature environment without any motivational outcomes, (3) assessing different types of cognitive load to
user interaction; [66]). Although the IVR environment in this study better understand how various IVR design elements contribute to
included interactivity elements (see Materials and Methods), the level of cognitive processes, or (4) assessing learning in a way that captures the
interactivity was limited to what 360-degree panorama VR environ­ complexity of informal, nature-based learning.
ments could offer. More empirical research should explore how this
“reality versus interactivity trade-off” impacts the learning process in Declaration of Competing Interests
IVR environments ([66], p. 3).
Last but not least, as this research is a short-term intervention study, The authors declare that they have no known competing financial
some researchers may argue that the positive findings obtained for the interests or personal relationships that could have appeared to influence
IVR group (e.g., higher level of learning retention and self-efficacy) the work reported in this paper.
could be attributed partly to the novelty effect. That is, it may seem
plausible that the improvement in these learning and motivational Acknowledgement
outcomes could be a result of participants’ increased interest and
attention during the learning process due to the newness of this IVR This research was supported by a Research & Creative Activities
technology [58,68]. It will shed more lights for future researchers to Program grant (RCAP #20–8027) at Western Kentucky University. The
conduct long-term IVR studies and examine whether the same result authors would like to thank Chadwick Singer at Lost River Cave for his
patterns will be obtained over a longer period after the users become contribution to the project, including his feedback on the learning
familiar with the IVR technology. In so doing, we may safely conclude content during the design and development phase and his help in
that the novelty effect can be excluded as a probable explanation for the scheduling during the data collection phase at the site.

Appendix A. Knowledge Check Test

Please circle the correct answer for each question below based on your understanding.

1 Osage Orange is a native tree to (State Name). What is the distinguishing feature to remember this tree?
A The tree has bark that peels off and the trunk becomes hollow.
B The tree has large, round green balls that drop from the female tree. *
C The tree has tulip looking flowers on it in the spring.
D The tree has rough, warty bark.
2 Which is as hardwood tree found in (Park Name)?
A Maple*
B Pine
C Cedar
D Redwood
3 Why are native plants important to grow?
A Native plants are from the area and have adapted to the climate and terrain. *
B Native plants keep nonnatives and invasives from growing in the area.
C Native plants require more water and more care from humans.
D Native plants spread quickly and take over a place.
4 Which type of species is not from a local habitat and causes harm to the local species?
A Local species
B Native species
C Nonnative species
D Invasive species *
5 Which plant is a native species?
A Wintercreeper

10
X. Huang et al. Computers and Education Open 4 (2023) 100124

B Garlic mustard
C Tree-of-heaven
D Wild ginger *
6 Which of the following examples is a way to restore native prairies?
A Allow the meadow to go through succession naturally.
B Burn it. *
C Plant nonnative plants.
D Plant invasive plants.
7 In the ancient past, (State Name) was a/an:
A mountain range
B desert
C ocean *
D forest
8 Which is a feature of karst landscapes?
A rivers
B blueholes *
C plateaus
D prairies
9 What helps the bedrock dissolve in karst landscapes?
A acidic water *
B polluted water
C warm water
D cool water
10 Why are karst landscapes more vulnerable to pollution?
A They are found where a lot of people are located.
B Caves and sinkholes are large so trash can get in them.
C The surface and subsurface are highly interconnected. *
D They aren’t any more vulnerable to pollution than another type of landscape.
11 Why is it important to pick up trash at (Park Name)?
A Trash is an ugly nuisance.
B Animals might eat the trash on the trail.
C Trash takes a long time to decompose.
D Trash can end up in the watershed, negatively affecting the drinking water. *
12 Why are watersheds important for humans?
A They provide areas to view water systems safely.
B Water collects here because they are low points in the landscape.
C Watersheds collect water for human consumption. *
D Watersheds provide drinking water for animals.
13 Why does the viceroy mimic the monarch butterfly?
A The monarch camouflages itself from predators.
B The monarch has eyespots that scare away predators.
C The monarch eats poisonous food making it poisonous to predators. *
D The monarch migrates to Mexico in the wintertime.
14 Why are the butterflies from the butterfly house not allowed to be released to the wild?
A The butterflies are nonnatives to (State Name).
B The butterflies were not born in this local area and so might spread disease. *
C The butterflies would outcompete the local butterflies.
D The butterflies would disrupt the lifecycle of the local butterflies.
15 Why are butterflies important animals to promote in local ecosystems?
A They camouflage well with their environment.
B They are predators to insect pests.
C They help pollinate flowers and other plants. *
D They cause great damage to agriculture.

Note: * indicates the correct answer

Appendix B. Perceived Learning (Adopted from [52])

Using the scale below, please indicate the extent to which you agree or disagree with each of the following statements by circling the number that
corresponds to your opinion. (1 = Strongly Disagree; 7 = Strongly Agree)

1 I was more interested to learn the topics.


2 I learned a lot of factual information in the topics.
3 I gained a good understanding of the basic concepts of the materials.
4 I learned to identify the main and important issues of the topics.

11
X. Huang et al. Computers and Education Open 4 (2023) 100124

5 I was interested and stimulated to learn more.


6 I was able to summarize and conclude what I learned.
7 The learning activities were meaningful.
8 What I learned, I can apply in real context.

Appendix C. Self-efficacy [Developed by the authors following Bandura’s [53] framework]

On a scale from 0 (cannot do at all) to 10 (highly certain can do), please rate how certain you are that you can perform each of the skills below.
Recording a number for each statement from 0 to 10 indicating your degree of confidence using the scale above.

Skills Your confidence


(0–10)
Identify trees and plants at (Park Name).
Distinguish native versus invasive trees and plants at (Park Name).
Explain Karst landscapes at (Park Name).
Explain the importance of water in a karst environment and how pollution happens to karst environments.
Explain the importance of butterflies in the ecosystem.

Appendix D. Cognitive Load (Adapted from [54])

Cognitive load during the tour

Please indicate the amount of mental effort you spent learning about (Park Name) during your tour. Circle the corresponding number below.

Cognitive load during the knowledge test

Please indicate the amount of mental effort you spent while completing the knowledge check questions. Circle the corresponding number below.

Appendix E. Perceived Enjoyment (Adapted from [55,56])

Using the scale below, please indicate the extent to which you agree or disagree with each of the following statements by circling the number that
corresponds to your opinion. (1 = Strongly Disagree; 7 = Strongly Agree)

12
X. Huang et al. Computers and Education Open 4 (2023) 100124

1 I found exploring (Park Name) (virtually) to be enjoyable.


2 Touring (Park Name) (virtually) is pleasant.
3 I had fun exploring (Park Name) (virtually).

Appendix F. Perceived Usefulness (Adapted from [55,57])

Using the scale below, please indicate the extent to which you agree or disagree with each of the following statements by circling the number that
corresponds to your opinion. (1 = Strongly Disagree; 7 = Strongly Agree)

1 Using this type of virtual tour as a tool for learning increased my understanding and knowledge on the topics.
2 This type of virtual tour enhanced the effectiveness of my learning.
3 This type of virtual tour allowed me to progress at my own pace.
4 This type of virtual tour was useful in supporting my learning.

References [22] Ballantyne R, Packer J. Nature-based excursions: School students’ perceptions of


learning in natural environments. Int Res Geogr Environ Educ 2002;11(3):218–36.
https://doi.org/10.1080/10382040208667488.
[1] Burbules NC. Rethinking the virtual. In: Weiss J, Nolan J, Hunsinger J, Trifonas P,
[23] Mann J, Gray T, Truong S, Sahlberg P, Bentsen P, Passy R, Ho S, Ward K, Cowper R.
editors. The international handbook of virtual learning environments. Dordrecht:
A systematic review protocol to identify the key benefits and efficacy of nature-
Springer; 2006. p. 37–58. https://doi.org/10.1007/978-1-4020-3803-7.
based learning in outdoor educational settings. Int J Environ Res Public Health
[2] Pantelidis V. Virtual reality in the classroom. Educ Technol 1993;33(4):23–7.
2021;18(3):1–10. https://doi.org/10.3390/ijerph18031199.
[3] Ryan M-L. Immersion vs. interactivity: virtual reality and literary theory.
[24] National Research Council. Learning science in informal environments: people,
SubStance 1999;28(2):110–37. https://doi.org/10.2307/3685793.
places, pursuits. The National Academies Press; 2009. https://doi.org/10.1080/
[4] Akpan IJ, Shanker M. A comparative evaluation of the effectiveness of virtual
09500690903454217.
reality, 3D visualization and 2D visual interactive simulation: an exploratory meta-
[25] Ardoin NM, Wheaton M, Bowers AW, Hunt CA, Durham WH. Nature-based
analysis. Simulation 2019;95(2):145–70. https://doi.org/10.1177/
tourism’s impact on environmental knowledge, attitudes, and behavior: a review
0037549718757039.
and analysis of the literature and potential future research. J Sustain Tour 2015;23
[5] Helsel S. Virtual reality and education. Educ Technol 1992;32(5):38–42. https://
(6):838–58. https://doi.org/10.1080/09669582.2015.1024258.
doi.org/10.1109/ICSMC.1992.271688.
[26] Harrington MCR. An ethnographic comparison of real and virtual reality field trips
[6] Holly M, Pirker J, Resch S, Brettschuh S, Gütl C. Designing VR experiences –
to trillium trail: the salamander find as a salient event. Children, Youth Environ
Expectations for teaching and learning in VR. Educ Technol Soc 2021;24(2):
2009;19(1):74–101. http://www.jstor.org/stable/10.7721/chilyoutenvi.19.1.
107–19.
0074.
[7] Psotka, J. (1995). Immersive training systems: Virtual reality and education and
[27] Liu AT, Tan T, Chu Y, Liu T, Tan T, Chu Y. Outdoor natural science learning with an
training. Instructional Science, 23(5/6), 405–431.
RFID-supported immersive ubiquitous learning environment. J Educ Technol Soc
[8] Bailenson JN, Yee N, Blascovich J, Beall AC, Lundblad N, Jin M. The use of
2009;12(4):161–75.
immersive virtual reality in the learning sciences: digital transformations of
[28] Hanson K, Shelton BE. Design and development of virtual reality: analysis of
teachers, students, and social context. J Learn Sci 2008;17(1). https://doi.org/
challenges faced by educators. J Educ Technol Soc 2008;11(1):118–31.
10.1080/10508400701793141.
[29] Markowitz DM, Laha R, Perone BP, Pea RD, Bailenson JN. Immersive virtual reality
[9] Mooradian N. Virtual reality, ontology, and value. Metaphilosophy 2006;37(5):
field trips facilitate learning about climate change. Front Psychol 2018;9(NOV):
673–90.
1–20. https://doi.org/10.3389/fpsyg.2018.02364.
[10] Makransky G, Terkildsen TS, Mayer RE. Adding immersive virtual reality to a
[30] Mead C, Buxner S, Bruce G, Taylor W, Semken S, Anbar AD. Immersive, interactive
science lab simulation causes more presence but less learning. Learn Instr 2019;60
virtual field trips promote science learning. J Geosci Educ 2019;67(2):131–42.
(December 2017):225–36. https://doi.org/10.1016/j.learninstruc.2017.12.007.
https://doi.org/10.1080/10899995.2019.1565285.
[11] Falco LD, Summers JJ. Improving career decision self-efficacy and STEM self-
[31] Petersen GB, Klingenberg S, Mayer RE, Makransky G. The virtual field trip:
efficacy in high school girls: evaluation of an intervention. J Career Develop 2019;
investigating how to optimize immersive virtual learning in climate change
46(1):62–76. https://doi.org/10.1177/0894845317721651.
education. Educ TechnolBritish J Educ Technol 2020;51(6):2098–114. https://doi.
[12] Ketelhut DJ. The impact of student self-efficacy on scientific inquiry skills: An
org/10.1111/bjet.12991.
exploratory investigation in river city, a multi-user virtual environment. J Sci Educ
[32] Zhao J, LaFemina P, Carr J, Sajjadi P, Wallgrün JO, Klippel A. Learning in the field:
Technol 2007;16(1):99–111. https://doi.org/10.1007/s10956-006-9038-y.
Comparison of desktop, immersive virtual reality, and actual field trips for place-
[13] Lin L, Lee T, Snyder LA. Math self-efficacy and STEM intentions: a person-centered
based STEM education. In: Proceedings of the 2020 IEEE Conference on Virtual Reality
approach. Front Psychol 2018;9(OCT):1–13. https://doi.org/10.3389/
and 3D User Interfaces (VR) Learning, March; 2020. p. 939–48. https://doi.org/
fpsyg.2018.02033.
10.1109/vr46266.2020.00114.
[14] Lai TL, Lin YS, Chou CY, Yueh HP. Evaluation of an inquiry-based virtual lab for
[33] Radianti J, Majchrzak TA, Fromm J, Wohlgenannt I. A systematic review of
junior high school science classes. J Educ Comput Res 2021;1. https://doi.org/
immersive virtual reality applications for higher education: Design elements,
10.1177/07356331211001579.
lessons learned, and research agenda. Comput Educ 2020;147(July 2019):103778.
[15] Madden J, Pandita S, Schuldt JP, Kim B, Won AS, Holmes NG. Ready student one:
https://doi.org/10.1016/j.compedu.2019.103778.
exploring the predictors of student learning in virtual reality. PLoS One 2020;15
[34] Cummings JJ, Bailenson JN. How immersive is enough? A meta-analysis of the
(3):1–26. https://doi.org/10.1371/journal.pone.0229788.
effect of immersive technology on user presence. Media Psychol 2015;19(2):
[16] Makransky G, Andreasen NK, Baceviciute S, Mayer RE. Immersive virtual reality
272–309. https://doi.org/10.1080/15213269.2015.1015740.
increases liking but not learning with a science simulation and generative learning
[35] Skulmowski A, Xu KM. Understanding cognitive load in digital and online learning:
strategies promote learning in immersive virtual reality. J Educ Psychol 2021;113
a new perspective on extraneous cognitive load. Educ Psychol Rev 2021. https://
(4):719–35. https://doi.org/10.1037/edu0000473.
doi.org/10.1007/s10648-021-09624-7.
[17] Zhao Jian, Lin L, Sun J, Liao Y. Using the summarizing strategy to engage learners:
[36] Mayer RE, Moreno R. Techniques that reduce extraneous cognitive load and
empirical evidence in an immersive virtual reality environment. Asia-Pacific Educ
manage intrinsic cognitive load during multimedia learning. In: Plass JL,
Researcher 2020;29(5):473–82. https://doi.org/10.1007/s40299-020-00499-w.
Moreno R, Brünken R, editors. Cognitive load theory. Cambridge University Press;
[18] Meyer OA, Omdahl MK, Makransky G. Investigating the effect of pre-training when
2010. p. 131–51. https://doi.org/10.1016/s0079-7421(02)80005-6.
learning through immersive virtual reality and video: a media and methods
[37] Chandler P, Sweller J. Cognitive load theory and the format of instruction. Cogn
experiment. Comput Educ 2019;140(December 2018):103603. https://doi.org/
Instr 1991;8(4):293–332.
10.1016/j.compedu.2019.103603.
[38] Sweller J. Cognitive load during problem solving: Effects on learning. Cogn Sci
[19] Pande P, Thit A, Sørensen AE, Mojsoska B, Moeller ME, Jepsen PM. Long-term
1988;12:257–85.
effectiveness of immersive VR simulations in undergraduate science learning:
[39] Sweller J. Cognitive load theory, learning difficulty, and instructional design. Learn
lessons from a media-comparison study. Res Learn Technol 2021;29(1063519):
Instr 1994;4(4):295–312. https://doi.org/10.1016/0959-4752(94)90003-5.
1–24. https://doi.org/10.25304/rlt.v29.2482.
[40] Sweller J, van Merriënboer JJG, Paas F. Cognitive architecture and instructional
[20] Parong J, Mayer RE. Learning science in immersive vrtual reality. J Educ Psychol
design: 20 years later. Educ Psychol Rev 2019;31(2):261–92. https://doi.org/
2018;110(6):785–97. https://doi.org/10.1037/edu0000241.
10.1007/s10648-019-09465-5.
[21] Kozhevnikov M, Gurlitt J, Kozhevnikov M. Learning relative motion concepts in
[41] Mayer, R.E. (2002). Cognitive theory and the design of multimedia instruction: An
immersive and non-immersive virtual environments. J Sci Educ Technol 2013;22
example of the two-way street between cognition and instruction. New Directions
(6):952–62. https://doi.org/10.1007/s10956-013-9441-0.
for Teaching and Learning, 2002(89), 55–71. 10.1002/tl.47.

13
X. Huang et al. Computers and Education Open 4 (2023) 100124

[42] Mayer RE, Fiorella L, editors. The cambridge handbook of multimedia learning. 3rd [57] Makransky G, Lilleholt L. A structural equation modeling investigation of the
ed. Cambridge University Press; 2021. https://doi.org/10.1017/9781108894333. emotional value of immersive virtual reality in education. Educ Technol Res
[43] Vogt A, Babel F, Hock P, Baumann M, Seufert T. Immersive virtual reality or Develop 2018;66(5):1141–64. https://doi.org/10.1007/s11423-018-9581-2.
auditory text first? Effects of adequate sequencing and prompting on learning [58] Parong J, Mayer RE. Cognitive and affective processes for learning science in
outcome. British J Educ Technol 2021;January:1–19. https://doi.org/10.1111/ immersive virtual reality. J Comput Assist Learn 2021;37(1):226–41. https://doi.
bjet.13104. org/10.1111/jcal.12482.
[44] Moreno R, Mayer RE. Interactive multimodal learning environments: special issue [59] Kisiel J, Anderson D. The challenges of understanding science learning in informal
on interactive learning environments: contemporary issues and trends. Educ environments. Curator 2010;53(2):181–9.
Psychol Rev 2007;19(3):309–26. https://doi.org/10.1007/s10648-007-9047-2. [60] Cook AE, Zheng RZ, Blaz JW. Measurement of cognitive load during multimedia
[45] Mayer RE. Multimedia learning. 3rd ed. Cambridge University Press; 2021. learning activities. Cognitive Effects Multimedia Learn 2009;January:34–50.
[46] Mayer RE, Moreno R. Nine ways to reduce cognitive load in multimedia learning. https://doi.org/10.4018/978-1-60566-158-2.ch003.
Educ Psychol 2003;38(1):43–52. https://doi.org/10.1207/S15326985EP3801_6. [61] Checa D, Bustillo A. Advantages and limits of virtual reality in learning processes:
[47] Moreno R, Mayer RE. Techniques that increase generative processing in Briviesca in the fifteenth century. Virtual Reality 2020;24(1):151–61. https://doi.
multimedia learning: open questions for cognitive load research. In: Plass JL, org/10.1007/s10055-019-00389-7.
Moreno R, Brünken R, editors. Cognitive load theory. Cambridge University Press; [62] Ritter KA, Chambers TL. Three-dimensional modeled environments versus 360
2010. p. 153–77. degree panoramas for mobile virtual reality training. Virtual Reality 2022;26(2):
[48] Brown PC, Roediger HL, McDaniel MA. Make it stick: the science of successful 571–81. https://doi.org/10.1007/s10055-021-00502-9.
learning. Harvard University Press; 2014. [63] Yeo NL, White MP, Alcock I, Garside R, Dean SG, Smalley AJ, Gatersleben B. What
[49] Lee EA-L, Wong KW, Fung CC. How does desktop virtual reality enhance learning is the best way of delivering virtual nature for improving mood? An experimental
outcomes? A structural equation modeling approach. Comput Educ 2010;55(4): comparison of high definition TV, 360◦ video, and computer generated virtual
1424–42. https://doi.org/10.1016/j.compedu.2010.06.006. reality. J Environ Psychol 2020;72:101500. https://doi.org/10.1016/j.
[50] Bandura A. Guide for constructing self-efficacy scales. In: Pajares F, Urdan T, jenvp.2020.101500.
editors. Self-efficacy beliefs of adolescents. Information Age Publishing; 2006. [64] Checa D, Bustillo A. A review of immersive virtual reality serious games to enhance
p. 307–37. learning and training. Multimed Tools Appl 2020;79(9–10):5501–27. https://doi.
[51] Paas F, Van Merriënboer JJG, Adams JJ. Measurement of cognitive load in org/10.1007/s11042-019-08348-9.
instructional research. Percept Mot Skills 1994;79:419–30. [65] Koch M, Von Luck K, Schwarzer J, Draheim S. The novelty effect in large display
[52] Makransky G, Petersen GB. Investigating the process of learning with desktop deployments-experiences and lessons-learned for evaluating prototypes. In:
virtual reality: a structural equation modeling approach. Comput Educ 2019;134 Proceedings of the ECSCW 2018 - Proceedings of the 16th European Conference on
(February):15–30. https://doi.org/10.1016/j.compedu.2019.02.002. Computer Supported Cooperative Work; 2018. https://doi.org/10.18420/
[53] Tokel, S.T. (2015). Acceptance of virtual worlds as learning space. Innovations in ecscw2018.
Education and Teaching International, 52(3), 254–264. https://eds-a-ebscohost-com. [66] Cheng KH, Tsai CC. A case study of immersive virtual field trips in an elementary
ezp.waldenulibrary.org/eds/pdfviewer/pdfviewer?vid=12&sid=ba0154b1-03fe- classroom: students’ learning experience and teacher-student interaction
4695-95e4-0f4767694fd6%40sessionmgr4007. behaviors. Comput Educ 2019;140(December 2018):103600. https://doi.org/
[54] Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of 10.1016/j.compedu.2019.103600.
information technology. MIS Q 1989;13(3):319–40. https://doi.org/10.2307/ [67] Olson R, Wagler M. Afield in Wisconsin: Cultural tours, mobile learning, and place-
249008. based games. Western Folklore 2011;70(3–4):287–309.
[55] Checa, D., Miguel-Alonso, I., & Bustillo, A. (2021). Immersive virtual-reality [68] Frick, T., & Boling, E. (2002). Effective web instruction: Handbook for an inquiry-
computer-assembly serious game to enhance autonomsous learning. Virtual Reality, based process. Unpublished manuscript. Bloomington, IN: Indiana University.
0123456789. 10.1007/s10055-021-00607-1. [69] Merhi O, Faugloire E, Flanagan M, Stoffregen TA. Motion sickness, console video
[56] Klippel A, Zhao J, Jackson K, Lou La, Femina P, Stubbs C, Wetzel R, Blair J, games, and head-mounted displays. Human Factors 2007;49(5):920–34. https://
Wallgrün JO, Oprean D. Transforming earth science education throughimmersive doi.org/10.1518/001872007X230262.
experiences: delivering on a long held promise. J Educ Comput Res 2019;57(7): [70] Mayer RE. Multimedia learning. 3rd ed. Cambridge University Press; 2021.
1745–71. https://doi.org/10.1177/0735633119854025. [71] Parong J, Mayer RE. Cognitive and affective processes for learning science in
immersive virtual reality. Journal of Computer Assisted Learning 2021;37(1):
226–41. https://doi.org/10.1111/jcal.12482.

14

You might also like