You are on page 1of 12

Computers & Education 176 (2022) 104350

Contents lists available at ScienceDirect

Computers & Education


journal homepage: www.elsevier.com/locate/compedu

The impact of video lecturers’ nonverbal communication on


learning – An experiment on gestures and facial expressions of
pedagogical agents
Sascha Schneider a, *, Felix Krieglstein a, **, Maik Beege b, Günter Daniel Rey a
a
Psychology of Learning with Digital Media, Faculty of Humanities, Chemnitz University of Technology, Germany
b
Digital Media in Education, Department of Psychology, University of Education, Freiburg, Germany

A R T I C L E I N F O A B S T R A C T

Keywords: Body movements such as gestures and facial expressions are an essential part of human
Pedagogical agents communication. While first results show that body movements of pedagogical agents in educa­
Learning with videos tional videos foster learning when they are learning-related, the impact of unrelated gestures and
Body movements
facial expressions is still not fully examined. This study investigated whether learning-unrelated
Facial expressions
Gestures
gestures and facial expressions performed by an on-screen instructor were able to increase
Cognitive load. learning outcomes while considering differences in cognitive load and the perception of the agent.
In a 2 (gestures; with vs. without) x 2 (facial expressions; with vs. without) between-subject
design, data of 163 participants was collected. Results revealed that a pedagogical agent who
performed both gestures and facial expressions led to better retention performances. In terms of
transfer, participants performed better when they watched the agent performing gestures or facial
expressions solely. In line with the computers-as-social-actors paradigm, gestures and facial ex­
pressions made the agent look more human-like. Gestures and facial expressions also led to higher
perceptions of learning facilitation. In contrast to the hypotheses, implementing learning-
irrelevant body movements did not cause extraneous processing. The results of this study are
discussed with a focus on social processes while learning.

1. Introduction

Pedagogical agents have been proven in research as beneficial for learning (e.g., Castro-Alonso et al., 2021). In particular, when
such agents show body movements, such as gestures or facial expressions, learners’ performance is increased (Baylor and Kim, 2009;
Davis, 2018). Research suggests that a pedagogical agent performing body movements triggers a social reaction in the learner (Krämer
& Bente, 2010). In consequence, the learner remains more focused on the learning content and makes more effort to understand the
learning content. However, it is still unclear whether a pure movement of the body without direct relation to the learning content can
also have an effect on learning. Accordingly, this experiment investigates the effects of non-informative gestures and facial expressions

* Corresponding author. Psychology of learning with digital media, Faculty of Humanities, Technische Universität Chemnitz, Straße der Nationen
12, 09111, Chemnitz, Germany.
** Corresponding author. Psychology of learning with digital media, Faculty of Humanities, Technische Universität Chemnitz, Straße der Nationen
12, 09111, Chemnitz, Germany.
E-mail addresses: sascha.schneider@phil.tu-chemnitz.de (S. Schneider), felix.krieglstein@phil.tu-chemnitz.de (F. Krieglstein).

https://doi.org/10.1016/j.compedu.2021.104350
Received 26 April 2021; Received in revised form 6 September 2021; Accepted 3 October 2021
Available online 6 October 2021
0360-1315/© 2021 Elsevier Ltd. All rights reserved.
S. Schneider et al. Computers & Education 176 (2022) 104350

separately from each other.

1.1. Pedagogical agents in multimedia learning

Pedagogical agents are understood as on-screen characters or displayed humans guiding learners through a learning environment
(Heidig & Clarebout, 2011). By this, a pedagogical agent fulfills instructional purposes meaning that the agent supports the learner
while learning and navigating (Castro-Alonso et al., 2021; Martha & Santoso, 2019). This allows the agent to be understood as a mentor
who is both a motivator and a person with great expertise (cf. Baylor and PALS, 2003). Consequently, instructional designers create
and implement such on-screen agents for learning environments and educational purposes (Wang et al., 2018).
A meta-analysis by Schroeder et al. (2013) found a small effect of learning with pedagogical agents (g+ = 0.19) compared with
groups without such agents. A recent meta-analysis came to a similar result, whereby the learning-beneficial effect of pedagogical
agents was confirmed with an overall effect of g+ = 0.20 (Castro-Alonso et al., 2021). Moderator analyses indicated that pedagogical
agents are only effective for learning when the agent is female and used a human voice while explaining the learning content. A 2D
agent was also associated with higher learning outcomes than a 3D agent. Performing gestures also moderated the effect of pedagogical
agents on learning. Moreover, a systematic review by Schroeder and Adesope (2014, p. 229) also indicated “that learners may prefer
pedagogical agents compared to non-agent control conditions”.
The positive effect of pedagogical agents is often explained, among other factors, by the computers-as-social-actors paradigm (CASA;
Nass et al., 1994). This paradigm suggests that subjects interact with computers in the same natural way as they would interact with
fellow human beings. In consequence, social responses to computers are carried out rather thoughtlessly and intuitively. This paradigm
is aligned with the communication science-based media equation theory claiming that people treat media and computers as if these
devices were real people or places (Reeves & Nass, 1996). Although from different scientific disciplines, both the CASA paradigm and
the media equation theory are thematically related (e.g., Gambino et al., 2020). However, a learning environment needs to contain
social cues in order to activate social processes. This can be achieved, for example, by using human-like pedagogical agents (Atkinson,
2002). In this context, researchers describe the persona effect (e.g., Ryu & Baylor, 2005; Woo, 2008) when a learner attributes
human-like characteristics to a digital agent. In the field of multimedia learning, the social agency theory builds on this premise and
assumes that pedagogical agents as social cues prime the feeling of being in a human-like conversation (Moreno et al., 2001). When
learners interpret the learning situation as a social one, they try to understand the agent’s statements with more effort leading to deeper
cognitive processing (e.g., Mayer, 2014b; Mayer et al., 2003).
Several experimental studies found evidence for the justification of such agents in multimedia learning settings (Mayer & DaPra,
2012; Wang et al., 2018, 2020). However, some previous studies also came to opposite conclusions (e.g., Bailenson et al., 2005;
Frechette & Moreno, 2010; Lin et al., 2020; Schroeder et al., 2017; Unal-Colak & Ozan, 2012) showing that the implementation of an
on-screen instructor did not lead to improved learning performance. In this vein, it is often argued that agents distract learners from
relevant information. Accordingly, learners no longer focus their full attention on the learning content and become distracted by the
presence of the agent, which interferes learning (interference theory; Moreno et al., 2001).
An explanation of the negative influence of pedagogical agents is often based on the cognitive load theory (Sweller et al., 2019) which
assumes that processing information within human working memory is limited in terms of storage and duration. Therefore, instruc­
tional designers aim to avoid a cognitive overload in the learning process. In this vein, two additive types of cognitive load can be
distinguished (Sweller et al., 2019). First, intrinsic cognitive load (ICL, productive load) involves all cognitive processes needed to
master the task complexity (i.e., the degree of how information in an instructional message is interwoven). This load is highly
moderated by the learner’s domain-specific prior knowledge. Second, extraneous cognitive load (ECL, unproductive load) contains all
cognitive processes that result from suboptimal design or many information searching processes. However, in its origin, the cognitive
load theory proposed a third category – the germane cognitive load (GCL), which is defined as all learning-related activities such as
schema construction. Kalyuga (2011) argued that the GCL is indistinguishable from the ICL as both share the same theoretical
foundation. Thus, the GCL is essential for learning as it involves cognitive activities which are devoted to learning-relevant infor­
mation. Following Sweller (2010), effective learning environments should encourage learners to direct their cognitive resources to­
wards the achievement of the learning goals. Indeed, a recent confirmatory factor analysis found strong support for the three-factor
model including ICL, ECL, and GCL (Zavgorodniaia et al., 2020). Consequently, all three types should be included when examining
cognitive load effects.
Based on the cognitive load theory, instructional principles were derived reducing unproductive and increase productive cognitive
processes (Sweller et al., 2019). In consequence, learning materials are more successful when a “less is more” approach is adopted
(Mayer, 2014a), meaning that distracting or even unnecessary elements should consequently be removed from the learning envi­
ronment in order to save cognitive resources for actual learning (Mayer & Moreno, 2003; Sundararajan & Adesope, 2020). Including
pedagogical agents into multimedia learning materials might thus cause extraneous cognitive load, as the learner needs to process
additional (learning-irrelevant) information (Davis, 2018). Hereby, the learner’s cognitive resources are required for processing the
aesthetic properties of the on-screen instructor (Clark & Choi, 2007). Besides this cognitive distraction caused by pedagogical agents,
their presence can also lead to affective distraction (Okonkwo & Vassileva, 2001). This negative effect may occur in particular when
the agent’s presence evokes an emotional response in the learner that is detrimental to learning. For instance, affective dissonances
interfere with learning when the learner’s expectations of the agent are not confirmed (Frechette & Moreno, 2010).

2
S. Schneider et al. Computers & Education 176 (2022) 104350

1.2. Influence of gestures performed by pedagogical agents on learning

Gestures, such as hand and arm movements, play an important in human communication as well as human-computer interaction
(Vuletic et al., 2019). Hereby, gestures and spoken words are often interwoven (McNeill, 1992; Kelly et al., 2008). With gestures, the
same message is transmitted via a second, image-based modality (Cook et al., 2012). Meta-analyses by Hostetter (2011) as well as
Dargue et al. (2019) confirmed that gestures combined with speech have a moderate, beneficial effect on comprehension. In this vein,
previous studies have shown that not only performing but also observing gestures have a learning-beneficial effect (Thompson, 1995;
for a short review, see; Madan & Singhal, 2012). According to McNeill (1992), gestures can be classified into four different types: (1)
Iconic gestures refer to a pictorial representation whereby the spoken content and the simultaneously performed gestures are closely
connected. For instance, the speaker could show with his hands the shape of a specific Fig. 2 In contrast, gestures can also be
non-pictorial. In this vein, beat gestures are defined as rhythmical movements performed by the hands (Loehr, 2012) which underline
the verbal information. (3) Gestures can be also used in order to direct the attention of the interlocutor. These deictic gestures denote
pointing movements on visual information that is relevant in the current situation. (4) In a similar way to iconic gestures, metaphoric
gestures have a depictive character. However, they do not refer to a concrete or physically existing object but to a rather abstract
concept or idea (see Fig. 3).
In recent years, there has been an increased interest in investigating under which circumstances gestures performed by pedagogical
agents can improve learning. In this vein, a meta-analysis by Davis (2018) found that a pedagogical agent gesturing in multimedia
environments significantly increase the learning outcomes of transfer (g = 0.39) and retention (g = 0.28) compared to agents not
gesturing. However, the majority of positive effects stems from studies examining deictic gestures. These results align with the
assumption that pedagogical agents with human-like characteristics lead to better learning outcomes (Mayer, 2014b).
A comprehensive study by Mayer and DaPra (2012) examined whether social cues like gesturing, facial expressions, eye gaze, and
human-like movement affect learning with on-screen presented instructors. In their experimental series, learners were confronted with
narrated presentations in which an animated pedagogical agent was presented on the left side of successive slides. Hereby, the three
experiments examined if a high embodied agent who performed body movements was more learning-beneficial than a low embodied
agent without such movements. Across three experiments, students performed better in a transfer test when the on-screen instructor
showed gesturing, facial expressions, eye gaze, and human-like movements (high embodied) compared with a low embodied agent.
Experiments 1 and 2 found stronger social reactions to the high embodied on-screen agent what is in harmony with the social agency
theory. However, when examining gestures in educational settings, it should be noted that not all types of gestures are equally effective
for learning. Besides the influence of different types on learning effectiveness of gestures, the position of virtual humans’ gestures is of
particular importance. In line with the spatial contiguity principle, gestures, which are specifically targeting the to-be-learned content,
improve retention performances (Craig et al., 2015). In addition, the temporal contiguity effect can be also applied to virtual humans
performing gestures. A study by Twyford and Craig (2013) confirmed that the timing of a gesture is an essential factor when examining
their effectiveness. Following assumptions of the temporal contiguity principle, presenting gestures in temporal coincidence with the
learning content led to better learning performances. Across two experiments, Beege et al. (2020) showed that learners’ retention
performance was significantly affected by gestures. In line with the signaling principle (i.e., important information should be high­
lighted, Schneider et al., 2018) merely deictic gestures enhanced learning outcomes. Presenting an on-screen instructor performing
pointing gestures increased social presence and acted like an attention guide. Besides, beat gestures and the absence of gestures did not
foster learning.
In addition to observing a pedagogical agent performing gestures, it is also possible instructing learners to perform gestures while
learning. Hereby, it is assumed that physical movements while learning support information processing (embodied cognition; e.g.,
Wilson, 2002). In this vein, a study by Zhang et al. (2021) demonstrated that performing content-matching hand movements (hands
should be moved in line with the content) while watching a complex learning video is associated with learning benefits.

1.3. Influence of pedagogical agents’ facial expressions on learning

Similar to gestures, facial expressions are important in interpersonal communication (Motley & Camden, 1988). Such nonverbal
behaviors are signals for certain emotions (Carroll & Russell, 1996; Johnson, Dziurawiec, Ellis, & Morton, 1991). Messinger (2002)
argues that facial expressions may communicate both positive and negative emotions. Emotions also play an important role in learning
when an instructor uses certain facial expressions. In this vein, it is assumed that emotions also have a significant impact on the
learner’s cognitive engagement (Leutner, 2014; Moreno & Mayer, 2007). Facial expressions performed by a pedagogical agent make
this appear more human, thus the persona effect is favored and social interaction is facilitated (Ryu & Yu, 2013).
As shown by Baylor and Kim (2009), an animated pedagogical agent performing facial expressions leads to higher learning out­
comes. Hereby, five different facial expressions were used: neutral, serious, happy, surprised, and sad. These emotions are oriented to
the respective learning content. The authors conclude that nonverbal communication crucially determines learning-related outcomes.
Interestingly, facial expressions facilitated learning only when gestures were absent. The presence of mimic also positively affected the
perception of the agent: Learners assessed the on-screen instructor as more human-like or credible when showing facial expressions. In
contrast, a study by Frechette and Moreno (2010) showed that a static agent leads to better comprehension scores than an instructor
performing facial expressions. Hereby, the agent performed numerous facial expressions such as lip movements, which were syn­
chronized with the spoken narration. These should serve to convey content-associated emotions. The authors assume that a partly
animated pedagogical agent with facial expressions is perceived as unnatural. This may have had a distracting effect on the learner. In
order to eliminate this disturbing factor as best as possible, special emphasis was placed in this study on making the pedagogical agent

3
S. Schneider et al. Computers & Education 176 (2022) 104350

appear as natural as possible. Providing a total human agent aims to prevent uncanny valley perceptions (Mori et al., 1970/2012).
According to this assumption, an almost but not perfect presentation of humans can lead to feelings of eeriness and discomfort,
probably decreasing learning.

2. The present experiment

In general, pedagogical agents provide a social connection that fosters social relationships, which in turn is expected to enhance
learner’s social beliefs and learning outcomes (Castro-Alonso et al., 2021; Kim & Baylor, 2007). The on-screen instructor used in this
study is intended to serve the same function. Hereby, the aim is to examine to what extent even non-specific and also
learning-irrelevant gestures and facial expressions are sufficient to promote learning. Since several previous studies have proven the
learning-beneficial effect of pedagogical agents’ learning-related gestures and facial expressions (e.g., Mayer & DaPra, 2012; Wang
et al., 2018), it was hypothesized that a similar result will occur in this study for learning-irrelevant gestures and facial expressions:
H1a. : Learners receiving instructions from an on-screen instructor with gesturing perform better in the learning tests than learners
who are receiving instructions from an on-screen instructor without gesturing.
H1b. : Learners receiving instructions from an on-screen instructor performing facial expressions perform better in the learning tests
than learners receiving instructions from an on-screen instructor performing no facial expressions.
Since pedagogical agents function as a mentor while learning (Baylor and Ryu, 2003), it is assumed that the inclusion of gestures
and facial expressions is perceived as learning-enhancing:
H2a. : Learners who are shown the on-screen instructor performing gestures perceive the agent as more learning-facilitating than
learners who are shown the on-screen instructor without gestures.
H2b. : Learners who are shown the on-screen instructor performing facial expressions perceive the agent as more learning-facilitating
than learners who are shown the on-screen instructor without facial expressions.
In line with the social agency theory (Moreno et al., 2001), it is assumed that pedagogical agents performing gestures and facial
expressions (high embodied agent; Mayer, 2014b) while explaining the learning content, causes higher ratings of human-likeness. By
this, these life-like features trigger a social learning situation (Kim & Baylor, 2006):
H3a. : Learners receiving gesturing on-screen instructors perceive the agent as more human-like than learners who are shown the on-
screen instructor without gestures.
H3b. : Learners receiving on-screen instructors performing facial expressions perceive the agent as more human-like than learners
who are shown the on-screen instructor without facial expressions.
Concerning the CLT, implementing an on-screen instructor, is associated with an increase in extraneous cognitive load as additional
information requires processing (Clark & Choi, 2007). Therefore, the following hypotheses were formulated:
H4a. : Learners receiving gesturing on-screen instructors perceive a higher extraneous cognitive load than learners who are shown
the on-screen instructor without gestures.
H4b. : Learners receiving on-screen instructors performing facial expressions perceive a higher extraneous cognitive load than
learners who are shown the on-screen instructor without facial expressions.
Until now, there is only little evidence on the intertwining influence of gestures and facial expressions on learners’ perceptions and
learning performance (e.g., Frechette & Moreno, 2010; Ryu & Yu, 2013). Based on the CASA paradigm, showing both gestures and
facial expressions should increase the human-likeness of the pedagogical agent and thus, further increase learning. However, a clear
basis for a directional hypothesis was missing regarding the influence on human-likeness, cognitive load, and learning. For this, the
present study looked at this supposed interaction exploratory:
H5: There are interaction effects of gestures and facial expressions performed by pedagogical agents in terms of human-likeness,
cognitive load, and learning.
Since examining effects of gestures on learning were found to be influenced by prior knowledge (e.g. Congdon et al., 2018) or
learners’ emotional states (Shan et al., 2007, pp. 1–10), these variables were used as covariates in the experiment.

3. Method

3.1. Experimental design and participants

In this experiment, a two (gestures; with vs. without gestures) × two (facial expressions; with vs. without gestures) between-
subjects factorial design was used to test the influence and interaction of gestures and facial expressions. An a-priori power anal­
ysis was conducted with a two-factorial between-subject design with two-factor levels each, a moderate effect size of ηp2 = 0.06, a test
power of 1 - β = 0.80, and an error probability of α = 0.05. This analysis revealed a minimum number of participants of N = 125.
Overall, 163 university students (83.4% female; age: M = 23.09; SD = 5.82) from Technische Universität Chemnitz, who received
either 1-h course credit or a financial allowance, were recruited for this experiment. Students were enrolled in media studies (47.9%),

4
S. Schneider et al. Computers & Education 176 (2022) 104350

pedagogy (15.3%), psychology (14.1%), linguistics (16%), and other fields of study (6.7%). Accordingly, 62% of the participants were
bachelor students, 24.5% master students and 13.5% were about to take their state exam. Mean prior knowledge (further described in
Measures) was 1.38 (SD = 1.36) out of 16 points, which can be seen as low prior knowledge. Each student was randomly assigned to
one experimental group. Forty participants were allocated to the condition without gestures and without facial expressions, while the
remaining participants were evenly distributed (N = 41 in each condition) among the three other conditions.

3.2. Materials

For this experiment, six learning videos were created. The learning topic covered in these videos dealt with the characteristics,
functioning, and important features of geysers based on scientific texts (e.g., Rinehart, 1980). At the end of each video, participants
could independently click on a button to move to the next video. The learning videos ranged in length from about 30 s to just under 2
min (mean length = 87 s). Hereby, the length of the videos was based on recommendations that those should not exceed a time of 6 min
(Davis, 2018; Guo et al., 2014). Within the videos, a female on-screen instructor is implemented, while the learning content is pre­
sented via an audio message. Before the videos were recorded, the human agent received concrete instructions from the authors of this
study. For this, the authors prepared a list of body movements and facial expressions, which should be performed in the respective
condition. Then, the human agent was trained to perform them as flawlessly as possible before the final videos were recorded.
Instructional illustrations were placed next to the pedagogical agent (for an overview, see Fig. 1). In order to be able to compare the
four treatment groups, the material to be learned and the design of the learning environment was identical. Only the animation of the
instructor differed according to the experimental conditions. In the condition without gestures/without facial expressions, the on-
screen instructor stood statically. In the condition with gestures/without facial expressions, the agents face remained almost
motionless the entire time. Gestures, on the other hand, were present whereby task-related gestures such as pointing and underlining
were not used. The metaphoric gestures consisted, for example, of bracing the arm, sweeping the hair back, or extending the arm again.
These movements were performed in arbitrary order. In the condition without gestures/with facial expressions, the agent stood
without any noticeable movements. The facial expressions consisted of slight smiles, brief frowns, and small lip movements. In the
condition with gestures/with facial expressions, the instructor performed both gestures and facial expressions simultaneously. The
learning time was limited to 1800 s. However, this was not visible to the test person and was not communicated to participants before
the experiment. The average learning time was 460.88 s (SD = 16.77s).

3.3. Measures

For all measurements, Cronbach’s alpha or McDonald’s omega were calculated in order to ensure reliability. Hereby, both co­
efficients can be interpreted in the same way (Gliem & Gliem, 2003).

Fig. 1. Screenshots of the learning videos by experimental conditions.

5
S. Schneider et al. Computers & Education 176 (2022) 104350

Fig. 2. Mean retention test scores and corresponding standard errors by experimental group. Note. Retention test score ranged from 0 to 40.

Manipulation check. Before analyzing the data, a manipulation check was conducted in order to ensure that the manipulation within
the independent variables was effective (e.g., Hauser et al., 2018). Therefore, two items were created in which the participants were
asked how noticeable the gestures or facial expressions were. The answer possibilities ranged from 1 (“very noticeable”) to 5 (“not
noticeable at all”).
Prior knowledge. Since the domain-specific knowledge of the learners has a considerable influence on cognitive load perceptions and
learners’ performance (Chen et al., 2017), prior knowledge was measured with three open-format questions (“What is a geyser?“,
“Please briefly describe how geysers work!“, & “Where do geysers occur?”; ω = 0.70). For each question, a list with correct answers was
prepared. The participants’ answers were corrected independently by two raters. Inter-rater reliability scores can be classified as strong
to almost perfect (McHugh, 2012; question 1: κ = 0.83, question 2: κ = 0.97, & question 3: κ = 0.80).
Emotional states. As the pedagogical agent might convey certain emotions by performing facial expressions and gestures, the
participants emotional states were measured. In order to record possible changes, the students were asked to fill in two bipolar items of
the valence from the valence, positive affect, and negative affect (PANAVA) short scales (Schallberger, 2005) before and after viewing
the learning videos. On a seven-point scale ranging from “unhappy” to “happy” and “dissatisfied” to “satisfied”, the students were
asked to indicate how they were currently feeling (α = 0.84).
Learning performances. To assess participants’ learning performance, two tests (retention and transfer) were conducted. For
retention, which can be defined as remembering (Mayer, 2014a), ten multiple-choice questions were created (ω = 0.73). The tasks
serve to track whether the participants were able to remember contents explicitly mentioned in the videos. For each question, four
predefined answer options could be marked as correct. Either one, two, three, or all four items were correct. Each correctly marked
answer option and each correctly unmarked answer option (for false options) was awarded with one point. For example, the question
“Which countries are known for their geysers?” was provided including the answers options: “a) New Zealand”, “b) France”, “c)
Iceland”, and “d) USA”. Overall, for the retention test, 40 points could be maximally achieved by the participants. For the transfer test,
in which the students have to prove that they can apply the acquired knowledge, ten multiple-choice questions were formulated (ω =
0.70). At least one of the four answer options was always correct. For example, the question “How can an earthquake cause a geyser
that is actually extinct to reawaken?” was displayed together with the answers: “a) An earthquake can loosen possible blockages in the
eruption channel.“, “b) Earthquakes can only cause geysers to go out, not to reawaken.“, “c) Shifts in the earth’s crust can cause altered
water flow paths.“, and “d) Shifts in the pressure conditions in the subsurface can lead to altered water conduction paths.” In
conclusion, students could reach a maximum of 40 points for the transfer test.
Cognitive load. Learner’s perceived cognitive load while learning was assessed with the questionnaire from Leppink et al. (2013). In
detail, the subscales intrinsic cognitive load (ICL, three items, α = 0.85, e.g., “The activity covered concepts and definitions that I
perceived as very complex”), extraneous cognitive load (ECL, three items, α = 0.80, e.g., “The instructions and/or explanations during
the activity were very unclear”), and germane cognitive load (GCL, four items, α = 0.89, e.g., “The activity really enhanced my un­
derstanding of concepts and definitions”) were used. These items were rated by the students on an 11-point scale from (1) “not at all
applicable” to (10) “fully applicable”.
Perception of the video-instructor. In order to measure learners’ perceptions regarding the on-screen instructor, the agent persona
instrument from Ryu and Baylor (2005) was used. In detail, the two dimensions learning facilitation (ten items, e.g., “The agent made
the instruction interesting”, α = 0.91) and human-likeness (five items, e.g., “The agent’s emotion was natural” α = 0.74) were used for
this study. Participants had to rate these items on 5-point Likert scales ranging from (1) “strongly disagree” to (5) “strongly agree”.

3.4. Procedure

The study took place in a computer room with four workstations over three weeks. Participants were randomly assigned to one of
the four conditions by independently selecting a computer workstation, each of which had a small slip of paper prepared with the
respective condition and subject number. In the beginning, all participants were welcomed and briefly introduced to the study

6
S. Schneider et al. Computers & Education 176 (2022) 104350

procedure. Then, participants were instructed to put on the provided headphones while working within the learning environment.
After that, the participants could start with the experiment. Hereby, they were independently following the instructions on the screens.
The study consisted of three different sections, each connected via links. In the first block, learner’s prior knowledge, demographic
information, and current emotional states were requested. Following this, the learning environment started, which differed in
dependence of the experimental condition. During the final section, the remaining dependent variables were gathered in the following
order: (a) manipulation check; (b) valence, (c) assessments of the pedagogical agent, (d) cognitive load; and (e) knowledge tasks.

4. Results

In the analysis of data, multivariate analyses of covariance (MANCOVAs) and univariate analyses of covariance (ANCOVAs) were
conducted to assess differences between groups. For all variance analyses, the group variables gestures (with vs. without) and facial
expressions (with vs. without) were used as independent variables. For all analyses, prior knowledge was included as covariate since the
independent variables; gestures (F(1, 159) = 8.90, p = .003, ηp2 = 0.05) and facial expressions (F(1, 159) = 3.96, p = .048, ηp2 = 0.02)
showed significant differences. Also, the difference between the two emotional states before and after watching the learning videos was
used as a covariate for the data analysis since the on-screen instructor might induce several emotions with gestures and facial ex­
pressions but not without these body movements (Sato et al., 2019; Tonguç & Ozkara, 2020). However, only significant influences of
this covariate were reported. Furthermore, χ2-tests revealed no differences between the four treatment groups with regard to gender (p
= .10), subject of study (p = .36), and the degree (p = .26). Descriptive results of all dependent measures according to the experimental
groups were displayed in Table 1. Effect sizes were only computed if significant effects occurred.

4.1. Manipulation check

Significant main effects were found in a MANOVA with both manipulation check items, for gestures, Wilk’s Λ = 0.73, F(2, 156) =
29.24, p < .001, ηp2 = 0.27, and facial expressions, Wilk’s Λ = 0.79, F(2, 156) = 19.70, p < .001, ηp2 = 0.20. The interaction did not
reach significance (p = .717). Follow-up ANCOVAs revealed that the group with a lecturer performing gestures also perceived the
gestures as noticeable, F(1, 157) = 50.88, p < .001, ηp2 = 0.25. The effect for facial expressions was not significant (p = .438).
Furthermore, the facial expressions group estimated the mimic as more noticeable than the condition which have not seen mimic, F(1,
157) = 28.89, p < .001, ηp2 = 0.15. No effect was found regarding gestures (p = .432). Based on these results, the manipulation of the
independent variables can be assessed as successful.

4.2. Learning outcomes

In order to check for learning differences, a MANCOVA was conducted with the two dependent variables retention and transfer.
Hereby, significant main effects were found for gestures, Wilk’s Λ = 0.89, F(2, 156) = 9.31, p < .001, ηp2 = 0.11, and facial expressions,
Wilk’s Λ = 0.88, F(2, 156) = 10.37, p < .001, ηp2 = 0.12. The interaction failed to reach significance (p = .088). The covariate prior
knowledge became significance, Wilk’s Λ = 0.95, F(2, 156) = 4.34, p = .015, ηp2 = 0.05. This test has been divided into two follow-up
ANCOVAs. Regarding retention, it could be shown that the video-instructor performing gestures led to better learning scores than the
video-instructor without gestures, F(1, 157) = 16.91, p < .001, ηp2 = 0.10. In a similar way, a video-lecturer performing facial ex­
pressions resulted in higher retention scores, F(1, 157) = 17.98, p < .001, ηp2 = 0.10. In terms of transfer, significant main effects were
found for gestures, F(1, 157) = 9.50, p = .002, ηp2 = 0.06, and facial expressions, F(1, 157) = 11.90, p = .001, ηp2 = 0.07. Participants

Table 1
Mean scores of all dependent variables and possible covariates as well as corresponding standard deviations by experimental groups.
Type of scale Experimental groups

With gestures Without gestures

With facial expressions (N Without facial expressions With facial expressions (N Without facial expressions
= 41) (N = 41) = 41) (N = 40)

M SD M SD M SD M SD

Prior knowledge 1.20 1.67 0.95 1.09 1.98 1.75 1.40 1.13
Retention 31.27 4.12 26.78 4.50 27.42 4.32 25.63 5.03
Transfer 27.85 5.51 25.07 4.39 25.85 4.19 23.33 4.13
Intrinsic cognitive load 5.31 2.37 5.98 2.04 5.86 2.01 4.75 2.29
Extraneous cognitive load 4.35 1.91 4.37 1.61 4.04 1.47 3.75 1.57
Germane cognitive load 5.45 2.38 6.00 2.15 5.87 2.27 5.98 2.54
Emotional state (before) 4.16 1.42 4.55 1.20 4.61 1.21 4.65 1.11
Emotional state (after) 4.56 1.44 4.76 1.15 5.10 1.17 4.89 1.20
agent persona: learning facilitation 2.52 0.90 1.79 0.80 1.74 0.56 1.79 0.74
agent persona: human likeness 3.68 0.82 3.58 0.64 3.48 0.82 2.86 0.87

Note. For prior knowledge, a maximum of 16 points could be achieved. Scores for retention and transfer ranged from 0 to 40. The scales of intrinsic
cognitive load, extraneous cognitive load and germane cognitive load ranged from 1 to 10. Scores for the agent persona sub-scales learning facilitation
and human likeness ranged from 1 to 5.

7
S. Schneider et al. Computers & Education 176 (2022) 104350

were better in the transfer test when the instructor performed either gestures or mimic. Consequently, the hypotheses 1a and 2a can be
confirmed (see Fig. 3).

4.3. Perception of the video-instructor

For the two dimensions of human-likeness and learning facilitation, a MANCOVA found significant main effects for gestures, Wilk’s
Λ = 0.89, F(2, 156) = 9.18, p < .001, ηp2 = 0.11, and facial expressions, Wilk’s Λ = 0.89, F(2, 156) = 10.14, p < .001, ηp2 = 0.12.
Moreover, the interaction of both factors reached significance, Wilk’s Λ = 0.91, F(2, 156) = 7.62, p = .001, ηp2 = 0.09. The covariate
prior knowledge was also found to be significant, Wilk’s Λ = 0.96, F(2, 156) = 3.59, p = .030, ηp2 = 0.04. The covariate emotional
states difference did not become significant (p = .86). Follow-up ANCOVAs revealed that the gesturing instructor was perceived as
more learning-facilitating compared with the instructor not performing gestures, F(1, 157) = 7.53, p = .007, ηp2 = 0.05 (see Fig. 4).
The same effect occurred when the on-screen instructor performed facial expressions while explaining the learning content, F(1, 157)
= 11.37, p = .001, ηp2 = 0.07. Thus, hypotheses 2a and 2b can be confirmed. In addition, the significant interaction showed that the
instructor was perceived as most learning-facilitating when the agent performed both gestures and facial expressions, F(1, 158) =
10.58, p = .001, ηp2 = 0.06. In terms of human-likeness, follow-up ANCOVAs found on the one hand that the instructor was perceived
as more human-like when gestures were performed, F(1, 157) = 11.18, p = .001, ηp2 = 0.07. On the other hand, facial expressions
performed by the instructor led to a higher rating of the human-likeness, F(1, 157) = 9.29, p = .003, ηp2 = 0.06. Hypotheses 3a and 3b
can be confirmed as well. The significant interaction indicated that the instructor was perceived as most human-like when gestures and
facial expressions were performed simultaneously, F(1, 157) = 4.58, p = .034, ηp2 = 0.03 (see Fig. 5).

4.4. Cognitive load

A MANCOVA was conducted with the three cognitive load types; ECL, ICL, and GCL. Only a significant interaction between the two
main factors gestures and facial expressions was revealed, Wilk’s Λ = 0.95, F(3, 155) = 2.89, p = .037, ηp2 = 0.05. The main effects for
gestures (p = .317) and facial expressions (p = .939) were not significant. However, the covariate emotional states reached signifi­
cance, Wilk’s Λ = 0.92, F(3, 155) = 4.61, p = .004, ηp2 = 0.08. As a result, hypotheses 4a and 4b must be rejected. Follow-up ANCOVAs
could just find a significant interaction regarding the ICL, F(1, 157) = 6.77, p = .010, ηp2 = 0.04. Accordingly, participants reported the
highest ICL score when they learned with the video-instructor performing gestures, but no facial expressions. The interaction effects for
ECL (p = .586) and GCL (p = .480) did not reach significance.

5. Discussion

The goal of this study was to gain deeper insights into whether learning-irrelevant gestures and facial expressions can facilitate
learning processes. The major contribution of this study is that implementing gestures and facial expressions performed by a peda­
gogical agent contributes to successful learning. In line with previous meta-analyses (e.g., Castro-Alonso et al., 2021; Davis, 2018),
implementing gestures into the learning environment is associated with learning benefits regarding retention and transfer. In contrast
to Moreno and Frechette (2010), an on-screen instructor performing facial expressions also lead to better retention and transfer scores.
It is interesting to note that the combined presentation of gestures and facial expressions only led to a better retention performance. In
terms of transfer, each body movement contributed to the better result for itself.
Explanations for this might be found in the results on the perceptions of the pedagogical agent and the learning material. Although
both performing gestures and performing facial expressions led to an increased perception of learning facilitation and human-likeness,
an agent performing both body movements was additionally perceived as more human indicated by the interaction effect. This is in
line with the CASAparadigm, supporting the idea that a greater approximation of digitally presented characters to human behaviors

Fig. 3. Mean transfer test scores and corresponding standard errors by experimental group. Note. Retention test score ranged from 0 to 40.

8
S. Schneider et al. Computers & Education 176 (2022) 104350

Fig. 4. Mean perception scores of learning facilitation and corresponding standard errors by experimental group. Note. The scale ranged from 1 to 5.

Fig. 5. Mean perception scores of human likeness and corresponding standard errors by experimental group. Note. The scale ranged from 1 to 5.

also leads to an increased triggering of social processes, which in turn was reported as an increased human likeness by the learners
(Nass et al., 1994). This may also have resulted in learners being less distracted by a non-human agent (i.e., an agent that performs no
or only partial body movements) during the presentation and therefore focusing more on the facts of the learning content. A reason for
not finding an interaction in terms of transfer might be found in the instruction given to the learners before learning. By telling learners
to memorize as much information as possible from the videos, learners were instructed to focus on retention knowledge rather than
understanding and applying learning content. Nonetheless, both gestures and facial expressions helped to increase transfer knowledge
independently indicating that such body movements can also have an impact on the application of knowledge. This could hint towards
a kind of embodiment effect (i.e., seeing others perform body movements helps internalize procedures, Mayer & DaPra, 2012) when
looking at such body movements.
Interestingly and contra-intuitively, no significant differences were found for the perceptions of extraneous cognitive load. Gestures
and facial expressions did not act as distracting elements that drew learner’s attention away from the actual learning content.
However, gestures were found to elicit more ICL, probably assessed by learners as a higher perceived task complexity. A probable
explanation for this result might lay in the assumption that humans much more refer learning-relevant information to gestures
compared to facial expressions. Including gestures might therefore be perceived as an additional information resource that is reflected
in increased ICL perceptions. The assumption by Clark and Choi (2007) and Davis (2018) that agents’ body movements claim
significantly more cognitive resources for processing can therefore only be partially accepted since no extra ECL processes can be found
and only extra ICL processes for gesture were shown. Also, a possible explanation is that rather simple body movements are biolog­
ically anchored and therefore do not cause increases in cognitive load (Kirschner et al., 2018). From an evolutionary educational
perspective (Geary & Berch, 2016), gestures and facial expressions might be framed as biologically primary knowledge that is pro­
cessed automatically and without mental effort. The interpretation of gestures without a learning-relevant relation, however, might
induce a cognitive conflict that needs to be solved leading to higher ICL judgements.

5.1. Implications

From a theoretical perspective, the results of this study are in line with previous literature and underline the relevance of animated

9
S. Schneider et al. Computers & Education 176 (2022) 104350

pedagogical agents in multimedia learning. In unison with assumptions from human-to-human communication (e.g., Motley &
Camden, 1988; Vuletic et al., 2019), the CASA paradigm (Nass et al., 1994), and the embodiment effect (Mayer & DaPra, 2012), body
movements are useful supplements to increase understanding when learning information is transmitted by media.
The results of this study indicate that learners can reliably assess whether a pedagogical agent behaves naturally. Interestingly,
even learning-irrelevant body movements can improve learning and do not increase the perceived cognitive load. If an instructional
designer decides to animate such an agent for educational purposes, a rather holistic approach should be taken. Thus, in terms of
facilitating learning both gestures and facial expressions should be used.

5.2. Limitations and future directions

Despite the interesting results of this study, several limitations might narrow the generalizability. For example, this study was only
able to show the short-term effects of an on-screen instructor. Although moderate effects were found, this does not mean that they
persist over a longer period of time. Moreover, the rather short duration of the educational videos was able to increase learning. The
question if found effects are also transferable to longer videos is still open. Future studies should use longitudinal experiments to make
even more accurate statements about the effectiveness of pedagogical agents in learning videos over time.
Furthermore, this experimental study only investigated a very coarse graduation of each factor (i.e. absence vs. presence) which
partially limits the informative value. Future studies should add several types of learning-irrelevant gestures (c.f. Beege et al., 2020) in
order to examine how they interact with facial expressions when a virtual presented human acts like an instructor.
This study relied on a female pedagogical agent-based. However, body movements performed by female and male pedagogical
agents might differ in their effectiveness or the effectiveness of gestures and facial expressions might be moderated by both the sex of
the learner and the sex of the pedagogical agent (i.e., the gender matching effect, Arroyo et al., 2013). Further studies could also
measure additional variables, such as situational interest or engagement while learning in order to explain found effects (e.g., Liew
et al., 2017; Park, 2015). In research, motivation is often used as a mediating variable (e.g., Leutner, 2014) to examine the causal chain
between instructional interventions and learning achievement.

6. Conclusion

The current study showed how body movements performed by a pedagogical agent affects learning outcomes and the perception of
the agent. Gestures and facial expressions are, according to the results of this study, a simple and intuitive solution to make the learning
situation with an on-screen instructor perceived more human-like and learning-facilitating. Accordingly, an agent performing
learning-irrelevant gestures and facial expressions is associated with better learning outcomes without increasing extraneous cognitive
load.

Credit author statement

Sascha Schneider: Project administration, Conceptualization, Methodology, Software, Validation, Formal Analysis, Investigation,
Writing. Felix Krieglstein: Data curation, Writing. Maik Beege: Visualization, Resources. Günter Daniel Rey: Supervision.

References

Arroyo, I., Burleson, W., Tai, M., Muldner, K., & Woolf, B. P. (2013). Gender differences in the use and benefit of advanced learning technologies for mathematics.
Journal of Educational Psychology, 105, 957–969. https://doi.org/10.1037/a0032748
Atkinson, R. K. (2002). Optimizing learning from examples using animated pedagogical agents. Journal of Educational Psychology, 94, 412–427. https://doi.org/
10.1037/0022-0663.94.2.416
Bailenson, J. N., Swinth, K., Hoyt, C., Persky, S., Dimov, A., & Blascovich, J. (2005). The independent and interactive effects of embodied-agent appearance and
behavior on self-report, cognitive, and behavioral markers of copresence in immersive virtual environments. Presence: Teleoperators and Virtual Environments, 14,
379–393. https://doi.org/10.1162/105474605774785235
Baylor, A., & Kim, S. (2009). Designing nonverbal communication for pedagogical agents: When less is more. Computers in Human Behavior, 25, 450–457. https://doi.
org/10.1016/j.chb.2008.10.008
Baylor, A. L., & Pals. (2003). The impact of three pedagogical agent roles. In Proceedings of the second international joint conference on Autonomous agents and multiagent
systems (pp. 928–929). https://doi.org/10.1145/860575.860729
Baylor, A. L., & Ryu, J. (2003). The effects of image and animation in enhancing pedagogical agent persona. Journal of Educational Computing Research, 28, 373–394.
https://doi.org/10.2190/2FV0WQ-NWGN-JB54-FAT4
Beege, M., Ninaus, M., Schneider, S., Nebel, S., Schlemmel, J., Weidenmüller, J., Moeller, K., & Rey, G. D. (2020). Investigating the effects of beat and deictic gestures
of a lecturer in educational videos. Computers & Education, 156, 103955. https://doi.org/10.1016/j.compedu.2020.103955
Carroll, J. M., & Russell, J. A. (1996). Do facial expressions signal specific emotions? Judging emotion from the face in context. Journal of Personality and Social
Psychology, 70, 205–218. https://doi.org/10.1037/0022-3514.70.2.205
Castro-Alonso, J. C., Wong, R. M., Adesope, O. O., & Paas, F. (2021). Effectiveness of multimedia pedagogical agents predicted by diverse theories: A meta-analysis.
Educational Psychology Review. https://doi.org/10.1007/s10648-020-09587-1
Chen, O., Kalyuga, S., & Sweller, J. (2017). The expertise reversal effect is a variant of the more general element interactivity effect. Educational Psychology Review, 29,
393–405. https://doi.org/10.1007/10648-016-9359-1
Clark, R. E., & Choi, S. (2007). The questionable benefits of pedagogical agents: Response to Veletsianos. Journal of Educational Computing Research, 36, 379–381.
https://doi.org/10.2190/2781-3471-67MG-5033
Congdon, E. L., Kwon, M. K., & Levine, S. C. (2018). Learning to measure through action and gesture: Children’s prior knowledge matters. Cognition, 180, 182–190.
https://doi.org/10.1016/j.cognition.2018.07.002
Cook, S. W., Yip, T. K., & Goldin-Meadow, S. (2012). Gestures, but not meaningless movements, lighten working memory load when explaining math. Language &
Cognitive Processes, 27, 594–610. https://doi.org/10.1080/01690965.2011.567074

10
S. Schneider et al. Computers & Education 176 (2022) 104350

Craig, S. D., Twyford, J., Irigoyen, N., & Zipp, S. A. (2015). A test of spatial contiguity for virtual human’s gestures in multimedia learning environments. Journal of
Educational Computing Research, 53, 3–14. https://doi.org/10.1177/0735633115585927
Dargue, N., Sweller, N., & Jones, M. P. (2019). When our hands help us understand: A meta-analysis into the effects of gesture on comprehension. Psychological
Bulletin, 145, 765–784. https://doi.org/10.1037/bul0000202
Davis, R. O. (2018). The impact of pedagogical agent gesturing in multimedia learning environments: A meta-analysis. Educational Research Review, 24, 193–209.
https://doi.org/10.1016/j.edurev.2018.05.002
Frechette, C., & Moreno, R. (2010). The roles of animated pedagogical agents’ presence and nonverbal communication in multimedia learning environments. Journal
of Media Psychology: Theories, Methods, and Applications, 22, 61–72. https://doi.org/10.1027/1864-1105/a000009
Gambino, A., Fox, J., & Ratan, R. A. (2020). Building a stronger CASA: Extending the computers are social actors paradigm. Human-Machine Communication, 1, 71–86.
https://doi.org/10.30658/hmc.1.5
Geary, D., & Berch, D. (2016). Evolution and children’s cognitive and academic development. In D. Geary, & D. Berch (Eds.), Evolutionary perspectives on child
development and education (pp. 217–249). Springer. https://doi.org/10.1007/978-3-319-29986-0_9.
Gliem, J., & Gliem, R. (2003). Calculating, interpreting, and reporting Cronbach’s alpha reliability co-efficient for Likert-type scales. In Paper presented at the 2003
midwest research to practice conferencein adult, continuing and community education. Columbus, Ohio: the Ohio State University. https://scholarworks.iupui.edu/
bitstream/handle/1805/344/Gliem+&+Gliem.pdf?sequence=1.
Guo, P. J., Kim, J., & Rubin, R. (2014). How video production affects student engagement: An empirical study of mooc videos. In Proceedings of the first ACM conference
on Learning@ scale conference (pp. 41–50). ACM. https://doi.org/10.1145/2556325.2566239.
Hauser, D. J., Ellsworth, P. C., & Gonzalez, R. (2018). Are manipulation checks necessary? Frontiers in Psychology, 9, 1–10. https://doi.org/10.3389/fpsyg.2018.00998
Heidig, S., & Clarebout, G. (2011). Do pedagogical agents make a difference to student motivation and learning? Educational Research Review, 6, 27–54. https://doi.
org/10.1016/j.edurev.2010.07.004
Hostetter, A. B. (2011). When do gestures communicate? A meta-analysis. Psychological Bulletin, 137, 297–315. https://doi.org/10.1037/a0022128
Johnson, M. H., Dziurawiec, S., Ellis, H., & Morton, J. (1991). Newborns’ preferential tracking of face-like stimuli and its subsequent decline. Cognition, 40, 1–19.
https://doi.org/10.1016/0010-0277(91)90045-6
Kalyuga, S. (2011). Cognitive load theory: How many types of load does it really need? Educational Psychology Review, 23, 1–19. https://doi.org/10.1007/s10648-010-
9150-7
Kelly, S. D., Manning, S. M., & Rodak, S. (2008). Gesture gives a hand to language and learning: Perspectives from cognitive neuroscience, developmental psychology
and education. Language and Linguistics Compass, 2, 569–588. https://doi.org/10.1111/j.1749-818X.2008.00067.x
Kim, Y., & Baylor, A. L. (2006). A social-cognitive framework for pedagogical agents as learning companions. Educational Technology Research & Development, 54,
569–596. https://doi.org/10.1007/s11423-006-0637-3
Kim, Y., & Baylor, A. L. (2007). Pedagogical agents as social models to influence learner attitudes. Educational Technology, 23–28. https://www.jstor.org/stable/
44429373.
Kirschner, P. A., Sweller, J., Kirschner, F., & Zambrano, J. (2018). From cognitive load theory to collaborative cognitive load theory. Int. J. Computer Support. Collab.
Learning, 13, 213–233. https://doi.org/10.1007/s11412-018-9277-y
Krämer, N. C., & Bente, G. (2010). Personalizing e-learning. The social effects of pedagogical agents. Educational Psychology Review, 22, 71–87. https://doi.org/
10.1007/s10648-010-9123-x
Leppink, J., Paas, F., Van der Vleuten, C. P., Van Gog, T., & Van Merriënboer, J. J. (2013). Development of an instrument for measuring different types of cognitive
load. Behavior Research Methods, 45, 1058–1072. https://doi.org/10.3758/s13428-013-0334-1
Leutner, D. (2014). Motivation and emotion as mediators in multimedia learning. Learning and Instruction, 29, 174–175. https://doi.org/10.1016/j.
learninstruc.2013.05.004
Liew, T. W., Zin, N. A. M., & Sahari, N. (2017). Exploring the affective, motivational and cognitive effects of pedagogical agent enthusiasm in a multimedia learning
environment. Human-centric Computing and Information Sciences, 7, 1–21. https://doi.org/10.1186/s13673-017-0089-2
Lin, L., Ginns, P., Wang, T., & Zhang, P. (2020). Using a pedagogical agent to deliver conversational style instruction: What benefits can you obtain? Computers &
Education, 143, 103658. https://doi.org/10.1016/j.compedu.2019.103658
Loehr, D. P. (2012). Temporal, structural, and pragmatic synchrony between intonation and gesture. Laboratory Phonology, 3, 71–89. https://doi.org/10.1515/lp-
2012-0006
Madan, C. R., & Singhal, A. (2012). Using actions to enhance memory: Effects of enactment, gestures, and exercise on human memory. Frontiers in Psychology, 3, 507.
https://doi.org/10.3389/fpsyg.2012.00507
Martha, A. S. D., & Santoso, H. B. (2019). The design and impact of the pedagogical agent: A systematic literature review. J. Educat. Online, 16, n1.
Mayer, R. E. (2014a). Cognitive theory of multimedia learning. In R. E. Mayer (Ed.), The cambridge handbook of multimedia learning (pp. 43–71). Cambridge University
Press. https://doi.org/10.1017/cbo9781139547369.005.
Mayer, R. E. (2014b). Principles based on social cues in multimedia learning: Personalization, voice, image, and embodiment principles. In R. E. Mayer (Ed.), The
cambridge handbook of multimedia learning (pp. 345–368). Cambridge University Press. https://doi.org/10.1017/cbo9781139547369.017.
Mayer, R. E., & DaPra, C. S. (2012). An embodiment effect in computer-based learning with animated pedagogical agents. Journal of Experimental Psychology: Applied,
18, 239–252. https://doi.org/10.1037/a0028616
Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38, 43–52. https://doi.org/10.1207/
S15326985EP3801_6
Mayer, R. E., Sobko, K., & Mautone, P. D. (2003). Social cues in multimedia learning: Role of speaker’s voice. Journal of Educational Psychology, 95, 419–425. https://
psycnet.apa.org/doi/10.1037/0022-0663.95.2.419.
McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22, 276–282. https://doi.org/10.11613/BM.2012.031
McNeill, D. (1992). Hand and mind: What gestures reveal about thought. University of Chicago Press.
Messinger, D. S. (2002). Positive and negative: Infant facial expressions and emotions. Current Directions in Psychological Science, 11, 1–6. https://doi.org/10.1111/
1467-8721.00156
Moreno, R., & Mayer, R. E. (2007). Interactive multimodal learning environments. Educational Psychology Review, 19, 309–326. https://doi.org/10.1007/s10648-007-
9047-2
Moreno, R., Mayer, R. E., Spires, H. A., & Lester, J. C. (2001). The case for social agency in computer-based teaching: Do students learn more deeply when they
interact with animated pedagogical agents? Cognition and Instruction, 19, 177–213. https://doi.org/10.1207/S1532690XCI1902_02
Mori, M., MacDorman, K. F., & Kageki, N. (1970/2012). The uncanny valley. IEEE Robotics and Automation Magazine, 19, 98–100. https://doi.org/10.1109/
MRA.2012.2192811
Motley, M. T., & Camden, C. T. (1988). Facial expression of emotion: A comparison of posed expressions versus spontaneous expressions in an interpersonal
communication setting. Western Journal of Communication, 52, 1–22. https://doi.org/10.1080/10570318809389622
Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors. Human Factors Comput. Sys., 94, 72–78. https://doi.org/10.1145/259963.260288
Okonkwo, C., & Vassileva, J. (2001). Affective pedagogical agents and user persuasion. In C. Stephanidis (Ed.), Proceedings of the 9th international conference on human-
computer interaction (pp. 397–401). UAHCI.
Park, S. (2015). The effects of social cue principles on cognitive load, situational interest, motivation, and achievement in pedagogical agent multimedia learning.
J. Education. Technol. Soc., 18, 211–229.
Reeves, B., & Nass, C. (1996). The media equation: How people treat computers, television, and new media like real people. Cambridge University Press.
Rinehart, J. S. (1980). Geysers and geothermal energy. Springer.
Ryu, J., & Baylor, A. L. (2005). The psychometric structure of pedagogical agent persona. Technology, Instruction, Cognition and Learning, 2, 291–314.

11
S. Schneider et al. Computers & Education 176 (2022) 104350

Ryu, J., & Yu, J. (2013). The impact of gesture and facial expression on learning comprehension and persona effect of pedagogical agent. Science of Emotion and
Sensibility, 16, 281–292. Retrieved from https://www.koreascience.or.kr/article/JAKO201330951781236.page.
Sato, W., Hyniewska, S., Minemoto, K., & Yoshikawa, S. (2019). Facial expressions of basic emotions in Japanese laypeople. Frontiers in Psychology, 10, 1–11. https://
doi.org/10.3389/fpsyg.2019.00259
Schallberger, U. (2005). Kurzskalen zur Erfassung der Positiven Aktivierung, Negativen Aktivierung und Valenz in Experience Sampling Studien (PANAVA-KS) [Short
scales for measuring positive activation, negative activation, and valence in experience sampling studien (PANAVA-KS)]. Res. Rep. Department Psychol.
Schneider, S., Beege, M., Nebel, S., & Rey, G. D. (2018). A meta-analysis of how signaling affects learning with media. Educational Research Review, 23, 1–24. https://
doi.org/10.1016/j.edurev.2017.11.001
Schroeder, N. L. (2017). The influence of a pedagogical agent on learners’ cognitive load. Educational Technology & Society, 20, 138–147.
Schroeder, N. L., & Adesope, O. O. (2014). A systematic review of pedagogical agents’ persona, motivation, and cognitive load implications for learners. Journal of
Research on Technology in Education, 46, 229–251. https://doi.org/10.1080/15391523.2014.888265
Schroeder, N. L., Adesope, O. O., & Gilbert, R. B. (2013). How effective are pedagogical agents for learning? A meta-analytic review. Journal of Educational Computing
Research, 49, 1–39. https://doi.org/10.2190/EC.49.1.a
Shan, C., Gong, S., & McOwan, P. W. (2007, September). Beyond facial expressions: Learning human emotion from body gestures. In eBritish machine vision conferenc.
https://doi.org/10.5244/C.21.43
Sundararajan, N., & Adesope, O. (2020). Keep it coherent: A meta-analysis of the seductive details effect. Educational Psychology Review, 32, 707–734. https://doi.org/
10.1007/s10648-020-09522-4
Sweller, J. (2010). Element interactivity and intrinsic, extraneous and germane cognitive load. Educational Psychology Review, 22, 123–138. https://doi.org/10.1007/
s10648-010-9128-5
Sweller, J., Van Merriënboer, J. J., & Paas, F. (2019). Cognitive architecture and instructional design: 20 years later. Educational Psychology Review, 31, 261–292.
https://doi.org/10.1007/s10648-019-09465-5
Thompson, L. A. (1995). Encoding and memory for visible speech and gestures: A comparison between young and older adults. Psychology and Aging, 10, 215–228.
https://doi.org/10.1037/0882-7974.10.2.215
Tonguç, G., & Ozkara, B. O. (2020). Automatic recognition of student emotions from facial expressions during a lecture. Computers & Education, 148, 103797. https://
doi.org/10.1016/j.compedu.2019.103797
Twyford, J., & Craig, S. (2013). Virtual humans and gesturing during multimedia learning: An investigation of predictions from the temporal contiguity effect. In
T. Bastiaens, & G. Marks (Eds.), Proceedings of E-learn 2013–world conference on E-learning in corporate, government, healthcare, and higher education (pp.
2145–2149). Association for the Advancement of Computing in Education (AACE).
Unal-Colak, F., & Ozan, O. (2012). The effect of animated agents on students’ achievement and attitudes. The Turkish Online Journal of Distance Education, 13, 96–111.
https://dergipark.org.tr/en/pub/tojde/issue/16900/176146.
Vuletic, T., Duffy, A., Hay, L., McTeague, C., Campbell, G., & Grealy, M. (2019). Systematic literature review of hand gestures used in human computer interaction
interfaces. International Journal of Human-Computer Studies, 129, 74–94. https://doi.org/10.1016/j.ijhcs.2019.03.011
Wang, J., Antonenko, P., & Dawson, K. (2020). Does visual attention to the instructor in online video affect learning and learner perceptions? An eye-tracking analysis.
Computers & Education, 146, 103779. https://doi.org/10.1016/j.compedu.2019.103779
Wang, F., Li, W., Mayer, R. E., & Liu, H. (2018). Animated pedagogical agents as aids in multimedia learning: Effects on eye-fixations during learning and learning
outcomes. Journal of Educational Psychology, 110, 250–268. https://doi.org/10.1037/edu0000221
Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin & Review, 9, 625–636. https://doi.org/10.3758/BF03196322
Woo, H. L. (2008). Designing multimedia learning environments using animated pedagogical agents: Factors and issues. Journal of Computer Assisted Learning, 25,
203–218. https://doi.org/10.1111/j.1365-2729.2008.00299.x
Zavgorodniaia, A., Duran, R., Hellas, A., Seppala, O., & Sorva, J. (2020, September). Measuring the cognitive load of learning to program: A replication study. In
J. Maguire, & Q. Cutts (Eds.), United Kingdom & Ireland computing education research conference (pp. 3–9). https://doi.org/10.1145/3416465.3416468
Zhang, I., Givvin, K. B., Sipple, J. M., Son, J. Y., & Stigler, J. W. (2021). Instructed hand movements affect students’ learning of an abstract concept from video.
Cognitive Science, 45, e12940. https://doi.org/10.1111/cogs.12940

12

You might also like