You are on page 1of 7

Towards an HRI Tutoring Framework for Long-term

Personalization and Real-time Adaptation


Giulia Belgiovine Jonas Gonzalez-Billandon Giulio Sandini
giulia.belgiovine@iit.it Italian Institute of Technology giulio.sandini@iit.it
Italian Institute of Technology, Genova, Italy Italian Institute of Technology
University of Genova Genova, Italy
Genova, Italy

Francesco Rea Alessandra Sciutti


francesco.rea@iit.it alessandra.sciutti@iit.it
Italian Institute of Technology Italian Institute of Technology
Genova, Italy Genova, Italy

ABSTRACT KEYWORDS
Personalization and adaptation are key aspects of designing and robotic tutoring, social assistive robots, personalization
developing effective and acceptable social robot tutors. They allow
ACM Reference Format:
to tailor interactions towards individual needs and preferences, im-
Giulia Belgiovine, Jonas Gonzalez-Billandon, Giulio Sandini, Francesco Rea,
prove engagement and sense of familiarity over time, and facilitate and Alessandra Sciutti. 2022. Towards an HRI Tutoring Framework for Long-
trust between the user and the robot. To foster the development of term Personalization and Real-time Adaptation. In Adjunct Proceedings of
autonomous adaptive social robots, we present a tutoring frame- the 30th ACM Conference on User Modeling, Adaptation and Personalization
work that recognizes new or previously met pupils and adapts the (UMAP ’22 Adjunct), July 4–7, 2022, Barcelona, Spain. ACM, New York, NY,
training experience through feedback about real-time performance USA, 7 pages. https://doi.org/10.1145/3511047.3537689
and the tailoring of exercises and interaction based on users’ past
encounters. The framework is suitable for multiparty scenarios, 1 INTRODUCTION
allowing for deployment in real-world tutoring contexts unfolding
in groups. 1.1 Motivation
A preliminary evaluation of the framework during pilot studies Social robots are increasingly being designed for deployment in
and demonstration events in yoga-based training and game sce- educational, healthcare, and rehabilitation contexts [7, 25], owing
narios showed that our framework could be adapted to different to the huge benefits and positive impacts they have shown to bring
contexts and populations, including children and adolescents. The to these fields. In such application domains, robots often fulfill the
robot’s ability to recognize people and personalize its behavior role of an expert tutor, aiming at supporting and encouraging the
based on the performance of previous sessions was appreciated by achievement of significant improvements in the learning process,
participants, who reported the feeling of being followed and cared tackling the problem of low motivation and limited commitment to
for by the robot. Overall, the framework can support autonomous the therapy [14, 18, 28, 30, 31].
robot-led training by allowing monitoring of both daily perfor- Personalization and adaptation are fundamental aspects of de-
mance and improvements over multiple encounters. It also lends signing successful Social Assistive Robots (SARs), especially in
itself to further expansion to more complex behaviors, with the those contexts in which interactions are expected to unfold over
organic and modular inclusion of more advanced social capabilities, a relatively long period. Indeed, long-term interactions can suffer
such as redirecting the robot’s attention to different learners or from decreased user interest and engagement, consequently affect-
estimating participant engagement. ing robots’ effectiveness and acceptability. Adaptable systems are
thus necessary to tailor interactions, improve engagement and sense
CCS CONCEPTS of familiarity over time, and facilitate rapport and trust between
• Computer systems organization → Robotics; • Computing the user and the robot [9, 17, 20].
methodologies → Artificial intelligence. However, due to the vast needs in tutoring contexts, a completely
non-autonomous personalization of SAR is infeasible. In order to
foster the development of autonomous adaptive solutions, robot
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed tutors should be provided with some cognitive and social abili-
for profit or commercial advantage and that copies bear this notice and the full citation ties, such as recognizing their partners and keeping a memory of
on the first page. Copyrights for components of this work owned by others than the their shared experiences. In particular, they should retrieve expe-
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission riences from memory as the context requires and flexibly reuse
and/or a fee. Request permissions from permissions@acm.org. this knowledge to select appropriate actions to pursue inner goals
UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain (e.g., adapt to the user’s needs). Moreover, they should guarantee
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9232-7/22/07. . . $15.00 real-time adaptive behaviors and continuously update their beliefs
https://doi.org/10.1145/3511047.3537689 by exploiting their first-hand experience, as humans would do [29].

139
UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain Belgiovine et al.

contexts unfolding in groups, bringing advantages both in engage-


ment/motivation and logistical aspects. Methods for personalizing
tutoring strategy in HRI have been extensively studied for single
user-single robot interaction; however, it is reasonable to consider
that robot tutors could interact with classes rather than just single
students. Thus, they might need to adopt tailored training strate-
gies that consider different students’ profiles concomitantly and
Figure 1: Two moments from the yoga training session: in different group dynamics arising from these profiles.
the left panel, iCub shows a pose to the participants; in the
right panel, the participants try to reproduce the pose, re-
ceiving feedback when it was not correctly executed (as in
the case of the participants in the figure on the left and on
1.3 Related Work
the right).
The study of Winkle et al. [31], carried out with experts in the
tutoring field, highlighted the crucial need to personalize the robot
to the user in its overall functional, motivational, and interaction
strategies as well as to adapt in real-time to users’ behaviors during
Furthermore, although individualizing tutoring and assistance is
interactions. Thus, personalization needs to occur at different lev-
a first foreseen application for social robots, they also represent a
els and focus on: 1) ‘high-level’ features that represent individual
promising means to promote learning in social contexts. The use
differences and can be assumed to be constant over time (e.g., per-
of robot tutors could be beneficial in different group scenarios (e.g.,
sonality, preferences, social background), 2) long-term changes, i.e.,
nursing homes for the elderly or fitness and therapy centers), not
everything related to the user’s history (such as therapy progress),
only for logistic reasons (e.g., increase of the population in need
and 3) real-time evidence (e.g., behaviors, performance) observed
of assistance, limited availability of resources, and costs of robotic
during the interaction, which should be informed by high-level
technologies) but also because of the importance of relational and
personalization. However, there is not yet an autonomous HRI sys-
social component in stimulating the motivation of the person [10,
tem addressing all these aspects concomitantly to the best of our
12]. To promote group tutoring, the solutions developed for social
knowledge.
robots must remain robust and reliable in unstructured and dynamic
Works on personalization for SARs focused mainly on gener-
contexts such as multiparty interactions.
ating and monitoring short-term sessions based on performance
information. As an example, a recent work [23] focused specifically
1.2 Research Objectives and Contribution on the automatic assessment of physical exercises necessary for the
In this work, we will present a tutoring framework for HRI result- robot to autonomously monitor user performance and adapt to it
ing from the extension and adaptation of a previously developed by designing specific therapy plans and generating clinical reports.
computational architecture [15]. In particular, they used a dedicated graphical interface to let the
The resulting framework, presented in Fig. 2, allows for the therapist modify the plan. Another work demonstrated a gait train-
autonomous organization (saving and labeling) of the robot’s mul- ing system that could autonomously monitor task performance but
timodal first-hand experience thanks to the combination of percep- required teleoperation of the social robot companion [21].
tual and reasoning mechanisms and a working-memory system. In addition to performance adaptation, other works have focused
Specifically, the framework allows for autonomously creating a on customization strategies to learn user preferences and personal-
structured dataset of faces and voices, used to perform open-set ities [26, 32], or personalize instructions and feedback [8, 27]. For
and incremental person recognition, and information about users’ example, Sussenbach et al. [27] implemented a motivational action-
training. Therefore, the robot can recognize previously encountered based interaction model for HRI to autonomously generate the right
users and users not met before and, in this latter case, add them to type of feedback (including general, social encouragements as well
its long-term memory. as comments on task performance) to give under different condi-
Being the tutoring framework able to recognize new or pre- tions. The system was designed based on social interaction patterns
viously met pupils, it can adapt the training experience through observed from expert-led human-human interactions (HHI).
real-time performance-related feedback and considering users’ past Many other studies showed the benefits of remembering user’s
experiences. We tested the framework in a multi-session physical personal attributes (e.g., name, age) [4, 6, 13, 17], preferences and
task (Fig. 1), designed by leveraging a co-design process with an behavior patterns [3, 6, 16] as well as previously shared history
expert in the tutoring field (i.e., a gymnastics coach). [3, 4, 16, 19] for improving user experience in long-term interac-
In presenting the tutoring framework, we will provide initial tions. Zheng et al. [33] compared remembering these four types of
discussions on how it can be further used to study personalization information with a personal assistant robot that tracked the status
in group interactions, starting from observations from preliminary of users’ tasks and gave health tips. Their findings suggested that
pilot studies and demos, and leaving the conduction of a more commenting on observed user behavior patterns elicits stronger
systematic users study as future work. positive feelings, and tracking the progress of user’s goals and
Indeed, our architecture is suitable for multiparty scenarios. This recalling previous shared history are more effective in building rap-
aspect opens the possibility for deployment in real-world tutoring port than commenting on semantic information. However, a more

140
Towards an HRI Tutoring Framework for Long-term Personalization and Real-time Adaptation UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain

comprehensive study is necessary for more conclusive results since Perception Modules Action Modules

each of these types of information proved useful in the literature. Sound


Localization
iKinGeazeCtrl
(Head and Gaze)

Based on this evidence, we decided to let the robot store in its Sensory Data Audio
VAD + Speech Body Motion Recorder
long-term memory the biometric data for recognition (face and Audio
Recognition

voice) and semantic information (age, nationality, etc.), as well Dialogue


System
as episodic memories related to previous training sessions. For Images
Face Detection
Yoga Supervisor
(NLU + Dialogue
management

example, the robot gave personalized advice/comments for each Multiple


flow)

exercise to be performed, according to the user’s difficulties or Objects Tracker Yoga Teacher Database

successes experienced in the past for that particular exercise. • Faces/Voices


Pose
As concern SAR applications to coach/assist groups, the solu- estimation Person
• Personal Info

• Performance
Recognition
tions present in literature are still few, [2, 11, 22] and do not fully • Skill Level
• …

address the problems of autonomous data organization and tutoring


Working Memory
personalization in such complex interactions. Indeed, most of the Object Properties Collector Database
studies usually focus on dyadic interactions. This is a limitation {tracker_id, name, is_verified, is_active, spatial_pos,
is_tested, is_trained }
Organizer

also for effective and engaging training sessions, having been con-
firmed the importance of interaction and the relational component Figure 2: Tutoring framework built upon a previous archi-
in stimulating the motivational dimension of the person [10, 12]. tecture [15] allowing for person recognition, real-time per-
Despite the recent growth in research investigating the influence formance tracking and feedback delivery, participant profil-
of robots on groups, our overall understanding of what happens ing, and interaction with multiple users
when robots are placed within groups is still highly limited. De-
veloping such an understanding is essential for training and ed-
ucational fields as they generally involve interactions in groups
or teams, where the individual’s experience depends strongly on
group dynamics.
The novelty contribution of our work is to provide a compre- 2.1 Yoga teacher module
hensive framework addressing tutoring personalization for social This module is accountable for all the teaching-related behaviors,
robots that i) is suitable for multiparty interactions; ii) allows to including the customization of the training program. It includes
perform person recognition autonomously, and iii) allows for per- three main components, i.e., a Pose, a Trainer, and a User Profiler
sonalization at different time levels. classes.
The Pose class computes the list of joint angles (i.e., the angles
between two consecutive body segments) that are important for
the correct reproduction of the pose. For example, for the first pose
2 ARCHITECTURE in Fig. 3, it is required to maintain an angle of approximately 90°
The architecture is composed of stand-alone modules integrated on the right knee, computed as the geometric angle between two
with the YARP middleware [24], and it was developed starting from body segments in the 2D space (in this case, the thigh and the calf).
a previous architecture presented in [15]. Thanks to this modular The Trainer class deals with all the behaviors related to general
design, it is easy to adapt it to new HRI scenarios. For the tutoring coaching. For example, it evaluates the performance for each pose
framework, we redesigned the module responsible for the inter- based on the differences between the joints’ angles of the target
action (the Yoga Supervisor) and added a new module to handle pose (the one reproduced by the robot and defined by the expert)
the teaching of the designed task (the Yoga Teacher), as shown in and the current pose (the one detected from the user) ± a specific
Figure 2. The perceptual system is composed of a Face Detection, Margin Error (ME) of tolerance (to be customized based on the
Sound Localization, Speech Recognition and Voice Activity Detector context). Other tasks managed by this class include: giving feedback
(VAD), Multiple Object Tracker (MOT), and a Pose Estimation. The to correct the posture if there are joints to be fixed and giving praise
latter is responsible for computing the body’s skeleton of the users when there are no longer any corrections, giving personalized
by using the openPose software [5]. feedback based on past performance (provided by the User Profiler),
The architecture also included a Person Recognition module and a showing the poses, and giving generic evaluation at the end of each
Working Memory, which stored and shared with the other modules test session.
temporary information associated with each user (e.g., is_trained, The class User Profiler handles the customization of the training
is_tested) to keep track of their state during the exercise session for each user, such as setting the exercise parameters like Pose
and manage the group interaction. Retention Time (PRT) or Margin Error (ME) (for example, by in-
Finally, data stored in the Long-Term Memory included biometric creasing it in the case of children in a game scenario). It is also
data (to recognize users), personal information (name, age, pro- in charge of computing the skill level used to tailor the training
fession, etc.), and information about performance and training program and saving, loading, and analyzing the log files with all
progress. the users’ information.
In the following, we will present in more detail the modules used We provide in Table 1 a summary of the tasks performed by each
in the tutoring context scenario. For a detailed description of the module, with the corresponding description and some implementa-
other modules we suggest to refer to [15]. tion examples.

141
UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain Belgiovine et al.

2.2 Yoga supervisor module 3.1 Task


The role of this module is to manage the tutoring session, deal- We opted for a physical task that is engaging for all ages, modular
ing primarily with the interaction rather than the pure sphere of in its difficulty, and suitable for performing in groups. Specifically,
teaching. For example, through communication with the working the task consisted in executing and maintaining different poses
memory, this module keeps track of users who have already been inspired by the yoga practice. Fig. 1 shows some moments from the
tested or trained. It also deals with the transition through the dif- yoga training session.
ferent phases of the yoga session, such as poses demonstration, With the help of the expert, we identified three poses for three
start/end of the training, and start/end of the testing phase. Further- different levels of expertise (beginner, intermediate, expert). The
more, it decides which other module to start and stop (e.g., sound poses were chosen by taking into account the degrees of freedom
localization, tracking, etc.) based on external signals or events. of the robot movements and the possibility of correctly recognizing
Of course, this module can be easily adapted and modified if the figure in a 2D plane, as we used the openPose software to detect
one needs a different interaction design, as shown in Section 4.2, the skeleton of the participants [5]. The poses involved the whole
where we briefly describe how we adapted the interaction to a body, including arms, legs, torso, and head.
game-like training suitable for younger populations (children and We show, as representative examples, one pose for each difficulty
adolescents). level in Fig. 3. The expert chose the poses (top row), which we then
reproduced on the robot as a prefixed set of positions in the joint
space (bottom row).
Table 1: Tutoring Modules and their relative tasks with cor-
responding descriptions. TR = Training, TS = Test; V = Ver- 3.2 Experimental protocol
bal, NV = Non Verbal (Body + Expressions); LTP = Long-Term Users met for the first time (i.e., recognized as ‘unknown’), after
Personalization, RTA = Real-Time Adaptation. being enrolled in the database, started their session from level one
(beginner). The session proceeded then with poses of incremental
Module Task Description/Examples Phase Modality Adaptation Level difficulty, up to a total of 6 poses per session. For users recognized as
Show the pose Reproduce the pose using the robot body TR NV -
Starting from body skeleton, compute already enrolled in the system (i.e., met in a previous yoga session),
Compute joint angles TR+TS - -
Yoga Teacher joint angles important for that pose
Compute differences between target and
the robot greeted them in a personalized way (e.g., by saying “Hi
Compute performance TR+TS - RTA
current pose (accuracy and time)
Giulia! Nice to see you again!”). At the same time, the Yoga Teacher
Say corrective utterances.
Give general feedback
“Align your elbow with your shoulder”
TR V RTA
module retrieved the information relative to the previous training
Motivate/praise “Great! Go on like that!” TR V+NV RTA
Remember previous error for each pose. sessions, and it computed the user’s skill level, in addition to infor-
Give personalized advice TR+TS V+NV LTP
“This time, try to be focus on the knees!”
Evaluate users skill and give final score.
mation about past performance for each pose, including a detailed
Test/evaluate user TS V+NV RTA
“Well done, but you can do better!”
report about the joint positions mistaken most often. Thus, for an al-
Compute user skill level and deliver
Tailor training program
a suitable training program
- - LTP
ready known user, the training program started with an appropriate
Save performance information
Save/upload log files
in long-term memory
TS/TR - LTP
difficulty level, continuing then with poses of incremental difficulty
Yoga Supervisor
Interact with known
and unknown users
Greetings known users and introduce
new ones in the system
TR+TS V+NV LTP (again, for a total of 6 poses per session). All these processes were
Track participants state
Remind users who have already
performed the training/test
- - RTA handled by the robot autonomously.
Decide when switching between training/test
Switch trough different stage - - RTA
phase or other interaction phase

3 METHODS
As previously explained, we tested the architecture through pilot
experiments, by recruiting volunteer participants, and by conduct-
ing DEMOs for external visitors of different ages, including children
and adolescents (ages ranging from 7 to 16 years old), see Fig. 6.
The main objectives of these preliminary experiments were to
test the expanded architecture and fine-tune (thanks also to the
observations of the expert human tutor) the possible behaviors of
the robot and the characteristics of the task (poses to reproduce,
parameters, etc.). The final design was chosen to make the session
enjoyable and adaptable to people with different skills (both adults
and children), not to result in boredom or frustration.
These preliminary evaluations allowed us also to reflect on pos-
sible improvements in the tutoring framework, which users’ be- Figure 3: Examples of one yoga pose for each difficulty level
haviors (individual and group-related) the robot should refer to (left: beginner; middle: intermediate; right: expert) done by
learn to adapt, and what methods of autonomous learning might the human expert (top row) and reproduced on the iCub ro-
be reasonable to investigate. bot (bottom row).

142
Towards an HRI Tutoring Framework for Long-term Personalization and Real-time Adaptation UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain

For each pose to learn during the session, there was a first train-
ing phase and a second test phase followed by a final general evalu-
ation.

Training Phase: during this phase, the robot showed the pose and
then invited participants to reproduce it, giving real-time feedback
when it perceived errors in holding the figure/posture correctly.
An example of correcting feedback was “Try to keep your knee
perpendicular to your foot” or “Raise your elbow at shoulder level”.
After the pose was held correctly for at least 10 seconds, the robot
Figure 4: From left to right, yarp views showing: iCub cam-
congratulated the user.
eras visualization, body skeleton output computed with
Moreover, before demonstrating each pose, the robot delivered
openPose, simulation of real-time iCub behavior. These in-
personalized feedback to known users based on previous perfor-
terfaces were visible only to the experimenter to check the
mance relative to that particular pose (e.g., by saying “This time,
correct execution of the experiment and not to participants.
when doing the warrior pose, remember to pay particular attention to
keep the arms aligned!”).
Session Performance Joint Angles: Warrior Joint Angles: Warrior 2
For sessions with more than one person at the same time, the
training phase could be handled in two different ways: 1) the robot
invited people to train in performing the pose one at a time, moni-

P1
toring each person individually; 2) the robot invited everyone to
do the training at the same time but, not having the possibility to
monitor everyone simultaneously, it alternated its attention on the
participants (for feedback on the poses) in an equally distributed
way. This latter version is more similar to what a human coach
would do and, therefore, preferable. Additionally, in this way, the
P2

participants would have the possibility of self-correcting by observ-


ing each other.

Test Phase: After the training phase, the robot had to assess users’
acquired skills and save all the information needed to customize Figure 5: Example of performance from two participants’
subsequent sessions. At this stage, we decided to address each (P1 and P2). On the left: Performance of the participants in
participant individually. Thus, the robot asked one person at a time the first yoga session, beginning with the three poses of level
to take position in front of it (to acquire posture data better) and 1 (green bars) and going on with three poses of level two (yel-
invited them to reproduce the pose for a fixed period. In this phase, low ones). In the middle and on the right: joints angles in
the robot gave no feedback. However, it saved the position of all time for pose Warrior (middle) and pose Warrior 2 (right).
joints in time from which it computed the performance metrics, i.e., The blue line represents the margin error (ME).
the ratio between the number of frames in which the detected pose
was correct and the PRT, expressed as a percentage.
At the end of this phase, the robot gave general feedback based the case of game-like interactions with children and adolescents).
on the average performance by saying, for example, “You did great! As shown in Fig. 5 (second and third column), joint angles were
Go on like that!” or “It was not so bad, but I think we need to practice constantly monitored in time, triggering feedback from the tutor
more”. robot and defining the final performance and skill level. The latter,
in particular, was computed based on the average performance (and
4 PRELIMINARY RESULTS standard deviation) of the three poses of each difficulty level. Only
when the participant acquired a certain mastery the challenge level
4.1 Users evaluation and profiling was increased (e.g., by reducing the EM or increasing the PRT), and
We provide in Fig. 5 an example of the performance shown by two the training program was readapted with exercises of appropriate
different participants during their first session (including 6 poses difficulties, with a focus on the poses and body parts that proved to
in total, 3 of beginner level and 3 of intermediate level, represented be most critical.
respectively by green and yellow bars in Fig. 5).
As explained before, the performance was defined as the percent- The developed tutoring framework allows not only collecting
age of time the pose was held correctly during the test phase and information regarding the performance and progress in the task
depended on the Margin Error (ME), i.e., the acceptable difference but also to save autonomously and in a structured way samples of
in joint angles between the target and actual pose (the blue line face and voice, as well as general information, of each user inter-
in Fig. 5). This threshold was empirically chosen based on obser- acting with the robot (see Fig. 2). We believe that this, coupled with
vations from the expert human coach and may vary according to the possibility of managing interactions of (albeit small) groups au-
the application context and population (e.g., it was decreased in tonomously, represents an innovative contribution to the real-world

143
UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain Belgiovine et al.

deployment of social robot tutors. Although having this variety of the robot can autonomously organize its sensory experience during
data available will not be the ultimate solution to providing the ro- multiparty interactions. Based on it, it can perform recognition of
bot with human-like social intelligence, it represents an important both previously encountered and new users, adding the latter to
starting point for the autonomous development of robots’ more the system. Finally, we have provided the robot with some abilities
complex social skills. essential to fostering personalized tutoring, leveraging real-time
information and previous experience.
4.2 Interaction with younger population Effectively synthesizing and generalizing all this information for
The framework was also tested in interactions with children and the robot’s learning process to be successful and adaptive requires
adolescents on the occasion of a demo, revisiting the tutoring ses- further investigation. However, equipping robots with the ability
sion explained above in a more game-oriented way but still valuable to infer (and continually update/fine-tune) this knowledge by ex-
for physical training and learning (also about robotics). We tested ploiting their direct experience is a fundamental step towards this
our framework with 36 children interacting with the robot in 6 goal, as well as an important requirement for facilitating their use
groups composed on average of 6±4.3 members each (11.1±4.6 by the staff who will act as intermediaries to the end-users (e.g.,
years old). therapists, coaches, etc.).
Participants performed a modified version of the Chinese whis- Our architecture is flexible and adaptable to different contexts
pers game (also known as telephone game) 1 , but, instead of whis- and populations, including children and adolescents. With the help
pering a word to each other, they had to “pass a pose”. of the human expert coach, we tried to model the robot’s behavior
The iCub tutor started the interaction by showing the first pose so that the final interaction was pleasant and effective. The robot’s
to the first child without the others seeing it. At that point, the ability to recognize people and personalize comments/feedback
child had to learn the pose thanks to the feedback from the tutor based on the performance of previous sessions proved highly ap-
robot. Subsequently, each child showed the pose to the next until preciated by participants in the first pilot studies, giving them the
the last child had to perform the pose in front of the robot, which feeling of being truly followed and cared for by the robot.
rated how similar the last pose was to the one it had initially shown. However, our solutions to manage multiparty training sessions
This process was repeated until each child was the first in the chain. relied on an interaction design set a priori. A possible solution to
In this way, each participant was able to experience training and improve tutoring so that the robot can decide when and how to
evaluation with the robot tutor while also collaborating with their change its supervisory behavior would be for the robot to distribute
peers. its resources in a reasoned manner; i.e., considering whether and
how to shift its attention to specific team members or, better still,
thoughtfully inviting class members to help each other to bridge
any disparities.
Furthermore, thanks to the architecture developed, we could
provide the robot with the ability to respond proactively to users
requesting its help, using, for example, audio-visual attentional
mechanisms. However, the management of this eventual phase is
also non-trivial and requires further investigation.
With our work, we want to encourage further investigation
of personalization strategies in multiparty HRI. Indeed, in these
contexts, the robot tutor should be both socially competent and suc-
Figure 6: Some moments from the yoga task in a gamified cessful in training, meaning that it should be aware not only of the
version revisited for younger participants. individual learners’ needs but also of the group dynamics that may
arise as a response to particular behaviors, as it may unintention-
ally favor one member of the group or disadvantage another. For
Participants reported they enjoyed the experience and showed
example, if the robot’s goal is to maximize the performance of each
no particular difficulty in reproducing the poses (although their
team member, it may decide to distribute more training resources to
level of accuracy was lower than that of adults, and therefore the
those team members identified as under-performing in the task, but
robot’s evaluations considered a higher margin of error). The robot’s
this may make them uncomfortable or may lead to other members
feedback also seemed to be suitable, although in these cases, it is
not feeling monitored enough. How can the robot understand when
preferable to give more importance to visual feedback (e.g., showing
to prioritize the “group component” over the “individual compo-
the pose several times) rather than simply correcting with verbal
nent”? And to what extent do the dynamics observed in human
feedback.
interactions apply to robot-group interactions in these peculiar
5 DISCUSSION contexts? It is necessary to explore these aspects more extensively
in HRI in order to prevent algorithm-based learning from resulting
This work showed an extension and application of a previously de- in unequal treatment, inter-group bias, and social exclusion of team
veloped architecture (for open-set incremental person recognition) members [1].
[15] in a physical training context to start exploring personalized
systems in multiparty tutoring scenarios. Thanks to our framework,
1 https://en.wikipedia.org/wiki/Chinese_whispers

144
Towards an HRI Tutoring Framework for Long-term Personalization and Real-time Adaptation UMAP ’22 Adjunct, July 4–7, 2022, Barcelona, Spain

ACKNOWLEDGMENTS [19] Iolanda Leite, Ginevra Castellano, André Pereira, Carlos Martinho, and Ana Paiva.
2014. Empathic robots for long-term interaction. International Journal of Social
This work has been supported by a Starting Grant from the Euro- Robotics 6, 3 (2014), 329–341.
pean Research Council (ERC) under the European Union’s Horizon [20] Iolanda Leite, Carlos Martinho, and Ana Paiva. 2013. Social robots for long-term
interaction: a survey. International Journal of Social Robotics 5, 2 (2013), 291–308.
2020 research and innovation programme. G.A. No 804388, wHiS- [21] Bruno Leme, Masakazu Hirokawa, Hideki Kadone, and Kenji Suzuki. 2021. A so-
PER. Thanks also to Irene Pippo for her collaboration as the expert cially assistive mobile platform for weight-support in gait training. International
tutor. Journal of Social Robotics 13, 3 (2021), 459–468.
[22] Wing-Yue Geoffrey Louie and Goldie Nejat. 2020. A social robot learning to facil-
itate an assistive group-based activity from non-expert caregivers. International
Journal of Social Robotics 12, 5 (2020), 1159–1176.
REFERENCES [23] Alejandro Martín, José C Pulido, José C González, Ángel García-Olaya, and
[1] Anna MH Abrams and Astrid M der Pütten. 2020. I–c–e framework: Concepts Cristina Suárez. 2020. A framework for user adaptation and profiling for social
for group dynamics research in human-robot interaction. International Journal robotics in rehabilitation. Sensors 20, 17 (2020), 4792.
of Social Robotics 12, 6 (2020), 1213–1229. [24] Giorgio Metta, Lorenzo Natale, Francesco Nori, Giulio Sandini, David Vernon,
[2] Iivari Bäck, Kari Makela, and Jouko Kallio. 2013. Robot-led Exercise Program for Luciano Fadiga, Claes Von Hofsten, Kerstin Rosander, Manuel Lopes, José Santos-
the Rehabilitation of Older Nursing Home Residents. Annals of Long-Term Care Victor, et al. 2010. The iCub humanoid robot: An open-systems platform for
(2013). research in cognitive development. Neural networks 23, 8-9 (2010), 1125–1134.
[3] Tony Belpaeme, Paul Baxter, Robin Read, Rachel Wood, Heriberto Cuayáhuitl, [25] Irena Papadopoulos, Christina Koulouglioti, Runa Lazzarino, and Sheila Ali. 2020.
Bernd Kiefer, Stefania Racioppa, Ivana Kruijff-Korbayová, Georgios Athanasopou- Enablers and barriers to the implementation of socially assistive humanoid robots
los, Valentin Enescu, et al. 2012. Multimodal child-robot interaction: Building in health and social care: a systematic review. BMJ open 10, 1 (2020), e033096.
social bonds. Journal of Human-Robot Interaction 1, 2 (2012). [26] Marta Romeo, Daniel Hernández García, Ting Han, Angelo Cangelosi, and Kris-
[4] Joana Campos, James Kennedy, and Jill F Lehman. 2018. Challenges in Exploiting tiina Jokinen. 2021. Predicting apparent personality from body language: bench-
Conversational Memory in Human-Agent Interaction. In Proceedings of the 17th marking deep learning architectures for adaptive social human–robot interaction.
International Conference on Autonomous Agents and MultiAgent Systems. 1649– Advanced Robotics 35, 19 (2021), 1167–1179.
1657. [27] Luise Süssenbach, Nina Riether, Sebastian Schneider, Ingmar Berger, Franz Kum-
[5] Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. 2019. OpenPose: mert, Ingo Lütkebohle, and Karola Pitsch. 2014. A robot as fitness companion:
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE towards an interactive action-based motivation model. In The 23rd IEEE interna-
Transactions on Pattern Analysis and Machine Intelligence (2019). tional symposium on robot and human interactive communication. IEEE, 286–293.
[6] Nikhil Churamani, Paul Anton, Marc Brügger, Erik Fließwasser, Thomas Hummel, [28] Katelyn Swift-Spong, Elaine Short, Eric Wade, and Maja J Matarić. 2015. Effects
Julius Mayer, Waleed Mustafa, Hwei Geok Ng, Thi Linh Chi Nguyen, Quan of comparative feedback from a socially assistive robot on self-efficacy in post-
Nguyen, et al. 2017. The impact of personalisation on human-robot interaction stroke rehabilitation. In 2015 IEEE International Conference on Rehabilitation
in learning scenarios. In Proceedings of the 5th International Conference on Human Robotics (ICORR). IEEE, 764–769.
Agent Interaction. 171–180. [29] David Vernon, Serge Thill, and Tom Ziemke. 2016. The role of intention in
[7] Carlos A Cifuentes, Maria J Pinto, Nathalia Céspedes, and Marcela Múnera. 2020. cognitive robotics. In Toward Robotic Socially Believable Behaving Systems-Volume
Social robots in therapy and care. Current Robotics Reports 1, 3 (2020), 59–74. I. Springer, 15–27.
[8] Caitlyn Clabaugh, Kartik Mahajan, Shomik Jain, Roxanna Pakkar, David Becerra, [30] Eric Wade, Avinash Parnandi, Ross Mead, and Maja Matarić. 2011. Socially
Zhonghao Shi, Eric Deng, Rhianna Lee, Gisele Ragusa, and Maja Matarić. 2019. assistive robotics for guiding motor task practice. Paladyn, Journal of Behavioral
Long-term personalization of an in-home socially assistive robot for children Robotics 2, 4 (2011), 218–227.
with autism spectrum disorders. Frontiers in Robotics and AI (2019), 110. [31] Katie Winkle, Praminda Caleb-Solly, Ailie Turton, and Paul Bremner. 2018. Social
[9] Kerstin Dautenhahn. 2004. Robots we like to live with?!-a developmental per- robots for engagement in rehabilitative therapies: Design implications from a
spective on a personalized, life-long robot companion. In RO-MAN 2004. 13th study with therapists. In 2018 13th ACM/IEEE International Conference on Human-
IEEE International Workshop on Robot and Human Interactive Communication Robot Interaction (HRI). IEEE, 289–297.
(IEEE Catalog No. 04TH8759). IEEE, 17–22. [32] Bryce Woodworth, Francesco Ferrari, Teofilo E Zosa, and Laurel D Riek. 2018.
[10] Willy De Weerdt, Godelieve Nuyens, Hilde Feys, Peter Vangronsveld, Ann Van de Preference learning in assistive robotics: Observational repeated inverse re-
Winckel, Alice Nieuwboer, Jeroen Osaer, and Carlotte Kiekens. 2001. Group phys- inforcement learning. In Machine Learning for Healthcare Conference. PMLR,
iotherapy improves time use by patients with stroke in rehabilitation. Australian 420–439.
Journal of Physiotherapy 47, 1 (2001), 53–61. [33] Xiqian Zheng, Hiroshi Ishiguro, and Dylan F. Glas. 2019. Four memory categories
[11] Jing Fan, Linda Beuscher, Paul A Newhouse, Lorraine C Mion, and Nilanjan to support socially-appropriate conversations in long-term HRI.
Sarkar. 2016. A robotic coach architecture for multi-user human-robot interaction
(RAMU) with the elderly and cognitively impaired. In 2016 25th IEEE international
symposium on robot and human interactive communication (RO-MAN). IEEE, 445–
450.
[12] Louise Gauthier, Sandra Dalziel, and Serge Gauthier. 1987. The benefits of group
occupational therapy for patients with Parkinson’s disease. The American Journal
of Occupational Therapy 41, 6 (1987), 360–365.
[13] Rachel Gockley, Allison Bruce, Jodi Forlizzi, Marek Michalowski, Anne Mundell,
Stephanie Rosenthal, Brennan Sellner, Reid Simmons, Kevin Snipes, Alan C
Schultz, et al. 2005. Designing robots for long-term social interaction. In 2005
IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 1338–
1343.
[14] Rachel Gockley and Maja J MatariĆ. 2006. Encouraging physical therapy compli-
ance with a hands-off mobile robot. In Proceedings of the 1st ACM SIGCHI/SIGART
conference on Human-robot interaction. 150–155.
[15] Jonas Gonzalez, Giulia Belgiovine, Alessandra Sciutti, Giulio Sandini, and Rea
Francesco. 2021. Towards a Cognitive Framework for Multimodal Person Recog-
nition in Multiparty HRI. In Proceedings of the 9th International Conference on
Human-Agent Interaction. 412–416.
[16] Wan Ching Ho, Kerstin Dautenhahn, Mei Yii Lim, and Kyron Du Casse. 2010.
Modelling Human Memory in Robotic Companions for Personalisation and Long-
term Adaptation in HRI.. In BICA. 64–71.
[17] Takayuki Kanda, Rumi Sato, Naoki Saiwaki, and Hiroshi Ishiguro. 2007. A two-
month field trial in an elementary school for long-term human–robot interaction.
IEEE Transactions on robotics 23, 5 (2007), 962–971.
[18] Juan S Lara, Jonathan Casas, Andres Aguirre, Marcela Munera, Monica Rincon-
Roncancio, Bahar Irfan, Emmanuel Senft, Tony Belpaeme, and Carlos A Cifuentes.
2017. Human-robot sensor interface for cardiac rehabilitation. In 2017 Interna-
tional Conference on Rehabilitation Robotics (ICORR). IEEE, 1013–1018.

145

You might also like