Professional Documents
Culture Documents
net/publication/349896617
CITATIONS READS
2 24
3 authors, including:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Frederic Anthony Robinson on 24 May 2021.
53
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
semantic-free utterances in HRI, see [58]. But even if sound is not the robot resulting from the untreated sound condition were ab-
used to intentionally communicate in an interaction, it is always sent when the sound was masked. An early example of designed
present and always impacts the listener. This section will give an consequential sound can be found in Johannsen’s work [24–26].
overview of the HRI research investigating consequential sound He combined his musical utterances with recordings from various
and movement sonification. robots to make use of the consequential sounds’ ability to inform
users about the robot’s location. The recordings were subsequently
filtered and pitched to better fit with the musical material. Another
application of designed consequential sound was presented by Cha
2.1 Consequential sound and colleagues, who processed audio recordings of the TurtleBot
Several HRI studies have reported that consequential sound affected servo motors and integrated them into a robotic agent to perform
participants’ perceptions. In an experiment by Trovato et al., partic- a collaborative task with a human [9]. They found that adding a
ipants reported feeling "unsafe" when hugging a robot, due to the masking noise to the robot’s sound profile made it easier to localize
noise emitted from its hand [50]. Inoue et al. found indicators that and less disturbing.
the motor sound of social robot Paro negatively impacted interac-
tions [23]. Moore et al. reported a general aversion to engine noise
[40]. In a study examining affective gestures and their associated 2.2 Movement Sonification
servo sounds, Frid et al. showed possible problems of uncontrolled Kramer et al. broadly define sonification as the use of non-speech
sound, stating that "sounds inherent to robot movement can com- audio to convey information [30]. While the term movement sonifi-
municate other affective states than those originally intended to be cation is commonly used in the context of making human movement
expressed" [17, p. 9]. This notion is supported by recent research audible [13], we will define it here as designing sound as an accom-
that shows how motor sound affects the perception of characteris- paniment to robot movement with the aim of conveying information.
tics like trustworthiness [39], strength and precision [40] or com- Few studies have examined the sonification of expressive robotic
petence [48]. This power of sound to implicitly convey information movement, although it is an active area of research in the SONAO
and shape how humans perceive their environment through phys- project [17]. Dahl et al. investigated connections between move-
ical associations is well established in other disciplines, as is the ment quality and sound characteristics by analysing how trained
practice of actively using this communication channel. In product musicians sonify robotic movement [12]. This work later resulted
sound design, communicated characteristics are often not noticed in a sound synthesis framework [4]. Latupeirissa et al. analysed
by the user, but sub-consciously perceived. Car doors communicate the sonic profile of several robots in popular science fiction films
safety [61], the click of a luxury lighter communicates quality [31], and suggested their characteristics can be used as inspiration for
and bottles communicate freshness when opened [47]. In a process the auditory communication of internal states and the sonifica-
called “active sound design” (ASD), firms like BMW and Renault tion of gestures [34]. They later conducted a pilot study involving
have added recorded and synthesised sound material to their cars’ mainly expert musicians to investigate the effect of various move-
engine sound to enhance the user experience. In electric vehicles ment sounds on the affective cues of robot Pepper. They found that
such as the Jaguar I-Pace, ASD has moved from augmenting exist- some sound conditions shifted participants’ interpretations of the
ing engine sound to fully replacing it. In the case of autonomous affective content of the gestures [33]. None of these mentioned im-
vehicles, ASD can additionally provide a communication channel plementations were validated using larger-scale perceptual studies.
for information usually conveyed by the driver [38]. There are in- In the domain of semantic-free utterances, researchers have
stances, where auditory cues directly influence the perception of aimed to synchronize robot movement with expressions, which
other senses. In an experiment by Jousmäki and Hari, participants can be viewed as a form of movement sonification. Bramas et al.
rubbing their hands together reported having “dryer hands” when proposed a system that synchronizes robotic motion with various
the auditory feedback to their movements had a brighter spectrum utterances ranging from music extracts and sound effects to voice
[28]. Watanabe and Shimojo showed how the movement of two samples to increase the emotional impact of affective gestures [5].
objects that could be visually interpreted as both “bouncing off“ or Their study did not include a user evaluation. This notion of increas-
“passing through” each other, was reliably categorised depending ing impact through synchrony can be connected to recent work
on which sound they were paired with [54]. by Thiessen et al. who investigated the effect of infrasound on a
Working towards using consequential sound as a design space in NAO robot’s affective communication and found that it led to some
HRI, Frid et al. suggest masking or enhancing consequential sound emotions being perceived as more intense, despite participants not
[17]. They suggest using what Tünnermann et al. define as blended being aware of the infrasound’s presence [49]. A technical imple-
sonification: The process of changing the perception of a sound mentation presented by Schwenk and Arras generated semantic-
event by adding sound elements that contain additional perceptual free utterances in real-time, based on the movement data of their
information in a way that preserves the coherence of the original robot Daryl [46]. They used what they termed motion modulation to
sound [52]. This proposed masking was implemented in a study by control pitch and tone of the robot’s utterances. This architecture al-
Trovato et al. who blended simulated consequential sound with a lowed them to easily link expressive gestures like body, head, or ear
musical soundscape [51]. They investigated whether this enhance- pose with expressive utterances. Another example of a robot blend-
ment of robotic sound would influence how close participants will ing semantic-free utterances with motion is the toy robot Vector
comfortably approach a robot. They found no significant impact by San Francisco-based consumer robot manufacturer Anki. Their
on participant distance, but reported that negative impressions of designers filed a patent that describes enhancing the consequential
54
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
sound of their robot [56]. Their "condition-based audio techniques" it can "enable researchers to conduct design work that questions
described syncing sound content to Vector’s movement and gesture underlying assumptions and expands the boundaries of HRI" [36,
repertoire. The sound material was additionally dependent on the p. 4]. HRI researchers are also beginning to draw from the insights
robot’s moods, meaning that the movement of an excited robot of sound designers involved in the design of commercial robots
would sound differently from the movement of a disappointed one. [43].
3.2.1 Sound Design Approach. We designed three sound condi-
3 METHODOLOGY
tions (see Figure 2) and a control to be combined with the video
Taking inspiration from an experimental design by Tennent et al. footage. These are subsequently referred to as mechanical, har-
[48], we designed a between-subject study to investigate the extent monic, musical, and silent. Rather than using recordings of ex-
to which artificial movement sound (the independent variable) can isting robots, we designed the conditions to cover a broad sonic
influence how people perceive a robot’s movement, safety, capabil- spectrum, drawing from the first author’s background in film sound.
ity, and attractiveness (the dependent variables). We created three A multidisciplinary overview of the various concepts and methods
sound designs and a quiet control condition for a video depicting applied in sound design for linear media can be found in [15]. As
a robot in motion. After being randomly assigned to one of the part of their ongoing investigation into the sound of robots in film,
four conditions, participants watched the video and rated the robot Latuperissa and Bresin suggest three categories: inner workings,
across the various measures in an online survey. communication of movement, and expression of emotion [32]. Our
sound designs exclusively focus on the former two. To commu-
3.1 Task and Video Recording nicate inner workings, we designed sounds to represent internal
mechanics associated to specific actions, such as imaginary gears
moving into place before the beginning of a new motion sequence.
To communicate movement, we focused on start and end points of
motion sequences, either accentuating them with short bursts of
sound or distracting from them through gradual changes in volume
and tone color, or by not adding sound at all.
3.2 Sound Design 3.2.2 Sound Condition: Mechanical. The sound material in this con-
dition consisted of recordings sourced from a sound design library,
Our practice-based sound design approach is informed by a Re- such as bike chains, clockworks, and sounds of various pneumatic
search through Design (RtD) methodology, an established frame- equipment. The robot’s movement changes were accompanied by
work within the human-computer interaction community, which clearly accentuated sounds of moving parts becoming aligned with
enables designers to create prototypes and systems which "make each other or releasing pressure. During moments, when the robot
research contributions based on their strength in addressing under- did not move, we added ticking noises to represent changes in the
constrained problems" [59, p. 490]. Gaver notes its strength to create internal mechanics of the machine before a new motion sequence.
"new, conceptually rich artefacts" [18, p. 937]. RtD is mentioned We designed this condition to evoke associations of small precise
as a potentially valuable approach to HRI research in a growing machinery, such as watches. While we were curious how partici-
number of publications [21, 22, 35, 41], with Luria et al. stating how pants would perceive the overall sound profile, we expected it to
1A video with all sound conditions is available in the ACM digital library. affect ratings on items relating to movement control and safety.
55
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
3.2.3 Sound Condition: Harmonic. The sound material in this con- party. Another example is an orchestra performance where the au-
dition consists of a low, neutral pitch interval (open fifth), which dience can focus on and thereby better hear the soloists. Listeners
is extended to a major chord as the robot approaches the camera. of audio recordings do not have this option, meaning that recorded
The arm movement is accompanied by a similar but higher pitched sound can lose some of the detail present in a live context. While we
sound. The harmonies were created using sine waves, resulting in a could have accounted for this with microphone placement and mix-
sound with reduced brightness, that blended with the other sounds ing techniques, the recording would have become a design process
of the robot. The video additionally includes the sound of Fetch’s in itself and the natural sound condition could therefore no longer
wheels rolling across the wooden floor and the recorded sound of be considered an objective representation of the robot sound.
Fetch moving its arm. The robot’s sound in primarily using neutral
harmonic material as opposed to major or minor chords. This is 3.2.6 Integration into Video. We integrated the sound of the de-
based on a recent study by Moore et al., who found that the use signed sound conditions into the video with the aim of creating
of neutral pitch intervals to represent engine sound was generally the impression that all sounds associated with the robot were, in
preferred by participants [38]. The robot’s movement changes were fact, emitted from the robot and that all sounds heard in the video
understated on the soundtrack. Changes in Fetch’s direction were were recorded on-site and from the perspective of the camera. To
accompanied by fluctuations in the overall pitch of the harmony. achieve this, we took various measures. 1) We adjusted the volume
Start and end points of the arm movement were not emphasised, and tone colour of the robot sound to correspond to distance from
but the movement sound was faded in and out. We designed this the camera. 2) We recorded the room tone on-site and added it to all
condition as a possible implementation of blended sonification, sound conditions. 3) We recorded the impulse response of the video
adding harmonic masking elements to an existing motor sound, recording space and used it as a reverb for all artificial sound using
and expected it to increase the ratings of some of the measures convolution [60, Chapter 2]. This essentially made the artificially
when compared to the silent condition. added sound reverberate in the physical recording space. All on-site
sounds were recorded using a Zoom H4n Portable Field Recorder.
3.2.4 Sound Condition: Musical. The sound material in this condi- All sound associated with the robot was mixed to mono and the
tion features the same low, neutral pitch interval as the harmonic room tone and reverb were in stereo. We normalised the integrated
condition, albeit with a more pronounced major chord accompany- loudness (loudness throughout the entire video) across the mechan-
ing the movement. It additionally contains a musical interpretation ical, harmonic and musical sound conditions. The considerably
of the circular motion of turning gears or wheels, represented by a softer silent condition was not balanced with the others.
repeating melodic line that ascends in pitch when the robot moves
forward, and descends in pitch when it turns around its axis. The
3.3 Measures
robot’s movement changes were accentuated on the soundtrack.
Turning, and the beginning and end of the arm movement were ac- Participants rated their impressions and expectations of the robot
companied by a high frequency clicking sound. The arm movement across four blocks of questions. A first set of questions asked par-
was especially emphasised. It was preceded by a sonic event build- ticipants to rate their agreement with statements such as "I believe
ing up to the motion, accompanied by bright harmonic elements this robot could safely shake my hand" on a 5-point scale ranging
which ended in a single note once the movement sequence was from "strongly disagree" to "strongly agree". A second block asked
completed. We designed this condition with the aim of replacing participants to rate their impression of the robot along several se-
commonly expected robot sound with musical and sound design el- mantic differential scale items, such as "unattractive-attractive." A
ements. We were interested in seeing whether this clearly artificial third block presented participants with a list of adjectives, such as
sound would influence ratings, specifically movement, safety, and "calm" or "strained", and asked them how well the words described
capability. We expected to see reported attractiveness increase. the robot’s movement on a scale from "not at all" to "extremely
well." The items within each block of questions were randomized,
3.2.5 Sound Condition: Silent. This version was used as a control with the exception of a question regarding whether the video in-
and does not feature sound directly emitted by the robot. It does, creased the viewer’s stress or anxiety, which was always asked
however, contain various other sounds in order to make the silence right after participants had watched the video for the first time. The
of the robot convincing. The background soundscape consists of 1) items were then averaged into four measures and their consistency
the sound of Fetch’s wheels quietly rolling over the wooden floor, 2) was tested using Cronbach’s alpha. The four measures and their
the recorded room tone - the ambience of the video recording space individual questions are shown in Table 1.
- consisting of noise and occasional clicks from the air conditioning
unit, and 3) sounds created by the person holding the camera, such 3.3.1 Movement Quality (𝛼 = .82). This measure represents the
as clothes rustling and breathing. participants’ impressions of the robot’s movement. We took items
We decided against including an additional sound condition fea- from Tennent et al.’s "competence" and "aesthetic" measures [48]
turing Fetch’s natural sound, because its sonic profile is dominated and adjusted and extended them to specifically relate to Fetch’s
by continuous fan noise and while its movement sound is noticeable control over its movement, both from functional (through items
when listening to the physical robot, it is much less apparent on like "uncontrolled") and aesthetic perspectives (through items like
a recording. This can be attributed to the cocktail part effect, the "elegant"). Throughout its movement routine, Fetch performed sud-
human ability to selectively focus on sounds in the environment [3]. den movements and slightly rocked back and forth when coming
The most prominent example is the ability to acoustically under- to a halt. By either accentuating or distracting from these move-
stand a conversation partner in the loud environment of a cocktail ments with sound, we expected these movements to be more or
56
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
less noticed. We also expected ratings for this measure to have the lowest ratings, and wanted to see whether our intention to de-
implications for the other measures in the survey. sign the harmonic condition as realistic was reflected in this rating.
Participants answered the question on a 5-point scale ranging from
3.3.2 Perceived Safety (𝛼 = .80). This measure reflects how com- "definitely not" to "definitely yes". Additionally, our survey asked
fortable and safe participants feel with imagining the robot in their participants to give context to their ratings using written text and
work environment or home. We took items from Tennent et al.’s to describe the sound of the robot.
"trust" measure [48] and added two additional items, placing the
measure’s focus on safe interactions with people. We expected that 3.4 Study Procedure
the sound conditions would influence participants’ perceived safety
both positively and negatively. We thought certain sound elements We designed a survey2 using the Qualtrics survey software and
might unsettle listeners or make them feel at ease and that this recruited participants via the Prolific online platform. This platform
would influence their perception of how safe this robot is to be is similar to Amazon Mechanical Turk (AMT), which is widely used
around. in studies on human-robot interaction and its results have been
reported to match those of laboratory studies well [10, 27, 37]. Pro-
3.3.3 Perceived Capability (𝛼 = .86). This measure reflects the range lific provides high-quality data to researchers through its dedicated
of abilities and functionality which participants attribute to the focus on research studies [42]. Both AMT and Prolific have been
robot after viewing the video clip. We took six items from a robot used in previous studies on the topic of robot sound [38–40, 48].
capability measure by Cha et al. [8] and hypothesised that designing After signing up for the survey, participants were directed to an
the sound of the robot’s internal workings to resemble precise information statement. They were instructed to wear headphones,
machinery or heavy industrial equipment could influence what but not made aware of the study’s specific focus on sound. Rather,
tasks the robot would be considered capable of. they were asked to assess the robot’s behaviour. This was done to
avoid participants considering all questions exclusively in relation
3.3.4 Attractiveness (𝛼 = .86). Finally, we took the five items of to the sound they were hearing, thereby potentially skewing the
the "attractiveness" measure by Ho and MacDorman [19] and used results. To test whether participants had functioning audio, they
them to measure the participants’ sentiment towards the robot. We were asked to type in a spoken number on a test video’s audio
were interested in seeing whether different sound conditions would track. Even though more elaborate ways to test for headphone use
generally improve or worsen ratings of the robot, and how these are available [57] we considered this type of testing sufficient, as
subjective ratings would relate to the other measures. the sound conditions were considerably different from each other.
After the audio test, participants were randomly assigned to one
3.3.5 Additional Questions. Besides of the four measures above, of the four sound conditions for the remainder of the survey and
we asked three additional questions, shown in Table 2. Question 1 presented with the questions. They were encouraged to revisit the
tested whether any of the elements in the sound conditions were video if necessary. Towards the end of the survey, participants were
interpreted as deliberate communication addressing the listener. asked whether they had impaired hearing, whether they had used
The hum of the artificial motor in the harmonic condition, for ex- headphones as requested, and how familiar they were with elec-
ample, could potentially be interpreted as a semantic-free utterance tronics, robots, and Fetch in particular. Finally, they were presented
[58]. As the effect of utterances on our measures was not within with a debriefing explaining the study’s specific focus on sound and
the scope of this study, we wanted to ensure none of the sound
conditions showed significant differences. Participants answered
the question on a 5-point scale ranging from "definitely not" to "def- 2 The survey can be found at
initely yes". Question 2 asked how relevant participants considered https://unsw.au1.qualtrics.com/jfe/form/SV_86Tn7dfp8WOdAyN
the robot’s appearance, sound, and movement for their ratings. We
expected there to be no differences in the reported influence of
movement and appearance across the sound conditions, but were Table 2: Additional Questions
interested in how influential the different sound conditions were
1) Do you have the impression the robot tried to communicate something to you?
rated. We also expected the absence of sound in the silent con-
2) Please indicate how much the following robot characteristics influenced your answers:
dition to be reflected in this metric. Participants answered each The way it looks.
item on a 5-point scale ranging from "not at all" to "a great deal". The way it sounds.
The way it moves.
Question 3 tested how aware participants were of the presence of
3) Did you have the impression that artificial sound was added to the robot’s motor sound?
artificial sound. We again expected the silent condition to receive
57
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
given the opportunity to revoke their participation without affect- the Tukey-HSD test indicated that the mean rating for the musical
ing their payment. The procedure was approved by the institutional condition (M = 3.57, SD = 0.586) was significantly higher than both
review board of UNSW. the silent condition (M = 3.21, SD = 0.636) and the mechanical
condition (M = 2.82, SD = 0.629). Ratings of the mechanical con-
3.5 Participants dition were significantly lower than the harmonic (M = 3.26, SD
Based on a sample size calculation informed by Tennent et al. [48], = 0.685) and silent condition. The harmonic condition was not
we recruited 240 participants (37.5% male, 62.5% female, M =36 significantly different to the silent and musical condition. Taken
years, SD = 11 years) on the Prolific platform. Participants were together, compared to the silent robot, the mechanical sound
nationals of either the United Kingdom or the United States and decreased, the musical sound increased, and the harmonic sound
spoke English as a first language. They were payed £0.75 for the did not affect movement quality. The effect size was 𝜂 2 = 0.149.
task, which was estimated to take six minutes. All participants had 4.2.2 Perceived Safety. Figure 4 shows the ratings of perceived
an approval rating of 95% or higher and had taken part in at least safety in each sound condition. There was a significant effect of
100 studies on the platform. The participants took on average 6.3 sound on perceived safety at the p<.05 level for the four condi-
minutes to complete the survey, with a minimum of 2.5 minutes tions [F(3, 202) = 4.048, p = 0.008]. Post-hoc comparisons using the
and a maximum of 35. No significant differences in age and prior Tukey-HSD test indicated that the mean rating for the mechanical
experience with electronics, robotics, and Fetch in particular were condition (M = 2.91, SD = 0.752) was significantly lower than both
reported across sound conditions, and experience with robotics and the musical condition (M = 3.37, SD = 0.747) and the silent con-
Fetch was generally low. One participant revoked their consent dition (M = 3.35, SD = 0.780), but not significantly lower than the
after the debriefing and was removed from the data set prior to harmonic condition (M = 3.29, SD = 0.745). The harmonic, musical,
analysis. Three participants incorrectly answered the audio test and silent conditions were not significantly different from each
question at the beginning of the survey, nine participants reported other. In summary, compared to the silent robot, the mechanical
impaired hearing and 21 reported not wearing headphones during sound decreased perceived safety, while the harmonic and musical
the study. These participants were excluded from analysis. The final sound conditions did not affect it. The effect size was 𝜂 2 = 0.057.
group sizes were mechanical: 50, harmonic: 53, musical: 53, and
silent: 50. 4.2.3 Perceived Capability. Figure 5 shows the ratings of perceived
capability across the four sound conditions mechanical (M = 2.44,
4 RESULTS SD = 0.709), harmonic (M = 2.75, SD = 0.953), musical (M = 2.72, SD
= 0.838), and silent (M = 2.67, SD = 0.914). The ANOVA showed no
4.1 Statistical Methods significant effect of sound on perceived capability [F(3, 202) = 1.357,
Statistical analysis was done in R version 4.0.2 [11].3 To arrive at the p = 0.257]. None of the individual items making up the perceived
measures movement quality, perceived safety, perceived capability, capability measure (see Table 1) significantly differed across sound
and attractiveness, we first reversed the ratings of the four ques- conditions either. The effect size was 𝜂 2 = 0.019.
tions describing negative impressions (marked with "(-)" in Table
4.2.4 Attractiveness. Figure 6 shows the ratings of attractiveness
1). With every item then ranging from 1, negative, to 5, positive,
in each sound condition. There was a significant effect of sound on
each measure’s individual items were averaged. The measures’ con-
reported attractiveness at the p<.05 level for the four conditions [F(3,
sistency was calculated using cronbach() from the psy package
202) = 4.761, p = 0.003]. Post-hoc comparisons using the Tukey-HSD
[14]. We conducted one-way between-subjects ANOVAs to com-
test indicated that the mean rating for the musical condition (M =
pare the effect of the four sound conditions on the measures. These
3.46, SD = 0.717) was significantly higher than both the mechanical
were performed using aov() from the R base package. Where the
(M = 2.99, SD = 0.761) and the harmonic (M = 3.01, SD = 0.780)
ANOVA showed a significant effect (p<.05), we used the Tukey-
conditions. Ratings of the silent condition (M = 3.22, SD = 0.658)
HSD() test [1] to adjust p-values for multiple comparisons and see
did not significantly differ to any of the three other conditions.
relationships between the individual sound conditions. Homogene-
The harmonic and mechanical conditions were not significantly
ity of variance was tested with Levene’s test using leveneTest()
different either. Taken together, the robot was rated more attractive
in the car package [16] and normality was confirmed using the
when paired with the musical sound than when paired with either
Shapiro-Wilk test. Bar plots showing means with error bars repre-
the mechanical or harmonic sound. However, when using the
senting +-1 standard error were created using ggplot2 [55]. Y axes
silent condition as a baseline, none of the sounds significantly
have a theoretical range of 1 to 5, but have been limited to improve
impacted the participant’s ratings of the robot’s attractiveness. The
readability. Significance levels are highlighted as stars. * indicates
effect size was 𝜂 2 = 0.066.
p < .05, ** indicates p < .01, and *** indicates p < .001.
4.2.5 Additional Questions. Analysing answers to the questions
4.2 Measures shown in Table 2, we found the following results:
4.2.1 Movement Quality. Figure 3 shows the ratings of movement 1) The reported presence of deliberate communication from the
quality across sound conditions. There was a significant effect of robot was not significantly different across the sound conditions
sound on reported movement quality at the p<.05 level for the four [F(3, 202) = 2.243, p = 0.0845], leading us to conclude that no sound
conditions [F(3, 202) = 11.83, p < 0.001]. Post-hoc comparisons using elements were identified as utterances to a significant degree.
2) The reported influence of robot movement on ratings was
3 Survey data and R file are available in the ACM digital library. not significantly different across the sound conditions [F(3, 202)
58
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
rating
rating
2.5 2.5 2.5
mechanical harmonic musical silent mechanical harmonic musical silent mechanical harmonic musical silent
sound sound sound
Figure 3: Compared to the silent condition, the Figure 4: Perceived safety in the mechanical Figure 5: Participants reported no significant dif-
mechanical sound decreased and the musical sound condition was rated significantly lower than ferences in perceived robot capability across the
sound increased movement quality ratings. in the silent and musical conditions. four sound conditions.
rating
3.0
rating
mechanical harmonic musical silent mechanical harmonic musical silent mechanical harmonic musical silent
sound sound sound
Figure 6: The robot was rated more attractive Figure 7: Participants reported no differences in Figure 8: The mechanical and musical condi-
when paired with the musical condition than how much sound affected their impressions across tions were both identified as significantly more ar-
when paired with the mechanical or harmonic the three designed sound conditions. The silent tificial than the harmonic and silent conditions.
sound. condition was rated significantly less impactful.
= 0.825, p = 0.482]. Neither was the reported influence of robot neither were the harmonic and silent conditions. In summary,
appearance [F(3, 202) = 1.604, p = 0.19]. There was a significant while the sounds in the musical and mechanical conditions were
effect of sound on the reported influence of sound on the partic- more likely to be identified as artificial, the ratings for the harmonic
ipants’ ratings (shown in Figure 7) at the p<.05 level for the four condition did not significantly differ from the silent conditions,
conditions [F(3, 202) = 6.378, p < 0.001]. Post-hoc comparisons using leading us to conclude that the harmonic condition was convincing
the Tukey-HSD test indicated that the mean rating for the silent as the robot’s natural sound.
condition (M = 2.06, SD = 1.11) was significantly lower than the
mechanical (M = 2.94, SD = 1.35), harmonic (M = 2.81, SD = 1.21),
and musical (M = 3.00, SD = 1.22) conditions. Those three designed 5 DISCUSSION
sound conditions showed no significant differences among each Our various sound conditions both increased and decreased the
other. In summary, the reported influence of the robot’s movement, participants’ ratings across some of the measures. Two thoroughly
appearance, and sound on participants’ ratings was not affected by designed and unrealistic sound conditions, mechanical and musi-
the sound conditions, except in the case of the silent condition, cal, showed significant differences to the silent condition. The
where the sound was reported as significantly less influential. This most notable differences were in the participants’ ratings of the
met our expectations, as the silent condition was designed not to robot’s movement, which were both positively and negatively af-
feature any sounds emitted by the robot, despite the wide range of fected. Coupling the same movement with different sounds lead
background sounds present. to the motions being rated as more or less precise, elegant, jerky,
3) The presence of artificial sound as reported by participants or uncontrolled, among others. However, these impressions of the
is shown in Figure 8. As expected, there was a significant effect at robot’s movement had little implications for the more functional
the p<.05 level for the four conditions [F(3, 202) = 13.72, p < 0.001]. ratings of perceived safety and capability. Looking at these results
Post-hoc comparisons using the Tukey-HSD test indicated that the in relation to a prior study by Tennent et al. [48], which explored
mean ratings for the musical (M = 3.55, SD = 1.20) and mechanical the effects of servo motor sound on similar measures, we can make
(M = 3.28, SD = 0.991) conditions were both significantly higher the following comparisons:
than those for the harmonic (M = 2.74, SD = 0.964) and silent 1) While the presence of sound generally decreased ratings across
(M = 2.4, SD = 0.808) conditions. The mechanical and musical their measures, this was not the case with our artificial sound con-
conditions were not significantly different from each other, and ditions, as two them, harmonic and musical, were never rated
below the silent condition in any of the measures. This indicates
59
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
that while the unprocessed sound of servo motor tends to lead to and while the volume across sound conditions was normalised we
lower ratings, our artificial sound conditions are at least on par with could not specify the headphone volume of individual participants.
silence, both concerning functional measures like perceived capa- 2) Considering that participants saw a 20-second video clip of the
bility and safety, as well as less functional measures like movement robot, they had only a small window of opportunity to gather an
quality and attractiveness. impression. While their ratings can give early indicators of how
2) Given that the subtle differences in the servo motor sounds robots with artificial movement sound are perceived, further studies
used by Tennent et al. still resulted in some demonstrated effects, involving extended periods of physical or virtual [2] interaction
we expected the broader sonic range of artificial sound conditions with the robot are needed to better understand these effects. 3) Prior
to result in more pronounced differences in ratings. While we found HRI studies have found that the impressions caused by robot sound
interesting effects on reported quality of movement, we found little can be affected by external factors, such as context [44] and embod-
difference in perceived safety and no differences in perceived capa- iment [29]. Tennent et al., for example, found that the presence or
bility. This suggests that sound plays a lesser role in the assessment absence of a human in their scenario changed participants’ impres-
of purely functional aspects ("Can this robot prepare a meal?"). sions of the robot [48]. 4) Sound localisation [45] and other sound
When looking at the participants’ comments on the sound, we sources in the environment are also likely to introduce additional
found that while descriptions of the mechanical sound were gener- variables in a real world implementation. Given the extensive pos-
ally similar ("clicky", "mechanical", etc), descriptions of the musical sibilities of sound design approaches and the variety of qualitative
condition were not. Participants described the latter as both "sooth- comments we received, further work is required to identify which
ing" and "annoying," as "artificial, sinister" and "charming," and as specific sound elements influence perception across the dimensions
"pleasant" and "unpleasant." This leads us believe that while bolder presented in this paper.
creative approaches to robot movement sound can result in more
pronounced differences in ratings, this introduces the personal taste
of the listener. One could then hypothesise that a careful removal of 7 CONCLUSION
potentially controversial elements from the musical condition will
This study gives insight into the effect of artificial movement sound
result in higher ratings. However, the harmonic condition, contain-
on robot perception. To our knowledge, it is the first HRI study that
ing similar harmonic material but none of the more adventurous
utilises a variety of artificially synthesised sounds to investigate
elements, was not rated differently to the silent condition in any
the influence on people’s perceptions of robot movements, safety,
of the four measures. We believe this can be addressed by working
capability, and attractiveness. We found that our sound conditions
towards systems that learn to take personal preference into account,
both increased and decreased how participants perceived the robot’s
dynamically adjusting to individuals and context, rather than trying
movement. We further found that despite the fact that movement-
to please everyone through a common denominator. Viewing the
related ratings like "jerky," "precise," and "uncontrolled" differed
ratings of the harmonic sound in the context of blended sonification
significantly across sound conditions, these impressions had little
[52], we found that the subtle (see Figure 8) addition of harmonic
consequence for the robot’s perceived safety and capability. Sound
material to the robot’s movement had no significant effect. This
conditions that departed further from sounds commonly associated
leads us to believe that it is unlikely to affect these measures beyond
with robots caused significant differences in ratings, while the more
masking problematic consequential sound, making it a solution to
realistic sound condition did not affect ratings compared to the
a problem which would be better solved by preventing intrusive
silent control. This suggests that musical approaches to robotic
motor noise during manufacturing.
motion sound have the potential to positively impact how people
While our findings highlight the complexity of the design space
perceive robotic agents. Subtly harmonic, or entirely non-musical
for artificial movement sound, we see them as a promising step
approaches were found not to be preferable to silence. While the
towards an implicit channel of communication that, when combined
associations made by participants exposed to the different sound
with speech or utterances, can provide robotic agents with a more
conditions proved to be hard to predict and, at times, contradictory,
versatile sonic language. We believe that embodiment- and context-
we found that artificial movement sound has the potential to affect
dependent movement sound has the potential to become a rich and
how humans perceive robotic agents. With further study, this sound-
nuanced modality that may eventually be able to selectively target
based modality may eventually provide a refined, implicit channel
specific characteristics, helping designers create more refined and
of communication that enriches interactions and enables robotic
engaging robot behaviour.
agents to more seamlessly integrate into human environments.
60
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
61
Session 1: Motion HRI ’21, March 8–11, 2021, Boulder, CO, USA
[48] Hamish Tennent, Dylan Moore, Malte Jung, and Wendy Ju. 2017. Good vibrations: [54] Katsumi Watanabe and Shinsuke Shimojo. 2001. When Sound Affects Vision:
How consequential sounds affect perception of robotic arms. In 2017 26th IEEE Effects of Auditory Grouping on Visual Motion Perception. Psychological Science
International Symposium on Robot and Human Interactive Communication (RO- 12, 2 (March 2001), 109–116. https://doi.org/10.1111/1467-9280.00319
MAN). IEEE, Lisbon. https://doi.org/10.1109/ROMAN.2017.8172414 [55] Hadley Wickham. 2016. ggplot2: elegant graphics for data analysis. springer.
[49] Raquel Thiessen, Daniel J Rea, Diljot S Garcha, Cheng Cheng, and James E Young. [56] Jason Wolford, Ben Gabaldon, Jordan Rivas, and Brian Min. 2019. Condition-Based
2019. Infrasound for HRI: A Robot Using Low-Frequency Vibrations to Impact Robot Audio Techniques. Google Patents.
How People Perceive its Actions. In 2019 14th ACM/IEEE International Conference [57] Kevin J. P. Woods, Max H. Siegel, James Traer, and Josh H. McDermott. 2017.
on Human-Robot Interaction (HRI). IEEE. Headphone screening to facilitate web-based auditory experiments. Attention,
[50] Gabriele Trovato, Martin Do, Ömer Terlemez, Christian Mandery, Hiroyuki Ishii, Perception, & Psychophysics 79, 7 (Oct. 2017). https://doi.org/10.3758/s13414-017-
Nadia Bianchi-Berthouze, Tamim Asfour, and Atsuo Takanishi. 2016. Is hugging a 1361-2
robot weird? Investigating the influence of robot appearance on users’ perception [58] Selma Yilmazyildiz, Robin Read, Tony Belpeame, and Werner Verhelst. 2016.
of hugging. In 2016 IEEE-RAS 16th International Conference on Humanoid Robots Review of Semantic-Free Utterances in Social Human–Robot Interaction. In-
(Humanoids). IEEE. ternational Journal of Human-Computer Interaction 32, 1 (Jan. 2016). https:
[51] Gabriele Trovato, Renato Paredes, Javier Balvin, Francisco Cuellar, Nicolai Baek //doi.org/10.1080/10447318.2015.1093856
Thomsen, Soren Bech, and Zheng-Hua Tan. 2018. The Sound or Silence: Investi- [59] John Zimmerman, Jodi Forlizzi, and Shelley Evenson. 2007. Research through
gating the Influence of Robot Noise on Proxemics. In 2018 27th IEEE International design as a method for interaction design research in HCI. In Proceedings of the
Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, SIGCHI Conference on Human Factors in Computing Systems - CHI ’07. ACM Press,
Nanjing. https://doi.org/10.1109/ROMAN.2018.8525795 San Jose, California, USA. https://doi.org/10.1145/1240624.1240704
[52] René Tünnermann, Jan Hammerschmidt, and Thomas Hermann. 2013. Blended [60] Udo Zölzer. 2011. DAFX: digital audio effects. John Wiley & Sons. https:
sonification – sonification for casual information interaction. Georgia Institute //www.dafx.de/DAFX_Book_Page_2nd_edition/index.html
of Technology. [61] Elif Özcan and René van Egmond. 2006. Product sound design and application:
[53] René Van Egmond. 2008. The experience of product sounds. In Product experience. An overview. In Proceedings of the fifth international conference on design and
Elsevier. emotion, Gothenburg.
62