You are on page 1of 16

The Satiated

Empathic AI Sound Interfaces
Applying Emotional Artificial Intelligence
On Sound Interfaces

University of Otago
Thank you to all the experts for your participation. Without your feedback and
input, this report would not have been possible.

I would also like to mention two fantastic books that inspired my thoughts:
1. Emotional AI: The Rise of Empathic Media by Andrew McStay
2. Brand Machines, Sensory Media and Calculative Culture by Sven
Brodmerkel and Nicholas Carah.


Natasha Joe Dr. Roel Wijland

Table of Contents
Your roadmap.

Methodology 3
Context 4
Emily’s Pains 6
Emily’s Goals 7
Desires & Worries of EAI 8
Assertions 9
UI & UX Implications 11
Data-driven & expert-informed intelligence.

Iterative Research Performative Research

Gen Z’s habits were researched The creation of a persona, Emily,
and located amidst the rich to encapsulate the pains and
potential of soundscapes. the goals of the Gen Zs. This
With advancements in AI, EAI’s established the context of the
application could solve content study to the experts.
delivery through sensing moods,
habits and developing intimacy.

The modified eDelphi Filtration of Data UI and UX

Method Recommendations
Purposefully seeking UI, Assertions were developed A total of 9
UX, sound industry and EAI from a Thematic Analysis of recommendations were
experts to discuss future the interviews and an formulated based on the
innovative developments iterative process of themes - moods, habits
in sound interfaces for the verifying agreement of the and intimacy.
Gen Zs. assertions with the experts
took place.

Iterative research revealed an opportunity for UI and UX
designers and innovators to reinvent sound consumption.

The Gen Z’s and their unhealthy relationship with media...

The Gen Zs interact with media everyday where scrolling, watching and binging is
just natural for them. However, current media systems circulate content based
on past content behaviour - mostly predictable and irrelevant - where the Gen
Zs, coined the Mūdies, are satiated with screens and experience diminishing

But, wait! What about sound interaction?

Sound content frees the Mūdies’ eyes and provides freedom from the screen.
Sound has a lower production cost enabling more creative and innovative
content. In an environment where the Mūdies are increasingly pressured to multi-
task, sound offers mobility with wireless technology. They will be surrounded
by rich and intimate content in the most invisible way; free of cords but tucked
away in one’s ears.

The rich soundscape is underutilised...

The convergence of sound media and the nature of sound interaction show the
potential for interface development. As Spotify’s CEO and founder, Daniel Ek,
stated, ‘I didn’t know when we launched to consumers in 2008 was that audio —
not just music — would be the future of Spotify’. The development of a new way
to deliver sound content is warranted.

So the integration of EAI could reinvent how Gen Z listens to sound

Advancements in AI has burgeoned the idea of EAI where its application allows
real-time feedback, and the collision of virtual and physical experiences. This
means that EAI takes note of contextual factors - heart rate, number of steps,
location, moods and so on - for the personalisation and discovery of sound

The integration of EAI on sound interfaces would awaken intuitive capabilities

in content delivery: delineated by mood-sensing in the moment, to intervene in
predictable media habits, and eventually cultivating intimacy between the user
and the interface.
Solo, an emotional radio by the creative studio,
Uniform, changes music by responding to a person’s

Emily’s Pains
Emily was created to represent the pains of Gen Z aka the Mūdies.
She appeared throughout the expert consultation highlighting the
pains and the goals of the Mūdies.

Emily has lost count of how many times she has scrolled
through Facebook. She is tired and should head to bed
but continues to watch TikTok videos and cute animal

Like Emily the Gen Zs’ routine re-

volves around the screen. The al-
gorithms curate content based
on content which was chosen in
the past. This limits the discovery
of new content and keeps them
hooked on the old. We don’t
know how much we are missing

Emily is bored and frustrated as she feels like she wastes so

much time looking for good content. She gives up and binges
on a show she loved during her youth: Vampire Diaries.

Emily’s Goals

Emily tunes into sound - anywhere and anytime.

She listens to other people’s stories with real
voices, making her feel something - even on the

Allow the discovery of new sound

content for the moment and a
healthier relationship with media.
There are so much undiscovered
music, radio and podcasts. Like
your own moods which are al-
ways changing. Through curat-
ing sound content based on your
moods, it helps you escape from
your past media consumption
habits and discover new sound

Emily connects back to reality and realises how much

she let the screen control her.

Desires & Worries of EAI
The main forces that impacted the formation of assertions
throughout the expert consultation.

The consultation revealed several desires and concerns about the application
of EAI on sound interfaces. On the vertical axis, factors that were desired
were placed higher on the scale and vice versa. On the horizontal axis, factors
were grouped categorically into the three overarching themes of moods,
habits and intimacy but some factors overlapped. The larger font size means
that it was discussed by more experts in relation to the rest.


DESIRES Hands-free
Personalised Content
Unexpected Serendipity
Made for you
Discovery of New Genres


Unnecessary Interruptions Invasiveness

Lack of Control


After three rounds of iterative consultation, ten assertions
were agreed upon by the experts.

Concurrently, 83% of experts expressed that current sound interfaces do

recommend content outside of their comfort zone. This shows the need to
reinvent the delivery of sound content, where the following assertions were
utilised to formulate recommendations.

EAI should use both passive (e.g. predefined playlists) and active ways (e.g. text,
voice, emojis) to sense your moods in the moment.

If EAI interacts with consumer moods, it would allow for greater personalisation of
content than current AI systems.

Initially, I would be all right with informing EAI my moods, as I will experience the
benefits later.

It is more enjoyable if EAI is designed for randomness & spontaneity to break

If my listening habits are challenged, EAI should hint at the amount of sound
content not yet discovered (e.g. ‘Check out, the 1908 people who are feeling
lonely’ or ‘Curious? hear about how families deal with illnesses’).

EAI should be forgiven when it makes mistakes.

If EAI is trustworthy (e.g. data managed on the device instead of the cloud), I
would allow access to my instantaneous emotions.

If EAI responds with personality, it should only be centered around sound content
(e.g. the EAI asks, ‘Heya Natasha, wanna listen to something local?’).

Besides EAI responding about sound content, the only other acceptable respons-
es are about my well-being (e.g. ’It has been a good 3 hours, take a break’).

‘…if you’re able to label yourself, if you’re able to annotate,
you’re able to actively engage with the construction of
your media services.’

‘Like if I have to stop and interact with it, there’s really no

point because there’s not much difference between that
and Spotify right now’.

‘Cause I am a big proponent of the whole new AI kind of thing

where we don’t have to interact with our phones. It is not how the
user explicitly says that I’m at work...there is a physiological change
without the person even realising’.

‘You could definitely have the EAI play maybe a few seconds of
different pieces of content? And, it could maybe ask Emily what she
thinks of this?...And you get this kind of whole interaction without
the screen.’

‘I think what’s lacking with Google Home and Amazon’s Alexa is the
inability to predict mood and suggest options/encourage/listen...
when we can’t connect with humans, we will find some way to
connect with our devices.’

Check out what the experts said during

our conversations.

Emily in her comfortable albeit slightly

mundane bubble.

UI & UX Implications
Implications for UI and UX designers of sound interface
brands and brands that integrate sound capabilities.

For the empathic sound interface to meet Emily’s goals, it needs to interact
with her moods to curate content for her moment.

1 Allow real-time annotation of moods

Build mechanisms to label and respond to the content that EAI
curates through mood annotation. This will increase the accuracy
of content curation overtime through matching the user’s moods
to media.

2 But prioritise user-control

Provide opportunities for both implicit and explicit interaction with
the interface for the curation of content. Implicit interaction allows
content curation without touching one’s device. Explicit interac-
tion allows the user to direct one’s listening experience and provide
feedback to the UI.

3 And training the UI

Provide ways for the user to train the interface for a deeper under-
standing of the user’s moods.

‘So I think for me, what would annoy me is if
there were suggestions that seemed arbitrary.’

‘But some sort of tracking maybe, like what kind of

music or podcast led you to get into this certain mood
or mindset? And then kind of helping you realize those
loops you find yourself in.’

‘I’ve seen AI at work and it can understand things that people

don’t see and could it proactively start changing my music
to positive music because I start spiralling into depression,
for example, based on things we don’t realize, like heart rate,
cholesterol levels or... before they even reached that stage of
being sad…could AI find some kind of magical set of data that
could stop that from occurring?’

‘I like it when my sound preferences are

challenged, especially when it evokes a similar
feeling within a different genre.’

Check out what the experts said during

our conversations.

Content plays a huge part in our lives, like in the recent Joker
movie. More needs to be done in the UI and UX of these systems.

To help Emily consistently discover new content, the empathic AI sound
interface needs to intervene with her listening habits for serendipitous
encounters. From our research, there are three traits of serendipity:
randomness, unexpectedness and emotivity.

4 Limit random content suggestions

In most use cases, minimise random suggestions as it may be
annoying to the user.

5 But make interventions spontaneous

Keep the user wanting more through spontaneous voice interaction
and nudges that hint at the discovery of new content.

6 And focused on mood-management

Build the UI to draw connections between past moods and content
curation for emotive interventions aka to place the user in a better

Emily binge-watching and eating at night ;).

‘There is much more interest in edge based computing which
pushes the processing back out to devices themselves so if you
push the process back to device, you’ve got a lot more control
over where the data goes.’

‘In order for like EAI to work, it senses your mood and has
access to like, your heart rate, your facial recognition, and
expressions and your location, all that to actually seem like
smart and empathetic to your situation. But how much are
consumers willing to give?’

‘Putting it (Sony’s robot dog, AIBO) in the context of a dog

allows them to test things well. And be forgiven when things
do not go right. Um, it’s almost like a Trojan horse into your
heart because it’s so cute. You can sort of start to use
technology in that context.’

‘I think that’s kind of a cute, like a human interaction

that would be kind of like a nice novelty whereas right
now, it’s kind of like no matter what you’re doing, the
responses are always kind of pre-canned, that don’t
really elicit an emotional response either way.’

Check out what the experts said during

our conversations.

Mr Humfreez created by ANZ. A cute sheep which changes color to

monitor humidity and temperature, which shows the delight of design.

For Emily to reveal intimate data to the empathic sound interface, there needs
to be trust in the brand and the overall listening experience.

7 Reassure the user

If the UI is trained locally, the user would be more prone to reveal

8 And inculcate personal trust

Imbue EAI’s interactions with personality especially if it relates to
sound content or the user’s well-being.

9 To reinforce positive listening experiences

Cumulative reinforcement of positive experiences which takes the
user on an emotive and personalised listening adventure.

University of Otago
Thank you again to all the experts for your participation.
Natasha Joe Dr. Roel Wijland

find out more@

This Mūdie broke out of the listening bubble with Empathic AI Sound Interfaces!

Credits: Adobe Stock and various sources for your images.